Fuzz testing has passed its 35 th birthday and, in that time, has gone from a disparaged and mocked
technique to one that is the foundation of many efforts in software engineering and testing. The key
idea behind fuzz testing is using random input and having an extremely simple test oracle that only looks
for crashes or hangs in the program. Importantly, in all our studies, all our tools, test data, and results
were made public so that others could reproduce the work. In addition, we located the cause of each
failure that we caused and identified the common causes of such failures.
In the last several years, there has been a huge amount of progress and new developments in fuzz
testing. Hundreds of papers have been published on the subject and dozens of PhD dissertations have
been produced. In this talk, I will review the progress over the last 35 years describing our simple
approach – using what is now called black box generational testing – and show how it is still relevant
and effective today.
In 1990, we published the results of a study of the reliability of standard UNIX application/utility
programs. This study showed that by using simple (almost simplistic) random testing techniques, we
could crash or hang 25-33% of these utility programs.
In 1995, we repeated and significantly extended
this study using the same basic techniques: subjecting programs to random input streams. This study
also included X-Window applications and servers. A distressingly large number of UNIX applications still
crashed with our tests. X-window applications were at least as unreliable as command-line applications.
The commercial versions of UNIX fared slightly better than in 1990, but the biggest surprise was that
Linux and GNU applications were significantly more reliable than the commercial versions.
In 2000, we took another stab at random testing, this time testing applications running on Microsoft
Windows. Given valid random mouse and keyboard input streams, we could crash or hang 45% (NT) to
64% (Win2K) of these applications.
In 2006, we continued the study, looking at both command-line and GUI-based applications on the
relatively new Mac OS X operating system. While the command-line tests had a reasonable 7% failure
rate, the GUI-based applications, from a variety of vendors, had a distressing 73% failure rate.
Recently, we decided to revisit our basic techniques on commonly used UNIX systems. We were
interested to see that these techniques were still effective and useful.
In this talk, I will discuss our testing techniques and then present the various test results in more detail.
These results include, in many cases, identification of the bugs and the coding practices that caused the
bugs. In several cases, these bugs introduced issues relating to system security. The talk will conclude
with some philosophical musings on the current state of software development.
Papers on the four studies (1990, 1995, 2000, 2006, and 2020), the software and the bug reports can be
found at the UW fuzz home page:
http://www.cs.wisc.edu/~bart/fuzz/
Brief bio:
Barton Miller is the Vilas Distinguished Achievement Professor at UW-Madison
Miller is a co-PI on the Trusted CI NSF Cybersecurity Center of Excellence, where he leads the
software assurance effort. His research interests include software security, in-depth vulnerability
assessment, binary and malicious code analysis and instrumentation, extreme scale systems, and
parallel and distributed program measurement and debugging. In 1988, Miller founded the field of
Fuzz random software testing, which is the foundation of many security and software engineering
disciplines. In 1992, Miller (working with his thenstudent Prof. Jeffrey Hollingsworth) founded the
field of dynamic binary code instrumentation and coined the term “dynamic instrumentation”.
Miller is a Fellow of the ACM and recently won the Jean Claude Laprie Award in dependable
computing for his work on fuzz testing.
Miller was a member of the FAA VECTOR Task Force reviewing cybersecurity of the U.S. aviation
infrastructure. He was the chair of the Institute for Defense Analysis Center for Computing Sciences
Program Review Committee, member of the U.S. National Nuclear Safety Administration Los
Alamos and Lawrence Livermore National Labs Cyber Security Review Committee (POFMR),
member of the Los Alamos National Laboratory Computing, Communications and Networking
Division Review Committee, has been on the U.S. Secret Service Electronic Crimes Task Force
(Chicago Area) is currently an advisor to the Wisconsin National Guard 176 th Cyber Prevention
Team and the Wisconsin Security Research Consortium.