Original post

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider accepting the trial offer on the right. Thank you for visiting LWN.net!

Fuzzing is a technique with randomized inputs that is used to find problematic edge cases or security problems in code that accepts user input. Go package developers can use Dmitry Vyukov’s popular go-fuzz tool for fuzz testing their code; it has found hundreds of obscure bugs in the standard library as well as in third-party packages. However, this tool is not built in, and is not as simple to use as it could be; to address this, Go team member Katie Hockman recently published a draft design that proposes adding fuzz testing as a first-class feature of the standard go test command.

Using random test inputs to find bugs has a history that goes back to the days of punch cards. Author and long-time programmer Gerald Weinberg recollects:

We didn’t call it fuzzing back in the 1950s, but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. We also used decks of random number punch cards. We weren’t networked in those days, so we weren’t much worried about security, but our random/trash decks often turned up undesirable behavior.

More recently, fuzz testing has been used to find countless bugs, and some notable security issues, in software from Bash and libjpeg to the Linux kernel, using tools such as american fuzzy lop (AFL) and Vyukov’s Go-based syzkaller tool.

The basic idea of fuzz testing is to generate random inputs for a function to see if it crashes or raises an exception that is not part of the function’s API. However, using a naive method to generate random inputs is extremely time-consuming, and doesn’t find edge cases efficiently. That is why most modern fuzzing tools use “coverage-guided fuzzing” to drive the testing and determine whether newly-generated inputs are executing new code paths. Vyukov co-authored a proposal which has a succinct description of how this technique works:

start with some (potentially empty) corpus of inputs
for {
    choose a random input from the corpus
    mutate the input
    execute the mutated input and collect code coverage
    if the input gives new coverage, add it to the corpus
}

Collecting code coverage data and detecting when an input “gives new coverage” is not trivial; it requires a tool to instrument code with special calls to a coverage recorder. When the instrumented code runs, the fuzzing framework compares code coverage from previous test inputs with coverage from a new input, and if different code blocks have been executed, it adds that new input to the corpus. Obviously this glosses over a lot of details, such as how the input is mutated, how exactly the coverage instrumentation works, and so on. But the basic technique is effective: AFL has used it on many C and C++ programs, and has a section on its web page listing the huge number of bugs found and fixed.

The go-fuzz tool

AFL is an excellent tool, but it only works for programs written in C, C++, or Objective C, which need to be compiled with GCC or Clang. Vyukov’s go-fuzz tool operates in a similar way to AFL, but is written specifically for Go. In order to add coverage recording to a Go program, a developer first runs the go-fuzz-build command (instead of go build), which uses the built-in ast package to add instrumentation to each block in the source code, and sends the result through the regular Go compiler. Once the instrumented binary has been built, the go-fuzz command runs it over and over on multiple CPU cores with randomly mutating inputs, recording any crashes (along with their stack traces and the inputs that caused them) as it goes.

Damian Gryski has written a tutorial showing how to use the go-fuzz tool in more detail. As mentioned, the go-fuzz README lists the many bugs it has found, however, there are almost certainly many more in third-party packages that have not been listed there; I personally used go-fuzz on GoAWK and it found several “crashers”.

Journey to first class

Go has a built-in command, go test, that automatically finds and runs a project’s tests (and, optionally, benchmarks). Fuzzing is a type of testing, but without built-in tool support it is somewhat cumbersome to set up. Back in February 2017, an issue was filed on the Go GitHub repository on behalf of Vyukov and Konstantin Serebryany, proposing that the go tool “support fuzzing natively, just like it does tests and benchmarks and race detection today“. The issue notes that “go-fuzz exists but it’s not as easy as writing tests and benchmarks and running go test -race“. This issue has garnered a huge amount of support and many comments.

At some point Vyukov and others added a motivation document as well as the API and tooling proposal for what such an integration would look like. Go tech lead Russ Cox pressed for a prototype version of “exactly what you want the new go test fuzz mode to be“. In January 2019 “thepudds” shared just that — a tool called fzgo that implements most of the original proposal in a separate tool. This was well-received at the time, but does not seem to have turned into anything .

More recently, however, the Go team has picked this idea back up, with Hockman writing the recent draft design for first-class fuzzing. The goal is similar, to make it easy to run fuzz tests with the standard go test tool, but the proposed API is slightly more complex to allow seeding the initial corpus programmatically and to support input types other than byte strings (“slice of byte” or []byte in Go).

Currently, developers can write test functions with the signature TestFoo(t *testing.T) in a *_test.go source file, and go test will automatically run those functions as unit tests. The existing testing.T type is passed to test functions to control the test and record failures. The new draft design adds the ability to write FuzzFoo(f *testing.F) fuzz tests in a similar way and then run them using a simple command like go test -fuzz. The proposed testing.F type is used to add inputs to the seed corpus and implement the fuzz test itself (using a nested anonymous function). Here is an example that might be part of calc_test.go for a calculator library:

    func FuzzEval(f *testing.F) {
        // Seed the initial corpus
        f.Add("1+2")
        f.Add("1+2*3")
        f.Add("(1+2)*3")

        // Run the fuzz test
        f.Fuzz(func(t *testing.T, expr string) {
            t.Parallel()      // allow parallel execution
            _, _ = Eval(expr) // function under test (discard result and error)
        })
    }

Just these few lines of code form a basic fuzz test that will run the calculator library’s Eval() function with randomized inputs and record any crashes (“panics” in Go terminology). Some examples of panics are out-of-bounds array access, dereferencing a nil pointer, or division by zero. A more involved fuzz test might compare the result against another library (called calclib in this example):

        ...

        // Run the fuzz test
        f.Fuzz(func(t *testing.T, expr string) {
            t.Parallel()
            r1, err := Eval(expr)
            if err != nil {
                t.Skip() // got parse error, skip rest of test
            }

            // Compare result against calclib
            r2, err := calclib.Eval(expr)
            if err != nil {
                t.Errorf("Eval succeeded but calclib had error: %v", err)
            }
            if r1 != r2 {
                t.Errorf("Eval got %d, calclib got %d", r1, r2)
            }
        })
    }

In addition to describing fuzzing functions and the new testing.F type, Hockman’s draft design proposes that a new coverage-guided fuzzing engine be built that “will be responsible for using compiler instrumentation to understand coverage information, generating test arguments with a mutator, and maintaining the corpus“. Hockman makes it clear that this would be a new implementation, but would draw heavily from existing work (go-fuzz and fzgo). The mutator would generate new randomized inputs (the “generated corpus”) from existing inputs, and would work automatically for built-in types or structs composed of built-in types. Other types would also be supported if they implemented the existing BinaryUnmarshaler or TextUnmarshaler interfaces.

By default, the engine would run fuzz tests indefinitely, stopping a particular test run when the first crash is found. Users will be able to tell it to run for a certain duration with the -fuzztime command line flag (for use in continuous integration scripts), and tell it to keep running after crashes with the -keepfuzzing flag. Crash reports will be written to files in a testdata directory, and will contain the inputs that caused the crash as well as the error message or stack trace.

Discussion and what’s next

As with the recent draft design on filesystems and file embedding, official discussion for this design was done using a Reddit thread; overall, the feedback was positive.

There was some discussion about the testing.F interface. David Crawshaw suggested that it should implement the existing testing.TB interface for consistency with testing.T and testing.B (used for benchmarking); Hockman agreed, updating the design to reflect that. Based on a suggestion by “etherealflaim”, Hockman also updated the design to avoid reusing testing.F in both the top level and the fuzz function. There was also some bikeshedding over whether the command should be spelled go test -fuzz or go fuzz; etherealflaim suggested that reusing go test would be a bad idea because the it “has history and lots of folks have configured timeouts for it and such“.

Jeremy Bowers recommended that the mutation engine should be pluggable:

I think the fuzz engine needs to be pluggable. Certainly a default one can be shipped, and pluggability can even be pushed to a “version 2”, but I think it ought to be in the plan. Fuzzing can be one-size-fits-most but there’s always going to be the need for more specialized stuff.

Hockman, however, responded that pluggability is not required in order to add the feature, but might be “considered later in the design phase“.

The draft design states up front that “the goal of circulating this draft design is to collect feedback to shape an intended eventual proposal“, so it’s hard to say exactly what the next steps will be and when they will happen. However, it is good to see some official energy being put behind this from the Go team. Based on Cox’s feedback on Vyukov’s original proposal, my guess is that we’ll see a prototype of the updated proposal being developed on a branch, or in a separate tool that developers can run, similar to fzgo.

Discussion on the Reddit thread is ongoing, so it seems unlikely that a formal proposal and an implementation for a feature this large would be ready when the Go 1.16 release freeze hits in November 2020. Inclusion in Go 1.17, due out in August 2021, would be more likely.

Index entries for this article
GuestArticles Hoyt, Ben