FUZZING 2023 – Proceedings

Message from the Organizers
It is our great pleasure to welcome you to the 2nd International Workshop on Fuzzing (FUZZING 2023), co-located with ISSTA in Seattle, Washington, USA on 17 July 2023. This workshop is the continuation of last year's successful inauguration workshop that introduced a preregistration-based publication process to our community. Similar to last year, this workshop hosts the presentations of the accepted drafts of the registered reports that were accepted as part of the first stage in a two-stage publication process. In the first stage, the program committee (PC) evaluates all submissions based on: (i) the significance and novelty of the hypotheses or techniques and (ii) the soundness and reproducibility of the methodology specified to validate the claims or hypotheses -- but explicitly not based on the strength of the (preliminary) results. These draft registered reports are presented and improved at the FUZZING 2023 workshop in Seattle.

Keynotes

Three Colours of Fuzzing: Reflections and Open Challenges (Keynote)
Cristian Cadar

(Imperial College London, UK)
In this talk, I will reflect on my experiences designing and applying different forms of fuzzing (whitebox, greybox and blackbox) to various types of software (file processing applications, network servers, compilers, document readers, etc.) and software engineering problems (patch testing, test suite augmentation, etc.)
While the goal of fuzzing is to find bugs, our objective as fuzzing researchers and practitioners should be to improve the reliability, security and quality of software. I therefore argue that we need to pay closer attention to how fuzzing is integrated into the software development process and how we can use fuzzing to help with other software engineering tasks.

Publisher's Version

Rich Coverage Signal and the Consequences for Scaling (Keynote)
Kostya Serebryany

(Google, USA)
Most existing fuzzing tools use edge coverage to identify interesting inputs and guide the expansion of the corpus. This coverage signal is convenient because it is bounded in size. Once fuzzing discovers all reachable edges, however, this form of coverage stops being useful. To keep providing a useful guidance to the fuzzer we can add additional signals, such as call stacks, bounded execution paths, arguments to comparison instructions, and signals derived from anomaly detection. Most of these signals can generate a large amount of data that the fuzzer needs to deal with which can have a drastic impact on the computational resources required. It is still tempting to use these rich signals. In the SiliFuzz project we have used rich coverage signals to uncover bugs that were hidden otherwise. In this talk we will discuss approaches to scaling fuzzing with rich coverage signals in a new fuzzing engine called Centipede.

Publisher's Version

Registered Reports

Grammar Mutation for Testing Input Parsers (Registered Report)
Bachir Bendrissou

, Cristian Cadar

, and Alastair F. Donaldson

(Imperial College London, UK)
Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. A prerequisite of this method is to have a specification of the input format in the form of a grammar. Consequently, the success of a grammar-based fuzzing campaign is highly dependent on the available grammar. If the grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to that grammar, there may be an impedance mismatch between inputs generated via grammar-based fuzzing and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based fuzzing, by construction, will not yield such edge case inputs. To overcome these limitations, we present Gmutator, an approach that mutates an input grammar and leverages the Grammarinator fuzzer to produce inputs conforming to the mutated grammars. As a result, Gmutator can find inputs that do not conform to the original grammar but are (wrongly) accepted by an SUT. In addition, Gmutator-generated inputs have the potential to increase SUT code coverage compared with the standard approach. We present preliminary results applying Gmutator to two JSON parsing libraries, where we are able to identify a few inconsistencies and observe an increase in covered code. We propose a plan for a full experimental evaluation over four different input formats—JSON, XML, URL and Lua—and twelve SUTs (three per input format).

Publisher's Version

Novelty Not Found: Adaptive Fuzzer Restarts to Improve Input Space Coverage (Registered Report)
Nico Schiller

, Xinyi Xu

, Lukas Bernhard

, Nils Bars

, Moritz Schloegel

, and Thorsten Holz

(CISPA Helmholtz Center for Information Security, Germany)
Feedback-driven greybox fuzzing is one of the cornerstones of modern bug detection techniques. Its flexibility, automated nature, and effectiveness render it an indispensable tool for making software more secure. A key feature that enables its impressive performance is coverage feedback, which guides the fuzzer to explore different parts of the program. The most prominent way to use this feedback is novelty search, in which the fuzzer generates new inputs and only keeps those that have exercised a new program edge. This is grounded in the assumption that novel coverage is a proxy for interestingness. Bolstered by its widespread success, it is easy to overlook its limitations. Particularly the phenomenon of input shadowing, situations in which an “interesting” input is discarded because it does not contribute novel coverage, needs to be considered. This phenomenon limits the explorable input space and risks missing bugs when shadowed inputs are more amenable to mutations that would trigger bugs.
In this work, we analyze input shadowing in more detail and find that multiple fuzzing runs of the same target exhibit a different basic block hit frequency despite overlapping code coverage. In other words, different fuzzing runs may find the same set of basic blocks but one might exercise specific basic blocks significantly more often than the other, and vice versa. To better distribute the frequency, we propose restarting the fuzzer to reset the fuzzing state, diversifying the fuzzer’s attention across basic blocks. Our preliminary evaluation of three Fuzzbench targets finds that fuzzer restarts effectively distribute the basic block hit frequencies and boost the achieved coverage by up to 9.3%.

Publisher's Version

DiPri: Distance-Based Seed Prioritization for Greybox Fuzzing (Registered Report)
Ruixiang Qian

, Quanjun Zhang

, Chunrong Fang

, and Zhenyu Chen

(Nanjing University, China)
Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and gravitates executions towards rarely explored program regions with code coverage as feedback. Seed prioritization is an important step of greybox fuzzing that prioritizes promising seeds for input generation. However, mainstream greybox fuzzers like AFL++ and Zest tend to slight the importance of seed prioritization and plainly pick seeds according to the order of the seeds being queued, or rely on an approach with randomness, which may consequently degrade their performance. In this paper, we propose a novel distance-based seed prioritization approach named DiPri to facilitate greybox fuzzing. Specifically, DiPri calculates the distances among seeds and selects the ones that are farther from the others in priority to improve the probabilities of discovering previously unexplored regions. To make a preliminary evaluation, we integrate DiPri into AFL++ and Zest and conduct experiments on eight (four in C/C++ and four in Java) fuzz targets. We also consider six configurations, i.e., three prioritization modes multiplied by two distance measures, in our evaluation to investigate how different prioritization timings and measures affect DiPri. The experimental results show that, compared to the default seed prioritization approaches of AFL++ and Zest, DiPri covers 1.87%∼13.86% more edges in three out of four C/C++ fuzz targets and 0.29%∼4.97% more edges in the four Java fuzz targets with certain configurations. The results highlight the potential of facilitating greybox fuzzing with distance-based seed prioritization.

Publisher's Version

Large Language Models for Fuzzing Parsers (Registered Report)
Joshua Ackerman

and George Cybenko

(Dartmouth College, USA)
Ambiguity in format specifications is a significant source of software vulnerabilities. In this paper, we propose a natural language processing (NLP) driven approach that implicitly leverages the ambiguity of format specifications to generate instances of a format for fuzzing. We employ a large language model (LLM) to recursively examine a natural language format specification to generate instances from the specification for use as strong seed examples to a mutation fuzzer. Preliminary experiments show that our method outperforms a basic mutation fuzzer, and is capable of synthesizing examples from novel handwritten formats.

Publisher's Version

CrabSandwich: Fuzzing Rust with Rust (Registered Report)
Addison Crump

, Dongjia Zhang

, Syeda Mahnur Asif

, Dominik Maier

, Andrea Fioraldi

, Thorsten Holz

, and Davide Balzarotti

(CISPA Helmholtz Center for Information Security, Germany; EURECOM, France; TU Berlin, Germany)
The rust programming language is one of the fastest-growing programming languages, thanks to its unique blend of high performance execution and memory safety. Still, programs implemented in rust can contain critical bugs. Apart from logic bugs and crashes, code in unsafe blocks can still trigger memory corruptions. To find these, the community uses traditional fuzzers like libfuzzer or aflpp, in combination with rust-specific macros. Of course, the fuzzers themselves are still written in memory-unsafe languages.
In this paper, we explore the possibility of replacing the input generators with rust, while staying compatible to existing harnesses. Based on the rust fuzzer library libafl, we develop ourtool, a drop-in replacement for the C++ component of cargo-fuzz. We evaluate our tool, written in rust, against the original fuzzer libfuzzer. We show that we are not only able to successfully fuzz all three targets we tested with ourtool, but outperform cargo-fuzz in bug coverage. During our preliminary evaluation, we already manage to uncover new bugs in the pdf crate that could not be found by cargo-fuzz, proving the real-world applicability of our approach, and giving us high hopes for the planned follow-up evaluations.

Publisher's Version

Beyond the Coverage Plateau: A Comprehensive Study of Fuzz Blockers (Registered Report)
Wentao Gao

, Van-Thuan Pham

, Dongge Liu

, Oliver Chang

, Toby Murray

, and Benjamin I.P. Rubinstein

(University of Melbourne, Australia; Google, Australia)
Fuzzing and particularly code coverage-guided greybox fuzzing is highly successful in automated vulnerability discovery, as evidenced by the multitude of vulnerabilities uncovered in real-world software systems. However, results on large benchmarks such as FuzzBench indicate that the state-of-the-art fuzzers often reach a plateau after a certain period, typically around 12 hours. With the aid of the newly introduced FuzzIntrospector platform, this study aims to analyze and categorize the fuzz blockers that impede the progress of fuzzers. Such insights can shed light on future fuzzing research, suggesting areas that require further attention. Our preliminary findings reveal that the majority of top fuzz blockers are not directly related to the program input, emphasizing the need for enhanced techniques in automated fuzz driver generation and modification.

Publisher's Version

InFuzz: An Interactive Tool for Enhancing Efficiency in Fuzzing through Visual Bottleneck Analysis (Registered Report)
Qian Yan

, Huayang Cao

, Shuaibing Lu

, and Minhuan Huang

(National Key Laboratory of Science and Technology on Information System Security, China)
Despite the effectiveness of current fuzzing methods, fully automated fuzzing techniques still face an important challenge in overcoming complex code constraints to achieve high coverage and find new vulnerabilities. As a result, experts in practice are starting to add themselves to the fuzzing workflow to look for defects. In this context, current state-of-the-art fuzzing methods are of limited help to improve the efficiency of human-assisted fuzzing. Therefore, we introduced an interactive tool called InFuzz to help humans better understand and intervene in the fuzzing process through visual bottleneck analysis. InFuzz extracts information from source code and runtime coverage and maps blocking branches in tests to source code lines, and gets potential inputs to blocking branches through dynamic data flow analysis, which are presented in the form of HTML web pages to the tester. In addition, it provides code annotation techniques to better intervene in fuzzing. Using InFuzz, testers can focus their attention on blocking constraints and learn their semantic context and associated input sources to better design code annotations, construct new input seeds, or update test drivers.

Publisher's Version

FUZZING 2023 – Proceedings

2nd International Fuzzing Workshop (FUZZING 2023)

Frontmatter

Keynotes

Registered Reports