Powered by
3rd ACM International Fuzzing Workshop (FUZZING 2024),
September 16, 2024,
Vienna, Austria
3rd ACM International Fuzzing Workshop (FUZZING 2024)
Frontmatter
Welcome from the Chairs
It is our great pleasure to welcome you to the 3rd International Workshop on Fuzzing (FUZZING 2024), co-located with ISSTA in Vienna, Austria on 16 September 2024. This workshop is the continuation of a two-year series of workshops that introduced a preregistration-based publication process to our community. Similar to the last two years, this workshop hosts the presentations of the accepted drafts of the registered reports that were accepted as part of the first stage in a two-stage publication process. In the first stage, the program committee (PC) evaluates all submissions based on:
(i) the significance and novelty of the hypotheses or techniques and
(ii) the soundness and reproducibility of the methodology specified to validate the claims or hypotheses -- but explicitly not based on the strength of the (preliminary) results.
These draft registered reports are presented and improved at the FUZZING 2024 workshop in Vienna.
Keynotes
Reasons for the Unreasonable Success of Fuzzing (Keynote)
Thomas (Halvar Flake) Dullien
(Independent, France)
The hacker culture of my youth (90s) was a very typical male-centric teenage subculture, with norms and value systems that were at odds with broader society. In my particular corner of the culture, the term ‘fuzz-tester’ was used as a derogatory put-down for people that were unable to find bugs by reading code. I wrote my first fuzzer around the age of 19, not to use it myself, but as part of a paid gig where someone else needed one. I couldn’t bring myself to use it; my pride in my ability to audit code wouldn’t let me go there. The fuzzer turned out to be annoyingly effective. And over the course of my 20s, I saw more and more people find surprisingly important and relevant bugs through fuzzing. Being humbled by looking down on fuzzing for years, only to realize that I would’ve been much more effective if I had fuzzed harder and earlier, was one of the more important experiences in becoming “a reasonable adult”. This keynote will discuss some of the reasons why fuzzing is so effective, why fuzzing is always an automatic winner of the hardware lottery, some historical bugs that were fuzzed (from OpenSSH to Cisco IPSec) and why brute-force exploration is an important component of all AI (even if it might not be a component of biological intelligence).
@InProceedings{FUZZING24p1,
author = {Thomas (Halvar Flake) Dullien},
title = {Reasons for the Unreasonable Success of Fuzzing (Keynote)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {1--1},
doi = {10.1145/3678722.3695730},
year = {2024},
}
Publisher's Version
Is “AI” Useful for Fuzzing? (Keynote)
Brendan Dolan-Gavitt
(New York University, USA)
Discussion of AI and its applications to security seems unavoidable nowadays, and, alas, this keynote is no exception. But is it actually useful for problems we care about, like fuzzing? In classic academic fashion I will answer “maybe” at great length, but hopefully with enough concrete examples and references to actual code that the talk will be worth listening to. I will cover: 1) Places where it seems obviously misguided (input generation in the fuzzing loop); 2) Areas where it seems to have demonstrable benefits (harness generation); and 3) Promising future directions (generating input seeds, evolving input seed generators).
@InProceedings{FUZZING24p2,
author = {Brendan Dolan-Gavitt},
title = {Is “AI” Useful for Fuzzing? (Keynote)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {2--2},
doi = {10.1145/3678722.3695731},
year = {2024},
}
Publisher's Version
Papers
The Havoc Paradox in Generator-Based Fuzzing (Registered Report)
Ao Li,
Madonna Huang,
Caroline Lemieux, and
Rohan Padhye
(Carnegie Mellon University, USA; University of British Columbia, Canada)
Parametric generators are a simple way to combine coverage-guided and generator-based fuzzing. Parametric generators can be thought of as decoders of an arbitrary byte sequence into a structured input. This allows mutations on the byte sequence to map to mutations on the structured input, without requiring the writing of specialized mutators. However, this technique is prone to the havoc effect, where small mutations on the byte sequence cause large, destructive mutations to the structured input. This registered report first provides a preliminary investigation of the paradoxical nature of the havoc effect for generator-based fuzzing in Java. In particular, we measure mutation characteristics and confirm the existence of the havoc effect, as well as scenarios where it may be more detrimental. The proposed evaluation extends this investigation over more benchmarks, with the tools Zest, JQF’s EI, BeDivFuzz, and Zeugma.
@InProceedings{FUZZING24p3,
author = {Ao Li and Madonna Huang and Caroline Lemieux and Rohan Padhye},
title = {The Havoc Paradox in Generator-Based Fuzzing (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {3--12},
doi = {10.1145/3678722.3685529},
year = {2024},
}
Publisher's Version
Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)
Sriteja Kummita,
Miao Miao,
Eric Bodden, and
Shiyi Wei
(Fraunhofer IEM, Germany; University of Texas at Dallas, USA; University of Paderborn, Germany)
Greybox fuzzing is used extensively in research and practice. There
are umpteen improvements proposed in the literature to improve
greybox fuzzing. However, to what extent do these improvements
affect the internal components (or internals) of a given fuzzer is
not yet understood as the improvements are mostly evaluated in
terms of code coverage and bug finding capability. Such an evaluation
is insufficient to understand the effect of improvements on
the internals of fuzzer. Some of the literature developed tools to
visualize the outcomes of the fuzzing to enhance the understanding.
However, they only focus on high-level information and no previous
research on visualization has been dedicated to understanding
fuzzing internals.
To close this gap, we propose the first step towards the development
of a fuzzing-specific visualization framework: a taxonomy
of visualization analysis tasks that fuzzing experts desire to help
them understand the internals of fuzzing. Our approach involves
conducting semi-structured interviews with fuzzing experts and
using qualitative data analysis to systematically extract the task
taxonomy from the interview data. We also evaluate the support
of existing visualization tools for fuzzing through the lens of our
taxonomy. In our pilot study, we conducted interviews with six
fuzzing experts and extracted a preliminary taxonomy. We aim to
conduct another 20 interviews to gain more insights and make the
taxonomy more robust at Phase 2.
@InProceedings{FUZZING24p13,
author = {Sriteja Kummita and Miao Miao and Eric Bodden and Shiyi Wei},
title = {Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {13--22},
doi = {10.1145/3678722.3685530},
year = {2024},
}
Publisher's Version
WebAssembly as a Fuzzing Compilation Target (Registered Report)
Florian Bauckholt and
Thorsten Holz
(CISPA Helmholtz Center for Information Security, Germany)
By monitoring the execution of the program under test, fuzzers can gather feedback on how different inputs affect the program’s behavior and detect crashes and other abnormal behaviors. To achieve these objectives, fuzzers typically rely on a static instrumentation phase, which can be cumbersome to extend and experiment with. In this paper, we explore a different strategy: By compiling to a common high-level compilation target, we can retain most of the instrumentation opportunities with the potential for dynamic instrumentation. Compiling to an intermediate target disentangles instrumentation from the harness build process and produces fuzzer-independent harness artifacts. More specifically, we propose to use WebAssembly (WASM) as a suitable target due to its widespread language support, deterministic and isolated nature, and simple and easy-to-JIT instruction set. To explore this approach, we present and discuss WasmFuzz, a fuzzer for WebAssembly binaries that bridges the gap between native and WASM fuzzing. To enable meaningful WebAssembly fuzzer comparisons, we demonstrate a generic way to retrofit WASM modules into source-based fuzzers through wasm2c. This approach already raises the performance baseline of WebAssembly fuzzing significantly. In our preliminary evaluation, WasmFuzz achieves, on average, more basic blocks per target compared to other WebAssembly fuzzers and seems competitive with native setups like cargo-fuzz (LibFuzzer).
@InProceedings{FUZZING24p23,
author = {Florian Bauckholt and Thorsten Holz},
title = {WebAssembly as a Fuzzing Compilation Target (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {23--32},
doi = {10.1145/3678722.3685531},
year = {2024},
}
Publisher's Version
Directed or Undirected: Investigating Fuzzing Strategies in a CI/CD Setup (Registered Report)
Madonna Huang and
Caroline Lemieux
(University of British Columbia, Canada)
Fuzzing best practices suggest that fuzzing should be run for at least 24 hours, if not longer. This recommendation makes it hard to integrate fuzzing into CI/CD contexts, to rapidly check a commit for bugs. Existing studies on CI/CD fuzzing simulated a CI/CD environment by running undirected fuzzers on Magma benchmark programs, which have multiple bugs injected into a single version of the program. Directed fuzzers, such as AFLGo, aim to generate inputs that reach specific target locations in the program being fuzzed. Thus, they should be more effective at fuzzing in a CI/CD environment. In this study, we propose to evaluate both directed and undirected fuzzers in a simulated CI/CD environment. Like prior work, we will use Magma as a source of benchmarks, and run fuzzers for 10 minutes. Unlike prior work, we will start the fuzzing process from a saturated corpus, rather than Magma's default corpus. Also unlike prior work, we will run the fuzzers on versions of Magma programs with a single bug injected. To deal with the threat that Magma patches give directed fuzzers access to too precise information as to the bug location, we will also conduct experiments where we add additional lines of target code, to evaluate the sensitivity of directed fuzzers. Our registered report gives preliminary results on a small subset of benchmarks.
@InProceedings{FUZZING24p33,
author = {Madonna Huang and Caroline Lemieux},
title = {Directed or Undirected: Investigating Fuzzing Strategies in a CI/CD Setup (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {33--41},
doi = {10.1145/3678722.3685532},
year = {2024},
}
Publisher's Version
LOOL: Low-Overhead, Optimization-Log-Guided Compiler Fuzzing (Registered Report)
Florian Schwarcz,
Felix Berlakovich,
Gergö Barany, and
Hanspeter Mössenböck
(JKU Linz, Austria; University of the Bundeswehr Munich, Germany; Oracle Labs, Austria)
Compiler fuzzing with randomly generated input programs is a powerful technique for finding compiler crashes and miscompilation bugs. Existing fuzzers for compilers are often unguided and must be manually parameterized to cover different parts of the compiler under test.
In this work we present LOOL, an approach for fuzzing a compiler with low overhead, guided by optimization log information produced by the compiler. The optimization log tracks program transformations performed by the compiler on the level of individual methods compiled. We argue that using the optimization log has less overhead than off-the-shelf code coverage tools. At the same time, the optimization log's per-method data gives more information than code coverage collected over a number of distinct compilations. The level of detail of the optimization log is also easy to tune for the use case of guiding a fuzzer.
We are integrating the LOOL approach in an existing fuzzer for the GraalVM compiler. A genetic optimization algorithm uses optimization log information for tuning code generation parameters with the goal of covering optimizations that were previously rarely exercised. Initial experiments confirm that varying the generator's parameters is effective at finding new bugs. The genetic algorithm will automate the exploration of the parameter space to improve testing of currently insufficiently fuzzed parts of the compiler.
@InProceedings{FUZZING24p42,
author = {Florian Schwarcz and Felix Berlakovich and Gergö Barany and Hanspeter Mössenböck},
title = {LOOL: Low-Overhead, Optimization-Log-Guided Compiler Fuzzing (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {42--51},
doi = {10.1145/3678722.3685533},
year = {2024},
}
Publisher's Version
Effective Fuzzing within CI/CD Pipelines (Registered Report)
Arindam Sharma,
Cristian Cadar, and
Jonathan Metzman
(Imperial College London, United Kingdom; Google, USA)
Deploying fuzzing within CI/CD pipelines can help ensure safe and secure code evolution.
Directed greybox fuzzing techniques such as AFLGo are a good match for the CI/CD context.
These techniques prioritise inputs based on estimated distances to the changed code.
Unfortunately, computing these distances is often expensive, making the techniques impractical for short CI/CD runs.
In this paper, we propose an AFLGo-based technique called PaZZER, which optimises the distance calculation by dropping the expensive control-flow graph component and computing the call-graph component in an incremental fashion.
Preliminary results are promising, showing that PaZZER can make CI/CD testing feasible for large applications:
eg. for Objdump the distance computation time is decreased from 34 min to just 2.5 min, with a further 2.3 min saved when an incremental algorithm is used.
The significant time reduction in distance computation allows PaZZER to use most of the time on actual fuzzing, making it practical for short CI/CD runs of around 10 minutes.
Our planned full evaluation will involve real-world commits from a diverse set of 9 applications of different sizes.
This will include coverage experiments and an ablation study to investigate the impact of PaZZER's design decisions, and a bug-finding case study comparing it against AFLGo and Google's CIFuzz.
We will assess the benefits and effectiveness of our approach in terms of patch coverage, patch proximity, distance computation time, and time-to-exposure for bugs.
@InProceedings{FUZZING24p52,
author = {Arindam Sharma and Cristian Cadar and Jonathan Metzman},
title = {Effective Fuzzing within CI/CD Pipelines (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {52--60},
doi = {10.1145/3678722.3685534},
year = {2024},
}
Publisher's Version
Published Artifact
Artifacts Available
Sparse Symbolic Loop Execution (Registered Report)
Frank Busse,
Martin Nowack, and
Cristian Cadar
(Imperial College London, United Kingdom)
Dynamic symbolic execution is a powerful program analysis technique but is often limited by the path-explosion problem, particularly in the presence of heavily branching loops. In this paper, we introduce sparse symbolic loop execution (SSLE), a novel approach aimed at mitigating this issue.
SSLE observes the edge patterns of sibling states, spawned from the same loop, at program branches up to a pre-computed loop-impact barrier. States that exhibit unique patterns of taken edges are selected for further exploration, while others are postponed.
We implemented SSLE in a prototype called SparKLE and evaluated it on a set of 8 benchmarks against the popular symbolic execution engine KLEE. SSLE shows promising results and could increase line coverage by up to 55% in 1h runs using a depth-first-search heuristic. In other cases up to 99.997% of states could be postponed without any loss in coverage.
Our planned evaluation will include a diverse set of 50 real-world benchmarks and will aim to better understand the effectiveness of SparKLE in handling symbolic loops, and how it compares with less complex approaches.
@InProceedings{FUZZING24p61,
author = {Frank Busse and Martin Nowack and Cristian Cadar},
title = {Sparse Symbolic Loop Execution (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {61--69},
doi = {10.1145/3678722.3685535},
year = {2024},
}
Publisher's Version
Automated Feature Testing of Verilog Parsers using Fuzzing (Registered Report)
Quentin Corradi,
John Wickerson, and
George A. Constantinides
(Imperial College London, United Kingdom)
In this article we propose a methodology based on fuzzing to test which features are supported by pasers and register an experiment applying this methodology to SystemVerilog-consuming tools.
SystemVerilog is a hardware description, specification and verification language widely used in hardware design, and with an active standard committee.
Most SystemVerilog-consuming tools have incomplete support and support additional features.
These tools do not provide the list of features they support, so identifying commonly supported SystemVerilog features is complicated.
This hinders design portability and tool interoperability.
We think current efforts to test these tools' feature support are insufficient.
All of the previous points justify why SystemVerilog-consuming tools are a good candidate for our methodology.
We also provide the first (to our knowledge) open-source parser and fuzzer for Verilog with full support and compliance with the 2005 standard.
@InProceedings{FUZZING24p70,
author = {Quentin Corradi and John Wickerson and George A. Constantinides},
title = {Automated Feature Testing of Verilog Parsers using Fuzzing (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {70--79},
doi = {10.1145/3678722.3685536},
year = {2024},
}
Publisher's Version
Understanding and Improving Coverage Tracking with AFL++ (Registered Report)
Vasil Sarafov,
David Markvica,
Felix Berlakovich,
Matthias Bernad, and
Stefan Brunthaler
(University of the Bundeswehr Munich, Germany)
Coverage-based fuzzers track which program parts they visit when executing a specific input as a proxy measure to (1) guide the fuzzing process, and (2) explore the PUT's state space.
One way to record coverage progress is to enumerate basic block pairs (e.g., edges in the control-flow graph) and use them to index into a hash table that holds counters.
The counter is incremented every time a fuzzer's input exercises the corresponding edge.
Traditionally the coverage map has been a compact bitmap that fits the L2 CPU cache to reduce runtime overhead and boost fuzzing throughput.
In such a design where space is traded for speed, two sources of imprecision can arise: (1) collisions, and (2) arithmetic inaccuracies.
Collisions refer to the situation when two different basic block pairs hash to the same entry.
Imprecision arises since one pair is now counted together, but the fuzzer cannot tell one apart from the other.
Arithmetic inaccuracies refer to errors in the counting strategy.
For example, a monotonically incrementing counter inside the hash table can overflow.
This indicates a situation where high-frequency control-flow exceeds the predefined, expected maximum counter size (e.g., in loops).
Due to execution frequencies obeying exponential power laws, such overflows will affect a small number of hash table entries.
Another arithmetic inaccuracy results from range-based counters that capture only predefined frequency intervals (e.g., logarithmic counters).
In 2018, CollAFL examined how collisions impact precision, and presented a new hashing scheme to reduce the number of collisions.
CollAFL did not address the problem of arithmetic inaccuracies.
Furthermore, CollAFL considered only a single-core virtual machine, a limited set of benchmark programs, and did not explore hardware-specific effects (e.g., cache utilization for concurrent fuzzing processes).
This registered report aims at providing new insights of how collisions and arithmetic inaccuracies affect coverage tracking for fuzzing.
We propose experiments for multiple hardware architectures with different cache topologies, and a more diverse set of benchmark programs.
Leveraging the evaluation data, our aim is to determine precise architecture-aware settings for AFL++.
Furthermore, we plan to demonstrate an adaptive optimization strategy that optimizes the coverage map to collisions and counting strategies for a specific combination of the CPU architecture and PUT.
@InProceedings{FUZZING24p80,
author = {Vasil Sarafov and David Markvica and Felix Berlakovich and Matthias Bernad and Stefan Brunthaler},
title = {Understanding and Improving Coverage Tracking with AFL++ (Registered Report)},
booktitle = {Proc.\ FUZZING},
publisher = {ACM},
pages = {80--89},
doi = {10.1145/3678722.3685537},
year = {2024},
}
Publisher's Version
proc time: 3.71