We're currently working on a project called Tree of AST, inspired by Tree of Thoughts. We've observed that when hunting for vulnerabilities in open-source projects or conducting white-box testing, many static analysis tools already pinpoint vulnerability sink threat functions—like eval, system in Python, or other sensitive functions capable of code execution or file operations. Yet in practice, the bulk of our time is consumed tracing backward from these sensitive functions to locate the source—the input origins. Many high-risk functions may surface, but not all are exploitable. Perhaps the input cannot reach the dangerous function, or numerous conditional checks between input and the dangerous function prevent payload delivery.

Hence why we engineered Tree of AST—a Tree of Thoughts-based dataflow analysis methodology.

First, we locate dangerous functions. We still employ off-the-shelf static analysis tools like Bandit to identify dangerous sensitive functions—eval in Python, file-sensitive operations. Second, we construct the CallGraph through file parsing and map each function's calling relationships alongside its Symbol Resolution. This enables backward taint propagation of variables executed by sensitive functions and their call chains. This yields a tree structure: the root being the sensitive function, with branches representing the sensitive function's call chains.

Now comes the intriguing part. Some audience members might question: Tree of Thoughts is a deep learning-based technique, like Chain of Thoughts—fundamentally a deep learning-driven reasoning paradigm. So what's the nexus with dataflow analysis, or analysis tasks at their core?

Reading the Tree of Thoughts paper, we discovered a compelling methodology: introducing a second Large Language Model for voting on thought processes and implementing dynamic pruning. This pruning strategy originally aimed to elect the most task-favorable thought chain paths.

We can transplant this methodology to elect the most viable dangerous call chains from threat functions to source inputs in dataflow analysis.

Taking the globally constructed call tree rooted at dangerous sensitive functions from the previous phase, we initiate from the dangerous function and traverse layer-by-layer progressively. At each elected node, we conduct elections for next-layer callers, continuing until reaching the data entry point or determining non-triggerability.

We implement several intriguing tricks—lookahead: pre-revealing the call tree rooted at the next caller during elections. Depth-based voting: introducing weighted scoring based on the votee's (the caller's) call tree depth during voting elections. The rewind mechanism: when discovering the current elected call chain cannot reach the data entry point, we permit voters to rewind—essentially rollback—to previous voting states.

Finally, assuming successful inference of a path from taint function (the sensitive function) to data entry point via the above methodology, we don't halt there. During this methodology's development, we concurrently engineered an approach for generating proof-of-concept or exploit payloads through a single call chain—the core keystone establishing the framework's autonomous discovery-to-exploitation characteristic.

After generating a call chain from taint function to data entry point, we reverse-traverse this call chain. Analyzing limitations within this call chain, we parallelly generate taggings for these limitations—HTML-alike tags describing constraints required from data entry point to taint function. These tags encompass both concrete specific constraints and abstract content (covering unconstrained portions). Tag containment relationships manifest constraint sequencing.

Finally, by traversing constraint tags generated via reverse-traversal from taint function to data entry point call chain, we hierarchically and progressively synthesize payloads capable of satisfying conditions from data entry point to reach the taint function.