Skip to main content

Researchers at Protect AI have launched Vulnhuntr, a free and open-source static code analysis tool. It is designed to uncover zero-day vulnerabilities in Python codebases using Anthropic’s Claude Artificial Intelligence (AI) model.

This innovative tool was unveiled during the recent No Hat security conference in Italy.

Vulnhuntr applies a unique approach by breaking down codebases into smaller, manageable chunks. This technique addresses the limitations of Large Language Models (LLMs) which can struggle with extensive context windows. Context windows are the amount of information an LLM can parse in a single chat request.

Instead of analysing entire files at once, Vulnhuntr uses prompt-engineering methods to send targeted, detailed prompts to the AI, which then intelligently requests additional snippets of code.

This iterative process allows Vulnhuntr to map the entire application flow, from user input to server output. By maintaining a comprehensive view of the flow, Vulnhuntr can reveal complex vulnerabilities that traditional static analyzers might miss. This enhances its analysis capabilities and significantly reduces false positives and negatives.

Vulnhuntr currently identifies seven types of remote vulnerabilities including the following:

  1. Arbitrary File Overwrite (AFO)
  2. Local File Inclusion (LFI)
  3. Server-Side Request Forgery (SSRF)
  4. Cross-Site Scripting (XSS)
  5. Insecure Direct Object References (IDOR)
  6. SQL Injection (SQLi)
  7. Remote Code Execution (RCE)

The tool has already proven effective as it has identified over a dozen zero-day vulnerabilities in popular Python projects on GitHub, such as gpt_academic and FastChat. It also flagged an RCE flaw in the machine learning library Ragflow, which has since been resolved.

One of Vulnhuntrโ€™s key features is its ability to generate Proof-of-Concept (PoC) exploits along with confidence scores that range from 1 to 10. A score of 7 suggests a likely vulnerability, while scores of 8 or higher indicate a strong probability of validity. On the other hand, scores of 6 or less are unlikely to be valid.

An example of the analysis result is shown below:

Sample Analysis Resultย 

Credit: The Register

At the moment, the tool only works on Python code. As a result, it is more likely to generate false positives when scanning Python projects that incorporate code in other languages.

Vulhuntr is available on GitHub. Protect AI urges bug bounty hunters to try the tool on open-source projects listed on its bug bounty website, huntr.com.

About the author: