Project Aura
state of the art static analysis for Python

So that one may code in peace.

Open Source

You can find the complete source code on GitHub. Aura is licensed under the GPL-3 license.

Highly customizable

Everything can be customized via configuration or extended using plugins.

Taint Analysis

Find unknown vulnerabilities in your codebase by tracking the propagation of untrusted input in the data flow.

Production ready

Tested on over 4Tb of compressed source code.

What is Aura?

Audit your source code

Aura is a static analysis framework for a Python source code that is able to detect errors, defects, vulnerabilites and patterns in your code. This is all done just by looking at your code without executing it! You can tune the framework by customizing a set of semantic rules that acts as patterns telling Aura what to look for. Of course this framework already comes with a rich set of default signatures that are ready to go!

While there are other tools with functionality that overlaps with Aura such as Bandit, dlint, semgrep etc. the focus of these alternatives is different which impacts the functionality and how they are being used. These alternatives are mainly intended to be used in a similar way to linters, integrated into IDEs, frequently run during the development which makes it important to minimize false positives and reporting with clear actionable explanations in ideal cases.

Aura on the other hand reports on behavior of the code, anomalies, and vulnerabilities with as much information as possible at the cost of false positive. There are a lot of things reported by aura that are not necessarily actionable by a user but they tell you a lot about the behavior of the code such as doing network communication, accessing sensitive files, or using mechanisms associated with obfuscation indicating a possible malicious code. By collecting this kind of data and aggregating it together, Aura can be compared in functionality to other security systems such as antivirus, IDS, or firewalls that are essentially doing the same analysis but on a different kind of data (network communication, running processes, etc).

For developers

Powerful configuration options to look for patterns
Understand the unknown source code that has been handed over to you better
Works out of the box, no need to modify your existing code
Supports a large amount of output formats and integration with CI systems

For businesses

Track and approve what python modules can developers use
Find hardcoded passwords, secrets, or other sensitive information
Enforce secure coding policies by whitelisting or blacklisting specific code patterns and modules

For researchers

Scan terabytes of data - designed to scan the whole PyPI repository
No code execution makes it safe for researching potentially untrusted code and analyzing malware
Scan RAW data, not only Python source code

Fork us on GitHub Browse the documentation

docker run -ti --rm sourcecodeai/aura:dev

Our Vision

Developers these days are now more at risk than ever before. It has been a common practice to develop by piecing together libraries that are not audited or using untrusted code snippets. We believe that there is a severe lack of tools that are designed to detect potentially malicious or vulnerable code that the developer may be using unknowingly. Development packages are distributed via repositories such as PyPI that is not audited, we aim to fill this gap by monitoring all uploaded source code to the package repository. While monitoring and analyzing the source code in public repositories, we collect a huge amount of data that we share back for free to fuel more research in this area.

Protect the developers

Times have changed, developers are starting to become a more lucrative target for cyber threats.

Aura datasets

Global PyPI scans

We scan periodically all packages published on PyPI to gather research data and to look for anomalies and signs of a malicious code published in those packages. All data collected during the scan is publicly released free of charge for non-commercial use, unchanged, and in the same format that it was generated in.

Latest Aura dataset from 01.11.2020 global PyPI scan

Packages scanned: 216959 out of 266952
Aura GIT version: b86df7701e692e559a380cad520f39eaa2711fda
Compressed/uncompressed dataset size: 3.48 / 42.96 GB

Magnet link Torrent file List of scanned packages License CC BY-NC Documentation

Project Aura state of the art static analysis for Python