Skip to main content

Python Application Analysis

Evaluation: Advance Binary Fingerprinting

Python scanning supports binary packaged archives (.whl/tar.gz) and coordinates in the requirements.txt manifests from the Python Package Index (PyPI). For the best results, we recommend first creating a Python virtual environment and resolving the dependencies using a pip install against the requirements file. This will bring only the dependencies needed by the project into the build while resolving any included dependency ranges in the requirements file.

How to get the best results

To produce the best outcome the following is suggested.

  • Create a requirements.txt file by using "pip freeze > requirements.txt".

  • Using the requirements.txt download the binaries by executing "pip download -r requirements.txt -d <output_dir>" or "pip wheel -r requirements.txt -d <output_dir>"

  • Run a scan on the <output_dir>

Example workflow. Check out a video demonstration here.

    1. cd project_folder

    2. mkdir ~/py-envs

    3. python3 -m venv ~/py-envs/project_folder

    4. source ~/py-envs/project_folder/bin/activate

    5. pip install -r requirements.txt

  • requirements.txt:

    • Usepip freezeto create the requirements file.

    • Lifecycle scanners will only use the manifest named requirements.txt while ignoring other variants.

    • Avoid including variable ranges for dependences as these will be ignored by the Lifecycle scanner.

    • The requirements.txt must use the == operator and version without wildcards.

    • Additional flags should be added to requirements.txt files to scope to the target os/arch as found in the environment markers of the Python documentation.

  • .whl files: may be matched to multiple environmental Python packages. These will show as duplicates in the Lifecycle scan report.

Evaluation: Source code and manifest analysis

The Python coordinate-based matching feature provides the ability to scan and evaluate Python dependencies found in Python manifest files without running the pip install first.Files named requirements.txt (generated using a pip command) and poetry.lock (Poetry lock files) will be analyzed.

Converting from other formats

setup.py

Asetup.pyfile can be used to generate a requirements.txt file by first installing its packages (e.g. via pip install . ), ideally to a virtual environment, and then running:

  • pip freeze > requirements.txt for Python 2 or

  • pip3 freeze > requirements.txt for Python 3

What do we parse from the file?

requirements.txt

Requirements using the "==" operator and version without wildcards will be considered. One requirement could be matched to multiple distributions of the same Python package. Using the sys-platform marker makes the dependency more specific. For example:

altgraph==0.10.2

pywin32 ==1.0 ; sys_platform == 'win32'

poetry.lock

Refer to the sample poetry.lock excerpt below:

Dependencies with name and exact version in the [[package]] section are required and evaluated. For example:

  • Name: six

  • Version: 1.16.0

Dependencies with extension and qualifier in the [metadata.files] are required and evaluated. For example:

  • Extension: whl

  • Qualifier: py2.py3-none-any

Package dependencies with name and exact version in the [package.dependencies] section are evaluated. For example:

  • Name: colorama

  • Version: 0.4.4

Sample poetry.lock

[[package]]
name = "six"
version = "1.16.0"
description = "Python 2 and 3 compatibility utilities"
category = "main"
optional = false
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*"

[package.dependencies]
colorama = "0.4.4"

[metadata]
lock-version = "1.1"
python-versions = "^3.8"
content-hash = "7ae52da2736b4294be7a184e040cc78412add14e92b816077ede183f9e1c636c"

[metadata.files]
six = [
    {file = "six-1.16.0-py2.py3-none-any.whl", hash = "sha256:8abb2f1d86890a2dfb989f9a77cfcfd3e47c2a354b01111771326f8aa26e0254"},
    {file = "six-1.16.0.tar.gz", hash = "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926"},
]
colorama = [
    {file = "colorama-0.4.4-py2.py3-none-any.whl", hash = "sha256:9f47eda37229f68eee03b24b9748937c7dc3868f906e8ba69fbcbdd3bc5dc3e2"},
    {file = "colorama-0.4.4.tar.gz", hash = "sha256:5941b2b48a20143d2267e95b1c2a7603ce057ee39fd88e7329b0c292aa16869b"},
]

Exclude devDependencies

IQ Server automatically excludes scanning devDependencies for projects using poetry versions < 1.5.1

Due to removal of category in the format of poetry.lock file from version 1.5.1 onwards, older versions of IQ Server will not exclude the devDependencies while scanning projects using the new poetry versions.

IQ Server will automatically exclude devDependencies for poetry versions 1.5.1 or higher, if pyproject.toml exists and is discoverable.

Steps to analyze using the Sonatype IQ CLI

Create requirements

Run pip freeze

pip freeze > requirements.txt

The requirements.txt encoding is UTF-8. Special note for Microsoft Windows users, the cmd.exe encoding may need to be changed to UTF-8. Please refer to Microsoft documentation on how to do this.

Example file content

altgraph==0.10.2

backports-abc==0.5

backports.ssl-match-hostname==3.5.0.1

bdist-mpkg==0.5.0

certifi==2018.1.18

chardet==3.0.4

click==6.7

confire==0.2.0

Django==1.6

django-countries==3.3

django-make-app==0.1.3

docopt==0.6.2

enum34==1.1.6

Add environment markers (optional)

Adding environment markers can simplify the results by filtering out components that are not relevant to your deployment platform. Only the sys_platform environment marker is supported at the moment.

Add the environment marker next to the component(s) in the requirements.txt.

e.g.

Django==1.6; sys_platform == 'win32'

Run a scan

Invoke a Sonatype IQ CLI scan of the directory containing requirements.txt. Instructions on how to do this can be found here Sonatype CLI