Repository Firewall Hashing
Repository Firewall uses the SHA-1 hashing algorithm for component identification. This is either the whole SHA-1 hash or the SHA-1 truncated to the first 10 bytes or 20 first characters. This truncation method is used to improve performance when searching and indexing the database.
The hashing used by the Repository Firewall for supported ecosystems is classified into two categories: package files hashing and synthetic hashing.
Package File Hashing
Package file hashing involves creating a hash of the compressed package downloaded from pubic open-source ecosystems. Most supported ecosystems are in this category.
Maven, Pypi, Composer, RubyGems, Cocoapods, Nuget, Cran, Conan
The file hash may be generated by directly hashing the file or by accessing the hash from the open-source ecosystem website.
shasum /path/to/component
For the example above, you may visit Maven Central to access the sha1 directly from the repository. Example commons-fileupload.
Synthetic Hashes
In contrast, synthetic hashes are generated using elements other than the package file. For instance, the package/version combination is used for Golang, while MD5 checksums are employed for Conda packages.
Golang, Conda
For the Golang ecosystem, the SHA-1 hash is created using a string composed of the package name and version.
source + '/x/' + name + '@v' + version
Calculating the SHA1 of GoLang package named text
and version 0.3.7
echo -n "golang.org/x/text@v0.3.7" | openssl sha1 > SHA1(stdin)= fe597b3fed5dbc388e7ce53c58b6de6bce5e104e
{ "format": "golang", "components":[{ "packageUrl": "pkg:golang/golang.org/x/text@v0.3.7", "sha1": "fe597b3fed5dbc388e7ce53c58b6de6bce5e104e" }] }
For Conda, we use the MD5 checksum of the package to calculate the SHA-1. Find the MD5 checksum of a package by searching the package info using the conda search tool.
conda search --info <package-name>