Skip to main content

Cleanup Policies

Cleanup Policies are the automation rules for removing content stored in repositories of your Nexus Repository. The quantity of components quickly grows over time without reducing the number of components at the same rate as they are being added to the Nexus Repository.

This presents a risk to your deployment when not managed early as detailed below:

  • A continuing increase in storage costs as more artifacts are published

  • Impact on performance as searching takes longer

  • Discovery is challenging as you sort through more artifacts

  • Consuming the available storage results in server failure and outages

Cleanup Policies and Tasks are not configured by default. Define the policies that best suit your development lifecycle.

Security Requirements

  • Only users with admin privileges may create cleanup policies. (i.e., nexus:*)

  • Users with edit repository privileges may add a clean policy to the repository.

    nexus:repository-admin:maven2:maven-central:edit
  • Users with either of the following privileges may modify cleanup tasks:

    nx-tasks-update, nx-tasks-all

Cleanup Policies Workflow

Nexus Repository cleans up components using a set of rules, or cleanup policies, set on the repository configuration and a series of tasks to safely flag and remove the components.

These are the steps to setting up cleanup policies:

  1. Administrators first create Cleanup Policies depending on their requirements.

  2. Add one or more Cleanup Policies to a repository's configuration.

  3. Cleanup Tasks are set to run regularly.

    Admin - Cleanup repositories using their associated policies
    Admin - Cleanup unused asset blobs

    Cleanup tasks "soft delete" components by flagging them for removal. Components still consume space but may be recovered when needed.

  4. Run the compact blob store task on off-peak hours for each blob store to reclaim disk space. These tasks completely removed the components from the storage disk, freeing up the available space.

    Admin - Compact blob store

Creating Cleanup Policies

Cleanup Policies are located in the repository section of the administration menu and require admin privileges to create or modify. They are intended for use in one more many repositories typically associated with a specific repository format type.

  1. Select the Create Cleanup Policy button from the Cleanup Policies view

  2. Provide a unique name for the policy

    Only letters, digits, underscores(_), hyphens(-), and dots(.) are allowed and may not start with underscore or dot.

  3. Select a target format for the policy

    The 'All formats' option may be selected for any repository type however most formats have specific cleanup criteria available for use.

  4. Optionally enter notes in the Description field (limit of 400 characters)

    Understanding what the cleanup policy is for at a later date may be challenging. We recommend providing details on your policy for auditing purposes.

  5. Select at least one criterion for the policy

    See the available cleanup criteria by format table below for details.

  6. Select the Save button

See Cleanup Criteria for using each criterion.

Strategies for Creating a Cleanup Policy

  • Cleanup Policies are intended to target components to be removed from either hosted or proxy repositories.

  • The criteria of the policy are combined together to remove only the components that meet every condition specified.

  • Multiple cleanup policies may be applied to the same repository.

  • It is possible for policies to have overlapping criteria targeting the same components.

Adding A Cleanup Policy to a Repository

Cleanup policies may be assigned to both proxy and hosted repositories. Users require repository admin edit privileges to edit repositories to add cleanup policies.

  1. Select a repository from the Repositories view in the Administration menu or create a new one

  2. Navigate to the Cleanup section

  3. Optionally use the search filter to limit the available cleanup policies from the list

  4. Select the required cleanup policies from the available section

  5. Use the right-facing arrow to move the cleanup policy to the applied section

  6. Repeat for as many policies that are required

  7. Select the Save button

Previewing Cleanup Policy Results

This section covers previewing a sample of the results of a cleanup policy against a specific repository for testing purposes. These results are not comprehensive audit as the result may defer depending on the selected criteria and when the cleanup policy runs.

  1. Select a repository from the Preview Repository drop-down menu below the Save button

  2. Select the Preview button to return a sample of the components

  3. Use the filter to check for specific results not shown in the sample

    The sample may be an incomplete list of what may be removed

  4. There is a 1-minute timeout on the preview to reduce the impact to performance

PostgreSQL Cleanup Preview

Generate the complete list of components that the policy would remove as configured.

  1. Select a repository from the Preview Repository drop-down menu below the Save button

  2. Select the Generate CSV Report button

  3. The CSV file is downloaded once the query is complete

    Filename: <cleanup policy name>-<repository name>-<timestamp>.csv
    Fields:   namespace, name, version, path

    Generating the cleanup preview CSV takes time depending on deployment size and configuration. The table below provides generation times based on our internal testing with the following specifications:

    Deployed on an AWS ECS c6i.4xlarge instance with Aurora PostgreSQL db.r6g.large database
    ----------------------------------------------------------------------------
    Components in Repository (millions) || Time to Generate CSV Report (minutes)
     1M  =   ~1min
     5M  =   ~2min
    20M  =   ~8min
    25M  =  ~15min
    27M  =  ~25min
    30M  =  ~40min

Cleanup Tasks

Nexus Repository automatically creates a few system tasks to soft delete components identified for cleanup by the cleanup policies. These tasks are not manually created and are re-added on a restart of the service when deleted.

  • Cleanup service: Admin - Cleanup repositories using their associated policies

    This task soft deletes components based on the repository's configured cleanup policies. These tasks may be rescheduled or manually executed. By default, this task is set to run once an hour.

  • Cleanup unused {format} blobs from nexus: Admin - Cleanup unused asset blobs

    These tasks soft delete orphaned assets that are no longer needed after a component is removed. They are added when a new format has been added as a repository. By default, these tasks run every 30 minutes.

Hard Deleting Components

Cleanup Policies soft delete components to remove after which Nexus Repository no longer displays the soft deleted components in the user interface.

However, these components are not immediately deleted from storage and will still use disk space.

  • Create and schedule the Admin - Compact blob store task to reclaim disk space.

  • Create the task for every blob store that requires cleanup.

Azure Blob Store Cleanup

The compact blob store task requests the Azure blob store to mark blobs for deletion. These are later hard-deleted during garbage collection on the Azure side.

This may vary on whether the soft delete feature is enabled as described in Azure's documentation.

AWS S3 Blob Store Cleanup

AWS S3-based blob stores use a bucket lifecycle policy managed on the S3 blob store configuration to delete components. When components are soft-deleted using cleanup policies, the expiration days property sets the lifecycle on the blob in the S3 bucket.

The compact blob store task is not used for S3 blob stores.

Docker Cleanup Strategies

Docker's tagging, manifests, and layers are unique ways of managing components and assets that require additional configuration when designing a cleanup strategy.

See Components and Assets in Docker to learn more about the docker format.

  1. Docker - Delete incomplete uploads

    Soft-delete uploads to the temporary storage that are not complete

  2. Admin - Cleanup repositories using their associated policies

    Soft-delete old published or downloaded docker components i.e. tags, not layers or manifests

  3. Docker - Delete unused manifests and images

    Soft-delete orphaned layers and manifests no longer referenced by tags, possibly orphaned by cleanup policies

Once the above tasks have run, the following tasks are needed to hard delete the components and reclaim space depending on your deployment.

  • Run the Admin - Compact blob store task for file-based blob stores

  • Set the configuration Expiration Days on object-based blob stores such as S3

Additional Information

  • Clean Up Components That Have Never Been Downloaded

    Use the Component Usage (Days) criterion to clean up components that have never been downloaded. This criterion removes components that haven't been downloaded in a specified number of days. The date the component was published is used when the component has never been downloaded.

  • Cleanup Policies Does Not Remove Components from Replicated Repositories

    Content Replication does not replicate the deletion of components on remote repositories. Cleanup policies only remove components from the specific instance on which it is run. The remote repository requires its own setup of cleanup policies.

  • SQL-Based Cleanup Performance

    As of release 3.65.0, Nexus Repository Pro instances using PostgreSQL databases use SQL-based cleanup by default. SQL-based cleanup is proven to take considerably less time than Java-based cleanup.

    See the metrics at Cleanup Performance Data.

  • Determining the Space Your Repositories Are Using

    Use the following Support article to determine the space your repositories are using.

    Investigating Blob Store and Repository Size and Space Usage

  • Replace the following tasks with Cleanup Policies
    Maven - Delete unused SNAPSHOT
    Repository - Delete unused components

Cleanup Criteria

The table below lists the available cleanup criteria and the formats to which they apply:

Format

Component Age

Component Usage

Release Type

Retain Select Versions

Asset Name Matcher

All Formats

Confirmed

Confirmed

APT

Confirmed

Confirmed

Confirmed

Bower²

Confirmed

Confirmed

Confirmed

CocoaPods

Confirmed

Confirmed

Confirmed

Conan

Confirmed

Confirmed

Confirmed

Conda

Confirmed

Confirmed

Confirmed

Docker¹

Confirmed

Confirmed

Confirmed

Confirmed

GitLFS

Confirmed

Confirmed

Go

Confirmed

Confirmed

Confirmed

Helm

Confirmed

Confirmed

Confirmed

Maven

Confirmed

Confirmed

Confirmed

Confirmed

Confirmed

npm

Confirmed

Confirmed

Confirmed

Confirmed

NuGet

Confirmed

Confirmed

Confirmed

p2

Confirmed

Confirmed

Confirmed

PyPI

Confirmed

Confirmed

Confirmed

R

Confirmed

Confirmed

Confirmed

Raw

Confirmed

Confirmed

Confirmed

RubyGems

Confirmed

Confirmed

Confirmed

Yum

Confirmed

Confirmed

Confirmed

Confirmed

¹ - Cleanup only evaluates tagged manifests for Docker.

² - Bower functionality is for proxy repositories only.

Component Age (Days)

This criteria sets how long to keep content based on component age.

  • Proxy repositories: based on when the component was first downloaded

  • Hosted repositories: based on when the component was uploaded or updated

Component Usage (Days)

This criteria sets how long to keep content based on when a component was last downloaded. The published or updated dates are used when the component has never been downloaded.

Release Type

Use to set the cleanup policy to either PRELEASES or RELEASES. Prereleases are different by format:

  • Maven

    Versions contain the -SNAPSHOT phrase

  • npm

    Uses semantic versioning where a version is a prerelease when it contains the dash "-" character

  • Yum

    The non-case-sensitive "release" property in the RPM header contains one of the following:

    alpha, beta, rc, pre, prerelease, snapshot

Retain Select Versions (PostgreSQL Only)

Those using a PostgreSQL database have the option to exclude the most recent versions from the cleanup policy. Select the number of versions to keep even when matching other criteria.

  • Maven

    The version number is used, available for the release type Releases

  • Docker

    The age of the manifest is used.

  1. Select the checkbox

  2. Select the number of versions to keep

Asset Name Matcher

Rules are based on the component name, namespace, or path in the repository. Supported regular expression patterns differ between the legacy OrientDB and the newer PostgreSQL and H2 environments.

When migrating to PostgreSQL or H2, legacy cleanup policies may result in more assets being removed than expected.

PostgreSQL and H2 Expressions

In H2 or PostgreSQL environments, the Asset Name Matcher uses Java regular expressions.

  • Not compatible with OrientDB Lucene regular expressions

  • Java regular expressions may match any part of the component path

When migrating to PostgreSQL, revise cleanup policies to include the leading slash in asset matcher names. Failure to do so may result in assets not being matched and cleaned up as expected.

OrientDB Expressions

Expressions in OrientDB uses the Elastic Search regular expression query syntax, from Apache Lucene.

  • Not compatible with Perl (PCRE) or Java util.regex.Pattern regular expressions

  • Expressions must match the entire name when wildcards are not used.

  • Asset names do not require a leading slash and use a limited set of operators

  • Asset matchers in OrientDB are different than the asset request path value used when evaluating content selector or routing rule expressions

Comparison between OrientDB and PostgreSQL

This example contains the following assets for consideration:

Pattern
 antlr.*

Repository
 /antlr/antlr/2.7.2/antlr-2.7.2.jar
 /org/antlr/antlr-master/3.1.3/antlr-master-3.1.3.pom
  • OrientDB - the first component is matched while the second is not

  • H2 or PostgreSQL - both components are matched

Expression Examples

The following examples demonstrate a specific asset name matcher against a repository and the remaining components after using the matcher in a cleanup policy.

  • Components in a version range
    /hello/-/hello-0.0.[1-2].tgz
    
    Repository
     /hello/-/hello-0.0.1.tgz
     /hello/-/hello-0.0.2.tgz
     /hello/-/hello-0.0.3.tgz
    
    Remaining
     /hello/-/hello-0.0.3.tgz
  • Components with a specific path
    /(org|com)/.*
    
    Repository
     /org/example/test.jar
     /com/example/test.jar
     /test/example/test.jar
    
    Remaining
     /test/example/test.jar
    
  • Components not belonging to a specific team
    /org/sonatype/^(team2)/.*
    
    Repository
     /org/sonatype/team1/ui/5.0/ui-5.0.jar
     /org/sonatype/team2/format/1.0/format-1.0.jar
     /org/sonatype/team3/database/10.0/database-10.0.jar
    
    Remaining
     /org/sonatype/team2/format/1.0/format-1.0.jar
    
  • A specific component
    /pool/main/z/zsh/zsh-common_5.4.2-3ubuntu3_all.deb
    
    Repository
     /pool/main/libc/libcap2/libcap2_2.25-1.2_amd64.deb
     /pool/main/z/zsh/zsh_5.4.2-3ubuntu3_amd64.deb
     /pool/main/z/zsh/zsh-common_5.4.2-3ubuntu3_all.deb
    
    Remaining
     /pool/main/libc/libcap2/libcap2_2.25-1.2_amd64.deb
     /pool/main/z/zsh/zsh_5.4.2-3ubuntu3_amd64.deb