Storage Guide

Introduction

A binary large object (blob) store provides object storage for components and assets. One or many repositores or repository groups can use each blob store. By default, Nexus Repository automatically creates a file blob store named "default" in the directory during installation. You can configure new blob stores by navigating to Administration  Repository  Blob Stores in Nexus Repository. Once you've created a blob store, many of its attributes are immutable. This makes it important to carefully plan your blob store configuration.

Learn the basics of storage management in our technical guide.

Terminology

  • Binary large object (Blob) - An object containing data (e.g., component binaries and metadata files) within a blob store.
  • Blob Store - An internal storage mechanism for the binary parts of components and their assets.
  • Concrete Blob Store - Any blob store that is not a group.
  • Fill Policy PRO - For a group blob store, the fill policy determines to which blob store a blob is written.
  • Group Blob Store PRO - A blob store that delegates operations to one of the other blob stores on its list.
  • Hard Delete - When a soft-deleted blob is permanently removed from the blob store (usually via the Admin - Compact blob store task).
  • Promote to Group PRO - A process by which a concrete blob store becomes a group blob store that can access the original concrete blob store's blobs.
  • Soft Delete - When a blob is marked for deletion from a blob store but still exists within the blob store. This is enabled by default to protect against accidental deletion.
  • Soft Quota - A feature that monitors a blob store and raises an alert when a specified metric exceeds a constraint. If a monitoring check fails, then the blob store is considered unhealthy. Writes still proceed, but a warning is logged. Types of soft quotas are as follows:
    • Space Limit - A byte limit (i.e.,  Quota Limit in MB) compared against specific blob store metrics by a soft quota.
    • Space Used  Quota - This type of quota is violated when the total size of the blob store exceeds the space limit.
    • Space Remaining Quota - This type of quota is violated when the available space of a blob store falls below the space limit.

Fields

The following fields appear in the blob store listing on the Administration  Repository  Blob Stores page in Nexus Repository:

  • Name - The blob store's name.

  • Type - The type of blob store implementation.

    • Azure Cloud Storage PRO - Stores blobs in Azure cloud storage. 

    • File - Store blobs in file system-based storage.

    • Group PRO - Combines multiple blob stores into one. 

    • S3 - Store blobs in AWS S3 cloud storage.

  • State - Whether the blob store is started or not.
  • Blob Count - The number of blobs currently stored in a blob store.
  • Total Size - The blob store's total size in bytes.
  • Available Space - A blob store's remaining storage capacity.

Planning

If you have a High Availability cluster, then you should also refer to Configuring Blob Stores.


You will need to choose which types to use, how many blob stores to create, and how you allocate repositories to these blob stores. These decisions should be based on:

  • the size of your repositories
  • the rate at which you expect them to grow over time
  • the storage space available
  • the options you have available for adding storage space

Blob Store Layouts

Once a repository is allocated to a blob store, it can take significant time to run the tasks necessary to change this. Blob stores cannot be split but they can be moved from one storage device to another using a process described below. For these reasons, your approach to using blob stores should be chosen carefully.

Single Blob Store

The simplest approach is to create a single blob store per storage device and divide your repositories among them. This is suitable in the following cases:

  • If you exceed available storage, you will be able to move blob stores to larger storage devices.
  • Your repositories are growing slowly enough that you will not exceed your available storage within a year.

Multiple Blob Stores

If you need to implement multiple blob stores in your configuration, do so with caution. While Nexus Repository can handle many blob store configurations, performance issues can occur in some scenarios: 

  • Repositories that are grouped together or are a part of a shared build pipeline (e.g., staging) benefit from being on a single blob store. If you use multiple blob stores in this scenario, components must be copied across multiple blob stores instead of just updating a database.
  • Because cleanup tasks are configured and serialized against each blob store, having many of them can cause lagging.
  • The remaining disk space calculations can be inaccurate when multiple blob stores are located on the same disk.
  • Having many blob stores may negatively impact database search efficiency.

When creating blob stores, focus on the utility gained versus the cost associated with managing multiple blob stores. Here are some best practices and considerations:

  • Consider separating blob stores by format as they are often on different build pipelines or repository stages.
  • You might split out Docker repositories since they can grow very quickly.
  • Consider splitting out more critical components (hosted vs. proxy/replicated repositories). Nexus Repository doesn't support the partial backup restoration, so you may benefit from keeping more actively developed components on a separate blob store from archived components. You can also put priority components on faster disks while archived ones go on slower, cheaper storage.
  • Only split by teams or lines of business when the team will never overlap or share components and when there are a reasonably fixed number of teams.
  • Only use automation for initial blob store provisioning, not for mass blob store creation.
  • Don't partition blobs per repository. This will cause performance impacts at scale.

Group Blob Store 

A group blob store combines concrete blob stores so that they act as a single blob store for repositories. Repositories can only act singularly or as part of a group. A repository can either use a concrete blob store directly or as a member of a single group, but not both.

Group blob stores are a good choice if you meet the following criteria:

  • You need to add more storage space via multiple devices.
  • You need the ability to spread writes and reads across multiple blob stores.
  • You need to mix both disk and cloud storage.

Promoting a Blob Store to a Group 

To promote a blob store to a group, select the Promote to group button. This launches the promotion process to add your blob store to a group for more flexibility. Follow the on-screen prompts to create the blob store group, which will contain the previously concrete blob store and to which you can add other blob stores.

You cannot undo promoting a blob store to a group.


What is a Fill Policy?

When configuring a blob store group, you will be asked to select a fill policy (i.e., a write policy). A fill policy is  the method that the blob store group uses to choose a member for writing blobs. You can change the fill policy at any time.

Available fill policy choices include the following:

  • Round Robin - Incoming writes alternate between all blob stores in the group. If writing to one blob store fails, then it attempts again with the next blob store in the group. This is useful when you have a number of blob stores available and you want each of them to receive a roughly equal number of writes. This does not balance based upon any other metric.
  • Write to First - All incoming writes are given to the first writeable blob store (skipping blob stores in a read-only state). If you need to direct all blobs to a specific blob store (e.g., you have a blob store for a new empty disk), then this fill policy will ensure that the new blob store gets all the writes.

Removing a Blob Store from a Group

To remove a blob store from a group, you will need to use the Admin - Remove a member from a blob store group task to ensure that repositories still have access to their blobs. Groups allow you to add members dynamically, but removing a blob store requires a task to ensure that repositories still have access to their blobs.

Moving a Blob Store

The following steps can be used to move a blob store to a new location. You can also use these steps to change a blob store's storage (e.g., moving a file-based blob store to S3).

  1. Create a new blob  store with the storage path set to the new location.
  2. Promote the existing to a group.
  3. When asked in the form, set the new group's Fill Policy to Write to First.
  4. Add the new blob  store that you created in step 1 to the newly promoted group blob  store underneath the original blob store.
  5. Schedule and run an Admin - Remove a member from blob store group task via  Administration → System → Tasks to remove the original blob store from the group.

The original blob store's contents will be moved over to the new blob store before the original blob store is removed. 

Estimating Blob Store Storage Requirements

Blob stores contain two files for each binary component stored in Nexus Repository:

  • The binary component (stored as a .bytes file thats size is the same as the component).
  • A properties file that stores a small amount of metadata for disaster recovery purposes (<1KB).

The total blob store storage size is therefore approximately the total size of all of your components plus an allowance for the properties files and the block size of your storage device.

On average, the allowance will be 1.5 * # of components * block size.

Blob Store Types

Nexus Repository supports several types of blob stores. File blob store is the default and is recommended for most installations.

File Blob Store

A file blob store lets Nexus Repository store blobs as files in a directory. The Path parameter supplied during blob store creation determines the blob files' location. The Path can be on either a local disk or a NFSv4-compatible mount. See Configuring Blob Stores for more in-depth information about configuring storage services for use with blob stores.

Nexus Repository does not support the following file systems types:

  • NFS versions 3 and older
  • GlusterFS
  • FUSE based user space file systems

S3 Blob Store

An S3 blob store saves blobs as objects within a bucket on AWS S3.

Requirements

  • The S3 blob store is only recommended for Nexus Repository installations hosted in AWS.
  • The S3 blob store should be in the same AWS region as the Nexus Repository installation. Using different regions will result in unacceptably slow performance.
  • Nexus Repository only supports AWS S3. We neither test nor support other S3 protocol implementations.

Nexus Repository automatically creates an S3 bucket when you create a blob store if one does not already exist; however, you may also create the bucket yourself. Note the following:

  • Nexus Repository will automatically apply a lifecycle rule to expire deleted content.
  • The bucket can use server-side encryption with KMS key management transparently. Nexus Repository does not support other server-side encryption methods.
  • If running on EC2 instances, Nexus Repository can use the IAM role assigned to those instances when accessing S3 buckets.

Carefully consider whether S3 is the right storage solution for you. Performance is highly dependent on the speed of the network between Nexus Repository and the AWS endpoint to which you connect. Nexus Repository will send multiple outbound HTTP requests to AWS to store blobs into S3; large blobs are split into chunks over multiple requests. If your Nexus Repository instance is not in AWS or connecting to another region, an S3 blob store may be significantly slower than a file-based blob store.

For optimum performance:

  • Run Nexus Repository on AWS on EC2 instances.
  • Ensure that the S3 connection is using the region in which Nexus Repository is run.

The chunk size when uploading to S3 can be adjusted by setting the property nexus.s3.multipartupload.chunksize in the nexus.properties file. The unit is bytes and the default is 5242880 (5MB). This can be tuned for optimal performance on your network.

Using Replicated Blob Stores for Recovery or Testing 

NEW IN 3.31.0 PRO

Should you need to use a replica of a production S3 blob store for recovery or testing purposes, you will also need a copy of the Nexus Respository database that corresponds to the blob store. This database will contain references to the production blob store. Using the database copy unmodified will result in unintended modifications to the production S3 blob store. For this reason, Nexus Repository provides a mechanism to override specific blob store attributes via an environment variable (NEXUS_BLOB_STORE_OVERRIDE) during Nexus Repository startup. Using this mechanism, you can override the S3 blob store bucket name attribute to point to the replica to avoid unintended modification of the production blob store.

For example, suppose you have a production S3 blob store named nxrm-blob-store that is associated with an S3 bucket named nxrm-bucket-prod. You have set up replication of this bucket to another bucket named nxrm-bucket-stage for testing purposes. In the testing environment, you can restore a backup or snapshot of the production Nexus Repository database and use the following environment variable when starting Nexus Repository to update the S3 blob store bucket name in the staging environment to point to nxrm-bucket-stage instead of nxrm-bucket-prod:

NEXUS_BLOB_STORE_OVERRIDE='{"nxrm-blob-store": {"s3": {"bucket": "nxrm-bucket-stage"}}}'

The NEXUS_BLOB_STORE_OVERRIDE environment variable is expected to contain a JSON representation of a map for which the key is the name of the blob store you are modifying. The value is another map with the same structure as the blob store attributes in the Nexus Repository database. In the case of an S3 blob store, the attributes map key is expected to be "s3" and the value associated with that key is a map of attributes that you wish to override (in our example, it would be "bucket").

Nexus Repository will only attempt blob store overrides where the blob store name in the environment variable matches an existing blob store in the Nexus Repository database. When there is a matching blob store name, Nexus Repository will only make a modification when the attribute value provided is different than the existing value in the Nexus Repository database.

It is important to note that the blob store override environment variable only changes blob store configuration in the Nexus Repository database and does not modify the referenced underlying blob store in any way. It is up to you to ensure that the new S3 bucket in this case contains blob store files that correspond to the Nexus Repository database being used (in this case, a replica of the production S3 bucket). The blob store override environment variable will not do any sort of copying of information from the existing production S3 bucket to the staging S3 bucket.

You can use NEXUS_BLOB_STORE_OVERRIDE to modified several blob stores:

NEXUS_BLOB_STORE_OVERRIDE='{"blob-store-1": {"s3": {...}}, "blob-store-2": {"s3": {...}}, "blob-store-3": {"s3": {...}}}'

You can also modify several attributes for each blob store:

NEXUS_BLOB_STORE_OVERRIDE='{"nxrm-blob-store": {"s3": {"bucket": "nxrm-bucket-stage", "region": "us-east-2"}}}'

The attributes available for override are any attributes that are defined for the blob store configuration in the Nexus Repository database. For S3 blob stores, the most common attributes are as follows: 

  • "bucket"
  • "region"
  • "prefix"
  • "accessKeyId"
  • "assumeRole" 
  • "sessionToken"

Azure Blob Store 

NEW IN 3.31.0 PRO

An Azure blob store saves blobs as objects within a storage account container on Microsoft Azure.

Requirements

  • The Azure blob store is only recommended for Nexus Repository installations hosted in Azure.
  • The Azure blob store should be in the same Azure region as the Nexus Repository installation. Using different regions will result in unacceptably slow performance.

You must create the Azure storage account in Azure before using Nexus Repository to create an Azure blob store. Below are the recommended storage account settings:

  • Location: the location hosting Nexus Repository
  • Performance: Standard or Premium
  • Account kind: StorageV2
  • Replication: Any

Nexus Repository will automatically create an Azure container when a blob store is created if one does not already exist.

The Azure storage container name must be a valid DNS name that follows the rules that Microsoft states in its documentation

There are two methods of gaining access to the Azure storage account from Nexus Repository:

  1. Use a secret access key supplied by the Azure storage account.
  2. If you're running Nexus Repository on an Azure VM, you can use System Managed Identity access.

System Managed Identity allows Azure to manage the access via roles assigned to the VM in which you are running Nexus Repository. See the Microsoft documentation for details.

To properly use the System Managed Identity, the Azure VM will need the following roles assigned to the Azure storage container:

  • Storage Account Contributor
  • Storage Blob Data Contributor


Nexus Repository does not validate that the proper roles are assigned before storing the configuration. If the aforementioned roles are not properly granted to the VM, you will need to delete the blob store and then add it again after the roles have been set up in the Azure storage instance.

Carefully consider whether Azure is the right storage solution for you. Performance is highly dependent on the network speed between Nexus Repository and Azure. Nexus Repository will send multiple outbound HTTP requests to Azure to store blobs in the storage account; large blobs are split into chunks over multiple requests. If your Nexus Repository instance is not in Azure or connecting to another location, an Azure blob store may be significantly slower than a file-based blob store.

For optimum performance, you'll want to take the following steps:

  • Run Nexus Repository on Azure on virtual machines
  • Ensure that the Azure connection is using the location where Nexus Repository is being run

The chunk size when uploading to Azure can be adjusted by setting the property nexus.azure.blocksize in the nexus.properties file (e.g., nexus.azure.blocksize=1000000). By default, this is set to 5242880 bytes (5MB). You can tune this for optimal performance on your network.