Skip to main content

Change Repository Blob Store Task Performance Testing

Overview

The Admin - Change repository blob store task allows you to change a repository's blob store source. This can be helpful for moving from local volume blob storage to AWS S3, from one S3 bucket to another, etc.

Note

In release 3.45.0, we temporarily disabled the Admin - Change repository blob store task to address a number of important issues.

In release 3.58.0, we re-enabled this task for PostgreSQL and H2 deployments only. You should only use this task if you are on release 3.58.0 or newer.

Due to known issues with OrientDB, the task remains disabled for OrientDB deployments while we further investigate these issues.

Disclaimers

  • Testing for hosted and proxy repositories took place separately and on a different Sonatype Nexus Repository version from testing for group repositories. Support for group repositories was not added until Sonatype Nexus Repository version 3.65.0.

  • Sonatype only tested specific scenarios and did not test against any special conditions or edge cases. While there may be other scenarios in which you would use this task, we have not tested those scenarios; your results may be very different from what we report here.

  • Sonatype only tested and validated this task in specific scenarios from a functional and performance perspective. It is important to note that functional validation only includes the basic path with the expected conditions. This testing should not be considered as functional testing.

  • The results displayed below should only be used as a reference as they come from testing in controlled scenarios with conditions that likely vary from your specific environment.

  • Due to the time required for the task to complete, some information in this report comes from statistical calculations instead of literal results; the calculations are based on actual result samples.

Performance Testing for Hosted and Proxy Repositories

For full transparency and to help you plan for using the Admin - Change repository blob store task, we are providing performance testing information for the following common scenarios:

  1. Changing a blob store from a local volume blob store to an AWS S3 blob store

  2. Changing a blob store from one AWS S3 blob store to another S3 blob store

  3. Changing a blob store from one local volume file blob store to another local volume file blob store in a different volume

Key Takeaways

  • Successful results were only achieved for Sonatype Nexus Repository deployments using PostgreSQL and H2 databases. There is a known and proven issue when using OrientDB that results in database failure when using this task to move anything more than 10GB. This is not due to an issue with the task but rather due to the number of database changes that must be made. We do not recommend using this task to transfer anything more than 10GB in OrientDB deployments; doing so may result in database corruption.

  • The following list illustrates scenario performance from best to worst:

    1. local volume to local volume

    2. local volume to S3 bucket

    3. S3 bucket to local volume

    4. S3 bucket to S3 bucket

  • The Admin - Change repository blob store task takes significant time to complete; plan accordingly before using the task.

Test Environment

  • The test environment was constructed as follows:

    • m5d.2xlarge EC2 nodes

    • 16G maximum heap size for the JVM

    • local storage based on NVMe volumes

    • default network bandwidth for this EC2 node type

    • db.r5.xlarge RDS node type for Postgres executions

  • Testing used Raw format repositories.

    • As this feature does not directly relate to repository format, we expect that this approach reflects general behavior regardless of repository formats contained in the blob store.

Summary of Findings

Scenario

Database

Time to move 10 GB

Time to move 100 GB

Time to move 1000 GB

Local Volume to Local Volume

PostgreSQL

+ 7 minutes

+ 2 hours

+ 11 hours, 30 minutes

H2

+ 17 minutes

+ 3 hours

+ 30 hours

Orient

Not supported

Not supported

Not supported

Local Volume to S3 Bucket

PostgreSQL

~4 hours

~40 hours

+ 2 weeks

H2

~4 hours, 20 minutes

+60 hours

+ 3 weeks

Orient

Not supported

Not supported

Not supported

S3 Bucket to Local Volume

PostgreSQL

~ 6 hours, 40 minutes

+ 2 days, 12 hours

+ 2 weeks, 5 days

H2

+ 8 hours

+ 3 days, 6 hours

+ 1 Month

Orient

Not supported

Not supported

Not supported

S3 Bucket to S3 Bucket

PostgreSQL

+ 10 hours, 30 minutes

+5 days

+ 1 month and a half

H2

+ 11 hours, 50 minutes

+ 5 days

+ 1 month, 3 weeks

Orient

Not supported

Not supported

Not supported

Charts and Discussion for Scenarios When Moving 10GB
Local Volume to AWS S3

While OrientDB saw the best performance in this scenario, it is important to note that OrientDB is proven to fail with repositories exceeding 10GB. Therefore, those using OrientDB should not attempt to use this task on repositories exceeding 10GB.

PostgreSQL experienced a similar performance, and H2 performed the slowest.

168755403.png
S3 to Local Volume

PostgreSQL performed best in this scenario with significant differences in H2 and OrientDB performance.

168755404.png
Local Volume to Local Volume

As anticipated, this scenario was the fastest for all databases. Please note that the scale below is in minutes rather than hours like the other charts.

While OrientDB experienced the best performance, we once again reiterate that OrientDB is proven to fail with repositories exceeding 10GB.

PostgreSQL performance was not very different from that of OrientDB. H2 did take more than twice the time to complete the task.

168755405.png
S3 to S3

AWS S3 to S3 was the slowest scenario observed; however, it is expected to be one of the most stable due to the S3 bucket's high availability. However, customers should consider the significant time it will take for the task to complete.

168755406.png

Performance Testing for Group Repositories

When running this task on a group repository, the task will only move metadata assets directly related to the group repository content id, not member repository content.

All test runs used S3 blob stores as both the source and destination.

The tested Sonatype Nexus Repository instance was using a PostgreSQL database.

Results Summary

Number of Blobs Moved

Time Elapsed

7,958

58 minutes, 14 seconds

158

1 minute, 36 seconds

8,720

55 minutes, 10 seconds