Change Repository Blob Store Task Performance Testing
Overview
The Admin - Change repository blob store task allows you to change a repository's blob store source. This can be helpful for moving from local volume blob storage to AWS S3, from one S3 bucket to another, etc.
In release 3.45.0, we temporarily disabled the Admin - Change repository blob store task to address a number of important issues.
In release 3.58.0, we re-enabled this task for PostgreSQL and H2 deployments only. You should only use this task if you are on release 3.58.0 or newer.
Due to known issues with OrientDB, the task remains disabled for OrientDB deployments while we further investigate these issues.
For full transparency and to help you plan for using the Admin - Change repository blob store task, we are providing performance testing information for the following common scenarios:
- Changing a blob store from a local volume blob store to an AWS S3 blob store
- Changing a blob store from one AWS S3 blob store to another S3 blob store
- Changing a blob store from one local volume file blob store to another local volume file blob store in a different volume
Disclaimers
- Sonatype only tested the specific scenarios listed in the Overview and did not test against any special conditions or edge cases. While there may be other scenarios in which you would use this task, we have not tested those scenarios; your results may be very different from what we report here.
- Sonatype only tested and validated this task in specific scenarios from a functional and performance perspective. It is important to note that functional validation only includes the basic path with the expected conditions. This testing should not be considered as functional testing.
- The results displayed below should only be used as a reference as they come from testing in controlled scenarios with conditions that likely vary from your specific environment.
- Due to the time required for the task to complete, some information in this report comes from statistical calculations instead of literal results; the calculations are based on actual result samples.
Test Environment
The test environment was constructed as follows:
m5d.2xlarge EC2 nodes
16G maximum heap size for the JVM
local storage based on NVMe volumes
default network bandwidth for this EC2 node type
db.r5.xlarge RDS node type for Postgres executions
- Testing used Raw format repositories.
- As this feature does not directly relate to repository format, we expect that this approach reflects general behavior regardless of repository formats contained in the blob store.
Key Takeaways
- Successful results were only achieved for Sonatype Nexus Repository deployments using PostgreSQL and H2 databases. There is a known and proven issue when using OrientDB that results in database failure when using this task to move anything more than 10GB. This is not due to an issue with the task but rather due to the number of database changes that must be made. We do not recommend using this task to transfer anything more than 10GB in OrientDB deployments; doing so may result in database corruption.
- The following list illustrates scenario performance from best to worst:
- local volume to local volume
- local volume to S3 bucket
- S3 bucket to local volume
- S3 bucket to S3 bucket
- The Admin - Change repository blob store task takes significant time to complete; plan accordingly before using the task.
Summary of Findings
Scenario | Database | Time to move 10 GB | Time to move 100 GB | Time to move 1000 GB |
---|---|---|---|---|
Local Volume to Local Volume | PostgreSQL | + 7 minutes | + 2 hours | + 11 hours, 30 minutes |
H2 | + 17 minutes | + 3 hours | + 30 hours | |
Orient | Not supported | Not supported | Not supported | |
Local Volume to S3 Bucket | PostgreSQL | ~4 hours | ~40 hours | + 2 weeks |
H2 | ~4 hours, 20 minutes | +60 hours | + 3 weeks | |
Orient | Not supported | Not supported | Not supported | |
S3 Bucket to Local Volume | PostgreSQL | ~ 6 hours, 40 minutes | + 2 days, 12 hours | + 2 weeks, 5 days |
H2 | + 8 hours | + 3 days, 6 hours | + 1 Month | |
Orient | Not supported | Not supported | Not supported | |
S3 Bucket to S3 Bucket | PostgreSQL | + 10 hours, 30 minutes | +5 days | + 1 month and a half |
H2 | + 11 hours, 50 minutes | + 5 days | + 1 month, 3 weeks | |
Orient | Not supported | Not supported | Not supported |
Charts and Discussion for Scenarios When Moving 10GB
Local Volume to AWS S3
While OrientDB saw the best performance in this scenario, it is important to note that OrientDB is proven to fail with repositories exceeding 10GB. Therefore, those using OrientDB should not attempt to use this task on repositories exceeding 10GB.
PostgreSQL experienced a similar performance, and H2 performed the slowest.
S3 to Local Volume
PostgreSQL performed best in this scenario with significant differences in H2 and OrientDB performance.
Local Volume to Local Volume
As anticipated, this scenario was the fastest for all databases. Please note that the scale below is in minutes rather than hours like the other charts.
While OrientDB experienced the best performance, we once again reiterate that OrientDB is proven to fail with repositories exceeding 10GB.
PostgreSQL performance was not very different from that of OrientDB. H2 did take more than twice the time to complete the task.
S3 to S3
AWS S3 to S3 was the slowest scenario observed; however, it is expected to be one of the most stable due to the S3 bucket's high availability. However, customers should consider the significant time it will take for the task to complete.