Skip to main content

Cleanup Performance Data

Test Focus and Limitations

This testing aimed to evaluate performance between Java- and SQL-based cleanup. It also sought to understand performance impacts when opting to retain specific versions while using SQL-based cleanup; this feature was added in release 3.65.0.

As of release 3.65.0, all Sonatype Nexus Repository Pro deployments using PostgreSQL databases will use SQL-based cleanup by default.

Summary of Findings

  • SQL-based cleanup far outperforms Java-based cleanup.

    • The average time per component for SQL-based cleanup was 10.6 ms versus 17.2 ms for Java-based cleanup.

    • The total process could take more than 10 hours with the SQL-based approach and more than 15 hours with the Java-based approach for 3.2 Million components.

    • Based on this data, we conclude that SQL-based cleanup performs 38% better than Java-based cleanup (results may vary depending on the deployment environment).

  • Neither SQL- nor Java-based cleanup test scenarios experienced detectable bottlenecks (memory leaks) when running the cleanup task.

  • Neither SQL- nor Java-based cleanup test scenarios experienced hardware degradation.

    • SQL-based cleanup saw 10% more CPU utilization versus Java-based cleanup.

    • Other hardware utilization statistics were similar between both approaches.

    • Using the ability to retain specific versions did cause a spike in RDS CPU utilization; however, this is expected as filtering by queries requires more effort from the database engine versus generic queries.

  • When testing the ability to retain specific versions, we saw no performance difference when using the last updated blob or last downloaded criterion to determine specific versions to retain.

Test Environment and Setup

The test environment included one Sonatype Nexus Repository deployment using first Java-based cleanup and then SQL-based cleanup.

The tested deployment met the following specifications:

  • 1 Sonatype Nexus Repository node deployed in an AWS m5d.2xlarge EC2 instance (8 vCPU, 32 GB RAM)

    • External Amazon PostgreSQL RDS backed by db.r5.xlarge (4 vCPU, 32GB RAM)

    • ~3.2 million Maven components in hosted and proxy Maven repositories

In both the Java- and SQL-based cleanup scenarios, the applied cleanup policy covered all components (i.e., all components were flagged for cleanup).

When testing performance impact of retaining specific versions, we adjusted the policy so that not all components would be identified for cleanup.

Resources Use Sample

SQL-based and Java-based Cleanup Comparison

Testing concluded that SQL-based cleanup performed approximately 38% faster than Java-based cleanup. Note that we mathematically calculated this value; you should only use it as a reference.

309526704.png

As expected, the average time per component was less for SQL- than Java-based cleanup. There was only minimal difference in hardware utilization.

309526701.png

Testing for Retaining a Specified Version Range

Item

criteriaLastBlobUpdated

criteriaLastDownloaded

Average Time per Component Retain 1

2.13 ms

3.25 ms

Average Time per Component Retain 2

4.29 ms

3.74 ms

Average Time per Component Retain 3

4.27 ms

2.89 ms

Average Time per Component Using Last Blob Updated Criterion

309526551.png

Average Time per Component Using Last Download Date Criterion

309526548.png

Summaries for Each Test Execution

Tip

Select any of the chart images to view them in full size.

Java-Based Cleanup

Test Execution

Elapsed Time

Avg. Time per component

CPU Avg. Sonatype Nexus Repository Node

CPU Avg. RDS

Observations

1

16 hours, 15 minutes, and 9 seconds

(58,509 seconds)

17.9 ms

309526749.png
309526746.png
  • Sonatype Nexus Repository CPU utilization rounded ~2%

  • RDS CPU utilization ranged from ~10% to ~25%

  • Memory behavior was as expected; no bottlenecks or memory leaks detected

2

15 hours, 31 minutes, and 19 seconds

(55,879 seconds)

17.1 ms

309526743.png
309526740.png
  • Sonatype Nexus Repository CPU utilization rounded ~1.7%

  • RDS CPU utilization ranged from ~8% to ~25%

  • Memory behavior was as expected; no bottlenecks or memory leaks detected

3

15 hours, 3 minutes, and 27 seconds

(54,207 seconds)

16.6 ms

309526737.png
309526734.png
  • Sonatype Nexus Repository CPU utilization rounded ~1.5%

  • RDS CPU utilization ranged from ~8% to ~28%

  • Memory behavior was as expected; no bottlenecks or memory leaks detected

SQL-Based Cleanup

Tip

Select any of the chart images to view them in full size.

Test Execution

Elapsed Time

Avg. Time per component

CPU Avg. NXRM Node

CPU Avg. RDS

Observations

1

7 hours, 57 minutes, and 12 seconds

(28,632 seconds)

8.7 ms

309526731.png
309526728.png
  • Response times improved a lot compared to the Java-based approach (from ~17 ms to ~9 ms)

  • Sonatype Nexus Repository CPU utilization rounded ~1%

  • The first 4 hours the RDS CPU utilization ranged from ~20% to ~40%

  • After the first 4 hours and until the test’s the end, RDS CPU utilization ranged from ~10% to ~35%

  • No memory management concerns detected during this test

  • This test seemed to be an outlier with better results than the average; therefore, it should be treated as the best seen, not the general expected

2

10 hours, 23 minutes, and 37 seconds

(37,417 seconds)

11.4 ms

309526725.png
309526722.png
  • CPU utilization was ~1%; this value is similar to the previous test

  • For the first 5 hours, RDS CPU utilization ranged from ~15% to ~30%

  • After the first 5 hours and until the end of the test, RDS CPU utilization ranged from ~10% to ~35%

  • No bottlenecks or memory leaks detected

  • Avg time increased a bit compared to the previous test, but it still far less than the Java-based tests average.

3

10 hours, 22 minutes, and 35 seconds

(37,355 seconds)

11.4 ms

309526719.png
309526716.png
  • Sonatype Nexus Repository CPU utilization rounded ~1%

  • Similar pattern in RDS CPU utilization as seen in previous tests

  • No bottlenecks or memory leaks detected

  • Avg time per component was very close to the previous test value at 11.4 ms.

4

10 hours, 7 minutes, and 27 seconds

(36,447 seconds)

11.1 ms

309526713.png
309526710.png
  • Similar pattern in CPU utilization for both Sonatype Nexus Repository and the RDS; this confirms the behavior seen in the previous executions.

  • Avg time per component was 11.1 ms, which is slightly less than but still very close to the previous test values

SQL-Based Cleanup with Retaining a Specified Version Range

Tip

Select any of the chart images to view them in full size.

Test Execution

Elapsed Time

Avg. Time per component

CPU Avg. NXRM Node

CPU Avg. RDS

Observations

1

33 minutes, and 54 seconds

(2,034 seconds)

0.6 ms

309526698.png
309526695.png
  • 876,142 components cleaned based on policy and regex

  • This test was configured to retain 1 version per component

  • Average time per component was much faster than when performing a complete cleanup

2

34 minutes, and 27 seconds

(2,067 seconds)

0.6 ms

309526692.png
309526689.png
  • 876,142 components cleaned based on policy and regex

  • This test was configured to retain 1 version per component

  • Hardware utilization was similar to previous executions

  • Maximum RDS CPU utilization was ~60%

3

25 minutes, and 13 seconds

(1,513)

0.4 ms

309526686.png
309526683.png
  • 876,142 components cleaned based on policy and regex

  • This test was configured to retain 1 version per component

  • As with previous tests, the RDS consumed the most resources, but there were no detected bottlenecks or memory leaks

4

55 minutes, and 56 seconds

(3,356 seconds)

1.0 ms

309526680.png
309526677.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 2 versions per component

5

58 minutes, and 30 seconds

(3,510 seconds)

1.0 ms

309526674.png
309526671.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 2 versions per component

6

25 minutes, and 13 seconds

(1,513 seconds)

0.4 ms

309526668.png
309526665.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 2 versions per component

7

27 minutes, and 47 seconds

(1,667 seconds)

0.5 ms

309526662.png
309526659.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

8

57 minutes, and 59 seconds

(3,479 seconds)

1.0 ms

309526656.png
309526653.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • This test was an outlier; however, we have considered these results as they may reflect behavior that could happen in production under similar conditions

  • This test was configured to retain 3 versions per component

9

29 minutes, and 49 seconds

(1,789 seconds)

0.5 ms

309526650.png
309526647.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

10

51 minutes, and 50 seconds

(3,110 seconds)

0.9 ms

309526644.png
309526641.png
  • 876,142 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

11

1 hours, 0 minutes, and 1 seconds

(3,601 seconds)

1.1 ms

309526638.png
309526635.png
  • 876,142 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

12

30 minutes, and 50 seconds

(1,850 seconds)

0.5 ms

309526632.png
309526629.png
  • 876,142 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 1 version per component

13

56 minutes, and 27 seconds

(3,387 seconds)

1.0 ms

309526626.png
309526623.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • This seemed to be an outlier with worse performance than the average; however, the hardware utilization did not show any indication of poor resource management

  • This test was configured to retain 2 versions per component

14

30 minutes, and 20 seconds

(1,820 seconds)

0.5 ms

309526620.png
309526617.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • Performance was similar to tests 4, 5 and 6 (similar configs), confirming that the 13th test was possibly an outlier

  • This test was configured to retain 2 versions per component

15

34 minutes, and 56 seconds

(2,096 seconds)

0.6 ms

309526614.png
309526611.png
  • 649,882 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 2 versions per component

16

28 minutes, and 16 seconds

(1,696 seconds)

0.5 ms

309526608.png
309526605.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

17

28 minutes, and 17 seconds

(1,697 seconds)

0.5 ms

309526602.png
309526599.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • This test was configured to retain 3 versions per component

18

21 minutes, and 37 seconds

(1,297 seconds)

0.4 ms

309526596.png
309526593.png
  • 540,371 components cleaned based on policy, retain number, and regex

  • Confirms the behavior seen in 16th and 17th tests; no hardware management problems detected

  • This test was configured to retain 3 versions per component