Requirements for High Availability
This section covers specific requirements for high-availability (HA) deployments for Nexus Repository. Review the following details before starting:
Requires a Nexus Repository Pro license
High-availability (HA) deployments are only available for Nexus Repository Pro customers.
Requires technology outside of Nexus Repository scope
Prepare to have in-house expertise in these technologies. E.g.; Load Balancer, Kubernetes, Helm Charts, Docker, Cloud Infrastructure, etc.
Requires significant infrastructure and maintenance costs
Evaluate the total investment costs of your architecture to meet your expected traffic and business requirements.
Requires full testing before implementation in production
Improper deployment to production may result in excessive downtime and data loss.
High Availability Requirements
Every node must meet the system requirements The system requirements including the CPU and memory requirements must be met for each node independently. Every node must be on the same version of Nexus Repository with the same configuration in their
nexus.properties
file.See Zero Downtime Upgrades to learn more.
Each node is on a separate server For fault-tolerant deployments, ensure each Nexus Repository instance is on a physical server. Have different servers for each node to limit the risk of a system failing.
Deploy to a single region or data center HA deployments are restricted to a single data center. Deploying across-region leads to significant database latency and data loss. Nexus Repository may be deployed in a federated model to support other regions.
See the Deployment Pattern Library
Review the performance metrics when sizing deployments
Technology Requirements
The following external technologies are required for HA deployments:
A load balancer is required to distribute traffic Examples include: HAProxy, NGINX, Apache HTTP, AWS ELB, etc
Blob stores accessible to active nodes See Migrating to Shared Blob Storage for details.
An external PostgreSQL database accessible to active nodes A low-latency and high-performance database server is critical for HA deployments.
Supported Infrastructure and Limitations
The following table outlines the supported infrastructure for HA deployments. Our support team's expertise is in AWS technologies. Other cloud infrastructure is supported but may fall outside of where our support team can provide guidance.
You must retain in-house expertise for the technologies you are using for your deployment.
Hardware/Software Components | Specific Technology and Version |
---|---|
Nexus Repository Pro 3.50.0+ | |
See the Java Compatibility Matrix | |
AWS, Azure, and Google Cloud are supported. | |
Operating systems that run your Java version. | |
AWS EC2, Azure VM, Bare metal servers, Hyper-V virtual machines, VMWare virtual machines | |
AWS Aurora PostgreSQL 13.3+ AWS RDS PostgreSQL 13.3+ Azure Database for PostgreSQL Flexible Server using PostgreSQL 13+ Google Cloud SQL for PostgreSQL 14.13+ Self-managed PostgreSQL 13+ | |
(sonatype-work directory) | AWS EBS, AWS EFS, Azure Storage, Google Cloud Filestore, NFS v4 |
AWS S3, Azure Blob Storage, Google Cloud Storage, NFS v4 | |
AWS EKS v1.23+ Azure AKS 1.25.5 Google Kubernetes Engine 1.30+ Kubernetes (on-prem deployments) | |
AWS ALB Azure Load Balancer Nginx/HTTPD (latest version) | |
Not available |
Feature Differences Between Legacy OrientDB and H2/PostgreSQL
While we have made every effort to keep Nexus Repository's external behavior consistent while moving away from OrientDB, differences include some unsupported features and formats.
Items Not Supported for H2 and PostgreSQL
The following features/formats are not supported for H2 and PostgreSQL environments:
High Availability Deployments require a PostgreSQL database replacing the legacy clustering (HA-C)
Bower format
Community formats (APK, Composer, CPAN, Puppet)
When migrating to an H2 or PostgreSQL database, unsupported formats are not migrated. You can not add repositories for formats that these databases do not support.
Update Your External Plugins
You will need to update external plugins that introduce new repository types, interact with repositories or repository content, or interact with the database directly to be compatible with the new data access approach.
Major Changes to Asset Name Matcher Regular Expressions
Regular expressions used in cleanup policies for path and name matching are handled differently when queried in OrientDB than they are in H2 or PostgreSQL environments.
More content may be removed from repositories during cleanup after the migration.
Anchored queries match the entire asset name when there isn’t a wildcard in the regex
In OrientDB, the Asset Name Matcher uses Lucene regular expressions that are anchored by default.
H2 and PostgreSQL environments use Java regular expressions that are not anchored.
Example of an Expression That Matches More Items After Migration to PostgreSQL/H2
This example contains the following assets for consideration:
/antlr/antlr/2.7.2/antlr-2.7.2.jar /org/antlr/antlr-master/3.1.3/antlr-master-3.1.3.pom
A cleanup policy sets the Asset Name Matcher
using the following regular expression.
antlr.*
Cleanup in OrientDB would only match the first item while H2 and Postgres queries will remove both components.
Backup and Restore Considerations
You can back up the embedded H2 database using the same process you've used before, but you will need to use the Admin - Backup H2 Database scheduled task.
Any existing OrientDB backups will not be compatible with either of the new database options. Nexus Repository will no longer handle external PostgreSQL database backups; this will be the system administrator's responsibility.
The H2 and ProgreSQL implementations in Nexus Repository do not use triggers or stored procedures, so the exclusion of them in the backup is not an issue.
Asset Paths Require Forward Slash
References to Assets via the REST API now require a forward slash in front of a path to work. For example, ticketlist.txt
now must be /ticketlist.txt
.
Log Differences
There are no changes to log file locations, but any logging related to database interactions will be different.
Groovy Scripting is Not Recommended
Nexus Repository has a feature for extending its functionality with Groovy scripts. (This feature is disabled by default for security reasons, but is still available.) In many cases, these scripts accessed undocumented, non-public Nexus Repository APIs. You may need to update scripts connecting to non-public APIs for them to work.
For both security and forward compatibility reasons, we recommend making use of the public REST API as much as possible rather than using Groovy scripting.
Changes Impacting Webhooks Events
Asset events contain the same information as before except that the value assigned to the asset name now begins with a forward slash. Beginning the asset's name with a forward slash is not specific to webhooks.
There are no changes to the contents of a Component event.
The following event types are generated for repository uploads:
Asset and Component events of type
CREATED
- the quantities of these are the same as before.Several Asset and Component events of type
UPDATED
- more events of this type are generated.
The following scenarios will emit an event of type
PURGED
containing the IDs of all deleted assets and components:When performing component cleanup. Nexus Repository no longer generates
DELETED
events for each component and assets deleted during component cleanup.Whenever deleting the last component from a Maven repository.
Deleting individual assets/components (i.e., not via component clean up) from a repository generates the same number and types of events as before.
Coordinate-Based Content Selectors
Path-based content selectors are fully compatible with the new architecture. However, we are using the transition to remove a deprecated feature: coordinate-based content selectors. These are any content selectors with references to format-specific coordinates.
Notable Search Functionality Differences Between Environments
Due to ongoing work for improving component search in Sonatype Nexus Repository, some functionality differences currently exist between deployments using OrientDB, H2, PostgreSQL, and/or High Availability (HA). Take note of the specific differences and considerations in the sections below before you begin searching for components.
Searching for Components in an OrientDB Environment
Most of the documentation in this section focuses on how search works in an OrientDB environment. However, searching by Conan Package Id and Conan Package or Recipe Revisions is not available for those using OrientDB.
Searching for Components in an H2 or Non-HA PostgreSQL Environment
Searching by Conan Package Id and Conan Package or Recipe Revisions is supported in an H2 environment.
When searching for raw components, you must use a leading slash (i.e., "/").
Search functionality is otherwise the same as that of an OrientDB environment.
Search Feature Differences in an HA Environment
The search feature implemented for high availability (HA) differs greatly from the search currently used in a non-HA environment.
HA search has the following requirements:
At least one search criterion is required when searching through the UI
At least one additional search criterion is required beyond format when searching through the UI
Each search criteria must be at least three characters long
Leading wildcard search is not supported; however, you may use a trailing wildcard (i.e., prefix search)
Searches cannot begin with special characters followed by a wildcard
All keyword searches automatically append a wildcard (*) at the end of each criterion
Example Invalid Search Patterns
The following search patterns are some that may work in non-HA environments but will not work in an HA environment
Invalid Pattern | Reasoning |
---|---|
"*" | Less than three characters long; leading wildcard not supported |
"he*" | Less than three characters long |
"he*@_" | User criteria ("@_") contains only an underscore |
"he*@/" | User criteria ("@/") contains only a forward slash |
"he*@*" | User criteria ("@*") contains only a wildcard |
"/*/hel" | Search begins with a special character followed by a wildcard |
"$%*hel" | Search begins with two special characters followed by a wildcard |
Re-fetch Limits and Search Configuration Capability
As of 3.72.0, Nexus Repository makes 10 trips to the database to return a result set in a PostgreSQL HA deployment. You can adjust this limit by creating a Search Configuration Capability specifying a different re-fetch limit. Setting the re-fetch limit to "-1" will allow unlimited database queries; however, this can result in slower performance in large deployments. In some scenarios, the re-fetch limit may result in missing or empty search results.
You must have clustering enabled to see the Search Configuration capability option.