Resiliency and High Availability
Only available in Sonatype Nexus Repository Pro. Interested in a free trial? Start here.
What is Resiliency?
Choosing the appropriate resiliency options to meet your needs should be your primary goal when designing your Nexus Repository architecture. Resiliency refers to the ability to recover from disruptions to critical processes and supporting technology systems. Disruptions may include any of the following:
failure of a single service (the repository node, the external relational database, or the artifact storage)
a data center outage for the production environment
an availability zone outage in the case of cloud services
The scope of interruption you are planning to mitigate will determine which architecture you will need to achieve the level of resiliency required.
See our migration documentation
Backup and Restoration
As you review backup strategies, there are two important terms to remember:
Recovery Point Objective - the amount of data loss that is acceptable if a restore becomes necessary
Recovery Time Objective - the length of time required to restore the service
Your backup plan will need to balance the cost of maintenance with the risk of potential data loss and disruptions to the service. Setting requirements for fast recovery with the least risk will increase infrastructure complexity and maintenance costs for achieving those results. You will also need to regularly test the recovery process to ensure that the process is successful and to provide training for process owners. Regardless of implementation size, make sure to document your plan and keep it up to date with any infrastructure changes.
You can configure your architecture to schedule database exports or use third-party tooling to transfer and back up files from one location to another.
For OrientDB or H2, Nexus Repository provides tasks to create database snapshots and relocate them to a target disk. Other directories in your local instance (or instances) should also be copied and rebuilt on a backup disk (see Prepare a Backup).
You will need to back up blob storage outside of the repository service.
See Backup and Restore (for H2 and OrientDB) and Backup and Restore in Amazon Web Services for further information.
Library of Patterns
The sections below list various patterns to use depending on your resiliency requirements.
Active-Active
A cluster of redundant active Nexus Repository instances within a single region or on-premises data center. The number of instances may be manually scaled or leverage Kubernetes to automatically scale instances.
Use Cases: This model maximizes uptime while protecting against application and hardware failures. May be scaled to multiple availability zones in a single region for additional protection.
Limitations: Uses multiple technology stacks that require in-house expertise to manage appropriately. Requires high infrastructure and maintenance overhead which may not yield your desired return on investment.
Examples
Single Node with Backup
Single active node with a cold backup that can be used to recover from a data loss.
Use Cases: This model reduces data loss.
Limitations: The backup is manual and requires downtime to recover.
Examples:
Single Node with Dynamic Failover
A single active node in one availability zone. Should a node or availability zone fail, Kubernetes activates a second node in either the same or a second availability zone.
Use Cases: This model reduces downtime, infrastructure costs, and data loss.
Limitations: Requires more downtime than active patterns.
Examples: