Install Nexus Repository with a PostgreSQL Database

Nexus Repository^TM defaults to using an embedded H2 database. For production deployments, we recommend using PostgreSQL databases. This topic covers installing Nexus Repository^TM with an external PostgreSQL database.

See the PostgreSQL Database Requirements.

These instructions are to start Nexus Repository using an external PostgreSQL database.

Download and configure Nexus Repository but do not start the instance.
See Install Nexus Repository
Create a PostgreSQL database.
Set the database configuration.
Start the Nexus Repository instance.

Create a PostgreSQL Database

Following are basic steps for setting up a PostgreSQL Database.

See the PostgreSQL documentation

Connect to your PostgreSQL server as a superuser. This is typically 'postgres':
```
psql -U postgres 
```

Create a new database for Nexus Repository:

CREATE DATABASE nexus ENCODING 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0;

Connect to the newly created database:
```
\c nexus;
```
Create a schema (optional, but recommended):
```
CREATE SCHEMA nexus;
```

Create a user for Nexus Repository:

CREATE USER nexus WITH PASSWORD 'somepassword';

Grant necessary privileges to the user:

GRANT ALL PRIVILEGES ON DATABASE nexus TO nexus;
GRANT ALL PRIVILEGES ON SCHEMA nexus TO nexus;

Install the required trigram module. Note that this example command assumes you have created a schema named nexus. If you have not, you can use public:
```
CREATE EXTENSION pg_trgm SCHEMA nexus;
```

Database Configuration

Nexus Repository supports 3 methods for providing the database configuration settings. When Nexus Repository initially starts, the first connection method encountered is used while the other methods are ignored. Mixing methods are not supported.

The settings are checked in the following order:

(1) Environment Variables, (2) JVM Arguments, (3) the Properties File

Choose the method that fits your infrastructure requirements. In containerized environments, use environment variables or JVM arguments to avoid modifying the property files stored in the container. New settings take effect on restarting the service.

Environment Variables

Pass the connectivity details as environment variables:

NEXUS_DATASTORE_NEXUS_JDBCURL
NEXUS_DATASTORE_NEXUS_USERNAME
NEXUS_DATASTORE_NEXUS_PASSWORD

JVM Arguments

Specify the properties as JVM arguments:

-Dnexus.datastore.enabled
-Dnexus.datastore.nexus.username
-Dnexus.datastore.nexus.password
-Dnexus.datastore.nexus.jdbcUrl

nexus-store.properties
Create the <data-dir>/etc/fabric/nexus-store.properties directory and file with the following properties:
```
username=<postgres_user>
password=<postgres_password>
jdbcUrl=<jdbcUrl_property>
```

JDBC Url Property

The format of the PostgreSQL JDBC URL is as follows:

jdbc\:postgresql\://<database-host>\:<database-port>/<database-name>?<param1>=<value1>&<param2>=<value2>

Colons in the JDBC URL need to be escaped with a backslash (\). Parameters are included at the end of the string starting with a question mark (?) and using ampersands (&) in between each parameter.

For AWS Aurora databases, include the gssEncMode=disable query parameter in the JDBC URL

database-host
The database server address (e.g., localhost or an IP address). Use the cluster ingress port when PostgreSQL is deployed with multiple nodes.
database-port
The PostgreSQL port (default: 5432).
database-name
The name of the database.
?param1&param2
Optional parameters may be included at the end of the query string.
For example, use currentSchema=<schema-name> to specify the default schema to which to connect.
Note that the first parameter must be prefixed with a ?, and each subsequent parameter must use &. So, you would use ?currentSchema=<schema-name> if this is the first parameter you use.

Recommended Parameters for Failover Recovery

The following parameters are recommended when deploying a multi-node Nexus Repository install running with an AWS RDS PostgreSQL multi-AZ cluster. These settings force Nexus Repository to restore the postgreSQL connection as quickly as possible to limit downtime during failure scenarios.

The complete jdbc url string with these parameters is similar to the following:

jdbc:postgresql://<rds-cluster-endpoint-url>:5432/nexus?gssEncMode=disable&tcpKeepAlive=true&loginTimeout=5&connectionTimeout=5&socketTimeout=30&cancelSignalTimeout=5&targetServerType=primary

tcpKeepAlive
Enables the TCP keepalive feature for the network socket used by the database connection. When enabled, the operating system periodically send low-level TCP probes to prevent network devices from dropping the database connection unexpectedly.
```
tcpKeepAlive=true
```
loginTimeout
Sets the maximum time, in seconds, that the driver waits to successfully establish a connection to the PostgreSQL server and complete the authentication process.
```
loginTimeout=5
```
connectionTimeout
Specifies a timeout, in seconds, for establishing the connection.
```
connectionTimeout=5
```
socketTimeout
Sets the maximum time, in seconds, that the driver will wait for data to be sent or received on the network socket after the connection has been successfully established.
```
socketTimeout=30
```
cancelSignalTimeout
Sets the maximum time, in seconds, that the driver waits for the PostgreSQL server to acknowledge a query cancellation request.
```
cancelSignalTimeout=5
```
targetServerType
Instructs the driver to connect only to a server that identifies itself as the primary node handling write operations. Connection attempts made to a server identifying as a replica/secondary/standby will fail.
```
targetServerType=primary
```

Configuration Options for PostgreSQL

Set optional configuration settings through the same methods used above. Set multiple advanced properties in the JVM arguments by delimiting the values with "\n".

advanced=maximumPoolSize\=200\nmaxLifetime\=840000

Maximum Pool Size

Servers under heavy load may need increased connection pool size for the database. Nexus Repository uses a default pool of 100.

Note that, for high-availability deployments, you must increase the number of connections that PostgreSQL allows so that your Maximum Connection Pool size does not exceed your maximum number of allowed connections.

See Adjust the PostgreSQL Max Connections.

Environment variable

NEXUS_DATASTORE_NEXUS_ADVANCED="maximumPoolSize=200"

JVM argument

-Dnexus.datastore.nexus.advanced=maximumPoolSize\=200

nexus-store.properties file
```
maximumPoolSize=200
advanced=maximumPoolSize\=200
```
Only one setting from above is used. The maximumPoolSize takes precedence over the advanced

Max Lifetime

Configure the max lifetime for database connections when using container orchestration tools, relational database services, or other infrastructure to launch Nexus Repository.

The default max lifetime is set to 30 minutes (1800000ms)
Set the max lifetime to be several seconds shorter than any infrastructure-imposed connection time limit.

Environment variable

NEXUS_DATASTORE_NEXUS_ADVANCED="maxLifetime=840000"

JVM argument

-Dnexus.datastore.nexus.advanced=maxLifetime\=840000

nexus-store.properties file
```
advanced=maxLifetime\=840000
```

PostgreSQL Database Maintenance

The following tasks help your PostgreSQL database maintain optimum performance. These should be done outside of normal working hours to reduce the impact on active users as the tasks can impact performance while running.

Using Vacuuming
PostgreSQL databases require periodic maintenance known as vacuuming. Vacuuming in PostgreSQL is a crucial maintenance process that helps to optimize database performance and reclaim disk space. You might need to adjust the auto-vacuuming parameters to obtain the best results for your situation.
Review the PostgreSQL documentation on vacuuming.
Routine Reindexing
In some situations, it is worthwhile to rebuild indexes periodically with the REINDEX command or a series of individual rebuilding steps.
See the PostgreSQL documentation on Reindexing
Manage logging
It is a good idea to save the database server's log output somewhere, rather than just discarding them. The log output is invaluable when diagnosing problems.
See Log maintenance
Continuous Archiving and Point-in-Time Recovery (PITR)
Continuous Archiving and Point-in-Time Recovery (PITR) are essential features in PostgreSQL that provide robust data protection and disaster recovery capabilities. They work together to ensure you can restore your database to any specific moment in time, even if a failure occurs.
See Backups/Archiving

Configuring PostgreSQL Logging

Enabling logging in PostgreSQL is essential for maintaining a healthy, secure, and performant database. It provides valuable information for troubleshooting, auditing, performance monitoring, and general database administration.

See the PostgreSQL documentation for details.

Modify your postgreSQL.conf file to apply these logging settings:

log_line_prefix = '%m [%p:%l] "%v %x" %q<%u@%d/%a> '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
log_statement = 'none'
log_min_duration_statement = 1000ms
log_transaction_sample_rate = 0

Log File Size: Enabling detailed logging can increase the size of your log files. Make sure you have adequate disk space and implement log rotation to manage log file growth.
Performance Impact: Excessive logging can impact database performance. Start with a reasonable level of logging and adjust as needed. Your current settings are generally good for identifying performance issues.
Security: Be mindful of the information logged. Avoid logging sensitive data like passwords or credit card numbers.

Explanation of the PostgreSQL logging parameters

log_line_prefix
This defines the format of each log message.
log_checkpoints
Enables logging of checkpoint activity. Checkpoints are crucial for database recovery.
log_connections
Enables logging of new connections to the database.
log_disconnections
Enables logging of disconnections from the database.
log_lock_waits
Enables logging of long lock waits. This is essential for diagnosing performance issues related to lock contention.
log_temp_files
Logs the creation of temporary files larger than the specified size (in kB). Setting it to 0 disables logging of temporary files. If you want to log temporary files, set it to a value greater than 0 (e.g., log_temp_files = 8192 to log files larger than 8MB).
log_autovacuum_min_duration
Logs autovacuum actions that take longer than the specified duration. Setting it to 0 logs all autovacuum actions. This is very helpful for monitoring autovacuum's activity.
log_error_verbosity
Controls the verbosity of error messages. default provides a good balance. Other options are terse, verbose, and sqlstate.
log_statement
Controls which SQL statements are logged. none disables logging of SQL statements. Other options include all, ddl, mod, and read.
log_min_duration_statement
Logs statements that take longer than the specified duration (in milliseconds). Your setting of 1000ms will log statements that take 1 second or longer. This is useful for identifying slow queries.
log_transaction_sample_rate
Used for sampling transactions for logging. A value of 0 disables transaction sampling.

Determine Current Database

Use the Data Store view in the Settings menu to determine the database mode used in previously initialized Nexus Repository instances.

PostgreSQL: no reference to H2 in the JDBC URL
H2: reference to H2 in their JDBC URL

Install Nexus Repository with a PostgreSQL Database

Create a PostgreSQL Database

Database Configuration

Environment Variables

JVM Arguments

nexus-store.properties

JDBC Url Property

Recommended Parameters for Failover Recovery

Configuration Options for PostgreSQL

Maximum Pool Size

Max Lifetime

PostgreSQL Database Maintenance

Using Vacuuming

Routine Reindexing

Manage logging

Continuous Archiving and Point-in-Time Recovery (PITR)

Configuring PostgreSQL Logging

Explanation of the PostgreSQL logging parameters

Determine Current Database

Search results