Skip to main content

Install Nexus Repository with a PostgreSQL Database

This topic covers installing Nexus Repository with an external PostgreSQL database. New Nexus Repository version 3.71.0 installations and above default to using an embedded H2 database. For production deployments larger than the minimum usage, we highly recommend using PostgreSQL databases.

See the database requirements section in Nexus Repository system requirements documentation.

Determine Current Database

Use the Data Store view in the Administration Menu to determine the database mode used in previously initialized Nexus Repository instances.

  • PostgreSQL: no reference to H2 in the JDBC URL

  • H2: reference to H2 in their JDBC URL

Start Nexus Repository with PostgreSQL

These instructions are to start Nexus Repository using an external PostgreSQL database.

  1. Before proceeding, check the PostgreSQL requirements with the required trigram module installed.

  2. Download the latest Nexus Repository version; expand and install the archive.

  3. Create a PostgreSQL database using the UTF8 character set.

    See the PostgreSQL documentation

  4. Optional: Add the license property to initialize the instance with a Professional license. This property is stored in the database and is subsequently later ignored in the configuration file.

    Add the following property to the nexus.properties file:

    nexus.licenseFile=/path/to/your/sonatype-license.lic
    

    See nexus.properties documentation for details.

  5. Set the database configuration using any of the 3 methods shown below.

  6. Start the Nexus Repository instance using a PostgreSQL database.

Database Configuration Methods

Nexus Repository supports 3 methods for providing the database configuration settings. When Nexus Repository initially starts, the first connection method encountered is used while the other methods are ignored. Mixing methods are not supported.

The settings are checked in the following order:

(1) Environment Variables, (2) JVM Arguments, (3) the Properties File

Choose the method that fits your infrastructure requirements. In containerized environments, use environment variables or JVM arguments to avoid modifying the property files stored in the container. New settings take effect on restarting the service.

  • Environment Variables

    Pass the connectivity details as environment variables:

    NEXUS_DATASTORE_NEXUS_JDBCURL
    NEXUS_DATASTORE_NEXUS_USERNAME
    NEXUS_DATASTORE_NEXUS_PASSWORD
  • JVM Arguments

    Specify the properties as JVM arguments:

    -Dnexus.datastore.enabled
    -Dnexus.datastore.nexus.username
    -Dnexus.datastore.nexus.password
    -Dnexus.datastore.nexus.jdbcUrl
  • nexus-store.properties

    Create the <data-dir>/etc/fabric/nexus-store.properties directory and file with the following properties:

    username=<postgres_user>
    password=<postgres_password>
    jdbcUrl=<jdbcUrl_property>
    

JDBC Url Property

The format of the PostgreSQL JDBC URL is as follows:

jdbc\:postgresql\://<database-host>\:<database-port>/<database-name>?<param1>=<value1>&<param2>=<value2>

Colons in the JDBC URL need to be escaped with a backslash (\). Parameters are included at the end of the string starting with a question mark (?) and using ampersands (&) in between each parameter.

For AWS Aurora databases, include the gssEncMode=disable query parameter in the JDBC URL

  • database-host

    The database server address (e.g., localhost or an IP address). Use the cluster ingress port when PostgreSQL is deployed with multiple nodes.

  • database-port

    The PostgreSQL port (default: 5432).

  • database-name

    The name of the database.

  • param1&param2

    Optional parameters may be included at the end of the query string.

Configuration Options for PostgreSQL

Set optional configuration settings through the same methods used above. Set multiple advanced properties in the JVM arguments by delimiting the values with "\n".

advanced=maximumPoolSize\=200\nmaxLifetime\=840000

Maximum Pool Size

Servers under heavy load may need increased connection pool size for the database. Nexus Repository uses a default pool of 100.

Note that, for high-availability deployments, you must increase the number of connections that PostgreSQL allows so that your Maximum Connection Pool size does not exceed your maximum number of allowed connections.

See Adjust the PostgreSQL Max Connections.

  • Environment variable

    NEXUS_DATASTORE_NEXUS_ADVANCED="maximumPoolSize=200"
  • JVM argument

    -Dnexus.datastore.nexus.advanced=maximumPoolSize\=200
  • nexus-store.properties file

    maximumPoolSize=N
    advanced=maximumPoolSize\=200
    

    Only one setting from above is used. The maximumPoolSize takes precedence over the advanced

Max Lifetime

Configure the max lifetime for database connections when using container orchestration tools, relational database services, or other infrastructure to launch Nexus Repository.

  • The default max lifetime is set to 30 minutes (1800000ms)

  • Set the max lifetime to be several seconds shorter than any infrastructure-imposed connection time limit.

  1. Environment variable

    NEXUS_DATASTORE_NEXUS_ADVANCED="maxLifetime=840000"
  2. JVM argument

    -Dnexus.datastore.nexus.advanced=maxLifetime\=840000
  3. nexus-store.properties file

    advanced=maxLifetime\=840000

PostgreSQL Database Maintenance

The following tasks help your PostgreSQL database maintain optimum performance. These should be done outside of normal working hours to reduce the impact on active users as the tasks can impact performance while running.

  • Using Vacuuming

    PostgreSQL databases require periodic maintenance known as vacuuming. Vacuuming in PostgreSQL is a crucial maintenance process that helps to optimize database performance and reclaim disk space. You might need to adjust the auto-vacuuming parameters to obtain the best results for your situation.

    Review the PostgreSQL documentation on vacuuming.

  • Routine Reindexing

    In some situations, it is worthwhile to rebuild indexes periodically with the REINDEX command or a series of individual rebuilding steps.

    See the PostgreSQL documentation on Reindexing

  • Manage logging

    It is a good idea to save the database server's log output somewhere, rather than just discarding them. The log output is invaluable when diagnosing problems.

    See Log maintenance

  • Continuous Archiving and Point-in-Time Recovery (PITR)

    Continuous Archiving and Point-in-Time Recovery (PITR) are essential features in PostgreSQL that provide robust data protection and disaster recovery capabilities. They work together to ensure you can restore your database to any specific moment in time, even if a failure occurs.

    See Backups/Archiving

Configuring PostgreSQL Logging

Enabling logging in PostgreSQL is essential for maintaining a healthy, secure, and performant database. It provides valuable information for troubleshooting, auditing, performance monitoring, and general database administration.

See the PostgreSQL documentation for details.

Modify your postgreSQL.conf file to apply these logging settings:

log_line_prefix = '%m [%p:%l] "%v %x" %q<%u@%d/%a> '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
log_statement = 'none'
log_min_duration_statement = 1000ms
log_transaction_sample_rate = 0
  • Log File Size: Enabling detailed logging can increase the size of your log files. Make sure you have adequate disk space and implement log rotation to manage log file growth.

  • Performance Impact: Excessive logging can impact database performance. Start with a reasonable level of logging and adjust as needed. Your current settings are generally good for identifying performance issues.

  • Security: Be mindful of the information logged. Avoid logging sensitive data like passwords or credit card numbers.

  • log_line_prefix

    This defines the format of each log message.

  • log_checkpoints

    Enables logging of checkpoint activity. Checkpoints are crucial for database recovery.

  • log_connections

    Enables logging of new connections to the database.

  • log_disconnections

    Enables logging of disconnections from the database.

  • log_lock_waits

    Enables logging of long lock waits. This is essential for diagnosing performance issues related to lock contention.

  • log_temp_files

    Logs the creation of temporary files larger than the specified size (in kB). Setting it to 0 disables logging of temporary files. If you want to log temporary files, set it to a value greater than 0 (e.g., log_temp_files = 8192 to log files larger than 8MB).

  • log_autovacuum_min_duration

    Logs autovacuum actions that take longer than the specified duration. Setting it to 0 logs all autovacuum actions. This is very helpful for monitoring autovacuum's activity.

  • log_error_verbosity

    Controls the verbosity of error messages. default provides a good balance. Other options are terse, verbose, and sqlstate.

  • log_statement

    Controls which SQL statements are logged. none disables logging of SQL statements. Other options include all, ddl, mod, and read.

  • log_min_duration_statement

    Logs statements that take longer than the specified duration (in milliseconds). Your setting of 1000ms will log statements that take 1 second or longer. This is useful for identifying slow queries.

  • log_transaction_sample_rate

    Used for sampling transactions for logging. A value of 0 disables transaction sampling.