Routing can be considered the internal activities the repository manager performs in order to determine where to look for a specific component in a repository. The routing information has an impact on the performance of component retrieval as well as determining the availability of components.
A large portion of the performance gains achievable with correct and optimized routing information is configured by the repository manager itself with automatic routing, documented in Automatic Routing. Fine grained control and further customizations in terms of access provision can be achieved with some manual routing configuration documented in Manual Routing Configuration.
Automatic routing is handled on a per repository basis. You can access the configuration and further details in the Routing tab after selecting a repository in the list accessible via the Repositories item in the Views/Repositories left-hand menu.
The routing information consists of the top two levels of the directory structure of the repository and is stored in a
prefixes.txt file. It allows the repository manager to automatically route only component requests with the corresponding
groupId values to a repository, as found in the text file. This, in turn, avoids unnecessary index or even remote repository access and therefore greatly improves performance.
The repository manager generates the
prefixes.txt file for a hosted repository and makes it available for remote downloads. Each deployment of a new component will trigger an update of the file for the hosted repository as well as the prefix files for any repository groups that contain the hosted repository. You can access it in the Routing tab of a hosted repository as displayed in Figure 6.17, “Automatic Routing for a Hosted Repository” by clicking on the Show prefix file link on the right. In addition, the Publishing section shows the Status of the routing information, a Message with further details, and the date and time of the last update in the Published On field.
Figure 6.17. Automatic Routing for a Hosted Repository
The Routing tab for a proxy repository displayed in Figure 6.18, “Automatic Routing for a Proxy Repository” contains the Discovery section. It displays the Status and a more detailed Message about the prefix file access. The Last run field displays the date and time of the last execution of the prefix file discovery. Such an execution can be triggered by pressing the Update now button. Otherwise, the Update Interval allows you to configure a new discovery every one, two, three, six, nine or twelve hours or on a daily or weekly basis.
Figure 6.18. Automatic Routing for a Proxy Repository
For a proxy repository, the prefix file is either downloaded from the remote repository or generation is attempted by scraping the HTML directory listing of the remote repository. If a prefix file is published by the remote it is always used. The scraping strategy only used in cases where the repository manager can be sure the remote directory listing contains all available artifacts. For example, if the remote is hosted repository on a Nexus Repository Manager, or a well known format such as a Subversion based repository then the directory listing will be used if no prefix file is available.
The generation of the prefix file in all the repository managers deployments proxying each other greatly improves performance for all repository manager instances. It lowers network traffic and load on the servers, since failing requests and serving the respective HTTP error pages for a component that is not found is avoided for each component. Instead, the regularly light weight download of the prefix file establishes a good high-level knowledge of components available.
Automatic Routing is configured automatically brings significant performance benefits to all Nexus Repository Manager Pro and Nexus Repository Manager OSS instances proxying each other in a network and on the wider internet. It does not need to be changed apart from tweaking the update interval. To exercise even finer control than provided by Automatic Routing use Routing as documented in Manual Routing Configuration.
Manual Routing Configuration
Routes are like filters you can apply to groups in terms of security access and general component retrieval, and can reduce the number of repositories within a group accessed in order to retrieve an component. The administration interface for routes can be accessed via the Routing menu item in the View/Repositories menu in the left-hand navigation panel.
Routes allow you to configure the repository manager to include or exclude specific repository content paths from a particular component search when the repository manager is trying to locate a component in a repository group. There are a number of different scenarios in which you might configure a route.
The most commonly configured scenario is when you want to make sure that you are retrieving components in a particular group ID from a particular repository. This is especially useful when you want your own organization’s components from the hosted Release and Snapshot repositories only.
Routes are applicable when you are trying to resolve a component from a repository group. Using routes allows you to modify the repositories the repository manager consults when it tries to resolve a component from a group of repositories.
Figure 6.19. Routing Configuration Screen
Figure 6.19, “Routing Configuration Screen” shows the Routing configuration screen. Clicking on a route will bring up a screen that will allow you to configure the properties of a route. The configuration options available for a route are:
The repository manager uses the URL Pattern will use to match a request. If the regular expression in this pattern is matched, the repository manager will either include or exclude the listed repositories from a particular component query. In Figure 6.19, “Routing Configuration Screen” the two patterns are:
This pattern would match all paths that start with either
/org/somecompany/ . The expression in the parenthesis matches either
org, and the
.* matches zero or more characters. You would use a route like this to match your own organization’s components and map these requests to the hosted Releases and Snapshots repositories.
This pattern is used in an exclusive route. It matches every path that starts with
/org/some-oss/. This particular exclusive route excludes the local hosted Releases and Snapshots directory for all components that match this path. When the repository manager tries to resolve components that match this path, it will exclude the Releases and Snapshots repositories.
A further example:
Using this pattern in an exclusive route allows you to exclude everything, except the
"org/some-oss" project(s). It uses a special negative matching regular expression.
Rule Type can be either inclusive, exclusive or blocking. An inclusive rule type defines the set of repositories that should be searched for components when the URL pattern has been matched. An exclusive rule type defines repositories which should not be searched for a particular component. A blocking rule will completely remove accessibility to the components under the specific pattern in a specified repository group.
Ordered Route Repositories
The repository manager searches an ordered list of repositories to locate a particular component. This order only affects the order of routes used and not the order of the repositories searched. That order is set by the order of the repositories in the group repository’s configuration.
In Figure 6.19, “Routing Configuration Screen” you can see the two dummy routes that are configured as default routes. The first route is an inclusive route, and it is provided as an example of a custom route an organization might use to make sure that internally generated components are resolved from the Releases and Snapshots repositories only. If your organization’s group IDs all start with com.somecompany , and if you deploy internally generated components to the Releases and Snapshots repositories, this Route will make sure that the repository manager doesn’t waste time trying to resolve these components from public repositories like the Central Repository or the Apache Snapshots repository.
The second dummy route is an exclusive route. This route excludes the Releases and Snapshots repositories when the request path contains
/org/some-oss. This example might make more sense if we replaced
codehaus. If the pattern was
/org/apache, this rule is telling the repository manager to exclude the internal Releases and Snapshots repositories when it is trying to resolve these dependencies. In other words, don’t bother looking for an Apache dependency in your organization’s internal repositories.
Exclusive rules will positively impact performance, since the number of repositories that qualify for locating the component, and therefore the search effort, is reduced.
What if there is a conflict between two routes? The repository manager will process inclusive routes before it will process the exclusive routes. Remember that routes only affect the repository managers resolution of components when it is searching a group. When it starts to resolve a component from a repository group, it will start with the list of repositories in a group. If there are matching inclusive routes, the repository manager will then take the intersection of the repositories in the group and the repositories in the inclusive route. The order as defined in the group will not be affected by the inclusive route. The repository manager will then take the result of applying the inclusive route and apply the exclusive route to that list of repositories. The resulting list is then searched for a matching component.
One straightforward use of routes is to create a route that excludes the Central Repository from all searches for your own organization’s hosted components. If you are deploying your own components to the repository manager under a
org.mycompany, and if you are not deploying these components to a public repository, you can create a rule that tells the repository manager not to interrogate Central for your own organization’s components. This will improve performance because the repository manager will not need to communicate with a remote repository when it serves your own organization’s components. In addition to the performance benefits, excluding the Central Repository from searches for your own components will reduce needless queries to the public repositories.
This practice of defining an inclusive route for your internal components to only hit internal repositories is a crucial best practice of implementing a secure component management in your organization and a recommended step for initial configuration of the repository manager. Without this configuration, requests for internal components will be broadcast to all configured external proxy repositories. This could lead to an information leak where e.g. your internet traffic reveals that your organization works on a component with the component coordinates of
In addition to defining inclusive and exclusive routes, you can define blocking routes. A blocking route can be created by creating a route with no repositories in the ordered list of repositories. It allows you to completely block access to components with the specified pattern(s) from the group. As such, blocking routes are a simplified, coarse-grained access control.
Check out Procurement Suite for fine-grained control of component availability and use blocking routes sparingly.
To summarize, there are creative possibilities with routes that the designers of Nexus Repository Manager Pro may not have anticipated, but we advise you to proceed with caution if you start relying on conflicting or overlapping routes. Use routes sparingly, and use coarse URL patterns. Remember that routes are only applied to groups and are not used when a component is requested from a specific repository.