Data Retention and Purging

NEW IN RELEASE 63

Every policy evaluation that you perform produces data, for example in the form of application composition reports, that is persisted in IQ Server's working directory. As you onboard hundreds or even thousands of applications into IQ Server, each having daily policy evaluations thanks to integration with your CI system or continuous monitoring, the disk space consumption can grow notably over time. Naturally, you want to free up some of that disk space by deleting obsolete data that you no longer care about. And to define what denotes "obsolete data", IQ Server offers user-defined data retention policies.

The specifics of those retention policies vary depending on the kind of data they govern and are described in more detail below. The general idea is that once data meets the criteria for being obsolete, IQ Server automatically purges it from the system via background tasks.

Application Reports

An application report for some internal dev build typically becomes irrelevant after a few months. In contrast, a report for a release/production version of your application should probably be kept for years to match the lifetime of the corresponding binary or simply to meet legal requirements. Consequently, you can specify different retention policies for each stage of the application lifecycle (build, release, operate, etc.).

The daily application reports produced by continuous monitoring are not restricted by the retention policy for whatever stage is being monitored but rather are associated to a separate retention policy.

Any purging needed to satisfy the configured retention policies is carried out automatically by a background task. That task runs once a day, around midnight local server time.

Inheritance of Retention Policies

Similar to license threat groups, retention policies can be configured for each organization individually. Applications simply inherit the retention policies from their parent organization which in turn can inherit them from the root organization.

We recommend to use the root organization to centrally manage the retention policies and only override them in organizations where business constraints truly demand different handling for a given lifecycle stage.

Structure of a Retention Policy

Now, for a given stage, a retention policy for application reports consists of these three properties:

  • A switch that generally enables or disables automatic purging of obsolete reports.
    Given the issue of ongoing disk space consumption, you normally want to enable automatic purging of some form. This switch is merely meant as an escape hatch in case the retention policy cannot adequately capture your business needs and you rather want to use some manual purge process outside of IQ Server.
  • A criteria that specifies the maximum age an application report can reach before it should be purged.
    This criteria is expressed in N days, weeks, months or years and is the recommended means to define obsolete reports. Do note that the latest application report in a given stage is always kept, regardless how long ago it was created.
  • A criteria that denotes the maximum number of reports that are kept before the oldest among them should be purged.
    This criteria is useful to deal with cases where a high frequency of policy evaluations produces more reports during a rather short time period than you are willing to accumulate on disk.

The two criteria of when to purge are independent from each other. When using automatic purging, you can specify only one of them or both. Whichever criteria is satisfied first will initiate purging of reports. For instance, if the retention policy retains reports for the last month but only 100 reports at most, the 101st report to occur for the stage will cause purging of the oldest report even if that report is still younger than one month. Likewise, any report (except the latest) older than one month gets purged, even if the total number of reports in that stage is already less than 100.

Default Retention Policies

For fresh IQ Server installations, the following default retention policies are put in place by the root organization:

StageMaximum Age
develop3 months
build3 months
stage-release3 months
release10 years
operate10 years
continuous-monitoring3 months

For existing IQ Server installations from before data retention policies were introduced you will need to manually enable automatic purging after upgrading. In other words, without your explicit choice to opt in, IQ Server will not purge a bunch of reports after being upgraded.

Trash Directory

Enabling automatic purging and choosing appropriate purge criteria can be a daunting task, especially for a big IQ Server installation: Is it really safe to purge any reports older than X or does somebody/something still rely on them? Given the uncertainty involved with answering that question, IQ Server provides a safety net in form of a trash directory. Purged reports are not completely deleted but rather moved into this trash directory. This way, the decision to permanently delete reports is ultimately up to you, after having had the chance of assessing who might miss purged reports.

The trash directory resides in a sub directory called trash of IQ Server's working directory, so by default it's located at sonatype-work/clm-server/trash.

The trash is further divided into sub directories of the form YYYY-MM-DD, denoting the date (in ISO 8601 form) when the contained files were purged. Hence, you can easily tell how long ago some reports were purged and conclude it is now safe to permanently delete them given nobody has since complained about their absence.

To avoid directories with so many files in them that browsing them becomes a pain, the date directories employ the first two hex characters of the internal application id to organize their files into another level of sub directories. Within these sub directories finally reside the purged reports in form of a single ZIP file per report.

The ZIP files are named using the pattern app-{internalApplicationId}-report-{reportId}.zip. The reportId portion in particular allows you to locate a purged report if somebody misses it later on.

Application reports tend to compress well so when they are purged and zipped up into the trash directory, disk space is already freed up despite the fact that the reports are not fully gone yet.

Summing the structure of the trash directory up, the following path demonstrates an example of a purged report: sonatype-work/clm-server/trash/2019-03-14/42/app-42794458631b458294c7a4ec7ad55657-report-0ab5a667857e49ad9a75ac3f270c44e0.zip

To restore any given report from the trash, simply unzip it into the report sub directory of IQ Server's working directory, i.e. sonatype-work/clm-server/report by default.

Success Metrics

To derive data about the number of discovered policy violations and their average age until resolution for Success Metrics, IQ Server persists policy violation data in its database even after those violations have been resolved and are no longer affecting the application. But once this policy violation data has reached an age where it is not even relevant to the time frame for which you need Success Metrics, it can be purged. The retention policy for Success Metrics allows you to control for how long historic violation data should be kept.

The necessary purging of violation data that the retention policy deems obsolete happens via a background task that runs daily during the night (local server time).

Note that this purging is limited to those policy violations which do not affect the current state of an application, i.e. policy violations that were previously resolved. Any unresolved policy violations, including those which have merely been waived or grandfathered, are not purged, regardless of how many years ago those violations were first discovered.

Inheritance of the Retention Policy

Just like the retention policies for application reports, the retention policy for Success Metrics can be configured for each organization individually and is inherited to their applications. And each organization in turn can inherit the retention policy from the root organization.

Structure of the Retention Policy

The retention policy for Success Metrics boils down to two settings:

  • A switch to generally enable automatic purging or not.
  • The maximum length of the time period for which Success Metrics should be provided.

Default Retention Policy

For pristine IQ Server installations, the root organization configures the retention policy to provide Success Metrics only for the last 12 months and purge any older data.

If you are rather upgrading an existing IQ Server installation, automatic purging remains disabled by default until you explicitly configure the server otherwise.