Guidelines for running Security Compliance Management at scale

You can run Security Compliance Management (SCM) on a maximum of 100,000 nodes. Before you run Security Compliance Management at scale, review the guidelines for configuring the environment and running the scan. The process of running Security Compliance Management at scale was tested by Puppet in a controlled environment. Because many factors affect performance, results in your system environment might vary.

System requirements and configuration for large-scale environments

To support environments with more than 10,000 nodes, your Security Compliance Management installation needs a total of at least 16GB of memory and 100GB of storage space available.

Depending on your node count, scan frequency, and desired retention period, you may also need to increase the capacity and memory available to the "Security Compliance Management" node. The table below contains recommended values based on certain landmark node counts. Note that these values assume one scan per week and the default data retention period of 14 weeks.


14 weeks data retention at 1 scan per week	Additional capacity	Additional memory
1,000 nodes	Default	Default
10,000 nodes	25Gi	Default
50,000 nodes	75Gi	4Gi
75,000 nodes	105Gi	8Gi
100,000 nodes	135Gi	12Gi

For higher node counts, add 6Gi to PostgreSQL capacity for each additional 5,000 nodes. For longer retention periods, divide calculated storage requirement by default retention period (14) to determine per week storage requirement and then multiply by desired retention period.

Configure the scan process

To help optimize the scan process, follow the guidelines:

In Puppet orchestrator, set the task_concurrency parameter to a value appropriate for your environment and number of nodes. This value sets the maximum number of task or plan actions that can run concurrently in the orchestrator. If you set the parameter to 250 and run a scan of 5000 nodes, the orchestrator is fully consumed until the scans are completed on all 5000 nodes. (For more information about optimizing performance, see Tune task and plan performance in Puppet Enterprise (PE).)
Schedule scans to coincide with periods of minimal workflow to help ensure adequate network throughput.
Plan adequate time for the initial inventory ingestion from Puppet Enterprise (PE). In lab testing, the ingestion of 100,000 nodes took 20 minutes.
If you have a large number of nodes, consider configuring ad hoc and scheduled scans in smaller batches of up to 10,000 nodes.

Upgrade Security Compliance Management in a large-scale environment

Before you upgrade Security Compliance Management in an environment with thousands of nodes, review the limitations and consider the best strategy for your environment.

During the standard upgrade process, a new version of the CIS-CAT Pro Assessor is downloaded to each Puppet-managed node. However, Security Compliance Management supports a limited number of concurrent downloads of the assessor. In lab testing, a maximum of about 120 concurrent downloads was achieved. Thus, if you initiate an upgrade of thousands of nodes, not all nodes are updated on the first run.

You can resolve the issue in one of the following ways:

Run Puppet manually on a maximum of 120 nodes. Repeat the process until all nodes are updated.
Configure Security Compliance Management to host the assessor file on an internal web server and then upgrade Security Compliance Management. If you choose this option you need to ensure that you host the correct assessor bundle based on your operating system.

To host the assessor file internally and upgrade Security Compliance Management, complete the following steps:

Download the appropriate assessor bundle for your operating system. The assessor bundles are located at:
- https://<SCM_FQDN>/files/assessor/linux
- https://<SCM_FQDN>/files/assessor/mac
- https://<SCM_FQDN>/files/assessor/windows
In the Puppet Enterprise (PE) console, click Node Groups > PE Infrastructure > PE Agent > Classes.
In the Add new class field, select the Security Compliance Management class.
In the Parameter name field, select scanner_source.
Set the value of the scanner source to the URL where the assessor is hosted. For example, the URL can have the following structure, where server-hosting-assessor-ip specifies the IP address of the server that hosts the assessor and os specifies either mac, linux, or windows:
```
http://server-hosting-assessor-ip/assessor/os/assessor.zip
```
Commit the changes.
In the PE console, click Run > Puppet.
Complete the upgrade process by selecting the relevant nodes and running the job.

Optimize scanning and reporting at scale

You can compare the results of your scanning and reporting processes against the results obtained in lab testing. If performance is not adequate in your environment, determine the cause of bottlenecks and address the issues.

Security Compliance Management has been tested and is able to process reports from up to 100,000 nodes in a single scan. Processing this number of reports can take up to 120 minutes depending on system resources. However, total scan time may be significantly longer based on Puppet orchestrator concurrency limits as well as the amount of time the CIS-CAT Pro Assessor takes to run on individual nodes. If you have a large number of nodes, consider configuring ad hoc and scheduled scans in smaller batches of up to 10,000 nodes.

Node results raw data exports can take up to 36 minutes for 90,000 nodes, or up to 4 minutes per batch of 10,000 nodes. Allow additional time if generating several exports of over 10,000 nodes concurrently.

The assessor run times are affected by the host type. In general, scans on Microsoft Windows systems take longer than scans on *nix systems. Run times can vary significantly, depending on many other factors. For example, run times are longer for nodes with many user accounts and for nodes with many types of software installed. Results obtained in the lab represent an optimal use case.