Run impact analysis on fewer nodes
If an environment has a lot of nodes, it might take a long time for impact analysis to run. It is possible to only analyze a subset of your total nodes, but there are tradeoffs.
- Not all nodes are analyzed. By definition, running impact analysis on fewer nodes means that some nodes don't get analyzed. For example, if you're analyzing only 10% of your nodes, the remaining 90% are not analyzed. When your code is deployed, excluded nodes might have unexpected changes that weren't detected since those nodes weren't analyzed.
- Additional heap space is consumed. To run impact analysis on fewer nodes,
you must create one or more dedicated impact analysis environments. Each impact
analysis environment has the same code as its corresponding primary environment
(for example,
production
andproduction-ia
). Because environments consume heap space in Puppet Server, adding these additional environments consumes additional heap space deploying the same code to multiple environments.
Impact analysis runs on nodes in a designated environment. Therefore, if your control
repo pipeline runs impact analysis on your production
environment, it analyzes all nodes in the production
environment node group. If you have a lot of nodes, this can take a long time to run and
might be taxing on system resources. If your nodes are mostly similar, it might make
sense to run impact analysis on a subset of your total nodes, rather than always
analyzing every node. However, this requires changing your environment structure to
accommodate one or more impact-analysis
environments.
For example, if you want to run impact analysis on a few nodes before deploying code to
all production
nodes, you'll need to set up a production impact analysis
environment. First, create a
production-ia
branch in your control repo and deploy the new environment. Next, create a
production-ia
environment node group as a child of your
production
environment node group. Then, add nodes to the production-ia
group representing a subset of your total production
nodes.
production-ia
group, so make sure the nodes in this group are a good
representation of your total production nodes. For example, make sure to include
different operating systems or geographic locations, as well as any outliers and
known problematic nodes.You now have two environments where your production code is deployed: production-ia
, which contains some production nodes, and
production
, which contains all production nodes. To
run impact analysis on the smaller production-ia
group,
you need to add the new production-ia
environment to
your control repo pipeline:
- Add a deployment for the
production-ia
environment, in addition to theproduction
environment's deployment.Since you're deploying the same code toproduction-ia
andproduction
, you can configure your pipeline to auto-promote to theproduction
deployment stage after completing theproduction-ia
deployment. - Edit the impact analysis task so that
it only runs on nodes in the
production-ia
environment. Make sure the impact analysis task is set to Run for selected environments and includes only theproduction-ia
environment. Since your goal is to analyze only a subset of nodes, you don't want to run impact analysis on theproduction
environment anymore. - Check the promotion settings between the impact analysis stage and the
production-ia
deployment stage. If you want to review the impact analysis report before deploying your code, make sure your pipeline doesn't auto-promote to the deployment stage.
While the above example used the production environment, you could set up similar structures for any environments you wanted to partially analyze, such as UAT or preproduction.
Related information