Hiera hierarchies
Hiera looks up data by following a hierarchy — an ordered list of data sources.
Hierarchies are configured in a hiera.yaml
configuration file. Each level of the hierarchy tells Hiera how to access some kind of data source. A hierarchy is
usually organized like
this:
--- version: 5 defaults: # Used for any hierarchy level that omits these keys. datadir: data # This path is relative to hiera.yaml's directory. data_hash: yaml_data # Use the built-in YAML backend. hierarchy: - name: "Per-node data" # Human-readable name. path: "nodes/%{trusted.certname}.yaml" # File path, relative to datadir. # ^^^ IMPORTANT: include the file extension! - name: "Per-datacenter business group data" # Uses custom facts. path: "location/%{facts.whereami}/%{facts.group}.yaml" - name: "Global business group data" path: "groups/%{facts.group}.yaml" - name: "Per-datacenter secret data (encrypted)" lookup_key: eyaml_lookup_key # Uses non-default backend. path: "secrets/%{facts.whereami}.eyaml" options: pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem pkcs7_public_key: /etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem - name: "Per-OS defaults" path: "os/%{facts.os.family}.yaml" - name: "Common data" path: "common.yaml"In this example, every level configures the path to a YAML file on disk.
Hierarchies interpolate variables
Most levels of a hierarchy interpolate variables into their configuration:
path: "os/%{facts.os.family}.yaml"
The percent-and-braces %{variable}
syntax is a Hiera interpolation
token. It is similar to the Puppet language’s ${expression}
interpolation tokens. Wherever you
use an interpolation token, Hiera determines the variable’s
value and inserts it into the hierarchy.
The facts.os.family
uses the Hiera special key.subkey
notation for accessing elements of hashes and arrays. It is
equivalent to $facts['os']['family']
in the
Puppet language but the 'dot' notation produces an empty
string instead of raising an error if parts of the data is missing. Make sure that an empty
interpolation does not end up matching an unintended path.
You can
only interpolate values into certain parts of the config file. For more info, see the
hiera.yaml
format
reference.
With node-specific variables, each node gets a customized set of paths to data. The hierarchy is always the same.
Hiera searches the hierarchy in order
After Hiera replaces the variables to make a list of concrete data sources, it checks those data sources in the order they were written.
Generally, if a data source doesn’t exist, or doesn’t specify a value for the current key, Hiera skips it and moves on to the next source, until it finds one that exists — then it uses it. Note that this is the default merge strategy, but does not always apply, for example, Hiera can use data from all data sources and merge the result.
Earlier data sources have priority over later ones. In the example above, the node-specific data has the highest priority, and can override data from any other level. Business group data is separated into local and global sources, with the local one overriding the global one. Common data used by all nodes always goes last.
That’s how Hiera’s “defaults, with overrides” approach to data works — you specify common data at lower levels of the hierarchy, and override it at higher levels for groups of nodes with special needs.
Layered hierarchies
Hiera uses layers of data with a hiera.yaml
for each layer.
Each layer can configure its own independent hierarchy. Before a lookup, Hiera combines them into a single super-hierarchy: global → environment → module.
default_hierarchy
- that can be used in a module’s
hiera.yaml.
It only
comes into effect when there is no data for a key in any of the other regular
hierarchies--- version: 5 hierarchy: - name: "Data exported from our old self-service config tool" path: "selfserve/%{trusted.certname}.json" data_hash: json_data datadir: dataAnd the NTP module had the following hierarchy for default data:
--- version: 5 hierarchy: - name: "OS values" path: "os/%{facts.os.name}.yaml" - name: "Common values" path: "common.yaml" defaults: data_hash: yaml_data datadir: data
Then in a lookup for the ntp::servers
key, thrush.example.com
would use the following combined hierarchy:
<CODEDIR>/data/selfserve/thrush.example.com.json
<CODEDIR>/environments/production/data/nodes/thrush.example.com.yaml
<CODEDIR>/environments/production/data/location/belfast/ops.yaml
<CODEDIR>/environments/production/data/groups/ops.yaml
<CODEDIR>/environments/production/data/os/Debian.yaml
<CODEDIR>/environments/production/data/common.yaml
<CODEDIR>/environments/production/modules/ntp/data/os/Ubuntu.yaml
<CODEDIR>/environments/production/modules/ntp/data/common.yaml
The combined hierarchy works the same way as a layer hierarchy. Hiera skips empty data sources, and either returns the first found value or merges all found values.
datadir
refers to the directory named ‘data’ next to the hiera.yaml
.Tips for making a good hierarchy
- Make a short hierarchy. Data files are easier to work with.
-
Use the roles and profiles method to manage less data in Hiera. Sorting hundreds of class parameters is easier than sorting thousands.
-
If the built-in facts don’t provide an easy way to represent differences in your infrastructure, make custom facts. For example, create a custom datacenter fact that is based on information particular to your network layout so that each datacenter is uniquely identifiable.
-
Give each environment – production, test, development – its own hierarchy.