Style Guide

Style guide for configurations and configuration management.

Function & Pattern Libraries

Use the function libraries and pattern libraries -- seriously!

Formatting & Linting

Use the Jsonnet tooling for formatting (jsonnetfmt ) and linting (jsonnet-lint) configuration files. These can be added to CI / CD pipelines to achieve team-wide consistency.

Organization

Configuration files should be organized by pipeline and resource using this hierarchical folder structure: root/[pipeline]/[resource]/

This hierarchy supports three layers of configuration:

  • global -- configs used in multiple pipelines, stored in root/foo.libsonnet
  • regional -- configs used in multiple resources of a single pipeline, stored in root/[pipeline]/foo.libsonnet
  • local -- configs used in one resource of a single pipeline, stored in root/[pipeline]/[resource]/foo.libsonnet

Further segmentation of files at the local level is recommended if users want to logically group configs or if a single config becomes too large (the larger the config, the harder it is to understand).

For example, configs for processing event data into the Elastic Common Schema (ECS) are easier to manage if they are logically grouped according to the ECS data model (e.g., client.* fields are in client.libsonnet , process.* fields are in process.libsonnet , user.* fields are in user.libsonnet , etc.).

Variables

If you are referencing a single value more than twice, then it should almost always be defined as a variable. In some cases you may even want to use a variable if it describes a value more clearly than the value's literal object key does.

{
  processors: [
    process.process(
      condition=operatorPatterns.and([
        condition.inspector(options=condition.regexp(expression='([0-9]{1,3}\\.){3}[0-9]{1,3}'), key='req.details.LocalAddr'),
        condition.inspector(options=condition.strings(expression='OUTBOUND', type='equals'), key='networkDirection'),
      ]),
      options=process.copy,
      key='req.details.LocalAddr',
      set_key='client_ip',
    ),
    process.process(
      condition=operatorPatterns.and([
        condition.inspector(options=condition.regexp(expression='([0-9]{1,3}\\.){3}[0-9]{1,3}'), key='req.details.LocalAddr'),
        condition.inspector(options=condition.strings(expression='INBOUND', type='equals'), key='networkDirection'),
      ]),
      options=process.copy,
      key='req.details.LocalAddr',
      set_key='server_ip',
    ),
  ],
}
local public_addr = 'req.details.LocalAddr';
local inspect_addr = condition.inspector(options=condition.regexp(expression='([0-9]{1,3}\\.){3}[0-9]{1,3}'), key=public_addr);

{
  processors: [
    process.process(
      condition=operatorPatterns.and([
        inspect_addr,
        condition.inspector(options=condition.strings(expression='OUTBOUND', type='equals'), key='networkDirection'),
      ]),
      options=process.copy,
      key=public_addr,
      set_key='client_ip',
    ),
    process.process(
      condition=operatorPatterns.and([
        inspect_addr,
        condition.inspector(options=condition.strings(expression='INBOUND', type='equals'), key='networkDirection'),
      ]),
      options=process.copy,
      key=public_addr,
      set_key='server_ip',
    ),
  ],
}

Functions

If you are using the same config blocks many times, then they should almost always be defined as a function. If the function can be reused across many pipelines, then it should be defined globally.

local public_addr = 'req.details.LocalAddr';

{
  conditions: [
    {
      type: 'regexp',
      settings: {
        key: public_addr,
        expression: '([0-9]{1,3}\\.){3}[0-9]{1,3}',
        negate: false,
      },
    },
    {
      type: 'regexp',
      settings: {
        key: 'event_type',
        expression: 'network_connect',
        negate: true,
      },
    },
    {
      type: 'regexp',
      settings: {
        key: 'network_direction',
        expression: 'OUTBOUND',
        negate: false,
      },
    },
  ],
}
local public_addr = 'req.details.LocalAddr';

local regexp(key, expression, negate=false) = {
  type: 'regexp',
  settings: {
    key: key,
    expression: expression,
    negate: negate,
  },
};

{
  conditions: [
    regexp(public_addr, '([0-9]{1,3}\\.){3}[0-9]{1,3}'),
    regexp('event_type', 'network_connect', negate=true),
    regexp('network_direction', 'OUTBOUND'),
  ],
}

For Loops

If you are repeatedly using the same config block, then it should almost always be defined using a for loop.

local processors = [
  {
    local output = 'cloud.account.name',
    local cond = operatorPatterns.and([
      condition.inspector(options=condition.strings(expression='123', type='equals'), key='recipientAccountId'),
    ]),

    processors: [
      process.process(
        condition=cond,
        options=process.insert(value='foo'),
        set_key=output,
      ),
    ],
  },
  {
    local output = 'cloud.account.name',
    local cond = operatorPatterns.and([
      condition.inspector(options=condition.strings(expression='456', type='equals'), key='recipientAccountId'),
    ]),

    processors: [
      process.process(
        condition=cond,
        options=process.insert(value='bar'),
        set_key=output,
      ),
    ],
  }
  {
    local output = 'cloud.account.name',
    local cond = operatorPatterns.and([
      condition.inspector(options=condition.strings(expression='789', type='equals'), key='recipientAccountId'),
    ]),

    processors: [
      process.process(
        condition=cond,
        options=process.insert(value='baz'),
        set_key=output,
      ),
    ],
  },
];

{
  processors: std.flattenArrays([p.processors for p in processors]),
}
local cloud_accounts = {
  '123': 'foo',
  '456': 'bar',
  '789': 'baz',
};

local processors = [
  {
    local output = 'cloud.account.name',
    local cond = operatorPatterns.and([
      condition.inspector(options=condition.strings(expression=id, type='equals'), key='recipientAccountId'),
    ]),

    processors: [
      process.process(
        condition=cond,
        options=process.insert(value=cloud_accounts[id]),
        set_key=output,
      ),
    ],
  }

  for id in std.objectFields(cloud_accounts)
];

{
  processors: std.flattenArrays([p.processors for p in processors]),
}