Enrich
Enrich transforms enrich data using an external system or process.
enrich.aws.dynamodb
Transforms data by querying an AWS DynamoDB table and returning all matched items as an array of objects.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
aws.region | string | AWS region that the DynamoDB table is in. Defaults to the AWS_REGION and AWS_DEFAULT_REGION environment variables. | No |
aws.role_arn | string | AWS role that is used to authenticate. Defaults to an empty string (no role assumption is used). | No |
retry.count | integer | Maximum number of times to retry calls to the DynamoDB table. Defaults to the AWS_MAX_ATTEMPTS environment variable. | No |
table_name | string | The DynamoDB table name that items are read from. | Yes |
partition_key | string | Retrieves a value from the object that is used as the Partition Key in the DynamoDB table. | Yes |
sort_key | string | Retrieves a value from the object that is used as the Sort Key in the DynamoDB table. Defaults to an empty string (no Sory Key). | No |
key_condition_expression | string | The DynamoDB key condition expression string. | Yes |
limit | integer | Determines the maximum number of items to evaluate in the DynamoDB query. Defaults to zero (no limit). | No |
scan_index_forward | boolean | Specifies the order of index traversal. If set to true, then traversal is performed in ascending order; if set to false, then traversal is performed in descending order. | No |
Example
sub.transform.enrich.aws.dynamodb(
settings={key_condition_expression: 'PK = :pk'}
)
sub.tf.enrich.aws.dynamodb({key_condition_expression: 'PK = :pk'})
enrich.aws.lambda
Transforms data by synchronously invoking an AWS Lambda function and returning the payload.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
aws.region | string | AWS region that the Lambda function is in. Defaults to the AWS_REGION and AWS_DEFAULT_REGION environment variables. | No |
aws.role_arn | string | AWS role that is used to authenticate. Defaults to an empty string (no role assumption is used). | No |
retry.count | int | Maximum number of times to retry invocations to the Lambda function. Defaults to the AWS_MAX_ATTEMPTS environment variable. | No |
function_name | string | The Lambda function that is synchronously invoked. | Yes |
Example
sub.transform.enrich.aws.lambda(
settings={function_name: 'myFunction'}
)
sub.tf.enrich.aws.lambda({function_name: 'myFunction'})
enrich.dns.domain_lookup
Transforms data by querying a domain in the Domain Name System (DNS).
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
request.timeout | string | Maximum time to wait for a query to complete. Defaults to 1s. | No |
Example
sub.transform.enrich.dns.domain_lookup(
settings={object: {source_key: 'domain'}}
)
sub.tf.enrich.dns.domain_lookup({obj: {src: 'domain'}})
enrich.dns.ip_lookup
Transforms data by querying an IP address in the Domain Name System (DNS).
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
request.timeout | string | Maximum time to wait for a query to complete. Defaults to 1s. | No |
Example
sub.transform.enrich.dns.ip_lookup(
settings={object: {source_key: 'ip'}}
)
sub.tf.enrich.dns.ip_lookup({obj: {src: 'ip'}})
enrich.dns.txt_lookup
Transforms data by querying a TXT record in the Domain Name System (DNS).
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
request.timeout | string | Maximum time to wait for a query to complete. Defaults to 1s. | No |
Example
sub.transform.enrich.dns.txt_lookup(
settings={object: {source_key: 'ip'}}
)
sub.tf.enrich.dns.txt_lookup({obj: {src: 'ip'}})
enrich.http.get
Transforms data by performing a GET request to an HTTP(S) URL.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
url | string | The HTTP(S) URL used in the GET request. URLs support loading secrets. | Yes |
headers | []object | An array of objects that contain HTTP headers sent in the request. Header values support loading secrets. Defaults to no headers. | No |
Example
sub.transform.enrich.http.get(
settings={url: 'https://my.url/'}
)
sub.tf.enrich.http.get({url: 'https://my.url/'})
enrich.http.post
Transforms data by performing a POST request to an HTTP(S) URL.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object for transformation. | No |
object.target_key | string | Places a value into an object after transformation. | No |
object.body_key | string | Retrieves a value from an object that is used as the message body. | Yes |
url | string | The HTTP(S) URL used in the GET request. URLs support loading secrets. | Yes |
headers | []object | An array of objects that contain HTTP headers sent in the request. Header values support loading secrets. Defaults to no headers. | No |
Example
sub.transform.enrich.http.post(
settings={object: {body_key: 'payload'}, url: 'https://my.url/'}
)
sub.tf.enrich.http.post({object: {body_key: 'payload'}, url: 'https://my.url/'})
enrich.kv_store.get
Transforms data by retrieving data from a key-value store.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object that is used as the key in the KV store. | Yes |
object.target_key | string | Places the KV store result into an object. | Yes |
prefix | string | String that is prepended to the value retrieved by object.key .Defaults to an empty string (no prefix is used). | No |
kv_store | object | The KV store configuration settings. Refer to each KV store backend described in Key-Value Stores for more information. | Yes |
close_kv_store | boolean | Determines if the KV store should be closed when a control message is received. Defaults to false (KV store is not closed). | No |
Example
sub.transform.enrich.kv_store.get(
settings={kv_store: sub.kv_store.memory(settings={}), object: {source_key: 'ip', target_key: 'domain'}}
)
sub.tf.enrich.kv_store.get({kv_store: sub.kv_store.memory(settings={}), obj: {src: 'ip', trg: 'domain'}})
enrich.kv_store.set
Transforms data by putting data into a key-value store.
Settings
Field | Type | Description | Required |
---|---|---|---|
object.source_key | string | Retrieves a value from an object that is used as the key in the KV store. | Yes |
object.target_key | string | Places an item from an object that is used as the value in the KV store. | Yes |
object.ttl_key | string | Retrieves a value from an object that is used as the time-to-live (TTL) of the item set into the KV store. This value must be an integer that represents the Unix time when the item will be evicted from the store. Any precision greater than seconds (e.g., milliseconds, nanoseconds) is truncated to seconds. Defaults to an empty string (no TTL is used when setting items into the store). | No |
prefix | string | String that is prepended to the value retrieved by object.key .Defaults to an empty string (no prefix is used). | No |
ttl_offset | string | An offset used to determine the time-to-live (TTL) of the item set into the KV store. If ttl_key is configured, then this value is added to the TTL value retrieved from the object. If ttl_key is not used, then this value is added to the current time.For example, if ttl_key is not configured and the offset is "1d" (1 day), then the value will be evicted from the store when more than 1 day from now has passed.Defaults to an empty string (no TTL is used when setting items into the store). | No |
kv_store | object | The KV store configuration settings. Refer to each KV store backend described in Key-Value Stores for more information. | Yes |
close_kv_store | boolean | Determines if the KV store should be closed when a control message is received. Defaults to false (KV store is not closed). | No |
Example
sub.transform.enrich.kv_store.set(
// The value of `domain` is put into the KV store as the value of `ip`.
settings={kv_store: sub.kv_store.memory(settings={}), object: {source_key: 'ip', target_key: 'domain'}}
)
sub.tf.enrich.kv_store.set({kv_store: sub.kv_store.memory(settings={}), obj: {src: 'ip', trg: 'domain'}})
Use Cases
Data Interpolation
The enrich_http_get
and enrich_http_post
transforms can optionally interpolate data into the URL by placing the string ${data}
anywhere in the URL. For example:
Configured URL | Data | Interpolated URL |
---|---|---|
hxxps://foo.com/path/to/${data} | {"ip_addr":"8.8.8.8"} | hxxps://foo.com/path/to/8.8.8.8 |
hxxps://foo.com/path/to/${data} | 8.8.8.8 | hxxps://foo.com/path/to/8.8.8.8 |
Secrets Interpolation
The enrich_http_get
and enrich_http_post
transforms can also optionally interpolate secrets with the URL and header values. Multiple secrets can be interpolated in a single string. For example:
Configured URL | Data | Environment Variables | Interpolated URL |
---|---|---|---|
hxxps://foo.com/path/to/${data}?token=${SECRETS_ENV:TOKEN} | {"ip_addr":"8.8.8.8"} | TOKEN=mysecret | hxxps://foo.com/path/to/8.8.8.8?token=mysecret |
hxxps://foo.com/path/to/${data}?token=${SECRETS_ENV:TOKEN}&user=${SECRETS_ENV:USERNAME} | {"ip_addr":"8.8.8.8"} | TOKEN=mysecret USERNAME=myusername | hxxps://foo.com/path/to/8.8.8.8?token=mysecret&user=myusername |
Enriching Data with HTTP APIs
The enrich_http_get
transform is the recommended method for interacting with external HTTP/S APIs:
sub.tf.util.secret(settings={
secret: sub.secrets.environment_variable({ id: 'ENV_VAR', name: 'API_KEY' })
}),
sub.tf.enrich.http.get(settings={
// The value of `ip_addr` is interpolated into the URL and the API
// results are set into `api_result`.
object: {source_key: 'ip_addr', target_key: 'api_result'},
url: 'hxxps://api.foo.com/${DATA}',
// The secret is interpolated into the TOKEN header for auth.
headers: {
TOKEN: '${SECRET:ENV_VAR}',
},
}),
Downloading Text Files via HTTP
The enrich_http_get
transform can be used to download text files from any HTTP/S endpoint. For example, it can download Moby Dick by Herman Melville:
URL | Message |
---|---|
https://www.gutenberg.org/files/2701/old/moby10b.txt | {"moby_dick":"**The Project Gutenberg Etext of Moby Dick, by Herman Melville**"} |
This can be combined with the object_copy
transform to overwrite the original data, leaving only the downloaded file.
Downloading Non-Text Files via HTTP
The enrich_http_get
transform can also be used to retrieve non-text (e.g., binary) files. For example, it can be used to download a PDF version of Moby Dick:
URL | Capsule |
---|---|
http://www.gasl.org/refbib/Melville__Moby_Dick.pdf | {"moby_dick":"JVBERi0xLjU="} |
KV Store
The KV store transforms enable several use cases, see Key-Value Stores for detailed examples.
Updated 10 months ago