FilterX function reference

FilterX is an experimental feature currently under development. Feedback is most welcome on Discord and GitHub.

Available in AxoSyslog 4.8.1 and later.

This page describes the functions you can use in FilterX blocks.

Functions have arguments that can be either mandatory or optional.

  • Mandatory options are always positional, so you need to pass them in the correct order. You cannot set them in the arg=value format.
  • Optional arguments are always named, like arg=value. You can pass optional arguments in any order.

cache_json_file

Load the contents of an external JSON file in an efficient manner. You can use this function to lookup contextual information. (Basically, this is a FilterX-specific implementation of the add-contextual-data() functionality.)

Usage: cache_json_file("/path/to/file.json")

For example, if your context-info-db.json file contains the following:

{
  "nginx": "web",
  "httpd": "web",
  "apache": "web",
  "mysql": "db",
  "postgresql": "db"
}

Then the following FilterX expression selects only “web” traffic:

filterx {
  declare known_apps = cache_json_file("/context-info-db.json");
  ${app} = known_apps[${PROGRAM}] ?? "unknown";
  ${app} == "web";  # drop everything that's not a web server log
}

datetime

Cast a value into a datetime variable.

Usage: datetime(<string or expression to cast as datetime>)

For example:

date = datetime("1701350398.123000+01:00");

Usually, you use the strptime FilterX function to create datetime values. Alternatively, you can cast an integer, double, string, or isodate variable into datetime with the datetime() FilterX function. Note that:

  • When casting from an integer, the integer is the number of microseconds elapsed since the UNIX epoch (00:00:00 UTC on 1 January 1970).
  • When casting from a double, the double is the number of seconds elapsed since the UNIX epoch (00:00:00 UTC on 1 January 1970). (The part before the floating points is the seconds, the part after the floating point is the microseconds.)
  • When casting from a string, the string (for example, 1701350398.123000+01:00) is interpreted as: <the number of seconds elapsed since the UNIX epoch>.<microseconds>+<timezone relative to UTC (GMT +00:00)>

endswith

Available in AxoSyslog 4.9 and later.

Returns true if the input string ends with the specified substring. By default, matches are case sensitive. Usage:

endswith(input-string, substring);
endswith(input-string, [substring_1, substring_2], ignorecase=true);

For details, see String search in FilterX.

flatten

Flattens the nested elements of an object using the specified separator, similarly to the format-flat-json() template function. For example, you can use it to flatten nested JSON objects in the output if the receiving application cannot handle nested JSON objects.

Usage: flatten(dict_or_list, separator=".")

You can use multi-character separators, for example, =>. If you omit the separator, the default dot (.) separator is used.

sample-dict = json({"a": {"b": {"c": "1"}}});
${MESSAGE} = flatten(sample-dict);

The value of ${MESSAGE} will be: {"a.b.c": "1"}

format_csv

Formats a dictionary or a list into a comma-separated string.

Usage: format_csv(<input-list-or-dict>, columns=<json-list>, delimiter=<delimiter-character>, default_value=<string>)

Only the input is mandatory, other arguments are optional. Note that the delimiter must be a single character.

By default, the delimiter is the comma (delimiter=","), the columns and default_value are empty.

If the columns option is set, AxoSyslog checks that the number of fields or entries in the input data matches the number of columns. If there are fewer items, it adds the default_value to the missing entries.

format_kv

Formats a dictionary into a string containing key=value pairs.

Usage: format_kv(kvs_dict, value_separator="<separator-character>", pair_separator="<separator-string>")

By default, format_kv uses = to separate values, and , (comma and space) to separate the pairs:

filterx {
    ${MESSAGE} = format_kv(<input-dictionary>);
};

The value_separator option must be a single character, the pair_separator can be a string. For example, to use the colon (:) as the value separator and the semicolon (;) as the pair separator, use:

format_kv(<input-dictionary>, value_separator=":", pair_separator=";")

format_json

Formats any value into a raw JSON string.

Usage: format_json($data)

get_sdata

See Handle SDATA in RFC5424 log records.

has_sdata

See Handle SDATA in RFC5424 log records.

includes

Available in AxoSyslog 4.9 and later.

Returns true if the input string contains the specified substring. By default, matches are case sensitive. Usage:

includes(input-string, substring);
includes(input-string, [substring_1, substring_2], ignorecase=true);

For details, see String search in FilterX.

isodate

Parses a string as a date in ISODATE format: %Y-%m-%dT%H:%M:%S%z

is_sdata_from_enterprise()

See Handle SDATA in RFC5424 log records.

isset

Returns true if the argument exists and its value is not empty or null.

Usage: isset(<name of a variable, macro, or name-value pair>)

istype

Returns true if the object (first argument) has the specified type (second argument). The type must be a quoted string. (See List of type names.)

Usage: istype(object, "type_str")

For example:

obj = json();
istype(obj, "json_object"); # True

istype(${PID}, "string");
istype(my-local-json-object.mylist, "json_array");

If the object doesn’t exist, istype() returns with an error, causing the FilterX statement to become false, and logs an error message to the internal() source of AxoSyslog.

json

Cast a value into a JSON object.

Usage: json(<string or expression to cast to json>)

For example:

js_dict = json({"key": "value"});

Starting with version 4.9, you can use {} without the json() keyword as well. For example, the following creates an empty JSON object:

js_dict = {};

json_array

Cast a value into a JSON array.

Usage: json_array(<string or expression to cast to json array>)

For example:

js_list = json_array(["first_element", "second_element", "third_element"]);

Starting with version 4.9, you can use [] without the json_array() keyword as well. For example, the following creates an empty JSON list:

js_dict = [];

len

Returns the number of items in an object as an integer: the length (number of characters) of a string, the number of elements in a list, or the number of keys in an object.

Usage: len(object)

lower

Converts all characters of a string lowercase characters.

Usage: lower(string)

otel_array

Creates a dictionary represented as an OpenTelemetry array.

otel_kvlist

Creates a dictionary represented as an OpenTelemetry key-value list.

otel_logrecord

Creates an OpenTelemetry log record object.

otel_resource

Creates an OpenTelemetry resource object.

otel_scope

Creates an OpenTelemetry scope object.

parse_csv

Split a comma-separated or similar string.

Usage: parse_csv(msg_str [columns=json_array, delimiter=string, string_delimiters=json_array, dialect=string, strip_whitespace=boolean, greedy=boolean])

For details, see Comma-separated values.

parse_kv

Split a string consisting of whitespace or comma-separated key=value pairs (for example, WELF-formatted messages).

Usage: parse_kv(msg, value_separator="=", pair_separator=", ", stray_words_key="stray_words")

The value_separator must be a single character. The pair_separator can consist of multiple characters.

For details, see key=value pairs.

parse_leef

Parse a LEEF-formatted string.

Usage: parse_leef(msg)

For details, see LEEF.

parse_xml

Parse an XML object into a JSON object.

Usage: parse_xml(msg)

For details, see /axosyslog-core-docs/filterx/filterx-parsing/xml/

parse_windows_eventlog_xml

Parses a Windows Event Log XML object into a JSON object.

Usage: parse_xml(msg)

For details, see /axosyslog-core-docs/filterx/filterx-parsing/xml/

Searches a string and returns the matches of a regular expression as a list or a dictionary. If there are no matches, the list or dictionary is empty.

Usage: regexp_search("<string-to-search>", <regular-expression>)

For example:

# ${MESSAGE} = "ERROR: Sample error message string"
my-variable = regexp_search(${MESSAGE}, "ERROR");

You can also use unnamed match groups (()) and named match groups ((?<first>ERROR)(?<second>message)).

Note the following points:

  • Regular expressions are case sensitive by default. For case insensitive matches, add (?i) to the beginning of your pattern.
  • You can use regexp constants (slash-enclosed regexps) within FilterX blocks to simplify escaping special characters, for example, /^beginning and end$/.
  • FilterX regular expressions are interpreted in “leave the backslash alone mode”, meaning that a backslash in a string before something that doesn’t need to be escaped and will be interpreted as a literal backslash character. For example, string\more-string is equivalent to string\\more-string.

Unnamed match groups

${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.unnamed = regexp_search("first-word second-part third", /(first-word)(second-part)(third)/);

${MY-LIST}.unnamed is a list containing: ["first-word second-part third", "first-word", "second-part", "third"],

Named match groups

${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.named = regexp_search("first-word second-part third", /(?<one>first-word)(?<two>second-part)(?<three>third)/);

${MY-LIST}.named is a dictionary with the names of the match groups as keys, and the corresponding matches as values: {"0": "first-word second-part third", "one": "first-word", "two": "second-part", "three": "third"},

Mixed match groups

If you use mixed (some named, some unnamed) groups in your regular expression, the output is a dictionary, where AxoSyslog automatically assigns a key to the unnamed groups. For example:

${MY-LIST} = json(); # Creates an empty JSON object
${MY-LIST}.mixed = regexp_search("first-word second-part third", /(?<one>first-word)(second-part)(?<three>third)/);

${MY-LIST}.mixed is: {"0": "first-word second-part third", "first": "first-word", "2": "second-part", "three": "third"}

regexp_subst

Rewrites a string using regular expressions. This function implements the subst rewrite rule functionality.

Usage: regexp_subst(<input-string>, <pattern-to-find>, <replacement>, flags

The following example replaces the first IP in the text of the message with the IP-Address string.

regexp_subst(${MESSAGE}, "IP", "IP-Address");

To replace every occurrence, use the global=true flag:

regexp_subst(${MESSAGE}, "IP", "IP-Address", global=true);

Note the following points:

  • Regular expressions are case sensitive by default. For case insensitive matches, add (?i) to the beginning of your pattern.
  • You can use regexp constants (slash-enclosed regexps) within FilterX blocks to simplify escaping special characters, for example, /^beginning and end$/.
  • FilterX regular expressions are interpreted in “leave the backslash alone mode”, meaning that a backslash in a string before something that doesn’t need to be escaped and will be interpreted as a literal backslash character. For example, string\more-string is equivalent to string\\more-string.

Options

You can use the following flags with the regexp_subst function:

  • global=true:

    Replace every match of the regular expression, not only the first one.

  • ignorecase=true:

    Do case insensitive match.

  • jit=true:

    Enable just-in-time compilation function for PCRE regular expressions.

  • newline=true:

    When configured, it changes the newline definition used in PCRE regular expressions to accept either of the following:

    • a single carriage-return
    • linefeed
    • the sequence carriage-return and linefeed (\\r, \\n and \\r\\n, respectively)

    This newline definition is used when the circumflex and dollar patterns (^ and $) are matched against an input. By default, PCRE interprets the linefeed character as indicating the end of a line. It does not affect the \\r, \\n or \\R characters used in patterns.

  • utf8=true:

    Use Unicode support for UTF-8 matches: UTF-8 character sequences are handled as single characters.

startswith

Available in AxoSyslog 4.9 and later.

Returns true if the input string begins with the specified substring. By default, matches are case sensitive. Usage:

startswith(input-string, substring);
startswith(input-string, [substring_1, substring_2], ignorecase=true);

For details, see String search in FilterX.

string

Cast a value into a string. Note that currently AxoSyslog evaluates strings and executes template functions and template expressions within the strings. In the future, template evaluation will be moved to a separate FilterX function.

Usage: string(<string or expression to cast>)

For example:

myvariable = string(${LEVEL_NUM});

Sometimes you have to explicitly cast values to strings, for example, when you want to concatenate them into a message using the + operator.

strptime

Creates a datetime object from a string, similarly to the date-parser(). The first argument is the string containing the date. The second argument is a format string that specifies how to parse the date string. Optionally, you can specify additional format strings that are applied in order if the previous one doesn’t match the date string.

Usage: strptime(time_str, format_str_1, ..., format_str_N)

For example:

${MESSAGE} = strptime("2024-04-10T08:09:10Z", "%Y-%m-%dT%H:%M:%S%z");

You can use the following format codes in the format string:

%%      PERCENT
%a      day of the week, abbreviated
%A      day of the week
%b      month abbr
%B      month
%c      MM/DD/YY HH:MM:SS
%C      ctime format: Sat Nov 19 21:05:57 1994
%d      numeric day of the month, with leading zeros (eg 01..31)
%e      like %d, but a leading zero is replaced by a space (eg  1..31)
%f      microseconds, leading 0's, extra digits are silently discarded
%D      MM/DD/YY
%G      GPS week number (weeks since January 6, 1980)
%h      month, abbreviated
%H      hour, 24 hour clock, leading 0's)
%I      hour, 12 hour clock, leading 0's)
%j      day of the year
%k      hour
%l      hour, 12 hour clock
%L      month number, starting with 1
%m      month number, starting with 01
%M      minute, leading 0's
%n      NEWLINE
%o      ornate day of month -- "1st", "2nd", "25th", etc.
%p      AM or PM
%P      am or pm (Yes %p and %P are backwards :)
%q      Quarter number, starting with 1
%r      time format: 09:05:57 PM
%R      time format: 21:05
%s      seconds since the Epoch, UCT
%S      seconds, leading 0's
%t      TAB
%T      time format: 21:05:57
%U      week number, Sunday as first day of week
%w      day of the week, numerically, Sunday == 0
%W      week number, Monday as first day of week
%x      date format: 11/19/94
%X      time format: 21:05:57
%y      year (2 digits)
%Y      year (4 digits)
%Z      timezone in ascii format (for example, PST), or in format -/+0000
%z      timezone in ascii format (for example, PST), or in format -/+0000  (Required element)

For example, for the date 01/Jan/2016:13:05:05 PST use the following format string: "%d/%b/%Y:%H:%M:%S %Z"

The isodate FilterX function is a specialized variant of strptime, that accepts only a fixed format.

unset

Deletes a variable, a name-value pair, or a key in a complex object (like JSON), for example: unset(${<name-value-pair-to-unset>});

You can also list multiple values to delete: unset(${<first-name-value-pair-to-unset>}, ${<second-name-value-pair-to-unset>});

See also Delete values.

unset_empties

Deletes (unsets) the empty fields of an object, for example, a JSON object or list. By default, the object is processed recursively, so the empty values are deleted from inner dicts and lists as well. If you set the replacement option, you can also use this function to replace fields of the object to custom values.

Usage: unset_empties(object, options)

The unset_empties() function has the following options:

  • ignorecase: Set to false to perform case-sensitive matching. Default value: true. Available in Available in AxoSyslog 4.9 and later.
  • recursive: Enables recursive processing of nested dictionaries. Default value: true
  • replacement: Replace the target elements with the value of replacement instead of removing them. Available in AxoSyslog 4.9 and later.
  • targets: A list of elements to remove or replace. Default value: ["", null, [], {}]. Available in AxoSyslog 4.9 and later.

For example, to remove the fields with - and N/A values, you can use

unset_empties(input_object, targets=["-", "N/A"], ignorecase=false);

update_metric

Updates a labeled metric counter, similarly to the metrics-probe() parser. For details, see Metrics.

upper

Converts all characters of a string uppercase characters.

Usage: upper(string)

vars

Returns the variables (including pipeline variables and name-value pairs) defined in the FilterX block as a JSON object.

For example:

filterx {
  ${logmsg_variable} = "foo";
  local_variable = "bar";
  declare pipeline_level_variable = "baz";
  ${MESSAGE} = vars();
};

The value of ${MESSAGE} will be: {"logmsg_variable":"foo","pipeline_level_variable":"baz"}