FilterX

FilterX is an experimental feature currently under development. Feedback is most welcome on Discord and GitHub.

Available in AxoSyslog 4.8.1 and later.

FilterX helps you to route, parse, and modify your logs: a message passes through the FilterX block in a log path only if all the FilterX statements evaluate to true for the particular message. If a log statement includes multiple FilterX blocks, the messages are sent to the destinations only if they pass all FilterX blocks of the log path. For example, you can select only the messages originating from a particular host, or create complex filters using operators, functions, and logical expressions.

FilterX blocks consist of a list of FilterX statements, each statement evaluates either to truthy or falsy. If a message matches all FilterX statements, it passes through the FilterX block to the next element of the log path, for example, the destination.

  • Truthy values are:
    • Complex values (for example, a datetime object),
    • non-empty lists and objects,
    • non-empty strings,
    • non-zero numbers,
    • the true boolean object.
  • Falsy values are:
    • empty strings,
    • the false value,
    • the 0 value,
    • null,

Statements that result in an error (for example, if a comparison cannot be evaluated because of type error, or a field or a dictionary referenced in the statement doesn’t exist or is unset) are also treated as falsy.

Define a filterx block

You can define filterx blocks inline in your log statements. (If you want to reuse filterx blocks, Reuse FilterX blocks.)

For example, the following FilterX statement selects the messages that contain the word deny and come from the host example.

log {
    source(s1);
    filterx {
        ${HOST} == "example";
        ${MESSAGE} =~ "deny";
    };
    destination(d1);
};

You can use filterx blocks together with other blocks in a log path, for example, use a parser before/after the filterx block if needed.

FilterX statements

A FilterX block contains one or more FilterX statements. The order of the statements is important, as they are processed sequentially. If any of the statements is falsy (or results in an error), AxoSyslog drops the message from that log path.

FilterX statements can be one of the following:

  • A comparison, for example, ${HOST} == "my-host";. This statement is true only for messages where the value of the ${HOST} field is my-host. Such simple comparison statements can be the equivalents of traditional filter functions.

  • A value assignment for a name-value pair or a local variable, for example, ${my-field} = "bar";. The left-side variable automatically gets the type of the right-hand expression. Assigning the false value to a variable (${my-field} = false;) is a valid statement that doesn’t automatically cause the FilterX block to return as false.

  • Existence of a variable of field. For example, the ${HOST}; expression is true only if the ${HOST} macro exists and isn’t empty.

  • A conditional statement ( if (expr) { ... } elif (expr) {} else { ... };) which allows you to evaluate complex decision trees.

  • A declaration of a pipeline variable, for example, declare my_pipeline_variable = "something";.

  • A FilterX action. This can be one of the following:

    • drop;: Intentionally drop the message. This means that the message was successfully processed, but discarded. Processing the dropped message stops at the drop statement, subsequent sections or other branches of the FilterX block won’t process the message. For example, you can use this to discard unneeded messages, like debug logs. Available in AxoSyslog 4.9 and later.
    • done;: Return truthy and don’t execute the rest of the FilterX block, returns with true. This is an early return that you can use to avoid unnecessary processing, for example, when the message matches an early classification in the block. Available in AxoSyslog 4.9 and later.

When you assign the value of a variable using another variable (for example, ${MESSAGE} = ${HOST};), AxoSyslog copies the current value of the ${HOST} variable. If a statement later changes the value of the ${HOST} field, the ${MESSAGE} field won’t change. For example:

filterx {
  ${HOST} = "first-hostname";
  ${MESSAGE} = ${HOST}; # The value of ${MESSAGE} is first-hostname
  ${HOST} = "second-hostname"; # The value of ${MESSAGE} is still first-hostname
};

The same is true for complex objects, like JSON, for example:

js = json({
    "key": "value",
    "second-key": "another-value"
});

${MESSAGE} = js;

js.third_key = "third-value-not-available-in-MESSAGE";

You can use FilterX operators and functions.

Data model and scope

Each FilterX block can access data from the following elements.

  • Macros and name-value pairs of the message being processed (for example, $PROGRAM). The names of macros and name-value pairs begin with the $ character. If you define a new variable in a FilterX block and its name begins with the $ character, it’s automatically added to the name-value pairs of the message.

  • Local variables. These have a name that doesn’t start with a $ character, for example, my_local_variable. Local variables are available only in the FilterX block they’re defined.

  • Pipeline variables. These are similar to local variables, but must be declared before first use, for example, declare my_pipeline_variable=5;

    Pipeline variables are available in the current and all subsequent FilterX block. They’re global in the sense that you can access them from multiple FilterX blocks, but note that they’re still attached to the particular message that is processed, so the values of pipeline variables aren’t preserved between messages.

    If you don’t need to pass the variable to another FilterX block, use local variables, as pipeline variables have a slight performance overhead.

Variable names

FilterX variable names have more restrictions than generic name-value pair names. They:

  • can contain alphanumeric characters and the underscore character (_), but cannot contain hyphens,
  • cannot begin with numbers,
  • can begin with underscore.

Variable types

Variables can have the following types. All of these types have a matching function that can be used to type cast something into the specific type.

Assign values

To assign value to a name-value pair or a variable, use the following syntax:

<variable-name> = <value-of-the-variable>;

In most cases you can omit the type, and AxoSyslog automatically assigns the type based on the syntax of the value, for example:

  • mystring = "string-value";
  • myint = 3;
  • mydouble = 2.5;
  • myboolean = true;

When needed, you can explicitly specify the type of the variable, and AxoSyslog attempts to convert the value to the specified type:

<variable-name> = <variable-type>(<value-of-the-variable>);

For example:

filterx {
  ${MESSAGE} = string("Example string message");
};

You can also assign the value of other name-value pairs, for example:

filterx {
  ${MESSAGE} = ${HOST};
};

When processing RFC5424-formatted (IETF-syslog) messages, you can modify the SDATA part of the message as well. The following example sets the sequenceId:

filterx {
  ${.SDATA.meta.sequenceId} = 55555;
};

Template functions

You can use the traditional template functions of AxoSyslog to access and format name-value pairs. For that you must enclose the template function expression between double-quotes, for example:

${MESSAGE} = "$(format-json --subkeys values.)";

However, note that template functions cannot access the local and pipeline variables created in FilterX blocks.

Delete values

To delete a value without deleting the object itself (for example, name-value pair), use the null value, for example:

${MY-NV-PAIR-KEY} = null;

To delete the name-value pair (or a key from an object), use the unset function:

unset(${MY-NV-PAIR-KEY});
unset(${MY-JSON}["key-to-delete"]);

To unset every empty field of an object, use the unset-empties function:

Add two values

The plus operator (+) adds two arguments, if possible. (For example, you can’t add two datetime values.)

  • You can use it to add two numbers (two integers, two double values). If you add a double to an integer, the result is a double.

  • Adding two strings concatenates the strings. Note that if you want to have spaces between the added elements, you have to add them manually, like in Python, for example:

    ${MESSAGE} = ${HOST} + " first part of the message," + " second part of the message" + "\n";
    
  • Adding two lists merges the lists. Available in AxoSyslog 4.9 and later.

  • Adding two dicts updates the dict with the values of the second operand. For example:

    x = {"key1": "value1", "key2": "value1"};
    y = {"key3": "value1", "key2": "value2"};
    ${MESSAGE} = x + y; # ${MESSAGE} value is {"key1": "value1", "key3": "value1", "key2": "value2"};
    

    Available in AxoSyslog 4.9 and later.

Complex types: lists, dicts, and JSON

The list and dict types are similar to their Python counterparts. FilterX uses JSON to represent generic dictionary and list types, but you can create other, specific dictionary and list types as well (currently for OTEL, for example, otel_kvlist, or otel_array). All supported dictionary and list types are compatible with each other, and you can convert them to and from each other, copy values between them (retaining the type), and so on.

For example:

my_list = []; # Creates an empty list (which defaults to a JSON list)
my_array = {}; # Creates an empty dictionary (which defaults to a JSON object)

my_list2 = json_array(); # Creates an empty JSON list
my_array2 = json(); # Creates an empty JSON object.

You can add elements to lists and dictionaries like this:

list = json_array(); # Create an empty JSON list
#list = otel_array(); # Create an OTEL list
list += ["first_element"]; # Append entries to the list
list += ["second_element"];
list += ["third_element"];
${MESSAGE} = list;

You can also create the list and assign values in a single step:

list = json_array(["first_element", "second_element", "third_element"]);
${MESSAGE} = list;

You can refer to the elements using an index (starting with 0):

list = json_array(); # Create an empty JSON list
list[0] = "first_element"; # Append entries to the list
list[1] = "second_element";
list[2] = "third_element";
${MESSAGE} = list;

In all three cases, the value of ${MESSAGE} is the same JSON array: ["first_element", "second_element", "third_element"].

You can define JSON objects using the json() type, for example:

js1 = json();
js1 += {
    "body": "mystring",
    "time_unix_nano": 123456789,
    "attributes": {
        "int": 42,
        "flag": true
        }
    };

js2 = json({"key": "value"})

Naturally, you can assign values from other variables to an object, for example:

js = json_array(["foo", "bar", "baz"]);
${MESSAGE} = json({
    "key": "value",
    "list": list
});

or

js = json({
    "key": ${MY-NAME-VALUE-PAIR},
    "key-from-expression": isset(${HOST}) ? ${HOST} : "default-hostname",
    "list": list
});

Within a FilterX block, you can access the fields of complex data types by using indexes and the dot notation, for example:

  • dot notation: js.key
  • indexing: js["key"]
  • or mixed mode if needed: js.list[1]

When referring to the field of a name-value pair (which begins with the $ character), place the dot or the square bracket outside the curly bracket surrounding the name of the name-value pair, for example: ${MY-LIST}[2] or ${MY-OBJECT}.mykey. If the name of the key contains characters that are not permitted in FilterX variable names, for example, a hyphen (-), use the bracketed syntax and enclose the key in double quotes: ${MY-LIST}["my-key-name"].

You can add two lists or two dicts using the Plus operator.

Operators

FilterX has the following operators.

For details, see FilterX operator reference.

Functions

FilterX has the following built-in functions.

  • cache_json_file: Loads an external JSON file to lookup contextual information.
  • endswith: Checks if a string ends with the specified value.
  • flatten: Flattens the nested elements of an object.
  • format_csv: Formats a dictionary or a list into a comma-separated string.
  • format_json: Dumps a JSON object into a string.
  • format_kv: Formats a dictionary into key=value pairs.
  • get_sdata: Returns the SDATA part of an RFC5424-formatted syslog message as a JSON object.
  • has_sdata: Checks if a string ends with the specified value.
  • includes: Checks if a string contains a specific substring.
  • isodate: Parses a string as a date in ISODATE format.
  • is_sdata_from_enterprise: Checks if the message contains the specified organization ID.
  • isset: Checks that argument exists and its value is not empty or null.
  • istype: Checks the type of an object.
  • len: Returns the length of an object.
  • lower: Converts a string into lowercase characters.
  • parse_csv: Parses a comma-separated or similar string.
  • parse_kv: Parses a string consisting of whitespace or comma-separated key=value pairs.
  • parse_leef: Parses LEEF-formatted string.
  • parse_xml: Parses an XML object into a JSON object.
  • parse_windows_eventlog_xml: Parses a Windows Event Log XML object into a JSON object.
  • regexp_search: Searches a string using regular expressions.
  • regexp_subst: Rewrites a string using regular expressions.
  • startswith: Checks if a string begins with the specified value.
  • strptime: Converts a string containing a date/time value, using a specified format string.
  • unset: Deletes a name-value pair, or a field from an object.
  • unset_empties: Deletes empty fields from an object.
  • update_metric: Updates a labeled metric counter.
  • upper: Converts a string into uppercase characters.
  • vars: Lists the variables defined in the FilterX block.

For details, see FilterX function reference.

Use cases and examples

The following list shows you some common tasks that you can solve with FilterX:

  • To set message fields (like macros or SDATA fields) or replace message parts: you can assign values to change parts of the message, or use one of the FilterX functions to rewrite existing values.

  • To delete or unset message fields, see Delete values.

  • To rename a message field, assign the value of the old field to the new one, then unset the old field. For example:

    $my_new_field = $mike_old_field;
    unset($mike_old_field);
    
  • To use conditional rewrites, you can either:

    • embed the FilterX block in an if-else block, or

    • use value comparison in the FilterX block to select the appropriate messages. For example, to rewrite only messages of the NGINX application, you can:

      ${PROGRAM} == "nginx";
      # <your rewrite expression>
      

Create an iptables parser

The following example shows you how to reimplement the iptables parser in a FilterX block. The following is a sample iptables log message (with line-breaks added for readability):

Dec 08 12:00:00 hostname.example kernel: custom-prefix:IN=eth0 OUT=
MAC=11:22:33:44:55:66:aa:bb:cc:dd:ee:ff:08:00 SRC=192.0.2.2 DST=192.168.0.1 LEN=40 TOS=0x00
PREC=0x00 TTL=232 ID=12345 PROTO=TCP SPT=54321 DPT=22 WINDOW=1023 RES=0x00 SYN URGP=0

This is a normal RFC3164-formatted message logged by the kernel (where iptables logging messages originate from), and contains space-separated key-value pairs.

  1. First, create some filter statements to select iptables messages only:

    block filterx parse_iptables() {
        ${FACILITY} == "kern"; # Filter on the kernel facility
        ${PROGRAM} == "kernel"; # Sender application is the kernel
        ${MESSAGE} =~ "PROTO="; # The PROTO key appears in all iptables messages
    }
    
  2. To make the parsed data available under macros beginning with ${.iptables}, like in the case of the original iptables-parser(), create the ${.iptables} JSON object.

    block filterx parse_iptables() {
        ${FACILITY} == "kern"; # Filter on the kernel facility
        ${PROGRAM} == "kernel"; # Sender application is the kernel
        ${MESSAGE} =~ "PROTO="; # The PROTO key appears in all iptables messages
    
        ${.iptables} = json(); # Create an empty JSON object
    }
    
  3. Add a key=value parser to parse the content of the messages into the ${.iptables} JSON object. The key=value pairs are space-separated, while equal signs (=) separates the values from the keys.

    block filterx parse_iptables() {
        ${FACILITY} == "kern"; # Filter on the kernel facility
        ${PROGRAM} == "kernel"; # Sender application is the kernel
        ${MESSAGE} =~ "PROTO="; # The PROTO key appears in all iptables messages
    
        ${.iptables} = json(); # Create an empty JSON object
    
        ${.iptables} = parse_kv(${MESSAGE}, value_separator="=", pair_separator=" ");
    }
    

FilterX variables in destinations

If you’re modifying messages using FilterX (for example, you extract a value from the message and add it to another field of the message), note the following points:

  • Macros and name-value pairs (variables with names beginning with the $ character) are included in the outgoing message in case the template of the destination includes them. For example, if you change the value of the ${MESSAGE} macro, it’s automatically sent to the destination if the destination template includes this macro.
  • Local and pipeline variables are not included in the message, you must assign their value to a macro or name-value pair that’s included in the destination template to send them to the destination.
  • When sending data to opentelemetry() destinations, if you’re modifying messages received via the opentelemetry() source, then you must explicitly update the original (raw) data structures in your FilterX block, otherwise the changes won’t be included in the outgoing message. For details, see Modify incoming OTEL.