group-lines parser

Available in AxoSyslog version 4.2 and newer.

The group-lines() parser correlates multi-line messages received as separate, but subsequent lines into a single log message. AxoSyslog first collects the received messages into streams of related messages (based on the key() parameter), then grouped into correlation contexts up to timeout() seconds long. Multi-line messages are then identified within these contexts.

  group-lines(
    key("$FILE_NAME")
    multi-line-mode("smart")
    template("$MESSAGE")
    timeout(10)
    line-separator("\n")
  );

The parser has the following options.

key()

Type: template
Default:

Description: Specifies a template that determines which messages form a single stream. Messages where the template expansion results in the same key are considered part of the same stream. Using the key() option, you can extract multi-line messages even if different streams are interleaved in your input.

line-separator()

Type: string
Default: \n

Description: In case a multi-line message is found, this string is inserted between the of the new multi-line message. Defaults to the newline character.

multi-line-garbage()

Type: regular expression
Default: empty string

Description: Use the multi-line-garbage() option when processing multi-line messages that contain unneeded parts between the messages. Specify a string or regular expression that matches the beginning of the unneeded message parts. If the multi-line-garbage() option is set, AxoSyslog ignores the lines between the line matching the multi-line-garbage() and the next line matching multi-line-prefix(). See also the multi-line-prefix() option.

When receiving multi-line messages from a source when the multi-line-garbage() option is set, but no matching line is received between two lines that match multi-line-prefix(), AxoSyslog will continue to process the incoming lines as a single message until a line matching multi-line-garbage() is received.

To use the multi-line-garbage() option, set the multi-line-mode() option to prefix-garbage.

multi-line-mode()

Type: indented, prefix-garbage, prefix-suffix, regexp, smart
Default: empty string

Description: Use the multi-line-mode() option when processing multi-line messages. The AxoSyslog application provides the following methods to process multi-line messages:

  • indented: The indented mode can process messages where each line that belongs to the previous line is indented by whitespace, and the message continues until the first non-indented line. For example, the Linux kernel (starting with version 3.5) uses this format for /dev/log, as well as several applications, like Apache Tomcat.

        source s_tomcat {
            file("/var/log/tomcat/xxx.log" multi-line-mode(indented));
        };
    
  • prefix-garbage: The prefix-garbage mode uses a string or regular expression (set in multi-line-prefix()) that matches the beginning of the log messages, ignores newline characters from the source until a line matches the regular expression again, and treats the lines between the matching lines as a single message. For details on using multi-line-mode(prefix-garbage), see the multi-line-prefix() and multi-line-garbage() options.

  • prefix-suffix: The prefix-suffix mode uses a string or regular expression (set in multi-line-prefix()) that matches the beginning of the log messages, ignores newline characters from the source until a line matches the regular expression set in multi-line-suffix(), and treats the lines between multi-line-prefix() and multi-line-suffix() as a single message. Any other lines between the end of the message and the beginning of a new message (that is, a line that matches the multi-line-prefix() expression) are discarded. For details on using multi-line-mode(prefix-suffix), see the multi-line-prefix() and multi-line-suffix() options.

    The prefix-suffix mode is similar to the prefix-garbage mode, but it appends the garbage part to the message instead of discarding it.

  • smart: The smart mode recognizes multi-line data backtraces even if they span multiple lines in the input. The backtraces are converted to a single log message for easier analysis. Backtraces for the following programming languages are recognized : Python, Java, JavaScript, PHP, Go, Ruby, and Dart.

    smart mode is available in AxoSyslog version 4.2 and newer.

    The regular expressions to recognize these programming languages are specified in an external file called /usr/share/syslog-ng/smart-multi-line.fsm (installation path depends on configure arguments), in a format that is described in that file.

multi-line-prefix()

Type: regular expression starting with the ^ character
Default: empty string

Description: Use the multi-line-prefix() option to process multi-line messages, that is, log messages that contain newline characters (for example, Tomcat logs). Specify a string or regular expression that matches the beginning of the log messages (always start with the ^ character). Use as simple regular expressions as possible, because complex regular expressions can severely reduce the rate of processing multi-line messages. If the multi-line-prefix() option is set, AxoSyslog ignores newline characters from the source until a line matches the regular expression again, and treats the lines between the matching lines as a single message. See also the multi-line-garbage() option.

Example: Processing Tomcat logs

The log messages of the Apache Tomcat server are a typical example for multi-line log messages. The messages start with the date and time of the query in the YYYY.MM.DD HH:MM:SS format, as you can see in the following example.

   2010.06.09. 12:07:39 org.apache.catalina.startup.Catalina start
    SEVERE: Catalina.start:
    LifecycleException:  service.getName(): "Catalina";  Protocol handler start failed: java.net.BindException: Address already in use null:8080
           at org.apache.catalina.connector.Connector.start(Connector.java:1138)
           at org.apache.catalina.core.StandardService.start(StandardService.java:531)
           at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
           at org.apache.catalina.startup.Catalina.start(Catalina.java:583)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
           at java.lang.reflect.Method.invoke(Method.java:597)
           at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
           at java.lang.reflect.Method.invoke(Method.java:597)
           at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177)
    2010.06.09. 12:07:39 org.apache.catalina.startup.Catalina start
    INFO: Server startup in 1206 ms
    2010.06.09. 12:45:08 org.apache.coyote.http11.Http11Protocol pause
    INFO: Pausing Coyote HTTP/1.1 on http-8080
    2010.06.09. 12:45:09 org.apache.catalina.core.StandardService stop
    INFO: Stopping service Catalina

To process these messages, specify a regular expression matching the timestamp of the messages in the multi-line-prefix() option. Such an expression is the following:

source s_file{file("/var/log/tomcat6/catalina.2010-06-09.log" follow-freq(0) multi-line-mode(regexp) multi-line-prefix("[0-9]{4}\.[0-9]{2}\.[0-9]{2}\.") flags(no-parse));};
    };

Note that flags(no-parse) is needed to prevent AxoSyslog trying to interpret the date in the message.

multi-line-suffix()

Type: regular expression
Default: empty string

Description: Use the multi-line-suffix() option when processing multi-line messages. Specify a string or regular expression that matches the end of the multi-line message.

To use the multi-line-suffix() option, set the multi-line-mode() option to prefix-suffix. See also the multi-line-prefix() option.

scope()

Type: process, program, host, or global
Default: global

Description: Specifies which messages belong to the same context. The following values are available:

  • process: Only messages that are generated by the same process of a client belong to the same context, that is, messages that have identical ${HOST}, ${PROGRAM} and ${PID} values.
  • program: Messages that are generated by the same application of a client belong to the same context, that is, messages that have identical ${HOST} and ${PROGRAM} values.
  • host: Every message generated by a client belongs to the same context, only the ${HOST} value of the messages must be identical.
  • global: Every message belongs to the same context. This is the default value.

template()

Type: template
Default:

Description: A template string that specifies what constitutes an line to group-lines(). In simple cases this is ${MSG} or ${RAWMSG}.