Of course we cannot always share details about our work with customers, but nevertheless it is nice to show our technical achievements and share some of our implemented solutions.
Logstash's filter capabilities are very helpful to sort incoming logs based on rules and patterns, rewrite the logs, add or remove fields, change metadata or simply specify a different output based on a filter.
A filter can also be applied on IP addresses. From a previous article (Quick and easy log listener with Logstash and local file output) we could see, that even for the most basic log events, a client (or source) ip address is logged and saved in the [host] field:
root@logstash:~# cat /var/log/applications.log
{
"@version" => "1",
"@timestamp" => 2022-01-24T14:41:29.540Z,
"message" => "Sending log event to Logstash\n",
"host" => "192.168.253.110"
}
We can use this (remote) [host] field to separate incoming logs. Let's assume that this Logstash receives logs from many applications, spread across many networks. If we know the (static) IP addresses of certain application servers, we also know the application. Another possibility is to use network segmentation to separate TEST and PRODUCTION environments.
The first obvious solution is to create a filter which uses an if condition for the IP address lookup, found in the [host] field. Here's a practical example how to determine the application environment based on the remote host ip address:
root@logstash:~# cat /etc/logstash/conf.d/10-filter.conf
filter {
if [host] == "192.168.253.111" or [host] == "192.168.253.112" or [host] == "10.150.79.139" or [host] == "10.150.79.140" or [host] == "10.150.79.143" or [host] == "10.150.79.144" {
mutate { add_field => { "logtarget" => "test" } }
} else if [host] == "192.168.253.115" or [host] == "192.168.253.116" or [host] == "10.150.79.11" or [host] == "10.150.79.12" or [host] == "10.150.79.13" or [host] == "10.150.79.14" or [host] == "10.150.79.15" or [host] == "10.150.79.16" or [host] == "10.150.79.17" or [host] == "10.150.79.18" {
mutate { add_field => { "logtarget" => "prod" } }
} else {
mutate { add_field => { "logtarget" => "generic" } }
}
}
If the log event was sent by a (remote) host matching one of the IP addresses of the first if condition, the field [logtarget] with value "test" is added. The second/else if condition does the same for a production environment. If no condition matched the IP address, a "generic" value is added.
The Logstash output {} can now be configured to use the [logtarget] field in the file name:
root@logstash:~# cat /etc/logstash/conf.d/99-output.conf
output {
file {
path => "/log/app-%{logtarget}-%{+YYYY-MM-DD-HH}.log"
file_mode => 0644
codec => plain
}
}
Note: The variable %{logtarget} can also be used in an Elasticsearch output, for example to write into a specific index.
Depending on where the logs came from, the log file will be one of:
This way (if conditions) of handling IP addresses works fine for a couple of static IP addresses. But you can see from the above condition: The more IP addresses need to be handled, the larger the condition becomes. That's not only annoying to manage, Logstash also needs more time to parse the condition (for each incoming event!).
If all the application hosts are known and use static IP addresses, the if conditions can also be shortened by creating a regular expression (regex) around the IP addresses:
root@logstash:~# cat /etc/logstash/conf.d/10-filter.conf
filter {
if [host] =~ /^192\.168\.253\.(111|112)/ or [host] =~ /^10.150.79.(139|140|143|144)/ {
mutate { add_field => { "logtarget" => "test" } }
} else if [host] =~ /^192\.168\.253\.(115|116)/ or [host] =~ /^10.150.79.1([1-9])/ {
mutate { add_field => { "logtarget" => "prod" } }
} else {
mutate { add_field => { "logtarget" => "generic" } }
}
}
This is basically the same filter as before, just shorter and (at least in my opinion) more manageable.
But there's still one problem with this filter: We must know the IP addresses of the application. If the application scales up, additional hosts may need to be added. The chances that new hosts are forgotten to be added into the filter are pretty high!
There's one more filter plugin, which can help us out here: The CIDR filters plugin. The documentation of the plugin lacks a lot of information and examples and is at first hard to understand. Basically what you need to understand is the following:
"Translating" the above filters to use the cidr filter, results in the following new filter:
root@logstash:~# cat /etc/logstash/conf.d/10-filter.conf
filter {
# Test networks
cidr {
address => [ "%{host}" ]
network => [ "192.168.253.111/32", "192.168.253.112/32", "10.150.79.128/25" ]
add_field => { "logtarget" => "test" }
}
# Production networks
cidr {
address => [ "%{host}" ]
network => [ "192.168.253.115/32", "192.168.253.116/32", "10.150.79.0/25" ]
add_field => { "logtarget" => "prod" }
}
}
In this example, the [host] field is used as variable and used as value for the "address" option. "address" is then compared against different networks, defined in "network". Note that there are no if conditions in this situation; the comparison happens within the cidr filter and actions (add_field) are only applied if the comparison is a match (true).
Although each environment has two static host IP addresses defined (using the /32 suffix), the big advantage here is the separation of segmented networks. By checking the "address" against the TEST (10.150.79.128/25) and PROD (10.150.79.0/25) network ranges, we don't need to know the exact source IPs where the application runs. Application scaling (within the environment) is perfectly supported and the filter does not need to be adjusted.