# Introducing Log4Shell Sentinel

A smart Log4Shell/Log4j/CVE-2021-44228 scanner

# Introduction

The Internet’s on fire. CVE-2021-44228, a critical vulnerability in the Apache Log4j, one of the most widely used libraries, allows unauthenticated users to easily gain full server control of vulnerable applications and is currently being massively exploited. This is hands down one the worst Java-related vulnerability in the past decade. For more on the vulnerability itself, I would recommend reading this and this by the Lunasec team.

To help organizations identify vulnerable applications in their environment, I developed Log4Shell Sentinel that takes a somewhat unique approach to tackling the issue at enterprise-scale.

NOTE This article will only touch on some of the tool’s features. For more document, kindly refer to the GitHub

# Why Another Tool?

There are already a number of excellent tools available to help organizations scan their environments. This includes:

• passive and active DAST tools including Burp Suite plugins
• SAST tools to identify vulnerable code (if you have access to the source code)
• filesystem scanners
• web log scanners such as log4shell-detector.

But the truth is that despite the availability of these tools, the vast majority of enterprises are struggling. Even if they successfully identify a few applications, this is typically only a small subset of the applications that may be vulnerable. Add to this the tremendous adoption of containers over the past few years and this problem is further compounded. Until now.

Log4Shell Sentinel. Log4Shell Sentinel empowers companies to quickly:

• identify all (except for appliances you don’t have access to) uses of the vulnerable library in their environments (typically within minutes). It uses a file-system scanning based approach with some smarts added to it
• enable an IT / security analyst / GRC specialist to:
• identify duplicates of the same application running across their environment and handle them as a single instance
• map file system mappings to container images
• setup ignore lists for applications known to be not impacted by this vulnerability, even if they use the vulnerable library

Log4Shell Sentinel is the only open-source tool I’m aware of that takes a holistic view of the lifecycle of identifying and tracking the vulnerability across the enterprise. Using it, along with other tools that can help validate findings, will give enterprises more visibility into their exposure levels.

Let’s first look at the general solution design.

# Solution Design

The approach I adopted was a low-level approach based on file system scanning. While a match does not necessarily mean that an application is vulnerable, it does mean that the application may be vulnerable and you’ll have to add either source code analysis or run a SAST or DAST tool to definitely know if it is vulnerable or not (or possibly take the vendor’s word for it if it is a 3rd party application). This also means that:

• a scan across your environment can be completed in minutes
• a re-scan can easily be run to identify if all instances of a vulnerable application were updated or not

When combined with configuration management tools such as Ansible and the ability to ignore findings, you can easily scale a scan across hundreds / thousands of servers while progressively moving through the list of findings.

## Filesystem Scanning

As there are a number of ways to deploy Java applications, Log4Shell Sentinel inspects the following file types:

• simple jar
• fat / uber jar
• WAR
• EAR

Filesystem scanning has both pros and cons. Pros include:

• being able to identify with high certainty possibly vulnerable apps
• speed

The main cons are:

• not being able to actually trigger the vulnerability to definitively know if it vulnerable or not
• not knowing if the artifact found actually corresponds to a running process / container (LogShell Sentinel does this for you in some cases)

One of the key features of Log4Shell Sentinel is its ability to enrich findings, allowing analysts to focus on whats important. Out of the box, for each jar file detected, it adds the following:

• application fingerprinting
• container image mapping

### Host Enrichment

To allow an analyst to quickly identify what host a given application runs on, all matches add the following fields:

• IP
• hostname

For hosts with multiple IPs, the IP used for default routing is used. This allows an analyst to easily the host the potentially vulnerable application.

### Application Fingerprinting

Although there is no way to identify the actual application used, Log4Shell Sentinel does help analysts identify multiple copies of a single application across their environment, even if it doesn’t know what this application translates to in terms of business functionality. It does this by calculating an MD5 hash of the corresponding JAR/WAR/EAR file and adds this to the output. As the scanner outputs findings in CSV format, the analyst can then sort on this field within Excel and treat this as a single finding. However, in instances where a simple jar is used, no hash is calculated as this would be meaningless due to many applications likely using the same log4j-core-X.X.jar file. In this case, the analyst can use the full path to possibly identify multiple instances of the same application.

### Container Image Mapping

One major issue with current filesystem scanners is that they can be difficult to work with, especially for containerized apps. For example, a typical scan may result in a match such as this:

/var/lib/docker/overlay2/192768f471818601094bf4edd96d14bfc0e2b178a04a2efd00b2231ad4e46b33/merged/app/spring-boot-application.jar

An analyst would then have to SSH to the host and try to determine what container this file belongs to. Determining this is further compounded by the fact that different container runtime interfaces such as:

use different paths and require different techniques to match a file to a container. In fact, even within a given container runtime, the storage driver used (overlay2, aufs, devicemapper etc.) impacts this. As of v1.0.0, Log4Shell Sentinel supports the following mappings:

• CRI-O
• Docker (overlay2, aufs)
• containerd

This is a killer feature that can save analysts hours of frustration trying to determine what an entry like the above maps to. Instead, a match would give the analyst both the full path + the container image responsible for running this container. In our case, this is: ghcr.io/christophetd/log4shell-vulnerable-app. For me, personally, this is most unique feature of Log4Shell Sentinel and the one that took the most effort to implement. Not only does it perform the lookup, but it also removes superfluous matches that would drive an analyst insane. For example, these two matches:

/var/lib/docker/overlay2/192768f471818601094bf4edd96d14bfc0e2b178a04a2efd00b2231ad4e46b33/merged/app/spring-boot-application.jar
/var/lib/docker/overlay2/9e570f0cec8dcff5662a940f205600b541f82bd7d5d9c9bea8975ecb072506f4/diff/app/spring-boot-application.jar


point to the same container. The vulnerable JAR file is detected at both the diff and merged layers of the container but Log4Shell Sentinel understands this and will only show you a single entry. For more on container layers, refer to this article.

## Sample Finding

A sample finding would look like this:

IP,Hostname,AppName,Team,Ignore (Y/N),Comments,MD5Hash,Timestamp,Container,ContainerImage,FullPath,Version


A breakdown of the fields is as follows:

FieldExampleDescription
IP192.168.121.121If the instance has multiple IPs, the primary IP is added
Hostnameserver3Hostname
AppName This is left to the user to complete after the scan is completed
Team This is left to the user to complete after the scan is completed
Ignore (Y/N) This is left to the user to complete after the scan is completed
Comment This is left to the user to complete after the scan is completed
MD5Hash4e615cd580758b70c49ade1f79103328A unique fingerprint of our application
Timestamp2021-12-21T07:38:09ZA timestamp when the scan was performed. This is useful if you want to aggregate multiple scans of the same host
ContainertrueTrue = this is a container
ContainerImageghcr.io/christophetd/log4shell-
vulnerable-app:latest
If a container was detected, this is the corresponding image
FullPath/run/containerd/io.containerd.
The full path of the finding. In this example, the vulnerable JAR file is part of a fat / uber jar
Versionlog4j-core-2.14.1.jarThe lo4j-core version detected

# Running Log4Shell Sentinel

Running Log4Shell Sentinel is straight-forward. As it is a single statically compiled binary, all you need is to download it on the target machine and then run it. No configuration file(s) is required (unless you want to pass in an ignore list). You simply point it to the directory where your jar files are located and it does the rest. NOTE For the best coverage, I would recommend passing in / and running it as root to ensure that you do actually cover your entire filesystem. A system scan typically takes a few seconds and it scans about 100K files / second. During this time, CPU usage for a single core will spike so you probably want to run it during off-peak hours or if your current CPU utilization is low:

Here the scanner detected a possibly vulnerable version of:

• Logstash
• Elasticsearch
• awspx

We can select to show only the CSV lines and import them into Excel and start analyzing them. For a more thorough example, we’ll scan our environment.

# Workflow & Mass ScanningOur Environment

While it is perfectly fine to run Log4Shell Sentinel on a single server, it really shines when combined with your favorite configuration management tool (Ansible, Chef, Puppet, Salt, AWS SSM, etc.) or even simple parallel SSH tools such as parellel-ssh.

This gives you numerous benefits including:

• finding duplicate instances of your applications and containers and treating these findings as a single finding at the analysis phase
• scanning your entire environment in minutes

To really benefit from this tool, we want to run it across our servers and aggregate the results. In our example, we’ll use Ansible. Ansible is fantastic and easy to start using. All it requires is SSH access to the target servers (and ideally an account on the target servers that supports passwordless sudo). We won’t go into the details of settings things up as this is out of scope but an illustration will show how useful this is. The inventory parameter mentioned in the commands below is simply a file with the servers we wish to SSH to and scan. In our case, it points to 3 servers:

[all]
server1
server2
server3


In our environment, we’ll start by copying over the binary to our servers:

$ansible all -i inventory -m copy -a 'src=./log4shell_sentinel dest=$HOME/log4shell_sentinel' --become


The --become option tells it to sudo to the root user. After running the above, our binary is available under /root/ on all our servers. We then make it executable:

$ansible all -i inventory -a "chmod +x /root/log4shell_sentinel" --become  Finally, we run it: $ ansible all -i inventory -a "/root/log4shell_sentinel --p / --nb --nm --nh" --become


and get back our results:

server1 | CHANGED | rc=0 >>
192.168.1.110,server1,,,,,,2021-12-22T16:41:11+11:00,false,,/home/ubuntu/Apps/log4j-core-2.14.0.jar,log4j-core-2.14.0.jar
192.168.1.110,server1,,,,,,2021-12-22T16:41:11+11:00,false,,/home/ubuntu/logstash-6.8.0/logstash-core/lib/jars/log4j-core-2.9.1.jar,log4j-core-2.9.1.jar
192.168.1.110,server1,,,,,,2021-12-22T16:41:13+11:00,true,owasp/zap2docker-stable:2.10.0,/var/lib/docker/overlay2/07678b8899ca644697d67bd73092c72e974fdccc382ae26b4ac01c72b43972c4/merged/zap/lib/log4j-core-2.14.0.jar,log4j-core-2.14.0.jar
192.168.1.110,server1,,,,,c128eae8be5c9b5afcde153009474e88,2021-12-22T16:41:13+11:00,true,owasp/zap2docker-stable:2.10.0,/var/lib/docker/overlay2/07678b8899ca644697d67bd73092c72e974fdccc382ae26b4ac01c72b43972c4/merged/zap/webswing/webswing-server.war,log4j-core-2.13.2.jar
server2 | CHANGED | rc=0 >>
192.168.121.43,server2,,,,,,2021-12-22T05:41:16Z,false,,/home/vagrant/logstash-6.8.0/logstash-core/lib/jars/log4j-core-2.9.1.jar,log4j-core-2.9.1.jar
192.168.121.43,server2,,,,,,2021-12-22T05:41:21Z,true,docker.elastic.co/elasticsearch/elasticsearch:7.14.2,/var/lib/docker/overlay2/1a1a6aa868ce85f258c4a57e5d8523c83d484e2e324bcdd4c99a50d36331d121/merged/usr/share/elasticsearch/lib/log4j-core-2.11.1.jar,log4j-core-2.11.1.jar
192.168.121.43,server2,,,,,,2021-12-22T05:41:26Z,true,beatro0t/awspx:latest,/var/lib/docker/overlay2/c2ed81227ab6a9e33327459ea34db9ae151f0170e6b15a763785feb0db2f15f8/merged/var/lib/neo4j/lib/log4j-core-2.14.0.jar,log4j-core-2.14.0.jar
server3 | CHANGED | rc=0 >>


We find findings on all 3 of our servers. To remove the extra meta-data added by Ansible, we can re-run it and grep -v the extra meta-data:

$ansible all -i inventory -a "/root/log4shell_sentinel --p / --nb --nm --nh" --become | grep -v "| CHANGED |"  Another option is to simply redirect all output to a given directory. This is a cleaner option: $ mkdir results
$ansible all -i inventory -a "/root/log4shell_sentinel --p / --nb --nm --nh" --become -t results  The output is saved in JSON format and looks like this: { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": true, "cmd": [ "/root/log4shell_sentinel", "--p", "/", "--nb", "--nm", "--nh" ], "delta": "0:00:46.752778", "end": "2021-12-22 05:52:16.308776", "msg": "", "rc": 0, "start": "2021-12-22 05:51:29.555998", "stderr": "", "stderr_lines": [], "stdout": "192.168.121.121,server3,,,,,4e615cd580758b70c49ade1f79103328,2021-12-22T05:51:35Z,true,docker.io/mlinarik/log4j-log4shell-vulnerable-app:latest,/run/containerd/io.containerd.runtime.v2.task/k8s.io/ddd445b7143e5d11598ef476c4c7a1279d7a472576cd76542e4ed409b2f48564/rootfs/app/spring-boot-application.jar,log4j-core-2.14.1.jar", "stdout_lines": [ "192.168.121.121,server3,,,,,4e615cd580758b70c49ade1f79103328,2021-12-22T05:51:35Z,true,docker.io/mlinarik/log4j-log4shell-vulnerable-app:latest,/run/containerd/io.containerd.runtime.v2.task/k8s.io/ddd445b7143e5d11598ef476c4c7a1279d7a472576cd76542e4ed409b2f48564/rootfs/app/spring-boot-application.jar,log4j-core-2.14.1.jar" ] }  To extract only the stdout_lines, we simply: $ cat /tmp/ansible/*  | jq -r ".stdout_lines[]"
...


In practice, we would do the following to generate and header and add the results:

$./log4shell_sentinel --ph > results.csv$ cat result/* | jq -r ".stdout_lines[]" >> results.csv


We then open our findings in Excel:

As an analyst, we would then start to process the results. We’ll notice the following:

• the archives on server1 under /home/ubuntu/Apps/spring-* are all the same application (we determine this using the MD5 hash). We can treat them as a single entry. After a quick check, we determine that this isn’t currently being run and we choose to update the Ignore field with these entries. In future scans, we can simply use the MD5 hash to not see this in our results.
• by sorting by MD5 hashes, we notice that two different container images docker.io/mlinarik/log4j-log4shell-vulnerable-app:latest and ghcr.io/christophetd/log4shell-vulnerable-app are actually wrappers for the same application. We may decide to further scan one to determine if it is vulnerable or not by performing a source code audit or running one of the available SAST or DAST tools
• for the zap2docker-stable:2.10.0 and docker.elastic.co/elasticsearch/elasticsearch:7.14.2, we can try to determine the system owners by using a combination of the application name + the hostname and IP

We update our Excel sheet with this data and then use the Filter option to view only entries where ignore = Y and then copy the columns into a new Worksheet. We then save it as paths.csv which now looks like this:

/home/ubuntu/Apps/log4j-core-2.14.0.jar,None,Not an issue
/home/ubuntu/Apps/spring-boot-application.ear,Sample App,Not running
/home/ubuntu/Apps/spring-boot-application.jar,Sample App,Not running
/home/ubuntu/Apps/spring-boot-application.war,Sample App,Not running
/home/ubuntu/logstash-6.8.0/logstash-core/lib/jars/log4j-core-2.9.1.jar,Logstash,Run only from the CLI
/home/vagrant/logstash-6.8.0/logstash-core/lib/jars/log4j-core-2.9.1.jar,Logstash,Run only from the CLI


WARNING We have to be careful when configuring ignore lists. For example, if we used an MD5 hash instead for spring-boot-application.jar, our two containers running the application would have been added to the ignore list.

To re-run our scan, we would have to copy this file to our servers and then re-run our scan. This is simple:

$ansible all -i inventory -m copy -a 'src=./paths.csv dest=$HOME/paths.csv' --become


and then re-running our scan shows that they have been removed from the findings:

\$ ansible all -i inventory -a "/root/log4shell_sentinel --p / --nb --nm --nh --ip /root/paths.csv"  --become | grep -v '| CHANGED |'
192.168.121.43,server2,,,,,,2021-12-22T06:21:33Z,true,docker.elastic.co/elasticsearch/elasticsearch:7.14.2,/var/lib/docker/overlay2/1a1a6aa868ce85f258c4a57e5d8523c83d484e2e324bcdd4c99a50d36331d121/merged/usr/share/elasticsearch/lib/log4j-core-2.11.1.jar,log4j-core-2.11.1.jar
192.168.121.43,server2,,,,,,2021-12-22T06:21:35Z,true,beatro0t/awspx:latest,/var/lib/docker/overlay2/c2ed81227ab6a9e33327459ea34db9ae151f0170e6b15a763785feb0db2f15f8/merged/var/lib/neo4j/lib/log4j-core-2.14.0.jar,log4j-core-2.14.0.jar
192.168.1.110,server1,,,,,,2021-12-22T17:21:36+11:00,true,owasp/zap2docker-stable:2.10.0,/var/lib/docker/overlay2/07678b8899ca644697d67bd73092c72e974fdccc382ae26b4ac01c72b43972c4/merged/zap/lib/log4j-core-2.14.0.jar,log4j-core-2.14.0.jar
192.168.1.110,server1,,,,,c128eae8be5c9b5afcde153009474e88,2021-12-22T17:21:36+11:00,true,owasp/zap2docker-stable:2.10.0,/var/lib/docker/overlay2/07678b8899ca644697d67bd73092c72e974fdccc382ae26b4ac01c72b43972c4/merged/zap/webswing/webswing-server.war,log4j-core-2.13.2.jar