Detecting Log4J on a global scale using collaborative security

Klaus Agnoletti

Playlists: 'MCH2022' videos starting here / audio

Utilizing collaborative security to collect data on attacks we were able to detect Log4J in a quite unusual but effective manner. We'll show you how CrowdSec enables the entire infosec community to stand together by detecting attempts to exploit a critical 0day, reporting them centrally thereby enabling anyone to protect themselves shortly after the vulnerability was made public. The unusual part is that this is done using FOSS software and by analyzing logs of real production systems but in a way that doesn't compromise the anonymity of anyone (except the attacker, of course) and doing so with a reliable result where poisoning and false positives are almost impossible. Too good to be true? Come by and judge for yourself!

The objective with the talk is to inspire the audience to understand why the world needs to think differently towards the threats of cyberattacks from criminals and which advantages it has when you’re really utilizing the power of the crowd.

Basically we’ve been doing it wrong until now by thinking that all the world’s problems can be solved by throwing money at them. Guess what: They can’t. Defending against hackers is a full time, complex task that requires a lot of complex tasks to be carried out in the same order and same way every time to be effective. That’s difficult to do so in order to make it more doable we should try working together. CrowdSec is FOSS software that does exactly this by enabling users to share information about current attacks by parsing log files and sharing basic information (anonymously) about the attack (source ip, timestamp, IoC) with the crowd.

CrowdSec could be perceived as a modern form of Fail2ban, though for Cloud and container-based infrastructure as well and capable of taking way more advanced decisions a lot faster. Mainly, it’s using a decoupled and distributed approach (detect here, remedy there) and an inference engine that leverages leaky buckets, YAML & Grok patterns to identify aggressive behaviors. It acquires signals from various data sources like files, syslogd, journald, AWS Cloudwatch and Kinesis, Docker logs and Windows Event Log, normalizes them, enriches them to apply heuristics and triggers a bouncer to deal with the threat, if need be. Since it’s written in Go, it’s compatible with almost any environment, fast in execution and ressource conservative.

To make sure signals are generally trustworthy we’ve implemented a reputation engine. Not only don’t we want any false positives - we also don’t want data to be poisoned. This is taken care of by a trust-ranking system where we assign a trust to each agent that will grow over time as the agent provides reliable signals. In this process both persistence and consistency is taken into account. In this process both persistence and consistency is taken into account. When an ip is voted for, it needs a certain amount of points based on the trust rank of each agent that has reported the ip. This system makes it expensive to poison collected data. Not only does it take a long time to reach a trust rank that makes any real difference - also diversity of AS NNumbers are being taken into account as well. The outcome of this is a reliable blocklist that’s constantly redistributed to network members in order to achieve a form of digital herd immunity. An ip caught aggressing WordPress sites will quickly be banned by all members who subscribed to the WordPress defense collection.

While CrowdSec is in charge of the detection, the reaction is performed by “bouncers” that aim to be deployable at any level of the applicative / infrastructure stack :
via nftables/iptables/pf based on an IP set
via nginx/openresty LUA scripting
via a Wordpress plugin
via a general PHP/Python/JS bouncer that works with all applications written in those languages
on Cloudflare or Fastly via our bouncer that integrates with the provider’s API
on AWS WAF via our bouncer that integrates with AWS’ API.
.. or in many other ways. Over time the possibilities will increase as the application design basically supports anything.

Bouncers can enforce several types of remediations, like blocking, sending a captcha, notifying, lowering rights, speed, sending a 2FA request, etc.

This approach, combined with a declarative configuration and a stateless behavior, makes it an efficient tool to enhance the security of modern stacks (containers, k8s, serverless and more generally automatically deployed infrastructures).

We are committed to building a strong community, with all that it implies :
a public hub to find, share and amend parsers, scenarios, and blockers
permissive open-source license (MIT) to stay business-friendly
and overall a strong commitment to transparency and community-first mentality, by tooling and behavior

In my talk I will dig into the technical nitty-gritty part of CrowdSec, the architecture and concepts and focus specifically on how we managed to collect data from live Log4J exploitation attempts using the crowd and how efficient this strategy turned out to be. CrowdSec is still collecting data and tracking the result on

The bigger the CrowdSec community becomes, the better protection against cyber criminals. So I really want to inspire the audience to engage in CrowdSec either by installing and using the software, to contribute documentation or code - or all of them. For the good of everybody!

Currently CrowdSec is collecting data from more than 45.000 agents in 158 countries. Each day more than 3M signals are collected. Over the last more than a year over 2.4M malevolent ips have been detected and verified.