- In 3.0 we introduced Investigations for digging into incidents. We’ve continued to build out its capabilities, because faster and more efficient investigating means faster time to incident resolution:
- Investigations data can now be stored in partitions, so you can query only what you need to. We recommend partitioning based on “hostname” and “date” (YYYY-MM-DD) for the easiest integration with our console, but you can do you, too
- We’ve added some useful data to queries, such as changes to file permissions and file metadata, and whether a file execution attempt succeeded
- Alerts where you want them, whether in the clouds or on the ground: we’ve added support for Google Pub/Sub for alert output, as well as Minio for both alert output and Investigations data storage.
- The “file” detection had its retirement party, and the “fileAccess” and “fileMonitor” detections took its place. These new FIM detections should be less brittle (if only we could output alerts as peanut brittle). fileAccess alerts on reading files and their metadata, while fileMonitor looks for file modifications.
- We’ve added more granularity to our newFileExec detection, allowing filtering based on what program created the file
- You can now see whether a program is a network service or has any network services associated with it. This can be helpful in identifying reverse shells (shells on the target machine that communicate back to the attacker), like a netcat shell running a process status command
- We’ve made it easier for you to configure the number of concurrent kretprobes probing away. This helps improve data quality (i.e. fewer dropped events) on larger systems with more cores
- Metrics are now available for the number of events processed out of order, letting you keep an eye on conditions for false positives
- We’ve added the ability to configure detection rules based on current working directory
- You can now create rules related to a specific incident ID, letting you log all the things related to an incident in progress
- Reducing our footprint on computing resources:
- Event processing uses less CPU
- Per-process state tracking and kernel loading both use less memory
- Syscall subscriptions are now kprobe subscriptions, resulting in the same data with less performance impact
- We’re gonna need a bigger buffer! To avoid dropping events on busy workloads when the buffer fills, we have increased perf ring buffer sizes. Process events get even bigger buffers.
- You can now send alerts and Investigations data to multiple blob storage outputs, sorting on parameters like the type of alert or data from a specific sensor.
- We’ve also added support for a directory structure in blob storage, so you can output to different parts of the same blob. (This all sounds like a horror movie but with more Linux.)
- When a container escape alert fires, we now tell you why
- To avoid false positives, we’ve improved the way we handle lost events…
- … but we also want fewer lost events! Long program names and long arguments no longer cause exec events to be lost.
- Upon starting, the sensor now does a self-health check to ensure that it can get the data it needs. And if it can’t, now it will tell you.
Notable Bug Fixes
- Fixed an issue that was causing false positives in stack pivot detection
- When processing got backed up, events would be processed out of order, leading to false positives and false negatives. We’ve fixed this so that events are properly reordered. We also add a warning log message in the event that events are processed out of order.
- We fixed a ton of other bugs that improved performance in many different areas, but we’ll spare you from having to read that entire list (unless you need help falling asleep)