How Flowmill Works

Flowmill consists of an eBPF-based "kernel" collector on each host (cloud instance / VM / bare metal) that sends a detailed summary of communication collected directly from the operating system to the cloud-based Flowmill service. This agent leverages eBPF, an interface exposed by the Linux kernel, to automatically instrument the network stack and collect real-time data on every socket, along with associated process and container metadata. This targeted approach allows the Flowmill agent to operate with negligible overhead, typically 0.25% / CPU core.

Additionally, Flowmill includes a number of optional metadata collectors. These include:

  1. Kubernetes Collector: Collects service, deployment, pod, and node updates from the Kubernetes APIs

  2. AWS Collector: Collects information on network interfaces exposed in the VPC to label components such as ELBs, NAT Gateways, PrivateLink, etc.

Figure 1: Flowmill Architecture

The Flowmill agents encrypt and compress data before sending it to the Flowmill service. This 'raw' data must be matched across both sides of the connection, labeled and enriched with service names, and aggregated and analyzed in real time. This is a critical part of the process in that it turns low level socket metrics into a service and host-oriented views of system behavior.

The dataset can then be exposed over an API, user interface, or through various partner integrations.