Detecting Network Reliability and Performance Problems

Distributed, service-based applications are particularly vulnerable to network issues. When reliability and performance problems emerge, operators frequently find themselves asking themselves "Is it the network"? Flowmill connects real time network statistics with metadata about containers and services to help separate software bugs from infrastructure issues.

Since Flowmill captures and presents pairwise metrics on source and destination, it is possible to automatically monitor connections to managed services and 3rd party providers that may operate on the public Internet as well as reliability within your cloud provider.

Flowmill can help identify a number of network issues affecting services including:

  • Packet Loss: Retransmissions in a TCP stream

  • Connection Failures: TCP Syn timeouts

  • DNS Timeouts: DNS requests without corresponding responses

  • Network performance: Detailed metrics on network round trip time can be correlated with application layer layer latency to determine is 'the network' is the root cause for a performance bottleneck.

Network reliability demonstration

DNS demonstration