Sieve: Towards Actionable Insights from Monitored Metrics in Microservices
Most distributed systems are constantly monitored to understand their current and prior state, and this monitoring is a crucial part of any system deployment. In this respect, many distributed systems applications are designed following the microservices architecture. These applications are split up into smaller services that can be deployed individually, and communicate with each other over well-defined network based APIs. In the current setting, the number of services and metrics for such systems can grow beyond the understanding of a single developer or operator. In this paper we present SIEVE - a metric reduction framework for microservices. SIEVE decreases the dimensionality of metrics needed to considered. SIEVE automatically filters unimportant metrics by observing their signal over time. SIEVE uses a novel time-series clustering algorithm called K-Shape to group highly related metrics of a service into groups and select a representative metric from each group to reduce the overall amount of metrics. SIEVE infers dependencies between service components using a predictive-causality model by testing for Granger Causality. We show that SIEVE's generic approach is useful to support two case- studies: auto-scaling and root-cause analysis in micro-services.