SRAudit: Auditing the Structural Reliability of the Clouds to Ward Off Correlated Failures

  • Chen R.
  • Ennan Zhai
  • Ruzica Piskac

Cloud computing systems heavily rely on redundancy techniques for ensuring reliability. Nevertheless, as cloud systems become ever more structurally complex, infrastructure components, e.g., replica servers, may unwittingly share deep dependencies, such as aggregation switches. These unexpected common dependencies may result in correlated failures that undermine redundancy efforts. Although existing diagnosis tools offer post-failure forensics, they typically lead to prolonged failure recovery time. This paper presents SRAudit, a practical framework that aims to prevent correlated failures before cloud outages occur, by allowing administrators to proactively audit the structural reliability of redundant systems of interest. SRAudit is capable of simultaneously offering expressive, accurate and efficient structural reliability auditing by introducing three novel components: 1) a declarative domain-specific language, RAL, enabling administrators to easily write auditing programs to express diverse auditing tasks; 2) a high-performance auditing engine that parses RAL program, and efficiently generates accurate auditing results by leveraging various verification tools (e.g., MinCostSAT and model counter); and 3) a repair engine that can automatically generate the reliability improvement plans based on easily written specifications. Our experimental result shows SRAudit can determine the top-20 critical correlated failure root causes in a 70,656- node redundant system within 5 minutes, which is 300x more efficient in auditing time than previous systems.

Recent Publications

August 09, 2017

A Cloud Native Approach to 5G Network Slicing

  • Francini A.
  • Miller R.
  • Sharma S.

5G networks will have to support a set of very diverse and often extreme requirements. Network slicing offers an effective way to unlock the full potential of 5G networks and meet those requirements on a shared network infrastructure. This paper presents a cloud native approach to network slicing. The cloud ...

August 01, 2017

Modeling and simulation of RSOA with a dual-electrode configuration

  • De Valicourt G.
  • Liu Z.
  • Violas M.
  • Wang H.
  • Wu Q.

Based on the physical model of a bulk reflective semiconductor optical amplifier (RSOA) used as a modulator in radio over fiber (RoF) links, the distributions of carrier density, signal photon density, and amplified spontaneous emission photon density are demonstrated. One of limits in the use of RSOA is the lower ...

July 12, 2017

PrivApprox: Privacy-Preserving Stream Analytics

  • Chen R.
  • Christof Fetzer
  • Le D.
  • Martin Beck
  • Pramod Bhatotia
  • Thorsten Strufe

How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy (ezk) guarantees for users, a privacy bound tighter ...