StreamApprox: Approximate Computing for Stream Analytics

  • Chen R.
  • Christof Fetzer
  • Hilt V.
  • Le D.
  • Pramod Bhatotia
  • Thorsten Strufe

Approximate computing -- a computing paradigm where an approximate output is sufficient instead of the exact output for a given application workflow -- is increasingly being adopted to accelerate jobs and efficiently utilize the computing resources. The idea behind approximate computing is to compute user's application over a representative sample of the input data instead of computing over the entire dataset. As a result, it allows users to make a trade-off between the output accuracy and system efficiency. The advancements in approximate computing is either limited to batch processing (where the input data remains unchanged during the course of sampling) or not transparent (where users are required to design and implement application-specific data streaming algorithms for sampling). Thus, the existing approaches are not well-suited to support approximate computing for low-latency stream analytics in a transparent way. In this paper, we present StreamApprox -- a stream analytics system for approximate computing requiring no modifications to the existing applications. To realize this idea, we designed an online stratified sampling algorithm to produce approximate output with bounded error. The online algorithm facilitates a systematic trade-off between the output accuracy and user-specified latency requirements. We implemented StreamApprox based on Apache Spark Streaming. Our evaluation using a set of micro-benchmarks and real-world case-studies shows that StreamApprox achieves a speedup of 2X over the native Spark Streaming execution for approximate computing with an accuracy loss less than 1%.

Recent Publications

August 09, 2017

A Cloud Native Approach to 5G Network Slicing

  • Francini A.
  • Miller R.
  • Sharma S.

5G networks will have to support a set of very diverse and often extreme requirements. Network slicing offers an effective way to unlock the full potential of 5G networks and meet those requirements on a shared network infrastructure. This paper presents a cloud native approach to network slicing. The cloud ...

August 01, 2017

Modeling and simulation of RSOA with a dual-electrode configuration

  • De Valicourt G.
  • Liu Z.
  • Violas M.
  • Wang H.
  • Wu Q.

Based on the physical model of a bulk reflective semiconductor optical amplifier (RSOA) used as a modulator in radio over fiber (RoF) links, the distributions of carrier density, signal photon density, and amplified spontaneous emission photon density are demonstrated. One of limits in the use of RSOA is the lower ...

July 12, 2017

PrivApprox: Privacy-Preserving Stream Analytics

  • Chen R.
  • Christof Fetzer
  • Le D.
  • Martin Beck
  • Pramod Bhatotia
  • Thorsten Strufe

How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy (ezk) guarantees for users, a privacy bound tighter ...