Location K-Anonymity under a Streaming Model

  • Andrews D.
  • Wilfong G.
  • Zhang Y.

In this paper we study how to achieve location anonymization under a streaming model. K-anonymity is a popular notion of privacy for which every recorded data point is identical to at least $k-1$ other data points. For location based data, k-anonymity refers to recording a coarse region that contains at least $k$ data points as the rough location of each of the data points in the region. One primary goal is to make sure the anoynimized location data is as accurate as possible, namely the size the recorded coarse region with $k+$ data points are as small as possible. Specifically, under our model, location data arrive online and are initially stored in a temporary memory a fixed size $m$. When or before this temporary memory becomes full, the data has to move to a permanent memory k-anonymized. Our problem formulation is motivated by applications such as location based advertising and disease tracking, in combination of certain legal requirement on privacy. We first show that in the {em competitive analysis framework} when an adversary specifies location data, any online algorithm can be arbitrarily bad in terms of the recorded region size. We therefore resort to assume the location distribution is known a priori. If the distribution is uniform, we consider a simple and natural algorithm {sc PickMax}, in which we prepartition the region into $m/k$ subregions uniform in size and pick the subregion with the largest occupancy whenever the buffer is full. We give a detailed analysis and show that the largest occupancy converges to $2k$. We therefore discover, perhaps somewhat unintuitively that it is sufficient to achieve k-anonymity if we partition the region into $2m/k$ subregions each of which is half the size. We also use simulation to compare the performance of {sc PickMax} and a variant {sc PickK} which records a subregion the moment it reaches occupancy $k$. Finally we apply simulation to a real wireless trace for which the distribution is not uniform. We partition the region in a quadtree form of nonuniform sizes and apply variants of {sc PickMax} and {sc PickK}.

Recent Publications

August 09, 2017

A Cloud Native Approach to 5G Network Slicing

  • Francini A.
  • Miller R.
  • Sharma S.

5G networks will have to support a set of very diverse and often extreme requirements. Network slicing offers an effective way to unlock the full potential of 5G networks and meet those requirements on a shared network infrastructure. This paper presents a cloud native approach to network slicing. The cloud ...

August 01, 2017

Modeling and simulation of RSOA with a dual-electrode configuration

  • De Valicourt G.
  • Liu Z.
  • Violas M.
  • Wang H.
  • Wu Q.

Based on the physical model of a bulk reflective semiconductor optical amplifier (RSOA) used as a modulator in radio over fiber (RoF) links, the distributions of carrier density, signal photon density, and amplified spontaneous emission photon density are demonstrated. One of limits in the use of RSOA is the lower ...

July 12, 2017

PrivApprox: Privacy-Preserving Stream Analytics

  • Chen R.
  • Christof Fetzer
  • Le D.
  • Martin Beck
  • Pramod Bhatotia
  • Thorsten Strufe

How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy (ezk) guarantees for users, a privacy bound tighter ...