Younsu Chae, Georgia Institute of Technology Katie Guo, Dept. of Network Software Lucent Bell Labs. Milind M. Buddhikot, Dept. of Network Software Research, Lucent Bell Labs. Subhash Suri, Department of Computer Science, University of California, Santa Barbara Ellen Zegura, Georgia Institute of Technology
Abstract
In the current Internet, web content is increasingly being cached closer to the end-user to reduce network and web server load and improve performance. Existing web caching systems \cite{CommercialCaches} typically cache entire web documents and attempt to keep them consistent with the origin server. This approach works well for text and images; for bandwidth intensive multimedia data such as audio and video, caching entire documents is not cost effective and does not scale. An alternative approach is to cache parts of the multimedia stream on different caches in the network and coordinate stream playback from these independent caches. From the perspective of the clients, the collection of cooperating distributed caches act as a single fault-tolerant, scalable cache. In this paper, we focus on data placement and replacement techniques for such cooperating distributed caches. Specifically, we propose the following new schemes that work together: (1) A family of distributed layouts, consisting of two layouts, namely RCache and Silo. The RCache layout is a simple, randomized, easy-to-implement layout that distributes constant length segments of a clip among caches and provides modest storage efficiency. The Silo scheme improves upon RCache; it accounts for long term clip popularity and intra-clip segment popularity metrics and provides parameters to tune storage efficiency, server load, and playback switch-overs; (2) Two novel local data replacement schemes, namely alpha-beta, and Rainbow. The alpha-beta scheme uses simple thresholds to capture the macroscopic clip popularity and microscopic segment popularity. The Rainbow is more sophisticated and uses the concept of segment access potential that accurately captures the popularity metrics. (3) Caching Token, a dynamic global data replacement or redistribution scheme that exploits existing data in distributed caches to minimize distribution overhead. Our schemes optimize storage space, start-up latency, server load, network bandwidth usage, and overhead from playback switchovers. Our analytical and simulation results show that the Silo scheme provides 3 - 7 times as high cache hit ratio as that of traditional web caching system while utilizing the same amount of storage space.
( Proceedings of SPIE Annual Conf, Denver, Colorado, Aug, 2001)
Katie Guo, Dept. of Network Software Lucent Bell Labs, Milind M. Buddhikot, Dept. of Network Software Research, Lucent Bell Labs, Youngsu Chae, Georgia Institute of Technology, Subhash Suri, Department of Computer Science, University of California, Santa Barbara.
ABSTRACT:
Currently, Internet content is increasingly being cached closer to the end-user to reduce network and web server load and therefore improve performance and user perceived quality. Existing web caching systems \cite{CommercialCaches} typically cache entire web documents and attempt to keep them consistent with the origin server. This approach works well for text and images; for bandwidth intensive multimedia data such as audio and video, caching entire documents is not cost effective and does not scale. An alternative approach is to cache parts of the multimedia stream on different caches in the network and coordinate stream playback from these caches. From the perspective of the clients, the collection of cooperating distributed caches act as a single cache that is both scalable and fault-tolerant. This paper focuses on the design and evaluation of novel data placement and replacement techniques for such distributed caches. Specifically, we propose schemes in two categories: (1) RCache, a family of easy to implement,fault-tolerant multimedia data layout schemes (2) TwoD, a two-dimentional local data replacement scheme based on the concept of segment access potential, used for data replacement at each cache.
Our schemes optimize storage space, start-up latency, server load, network bandwidth usage, {\bf and overhead from playback switch-overs.} %We report simulation and analytical results show Our analytical and %simulation results show a good tradeoff in cache hit ratio, amount of storage %used and amount of server bandwidth used. Our analytical and simulation results show that the RCache scheme provides 30% times higher cache hit ratio than a comparable traditional web caching system that has the same amount of storage space.