Information networks and communication systems generate huge amounts of real and non-real time data that are essential for functions, ranging from network self-configuration and optimization to the discovery of emerging needs by network users.

To provide for these and other functions, to impart insight from these massive flows of data, we employ multiple statistical and data sciences disciplines. Among these are statistical machine learning, Bayesian methods and graphical models, pattern recognition, and other forms of data analysis and forecasting.

In Bell Labs, our tradition of fundamental research driven by real-world applications has its roots in the 1930s with Walter Shewhart, and later with John Tukey and Robert Tarjan. Our research led to the design of the S language that is implemented in S-plus and R, local regression methods, extended linear models, and learning methods like support vector machines, random decision forests, boosting and convolutional networks for deep learning.

Dynamic system modeling and change detection methods applied to uncover anomalies in wireless networks
Dynamic system modeling and change detection methods applied to uncover anomalies in wireless networks.

Today, we focus on complex problems related to load balancing and security in distributed clouds, energy efficiency, wireless “HetNet” optimization, geo-location and customer experience modeling. These include work on mapping indoor wireless signals using sensor fusion, and work by Jin Cao on inferring, in real time, how the network conditions impact video streaming experience.

Our industry has evolved on a path largely defined by the economics and technologies associated with Moore’s Law. As a result, we have the capacity to generate and process enormous amounts of data. As we progress into the era of Big Data, advances in the statistical and data sciences will be even more critical for harvesting the information and resulting insights embedded within that data and defining the path forward. To this end we are investigating various sampling methods, streaming algorithms and parallel and distributed computing techniques.