A Generative Model for Predicting Outcomes in College Basketball

  • Perez-Cruz F.
  • Ruiz F.

Gambling is one of mankind's oldest activities, as evidenced by writings and equipment found in tombs and other places, and the birth of probability theory is attributed to Pascal and Fermat in 1654 on the fair division on an interrupted game of chance (Hacking, 1975). In this paper, we aim at estimating probabilities in sports. Specifically, we focus on the March Madness tournament in college basketball,1 although the model is general enough to model nearly any team sport for regular season and play-off games (assuming that both teams are willing to win). Estimating probabilities in sport events is challenging, because it is unclear what variables affect the outcome and what information is publicly known before the games begin. In team sports, it is even more complicated, because the information about individual players becomes relevant. Although there has been some attempts to model individual players (Miller, Bornn, Adams, and Goldsberry, 2014), there is no standard method to evaluate the importance of individual players and remove their contribution to the team when players do not play or get injured or suspended. It is also unclear if considering individual player information can improve predictions with no overfit. For college basketball, even more variables come into play, because there are 351 teams divided in 32 conferences, they only play about 30 regular games and the match-ups are not random, so the results do not directly show the level of each team. In the literature, we can find several variants of a simple model for soccer that identifies each team by its attack and defense coefficients (Maher, 1982, Dixon and Coles, 1997, Crowder, Dixon, Ledford, and Robinson, 2002, Baio and Blangiardo, 2010, Heuer, M¨uller, and Rubner, 2010). In all these works, the score for the home team is drawn from a Poisson distribution, whose mean is the multiplicative contribution of the home team attack coefficient and the away team defense coefficient. The score of the visitor team is an independent Poisson random variable, whose mean is the visitor attack coefficient multiplied by the home team defense coefficient. These coefficients are estimated by maximum likelihood using the past results and used to predict future outcomes.

View Original Article

Recent Publications

August 09, 2017

A Cloud Native Approach to 5G Network Slicing

  • Francini A.
  • Miller R.
  • Sharma S.

5G networks will have to support a set of very diverse and often extreme requirements. Network slicing offers an effective way to unlock the full potential of 5G networks and meet those requirements on a shared network infrastructure. This paper presents a cloud native approach to network slicing. The cloud ...

August 01, 2017

Modeling and simulation of RSOA with a dual-electrode configuration

  • De Valicourt G.
  • Liu Z.
  • Violas M.
  • Wang H.
  • Wu Q.

Based on the physical model of a bulk reflective semiconductor optical amplifier (RSOA) used as a modulator in radio over fiber (RoF) links, the distributions of carrier density, signal photon density, and amplified spontaneous emission photon density are demonstrated. One of limits in the use of RSOA is the lower ...

July 12, 2017

PrivApprox: Privacy-Preserving Stream Analytics

  • Chen R.
  • Christof Fetzer
  • Le D.
  • Martin Beck
  • Pramod Bhatotia
  • Thorsten Strufe

How to preserve users' privacy while supporting high-utility analytics for low-latency stream processing? To answer this question: we describe the design, implementation and evaluation of PRIVAPPROX, a data analytics system for privacy-preserving stream processing. PRIVAPPROX provides three properties: (i) Privacy: zero-knowledge privacy (ezk) guarantees for users, a privacy bound tighter ...