It’s my favorite time of year: fall which means it’s time for college football. I have always loved college sports. Growing up, I lived in a Big Ten/SEC household and a Big East (now ACC) town which meant a deluge of college sports filled the television screen from the first kick-off in August to the last buzzer beater in April. Recently, analytics has come to dominate both sports, but since it is football season let’s start there.
The last two off-seasons in college sports have been abuzz with NIL, transfer portal, and conference realignment news. I think the sentiment among most fans is captured by Dr. Pepper’s “Chaos Comes to Fansville” commercial. I began to notice that every conversation about conference realignment, in particular, was filled with speculation and fueled by gut feeling. There was, however, a common faith that some great and powerful college football Oz was crunching numbers to decide which team was worth adding to which conference. I still haven’t had the opportunity to meet his man behind the curtain, so until then I’d like to take a shot at proposing a data-driven conference realignment.
This is a four-part blog which will hopefully serve as a fun way to learn some new data science tools:
- College Football Conference Realignment — Exploratory Data Analysis in Python
- College Football Conference Realignment — Regression
- College Football Conference Realignment — Clustering
- College Football Conference Realignment — node2vec
I’ll preface this post by saying there are many ways to perform exploratory data analysis, so I’ll only be covering a few methods here which are relevant to conference realignment.
I took the time to build my own dataset using sources I compiled from across the web. These data include basic information about each FBS program, a non-canonical approximation of all college football rivalries, stadium size, historical performance, frequency appearances in AP top 25 polls, whether the school is an AAU or R1 institution (historically important for membership in the Big Ten and Pac 12), the number of NFL draft picks, data on program revenue from…