We Used Broadband Data We Shouldn’t Have — Here’s What Went Wrong

Coverage Type: 

Over the summer, FiveThirtyEight published two stories on broadband internet access in the US that were based on a data set made public by academic researchers who had acquired data from Catalist, a well-known political data firm. After further reporting, we can no longer vouch for the academics’ data set. The preponderance of evidence we’ve collected has led us to conclude that it is fundamentally flawed. That’s because:

  1. The academics’ data does not provide an accurate picture of broadband use at the county level relative to other sources.
  2. Some of the data that the academic researchers received from Catalist originated with a third-party commercial source, and Catalist acknowledged that it did not vet that data itself. The researchers and Catalist also disagree about what Catalist said the data represents and what it could be used for.

The idea behind the stories was to demonstrate that broadband is not ubiquitous in the U.S. today, even as more of our lives and the economy go online. We stand by this sentiment and the on-the-ground reporting in the two stories even though we have lost confidence in the data set. We should have been more careful in how we used the data to help guide where to report out our stories on inadequate internet, and we were reminded of an important lesson: that just because a data set comes from reputable institutions doesn’t necessarily mean it’s reliable.


We Used Broadband Data We Shouldn’t Have — Here’s What Went Wrong