What is public health? Various ideas of public health pervade our society, several of which have drifted into the political domain. This site is concerned with empiricism and correct data science, and it is bounded by available data.
Since much of what is currently presented as factual and/or scientific is neither, one of my bolder purposes for this website is to offer those willing to look at things themselves a chance to see what the available data actually indicate. Unfortunately, governments don’t always understand their own data, and sadly, some constituents and academics don’t care as long as they get paid. 🤦♂️
There are a lot of data out there that people, including some data scientists, have failed to understand. There are various reasons for this, and perhaps this meager website will begin to make a dent.
In a way, it is a bit like leadership. There is a difference between people in leadership positions and actual leaders, just as there is a difference between people who are “experts” and actual expertise. Expertise can be demonstrated, because there is functional utility in competence, but opinions only beget emotions, social media posts, news articles, and PowerPoint slides.
Reality is always the final arbiter, and either an airplane flies at the end of a runway, or it does not. Either code runs or it does not. Either the math is correct, or it is not. Either the predictions are right, or they are not. Either there is impact, or there is not. Either data are generalizable, or they are not. Wrong understandings invariably make everything seem more complicated than it is and eventually fail, and then things get worse.
The extent to which intellectuals and academics spectacularly mix this all up is something about which I may someday write a book.
Fine. But what do I mean by Public Health Data?
There are, in fact, massive health data resources in the public domain. It is very exciting.
Some of these datasets are preprocessed, already cleaned and prepared for analysis, and others have to be cleaned prior to running any meaningful inquiries. Examples of prepped data include sources such as the UCI Machine Learning Repository and Kaggle.
The myriad prepared data visualizations from various organizations are outside the scope of this site, because 90% of it amounts to the data equivalent of ultra-processed food and there’s really no telling what kind of bizarre ingredients reside under all the artificial coloring and lab-created crunch.
The content on this site demonstrates how to transform and explore available data, the sources of which are plentiful. Here is a partial list of what comes to mind:
U.S.
https://data.cms.gov/
https://seer.cancer.gov/
https://catalog.data.gov/dataset/
https://data.cdc.gov/500-Cities-Places/PLACES-Local-Data-for-Better-Health-County-Data-20/swc5-untb/data_preview
https://data.ed.gov/
https://www.cdc.gov/cdi/index.html
https://www.cdc.gov/yrbs/data/index.html
https://www.cdc.gov/brfss/index.html
https://www.cdc.gov/nchs/nhis/index.htm
https://www.cdc.gov/nchs/nhanes/index.htm
https://www.cdc.gov/nchs/index.html
https://cde.ucr.cjis.gov/
https://bjs.ojp.gov/data
International
https://ghdx.healthdata.org/
https://data.worldbank.org/
https://ourworldindata.org/
https://data.who.int/
https://api.dhsprogram.com/#/index.html
https://data.humdata.org/dataset
https://data.unicef.org/
https://globaldatabarometer.org/open-data/
https://www.data.gov.in/
https://ndap.niti.gov.in/
Those are some of them. There are many others.
As to what you will find when you start looking at these
data, if you are reading these words...
I promise it will be a grand adventure!