Monday, October 6, 2025

Data, People, and the Limits of Clarity


Data promises clarity. It offers the hope of structure, insight, and confidence. But can data truly capture people? The answer is complicated. This article explores where data shines, where it falters, and what it means for how we use it in practice.


Statistics and the Signal in the Noise

At its heart, data analysis is about variance—differences in people, outcomes, and conditions. Statistics was built to make sense of those differences. Traditional epidemiology used probabilistic equations to describe distributions in the noise: what portion of outcomes could be explained by certain predictors, and which were simply chance.

This “signal-to-noise” framing is central. Whether the variable in question is categorical or continuous, whether we’re looking at simple associations or complex interactions, the job of statistics is the same: describe patterns in a way that helps us decide what is most likely true for most people, most of the time.

But that phrasing is deliberate. Statistics never claims to capture all people, all of the time. In fact, the mathematics that makes statistical inference possible is built on precisely that limitation.


Statistics, AI, and Machine Learning: More Alike Than Different

The debates between statistics and machine learning can get heated, but at the foundation they are more alike than different. Both rely on the same mathematics—probability, optimization, distributions, variance.

The distinction lies in practice. Traditional statistics often required you to commit to an analysis plan before touching the data. AI and machine learning tend to be more iterative: throw a range of models, heuristics, and parameters at the dataset, see what performs best, and refine from there. That’s only a slight exaggeration.

Whether you’re fitting a logistic regression, running a Bayesian hierarchical model, or training a GAN, you’re still leveraging the same non-negotiable mathematical principles. That’s why all of these approaches “work” at all. The difference is whether you prioritize theory and planning, or performance and iteration.


Tree People and Forest People

Beyond the methods, there are human differences in how people approach data. In my experience across academia and the pharmaceutical industry, I’ve come to see two broad types:

  • Tree people: detail-oriented, precise, and focused on execution. They thrive on getting the minutiae right, even if the end result has limited impact.

  • Forest people: abstraction-oriented, big-picture thinkers. They are less concerned with local details and more interested in the cumulative impact at scale.

Neither orientation is inherently better. Both are forms of pattern recognition—just tuned to different levels of granularity. And both are needed. Organizations tend to sort themselves accordingly: the higher you go, the more abstract the vantage point; the closer you are to local execution, the more tactical the focus. Data science sits in the middle, trying to bridge detail with strategy.


Meaning, Measurement, and What Works

Language itself mirrors this tension. Words are symbols, placeholders for patterns of shared meaning. “Dog” doesn’t refer to one animal, but to an abstract category. Our ability to generalize through symbols is what allows us to draw connections from individual cases to broader patterns.

But science is not about symbols alone. It is grounded in what works. Physics is judged not by elegance but by predictive power—whether it can describe how an electron moves in an accelerator. Chemistry matters because reactions behave the same way every time under the same conditions.

The same is true in data science. Equations, algorithms, models: they only matter when they reliably predict and control outcomes in the observable world. A smartphone only functions because its circuitry, programming languages, and software rest on physical truths that work every single time.


The Bounds of Empiricism

That last point matters. Industrial and digital technologies are built on direct, local observations of the physical universe. Data from hard sciences is grounded in what can be seen, measured, and repeated. Theoretical abstractions may stretch into complexity, but their validity depends on whether they can be translated back into practical, predictive outcomes.

Science, then, is not a static set of conclusions or a matter of consensus. It is a structured process for iterating toward truths that were real long before we recognized them. It is about workable knowledge—what consistently explains and predicts what we observe.


Closing

Data is powerful. It gives us ways to see patterns in the noise, to generalize beyond the individual case, and to make decisions with confidence. But its limits are as important as its strengths. Models will never capture everyone, everywhere, all the time. People approach data differently, focusing on details or abstractions. And science itself is bounded by what can be observed, predicted, and replicated.

That is both the promise and the discipline of data: it tells us what works, and just as importantly, reminds us where clarity ends.


Super Admin

Jimmy Fisher



you may also like

  • by Jimmy Fisher
  • Oct 19, 2024
Variable Operationalization
  • by Jimmy Fisher
  • Oct 19, 2024
Experts vs. Expertise
  • by Jimmy Fisher
  • Nov 03, 2024
Correlation is not Causation
  • by Jimmy Fisher
  • Dec 14, 2024
No Skepticism, No Science