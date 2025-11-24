Data engineering is having a moment.

Everyone suddenly cares about pipelines, lineage and “AI foundations.” It still surprises me, mostly because these are the same things no one wanted to talk about for years. They were the unglamorous parts of data work, the plumbing behind the dashboards.

Now they’re headline topics again. Progress always circles back to fundamentals — and that’s a good thing, because nothing about AI works if the data doesn’t.

AI has compressed years of technology maturity into months. We went from “that’s interesting” to “deploy it everywhere” without stopping to learn what breaks and why. That’s the work now. And it isn’t happening in model architectures or tuning algorithms. Rather, it’s happening in data engineering, in the same fundamentals we’ve always needed: clean pipelines, solid governance, traceable lineage and systems that fail gracefully.

AI is just the latest technology to pass through that same cycle of discovery and disillusionment. What makes it stick are the principles data engineers have practiced for years.

Data engineering is infrastructure for the digital world. You don’t get credit when it works, but everything stops when it doesn’t. The job isn’t just moving data from point A to point B. It’s turning raw information into something that makes sense, adding context, shaping structure and creating the connective tissue that turns data into knowledge. Just look at the findings from a recent Snowflake MIT Tech Review report, Redefining Data Engineering in the Age of AI: 72% of the 400 surveyed technology leaders deem data engineers integral to their business.

The work is invisible most of the time: saying no to shortcuts that will break later, tracing issues no one else sees and keeping systems alive through quiet discipline. That’s the craft.

How we learn what we don’t know

Every new technology follows the same pattern: excitement, confusion, failure and eventually understanding. AI is no different; it’s just moving faster.

We’re still learning what it can do, where it breaks and how to make it trustworthy. And that learning curve isn’t just technical — it’s cultural. It’s about how people share what they know and how organizations turn uncertainty into progress.

The hard part isn’t building these systems. It’s understanding what we truly know, what we only think we know and what we’ve never even questioned.

Recently, I came across a simple, yet revealing framework often used for risk analysis: knowns and unknowns. It fits perfectly with where we are with regard to data and AI. It helps us see not just what we know but what we assume, ignore or forget to ask. It shows us where the real danger lives.

The 2x2 of reality

The “known knowns” model has been around for decades. It became famous when then-Secretary of Defense Donald Rumsfeld used it during a press briefing in 2002, but the idea goes back to psychology research from the 1950s when Joseph Luft and Harrington Ingham created the Johari window, a way to think about what’s known to us, what’s known to others and what’s still hidden.

It fits perfectly into the realm of data and AI because it shows how people and systems actually learn.