← All writing
AIOps7 min read · LinkedIn

From notebook to production

The analysis works on your laptop. Now it needs to run every morning at 6am, survive a dependency bump, and not page you when it does. That gap — between "it ran once, for me" and "it runs every day, for everyone" — is where most data science value quietly dies. The model was never the hard part. The last mile is.

"It ran" is not "it runs"

A notebook that produced the right answer once proves almost nothing about whether it'll produce the right answer unattended next Tuesday. The notebook had you in the loop: you re-ran the failed cell, you remembered to update the date, you noticed the weird number and fixed it by hand. Production has none of that. Every bit of judgment you supplied live has to be made explicit, or it silently disappears.

So the work of productionizing isn't rewriting the logic. It's removing yourself from the runtime — finding every place you were the safety net and replacing yourself with something that runs on its own.

The checklist

Before any notebook earns a cron schedule, I run it through this:

  • Pinned environment. Exact dependency versions, captured and reproducible. "It worked with whatever was installed in March" is not an environment. A dependency bump should be a deliberate change, not a surprise at 6am.
  • Deterministic inputs. No "yesterday" computed from the wall clock in three different cells, no reaching into local files only you have. Inputs are parameters, passed in, the same every run.
  • Idempotent writes. Running it twice produces the same result as running it once — upserts, not blind appends. Reruns are routine in production; they shouldn't double your numbers.
  • Observability that means something. Not "it logged a lot." A clear signal of success or failure, and a record of what it produced, so a human can answer "did today's run go fine?" in ten seconds.
  • Alerting with a low false-positive rate. A page should mean act now. An alert that cries wolf gets muted, and a muted alert is the same as no alert the day it finally matters.

The mindset shift

The through-line is a shift from it produced output to it will keep producing correct output without me watching. That means designing for the bad day, not the demo: the upstream table that's late, the schema that drifted, the rerun after a failure. Most of the checklist is just making your own implicit habits survive your absence.

Earn the schedule

A one-off analysis and a production job look like the same code and are completely different artifacts. The difference is everything that lets the second one run without you. Get that right and an analysis stops being a thing you perform every morning and becomes a thing that simply happens — which is the only way its value ever compounds instead of evaporating the week you get busy.

Also asListensoonSlidessoonPodcastsoonVideosoon

Have data that should be doing more?

Tell me about the pipeline that breaks, the metric nobody trusts, or the analysis stuck in a notebook. Let's operationalize it.