Migrating 70 Python Libraries to uv

July 22, 2024

A small Python project has six dependencies. A medium one has twenty. A backend service in a five-year-old codebase has, give or take, seventy — and the transitive closure that follows from those seventy sits at three or four hundred. At that size, two specific things start to dominate the team's life in ways that nobody would have predicted at six.

This is a post about both of those things, and about migrating to uv, which fixes one of them and exposes the shape of the other clearly enough that you finally understand what you're paying for.

The honest framing, which I'll spend the rest of the post earning:

A 70-library Python tree is a chronically-inconsistent constraint graph. The tools used to manage it are slow precisely when you need them most. And the language underneath doesn't catch the class of upgrade breakage that hurts the most — API-shape changes in transitive dependencies, which surface only at runtime, only after a full test suite has been run, only after the cost of running it has been paid. uv changes the economics of the first problem. Nothing changes the economics of the third — except switching languages.

That last sentence is what this post is really about. The migration steps and the wall-clock numbers are evidence. The deeper observation — about types, validation cost, and the long-run maintenance economics of a dynamic-language codebase — is the part that, in the end, reshaped how I thought about my own career.

The constraint graph is chronically inconsistent

Walk into any seventy-library Python project and look at what's in requirements.in. There's a web framework, an ORM, a queue client, a cache client, two HTTP libraries (because of course), a PDF library, an OCR library, a handful of cloud SDKs, three observability agents, half a dozen format parsers, and a long tail of small utilities. Each one is maintained by a different group of people on a different release cadence with a different sense of when breaking changes are acceptable.

A handful of those direct dependencies share a transitive sub-dependency — call it lib-E. Maybe it's urllib3 underneath every HTTP client. Maybe it's cryptography underneath every package that signs or verifies anything. Maybe it's pydantic v1 underneath your ORM and v2 underneath your newer validation code. Whatever it is, it's the kind of foundational lib that everyone's transitive closure pulls in, and not all of them want the same version.

flowchart TB PROJ["your project<br/>(70 direct deps)"] PROJ --> A["lib-A<br/>(web framework)"] PROJ --> B["lib-B<br/>(legacy auth)"] PROJ --> C["lib-C<br/>(http client)"] PROJ --> D["lib-D<br/>(pdf generator)"] PROJ --> F["lib-F<br/>(cache client)"] A --> E["lib-E"] B --> E C --> E D --> E F --> E B -. "wants lib-E < 2.0<br/>(hasn't been updated in 14 months)" .-> E A -. "happy with lib-E >= 2.2" .-> E C -. "happy with lib-E >= 2.2" .-> E

This is the canonical pattern. Five direct dependencies all need lib-E. Four of them have been updated recently and are happy with lib-E >= 2.2. One of them — lib-B, which hasn't been touched in 14 months — has a hard upper bound at lib-E < 2.0. There's only one lib-E per Python environment. So the resolver has to pick a version that satisfies every constraint, and the most restrictive bound wins. The whole tree gets dragged back to lib-E 1.x.

You can't upgrade lib-E because lib-B would break. You can't upgrade lib-B because the maintainer hasn't shipped a release. You're stuck — not because pip is bad, not because uv is good, but because the ecosystem has produced a constraint graph that doesn't have a satisfying answer at the version you actually want. This is a property of the Python world, not of the tooling.

Finding the next-best workable permutation"is there a lib-B' I can swap to that doesn't pin lib-E so aggressively?", "can I drop the feature lib-B provides and use something else?", "is lib-E 2.1 close enough to make lib-B happy and the rest of the tree workable?" — is the actual job. It is, in essence, manual SAT-solving against a noisy, evolving graph.

The slow resolver makes that job miserable. uv makes it tolerable. Neither of them solves it.

Every upgrade has a fixed validation cost

Even when you find the workable permutation, you can't merge the change without verifying that nothing broke. And here's where Python's dynamic typing turns into the dominant cost.

A library upgrade can change three categories of things:

  1. API-shape changes — a function got renamed, a kwarg got removed, a class moved to a different module, a return type changed from dict to list[dict].
  2. Behavioral changes — same signatures, different output. The retry logic now backs off at 100ms instead of 50ms. The serializer now emits ISO timestamps with microseconds instead of milliseconds.
  3. Bug fixes — pure improvements; semantically transparent.

In a typed language, the first category is caught at compile time. The compiler refuses to build until every call site that used the renamed function gets updated. You don't even need a test suite to learn about it. In Python, all three categories survive the build. The only way to find out which of your hundred call sites is now broken is to run the test suite. End to end. For every upgrade. Regardless of whether the change was a one-line patch or a major-version bump.

flowchart TB BUMP["Bump one direct dep"] BUMP --> RESOLVE["Re-resolve the tree<br/>(possibly bumps 30 transitive deps)"] RESOLVE --> COMPILE{"Did the build succeed?"} COMPILE -- "yes (always, in Python)" --> TESTS["Run the full test suite<br/>(only way to know if anything broke)"] TESTS --> RESULT{"All green?"} RESULT -- "yes" --> SHIP["Merge"] RESULT -- "no" --> DEBUG["Find which transitive change broke<br/>which call site"] DEBUG --> RESOLVE

The fixed cost per upgrade is the entire test suite, regardless of the size of the upgrade. That's the thing teams instinctively work around by batching upgrades"we'll do all the dep bumps in one big PR at the end of the quarter, run tests once, and ship together." Which is exactly the practice that produces stale CVE response times and security debt that compounds quietly until it doesn't.

The constraint graph and the validation cost compound. A team that has to validate every upgrade against a full test suite does fewer upgrades. A team that does fewer upgrades has more drift in its constraint graph. More drift means more incompatibilities pile up. More incompatibilities means each eventual upgrade is a bigger change and a riskier validation. The treadmill speeds up, then breaks.

What uv actually changes (and what it doesn't)

uv is a Rust-rewrite of the Python packaging stack — resolver, installer, lockfile manager — that ships as a single static binary. It is faster than pip by an order of magnitude on every operation that matters, and its conflict diagnostics are dramatically better than pip's.

What uv changes:

  • Resolution wall-clock, from minutes to seconds. A 400-package tree that took pip-tools three to four minutes to resolve takes uv three to five seconds.
  • Iteration economics. When re-resolving costs five seconds, you can ask "what if I bump lib-A to v3?" — see the result — undo — try "what if I drop lib-B entirely?" — see that result. You can run ten experiments in the time pip-tools would have done one. The constraint graph, which used to be opaque because exploring it was expensive, becomes legible.
  • Conflict diagnostics. When the resolution fails, uv tells you which constraint conflicts with which other one, with the version ranges and the package paths spelled out. pip's "could not find a version that satisfies the requirement" becomes uv's "package lib-A requires lib-E < 2.0; package lib-C requires lib-E >= 2.2; these constraints are mutually unsatisfiable."
  • Reproducibility. The optional project mode produces a universal lockfile (uv.lock) that captures resolutions across every supported Python version and platform simultaneously. The macOS-arm64 vs Linux-x86_64 drift class of bugs disappears.

What uv does not change:

  • The constraint graph itself. If lib-B pins lib-E < 2.0, neither pip nor uv can magic that pin away. The graph is what the maintainers committed.
  • The validation cost. Running the test suite still costs what it costs. uv resolves in five seconds, but if your test suite takes nine minutes, the upgrade still takes nine minutes plus five.
  • The dynamic-typing tax. Every upgrade still surfaces breakage at runtime, only after the test suite has caught it. The compiler still does not help.

The accurate way to describe the win, then: uv compresses the tooling part of the upgrade loop until the language's part is the entire bottleneck. This is genuinely useful — it makes the actual cost legible — but it doesn't make the actual cost go away.

The migration in steps

The migration was a week of staged changes, each one reversible, each one merged on its own. None of these are exotic; the value is in the staging order, not the commands.

flowchart TB S0["Day 0 — audit existing<br/>requirements.in / .txt files"] S0 --> S1["Day 1 — install uv,<br/>swap pip-compile for uv pip compile"] S1 --> S2["Day 2 — verify resolved<br/>requirements.txt matches"] S2 --> S3["Day 3 — update CI to use<br/>uv pip sync"] S3 --> S4["Day 4 — update Dockerfile<br/>and dev docs"] S4 --> S5["Optional — adopt uv project mode<br/>(pyproject.toml + uv.lock)"]

Day 0 — audit

Before changing anything, list every requirements file in the repo and what each one is for:

requirements.in # production direct deps (~70 entries) requirements.txt # production resolved (~400 packages) dev-requirements.in # dev/test additions (~25 entries) dev-requirements.txt # dev resolved constraints.txt # version pins applied across projects

Two things often surface here that are worth fixing before the migration so they don't become uv-flavored mysteries afterwards: lines in .in files with no version specifier (which means each resolve picks up whatever the latest is), and constraint files that are honored inconsistently across local and CI installs.

Day 1 — swap the compile step

# install uv (one static binary, no Python required) curl -LsSf https://astral.sh/uv/install.sh | sh # before: pip-compile --output-file=requirements.txt requirements.in --generate-hashes # after: uv pip compile requirements.in -o requirements.txt --generate-hashes

uv pip compile is a drop-in for pip-compile. The --generate-hashes flag matters; it produces a fully-hashed lockfile that's reproducible the same way pip-tools' was.

This is the moment that makes the migration feel different. The compile that had been three to four minutes finished in under five seconds. I re-ran it three times the first time before I trusted the output.

Day 2 — verify

The output of uv pip compile is not guaranteed to be byte-identical to pip-tools'. Sort order and tiebreaks on equally-valid versions can differ. Diff and reconcile:

uv pip compile requirements.in -o requirements.uv.txt --generate-hashes diff requirements.txt requirements.uv.txt | head -50

The differences are usually cosmetic (header comments, sort order) plus the occasional case where pip-tools picked an older version that uv picks newer. Walk through the substantive ones; accept or constrain as needed.

Day 3 — CI

# before pip install -r requirements.txt # after uv pip sync requirements.txt

uv pip sync makes the environment match the lockfile exactly, removing packages that aren't listed. This is the right semantics for CI; it makes the build reproducible regardless of what was in the cache.

CI install times went from ~90 seconds (cache hit) to ~12 seconds. The existing pip cache stayed mounted; uv has its own cache layout but they coexist without complaint.

Day 4 — Dockerfile and dev docs

# before RUN pip install --no-cache-dir -r requirements.txt # after COPY --from=ghcr.io/astral-sh/uv:0.3.0 /uv /usr/local/bin/uv RUN uv pip sync --system requirements.txt

The --system flag tells uv to install into the system Python rather than a virtual environment — appropriate inside a container where the system Python is the project's Python.

For local dev, the README updated to a single command — uv pip sync requirements.txt dev-requirements.txt — instead of the old two-pip-install dance.

Optional — project mode

After a few weeks of stability on the basic migration, you can adopt uv's project mode (pyproject.toml as canonical, uv.lock as the universal lockfile). This step is not required to capture the speed wins — those landed on day 1. Project mode buys you cross-platform reproducibility and a tidier dependency-group setup; it's a real improvement, but not the part that pays for the migration.

The gotchas

Six small surprises that came up during the migration. None blocked anything; all are worth knowing so you don't lose an hour.

  • Editable installs of the project itself. pip install -e . syntax (-e . in requirements.in) is identical in uv. Behavior on Windows differs slightly — uv uses a .pth file rather than a setup.py develop shim — but the end result is the same import semantics.
  • Native-package builds. cryptography, psycopg2-binary, lxml, numpy and similar build native code on install. uv builds them in isolated environments, which is faster than pip's serial approach but uses different cache keys. If your CI cached pip's .cache/pip/wheels, add uv's cache directory (~/.cache/uv by default) to the CI cache config.
  • Hashes by default. uv pip compile --generate-hashes produces a hashed lockfile; uv pip sync enforces hashes the way pip install --require-hashes did. The supply-chain reviewer's question — "are these dependencies pinned with hashes?" — becomes a one-word answer.
  • Index URLs and private mirrors. uv reads pip.conf / pip.ini and accepts --index-url / --extra-index-url flags and the UV_INDEX_URL env var. If you have a private mirror, export UV_INDEX_URL in CI.
  • Lockfile semantics in project mode. requirements.txt is platform-specific; uv.lock is universal. The two represent different mental models. You can keep requirements.txt as the deployment artifact even after adopting project mode if your deploy pipeline wants that format.
  • Python interpreter discovery. uv prefers a "managed" Python (one it has installed itself via uv python install) over a system Python. On dev machines with pyenv already in charge, set UV_PYTHON_PREFERENCE=only-system so uv uses the active pyenv Python.

What the numbers actually were

Three measurements at the points that matter:

Metricpip-tools (before)uv (after)Speedup
pip-compile cold (~70 deps, ~400 tree)3–4 minutes3–5 seconds~50×
Fresh CI install (cache cold)~3 minutes~25 seconds~7×
Fresh CI install (cache warm)~90 seconds~12 seconds~7×
CVE bump cycle (compile → CI green)~12 minutes~3 minutes~4×
Local dev env from clean clone~4 minutes~30 seconds~8×

The wall-clock numbers are not the most interesting part. The most interesting part is what they enabled.

CVE patching went from "something we batch up because it costs half a day" to "the Dependabot PR is already green by the time anyone gets to it." Constraint-graph experiments went from "we'll think about that next sprint" to "give me an hour." And — quietly — supply-chain reviews became one-word answers, because the lockfile, the hashes, and the resolver were all the same shape and all defensible.

But the test suite still took nine minutes. And every upgrade still triggered it. uv didn't change that part of the cost — and that part is what the rest of the post is really about.

What this taught me about types

Sit with the migration for a few months and the ratio of where time goes shifts in a way that's worth noticing.

Before uv: maybe 40% of upgrade-cycle time was the resolver, 50% was the test suite, 10% was the human looking at diffs. After uv: maybe 5% was the resolver, 85% was the test suite, 10% was the human. The resolver compressed away. The test suite didn't.

That ratio is a precise statement about where the real cost of maintaining a Python dependency tree comes from. It comes from the language not being able to tell you, at compile time, which of your hundred call sites just broke. Every other cost in the loop can be optimized — caching, parallel installation, faster wheels, pre-built layers. The "run the entire test suite to find out which transitive bump renamed a kwarg" cost is structural. It's the language's contract with you.

In a statically-typed language with a strong dependency story — Rust is the example I lived with after this — that cost is dramatically smaller, for exactly the failure mode that dominates dependency upgrades. When a Rust crate renames a function, removes a method, or changes a signature, your build fails. You don't run the test suite to find that out; the compiler tells you in seconds, with the file and line number. The test suite is reserved for what it's actually good at: catching behavioral bugs, semantic regressions, edge cases the type system can't express.

Being honest about what Rust doesn't catch is important — it doesn't catch behavioral changes (a function with the same signature now returns different values), and it doesn't catch every runtime concern. So the gap between Python and Rust isn't "tests vs no tests"; it's "tests catch behavior, compiler catches API shape." But the dominant failure mode of a transitive dependency upgrade is API-shape breakage. That's the failure mode the Rust compiler eats for free. That's the failure mode the Python test suite has to brute-force.

Multiply that across years of upgrades on a long-lived service and the maintenance economics tilt heavily. It's not that Python is unmaintainable — it's that the per-upgrade cost is bounded below by something the language has decided to never give you.

I had been writing Python for almost the entire decade of my career when this migration happened. I had been working with Rust on a smaller side-project for less than a year. The honest realization, sitting with the new numbers and the unchanged test-suite cost, was that I had been paying a structural tax I'd never accounted for, and that the tax was not going to go away as long as I was working in dynamic-language ecosystems. uv made the Python tax bearable. It did not eliminate it, and it could not.

That's the calculus that, eventually, tipped me toward Rust full-time. The migration to uv was the last thing that had to land before I could see the cost clearly.

What stays the same

A subtle and good property of the migration: nothing about how the team thought about dependencies changed. requirements.in is still the canonical declaration. requirements.txt is still the lockfile. CI still syncs. Procurement still reviews the same artifact. uv slid underneath the existing mental model and made the slow part fast.

That's why the migration was safe to do in a week. The conceptual layer above the tooling was identical, which meant every engineer's existing knowledge was directly portable. Nothing had to be relearned.

The deeper observation — that the tooling is downstream of the language, that the language sets a floor on maintenance cost that no tool can lift — is what stayed with me afterwards. It's what made me read the next year's career options through a different lens. uv was the right migration to do. Rust was the right direction to start moving in. Both can be true, and both came out of the same project.

GitHub
LinkedIn
X