Defense Software Factories Need a Major Reset

Pictured is the second flight test U.S. Air Force B-21 Raider–an arrival at Edwards AFB, Calif. announced by the Air Force in September 2025 (Photo Provided by U.S. Air Force)

By Bryon Kroger, Defense Opinion Writer.

Software factories were supposed to fix defense tech. They promised a faster, leaner path to warfighter-ready software. But instead of operating like real factories, many slid into innovation theater—promising greater agility and speed, building internal platforms and polishing slide decks—while rarely shipping software that users wanted and missions needed.

The root of the problem? Software factories are measuring the wrong things or worse, not measuring at all.

At Kessel Run, the Department of Defense’s first modern “software factory,” we coined the term to resonate with leaders accustomed to physical production. But we also meant it literally: a system of people, process, and technology built to continuously deliver valuable software that warfighters love.

That definition implied discipline, measurement and accountability. In manufacturing, cycle time, defect rates and customer satisfaction scores are foundational measures of performance and efficiency. Metrics drive improvement.

So, if we’re going to keep calling them factories, we need to start measuring them like factories. That means linking delivery performance to mission impact, because there is no value in shipping software code weekly if it fails to advance mission metrics or creates a poor user experience.

It’s time for “Software Factory 2.0,” a model for continuous delivery, outcomes in production, real-world validation and alignment with mission impact.

Metrics that matter

To measure software factories, start with a new value equation: performance over cost and schedule.

Performance = mission impact. Did the software help achieve an operational goal? Did it improve stability and reliability?

Cost includes not just dollars spent, but the cost of delay and instability.

Schedule = throughput. How fast and often are you shipping to production?

To optimize this value equation, every software factory should be able to continuously deliver impactful software that warfighters love. This requires measuring software delivery and mission performance.

The DevOps Research and Assessment (DORA) team, a part of Google that helps companies improve software delivery and operations performance, offers the clearest benchmark for delivery performance.

Their gold standard metrics are:

Deployment frequency – How often are changes deployed to production?

Change lead time – How long does it take from a decision to produce or change a given code to deployment of that code?

Change fail percentage – How often do deployments cause failures in production?

Failed deployment recovery time – How long does it take to fix a failed deployment?

Reliability – Does the system consistently perform as expected?

When teams track these metrics, they gain visibility into the real cost of inefficiency: long lead times, rework and fragile deployments. Improving DORA metrics means shipping high-quality, secure software earlier and more often. DORA metrics offer a neutral, data-driven assessment of delivery performance.

The DoD should also hold fly-offs between competitors to spotlight what works and force contractors and program offices to support claims with results. Competition breeds excellence, and right now software factories need less posturing and a lot more performance.

Efficacy before efficiency

Efficiency only matters if you’re building the right thing. As Peter Drucker said, “There is nothing quite so useless as doing with great efficiency something that should not be done at all.” If a software factory doesn’t move the mission forward, it should not be funded. Efficacy must come before efficiency.

Software efficacy begins with a clear, measurable mission impact. If the goal is to cut the time to produce a master air attack plan by 10%, that’s the benchmark. After delivery, measure it. It’s the only way to know if the software worked. Most government programs don’t measure delivery performance, allow self-reporting of metrics, or they stop measuring at delivery, conflating shipping code with solving problems.

Meaningless milestones as metrics

The culture underpinning defense software development needs a reset. Start by abandoning poorly defined milestones like minimum viable product and minimum viable capability releases—new labels for the same old waterfall requirements.

The only milestone that matters is net value release. If the latest software update provides net value to the mission (effectiveness), if it delivers more value than it cost to build, deploy and implement (efficiency), then and only then should it be shipped.

Too many programs still reward activity over impact — slide decks over shippable code and intentions over outcomes. Even when they aim higher, they still evaluate price on cost per hour, instead of the value equation’s outcome per unit of time, per dollar.

It’s time to evolve. Rebuild around what works. And hold every program accountable for continuously delivering war-winning software that warfighters love to use.

Bryon Kroger is founder and CEO of Rise8, a government technology software company based in Tampa. He co-founded Kessel Run, the Department of Defense’s first software factory, where he served as chief operating officer.

Are you a Defense Daily reader with a thought-provoking opinion on a defense issue? We want to hear from you.

We welcome submissions of opinion articles on national security, defense spending, weapons systems and related areas.
We welcome submissions from lawmakers, administration officials, industry representatives, military officials, academics, think tank experts, congressional candidates, international experts and others on issues important to the national defense community.
We welcome a diverse range of opinions all along the political spectrum.
Email: editor@DefenseOpinion.com