The Hidden Cost of AI Infrastructure Lock-In — and the Engineering Bet Behind Oxmiq Labs

Organizations building enterprise AI infrastructure have discovered, often belatedly, that GPU hardware pricing is not the primary cost driver of their compute spend. The deeper expense is structural: a dependency on a single vendor’s software platform that constrains hardware procurement, inflates operational costs, and reduces the strategic flexibility of every technical decision downstream. That dependency is called CUDA. The company working to resolve it at the architectural level is Oxmiq Labs, founded in 2023 by Raja Koduri — an engineer and executive with more than two decades of direct experience designing the GPU systems at the center of this problem.


When Hardware Prices Are Not the Real Issue

NVIDIA’s GPU hardware is priced at a premium. That much is well-documented. What receives less attention is why competing hardware — some of it technically competitive in specific workload categories — has not broken that pricing power despite years of market presence.

The answer is not that NVIDIA’s chips are irreplaceable. They are not. The answer is that the software written to run on NVIDIA hardware is not easily portable. CUDA, NVIDIA’s parallel computing platform and programming model, has been the default execution environment for GPU-accelerated workloads since 2006. Machine learning frameworks, scientific computing libraries, and production inference pipelines were built on top of it — not as a deliberate lock-in strategy by individual developers, but as the organic result of years of framework development in an environment where CUDA was the most capable, most documented, and most widely supported option.

The result is that enterprises evaluating non-NVIDIA GPU hardware do not face a simple question of price per compute unit. They face a more complex question: what is the cost of migrating software that was written for a CUDA execution environment to run on a different hardware target? In most production contexts, that cost — measured in engineering time, regression risk, and toolchain disruption — exceeds the projected hardware savings. The lock-in is not contractual. It is architectural.


The Mechanics of Software-Driven Procurement Lock-In

Understanding how CUDA creates hardware lock-in requires understanding how AI software is actually written. Developers building machine learning systems typically work in Python, using frameworks such as PyTorch or TensorFlow. Those frameworks contain layer upon layer of CUDA-specific optimization: memory management routines, kernel libraries, profiling integrations, and hardware acceleration hooks that assume NVIDIA GPU architecture is present and available.

This is not an implementation detail. It is load-bearing infrastructure. When a machine learning engineer writes a training script, they are not choosing NVIDIA — they are using a framework that has already made that choice, embedded it in its core libraries, and optimized for it across years of production use. The CUDA dependency is invisible to the developer precisely because it is so deeply integrated.

This invisibility is what makes the lock-in structurally durable. A developer who has never explicitly chosen CUDA has nonetheless written code that cannot easily run without it. Asking that developer to switch hardware vendors is not a request to evaluate a new chip. It is a request to revalidate every dependency in their stack against a different execution environment — at their own engineering expense, on their own timeline, without any guarantee that the performance characteristics they have optimized for will be preserved.

For cloud providers, hyperscalers, and enterprises managing large-scale AI programs, this constraint translates directly into procurement leverage for NVIDIA. When there is no credible portability layer, the price of the hardware is, functionally, the price of the only hardware that works.


What Previous Attempts Revealed

The GPU industry has not lacked for well-funded attempts to address this dynamic. Advanced Micro Devices invested substantially in ROCm, its open-source GPU software platform, as a path to making Radeon hardware viable for AI and compute workloads. Intel, under Raja Koduri‘s leadership as Chief Architect and Executive Vice President of the Architecture, Graphics and Software (IAGS) division, developed oneAPI — a unified programming model designed to abstract AI workloads across heterogeneous hardware targets.

Both efforts produced real technology. ROCm has a developer community and meaningful deployment in certain high-performance computing contexts. oneAPI represents a serious attempt at heterogeneous compute abstraction. Neither has displaced CUDA as the default developer choice.

Koduri was not a distant observer of Intel’s effort — he led it. That operational experience produced a precise diagnosis of where the strategy encountered friction. Building a new programming model and asking developers to adopt it requires those developers to invest time they do not have, accept compatibility risks they cannot afford, and trust a platform that lacks the years of production validation CUDA carries. The barrier is not technical capability. It is developer inertia compounded by switching cost.

The lesson Koduri drew from that experience directly shaped the Oxmiq Labs approach: the solution cannot ask developers to change anything.


Oxmiq Labs: Portability Without Migration

Oxmiq Labs, headquartered in San Francisco and founded in 2023, is a GPU software and intellectual property startup. Its core focus is CUDA workload portability — enabling Python-based AI applications written for CUDA execution environments to run on non-NVIDIA GPU hardware without modification to the application code.

This is a fundamentally different strategy from the approaches that preceded it. ROCm and oneAPI both required, at some level, that developers engage with a new platform. They offered migration paths, compatibility layers, and porting guides — tools that acknowledged a gap between where the code was written and where it needed to run. Oxmiq’s architecture does not acknowledge that gap. It eliminates it. The developer’s code does not move. The infrastructure underneath it does.

The company builds on a RISC-V hardware architecture — an open-source instruction set architecture not owned or controlled by any commercial entity. That foundation matters for the software strategy above it. A RISC-V base carries no proprietary architectural constraints on the compatibility layer that sits on top of it, no licensing encumbrances that limit how the execution environment can be constructed, and no legacy design assumptions inherited from an era when AI compute workloads did not exist.

For a company building software infrastructure intended to be hardware-agnostic, starting on an architecturally unconstrained, openly licensed foundation is not incidental. It is a prerequisite.


The Infrastructure Problem Oxmiq Is Actually Solving

Framed from an enterprise perspective, the problem Oxmiq Labs targets has three distinct layers.

The first is procurement leverage. Organizations that cannot run CUDA workloads on non-NVIDIA hardware cannot negotiate GPU pricing with multiple vendors. A credible portability layer restores that leverage by making hardware substitution technically viable.

The second is supply chain resilience. The surge in enterprise AI demand following the acceleration of large language model deployment exposed a structural vulnerability: GPU supply is concentrated, lead times are long, and allocation is prioritized toward the largest customers. Organizations with no alternative compute path are fully exposed to that supply risk. Hardware portability is also supply chain diversification.

The third is long-term infrastructure flexibility. AI infrastructure built exclusively on NVIDIA GPU architecture is infrastructure that is fully dependent on NVIDIA’s product roadmap, pricing decisions, and ecosystem choices for every future capability upgrade. Organizations that solve the portability problem now retain the ability to evaluate future hardware on technical and commercial merit rather than ecosystem compatibility.

All three of these problems are downstream of the same root cause: software that cannot move. Oxmiq Labs is building the mechanism that makes it portable.


The Depth of Experience Behind the Design Decisions

Raja Koduri‘s educational background — a bachelor’s degree in electronics and communications engineering from Andhra University and a Master of Technology from the Indian Institute of Technology (IIT) Kharagpur — established the systems engineering foundation that GPU architecture work demands. The IIT Kharagpur program, one of the most rigorous engineering graduate programs in Asia, grounded his work in the computational and electrical principles that govern how hardware and software interact at the architectural level.

His career, spanning senior roles at ATI Technologies, Advanced Micro Devices, Apple, and Intel, built on that foundation through direct operational experience. As Senior Vice President and Chief Architect of AMD’s Radeon Technologies Group, Koduri worked within the organization most directly competing with NVIDIA across consecutive GPU product generations — observing, at close range, how CUDA ecosystem effects translated into commercial outcomes regardless of hardware performance. At Apple, he encountered a hardware-software integration model that offered a different kind of lesson: what is possible when the software stack is designed from the beginning to fit the hardware it runs on, rather than adapted to it after the fact.

Beyond his role at Oxmiq Labs, Koduri serves in advisory and board capacities for leading semiconductor and AI companies. That professional positioning keeps him embedded in the industry’s current technical and commercial developments — and gives Oxmiq’s technology a path to evaluation and adoption that runs through relationships built across decades of senior technical leadership.


The Addressable Outcome

If CUDA workload portability on RISC-V-based hardware becomes technically reliable and commercially available, the consequences extend beyond any individual company’s GPU procurement decision. Cloud providers gain a credible negotiating alternative for GPU infrastructure. Chip manufacturers who have designed non-NVIDIA GPU silicon gain a software path to developer adoption that does not require the developer community to change its tools. Enterprises gain infrastructure flexibility that has not existed, in practical terms, since CUDA became the default AI compute environment.

None of this is guaranteed by the existence of Oxmiq Labs or the quality of its technical approach. Execution at the infrastructure level is difficult, adoption requires trust that is built through validation rather than claimed through specification, and the commercial timeline for platform-level changes in enterprise technology is measured in years.

But the problem is real, its costs are quantifiable, and the organizations most affected by it are actively looking for a credible solution. Raja Koduri has spent his career understanding exactly why previous attempts at that solution fell short — and building the architectural clarity to approach it differently.


About Raja Koduri

Raja Koduri is an Indian-American computer engineer, technology executive, and founder with more than two decades of experience in GPU architecture and computing platform development. He holds a bachelor’s degree in electronics and communications from Andhra University and a Master of Technology from the Indian Institute of Technology (IIT) Kharagpur. Koduri has held senior roles at ATI Technologies, Advanced Micro Devices (AMD), Apple, and Intel, where he served as Chief Architect and Executive Vice President of the Architecture, Graphics and Software division. In 2023, he founded Oxmiq Labs Inc., a San Francisco-based GPU software and IP startup focused on enabling CUDA workloads to run on non-NVIDIA hardware through RISC-V-based designs and open software frameworks. He also serves in advisory and board capacities for leading semiconductor and AI companies.