Home 9 Semiconductors 9 Lumai Launches Lumai Iris for Real-Time LLM Inference

Lumai Launches Lumai Iris for Real-Time LLM Inference

by | Apr 29, 2026

Optical computing server family includes Nova, Aura and Tetra, with Iris Nova available for evaluation
Lumai Iris

OXFORD, UK, Apr 29, 2026 – Lumai has introduced Lumai Iris, an optical computing inference server family built to run billion-parameter large language models (LLMs) in real time as data centers face rising power and scalability limits from AI deployment.

Lumai Iris is designed for large-scale AI inference workloads and uses optical computing for core AI operations. Lumai said the system can deliver up to 90% lower energy consumption than conventional architectures, while improving inference performance and execution efficiency.

The Energy Wall

AI compute demand is shifting from model training to large-scale inference, where models are deployed in real-world applications. As inference workloads grow, data centers must manage tighter power limits, scalability constraints, and rising demand for more efficient compute systems.

Lumai Iris Server Rack

Lumai cited International Energy Agency projections that global data center power demand will double by 2030. The company said the Iris family is intended to address power and cost pressures in AI infrastructure by improving performance per kilowatt.

The Silicon Ceiling

Traditional silicon architectures face scaling, power, and thermal constraints as AI workloads grow. Each new silicon generation provides smaller scaling gains while requiring more power and cost to support larger workloads.

“As the industry transitions into the inference era, we are simultaneously crossing the threshold into the post-silicon era,” said Dr. Xianxin Guo, CEO and co-founder of Lumai. “By shifting the computation paradigm from electrons to photons, Lumai can deliver an order-of-magnitude increase in performance with significant energy savings.”

A New Architecture for AI Compute

Lumai is using research from the University of Oxford to develop an optical computing architecture for AI inference workloads. The technology uses light in 3D volume, rather than relying only on 2D conventional chip architectures.

The architecture is designed to use spatial parallelism and run millions of operations simultaneously. The approach supports what it describes as low-cost, high-token throughput in compute-bound workloads and applies to the prefill stage of disaggregated inference architectures, where tokens are processed before generation begins.

Lumai Iris Nova

Iris Nova runs real-time inference on Llama 8B and 70B through a hybrid processor. Digital processing manages system control and software, while an optical tensor engine performs the core mathematical operations. The design supports integration into data center environments.

The Advanced Research and Invention Agency (ARIA) commented, “The demands on existing AI processors necessitate an urgent search for alternative scaling pathways,” said Suraj Bramhavar, program director at ARIA. “Lumai is leading the charge in demonstrating that optical processors could provide one such pathway, and ARIA is excited to partner with them to explore the shift beyond our traditional digital computing paradigm.”

Availability

Lumai Iris consists of three server lines: Nova, Aura, and Tetra. Lumai Iris Nova, the first server in the family, is available for evaluation by hyperscalers, neo-clouds, enterprises, and research institutions.

Source: Lumai

About Lumai

Lumai develops optical computing hardware designed to support AI workloads while reducing energy use. The company builds optical processors that use light to perform AI inference and related computing tasks. Lumai’s technology originated from optics research at the University of Oxford. The company was founded in 2021 and is headquartered in Oxford, United Kingdom. Its products target data center operators, cloud service providers, and organizations running large-scale AI systems. Lumai focuses on applications where traditional electronic processors face power and performance constraints. The company aims to address scalability and efficiency challenges associated with silicon-based GPUs and photonic approaches. Its optical computing platform seeks to lower operating costs for AI processing through improved power efficiency. Lumai serves customers in AI infrastructure, cloud computing, and advanced research.