
Researchers at UCLA have created a hybrid optical-digital model that shifts portions of generative AI from electrons to photons, aiming to reduce energy use and accelerate image synthesis, tells IEEE Spectrum. Rather than fully relying on electronic diffusion models, their system encodes noise using a digital network and then passes that through an optical “diffractive processor” that decodes it into an image using controlled phase patterns.
The mechanism relies on spatial light modulators (SLMs). The first SLM encodes a phase “seed” from the digital network; then, laser light passes through a second SLM that acts as a diffractive decoder. The light interferes and reconstructs an image that is captured by a sensor. Because the image generation occurs optically, the system can create it essentially in a single step at the speed of light.
The team built two versions. The “snapshot” model yields images in one optical pass, offering speed advantages. The “iterative” version refines output over multiple passes, achieving higher image quality and richer backgrounds. In tests, both models generated monochrome and color images, from handwritten digits and fashion items to more artistic scenes, that closely resemble those produced by conventional diffusion models.
Beyond performance, the optical model offers privacy benefits. The encoded phase seed is unintelligible unless run through the correct decoder, meaning eavesdroppers can’t interpret the data. The researchers emphasize, however, that their approach is not intended to replace digital generative models outright. Instead, they see it as a “visual computer” well-suited to AR/VR and scenarios where output is ultimately for human viewing.
Scaling and integration remain challenges. Transitioning between digital and optical domains introduces complexity, and the prototypes need to be miniaturized for practical use. Yet this work points to a future where parts of AI generation are offloaded to light itself, bringing gains in energy efficiency, speed, and privacy.