Every time you ask ChatGPT a question, your request triggers a data relay race. Information leaves memory, passes through a CPU for preprocessing, travels to a GPU for heavy computation, and then makes its way back — and that entire journey repeats for every single word the AI generates.
The bottleneck is structural — it means routing through some of the most expensive and power-intensive chips in the industry on every single request. That inefficiency is exactly what XCENA, a startup with offices in South Korea and the U.S., is trying to solve. The four-year-old startup has designed a chip that places compute capabilities much closer to DRAM — the fast, short-term memory chips that store data a processor is actively using — allowing routine data operations to be handled near memory, without the costly round trips between CPUs, GPUs, and memory.