OpenAI and Broadcom announce chip designed for LLM inference at scale

Text settings Story text Size Small Standard Large Width * Standard Wide Links Standard Orange * Subscribers only Learn more Minimize to nav

OpenAI, the company behind ChatGPT and Codex and the models those tools utilize, and Broadcom, an established silicon supplier, have announced a new chip called Jalapeño, designed specifically for large language model inference in data centers.

The chip is intended to be deployed at large data centers, both companies claim this is just the first generation in a long-term project that will see chips refined over time.

Broadcom says that this ASIC (Application-Specific Integrated Circuit) was designed from scratch for LLM inference, based on “detailed insights” from the company’s conversations with researchers at OpenAI, and that the chip’s development was informed by OpenAI’s own roadmap for future models and products. The design and production of the chip took nine months.

The promise is that this chip is more specialized for the current needs of LLMs than those that inference systems currently run on in existing data centers.

OpenAI claims that “early testing shows that Jalapeño will deliver performance per watt substantially better than current state-of-the-art,” but notes that it is not done measuring performance, and that a “detailed technical report will be presented in the coming months.”

Until then, we don’t have many details to go on.

The company, which is known for its ChatGPT and Codex services and harnesses, hopes to ultimately own the full stack behind its models and products, reducing dependence on outside companies like Nvidia and ostensibly providing better performance or efficiency thanks to vertical integration.

More generally, OpenAI and its competitors are interested in custom silicon because it’s another way to potentially squeeze out more capacity amid a global compute crunch, as competing companies scramble for limited data center capacity.

While Broadcom was already a successful chipmaker for customers building out compute infrastructure, it has seen substantial movement recently as it has built new business around providing custom chips to hyperscalers and the teams building frontier models during the current AI boom.

Both companies claim Jalapeño chips will be deployed in data centers by the end of this year.

Samuel Axon Senior Editor Samuel Axon Senior Editor Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development. 17 Comments

OpenAI and Broadcom announce chip designed for LLM inference at scale

Related Articles

The Skylight Calendar Is One of My Favorite Products On Sale for Prime Day

Europe is pushing back on Washington’s chip war

Former Infosys chief has a new startup that wants to challenge the IT services world

After successfully selling over 15 cars, Faraday Future would now like you to buy its robots