OpenAI and Broadcom Inc. have developed a custom AI inference chip called Jalapeño to improve efficiency for large-language-model workloads [1].

The partnership marks a significant shift toward a full-stack AI platform, reducing OpenAI's reliance on third-party hardware providers to lower the cost of running models like ChatGPT [2].

According to the companies, the new hardware is designed specifically for inference, the process of generating responses from a trained model. The chip could potentially lower inference costs by 50 percent [1]. This efficiency is intended to scale the availability of AI services while reducing the massive electrical and financial overhead associated with data center operations [3].

The development process moved rapidly, spanning roughly nine months from the initial design phase to tape-out [4]. This accelerated timeline suggests a streamlined collaboration between the software expertise of OpenAI and the silicon engineering capabilities of Broadcom [2].

Deployment timelines vary across reports. Some sources said the chip will be deployed by the end of 2026 [5]. However, other reports said the chip is currently being tested with samples and that no specific deployment date has been finalized [2].

Broadcom has a history of developing custom ASICs (application-specific integrated circuits) for major tech firms. By partnering with Broadcom, OpenAI gains a tailored hardware solution that optimizes the specific mathematical operations required by its proprietary transformers [3].

This move aligns with a broader industry trend where AI developers create custom silicon to avoid the bottlenecks and high pricing of the general-purpose GPU market [2].

The new hardware is designed specifically for inference, the process of generating responses from a trained model.

The Jalapeño chip represents OpenAI's transition from a software-centric company to a vertically integrated hardware and software provider. By controlling the silicon layer, OpenAI can optimize its models for specific hardware architectures, potentially creating a performance and cost moat that competitors relying on off-the-shelf chips cannot match. This move also signals a strategic effort to diversify the AI supply chain, reducing the industry's total dependence on Nvidia.