A team of former Google and Meta executives claims to have found a solution to the problem of overwhelmed servers and idle high-end chips caused by massive AI models.
Majestic Labs AI was co-founded by Ofer Shacham, Masumi Ledes, and Sha Rabii. The company has developed a new server system called 'Prometheus' designed to break through the memory wall performance bottleneck hindering the operation of extremely large AI models.
The founders, who previously worked at Google designing first-generation data center and mobile chips and later led custom chip development at Meta's Reality Labs, announced a $100 million funding round last November from investors including Powef Capital, Lux Capital, and Grove Capital.
Operating from a modest office in Los Altos, California, the startup believes its proprietary chip can solve the memory wall issue, a growing computational bottleneck that severely limits AI model response speeds. The Prometheus server system incorporates hundreds of the company's custom AIU (Artificial Intelligence Processing Unit) chips. According to the founders, the server offers 1,000 times the memory capacity of competing GPUs from companies like NVIDIA, making it suitable for AI models with trillions of parameters.
Rabii noted that due to memory constraints, commercializing top-tier large models with existing infrastructure is becoming increasingly unfeasible. Even powerful chips often sit idle, waiting to fetch additional memory from other chips, leading to wasted computational resources.
To address this, Majestic's new server can be scaled up to 128TB of high-speed memory, enough to smoothly run AI models with 5 to 10 trillion parameters, with memory configurations customizable to client needs.
CEO Shacham, based in Tel Aviv, stated in a video interview, "This is the industry's first AI processor designed from the ground up with memory at its core, specifically tailored for the massive memory requirements of giant models."
The rapid adoption of agent AI and autonomous AI robots for tasks like code writing has created a severe shortage of computing resources globally. High-end chip rental prices have surged, leading to frequent service outages and usage restrictions for many AI services.
This surging demand for high-performance, low-power chips capable of fast inference tasks has created an opportunity for dozens of hardware and software startups like Majestic.
Major tech companies are also intensifying their efforts. Advanced Micro Devices (AMD) is heavily promoting the inference capabilities of its newest chips. Late last year, NVIDIA spent $20 billion to acquire technology licensing from chip firm Groq and brought on its core management team, recently launching its own server with chips designed specifically for inference.
Last week, Google Cloud announced its next-generation TPU, featuring a dual-chip architecture with separate chips for training and inference, emphasizing high-bandwidth memory. Another inference chip newcomer, Cerebras, secured a major partnership with Amazon Web Services this year and filed for an IPO in early April.
The founders of Majestic believe that all current inference solutions are inadequate for the massive memory demands of future super-sized AI models. Rabii analogized that this forces companies to over-purchase computational power just to get sufficient memory, "like being forced to buy an entire house when all you need is a garage."
A significant challenge for the company's future is the ongoing shortage of memory chips needed for its servers, with most manufacturers expecting the shortage to persist at least until next year. To mitigate supply chain pressure, Majestic is using standard DRAM memory, which is simpler to implement and less expensive than High Bandwidth Memory (HBM). HBM requires a complex 3D stacking process, leading to longer production cycles and limited capacity.
The founders revealed that the company's core technical advantage is its proprietary interconnect architecture, which enables high-speed data transfer between processors and hundreds of terabytes of memory at low power, surpassing the speed of traditional HBM.
Majestic has secured multiple customer partnerships, with deployments expected to begin in 2027, representing potential revenue in the hundreds of millions of dollars. Customer names were not disclosed due to confidentiality agreements.
Comments