As the Spring Festival approaches, competition in the AI industry is intensifying. Since the beginning of the year, Baidu, Alibaba, and DeepSeek have been intensively releasing new technologies and products centered around foundational large models, accelerating their efforts to seize the high ground in AI innovation. On January 22, Baidu launched the official version of its ERNIE 5.0 model, which utilizes native full-modal unified modeling technology, supporting the input and output of various types of information including text, images, audio, and video. On January 26, Alibaba released its flagship reasoning model Qwen3-Max-Thinking, which innovates in reasoning technology to achieve a performance leap. Subsequently, DeepSeek introduced the new DeepSeek-OCR-2 model and made it open-source. Industry experts believe that in a short period, China's AI landscape has "entered a tripartite balance of power," with the pace of innovation noticeably accelerating.
The evolution of foundational large models determines the upper limit of AI application capabilities. The official version of ERNIE 5.0 leads technological innovation with its native full-modal architecture, shaping a competitive advantage. A relevant person in charge at Baidu explained that unlike most industry multi-modal solutions that use "late fusion," the official ERNIE 5.0 model employs a unified autoregressive architecture for native full-modal modeling. This approach jointly trains multi-source data such as text, images, video, and audio within the same model framework, allowing multi-modal features to fully integrate and co-optimize under a unified architecture, achieving native full-modal unified understanding and generation. In the previously published LMArena global large model leaderboard, the official ERNIE 5.0 model repeatedly ranked first domestically in both the text and visual understanding categories, joining the top international tier.
Alibaba has also been making continuous efforts around its Qwen large model. Its newly released reasoning model, Qwen3-Max-Thinking, adopts a novel test-time scaling mechanism that enables more efficient reasoning computation within the same context, yielding more intelligent reasoning results at a lower cost. Furthermore, Alibaba leverages its application ecosystem, using its advantage in traffic portals to integrate the Qwen model into its core domains like e-commerce, hotel and travel, and payments. It has been deeply integrated into platforms such as Taobao, Alipay, and Fliggy, achieving efficient synergy between technology and application scenarios.
Industry experts indicate that unlike Baidu and Alibaba, which are making comprehensive efforts backed by their strong proprietary business ecosystems, DeepSeek builds on its open-source advantage, focusing on foundational model capabilities and building an open ecosystem. Through full-stack open-source offerings of "model weights + training framework + deployment tools," it maximizes its cost-performance advantage. The newly launched DeepSeek-OCR-2 employs an innovative DeepEncoder V2 method, allowing the model to dynamically rearrange parts of an image based on its meaning, simulating the logical process of human scene observation. This makes it more intelligent and logical when processing complex images, showcasing unique technical innovation.
On January 29, in a move对标 DeepSeek-OCR-2, Baidu quickly released and open-sourced its ERNIE-derived model Paddle OCR-VL-1.5. It pioneered an "irregular bounding box localization" technology for OCR (Optical Character Recognition) models, enabling accurate recognition of irregular documents that are tilted, folded, or have curled edges. A relevant person in charge at Baidu stated that as one of the very few companies possessing full-stack AI capabilities, Baidu has also made more sustained investments in software-hardware coordination and scenario implementation, demonstrating strong resilience in the current competition among "leading players."
In terms of underlying hardware, driven by the extreme computational demands of its own business, Baidu incubated the AI chip brand Kunlun Core. It has completed the journey from specialized to general-purpose, and from internal support to external services, validating the feasibility of scenario-defined chips. It recently initiated an independent listing process to accelerate its layout across multiple fields. Currently, Baidu has powered up the first fully self-developed 30,000-card Kunlun Core cluster in China, capable of simultaneously supporting the training of multiple large models with hundreds of billions of parameters, achieving a leap from "usable" self-developed computing power to "scalable and replicable."
Reliable underlying computing power, leading cloud services, and powerful model capabilities have given rise to a richer array of product services. Based on the ERNIE foundational model, Baidu has built a portfolio of matrix models and specialized models. Matrix models are aimed at rapid deployment for product-level applications and general scenarios; specialized models target industry applications and vertical scenarios. For example, the ERNIE Digital Human model has achieved large-scale application in areas like live-stream e-commerce, creating new interactive experiences and content formats. During the 2025 "Double 11" shopping festival, the Gross Merchandise Volume from digital human livestreams increased by 91% year-on-year, the number of livestreaming rooms increased by 119%, and over 100,000 merchants utilized the technology.
Since DeepSeek gained popularity during the 2025 Spring Festival, Chinese AI has remained a hot topic in the international technology sector. After more than a year of development, AI technology in China is entering a new phase of large-scale implementation. The value of AI is being further validated through its role in driving industrial transformation and creating broader social benefits. Although the development paths of these companies differ, they share a clearly visible path of upgrading innovation capabilities, collectively propelling China's AI industry from "catching up" to "leading the way."
Comments