DeepSeek Falls Short of Shaking the De Facto U.S.-China AI Landscape

钛媒体APP
01-31

TMTPOST --- Currently, the most discussed topic in the tech world is undoubtedly the DeepSeek. The internet is filled with various technical and financial analyses. Here, I would like to share some personal insights in the hope of providing a clearer perspective.

The Path of Technological Innovation 

Having worked in Silicon Valley for 40 years, I have gained a profound understanding of technological innovation. Technological innovation is like a pioneer searching for a gold mine in a vast mountain range. Although some point out that there are gold mines in the mountains, the exact locations remain unknown. As a result, many explorers rush in, each searching in different directions.

Looking back at the development of artificial intelligence, from the breakthrough in image recognition in 2012 to the later applications in the field of Go, we can consider it the "AI 1.0" phase. At that time, the AI industry in Silicon Valley and globally mainly focused on image and video recognition.

In 2017, Google introduced the Transformer model, focusing on language translation, particularly between English and French. However, after completing related research, Google did not delve deeper into the field, as the industry widely believed that the language translation market was limited and did not compare to the potential of image and video recognition.

However, OpenAI took a different approach, recognizing the vast potential in the language domain. Human intelligence is often presented through language, so OpenAI fully committed itself to research and development. Despite its resource limitations at the time, and the stark contrast in manpower and financial resources compared to giants like Google and Microsoft, OpenAI, with its sharp insight, released ChatGPT in November 2022.

It was like discovering a small detour along the well-trodden path, where OpenAI ventured in, only to unexpectedly find a gold mine, shaking the entire industry. Many practitioners quickly flocked to the field, turning it into a new "sunlit avenue."

Since then, the industry has continuously scaled up pre-trained models and data, but gradually lost its direction. During this process, OpenAI made another contribution—reasoning learning. Research found that even without extremely large model sizes, by carefully training reasoning abilities, a model's performance could be greatly improved. On September 24, 2024, OpenAI released the o1 model, once again opening a new path for the industry.

In this exploration, global teams, especially those from the U.S., invented many tools—like sharper machetes and shovels—that helped accelerate progress through the thorny path of exploration.

DeepSeek's Open Source Ideal and Its Success by Serendipity

DeepSeek can be considered a team driven by technological ideals. They insist on open sourcing, which contrasts with the usual approach of industry leaders who tend to keep their core technologies confidential. Instead, DeepSeek chooses to leverage global wisdom to push technological development forward.

In technological development, leading companies often withhold their core technologies, while those that are lagging behind tend to promote progress through open source, such as Meta (formerly Facebook) in its competition with OpenAI. Open source is akin to a community spirit, much like public welfare. Entrepreneurs from Alashan, for example, would understand this better—sometimes, even without knowing whether there will be any return, they still choose to contribute.

Open Source Culture and DeepSeek's Contribution to AI

The United States has a strong open-source culture, such as the open-source software Linux and Wikipedia, which has become the world’s dictionary. At that time, China tried to replicate these models, but the results were poor. In China, the open-source culture is not deeply rooted, so for DeepSeek to persist in open sourcing its quality outcomes and sharing them with the world is rare and commendable. This reflects the influence of the American open-source community on young Chinese programmers and entrepreneurs, though this influence remains relatively scarce domestically.

The core figure of the team, Liang Wenfeng, began applying machine learning to quantitative investing as early as 2013, and the team has accumulated over a decade of experience in the field of machine learning. In terms of technical sensitivity, they may have started developing large language models using the transformer architecture well before ChatGPT, possibly as early as 2019. Meanwhile, the team has gathered some of China’s top talent, and in their technological exploration, they used advanced tools developed by previous generations to discover a new path based on reasoning models—completely automated training, which differs from OpenAI’s manual training approach and helps reduce costs.

This automated reasoning training is similar to the AlphaGo Zero model. After AlphaGo defeated Lee Sedol, Google engineers tried to make AlphaGo Zero learn from scratch without relying on human experience. As a result, in a short period, AlphaGo Zero surpassed the older version of AlphaGo that had defeated Lee Sedol. DeepSeek has delved deeply into this path and achieved success. Although its contribution is not as significant as the breakthroughs of ChatGPT and reasoning training, it can still be considered the third most important contribution since ChatGPT, reducing reasoning costs by two orders of magnitude.

From a technical landscape perspective, although DeepSeek's achievements have narrowed the gap between China and the U.S. in artificial intelligence technology, the overall AI landscape between China and the U.S. remains unchanged.

In several key areas of AI technology, China still faces a significant gap with the U.S. in the chip sector; in terms of algorithmic breakthroughs, most major breakthroughs in the past decade—from 2012's AlexNet, 2017's Transformer, 2022's ChatGPT, to subsequent breakthroughs like chain-of-thought, RAG, and reasoning training—have occurred in the U.S. The French company Mistral has made a few contributions, and DeepSeek’s contribution accounts for about 5%, which is already an impressive achievement.

DeepSeek’s success, to some extent, is serendipitous. In a scientific exploration process full of chance, many teams explore different directions, and eventually, one team will achieve a breakthrough first. China possesses a vast AI foundation and an extensive pool of engineers, and after 30 years of development, has established a solid basis for communication with the West. Therefore, the emergence of a team like DeepSeek in China is not surprising. This is akin to the Soviet Union successfully launching its first artificial satellite in 1957, a feat built on the technological openness the U.S. offered to the Soviet Union during the two World Wars, which enabled the Soviet Union to establish a strong technological base. However, once the Soviet Union closed itself off, its technological capabilities rapidly declined.

The Prospects of AI, Still in Its Early Stages

In the commercial and stock market sectors, the development of artificial intelligence is closely tied to Nvidia. In my 2017 book Dark Knowledge, I pointed out that Nvidia would be the chip giant of the AI era. At that time, its market value was around $30 billion, and since then, its value has increased nearly a hundredfold.

Recently, the U.S. stock market has seen a significant decline, partly because the market believes that the efficiency improvements in algorithms will reduce the demand for computing power, and thus lower dependence on chips like those from Nvidia. However, this is a static way of thinking. It's similar to a phenomenon discovered by 19th-century British economist Jevons: after the efficiency of steam engines dramatically improved, coal consumption did not decrease; instead, it significantly increased. This phenomenon is known as the "Jevons Paradox." The underlying logic is that although the coal consumption per steam engine decreased, because the steam engine became more economical and efficient, people began using it in more areas, leading to a large increase in the total number of steam engines, and ultimately, coal consumption rose instead of falling.

Similarly, the reduction in the cost of AI models will drive the widespread expansion of AI applications, and the demand for chips will increase accordingly. In the past, one of the major barriers to AI applications was their high cost. Take ChatGPT, for example: users need to pay for each query, and as the complexity of the question increases, the computational cost grows exponentially. For instance, answering a simple question like "Who was the first emperor of the Tang Dynasty?" can be completed instantly, whereas a complex question like "What are the core reasons for the Tang-Song transition?" requires in-depth analysis and verification, and the computation time might increase a hundredfold or more.

Today, AI has advanced to the stage where it can act as an intelligent agent, providing comprehensive services to users. For example, if a user is planning to travel to Brazil, AI can function like a personal assistant, helping with everything from itinerary planning to hotel reservations and ticket bookings. The computational cost of this process could be thousands of times greater than a simple Q&A. If the cost of each service is as high as 10,000 RMB, users may hesitate; but if the cost drops to 100 RMB, users will use it without hesitation. Therefore, low-cost AI models will dramatically expand the application scenarios for AI, potentially leading to a hundredfold, thousandfold, or even ten thousandfold increase in usage.

As for DeepSeek, there has been external concern about its chip usage and whether there are issues with data theft. However, judging by its market pricing, where the cost per access is only 1/30 of OpenAI's while still generating a profit, it shows that DeepSeek has been exceptionally effective in cost control. As for other concerns, there is currently no concrete evidence, and the judgment on its technological value and market impact is not crucial.

Whether DeepSeek can continue to achieve major breakthroughs in the future remains uncertain. As the team's profile grows, the high demands from the government and the high salaries offered by large companies may negatively affect its pure technical pursuit. Whether it can stay true to its technological ideals and continue to push forward in such an environment is a question worth pondering.

The prospects of the AI wave are vast, and we are still in its early stages. Over the next decade or even two decades, AI is expected to experience rapid growth. Similar to the development of the internet in the late 1990s and early 2000s, AI is poised for significant breakthroughs over the next ten to twenty years.

More importantly, how great is the potential of AI? Currently, major labs including OpenAI, Anthropic, Google, and Microsoft are actively researching artificial general intelligence (AGI). The definition of AGI is: machines that can complete most of the intellectual activities of humans. Based on current research progress, AGI may emerge in the next two to five years. If AGI becomes a reality, its market size will be enormous, likely surpassing the internet market, reaching trillions or even tens of trillions of dollars.

DeepSeek’s technological breakthrough further proves that it is possible to achieve AGI at a lower cost. If the cost of AI is too high, even surpassing human labor costs, its application range will be limited. This is similar to how many production lines in China still use human labor rather than robots, because labor costs are lower. Only when the cost of robots is far lower than that of human labor will robots become widely popular. DeepSeek has made significant contributions to reducing the cost of AGI, which is highly commendable.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment