Home About us

The "cornering point" of artificial intelligence, the troika has fully adjusted its posture

Intelligent relativity 2024/08/02 16:40

Text | Intelligent relativity

Author | Chen Bocheng

A few days ago, Nvidia CEO Jensen Huang and Meta founder Mark Zuckerberg had a "fireside chat".

As leaders in the field of artificial intelligence today, the two occupy the supreme position in the field of computing power with the absolute advantage of AI chips, and on the other hand, they have risen strongly to become the benchmark in the field of open source with the help of the open source large model Llama 3.1. Such conversations present a different perspective on the future development trend of AI.

Intelligent relativity, the "cornering point" of artificial intelligence, the troika has fully adjusted its attitude

Huang talks to Zuckerberg

The dialogue between the two celebrities depicts the future development blueprint of AI technology: from open-source AI algorithms, to advanced humanoid robots, to smart glasses that will be popularized in the future, the development of AI technology is full of opportunities and challenges. In the future, various products such as AI mobile phones, AIPCs, AI cars, smart glasses, and servers will be intelligently upgraded, and complex models, massive data, and calculations will greatly rely on AI computing power.

AI computing power is also expanding from dedicated computing to all computing scenarios, gradually forming a pattern of "all computing is AI".

In fact, the actions of computing power manufacturers have also witnessed the market's requirements for the development of computing power. On the one hand, various PUs such as CPUs, GPUs, and NPUs are also used for AI computing.

On the other hand, in the general-purpose server adapted to different scene applications, Inspur Information is also committed to providing options with both high performance and low cost. Not long ago, based on the 2U4 flagship general-purpose server NF8260G7, Inspur Information Innovation adopted leading tensor parallelism, NF4 model quantization and other technologies to realize that the server can run the "Source 2.0" large model with 100 billion parameters with only 4 CPUs, which has once again become a new benchmark for general AI computing power.

In today's market, the industrial status of computing power is rapidly rising. Corresponding to the troika of artificial intelligence development, computing power, algorithms, and data have finally reached a state of equal status and are moving towards "neck and neck".

It is important to know that in the early stage of the development of AI technology, China's huge Internet user base and abundant online data resources focused on the development of data. The United States, on the other hand, has a long tradition of research in basic disciplines such as computer science, mathematics and statistics, and focuses more on the development of algorithms. Compared with the two, the attention of computing power in the early stage is much weaker.

To this day, the troika goes hand in hand. The public's thinking on the development of artificial intelligence is also becoming clearer and clearer: the explosion of the AI industry is the result of the coordinated development of algorithms, computing power and data. This state also means that the AI industry is entering a new stage.

The artificial intelligence industry has come to a "turning point".

At this stage, the accelerated iteration of large-scale model technology has brought about the continuous emergence and improvement of hundreds of billions of large-scale models. Related AI applications are also penetrating into all walks of life at an unprecedented speed and scale, and are integrated into daily life and work.

The artificial intelligence industry is moving from initial exploration to a "corner" of widespread application. In this process, the troika of AI has also reached a critical moment of comprehensive and coordinated development, so as to provide necessary technical support for the leapfrog upgrade of scenario applications.

Taking the bank's fraud prevention system as an example, the early system was built based on big data, and used empirical preset rules and statistical models to judge and detect suspicious transactions. Now, based on the integration of big data systems and financial fraud prevention AI models with higher performance general computing power, the bank fraud prevention system has achieved a functional upgrade, which not only has higher accuracy and lower false alarm rate, but also can self-learn and adjust according to new data, and quickly adapt to new fraud patterns.

The synergy of algorithms, computing power, and data constitutes the basic paradigm of current AI applications. A successful AI project often requires appropriate input and optimization in all three areas.

Algorithms act like the brains of AI, processing information, learning knowledge, and making decisions. Data is the foundation of algorithms, and if there is not enough data, even the most advanced algorithms will not be able to perform as well as they should.

On this basis, whether it is the operation of the algorithm or the processing of data, it is inseparable from the support of computing power. Especially in scenarios involving a large amount of data processing, complex model training, and real-time inference requirements, AI requires computing power, and with the large-scale popularization of scenarios, it is necessary to further take into account the economy.

Nowadays, for the troika of the AI industry, the upgrading of algorithms, computing power, and data is still going on simultaneously, and the collaboration between the three has reached a new height driven by the development of the AI industry. The accelerated development of the AI industry requires a more consistent pace of the troika.

It's time for a complete overhaul of the Troika

The widespread application of artificial intelligence must be based on the coordinated development of the troika. In the coming time, a key problem needs to be solved for the upgrading of the artificial intelligence industry, that is, how to maintain a stable state of the troika.

First, the technology "side by side": one horse is not the best, and the three horses are the most stable.

Computing power, algorithms, and data complement each other, and a single technological leadership cannot bring about a comprehensive explosion of the AI industry, and the other two must be quickly supplemented in order to solve related technical problems.

For example, at present, the accelerated development of large models with hundreds of billions of parameters or even trillions of parameters has brought more powerful information processing and decision-making capabilities, providing a foundation for the emergence of intelligence. However, a breakthrough at the algorithm level must require an upgrade of computing power and data in order to give full play to the effect of application. To put it simply, if there is not enough computing power to drive the training and inference needs of hundreds of billions of large models, then no matter how powerful the model is, there is no "place to use".

To accelerate the development of artificial intelligence and support the widest range of general scenarios in thousands of industries, hundreds of billions of large models must be integrated with big data, databases, clouds and other scenarios to achieve efficient operation.

However, this goal requires a lot of hardware resources such as computing, memory, and communication. In order to meet the AI computing power needs of more users, computing power manufacturers have to consider how to overcome the existing computing power bottlenecks in a targeted manner. From the perspective of the NF8260G7 AI general server that carries the inference of hundreds of billions of parameters and large models, Inspur Information has made a professional design in this regard.

In view of the low latency and huge memory requirements in the inference process of hundreds of billions of large models, the NF8260G7 server is equipped with 4 Intel Xeon processors with AMX AI acceleration function, and in terms of memory, the NF8260G7 is equipped with 32 32G DDR5 4800MHZ memory, and the measured memory bandwidth is 995GB/s (read bandwidth), 423GB/s (write bandwidth), and 437GB/s (read and write bandwidth) respectively. It lays a foundation for concurrent inference computing with low latency and multi-processor for hundreds of billions of large models. At the same time, Inspur Information has also optimized the high-speed interconnection signal routing path and impedance continuity between CPUs and between CPUs and memory, so as to better support large-scale concurrent computing.


This design and upgrade aims to optimize the computing power for the algorithm, which provides a very key support for the large-scale application of hundreds of billions of large models in the future.

Second, the system "drives": three horse-drawn carts, focusing on systematic optimization.

With the development of AI technology, computing power, algorithms, and data are becoming more and more systematic. Many tech giants are competing to find AI solutions with "high model level and low computing power threshold". AI-related solutions are no longer the application of a single technology, but a comprehensive breakthrough in multiple fields to achieve an overall and systematic upgrade.

For example, Google's EfficientNet model optimizes the network architecture to improve the accuracy of the ImageNet dataset by about 6% compared to the traditional model, while requiring 70% less computation. It can be seen that in the process of promoting the upgrading of computing power, the current large model manufacturers will also take into account the innovation at the software level to improve the adaptability between computing power and algorithms.

In order to make the general server better run the 100-billion-level large model, Inspur Information not only innovates and upgrades the server itself, but also optimizes the parameter scale of the 100-billion-level large model. Based on the R&D and accumulation of Source 2.0 algorithms, Inspur Information performs tensor segmentation of 102.6 billion parameters of Source 2.0 large model convolution operators, which provides the possibility for efficient tensor parallel computing on general servers, and ultimately improves the efficiency of inference calculation.

Intelligent relativity, the "cornering point" of artificial intelligence, the troika has fully adjusted its attitude

Parallel computing based on CPU servers

At the same time, in this process, Inspur Information also adopts NF4 quantization technology to "slim down" the model and improve the decoding efficiency of inference.

Intelligent relativity, the "cornering point" of artificial intelligence, the troika has fully adjusted its attitude

NF4 quantification techniques

When computing power and algorithms move towards synergy, the results of systematic optimization are built on the basis of the synergy between the two, and the ultimate goal is to provide a stable and powerful technical foundation for the implementation of the AI industry. In the future, the full-scale explosion of the AI industry will require a more systematic concept to drive the development of the troika.

3. Application "acceleration": The implementation of the industry needs the comprehensive optimal solution of the "troika".

AI is no longer a product of the lab, but a commodity of market competition. Whether it is the emergence of hundreds of billions of large models or the upgrading of computing power solutions, its fundamental goal is to accelerate the implementation of AI applications, bring them to the public, and bring practical economic benefits. Therefore, in addition to the technical level, the industry also needs to consider the economic level.

In contrast, although the AI server with NVIDIA GPU chips as the core performs well in handling high-performance computing tasks such as machine learning and deep learning, computing power manufacturers such as Inspur Information are still committed to developing and upgrading general-purpose servers with CPUs as the core, why is this?

The fundamental reason is that CPUs are still irreplaceable in terms of general-purpose computing, energy efficiency, and cost-effectiveness. In particular, the economic issue related to cost-effectiveness is the key factor that currently restricts the large-scale development of applications in many scenarios. Because the cost of AI-specific infrastructure remains high, it is difficult for ordinary enterprises to afford. Inspur Information provides a lower cost, while taking into account the high performance of the economic option, exactly what the market needs.

Based on the collaborative innovation of software and hardware in general-purpose server NF8260G7, Inspur Information has successfully realized the inference deployment of hundreds of billions of large models in general-purpose servers, and also provides a choice with stronger performance and more economical cost, so that AI large model applications can be more closely integrated with cloud, big data, database and other applications, helping the high-quality development of the industry. Such a comprehensive optimal solution is the most necessary condition for the industry to achieve large-scale explosion.

epilogue

The systematization of the AI troika has taken shape, and more powerful computing power can support more complex algorithmic models, so as to better process large-scale data. At the same time, high-quality datasets help to improve the effectiveness of algorithms, which in turn require more powerful computing power to process. Advances in algorithms can also reduce the need for computing power and reduce computational costs through more efficient model design.

This systematic formation will greatly promote the development of the artificial intelligence industry, and also provide a key general direction for AI manufacturers to upgrade their products, iterate on technology, and advance their services at this stage. But at the same time, it also means a new challenge, that is, how to integrate the technology and resources between computing power, algorithms and data to achieve new breakthroughs.

*The pictures in this article are all from the Internet

#智能相对论 Focusing on intelligent new industries and new services, this is an in-depth interpretation of intelligent service NO.263

This content is original to [Intelligent Relativity],

It only represents a personal opinion and is not allowed to be used by anyone in any way, including reprinting, excerpting, copying or mirroring without authorization.

Some of the pictures are from the Internet, and the copyright ownership has not been verified, not for commercial use, if there is any infringement, please contact us.

•New media for the AI industry;

•Monthly TOP5 of The Paper Technology List;

•The article has long "occupied" the TOP10 of the top 10 popular articles in titanium media;

•Author of "Artificial Intelligence: 100,000 Whys"

•[Key focus areas] Smart home appliances (including white and black appliances, smartphones, drones and other AIoT devices), intelligent driving, AI + medical, robots, Internet of Things, AI + finance, AI + education, AR/VR, cloud computing, developers, and the chips and algorithms behind them.

This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com