Home About us

There is no free lunch for large models

Zero-state LT 2024/06/24 11:56

Zero-state LT, there is no free lunch for large models

The wind of 618 blew into the large model circle. In just one week, the large model has stepped from the "centi" era to the "free" era.

According to statistics, a total of 21 large model manufacturers were involved in this price reduction wave. From the mobile phone war, the online car-hailing war, the bike-sharing war, the community group buying war, and then to the later car-making war and today's large-scale model war. Can the lively wave of large-scale model price reductions be "cheap" for consumers again? Small and medium-sized enterprises, cloud vendors, and large-scale model manufacturers, who is the real winner?

With the reduction of inference costs brought about by the price reduction tide, the process of commercialization of large models may accelerate the explosion of C-end applications.

1. Is there a free lunch for large models?

No one expected that Domino, which was in the tide of price reductions for large models, was first pushed away by a small company.

On May 6, DeepSeek-V2, a subsidiary of High-Flyer, released DeepSeek-V2, whose price is nearly one percent of GPT-4-Turbo. On May 11, the Zhipu AI large model open platform also lowered the price of the entry-level GLM-3 Turbo model by 80%.

The price war of domestic large-scale model manufacturers immediately began. On May 15, at the Spring Volcano Engine Force Conference, the main byte model was priced at only 0.0008 yuan/1000 Tokens in the enterprise market, 99.3% lower than the industry price, allowing the large model to enter the stage of "cent-based" from "cent-based".

On May 21, Alibaba Cloud announced that it would significantly reduce the price of a number of commercial and open-source models of its Tongyi Qianwen, among which the API input price of Qwen-Long, a commercial model benchmarked against GPT-4, was reduced from 0.02 yuan/10005 yuan/10005 yuan/1000 tokens, a decrease of 97%; The output price decreased from 0.02 yuan/1000 tokens to 0.002 yuan/1000 tokens, a decrease of 90%. At the same time, Baidu Intelligent Cloud also issued an announcement saying that the two main models of Baidu Wenxin model, ENIRESpeed and ENIRElite, will be fully free and effective immediately.

Tencent, iFLYTEK and other manufacturers also announced price cuts. iFLYTEK announced that the iFLYTEK Xinghuo API capability is officially open for free, and the iFLYTEK Xinghuo Lite API is permanently free and open, and the API price of the top version (Spark3.5 Max) is 0.21 yuan/10,000 tokens.

It's a price reduction, and it's free, do the manufacturers who "buy and sell at a loss" want to push the large model into the free era?

After careful analysis, the large model of bean bags that took the lead in reducing prices only reduced the input price, and the output price did not decrease significantly. Alibaba Cloud has lowered the input and output prices, but the output price of Qwen-Max has not been reduced, which is the model with the largest parameter scale and the highest cost in the Qianwen series, and Baidu has reduced the two models with smaller parameters, and Wenxin 4.0 is not in the scope of this price reduction. Although it is claimed that the level can be compared with GPT-4, in fact, it is not even as good as GPT-3.5, and the model level is not up to standard, and it cannot be run in the actual production environment at all.

The sincerity of the price reduction makes people suspicious, it looks like taking a few unpalatable side dishes out, and claiming that you can eat for free. No wonder netizens suspect that large model companies attract developers and enterprises to use, and may start to card key indicators such as call speed, inference speed, and task processing volume. The routine of "forcing you to become a VIP, and then forcing you to become an SVIP" is very familiar.

2. Price reduction is just a simple market strategy

At present, the business model of the AIGC industry mainly includes various forms such as pay-as-you-callSaaS (Software as a Service) subscription fee, value-added services and solutions, covering a wide range of customer groups from content production companies to ordinary consumers.

Taking OpenAI as an example, there are three main sources of income: one is to charge subscription service fees for C-end members; The second is to provide API call services for enterprise developers, and after paying to access the OpenAI port, B-end users can call GPT series large models, DALL· E series Wensheng graph model and Whisper speech recognition model for application development, which is also OpenAI's core business; The third is the share from Microsoft Azure cloud services. With these three main businesses, OpenAI's revenue will exceed 1.3 billion yuan in 2023.

Zero-state LT, there is no free lunch for large models

Due to being on the same track, the charging model of domestic large manufacturers is basically similar to OpenAI, but whether it is the C-end or the B-side, the bottleneck of the development of domestic large models has long been a reality. The decrease in the price of large model inference is mainly due to the fact that various manufacturers want to seize market share and accelerate the commercialization process.

One reality is that the number of users of large models in China is still too small, the user growth is weak, and the usage data is not rich enough. Secondly, the number of users of the domestic large model at the application level, that is, based on AIGC, is not high. Therefore, increasing user interaction and training data by expanding the user scale for free is the main goal of this price war.

The market generally believes that the essence of the price war is that there is little difference in the technology of domestic large models, and it is difficult for users to perceive the technical distinction. Through abundant funds, major Internet companies have quickly weakened and kicked out companies with weak financial strength. From the mobile phone war, online car-hailing war, bike-sharing war, community group buying war, and then to the later car-making war, they are all similar routines, and they are all caused by the small technology gap.

But the "price war" of large models is not just a simple market competition strategy. Many people believe that the price reduction marks the commercial inflection point of the development of domestic large models. Liu Yang, director of the information research department of Shenwan Hongyuan Research Institute and chief analyst of the TMT industry, said in a public interview that domestic large-scale model enterprises attach importance to industrial value over financial value, and the user side and value-added version have reached the time when they can be promoted on a large scale. iFLYTEK said at the press conference that based on the company's engineering advantages of domestic independent and controllable large models, iFLYTEK's Xinghuo API capabilities are officially free and open, which is conducive to helping developers reduce call costs, drive product innovation and verification, and accelerate the arrival of large model empowerment and AI inclusiveness.

The result of every technological progress of human beings is the continuous reduction of technological costs, and large models also follow a kind of "Moore's Law" and scale effect. OpenAI and other foreign companies have already taken the lead in reducing prices, and the domestic large models in the dual dilemma of model capacity and computing power cost dare to reduce prices, which is not the result of the decline in computing power costs.

Zero-state LT, there is no free lunch for large models

For cloud vendors who insist on making large models, in addition to seizing market share, the bigger abacus is actually in the public cloud market.

The public cloud is the technical foundation of the big model. After OpenAI released the 4o model, it caused a period of publicity on social media, but some professionals pointed out that if ChatGpt based on the Transformer architecture has been relying on the public cloud, it will inevitably face the reality that low latency cannot be satisfied when used by a large number of users.

The embodiment of the role of public cloud in the era of generative AI, neural networks and start-up machines. Therefore, with the rapid development of the large model industry, the public cloud is still a huge profiteering. If the application of large models really explodes, the AI inference market will also usher in an explosion, creating a public cloud vendor with the ability of a large model on the base, through the public cloud + API model, in order to get out of the new growth curve and obtain greater profit space. Cloud computing vendors, represented by Alibaba Cloud, have continuously innovated from the underlying computing power, AI platform to model services, and at the same time, AI has also fed back the improvement of cloud management, application, computing, and infrastructure capabilities.

3. After the gunshot, who is the winner

Just like the online car-hailing war, users can take a taxi at a very low price, every price war is a time for consumers to reap dividends again, but the difference is that the price reduction of the large model may not be the beneficiary of ordinary consumers.

For a long time, the embarrassment faced by large models is that weak user growth is inevitable, and writing complex prompts is still the biggest obstacle for ordinary users. At present, the main AI assistants implanted by various mobile phone manufacturers are for the C-end, but these AI tools are difficult to monetize from consumers.

Who can benefit from the price reduction of large models?

From the perspective of the relationship between the various levels in the AIGC industry chain, the infrastructure layer is currently at the bottom of the industry, and the core is the AI server that supports computing power, such as CPUs and GPUs. Above is the model layer that requires long-term investment by many top scientists, and foreign OpenAI and Google, as well as domestic manufacturers, are currently competing mainly in this field. The application layer based on the large model is committed to solving the consumer terminals of C-end users and providing industry solutions for B-end users.

For the large model industry, only the advancement of technology can not bring about the landing of applications, if the inference cost of large models cannot be reduced, any commercialization will be an expensive attempt, especially for C-end applications, the cost may be a bottomless pit.

It can be seen that the biggest benefit of the overall price reduction of large models is still small and medium-sized enterprises engaged in application development, and the reduction of costs will help this type of enterprises to flourish.

Using the AI model as the technical base to create targeted application software, the most typical is the first domestic generative AI product Remini that exploded this year. The ugly and funny clay-style filter made Remini directly rush to the top of the domestic iOS free APP download list. The explosion of Remini reminds people of another image generation software, Miao Ya, which can generate its own "digital clone" with only 9.9 yuan and 20 daily photos, and the effect is comparable to an art photo with a market price of hundreds of yuan. Like the Duck, Remini's end may be short-lived, and along with those tepid AI application products, people have to worry about the commercial application path of generative AI and its stability.

So is the expensive API the main reason for limiting the development of the application side? Apparently not. As Jia Yangqing, former vice president of Alibaba, said, the use of AI by enterprises today is not cost-driven. If you don't know how to generate business value, no matter how cheap it is, it's just a waste.

The chairman and CEO of 360 also publicly expressed his views on the current large-scale model industry. He believes that the transformation of public large models to application large models is the correct way of thinking for the industry. For C-end users, it is necessary to find user needs for personalized customization; On the other hand, it customizes professional large models for enterprises and carries out privatized deployment. "In the future, enterprises should have multiple vertical large models, which should be simple and easy to use, and improve the internal efficiency of the enterprise." For example, BPai financial and taxation model, a recently self-developed vertical model focusing on the field of finance and taxation.

It is necessary to grasp both model technology and commercial application, and there is still a long way to go after the price war.

"Cheap can't win the business war, whoever can land and get profits is the last laugh."

This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com