Kimi is starting to recoup her roots
Author|Tao Ran Editor|Wei Xiao
Kimi, whose to C business has been soaring for more than half a year, has begun to exert its strength on the B side.
Halfway through August, two news directly related to its commercialization process spread in the market:
On the 2nd, Kimi's parent company, the dark side of the month, officially announced the official release of Kimi's enterprise-level API. Compared with the general model that covers the requirements of TOC, the enterprise-level model inference API has a higher level of data security and concurrency rate to support complex workflows and large-scale data processing requirements within the enterprise.
Five days later, the company released another commercialization move, announcing that the context cache storage fee of the Kimi open platform will be reduced by 50%, from 10 yuan/1M tokens/min to 5 yuan/1M tokens/min, and the price will take effect on August 7, 2024.
In fact, this technology has been launched as early as July 1 through the open platform to "reduce costs and increase efficiency" in the model inference process by pre-storing information such as text or data that may be repeatedly quoted and frequently requested.
Kimi's current action of seeking AI monetization on the B-side is by no means as tentative as the previous in-app launch of the "grayscale test of the reward function":
From enterprise-level solutions, to scenario-specific optimization, to price adjustments, the dark side of the moon is clearly prepared.
The position of the circle on the application of large models is clearly divided into two factions in 2024, when the technology is becoming more and more intensive.
At the World Artificial Intelligence Conference (WAIC) held in Shanghai in early July, Robin Li, CEO of Baidu, who holds the largest number of AI application visits in China, talked about his views on the application of large models in his speech: Of course, the C-side should be done, but the more fruitful application scenarios of large models are still on the B-side.
Image source/AI product list
Robin Li believes that in the AI era, the value of "super capable" applications, that is, those that can profoundly impact the industry and significantly improve the efficiency of application scenarios, may be even more significant, and the overall value they create is far greater than that of some "super applications" in the mobile Internet era.
In the future, in the fields of healthcare, finance, education, manufacturing, transportation, agriculture, etc., we will develop and manufacture a variety of agents according to the characteristics of our own scenarios, unique experience, industry rules and data resources. In the future, millions of agents will appear, forming a huge agent ecosystem.
This style of play can be regarded as a representative of BAT and other major technology companies at this stage.
Judging from the bidding situation of large-scale model-related projects counted by Silicon Star people, Baidu won the bid for a total of 17 projects in many fields including medical care, finance, energy, environmental protection and transportation this year, including many large state-owned enterprises and head companies in all walks of life, and the amount is basically at the level of millions or even tens of millions.
Representatives of start-up companies, such as Wang Xiaochuan of Baichuan Intelligence and Yang Zhilin of the dark side of the moon, have always given the impression to the outside world in the past that they have always been staunch supporters of to C.
At the press conference of AI assistant Bai Xiaoying, Wang Xiaochuan once said that the to B business is not the business model that Baichuan mainly relies on, and it is a good business to do to B in the United States, but the C-end of the domestic market is "ten times larger" than the B-side.
Although Yang Zhilin, the founder of the dark side of the moon, did not talk much about the company's monetization in public, he also said in a speech at the Shanghai Innovation and Entrepreneurship Youth 50 Forum a few months ago that thanks to the proposal of the Transformer architecture, the development of the semiconductor industry, and the large amount of data accumulated by the Internet for AI, this kind of AI-to-C opportunity may be "the first time in the world that this kind of AI-to-C opportunity has appeared." ”
As for whether to make Kimi into an AI super application to C, or to lay out more points after the name is made, Yang Zhilin has left a live buckle: We don't say that we don't do it at all, but we may still focus on and exert force on this C-end most importantly.
It's probably the time to judge that the dark side of the moon that has been insisting on to C for a long time is finally "really fragrant" to B.
From the most superficial level, the first problem that needs to be solved to make a to B solution compared with Kimi on the C side is:
For paying players, you can't afford to go down when there's nothing going on.
The scale of computing power is an unavoidable topic: the dark side of the moon spent a year to get Kimi to the top of the traffic and usage in the large model track (some statistics show that Kimi and Wenxin Yiyan in July are the only mainstream large models in China with more than 10 million live in February), but after all, it is still a start-up, and it is obvious that the resources will not be particularly rich compared to large factories.
It's rare to hear of Wenxin Yiyan and Tongyi Qianwen having a shortage of computing power due to user peaks, but users who often use Kimi must have been more or less blocked by the computing power wall for several rounds of Q&A (which seems to be getting better recently).
If enterprise customers use Kimi as a common productivity tool, then the enterprise API server must ensure stability and reliability to ensure normal operation under high load conditions.
Combined with the context caching technology of this price reduction, in addition to expanding the server scale on demand with the business volume, Kimi has put another focus on "reducing costs and increasing efficiency" for existing model inference.
The cost of this technology is usually charged to the customer for the platform or service provider to maintain and provide caching services. For example, if a user frequently uses the same shopping website or app, then the website/app is likely to create a separate data set in the system and store the user ID, shopping cart contents, and preference information.
In the case of the large model, if the user submits a request to the system, such as asking a series of questions, or throwing a 10,000-word text to Kimi to ask for a report, the large model needs to understand the user's query context when processing the request, including previous questions, related topics, or specific information in certain fields.
The intermediate results and key information calculated by this part of the reasoning are often repeatedly mentioned (called) in the user's subsequent Q&A, and it is a relatively resource-saving choice to cache them so that they can be quickly accessed in subsequent requests.
Instead of recording usernames and passwords for the convenience of users logging in, this kind of caching first reduces the amount of resources consumed by the model for repeated reading and inference, and will also improve the efficiency of result generation to a certain extent. Using cached contextual information, large models can quickly generate responses or recommendations without having to recalculate from scratch. This speeds up response to users who ask relevant questions or need relevant information, reducing time-wasting waiting.
This caching mechanism, which helps to improve the responsiveness and processing efficiency of the system, while maintaining the coherence and accuracy of dialogue or text generation, can be critical to providing a smooth user experience and optimizing the use of resources.
Especially in the future, when more users and more centralized data processing requests from the B-side may be confronted, the value of rapid response and efficient processing may be further highlighted.
The dark side of the moon with frequent to B actions, and recently raised a large amount of financing from the goose factory.
According to market sources, Tencent participated in the latest round of financing of $300 million in the dark side of the moon, which will raise the company's market value to $3.3 billion when completed, making it the highest valuation among domestic large-scale model start-ups.
The dark side of the moon did not respond to this matter, but it is said that a source close to Tencent said that the participation was true.
So far, known as the "New AI Four Little Dragons", Zhipu AI, MiniMax, Baichuan Intelligence and the Dark Side of the Moon, the investment camps behind the four companies have all been involved in Tencent and Alibaba.
In BAT, Baidu chooses to do more to be itself, while AT continues to bet more through venture capital.
Startups are busy applying technology, and big companies seem to have focused part of their attention on return on investment or the right to say in the future of the industry.
This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com