Transformer author warning: You can't play OpenAI by only selling models!
Aidan Gomez, the youngest of the Transformer eight sons, lamented in the latest interview:
It's really not profitable to just sell models!
Google's Aidan Gomez is one of the authors of Transformer who has had a profound impact on the field of AI.
And now Aidan Gomez is the co-founder and CEO of Cohere, a company whose valuation has soared to $5.5 billion. (Previously, the Command R series of open-source large models was launched).
In this conversation with Harry Stebbings, the owner of 20VC, Aidan Gomez talks about the development of AI.
Some of these topics have sparked the attention and discussion of netizens, such as:
1.Model performance improvement, scale is not the only way
2.Selling only models cannot compete with OpenAI
3.AI startups should not rely on cloud vendors
4.Optimistic about the field of robotics, predicting a major breakthrough within 5 years
5.Data quality is crucial for models
For more specific content, and see the text version below to share~
Q: Before I begin, I would like to ask you a question, did you like to play games when you were younger?
Aidan Gomez: I do love games, I've loved technology since I was a kid.
Q: This means that you would never start a game with a very difficult first level that makes people feel "impossible, I don't want to play it".
Aidan Gomez: Yes, this is called "course learning" in machine learning. You teach the model to do something very simple, and then gradually increase the complexity and build on what you already know.
Interestingly, course learning actually fails in machine learning. We didn't really do the lesson, we threw the hardest and easiest material at the same time and let it figure it out on its own.
But for humans, it's very effective and an important part of our learning. It's really interesting to see that it didn't succeed in machine learning.
Q: You just said that you just threw everything directly at the model, and I want to go straight to the bottom of this issue, and a lot of people say that with more computing power, the performance will increase. Do you think that's right? Are there other factors that are limiting the performance gains?
Aidan Gomez: It's true that if you add more computing power to the model, or make the model bigger, it does get better. This is the surest way to improve model performance, but it is also the dumbest.
It is a very attractive strategy for those who are well-funded with extremely low risk. You know it's going to get better, just scale the model, spend more money, buy more computing power. I believe this, but I just think it's extremely inefficient.
There is a better way. If you look at the past year and a half, such as the time period from the release of ChatGPT to the current release of GPT-4. As GPT-4 does have 1.7 trillion parameters as they claim, it's a huge MoE.
We now have better models than this one, and they only have 13 billion parameters. So the speed of this change, or the extent to which costs are falling rapidly, is simply incredible, even somewhat surreal.
So yes, you can scale up to achieve the quality of your model, but probably shouldn't.
Q: Will this progress continue? I mean, will we continue to see progress on this scale? Or will it reach a bottleneck at some point?
Aidan Gomez: yes, it does require exponential investment. You'll need to constantly double your computing power to maintain a linear increase in intelligence. But this growth is likely to continue very, very, very long.
It's going to get smarter and smarter. But you will run into financial constraints. Not a lot of people bought the original GPT-4, especially a lot of businesses because it's very large, very expensive, inefficient to run, and costly, and it's not smart enough to justify that cost.
So, there's a lot of pressure in the market to make models smaller and more efficient, to make models smarter through data and algorithms, methods, and not just rely on scaling up.
Q: We live in a world where there are smaller, more efficient vertical models that are designed for specific use cases, and will there be several large, dominant models? Or will it be both?
Aidan Gomez: One of the trends we've seen over the last few years is that people like to prototype with a generic, smart model. They don't want to prototype with a specific model, they don't want to spend time fine-tuning the model to make it particularly good at the things they care about.
What they want is to grab a big, expensive model, use it to prototype, prove that it can get the job done, and then refine it into an efficient model that excels in a particular domain. So this pattern really emerged.
As a result, we will continue to live in a world where multiple models coexist, some verticalized and focused, others completely horizontal.
Q: For example, OpenAI is spending $3 billion right now. How can you hold your place in this race, unless you're a company like Microsoft, Amazon, Google, Facebook?
Aidan Gomez: If you're just doing a scale project, you really need to be one of these companies, or a subsidiary of one of these companies. However, there are many other things that can be done.
If you don't rely entirely on scaling as the only way forward, there's a lot more to explore if you believe in data innovation or model and approach innovation.
Q: Can we dive into what data innovation and model and method innovation are?
Aidan Gomez: Almost all of the big advances we've seen in the open source space have come from improvements in data. By getting higher quality data from the internet, improving web scraping algorithms, parsing web pages, extracting important parts, and increasing the weight of specific parts of the internet, because there is a lot of duplication and spammy content on the internet.
By extracting the most valuable, knowledge-rich parts and emphasizing them to the model, as well as the ability to generate synthetic data, these allow us to take large amounts of text or web content without human involvement, which is automatically generated by the model.
These innovations, particularly the ability to improve data quality, have driven much of the progress we are seeing today.
Q: Okay, this is data innovation, but what about model innovation?
Aidan Gomez: It's about things like new reinforcement learning algorithms. You know, there's a lot of rumors about Q*, and the changes it might bring. Ideas around search, like how to search for solutions.
The current state of the model is that I asked you a question, and your model needs to give the right answer right away. That's an extremely demanding requirement for a model, right?
You can't do that to humans, you can't ask a person a hard question and expect them to spit out the answer right away. They need time to think and process.
Q: They sometimes need a little brainstorming time.
Aidan Gomez: Yes, exactly. So, a very obvious next step in the development of models is that you need to get them to think and solve problems. You need to make them make mistakes, try something, fail, understand why they failed, and then backtrack and try again.
Currently, there is no concept of solving a problem in the model.
Q: Is the problem solving you mentioned the same concept as reasoning?
Aidan Gomez: Yes.
Q: Why is it so hard to reason? Why don't we have the concept of reasoning yet?
Aidan Gomez: Inference is not difficult, the hard thing is that we don't have a lot of training data on the internet that shows the inference process. Most of the internet is the output of the reasoning process.
When you write something online, you don't show your thought process, but you directly show your conclusions, show your ideas, and these are the results of a lot of thinking, experience, and discussion.
So we lack such training data, it's not available for free, you have to build it yourself. So what companies like Cohere, OpenAI, and Anthropic are doing is collecting data that shows the human reasoning process.
Q: Speaking of which, I wonder how you feel about competing with OpenAI's user-generated content program?
Aidan Gomez: It's hard, especially in the enterprise space, where we face a huge challenge: the privacy and confidentiality of customer data.
They treat the data as intellectual property and contain a lot of confidential information, so they don't allow us to use it for training. I fully understand this position. To that end, we've shifted our focus to synthetic data and invested significant resources in this area.
We also formed a team of human annotators and partnered with Scale AI. While this puts a lot of pressure on us, because we're not a consumer-facing company, we have to generate our own data.
Luckily, our focus is relatively small, focusing on areas where the needs of the business are clear, such as automating financial and HR functions. This allows us to dig deeper and address these specific needs.
Looking ahead to the next decade, how will the synthetic data market evolve? Will it be dominated by a few vendors? At the moment, it appears that the large language model API market is largely driven by synthetic data, with many people generating data from expensive large models to optimize smaller, more efficient models.
Whether this model is sustainable is uncertain, but I believe that as new tasks, new problems, and data needs emerge, both from models and humans, we must adapt and meet those needs.
Q: So what would a synthetic data mart look like? Will it be dominated by two or three vendors?
Aidan Gomez: I've heard that the current market for large model APIs is dominated by synthetic data. Most people use these large, expensive models to generate data and then use that data to fine-tune smaller, more efficient models.
So they're basically refining a bigger model. I don't know if this model is sustainable as a market. But I do think that there will always be new tasks, new problems, or new data needs, whether that data comes from models or humans, and we have to meet those needs.
Q: There's one thing that worries me, or makes me hesitate. You're seeing OpenAI competing on price, and you're seeing companies like Meta releasing models for free, while also not making a clear statement about the value of open source and open ecosystems.
Are we seeing a real decline in the value of these models? Is this a race to lower prices, or even to zero?
Aidan Gomez: If you're just selling models, it's going to be a very tough game for the next period of time. It's not going to be a small market.
Q: There will be a lot of people who just sell models, and there will be people who will sell models and other things.
Aidan Gomez: I don't want to name names, but I can say, Cohere, for example, only sells models right now. We have an API through which you can access our models.
That will soon change. The product landscape will change, and we will add something new to the existing product. If you only sell models, the situation will be difficult because it will turn into a zero-profit business and the price competition will be too fierce. A lot of people offer models for free.
Still, it's going to be a big business, and the market demand is growing very fast. But at least at this stage, the margins will be very small.
That's why there's a lot of excitement at the application level. The discussion in the market is right to point out that value is happening below the chip layer, because in the beginning everybody is investing a lot of money in chips to build these models, and then seeing the value reflected at the application level, like ChatGPT, which charges per user for $20 per month.
This seems to be where the value is happening at the moment. The model layer is an attractive business in the long run, but in the short term, as it stands, it's a very low-margin, commoditized business.
Q: Many people now think that it's too late for startups to enter the AI model space. However, as the cost barrier lowers, does this actually make it easier for startups to enter the space?
Aidan Gomez: It's true that every year the cost of building last year's model goes down 10 times or even 100 times. Thanks to better quality data and cheaper computing resources, the barrier to entry into the previous generation model is lowered.
But the problem is, no one really cares about those outdated models. Last year's model was almost worthless compared to this year's. With every technological advancement, old technology quickly becomes obsolete, and the cost of AI development is rising dramatically.
It may only cost $10 million to develop version 1, but it may cost another $1 million to $2 million to make version 2 slightly better. And now, it could cost $3 billion to develop a new model, and it could even cost $5 billion to update it.
This growth is no longer linear, but orders of magnitude. I'm not sure if the development of a new generation of technology is always cheaper than the previous one. In the case of chips and other complex technologies, we continue to develop even though the cost of development is rising, because it's worth it.
Q: So you're saying that people don't really care if those improvements are consistently effective?
Aidan Gomez: That's right. What I'm saying is that it's getting harder and harder to improve these models, and there's more resistance. Another interesting phenomenon is that as models become smarter, the ability of the average person (including me) to distinguish between them decreases.
Because we have limited expertise in medicine, mathematics, physics, etc., we can't really feel these changes. The model has done an excellent job with the basics, which is exactly the level of knowledge we can achieve.
Therefore, when we interact with them, it is difficult to feel the differences between the different generational models. But in reality, these models have huge strides in some specific abilities or pure intelligence.
As for whether it's worth continuing to invest heavily in technology, I think the answer is yes. Even though these technologies may not be important to the average consumer, they can be very valuable to researchers in certain areas of expertise.
We help them make more progress by providing these tools. It's like asking if we should continue to invest in next-generation technologies, such as creating a new material for a spacecraft to improve its efficiency in getting into orbit.
While this may not matter to most people, it is very important for those who need it, and there is a market demand, which is what keeps technological progress going.
Q: Let's go back to the issue of cost. Obviously, the cost is high and will continue to increase in the future. You mentioned the concept of "effective affiliates" earlier.
Many companies are now being acquired or merged, and cloud services are in the spotlight as a driver of continued growth. Do you think in the next three to five years, most of the small model providers will be acquired by the large cloud service providers?
Aidan Gomez: I think the space is really going to go through a consolidation, and it's already starting to happen. Many model developers have already been absorbed by large cloud service providers such as Amazon.
I'm sure there will be more of this in the future. However, it is important to note that becoming an affiliate of a cloud service provider can be risky. This is not a good sign for business development.
Often, in order to raise capital, you need to convince investors who are only concerned about the return on capital. But when you're raising money from a cloud service provider, it's a completely different story.
Q: So do you think venture capitalists have made money from model investing in the last few years?
Aidan Gomez: For Cohere's investors, they're definitely making a lot of money.
I am happy for those who believe in us. Our first investor, Jordan Jacobs of Radical Ventures, is still on our board and is very actively involved in building the company. I would even go so far as to call him the fourth co-founder of Cohere.
Q: Recent media reports that the company is valued at just over $550 million, is that stressing you?
Aidan Gomez: It's a real pressure, but it's also a positive pressure. Eventually, every company will be faced with revenue multiples, which will eventually converge with the multiples of the public market.
I think we're actually in a much better position than many of our peers. Because our valuations haven't grown like crazy like some other companies. Of course, we still need to continue to grow and develop, but I have full confidence in the market.
Currently, margins are under pressure due to price competition and the existence of free models, but this will change over time. At the same time, Cohere's product portfolio will continue to evolve and evolve.
Q: If you are an investor in 20VC now, where do you think the opportunities are?
Aidan Gomez: The product areas, the application areas, are still very attractive. These technologies will lead to new products that will transform social media. People love to communicate with these models, and the time of use is amazing.
Q: Do you think that's a good thing? I don't want my children to live in a world where they communicate with generative systems and imitate humans. I don't want them to get satisfaction from a conversation with a model.
Aidan Gomez: You could be wrong. You may want your child to be able to communicate with an agent who is extremely compassionate, very intelligent, knowledgeable, and safe.
It can teach them things, play with them, it doesn't lose its temper with them, it doesn't get angry with them, it doesn't bully them, it doesn't make them feel insecure.
Of course, nothing can replace humanity. Nothing can replace the human world, and we don't all of a sudden start dating chatbots that lead to a decline in the human birth rate.
I don't think that's going to happen, right? I want a baby and I can't have a baby with a ChatBot.
A human companion is much more valuable to me than any chatbot. Just like in the workplace, I don't think we can completely replace humans. AI will empower humans and make them more productive, but that doesn't mean there will be fewer jobs.
You can't replace humans. Think about sales, if I was pitched by a bot, I wouldn't buy it. It's that simple, I don't want to talk to a machine.
Of course, some simple purchases may be handled by bots, but for those purchases that are very important to me and my company, I want the other party to be a real person who can take responsibility.
If something goes wrong, I need someone who has the authority to intervene. So I really don't think that either on the consumer side or whether we're going to indulge in conversations with ChatBots or on the work side where jobs are going to disappear and lead to mass unemployment, I don't see that happening.
Q: I agree with you, but I'm really concerned that low-end positions, let's say a customer service team might lose 70 to 80 percent of its staff, and there will definitely be partial replacement.
Aidan Gomez: There's definitely going to be a local substitution. But overall, it will be growth, not replacement. Certain roles are really susceptible to technology, and customer support is one of them.
But in the end, someone will still need to do the work, only maybe in smaller quantities than it is today. But customer support is a tough role, and it's a psychologically very draining job. If you've ever listened to those phone call recordings, you know it's an emotionally draining job.
Q: Yes, it's kind of like content moderation on social media platforms, which is also a psychological trauma in many ways.
Aidan Gomez: Every day you wake up, you go to work, you get scolded all day, you have to apologize. So, maybe we should let the model handle those conversations, and let the humans handle the customer support issues that really need human help, say, to solve a problem without emotional complaints, but with the opportunity to make that person's life better.
Q: What do you think AI can't do today, but will become a reality and bring about a huge change in three years?
Aidan Gomez: The next big breakthrough in AI will be in robotics. Costs need to be reduced, but they are already falling. Then we need more powerful models.
Q: Why do you think there will be a big breakthrough in the field of robotics?
Aidan Gomez: Because a lot of the obstacles have disappeared. Previously the inference and planner of the robot were very vulnerable, you had to program each task, and they were hard-coded into a specific environment.
So you have to have a kitchen with exactly the same layout, the same size, without any differences, which is very fragile. But in terms of research, by using basic models, language models, people have actually developed better planners who are able to reason about the world more naturally.
So there are already a lot of companies working on this, and it may be that someone will soon solve the puzzle of universal humanoid robots, making them cheaper and more stable.
It's going to be a huge shift. I don't know if this will happen in the next five years or ten years, but it will definitely come out in this time frame.
Q: It's really interesting to talk to you today. I want to do a quick question, I give a statement, and you immediately give your thoughts
What has changed your opinion the most in the last 12 months?
Aidan Gomez: The importance of data. I seriously underestimated it. I used to think it was all about scale, but a lot of things that happened inside Cohere completely changed my understanding of what it was important to build this technology.
Data quality is critical. Quality, such as in billions of data points, a single example of error can have a significant impact on a model. It's a bit untrue, the sensitivity of the model to the data is so high that everyone underestimates that.
Q: How much money has your company raised so far?
Aidan Gomez: About $1 billion.
Q: Which round of funding is the easiest?
Aidan Gomez: Probably the first round. It was like a simple conversation, and they said, "Give you a few million dollars, try it." "So I thought that round was pretty easy.
Q: When it comes to raising $500 million, it's definitely a more complicated process. Isn't it a little hard to believe when you see $500 million coming in?
Aidan Gomez: Sort of. For example, $25 million a year, although I don't know the exact number, but it's a lot of money. Cohere has changed the way I think about the economy and money, and now $500 million is not a big number for me.
Q: Does that worry you?
Aidan Gomez: No, that's part of our strategy. If we are willing to accept that condition, we can accept it. But our strategy is to remain independent and develop autonomously.
Q: If you could choose any world-class board member, who would it be?
Aidan Gomez: Mike Volpi and Jordan Jacobs, they're now on my board of directors.
Q: Why do you think Mike is a good board member?
Aidan Gomez: Mike is brilliant, he seems to have been through everything. I can ask him almost any question, he has had similar experiences and can give valuable advice.
Q: Jeff Hinton and Yann LeCun, who do you prefer?
Aidan Gomez: Jeff for sure, I'm more intimate with him personally.
Q: Do you think Yann is overly optimistic?
Aidan Gomez: No, I agree more with Yann about AI. Jeff is more inclined towards doomsday prophecy, while Yann is more optimistic. While Yann is now a bit of Elon Musk's "Reply Brother", Jeff is really a smart and thoughtful person.
Q: Last question, what do you think is the one you have never been asked but should be?
Aidan Gomez: People always ask me about the future and potential risks of technology, but they rarely talk about the opportunities that technology brings.
Q: Where do you want the future of technology to go?
Aidan Gomez: I think we should use technology to make the world more productive, increase supply and make things richer and cheaper. Productivity may not sound very appealing, but if you apply the 5% productivity boost to the NHS.
This will have a significant impact on the state of the country, on budgets and on the lives of millions of people. So I think our priority should be to increase productivity and growth.
This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com