DeepSeek Founder Exclusive: China's AI — Time to Lead, Not Follow
DeepSeek has once again become a hot topic due to the release of its V3 open - source model, and this time, it has gone viral not only in China but also across the global internet.

The estimated training cost is only one - eleventh of the Llama 3.1 405B model, and the latter's performance is not as good as DeepSeek V3.
In multiple evaluations, DeepSeek V3 has reached the state - of - the - art (SOTA) in the open - source field, surpassing the Llama 3.1 405B model, and is able to compete head - on with top models such as GPT - 4o and Claude 3.5 Sonnet. Moreover, its price is lower than that of Claude 3.5 Haiku, being only 9% of Claude 3.5 Sonnet's price. It ranks 7th on the Chatbot Arena large - model leaderboard. Among the top ten, it is the only open - source model, and it has the least restrictive MIT license.
In May 2024, DeepSeek shot to fame. The reason was the release of an open - source model named DeepSeek V2, which provided an unprecedented cost - performance ratio and initiated a price war for Chinese large models.
As the only company outside the Chinese large enterprises that has a reserve of 10,000 A100 chips, DeepSeek has made many unique decisions. It has given up the "wanting everything" route, and so far, it has been focused on research and technology. It is a company that has not made any to - C applications, the only one that has not fully considered commercialization, firmly chosen the open - source route, and has not even raised funds.
How was DeepSeek actually forged? The team under 36Kr, a listed company on the US stock market, interviewed Liang Wenfeng, the rarely-seen founder of DeepSeek, in May 2023 and July 2024 respectively.

This technological idealist provides a voice that is particularly scarce in China's technology community at present: He is one of the few who put the "sense of right and wrong" before the "sense of gains and losses", and reminds Chinese to see the inertia of the times and put "original innovation" on the agenda.
01 How Was the First Shot of the Price War Fired?
Host: After the release of the DeepSeek V2 model, it quickly triggered a fierce price war for large models. Some people say that you are a "catfish" in the industry.
Liang Wenfeng: We didn't intend to be a catfish, but we accidentally became one.
Host: Did this result surprise you?
Liang Wenfeng: Very much so. I didn't expect everyone to be so sensitive to price. We just did things according to our own pace and calculated the cost and set the price. Our principle is not to lose money and not to make huge profits. This price is also with a little profit on top of the cost.
Host: Five days later, ZhipuAI followed suit, and then large enterprises such as ByteDance, Alibaba, Baidu, and Tencent did the same.
Liang Wenfeng: ZhipuAI reduced the price of an entry - level product, and its models at the same level as ours are still very expensive. ByteDance was the first to truly follow. Its flagship model was reduced to the same price as ours, which then triggered other large enterprises to reduce prices one after another. Since the model costs of large enterprises are much higher than ours, we didn't expect anyone to do this at a loss. In the end, it became the logic of burning money and giving subsidies in the Internet era.
Host: From the outside, the price reduction seems to be about grabbing users, which is usually the case in price wars in the Internet era.
Liang Wenfeng: Grabbing users is not our main purpose. On the one hand, we reduced the price because in exploring the structure of the next - generation model, the cost came down first. On the other hand, we also think that whether it's an API or AI, it should be inclusive and affordable for everyone.
Host: Before this, most Chinese companies would directly copy the Llama structure of this generation to make applications. Why did you start from the model structure?
Liang Wenfeng: If the goal is to make applications, it is also a reasonable choice to follow the Llama structure and quickly launch products in a short, flat and fast way. But our destination is AGI (Artificial General Intelligence), which means we need to study new model structures and achieve stronger model capabilities with limited resources. This is one of the basic research tasks required to scale up to larger models.
In addition to the model structure, we have also done a lot of other research, including how to construct data and how to make the model more human - like, etc., which are all reflected in the models we released. In addition, in terms of training efficiency and inference cost, the Llama structure is estimated to be about two generations behind the foreign advanced level.
Host: Where does this generation gap mainly come from?
Liang Wenfeng: First of all, there is a gap in training efficiency. We estimate that compared with the best - in - class abroad, the best in China may have a gap of about one time in model structure and training dynamics. Just this one point means that we need to consume twice the computing power to achieve the same effect. In addition, there may also be a gap of about one time in data efficiency, that is, we need to consume twice the training data and computing power to achieve the same effect. In total, we need to consume four times more computing power. What we need to do is to constantly narrow these gaps.
Host: Most Chinese companies choose to pursue both models and applications. Why has DeepSeek currently chosen to only do research and exploration?
Liang Wenfeng: Because we think the most important thing now is to participate in the global innovation wave. Over the past many years, Chinese companies have been used to others doing technological innovation and we taking it over for application and monetization, but this is not a matter of course. In this wave, our starting point is not to take the opportunity to make a fortune, but to go to the forefront of technology and promote the development of the entire ecosystem.
Host: The inertial perception left by the Internet and mobile Internet eras for most people is that the United States is good at technological innovation, while China is more adept at making applications.
Liang Wenfeng: We believe that with economic development, China should also gradually become a contributor rather than just a free - rider all the time. In the IT wave over the past thirty - odd years, we have basically not participated in real technological innovation. We have been used to Moore's Law dropping from the sky, and better hardware and software will come out every 18 months while we are at home. The Scaling Law is also being treated in this way.
But in fact, this is the result of the tireless creation of generations in the Western - dominated technology community. Just because we didn't participate in this process before, we have ignored its existence.
02 The Real Gap Is between Originality and Imitation
Host: Why did DeepSeek V2 surprise many people in Silicon Valley?
Liang Wenfeng: Among the large number of innovations that occur every day in the United States, this is a very ordinary one. The reason they are surprised is that it is a Chinese company joining their game as an innovative contributor. After all, most Chinese companies are used to following rather than innovating.
Host: But this choice is also too extravagant in the Chinese context. Large models are a game of heavy investment. Not all companies have the capital to only focus on research and innovation instead of considering commercialization first.
Liang Wenfeng: The cost of innovation is definitely not low, and the past inertia of the "take - what's - available" approach is also related to the national conditions in the past. But now, whether it's China's economic volume or the profits of large enterprises like ByteDance and Tencent, they are not low globally. What we lack for innovation is definitely not capital, but confidence and the knowledge of how to organize high - density talents to achieve effective innovation.
Host: Why do Chinese companies, including large enterprises that are not short of money, so easily regard rapid commercialization as the top priority?
Liang Wenfeng: Over the past thirty years, we have only emphasized making money and ignored innovation. Innovation is not entirely driven by business; it also requires curiosity and creativity. We are just bound by the past inertia, but it is also a stage - specific phenomenon.
Host: But you are a commercial organization, not a public - welfare scientific research institution. By choosing innovation and sharing it through open - source, where will you form a moat? For example, the innovation of the MLA (Multi - Head Latent Attention) architecture in May 2024 will also be quickly copied by others, right?
Liang Wenfeng: In the face of disruptive technologies, the moat formed by closed - source is short - lived. Even if OpenAI is closed - source, it cannot prevent being overtaken by others. So we precipitate value in the team. Our colleagues grow in this process, accumulate a lot of know - how, and form an innovative organization and culture, which is our moat.
Open - sourcing and publishing papers actually don't mean losing anything. For technical personnel, being followed is a very fulfilling thing. In fact, open - sourcing is more like a cultural behavior than a commercial one. Giving is an additional honor. A company that does this will also have cultural appeal.
Host: What do you think of market - faith - oriented views like Zhu Xiaohu's?
Liang Wenfeng: Zhu Xiaohu(One of Chinese famous investor) is self - consistent, but his approach is more suitable for companies that want to make money quickly. However, if you look at the most profitable companies in the United States, they are all high - tech companies that have achieved success through long - term accumulation.
Host: But in the field of large models, simply being technologically ahead is also difficult to form an absolute advantage. What is the bigger thing you are betting on?
Liang Wenfeng: We see that China's AI cannot always be in a following position. We often say that there is a gap of one or two years between China's AI and that of the United States, but the real gap is between originality and imitation. If this does not change, China will always be a follower. Therefore, some explorations are inevitable.
The leadership of NVIDIA is not just the effort of one company, but the joint effort of the entire Western technology community and industry. They can see the next - generation technology trends and have roadmaps in their hands. The development of China's AI also requires such an ecosystem. Many chips cannot develop because of the lack of a supporting technology community and only having second - hand information. Therefore, someone in China must stand at the forefront of technology.
03 High-Flyer's Large - Model Endeavor Is for Research and Exploration
Host: High-Flyer(Founded by Liang Wenfeng, a hedge fund company and technology company that rely on artificial intelligence technology for quantitative investment) decided to enter the large - model field. Why would a quantitative fund do such a thing?
Liang Wenfeng: Our work on large models has no direct relation to quantitative trading and finance. We established a new company named DeepSeek to do this. Among the main team members of High-Flyer, many are engaged in artificial intelligence. At that time, we tried many scenarios and finally entered the complex field of finance. General artificial intelligence may be one of the next most difficult things, so for us, it is a matter of how to do it rather than why.
Host: Are you going to train a large model by yourself, or a large model related to a vertical industry, such as finance?
Liang Wenfeng: We want to do general artificial intelligence, that is, AGI. Language large models may be the inevitable path to AGI and initially have the characteristics of AGI, so we will start from here, and there will also be vision - related aspects later.
Host: Due to the entry of large enterprises, many start - up companies have given up the general direction of only doing general - purpose large models.
Liang Wenfeng: We will not design some applications based on the model too early and will focus on large models.
Host: Many people think that it is not a good time for start - up companies to enter the field after large enterprises have reached a consensus and entered.
Liang Wenfeng: Currently, it seems that neither large enterprises nor start - up companies can easily establish a crushing technological advantage in a short time. Because there is OpenAI leading the way and everyone is based on public papers and codes, by next year at the latest, both large enterprises and start - up companies will develop their own large language models. Both large enterprises and start - up companies have their own opportunities. Existing vertical scenarios are not in the hands of start - up companies, and this stage is not very friendly to start - up companies. However, since these scenarios are ultimately scattered and fragmented small demands, they are more suitable for flexible start - up organizations.
In the long run, the application threshold of large models will become lower and lower, and start - up companies will have opportunities to enter at any time in the next 20 years. Our goal is also very clear, that is, not to do vertical and application - related things, but to do research and exploration.
Host: Why do you define it as "doing research and exploration"?
Liang Wenfeng: It is driven by curiosity. In the long - term view, we want to verify some conjectures. For example, we understand that the essence of human intelligence may be language, and human thinking may be a language process. You think you are thinking, but in fact, you may be weaving language in your mind. This means that human - like artificial intelligence (AGI) may be born in language large models. In the short - term view, there are still many unsolved mysteries in GPT4. While we are replicating it, we will also do research to uncover the mysteries.
Host: But research means greater costs.
Liang Wenfeng: If you only do replication, you can, based on public papers or open - source codes, train only a few times or even just finetune, and the cost is very low. However, for research, various experiments and comparisons are required, more computing power is needed, and the requirements for personnel are also higher, so the cost is higher.
Host: Where does the research funding come from?
Liang Wenfeng: As one of our funders, High-Flyer has sufficient R & D budgets. In addition, it has a donation budget of several hundred million yuan every year. In the past, it was all given to public - welfare institutions. If necessary, adjustments can also be made.
Host: But to do basic - layer large models, without 200 - 300 million US dollars, you can't even enter the game. How can we support its continuous investment?
Liang Wenfeng: We are also talking to different funders. After contacting them, we feel that many VCs have concerns about doing research. They have exit requirements and hope to commercialize products as soon as possible. According to our idea of giving priority to research, it is difficult to obtain financing from VCs. But we have computing power and an engineer team, which is equivalent to having half of the chips.
Host: What deductions and assumptions have we made about the business model?
Liang Wenfeng: What we are thinking about now is that we can make most of our training results publicly available later, so that it can be combined with commercialization. We hope that more people, even a small app, can use large models at a low cost, rather than the technology being monopolized by a few people and companies.
Host: Some large enterprises will also provide some services later. What is our differentiated part?
Liang Wenfeng: The models of large enterprises may be tied to their platforms or ecosystems, while we are completely free.
Host: In any case, it seems crazy for a commercial company to do research - oriented exploration with unlimited investment.
Liang Wenfeng: If you must find a commercial reason, it may not be found because it is not cost - effective. From a commercial perspective, basic research has a very low input - to - return ratio. When the early investors of OpenAI invested money, they definitely did not think about how much return they would get, but really wanted to do this thing. What we are relatively certain about now is that since we want to do this thing and have the ability, at this time, we are one of the most suitable candidates.
04 The Reserve of 10,000 Cards Is Actually Driven by Curiosity
Host: GPUs are scarce in this ChatGPT start - up boom. You had the foresight to reserve 10,000 of them in 2021. Why?
Liang Wenfeng: In fact, from the first card at the beginning, to 100 cards in 2015, 1000 cards in 2019, and then to 10,000 cards, this process happened gradually. Before there were several hundred cards, we hosted them in IDC (Internet Data Center). When the scale became larger, hosting could no longer meet the requirements, so we started to build our own computer rooms. Many people may think there is an unknown business logic behind this, but in fact, it is mainly driven by curiosity.
Host: What kind of curiosity?
Liang Wenfeng: Curiosity about the boundaries of AI capabilities. For many outsiders, the impact of the ChatGPT wave is particularly large; but for insiders, the impact brought by AlexNet in 2012 has already led a new era. The error rate of AlexNet was much lower than that of other models at that time, which revived the neural network research that had been dormant for decades. Although the specific technical direction has been changing, the combination of models, data, and computing power remains the same. Especially after OpenAI released GPT3 in 2020, the direction was clear, and a large amount of computing power was required. However, even in 2021, when we invested in building Yinghuo - 2, most people still could not understand.
Host: So since 2012, you have started to pay attention to the reserve of computing power?
Liang Wenfeng: For researchers, the thirst for computing power is endless. After doing small - scale experiments, they always want to do larger - scale experiments. After that, we will also consciously deploy as much computing power as possible.
Host: Many people think that building this computer cluster is for quantitative private - equity businesses to use machine learning for price prediction?
Liang Wenfeng: If you only do quantitative investment, very few cards can achieve the goal. We have done a lot of research besides investment, and we are more interested in figuring out what kind of paradigm can completely describe the entire financial market, whether there is a more concise expression, where the boundaries of the capabilities of different paradigms are, and whether these paradigms have a wider application, etc.
Host: But this process is also a money - burning behavior.
Liang Wenfeng: An exciting thing may not be simply measured by money. It's like buying a piano at home. Firstly, we can afford it, and secondly, it's because there is a group of people eager to play music on it.
Host: Graphics cards usually depreciate at a rate of 20%.
Liang Wenfeng: We haven't calculated it precisely, but it should be less than that. NVIDIA's graphics cards are hard - currency. Even old cards from many years ago are still in use by many people. The old cards we retired before were quite valuable when sold second - hand, and we didn't lose much.
Host: Building a computer cluster, the maintenance costs, labor costs, and even electricity bills are all substantial expenses.
Liang Wenfeng: In fact, the electricity and maintenance costs are very low, accounting for only about 1% of the hardware cost per year. The labor cost is not low, but it is also an investment in the future and the largest asset of the company. The people we choose are relatively down - to - earth, curious, and have the opportunity to do research here.
Host: In 2021, High-Flyer was among the first batch of companies in the Asia - Pacific region to obtain A100 graphics cards. Why were you earlier than some cloud service providers?
Liang Wenfeng: We conducted pre - research, testing, and planning on new cards very early. As for some cloud service providers, as far as I know, their previous demands were scattered. It was not until 2022, when there was a demand for renting machines for training in the field of autonomous driving and they had the ability to pay, that some cloud service providers began to build the infrastructure. It is difficult for large companies to simply do research and training. It is more driven by business needs.
Host: How do you view the competition landscape of large models?
Liang Wenfeng: Large companies definitely have advantages, but if they cannot apply them quickly, they may not be able to persevere. Because they need to see results more. Some leading start - up companies also have solid technologies, but like the previous wave of AI start - up companies, they all face the problem of commercialization.
Host: Some people think that a quantitative fund emphasizing its work in AI is hyping up other businesses.
Liang Wenfeng: In fact, our quantitative fund has basically stopped raising funds externally.
Host: How do you distinguish between AI believers and speculators?
Liang Wenfeng: Believers were here before and will still be here in the future. They are more likely to buy cards in bulk or sign long - term agreements with cloud service providers instead of renting them short - term.
05 R & D of the V2 Model All by China Talents
Host: Jack Clark, the former policy director of OpenAI and co - founder of Anthropic, believes that DeepSeek has hired "a group of mysterious and brilliant talents". What kind of people made the DeepSeek V2?
Liang Wenfeng: There are no mysterious and brilliant talents. They are all fresh graduates from top universities, fourth - year and fifth - year doctoral interns who haven't graduated, and some young people who have just graduated for a few years.
Host: Many large - model companies are persistent in recruiting talents from other countries outside China. Many people think that the top 50 top talents in this field may not be in Chinese companies. Where do your people come from?
Liang Wenfeng: There are no people who have returned from overseas in the V2 model. They are all Chinese. The top 50 top talents may not be in China, but maybe we can cultivate such people ourselves.
Host: How did this MLA innovation happen? I heard that the idea originally came from the personal interest of a young researcher?
High-Flyer proposed a brand - new MLA (a new multi - head latent attention mechanism) architecture, which reduced the video memory occupation to 5% - 13% of the most commonly used MHA architecture in the past.
Liang Wenfeng: After summarizing some mainstream change laws of the Attention architecture, he had a sudden whim to design an alternative solution. However, it was a long process from the idea to the implementation. We formed a team for this and it took several months to get it running.
Host: The birth of this divergent inspiration is closely related to your completely innovative organizational structure. During the High-Flyer era, you rarely assigned goals or tasks from top - to - bottom. But for the frontier exploration of AGI full of uncertainties, have there been more management actions?
Liang Wenfeng: At DeepSeek, everything is bottom - up. And we generally do not pre - assign division of labor, but have natural division of labor. Everyone has their own unique growth experience and comes with their own ideas, and there is no need to push them. During the exploration process, when they encounter problems, they will spontaneously invite others to discuss. However, when an idea shows potential, we will allocate resources from top - to - bottom.
Host: I heard that DeepSeek is very flexible in mobilizing cards and personnel.
Liang Wenfeng: There is no upper limit for each of us to mobilize cards and personnel. If you have an idea, you can call on the cards of the training cluster at any time without approval. At the same time, because there are no hierarchies and cross - departmental barriers, we can also flexibly call on everyone as long as the other party is interested.
Host: Such a loose management method also depends on the fact that you have selected a group of people driven by strong passion. I heard that you are very good at recruiting people from details, so that some people who are excellent in non - traditional evaluation indicators can be selected.
Liang Wenfeng: Our recruitment criteria have always been passion and curiosity. So many people have some strange experiences, which are very interesting. Many people desire to do research far more than they care about money.
Host: Transformer was born in Google's AI Lab, and ChatGPT was born in OpenAI. What do you think are the differences in the value of innovation between the AI labs of large companies and a start - up company?
Liang Wenfeng: Whether it is Google's lab, OpenAI, or even the AI labs of large Chinese companies, they are all very valuable. That OpenAI finally made it also has historical contingency.
06 Routines Are the Products of the Previous Generation and May Not Hold in the Future
Host: Is innovation largely a matter of chance? I see that there are doors that can be pushed open at will on both sides of the row of meeting rooms in the middle of your office area. Your colleagues said that this is to leave room for chance. In the birth of the Transformer, there was a story where someone who happened to pass by heard about it and joined in, and finally turned it into a general framework.
Liang Wenfeng: I think innovation is first and foremost a matter of belief. Why is Silicon Valley so innovative? First, it is the courage. When ChatGPT came out, the whole industry in China lacked confidence in doing frontier innovation. From investors to large companies, they all thought the gap was too large and it was better to focus on applications. But innovation first requires self - confidence. This kind of confidence is usually more obvious in young people.
Host: But you don't participate in financing and rarely speak out externally. Surely you have less social influence than those companies that are active in financing. How do you ensure that DeepSeek is the first choice for people who want to work on large models?
Liang Wenfeng: Because we are doing the most difficult things. What attracts top talents the most is definitely to solve the most difficult problems in the world. In fact, top talents in China are underestimated. Because there is too little hardcore innovation at the social level, they do not have the opportunity to be recognized. We are doing the most difficult things, which is attractive to them.
Host: In the previous release of OpenAI, GPT5 did not come out as expected. Many people think that the technology curve is obviously slowing down, and many people have begun to question the Scaling Law. What do you think?
Liang Wenfeng: We are rather optimistic. The whole industry seems to be in line with expectations. OpenAI is not a god and cannot always be in the lead.
Host: How long do you think it will take to achieve AGI? Before the release of DeepSeek V2, you released models for code generation and mathematics, and also switched from dense models to MOE. So what are the coordinates of your AGI roadmap?
Liang Wenfeng: It may be 2 years, 5 years, or 10 years. In short, it will be achieved in our lifetime. As for the roadmap, there is no unified opinion even within our company. But we have indeed bet on three directions. One is mathematics and code, the second is multi - modality, and the third is natural language itself. Mathematics and code are natural testing grounds for AGI. It is a bit like Go, a closed and verifiable system, and it is possible to achieve very high intelligence through self - learning. On the other hand, multi - modality and learning in the real human world may also be necessary for AGI. We are open to all possibilities.
Host: What do you think the end - state of large models will be?
Liang Wenfeng: There will be specialized companies providing basic models and basic services, and there will be a long chain of professional division of labor. More people will meet the diverse needs of the whole society on this basis.
Host: In the past year, there have been many changes in China's large - model start - ups. For example, Wang Huiwen, who was very active at the beginning of last year, withdrew in the middle, and the companies that joined later also began to show differentiation.
Liang Wenfeng: Wang Huiwen took on all the losses himself and let others get out safely. He made a choice that was most unfavorable to himself but good for everyone, so he is very honest and I admire him for that.
Host: Where do you put most of your energy now?
Liang Wenfeng: I mainly focus on researching the next - generation large models. There are still many unsolved problems.
Host: Other large - model start - up companies insist on "wanting both". After all, technology will not bring permanent leadership, and it is also important to seize the time window to turn technological advantages into products. Does DeepSeek dare to focus on model research because its model capabilities are not strong enough?
Liang Wenfeng: All routines are the products of the previous generation and may not hold in the future. Discussing the future profit models of AI with the business logic of the Internet is like discussing General Electric and Coca - Cola when Ma Huateng started his business. It is likely to be an act of "marking the boat to find the sword" (a metaphor for sticking to rigid rules without considering changing circumstances).
Host: In the past, High-Flyer had a strong technological and innovative gene and grew relatively smoothly. Is this the reason why you are optimistic?
Liang Wenfeng: To some extent, High-Flyer has enhanced our confidence in technology - driven innovation, but it has not been all smooth sailing. We have experienced a long accumulation process. What the outside world sees is the part of High-Flyer after 2015, but in fact, we have been working on it for 16 years.
Host: Returning to the topic of original innovation. Now that the economy is entering a downward trend and capital is entering a cold cycle, will it bring more constraints to original innovation?
Liang Wenfeng: I don't think so. The adjustment of China's industrial structure will rely more on hardcore technological innovation. When many people find that making quick money in the past was likely due to the luck of the times, they will be more willing to bend down and do real innovation.
Host: So you are also optimistic about this?
Liang Wenfeng: I grew up in a fifth - tier city in Guangdong in the 1980s. My father was a primary - school teacher. In the 1990s, there were many opportunities to make money in Guangdong. At that time, many parents came to my home, basically believing that studying was useless. But looking back now, the concept has changed. Because it is not easy to make money, and there may not even be the opportunity to drive a taxi. It has changed in just one generation.
There will be more and more hardcore innovation in the future. It may not be easily understood now because the whole social group needs to be educated by facts. When this society enables hardcore innovators to achieve success and fame, the group mentality will change. We just need more facts and a process.
07 More Investment Doesn't Necessarily Lead to More Innovation
Host: DeepSeek currently has the idealistic temperament of the early days of OpenAI and is also open - source. Will you choose to go closed - source in the future? Both OpenAI and Mistral have gone through the process of shifting from open - source to closed - source.
Liang Wenfeng: We won't go closed - source. We believe that it's more important to first establish a powerful technological ecosystem.
Host: Do you have any financing plans? According to some media reports, High-Flyer has a plan to spin off DeepSeek for independent listing. AI start - up companies in Silicon Valley will ultimately, more or less, be tied to large corporations.
Liang Wenfeng: We don't have any financing plans in the short term. The problem we're facing has never been about money, but rather the embargo on high - end chips.
Host: Many people think that working on AGI and doing quantitative trading are two completely different things. Quantitative trading can be carried out quietly, but AGI may require more fanfare and the formation of alliances, which can increase your investment.
Liang Wenfeng: More investment doesn't necessarily generate more innovation. Otherwise, large companies could monopolize all innovation.
Host: You're not currently developing applications. Is it because you lack the genes for operation?
Liang Wenfeng: We believe that the current stage is a period of explosive technological innovation, rather than a period of explosive application development. In the long run, we hope to create an ecosystem where the industry can directly utilize our technology and output. We will only be responsible for basic model development and frontier innovation, and other companies can build to - B and to - C businesses on the basis of DeepSeek. If a complete industrial upstream and downstream can be formed, there will be no need for us to develop applications ourselves. Of course, if necessary, we have no obstacles in developing applications, but research and technological innovation will always be our top priority.
Host: But if someone is choosing an API, why should they choose DeepSeek instead of a large company?
Liang Wenfeng: The future world is likely to be characterized by professional division of labor. Basic large models require continuous innovation, and large companies have their own capacity limitations, so they may not necessarily be suitable.
Host: But can technology really create a significant gap? You've also said that there are no absolute technological secrets.
Liang Wenfeng: There are no technological secrets, but resetting the technology requires time and cost. Theoretically, NVIDIA's graphics cards have no technological secrets and can be easily replicated. However, reorganizing the team and catching up with the next - generation technology both take time, so the actual moat is still quite wide.
Host: After you reduced the price, ByteDance was the first to follow suit, indicating that they still felt a certain threat. What do you think of the new solutions for start - up companies to compete with large companies?
Liang Wenfeng: To be honest, we don't really care about this. We just did this as a side issue. Providing cloud services is not our main goal. Our goal is still to achieve AGI.
Currently, we haven't seen any new solutions, and large companies don't have an obvious advantage either. Large companies have existing users, but their cash - flow businesses are also a burden, making them vulnerable to being subverted at any time.
Host: What do you think of the final outcome for the six large - model start - up companies other than DeepSeek?
Liang Wenfeng: Probably two to three of them will survive. They are all still in the money - burning stage. So, those with a clear self - positioning and more refined operations have a greater chance of survival. Other companies may undergo a complete transformation. Valuable things won't vanish completely, but they will exist in a different form.
Host: During the High-Flyer era, your attitude towards competition was described as "going your own way", and you rarely cared about horizontal comparisons. What is the starting point of your thinking about competition?
Liang Wenfeng: I often think about whether something can improve the operational efficiency of society, and whether you can find a position in its industrial division of labor chain where you are good at. As long as the end result is to make society more efficient, it is valid. Many things in the middle are just stage - specific. Over - focusing on them will definitely lead to confusion.
08 Innovation Is Self - Generated, Not Deliberately Arranged, and Certainly Not Taught
Host: How is the recruitment progress of the DeepSeek team?
Liang Wenfeng: The initial team has been assembled. In the early stage, due to a shortage of manpower, we will temporarily second some people from High-Flyer. We started the recruitment process at the end of last year when ChatGPT3.5 became popular. However, we still need more people to join.
Host: Talents for large - model start - ups are also scarce. Some investors say that many suitable talents may only be in the AI labs of giants such as OpenAI and FacebookAI Research. Will you recruit such talents from overseas?
Liang Wenfeng: If you pursue short - term goals, it's reasonable to look for people with ready - made experience. But if you look at the long - term, experience is not that important. Basic capabilities, creativity, and passion are more crucial. From this perspective, there are quite a few suitable candidates in China.
Host: Why is experience not that important?
Liang Wenfeng: It's not necessary for someone who has done this job before to be the only one who can do it. One of High-Flyer's recruitment principles is to focus on capabilities rather than experience. Our core technical positions are mainly filled by fresh graduates and those who have graduated for one or two years.
Host: In innovative business, do you think experience is a hindrance?
Liang Wenfeng: When doing something, someone with experience will tell you without hesitation how it should be done. But someone without experience will grope around repeatedly, think carefully about how to do it, and then find a solution that suits the current actual situation.
Host: High-Flyer entered the industry as an outsider with no financial genes at all and became a leading player within a few years. Is this recruitment rule one of the secrets to its success?
Liang Wenfeng: Our core team, including myself, didn't have quantitative trading experience at the beginning. This is quite special. It can't be said to be the secret to success, but it's one of the cultures of High-Flyer. We don't deliberately avoid people with experience, but we focus more on their capabilities.
Take the sales position as an example. Our two main salespeople are both newcomers to this industry. One used to be engaged in foreign trade of German mechanical products, and the other used to write code in the back - office of a securities firm. When they entered this industry, they had no experience, resources, or accumulation.
And now we may be the only large private equity firm that mainly relies on direct sales. Doing direct sales means not having to share fees with middlemen. With the same scale and performance, the profit margin is higher. Many companies have tried to imitate us, but they haven't succeeded.
Host: Why haven't many companies succeeded in imitating you?
Liang Wenfeng: Because relying on this alone is not enough to make innovation happen. It needs to be matched with the company's culture and management. In fact, they couldn't achieve anything in the first year and only started to achieve some results in the second year. But our assessment criteria are quite different from those of ordinary companies. We don't have KPIs, nor do we have so - called tasks.
Host: Then what are your assessment criteria?
Liang Wenfeng: Unlike ordinary companies that focus on the number of customer orders, we don't determine in advance how much our salespeople sell and how much commission they get. Instead, we encourage salespeople to expand their own circles, get to know more people, and have a greater influence. Because we believe that an honest salesperson who earns the trust of customers may not be able to get customers to place orders in a short time, but can make you feel that he is a reliable person.
Host: After selecting the right person, how do you help him get into the right state?
Liang Wenfeng: Give him important tasks and don't interfere with him. Let him figure out the solutions and give full play to his abilities. In fact, it's very difficult to imitate a company's genes. For example, when recruiting people without experience, it's not easy to directly imitate how to judge their potential and how to help them grow after they are recruited.
Host: What do you think are the necessary conditions for building an innovative organization?
Liang Wenfeng: Our conclusion is that innovation requires as little intervention and management as possible, giving everyone the space to freely express themselves and the opportunity to make mistakes. Innovation often occurs spontaneously, not through deliberate arrangement, and certainly not through teaching.
Host: This is a non - conventional management method. In this case, how do you ensure that a person works efficiently and in the direction you want?
Liang Wenfeng: Ensure that the values are consistent when recruiting people, and then ensure that everyone is in step through the corporate culture. Of course, we don't have a written corporate culture because all written things will hinder innovation. More often, it's the managers who set an example. How you make decisions when facing something will become a kind of criterion.
Host: Do you think that in this wave of competition in large - model development, a more innovative organizational structure for start - up companies will be the breakthrough point for competing with large companies?
Liang Wenfeng: According to the methodological principles in textbooks, the things that start - up companies are doing now seem like they won't be able to survive. But the market is changing. The real decisive force is often not some existing rules and conditions, but the ability to adapt to and adjust to changes. The organizational structures of many large companies can no longer respond quickly and act fast. Moreover, they are easily bound by previous experience and inertia. Under this new wave of AI, a group of new companies will definitely emerge.
Host: What excites you the most about doing this?
Liang Wenfeng: To figure out whether our conjectures are true. If they are, we will be very excited.
Host: What are the must - meet conditions for recruiting people for large - model development this time?
Liang Wenfeng: Passion and solid basic capabilities. Other things are not that important.
Host: Are such people easy to find?
Liang Wenfeng: Their passion usually shows. Because they really want to do this, these people often seek you out at the same time.
Host: Working on large models may be a never - ending investment. Does the cost involved concern you?
Liang Wenfeng: Innovation is expensive and inefficient, and sometimes it comes with waste. So, innovation can only occur when the economy has developed to a certain extent. When it's very poor, or in an industry that is not driven by innovation, cost and efficiency are extremely crucial. Look at OpenAI, which also spent a huge amount of money before achieving results.
Host: Do you think you're doing something crazy?
Liang Wenfeng: I'm not sure if it's crazy, but there are many things in this world that can't be explained by logic. Just like many programmers, who are crazy contributors to the open - source community. Even when they are very tired after a day's work, they still contribute code.
Host: There is a kind of spiritual reward in it.
Liang Wenfeng: It's like when you hike 50 kilometers. Your whole body may be exhausted, but you feel very satisfied spiritually.
Host: Do you think the madness driven by curiosity can last forever?
Liang Wenfeng: Not everyone can be crazy for a lifetime. But for most people, when they are young, they can devote themselves to doing something completely without utilitarian purposes.