Mistral AI, the parisian rival of OpenAI, just closed a $415 million funding round at a $2B valuation, after releasing an open source MoE (Mixture of Experts) model that beats GPT-3.5 and other SOTA 70B models like Llama-70B on most benchmarks. This is an impressive achievement for a 8 months old startup. I have to admit that I was a bit skeptical about Mistral when they first raised hundreds of millions in VC funding without any product. But now I am impressed. That’s why I decided to take a closer look at their strategy.
Mistral AI, a Paris-based OpenAI rival, closed its $415 million funding round
I was able to put my hands on the strategy memo Mistral wrote before raising VC money the first time. In this article, I will comment and analyze it. You can find the link to the original memo at the end of this article.
Limiting factors and limitations
The limiting factor of generative AI (genAI) is the limited number of researchers able to train and deploy foundation models. It is the scarce resource at the moment and I do agree with Mistral about how crucial it is to assemble a team of such researchers.
I also agree with the fact that genAI will boost productivity in the next decade. It is one of the reasons I founded Rimbaud Inc, a company whose purpose is to explore how AI can help increase human productivity. Software engineers are already experiencing productivity gains by using tools like Github Copilot. My personal experience is in line with that observation too. When using ChatGPT to code a part of a python package or create a web application using Ruby on Rails, I am able to quickly grasp new concepts, approaches, design patterns, even though I was not aware of them before. The only thing I do is ask good questions to ChatGPT. This is akin to a pure socratic experience. It doesn’t matter what you don’t know, as long as you now which questions will lead you to the knowledge you need.
Using the analysis mode of GPT-4 can also help me quickly get some data analysis and visualization ideas based on a full or sample csv dataset. But there are some limitations though. ChatGPT, Github Copilot and similar tools are very effective if you are already an expert or have at least a mid to very good level in the domain you use the tool for. Why ? Because LLMs hallucinate, and the only way to prevent that from harming your productivity is the ability to identify the good answers and reject the bad ones or the correct ones that have subtle deficiencies in them. Let’s take an example to illustrate. As a software engineer, you can ask ChatGPT to write a python class to solve a specific problem you have. But the solution you get from ChatGPT or Github Copilot might fail to abide by the SOLID principles, making your code hard to extend and maintain over time.
When it comes to coding, and even article writing, LLMs are thus good to build demos, prototypes, part of articles. They fail miserably when having to create something complex, like a full python package with different subpackages and modules, or writing a book. Some of it is due to the context length limitation, and some of it to hallucinations and the autoregressive nature of LLMs. So, in that sense, we can conclude that LLMs are good at improving human productivity on low complexity tasks but are not yet useful for high complexity tasks that require maintaining coherence between several abstractions.
If you use LLMs for coding without having a good level in the programming language or framework you are exploiting, you have the risk of falling into an infinite loop of bugs that you don’t understand. You will thus spend too much time trying to solve bugs without understanding what is really going on, and the LLM will often suggest solutions that introduce other unanticipated bugs. This downward spiral is reducing your productivity. Find below a graphical representation of the framework I created to understand how the productivity gains offered by current LLMs vary with the user expertise level and the complexity of the task at hand. True productivity boost happens for low complexity tasks and high expertise level users.
Variation of productivity gains with user expertise level and task complexity
The challenge in the following years will be to make AI help people increase their productivity on low to high complexity tasks, whether or not the human in the loop has low or high expertise level.
That means having AI systems that can adapt the depth and breath of their responses according to the the user skill level, and do so without losing on factuality.
LLMs : The Noise And The Value
With open source models quickly catching up on the closed source ones, I will even go further and say that true value will be captured by companies serving these models effectively through an API. And the biggest winners of all may be GPU providers (NVIDIA and Co) and cloud computing platforms (Google, AWS, Azure etc), because ultimately, everyone will have to use their services or rely on someone who uses their services. It is a good time being NVIDIA these days…
Evolution of NVIDIA's stock price over the last 5 years
The increased competition between model providers is already driving inference costs per token down, which is good for people building products on top of LLMs and maybe for the model providers themselves as they are forced to improve their unit economics. But the cutthroat competition might also drive the overall profitability of the space down.
What I found smart in Mistral’s positioning was the focus on being the “European leader in productivity and creativity enhancing AI”. As a long time observer of the AI space, I was always wondering why there wasn’t an equivalent of OpenAI in the EU. Mistral AI’s team saw that opportunity and were able to seize it thanks to the pedigree of the founding team and political access. It was a smart move to recruit the former French secretary of state for digital affairs as part of the founding team. These guys from Mistral (unfortunately there is no woman in the founding team) have been really strategic from the start.
In the strategy memo, Mistral argues that the closed technology approach chosen by OpenAI and its rivals (Anthropic, Cohere etc) does not meet market constraints. They argue that feeding business data to black box models deployed in the cloud is not safe, can cause legal problems, specifically for EU companies that should comply to GDPR rules unlike US-based companies. I agree with this point. But companies already use cloud providers for compute and data storage so the only way to circumvent the limitations evoked is by letting companies have an instance of the model in their private cloud. That means, only helping them set the inference infrastructure (consulting) and letting their cloud provider reap all the benefits of the value created. There is no way Mistral transforms itself into a consulting company. And, without surprise, it is not what they are proposing. Instead, they have recently launched their commercial offering and their best model is now only available via an API, which is similar to what OpenAI, Cohere, Anthropic and others are doing. So Mistral is not solving the safety and data confidentiality issues they identified differently than OpenAI.
Mistral also touts the virtues of entirely exposing the model, and it is true that they do it for their open source models. But here is the trick: they do not expose their commercial models. Finally, Mistral criticizes the fact that the data used to train OpenAI’s models is kept secret. But the same is true for the data used to train theirs. Mistral even goes as far as prohibiting the use of their models to train or improve other models or compete against them (see the tweet below).
This leads me to conclude that the criticism directed towards OpenAI by Mistral was more of a marketing artifact to raise funds. The goal was to create a wedge and differentiate themselves from the competition. Now that they have launched their commercial offering, we all realise that they are not so different from OpenAI, except the fact that they are from Europe and they open source some of their models like Meta or Yi.
So In my opinion, the true differentiator of Mistral is the geopolitical positioning as the European equivalent to OpenAI. As the AI becomes the new technological battleground, it is absolutely normal to think Europe would not want to be left behind. Mistral is well positioned to capture the enterprise market of foundation models in the EU. I now see, and I may be wrong, the open source argument Mistral used as a marketing tactic inconsistent with the need to generate profit afterwards. After all, if all your models are open source, the only way you can make money is through API serving. But since your models are open source, you will be competing with ten or more other startups that are well versed in the art of serving API endpoints for foundation models (Hugging Face, Together AI, Replicate etc). It’s like giving your competitive edge to your competitors for free. It doesn’t work on the long term, because proprietary technology or technological edge is what makes tech companies like Mistral gain outsized shares of their markets. As the pressure to generate profit increases, I expect Mistral to become more and more closed source and less and less open source.