A look at Mistral.ai’s economics
Mistral.ai (the “Company” or “Mistral”) has made a global impact since it was founded one year ago. Just recently, Sequoia cited it as one of their “Top 50 AI Companies of the Future”. CB Insights ranked the company in its “Top 100 AI companies for 2024” and noted that the $544m it has raised places it among the Top 10 AI companies in terms of equity funding. Finally, The Information reports that Mistral is raising “several hundred million dollars at a valuation of $5 billion.”
Even in a tech sector which has seen its share of hype and investment excess over several decades, this level of funding and valuation that we are seeing for GenAI companies is extraordinary. But even is this type of valuation for such young companies is unprecedented, it doesn’t necessarily mean that it’s irrational. The economics still have to be figured out and this can get even more complicated when it comes to open-source models like Mistral.
To evaluate that, we need to look more closely at Mistral’s economics to understand how much of this investment can be justified — and how big the risk actually is.
So, using what we know publicly about Mistral AI’s business, I thought it would be instructive to break down its business model to see what we can know and what questions investors should be asking.
Let’s start with a brief description of the company.
Mistral.ai is a French company operating in the space of genAI [Foundation Models]. It develops and commercializes open-source Large Language Models (LLMs) for enterprise-targeted use cases. The term “open source” refers to the LLM code and underlying architecture being accessible to the public, meaning developers and researchers are free to use, improve or otherwise modify the model.
The Company sells both large and small models. Their small 8x7B model is a leading one within the developer community. Unlike their large counterparts, SLMs are designed to serve more specific, niche, purposes within an enterprise.
I believe smaller models will have strong momentum in the following months and years with the development of domain-specific genAI at the application level (e.g., in healthcare, financial services, etc.). Hence, there is a need in the future for application-specific model derivatives that are small in size, from a few MB to 100s of MB. Mistral is well positioned on that front.
There is no doubt in parallel on the quality of their models overall.
It is clear that competition is going to keep increasing in the open-source LLM space:
· First, Meta’s Llama 3 recent release will likely put pressure on Mistral;
· Second, the closed race — in the sense there would just be a few foundational GenAI companies — many had predicted is unfolding differently with new open-source LLMs burgeoning, including from software firms: Databricks and Snowflake released “cheap” LLMs this month.
With that context, let’s dig into their economics, both today and in the long term:
Quality of Revenue
As for many other LLMs, Mistral offers a pay-as-you-go or “usage” based pricing for their APIs based on the level of input and output tokens. Tokens are essentially sequences of textual characters that LLMs convert into numeric representations, enabling the efficient processing of language. Prices are per million tokens, varying by model complexity: they vary from $0,25/1M tokens to $2/1M tokens across their open-source models. Costs are significantly higher for their optimized models (up to $8/1M for Mistral Large).
Revenue comes from API tokens as its open source models are free.
Mistral also charges professional services fees, which is coherent with its enterprise positioning. There is little transparency here, of course, but total amounts charged can apparently be high
API token prices will likely drop as competition intensifies within open-source models (619,047 models as of today on Hugging Face!) and within both large and small models. Though their revenue needs to be first and foremost driven by their token revenue to be considered high-quality (vs. non-recurring professional services), it is partly at risk as a result of potential price decreases.
Though we understand that today the vast majority of its customers are using the smaller models, Mistral may try to bring a part of its customer base to its larger commercial models putting it however in direct competition with Open AI’s GPT-4 and Anthropic’s Claude, raising yet other challenges.
Quality of Growth
Mistral develops and sells LLMs but has no other major (or any?) revenue streams in parallel except for the related professional services attached to the models.
Many LLM players indirectly monetize their LLM (and grow their revenues) through other products within their respective companies (i.e., cloud revenue for AWS, Microsoft, etc.; ad revenue for Meta, etc.). This limitation on growth (at the company level) is not a problem but must be considered as Mistral is compared to other LLMs who are part of a larger ensemble. This has implications for access to compute power in parallel (cf. below).
Quality of margins
The key margin driver for LLMs is computing costs. Though computational power costs vary per size and complexity of models (and so vary across the different LLMs of Mistral), they are critical considerations in building a winning economic engine. Mistral must obtain pricing of compute power at conditions similar to major LLMs, which should, in no case, constitute an unfair advantage detrimental to Mistral. Of course, the actual availability of computing is also a critical factor in the company’s viability, and restriction to GPUs would constitute a black swan event. The Company’s partnership with Microsoft would certainly help prevent this.
**
From my analysis, it’s clear that Mistral holds a unique position as a Tier 1 LLM, spearheading the advancement of open-source models. They are now in direct competition with Llama 3, and are offering models of various sizes, including small ones with promising growth prospects. This strategic positioning aligns well with the booming of genAI vertical applications.
My questions at this stage revolve more around the viability of the company’s economic engine. Token prices should erode as competition continues to increase, putting pressure on margins. Future compute costs, over which the company has little or no control, further enhance this. Both could put the company at high risk of margin compression.
A path forward rests on building a strong (niche?) expertise around targeted horizontal and vertical use cases for enterprises, with a strong service proposition through which the company can create a defensible moat. If national pride (i.e. French sovereignty) helps in striking significant deals, it should be “normalized” in forming a view of the company’s business model robustness.