In a whirlwind of events over the past few days, the burgeoning open source AI community has been abuzz with the unveiling of a potentially groundbreaking development.
On or around January 28, a user known as "Miqu Dev" uploaded a series of files onto HuggingFace, the prominent open source AI model and code sharing platform, introducing what appears to be a novel open source large language model (LLM) dubbed "miqu-1-70b."
This revelation, still accessible on HuggingFace at the time of writing, highlighted that the new LLM's interaction format, termed "Prompt format," closely resembled that of Mistral, a well-known Paris-based open source AI company recognized for its Mixtral 8x7b model, considered by many as a leading LLM. Mixtral 8x7b is a refined and retrained iteration of Meta's Llama 2.
Coinciding with this upload, an anonymous individual on 4chan, potentially linked to "Miqu Dev," shared a link to the miqu-1-70b files, sparking discussions among users on the platform.
This discovery rippled through various online channels, with some enthusiasts taking to X, Elon Musk's social network, to highlight the model's purported exceptional performance in common LLM tasks, nearing the capabilities of the previous frontrunner, OpenAI's GPT-4, on the EQ-Bench.
The intrigue didn't end there; professionals in the machine learning (ML) sphere also took notice, with discussions unfolding on platforms like LinkedIn. Maxime Labonne, an ML scientist at JP Morgan & Chase, speculated on the significance of "miqu," pondering whether it stood for "MIstral QUantized." Labonne's post hinted at the model's potential to rival existing benchmarks, including GPT-4.
Speculation ran rife within the community, with conjecture swirling around whether "Miqu" was a clandestine release from Mistral itself or the result of an internal leak from an employee or client.
The mystery took a turn when Arthur Mensch, co-founder and CEO of Mistral, intervened on X to shed light on the situation. He revealed that an overly eager employee from one of their early access clients had leaked a quantized and watermarked version of an older model distributed openly by Mistral. Mensch clarified that the model in question had been retrained from Llama 2, reaching completion on the day of Mistral 7B's release.
Despite the unauthorized dissemination, Mensch's response notably lacked demands for removal, instead suggesting attribution for the poster.
With Mensch's revelation and promise of ongoing developments, the community is on the edge of their seats. If interpreted generously, it suggests Mistral is on the cusp of achieving or even surpassing GPT-4 level performance with its "Miqu" model.
Such a milestone would not only revolutionize open source generative AI but also reverberate throughout the broader AI and computer science domain. Since its inception in March 2023, GPT-4 has reigned supreme as the most potent LLM by numerous benchmarks. The potential emergence of an open source model akin to GPT-4 could exert immense competitive pressure on entities like OpenAI, particularly as enterprises increasingly gravitate towards open source or hybrid models for their applications.
As the open source AI community rapidly advances, the future of AI supremacy hangs in the balance. Will OpenAI maintain its lead, fortified by innovations like GPT-4 Turbo and GPT-4V, or will the growing momentum of open source alternatives tip the scales? The answer remains to be seen in this evolving saga of technological advancement and competition.