2023 marked a significant turning point with the introduction of ChatGPT. This event triggered a competitive race among AI companies and tech giants, all striving to dominate the expanding market for large language model (LLM) applications. Despite the prevailing trend of private models, the open-source LLM ecosystem experienced a notable surge during the year, challenging the dominance of proprietary services.
Before 2023, the prevailing belief in the industry was that larger model sizes equated to enhanced LLM performance. Models like BLOOM and OPT, akin to OpenAI's GPT-3 with its 175 billion parameters, exemplified this approach. However, in February 2023, Meta introduced Llama, a family of models ranging from 7 to 65 billion parameters. Llama showcased that smaller language models could rival the performance of their larger counterparts, emphasizing the importance of training on a significantly larger corpus of data rather than sheer size.
Llama's success lay in its ability to operate on a single or a handful of GPUs and its open-source nature. This led to the emergence of several open-source LLMs, such as Cerebras-GPT, Pythia, MPT, X-GEN, Falcon, and others. The open-source models gained traction, with Meta releasing Llama 2 in July, serving as the foundation for various derivative models. Mistral.AI's Mixtral, in particular, garnered praise for its capabilities and cost-effectiveness.
Data from Hugging Face, a machine learning model hub, revealed the rapid growth of open-source LLMs. Developers created thousands of forks and specialized versions of models like "Llama," "Mistral," and "Falcon." Mixtral, despite its recent release, became the basis for 150 projects. This open-source nature not only facilitated the creation of new models but also enabled developers to combine them for enhanced versatility in practical applications.
As proprietary models advanced, the open-source community maintained its relevance, with tech giants recognizing its potential. Microsoft, a primary backer of OpenAI, released open-source models Orca and Phi-2, while Amazon introduced Bedrock, a cloud service accommodating both proprietary and open-source models. The trend indicates a growing recognition of the open-source ecosystem's role in the future of LLMs.
Despite the widespread experimentation with closed model APIs, the reliance on external services poses significant risks to data privacy and security. In response, the open-source ecosystem presents a unique proposition for businesses seeking to integrate generative AI while addressing privacy, security, and compliance concerns. As the landscape evolves, the open-source approach may become an integral part of the way forward for companies embracing the future of generative AI.