At Google I/O 2024, Google announced significant updates to its flagship Gemini model, along with the introduction of new model variations tailored to different use cases.
Gemini 1.5 Pro, the current flagship version, is now available to all developers globally via Google's Gemini Advanced platform. This version has received several improvements, particularly in translation, coding, and reasoning capabilities based on feedback from its initial rollout. Gemini 1.5 Pro is multimodal, capable of comprehending images, text, and visuals in prompts, and supports 35 languages. Notably, its context window has been expanded to accommodate up to 2 million tokens, enabling it to handle massive amounts of text input with ease.
In addition to the flagship model, Google introduced Gemini 1.5 Flash, a smaller and more lightweight version optimized for low-latency environments such as IoT devices and industrial robotics. Despite its smaller size, Flash retains Gemini's impressive context window and multimodal reasoning capabilities.
Google also unveiled PaliGemma, an open-source vision-language model for generating image captions and labels, offering detailed responses about images using both images and text inputs. Additionally, Gemma 2, Google's latest small language model, is designed to be more efficient for developers and businesses with limited infrastructure access. With 7 billion parameters, Gemma 2 outperforms models twice its size and can run on a single TPU through Vertex AI.
Gemini 1.5 Flash and Gemini 1.5 Pro will both be available in June, providing developers with versatile options to suit their specific needs. PaliGemma and Gemma 2 are also set to launch in June, further expanding Google's range of AI offerings.