Google launches Gemini, its biggest challenge to OpenAI

Google IO 2023 gemini

Credit: Google

Google has officially launched Gemini, its multimodal AI architecture.
Gemini 1.0 will come in three different sizes: Gemini Ultra, Gemini Pro, and Gemini Nano.
Gemini Pro and Nano are rolling out today across a range of Google products.

Back in May 2023, Google revealed it was taking two of its research teams — Brain Team and DeepMind — and putting them together to create a single unit called Google DeepMind. This new team would be responsible for working on Google’s next-generation AI model, Gemini. The firm is now launching three versions of Gemini, with two being made available starting today.

In a blog post, Google officially introduced its new AI architecture, Gemini. Described as having state-of-the-art performance, Google states that Gemini was built from the ground up to be multimodal. As the company explains:

Until now, the standard approach to creating multimodal models involved training separate components for different modalities and then stitching them together to roughly mimic some of this functionality. These models can sometimes be good at performing certain tasks, like describing images, but struggle with more conceptual and complex reasoning.

We designed Gemini to be natively multimodal, pre-trained from the start on different modalities. Then we fine-tuned it with additional multimodal data to further refine its effectiveness. This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain.

Gemini will reportedly come in three different sizes to run efficiently for varying needs. The largest and most capable version is called Gemini Ultra and it’s said to be designed for highly complex tasks. Below that is Gemini Pro, which is designed to be used across a range of devices. The third version of the AI — Gemini Nano — is meant to be the most efficient model for on-device tasks. Google says it optimized three different sizes for the first version of Gemini, which could mean other sizes could eventually come in the future.

In terms of performance, the Mountain View-based organization claims Gemini Ultra exceeds 30 of 32 academic benchmarks for current state-of-the-art results used for LLMs. It reportedly beats OpenAI’s GPT-4 in every category outside of commonsense reasoning for everyday tasks in regard to text.

Sponsored ad

gemini final text table

Credit: Google

With these improved capabilities, Google acknowledges the need for more safety measures. The company says it is adding new protections to its current AI Principles policy. It also says it has “conducted novel research into potential risk areas,” applied adversarial testing techniques, worked with “a diverse group of external experts and partners” to identify blindspots, and “built dedicated safety classifiers” to filter out violence and negative stereotypes.

As for when Gemini 1.0 will be available, Google says it is rolling out the AI now to various products starting today. One of those products getting the LLM today is Bard, which will reportedly use a fine-tuned version of Gemini Pro. The Pixel 8 Pro is also getting the AI today — Gemini Nano — and it will power Summarize in the Recorder app and Smart Reply in Gboard for WhatsApp. And Google’s Search Generative Experience is being enhanced with Gemini as well, reportedly reducing latency by 40% in English in the US.

Gemini Ultra, on the other hand, won’t be rolling out today as it is said to be undergoing “extensive trust and safety checks.” However, Google says it will make Ultra available for early experimentation to select customers, developers, and partners early next year.

Google launches Gemini, its biggest challenge to OpenAI

Related

Be the first to comment

Give Feedback About This ArticleCancel reply