Cohere tutorial: How to use Coheres' new multilingual model to answer questions about your business more efficiently

VISIT

If you have a business, you will get a lot of questions from your customers. Additionally, people from different countries will ask questions in their native language. This leads to duplicated questions and a lot of work for your customer support. Wouldn't it be awesome, if you could first cluster these questions and then answer them with a single reply? Generative AI can do it for you and this is exactly what Coheres' new multilingual model does. In this tutorial we will show you how to use embeddings to answer questions about your business more efficiently.

As an example for this tutorial we imagine that we own a hotel and we want to answer questions from our customers. Questions come in all different languages and we want to answer them with a single answer in English. We will use the Coheres' new multilingual model to cluster the questions to see which questions are similar. Coheres new model is the industry’s first multilingual text understanding model that supports 100+ languages and delivers 3X better performance than existing open-source models.

This is just one usecase for Coheres' new multilingual model. For example you could use it for these three usecases (and more of course):

Multilingual Semantic Search: To improve the quality of search results, Cohere’s multilingual model can produce fast, accurate results regardless of the language used in the search query or source content.

Aggregate Customer Feedback: Cohere’s multilingual model can be deployed to organize customer feedback across hundreds of languages, simplifying a major challenge for international operations.

Cross-Lingual Zero-Shot Content Moderation: Identifying harmful content in online global communities is challenging. By training Cohere’s multilingual model with a few English examples, it can then detect harmful content in 100+ languages.

Before we start with our hotel example, let's have a quick look at how Coheres' new multilingual model works. Cohere's multilingual text understanding model uses a technique called semantic vector space mapping to position texts with similar meanings in close proximity, which enables a range of valuable use cases for multilingual settings. For example, one can map a query to this vector space during a search to locate relevant documents nearby, which often yields better search results than keyword search. To train these models, however, large quantities of training data are needed, and this data has historically been mostly available in English. Prior work has attempted to use machine translation to map this data to other languages, but this approach does not capture the nuances behind language usage in different countries. In contrast, this new model is trained on a dataset of nearly 1.4 billion question/answer pairs across hundreds of languages, which are an actual questions asked by speakers of those languages, allowing to capture language- and country-specific nuances.

Let's get started. For our hotel questions example we will use the following questions:

You can see that the questions are in different languages. English, German, French and Chinese. Upon closer inspection you can see that some of the questions are very similar. Overall 5 topic clusters can be identified:

Food

Pool

Charging Station

Theater

Breakfast

Let's automate this process of clustering with Coheres' new multlingual model. We recommend to use the playground to test the model and to get a feeling for it. You can find the playground here.

Let's add our questions to the 'Texts' field like so:

On the right side you can set parameters. Change the model to 'multilingual-22-12' and truncate to 'None'. Then click on 'Calculate'. You will see that the model has clustered the questions into 5 clusters.

Now you can see that the model has clustered the questions into 5 clusters. Points that are close to each other are similar. Now you can answer the questions with a single answer.

But let's move from Coheres Playground to our own code. You can export this current example with the button 'Export code'. Choose the programming language you want to use. We will use Python.