Cohere

Cohere MCP Connector for Claude

A+

Access Cohere AI models via API — chat with Command models, generate embeddings, rerank documents and tokenize text from any AI agent.

6 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Connect your Cohere account to any AI agent and leverage enterprise-grade AI models through natural conversation.

What you can do

  • Model Discovery — List all available Cohere models with their names, capabilities and context lengths
  • Chat API — Send conversations to Command models (command-r-plus, command-r, command-r7b) and receive responses with citations and tool call support
  • Embeddings — Generate vector embeddings for semantic search with multiple embedding types (float, int8, uint8, binary)
  • Reranking — Rerank documents by relevance to a search query using Cohere's industry-leading reranking models
  • Tokenization — Tokenize and detokenize text for estimating token counts and debugging

How it works

  1. Subscribe to this server
  2. Enter your Cohere API Key
  3. Start using Cohere models from Claude, Cursor, or any MCP-compatible client

No more switching between API tools to interact with Cohere. Your AI acts as an LLM orchestration layer.

Who is this for?

  • Developers — quickly send messages to Command models, generate embeddings and rerank search results without writing HTTP code
  • ML Engineers — discover available models, compare capabilities and generate embeddings with multiple types (float, int8, binary)
  • Search Teams — rerank documents by relevance, tokenize text and generate embeddings for search index building
llmembeddingsrerankingnatural-language-processingtokenizationchat-api

6 tools expose this connector's capabilities to your AI agent.

chat

Requires the model ID (e.g. "command-r-plus", "command-r", "command-r7b") and messages array in JSON format. Each message must have a "role" ("user", "assistant", "system" or "tool") and "content" (text or array of content blocks). Optionally set max_tokens, temperature (0-1), p (nucleus sampling 0-1) and tools array for function calling. Returns the model's response with text, citations and tool calls. Send a chat message to a Cohere model

detokenize

Requires the token IDs array. Returns the reconstructed text. Useful for debugging and verifying tokenization. Detokenize token IDs back to text using Cohere

embed

Requires the model ID (e.g. "embed-v4", "embed-v3"), texts array and input_type ("search_document", "search_query", "classification", "clustering"). Returns embedding vectors for each input text. Useful for semantic search, similarity comparison and vector database storage. Generate embeddings using Cohere

list_models

Each model returns its name (e.g. "command-r-plus", "command-r", "embed-v4", "rerank-v3.5"), endpoint compatibility, context length and tokenization info. Use this to discover which models are available and their capabilities. List all available Cohere models

rerank

Requires the model ID (e.g. "rerank-v3.5", "rerank-english-v3.0"), query text and documents array. Optionally set top_n to return only the top N results. Returns ranked documents with relevance scores. Rerank documents by relevance to a query

tokenize

Requires the text to tokenize and optionally the model. Returns the list of token IDs and token strings. Useful for estimating token counts before sending to chat or embed endpoints. Tokenize text using Cohere

See how to talk to your AI agent using Cohere.

Send a message to Command R+ asking 'What is the capital of Brazil?'

Command R+ responded: 'The capital of Brazil is Brasília. It was purpose-built to replace Rio de Janeiro as the capital in 1960, and is located in the country's central-west region.'

Rerank these documents for the query 'machine learning models': ['Neural networks are inspired by biological neurons.', 'Python is a popular programming language.', 'Transformers use attention mechanisms for sequence processing.']

Reranked results: 1. 'Transformers use attention mechanisms...' (score: 0.95), 2. 'Neural networks are inspired...' (score: 0.72), 3. 'Python is a popular...' (score: 0.12). The transformer and neural network documents are most relevant to ML models.

Generate embeddings for these texts: ['The weather is nice today.', 'I love programming in Python.'] using embed-v4.

Generated embeddings for 2 texts using embed-v4 with input_type 'search_document'. Each embedding is a 1024-dimensional vector. You can use these for semantic search, similarity comparison or vector database storage.

Log in to the [**Cohere Dashboard**](https://dashboard.cohere.com/api-keys), go to **API Keys** and click **Create API Key**. Copy the key immediately — it starts with a random string and won't be shown again. Free tier includes trial access with rate limits.

Related Connectors