Language Detector Engine

Language Detector Engine MCP Connector for Claude

A+

Detect the language of any text local using exact n-gram analysis. Supports 400+ languages. When AI guesses wrong on short or mixed text, this engine proves it.

1 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Your customer support agent receives a ticket: 'O produto não chegou'. The AI routes it to the Spanish queue. The agent wastes time, the customer gets angry, SLA drops. Why? Because the AI 'guessed' the language probabilistically instead of calculating it.

This MCP uses franc (200K+ weekly downloads, inspired by Google's CLD2) to perform deterministic N-gram language detection. It returns exact ISO 639-3 codes for over 400 languages, and properly returns 'undefined' if a text is too ambiguous rather than hallucinating.

The Superpowers

  • 400+ Languages: From English (eng) and Portuguese (por) to Esperanto (epo) and Zulu (zul).
  • Exact N-gram Math: Analyzes text strictly by character frequencies, not LLM probability.
  • Whitelist/Blacklist: Know the text must be either Spanish or Portuguese? Pass only: ['spa', 'por'] to force a strict evaluation.
  • Confidence Scores: Use the all flag to get an array of all matches with their exact probability scores.
n-gram-analysislanguage-detectiondeterministic-logictext-processinglocalizationdata-validation

1 tools expose this connector's capabilities to your AI agent.

detect_language

Provide as much text as possible for higher accuracy. Detect the language of any text using n-gram analysis. Supports 400+ languages. Returns ISO 639-3 codes (e.g., "por", "eng", "spa")

See how to talk to your AI agent using Language Detector Engine.

Detect the language of this support ticket: 'Não consigo acessar minha conta desde ontem'.

Detected Language: 'por' (Portuguese). 100% confidence.

We only support English and Spanish. Detect the language of 'Hola como estas' using the whitelist.

Detected Language: 'spa' (Spanish) from the allowed list ['eng', 'spa'].

Get the top 3 language probabilities for this ambiguous name: 'Alejandro'.

Top Candidates: 1. spa (Spanish): 100% | 2. glg (Galician): 82% | 3. cat (Catalan): 64%

LLMs often hallucinate languages for short strings or names. They also struggle to provide standardized ISO codes reliably. This engine uses mathematical N-gram analysis (the same technique behind Google Search language detection) to deterministically map text to one of 400+ ISO 639-3 codes.

Related Connectors