Which LLM models can I use with the chat tool?

You can use any model hosted on DeepInfra, such as `deepseek-ai/DeepSeek-V3` or `meta-llama/Llama-3.3-70B-Instruct`, by passing the model name to the `create_chat_completion` tool.

How do I generate images using FLUX or Stable Diffusion?

Use the `generate_image` tool. Simply provide the model name (e.g., `black-forest-labs/FLUX-1-schnell`) and your text prompt to receive the generated image URL.

What is the 'run_native_inference' tool used for?

It is used for models that don't follow the OpenAI chat/image spec, such as audio transcription (Whisper), specialized OCR models, or your own private model deployments on DeepInfra.

DeepInfra (Serverless LLM Inference) MCP Connector for Claude

A+

Run top-tier LLMs, image generation, and embeddings via DeepInfra's serverless infrastructure directly from your AI agent.

4 tools Official Updated Jun 28, 2026 Official Vinkius Partner

More Details Connect to Claude

Connect to DeepInfra to access a massive library of open-source models including DeepSeek, Llama 3, and FLUX. This MCP server provides high-performance, serverless inference for text, images, and specialized tasks.

What you can do

Chat Completions — Generate text using state-of-the-art models like DeepSeek-V3 or Llama-3.3-70B with full control over temperature and tokens.
Image Generation — Create stunning visuals using models like FLUX-1 or Stable Diffusion by simply providing a text prompt.
Text Embeddings — Convert text into high-dimensional vectors for RAG (Retrieval-Augmented Generation) or semantic search.
Native Inference — Access specialized models for speech-to-text (Whisper), OCR, or custom deployments that don't follow standard OpenAI specs.

How it works

Subscribe to this server
Enter your DeepInfra API Token
Start querying world-class AI models from Claude, Cursor, or any MCP-compatible client

Who is this for?

Developers — integrate powerful LLMs into your coding workflow without managing GPU infrastructure.
Content Creators — generate high-quality images and text variations directly within your workspace.
Data Engineers — build semantic search pipelines using serverless embedding endpoints.

llm-inferenceserverless-aitext-to-imageembeddingsai-models

Connect to Claude

Subscribe on Vinkius, then add this connector to Claude.ai or Claude Code.

① Claude.ai (web app)

Go to Settings → Connectors → Add custom connector
Paste the MCP endpoint URL below

https://edge.vinkius.com/vk_preview_RMvS2UlIMhdShOjNbZfqar6XAZE9QfzTMbaejxJY/mcp

② Claude Code (terminal)

claude mcp add --transport http deepinfra-serverless-llm-inference https://edge.vinkius.com/vk_preview_RMvS2UlIMhdShOjNbZfqar6XAZE9QfzTMbaejxJY/mcp

③ Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "deepinfra-serverless-llm-inference": {
      "url": "https://edge.vinkius.com/vk_preview_RMvS2UlIMhdShOjNbZfqar6XAZE9QfzTMbaejxJY/mcp"
    }
  }
}

Get full access on Vinkius

The preview token above works for testing. Powered by Vinkius.

Details

Tools: 4
Grade: A+
Score: 100/100
Updated: Jun 28, 2026

Related Connectors

Geoapify MCP

17 tools Official

Access powerful location intelligence — geocoding, routing, place search, and IP tracking directly from your AI agent.

D View details →

Gainsight PX MCP

12 tools Official

Manage product experience, track user behavior, and oversee engagements via AI agents with Gainsight PX.

A+ View details →

Commerce Layer MCP

9 tools

Enable your AI agent to manage orders, SKUs, customers, and shipments via the Commerce Layer API.

A+ View details →

7shifts MCP

12 tools Official

Schedule restaurant staff, manage shifts, track labor costs, and coordinate your team with intelligent workforce planning.

A+ View details →