Can I use my fine-tuned adapters with this server?

Yes. When using the `generate_text` tool, you can provide an `adapter_id` to apply your specific fine-tuned LoRA adapter to the base model deployment.

How do I monitor the performance of my Predibase deployment?

Use the `get_metrics` tool to scrape Prometheus-formatted metrics or `get_info` to retrieve metadata like model ID and device type.

Does this support structured JSON responses?

Absolutely. The `generate_text` tool includes a `schema` parameter that allows you to pass a JSON schema to ensure the model output follows a specific structure.

Predibase (LLM Serving & Finetuning) MCP Connector for Claude

Deploy and query fine-tuned LLMs via Predibase — run inference, classify text, and monitor deployment metrics directly from your AI agent.

7 tools Official Updated Jun 28, 2026 Official Vinkius Partner

More Details Connect to Claude

Connect your Predibase account to any AI agent to manage high-performance LLM serving and fine-tuning workflows. Predibase provides a unified interface for serverless LLM deployment and LoRA adapter management.

What you can do

LLM Inference — Generate text or chat completions using generate_text, chat_completion, and completion tools.
Fine-tuning Integration — Dynamically apply LoRA adapters during inference using the adapter_id parameter in generation tasks.
Text Classification — Perform batch classification tasks with the classify tool for structured data workflows.
Deployment Monitoring — Check the status of your endpoints with get_health, get_info, and get_metrics.
Structured Output — Enforce JSON schemas on model responses for reliable downstream automation.

How it works

Subscribe to this server
Provide your Predibase API Token and Tenant ID
Start querying your deployments from Claude, Cursor, or any MCP client

Who is this for?

AI Engineers — deploy and test fine-tuned models without leaving the chat interface
Data Scientists — monitor inference metrics and health of production deployments
Developers — integrate high-performance LLM capabilities into apps with structured JSON output

llm-servingfine-tuninginferencemachine-learningai-ops

Connect to Claude

Subscribe on Vinkius, then add this connector to Claude.ai or Claude Code.

① Claude.ai (web app)

Go to Settings → Connectors → Add custom connector
Paste the MCP endpoint URL below

https://edge.vinkius.com/vk_preview_anUwoOMv2E6QQVudvBXsDytO30kiB1fpaXuzKUbp/mcp

② Claude Code (terminal)

claude mcp add --transport http predibase-llm-serving-finetuning https://edge.vinkius.com/vk_preview_anUwoOMv2E6QQVudvBXsDytO30kiB1fpaXuzKUbp/mcp

③ Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "predibase-llm-serving-finetuning": {
      "url": "https://edge.vinkius.com/vk_preview_anUwoOMv2E6QQVudvBXsDytO30kiB1fpaXuzKUbp/mcp"
    }
  }
}

Get full access on Vinkius

The preview token above works for testing. Powered by Vinkius.