Baseten

Baseten MCP Connector for Claude

A+

Manage your Baseten AI models — orchestrate deployments, list secrets, and run serverless inference predictions autonomously.

6 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Connect your Baseten account to any AI agent and track, deploy, and execute your machine learning models through natural conversation.

O que você pode fazer

  • Model Management — List managed models, fetch configurations, and understand active routing boundaries
  • Serverless Deployments — Inspect exact replica states, autoscaling configurations, and deployment versions
  • Inference Execution — Run direct predictions (predict) pushing tensor payloads or JSON directly to GPU weights
  • Workspace Secrets — Enumerate active environment secrets securely mapped inside the isolated orchestration ecosystem

Como funciona

  1. Subscribe to this server
  2. Enter your Baseten API Key
  3. Gain complete ML-Ops control over your active inference nodes using Claude, Cursor, or your preferred agent

Scale unified AI infrastructure without bouncing between terminal windows. Your agent becomes a capable Machine Learning Operator tracking your GPU lifecycle.

Para quem é?

  • ML Engineers — execute test payloads to deployments instantaneously without spinning up local Python notebooks
  • DevOps/SREs — audit running deployment resources and verify replica states reliably from your core IDE
  • AI Researchers — inspect version schemas and manage inference pipeline architectures quickly
model-deploymentinference-apiserverless-mlmodel-scalingmlops

6 tools expose this connector's capabilities to your AI agent.

list_models

List Baseten managed models

get_model

Get a specific Baseten model

predict

Formulate the explicit tensor shapes or dictionaries strictly matching the deployed instance. Invoke a serverless model inference prediction

list_deployments

List active inferences bounds matching a specific model

get_deployment

Get explicit details of a running deployment

list_secrets

List securely managed workspace secrets without showing values

See how to talk to your AI agent using Baseten.

List standard machine learning models we currently host on Baseten.

You have 2 primary managed models explicitly mapped: A fine-tuned 'Llama-3-70B-Instruct' (ID: XYZ) and a custom computer vision 'Defect-Detector-V2' (ID: ABC). Want to pull explicit active Deployment states for the Llama model?

Run a prediction against the Sentiment model ID 12345 using this text input: 'The new feature completely broke my workflow.'

The Baseten inference pipeline returned the following calculated object mapping successfully: `{"score": 0.12, "label": "NEGATIVE"}`. Prediction executed cleanly across targeted model instances in real-time.

Check if our Baseten project has a secret scoped as 'OPENAI_API_KEY_FALLBACK'.

I checked the environment isolation layer. `OPENAI_API_KEY_FALLBACK` is indeed fully provisioned alongside `HF_TOKEN`. Secret logic passes checks avoiding explicit plaintext extractions over network streams as per Baseten guidelines.

Yes. By pushing a correctly formatted JSON payload to the 'predict' tool, the agent securely triggers inference on the GPU instances, returning the exact calculated response data transparently to your editor context.

Related Connectors