Can the AI agent run a prediction directly against my hosted model?

Yes. By pushing a correctly formatted JSON payload to the 'predict' tool, the agent securely triggers inference on the GPU instances, returning the exact calculated response data transparently to your editor context.

Is my workspace and environmental secret data kept safe?

Baseten secret fetching natively obscures variable values. When you use 'list_secrets', the agent simply evaluates the key names and identifiers existing across your environment to verify configurations without exposing plaintext passwords.

How do I check auto-scaling configurations for an explicitly deployed model?

You can examine exactly how instances are managed by using 'get_deployment'. Tell the agent to target an active deployment ID and it maps the scaling limits, replica status, and container bounds out-of-the-box.

Baseten MCP Connector for Claude

A+

Manage your Baseten AI models — orchestrate deployments, list secrets, and run serverless inference predictions autonomously.

6 tools Official Updated Jun 28, 2026 Official Vinkius Partner

More Details Connect to Claude

Connect your Baseten account to any AI agent and track, deploy, and execute your machine learning models through natural conversation.

O que você pode fazer

Model Management — List managed models, fetch configurations, and understand active routing boundaries
Serverless Deployments — Inspect exact replica states, autoscaling configurations, and deployment versions
Inference Execution — Run direct predictions (predict) pushing tensor payloads or JSON directly to GPU weights
Workspace Secrets — Enumerate active environment secrets securely mapped inside the isolated orchestration ecosystem

Como funciona

Subscribe to this server
Enter your Baseten API Key
Gain complete ML-Ops control over your active inference nodes using Claude, Cursor, or your preferred agent

Scale unified AI infrastructure without bouncing between terminal windows. Your agent becomes a capable Machine Learning Operator tracking your GPU lifecycle.

Para quem é?

ML Engineers — execute test payloads to deployments instantaneously without spinning up local Python notebooks
DevOps/SREs — audit running deployment resources and verify replica states reliably from your core IDE
AI Researchers — inspect version schemas and manage inference pipeline architectures quickly

model-deploymentinference-apiserverless-mlmodel-scalingmlops

Related Connectors

SketricGen MCP

18 tools Official

Connect your AI agents to SketricGen to run multi-agent workflows, manage knowledge bases, debug traces, and interact with contacts.

A+ View details →

Assembled MCP

7 tools Official

Manage support workforce and scheduling with Assembled — track agent states, teams, and forecasts via AI.

F View details →

DeveloperHub MCP

10 tools Official

Equip your AI agent to manage documentation projects, track pages, and monitor changelogs via the DeveloperHub API.

A+ View details →

Craft CMS (Craftnet) MCP

10 tools Official

Equip your AI agent to manage plugins, licenses, and sales directly via the Craftnet (Craft CMS) API.

A+ View details →