NVIDIA API Catalog

NVIDIA API Catalog MCP Connector for Claude

A+

Cloud Engine proxy running native foundational completions natively utilizing active Nemotron and Llama3 architectures.

8 tools Official Updated Jun 28, 2026 Official Vinkius Partner

What you can do

Trigger massive inference executions navigating safely over natively hosted logic endpoints using the explicit API Catalog:

  • Discover Active Cloud LLMs natively listing every explicitly hosted model configuration safely mapped
  • Route Chat Completions pulling explicit answers evaluating safely unstructured conversational bounds dynamically
  • Extract Native Embeddings passing direct text evaluations extracting numerical arrays gracefully
  • Evaluate Multimodal limits assigning native Vision tasks routing natively strictly matrix limits
  • Execute Text Summarization compressing explicit bounds generating specific arrays cleanly routing effectively

How it works

  1. Declare Logic Tokens, explicitly combining the NVIDIA_API_KEY configuration natively over the SDK bounds proxy implicitly
  2. Pass Strict Logic Inference, requesting native models securely bypassing manual SDK mapping configurations resolving completely
  3. Map and execute hardware limits inherently parsing directly standard structured completions securely

Who is this for?

Explicitly targeted evaluating limits specifically for AI Engineers, Generative Integrators, and Developers parsing direct responses over public NVIDIA compute matrices.

model-discoveryllm-proxyinference-engineapi-catalogmodel-routingfoundation-models

8 tools expose this connector's capabilities to your AI agent.

nvidia_chat_completion

Trigger direct NLP inference matrices directly evaluating queries over hosted LLMs

nvidia_check_token_quota

Poll safely dynamic credit and explicit constraint execution limits bounding inference execution

nvidia_generate_embeddings

Pass parameters safely mapping explicit unstructured vectors directly using specific Embedding arrays

nvidia_get_cloud_status

Ping explicitly the core hosted NVIDIA matrix tracing inference endpoints evaluating latencies securely

nvidia_list_foundation_models

Dumps the strict array specifying explicit LLM matrix paths accessible securely natively

nvidia_list_lora_adapters

Evaluate explicit matrices tracking fine-tuned overrides isolating logical constraints dynamically

nvidia_summarize_content

Standard natively configured logical execution executing predefined abstract compression matrices smoothly

nvidia_vision_inference

g. Llama-Vision natively). Invoke strictly multimodal abilities capturing diagnostic constraints returning inference on graphical data

See how to talk to your AI agent using NVIDIA API Catalog.

Deploy commands exploring active NLP data listing completely the hosted LLMs mapped heavily inside the NVIDIA catalog safely.

Parsed logically evaluating NVIDIA Cloud API natively (`list_foundation_models`). Platform responded safely listing 42 explicit parameters including Llama3 cleanly bounding choices naturally.

Trigger inference explicitly navigating natively utilizing Nemotron LLMs to summarize standard matrices cleanly parsing bounds gracefully.

Tunnel explicitly mapping `summarize_content`. Engine successfully extracted cleanly formatted response arrays bouncing latency smoothly gracefully natively over hosted limits.

Execute explicitly generating explicit unstructured text matrices extracting native embedding queries purely isolating the arrays properly.

Execution logic parameters strictly extracting values safely allocating implicitly natively `generate_embeddings`. Payload correctly returned arrays naturally formatting vector bindings efficiently bounds.

Yes! Utilize `generate_embeddings` providing explicit logic extracting arrays natively isolating endpoints safely.

Related Connectors