-
Notifications
You must be signed in to change notification settings - Fork 833
Description
Is this related to an existing feature request or issue?
No response
Summary
SageMaker functionalities, such as creating training jobs, deploying inference endpoints, and invoking deployed endpoints, are currently accessed through the SageMaker Python SDK and SageMaker Hyperpod CLI. To enhance user experience and integration capabilities, we propose exposing these interfaces via the Model Context Protocol server. This approach will enable users to leverage AI Code agents, allowing for seamless interaction with SageMaker's core functionalities.
This MCP servers exposes all the functionalities offered by https://github.com/aws/sagemaker-hyperpod-cli (hyp
CLI) and later on Sagemaker PySDK(https://github.com/aws/sagemaker-python-sdk) interfaces.
Design proposal : https://quip-amazon.com/wmwUAcatMGbR/Sagemaker-MCP-Server
Use case
Phase 1
Supports Sagemaker hyperpod training and inference jobs.
Running Training and inference job :
Parameter Exposition:
Display required and optional parameters based on the predefined schema for PyTorch training jobs.
Automated Parameter Population:
Intelligently fill out parameters as requested by the user, streamlining the setup process.
Schema Validation:
- Perform thorough schema validation to ensure all input parameters meet the required specifications.
Job Submission: - Execute the submission of the training job once all parameters are correctly set and validated.
Status Monitoring:
- Provide real-time tracking of the job status, allowing users to monitor progress efficiently.
Log Retrieval: - Offer functionality to fetch and display logs associated with the job, facilitating debugging and analysis.
Phase 2
Cluster creation and management. Need to evaluate if we can leverage and merge with in progress cluster creation MCP server or use hyperpod CLI offered solution.
Phase 3 & beyond
- Enabled recipe support for training jobs.
- Exposing PySDK constructs via sagemaker MCP server.
====
Examples
-
Create a pytorch job with job name "my-test-job" and image as
MCP server identified if this is sufficient information to start the job or ask user to provide more details. This required and optional parameters are fetched automatically via pytorch-job template. -
Create a jumpstart endpoint with model-ID and name "my-js-endpoint1"
-
Whats the status of job my-test-job.
-
List all the pytorch jobs
-
List all the jumpstart endpoints.
Proposal
Design Proposal : https://quip-amazon.com/wmwUAcatMGbR/Sagemaker-MCP-Server
Out of scope
Only client facing sagemaker interfaces (e.g. offered by sagemaker-hyperpod or sagemaker-pySDK) will be in scope.
Potential challenges
- sagemaker-hyperpod CLI has been designed in such a way that specs changes are abstracted via template packages which are distributed separately to PyPi and MCP Server will take dependency on these templates for exposing the interface.
Dependencies and Integrations
AWS CLI
AWS SDK for Python (Boto3)
FastMCP
Pydantic
sagemaker-hyperpod
sagemaker-core
Alternative solutions
Metadata
Metadata
Assignees
Labels
Type
Projects
Status