Transform AI Development with Advanced Amazon SageMaker AI Customization and Training Capabilities
News | 15.01.2026
Building AI That Understands Your Business
As generative AI becomes more accessible, many organizations now use the same foundation models (FMs). True competitive advantage, however, comes from customizing AI models with your own data, workflows, and preferences—creating solutions that competitors cannot easily replicate.
While modern foundation models offer impressive reasoning and general intelligence, they lack the context that makes AI truly valuable to a business. They don’t inherently understand your terminology, industry constraints, or operational nuances. To bridge this gap, models must be trained and customized in a structured learning process—moving from general knowledge to deep, domain-specific understanding.
Amazon Web Services SageMaker AI now supports this entire journey: from large-scale pre-training to fine-tuning and preference alignment, and finally to efficient inference and continuous adaptation.
Accelerating Customization with Serverless Model Training
At AWS re:Invent 2025, Amazon SageMaker AI introduced major advancements that simplify and accelerate AI model development. These new capabilities address two long-standing challenges:
- The complexity and time required to customize foundation models
- Infrastructure failures that disrupt large-scale training and delay results
With serverless model customization, organizations can now fine-tune and align models in days instead of months—without managing infrastructure.
AI agent–guided customization
For teams seeking the highest level of abstraction, SageMaker AI introduces an AI agent–guided workflow (preview). Developers can describe their business objectives in natural language, and the AI agent generates a complete customization plan, including dataset guidance, evaluation metrics, and model recommendations.
This workflow supports advanced techniques such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Reinforcement Learning from AI Feedback (RLAIF)
- Reinforcement Learning from Verifiable Rewards (RLVR)
All training, evaluation, and responsible AI controls are handled in a fully serverless environment, removing operational overhead.
Self-guided customization for advanced teams
For teams that require greater control, SageMaker AI provides a self-guided interface in SageMaker Studio. Organizations can select popular models such as Amazon Nova, Llama, Qwen, DeepSeek, and GPT-OSS, then apply parameter-efficient fine-tuning (LoRA) or full fine-tuning with recommended best practices.
Integrated MLflow tracking provides full visibility into experiments, performance, and outcomes—while AWS manages scaling, provisioning, and optimization automatically.
Extending Beyond Fine-Tuning with Amazon Nova Forge
For organizations that require deep domain expertise embedded directly into their models, fine-tuning alone may not be sufficient. Continued pre-training often introduces risks such as catastrophic forgetting, where models lose foundational capabilities.
To address this, AWS introduced Amazon Nova Forge, available through SageMaker AI. Nova Forge enables organizations to build their own frontier models using Amazon Nova, starting from early checkpoints across pre-training, mid-training, and post-training phases.
By blending proprietary datasets with curated Amazon Nova data on fully managed infrastructure, organizations can:
- Preserve foundational intelligence and safety characteristics
- Embed deep industry-specific knowledge
- Reduce the risks associated with traditional pre-training approaches
This makes Nova Forge a cost-effective and scalable way to build truly differentiated, domain-aware AI models.
Scaling Efficiently with Elastic and Checkpointless Training
Elastic training with SageMaker HyperPod
AI workloads are dynamic—resource availability changes constantly. Traditional training jobs are fixed and inefficient, often leaving expensive accelerators idle.
Elastic training on Amazon SageMaker HyperPod automatically scales training workloads up or down based on available resources. Training continues uninterrupted, even as capacity shifts, maximizing utilization and reducing manual intervention.
Checkpointless training for resilience
Infrastructure failures can derail weeks of training. Checkpointless training eliminates this risk by preserving model state continuously across distributed clusters. If a failure occurs, training resumes in seconds without manual recovery—achieving up to 95% training efficiency at scale.
Together, elastic and checkpointless training significantly reduce costs, downtime, and operational complexity.
Serverless MLflow for Full Observability
To support experimentation, evaluation, and governance, Amazon SageMaker AI now offers serverless MLflow. This removes the need to deploy and manage tracking infrastructure, while providing:
- Real-time experiment tracking
- Prompt versioning and reuse
- Seamless integration with SageMaker Model Registry
- Cross-account collaboration via AWS Resource Access Manager
Serverless MLflow is available at no additional cost and automatically stays up to date—allowing teams to focus on innovation rather than maintenance.
Enabling Faster, More Secure AI Innovation
These new Amazon SageMaker AI capabilities form a comprehensive platform for enterprise AI development—from natural language–driven customization to large-scale, fault-tolerant training and production deployment.
As an official AWS Partner, Softprom helps organizations design, implement, and optimize AI solutions on AWS—ensuring scalability, security, and alignment with business goals. Whether you are customizing existing models or building domain-specific AI from the ground up, Softprom and AWS provide the expertise and technology to accelerate your AI journey.
Get Started with Amazon SageMaker AI
The new Amazon SageMaker AI customization and SageMaker HyperPod capabilities are available today across AWS Regions worldwide. Existing customers can access them via the SageMaker AI console, and new customers can begin with the AWS Free Tier.
To learn how these capabilities can support your AI initiatives, contact Softprom for expert consultation and implementation support.