Skip to content

Conversation

@RamprasathS
Copy link

@RamprasathS RamprasathS commented Jan 30, 2026

Fix: Pipeline creation fails with Pydantic validation error due to incorrect Tag class

Issue Description

When BenchmarkEvaluator attempts to create a new SageMaker Pipeline (when no existing pipeline is found), it fails with a Pydantic validation error. This prevents MMLU_PRO and other benchmark evaluations from running when the pipeline doesn't already exist.

Error Message

4 validation errors for Pipeline.create
tags.0.key - Field required
tags.0.value - Field required  
tags.0.Key - Extra inputs are not permitted
tags.0.Value - Extra inputs are not permitted

Impact

  • First-time benchmark evaluations fail completely
  • Any scenario requiring new pipeline creation fails
  • Works only when pipeline already exists (reuse path)
  • Affects all evaluation types (BENCHMARK, CUSTOM_SCORER, LLM_AS_JUDGE)

Root Cause

The code in sagemaker-train/src/sagemaker/train/evaluate/execution.py has two issues:

  1. Wrong Tag class imported: Imports Tag from sagemaker.core.resources, but Pipeline.create() expects Tag from sagemaker.core.shapes

  2. Wrong tag format: Creates dict {"key": ..., "value": ...} instead of Tag objects

There are two different Tag classes in the SDK:

  • sagemaker.core.resources.Tag - Used for Tag.get_all() to retrieve tags
  • sagemaker.core.shapes.Tag - Used for Pipeline.create() to create resources

Changes Made

1. Fixed imports (lines 16-24)

# Before
from sagemaker.core.resources import Pipeline, PipelineExecution, Tag

# After  
from sagemaker.core.resources import Pipeline, PipelineExecution
from sagemaker.core.resources import Tag as ResourceTag  # For Tag.get_all()
from sagemaker.core.shapes import Tag  # For Pipeline.create() tags parameter

2. Fixed tag creation (lines 66-99)

  • Create proper Tag objects from sagemaker.core.shapes
  • Handle both dict and Tag inputs gracefully
  • Add proper error handling and logging
  • Maintain backward compatibility

3. Updated Tag.get_all() calls

  • Changed Tag.get_all() to ResourceTag.get_all() (2 occurrences)
  • Lines ~235 and ~677

Testing

Before Fix

INFO Creating new pipeline: SagemakerEvaluation-BenchmarkEvaluation-...
ERROR 4 validation errors for Pipeline.create
✓ Evaluation job started!
   Job ARN: None  ← FAILED

After Fix

INFO Creating new pipeline: SagemakerEvaluation-BenchmarkEvaluation-...
INFO Creating pipeline with 2 tags
INFO Successfully created pipeline
INFO Pipeline is now active and ready for execution
✓ Evaluation job started!
   Job ARN: arn:aws:sagemaker:us-east-1:...  ← SUCCESS

Tested both scenarios:

  • ✅ Creating new pipeline (first run)
  • ✅ Reusing existing pipeline (subsequent runs)

Related Issues

This fixes the pipeline creation failure when running benchmark evaluations for the first time or after pipeline deletion.

Contributors

Pipeline creation was failing with Pydantic validation errors when
BenchmarkEvaluator attempted to create a new SageMaker Pipeline. This
occurred because the code imported Tag from sagemaker.core.resources
instead of sagemaker.core.shapes, which is what Pipeline.create()
expects for its tags parameter.

Root Cause:
The SDK has two different Tag classes:
- sagemaker.core.resources.Tag: Used for Tag.get_all() operations
- sagemaker.core.shapes.Tag: Used for Pipeline.create() parameter

Changes:
- Import Tag from sagemaker.core.shapes for Pipeline.create()
- Import Tag as ResourceTag from sagemaker.core.resources for Tag.get_all()
- Create proper Tag objects instead of dicts
- Add error handling for tag conversion
- Update Tag.get_all() calls to use ResourceTag

Impact:
This fixes benchmark evaluation failures (MMLU_PRO, BBH, GPQA, etc.)
when creating new pipelines.

Testing:
Verified both creating new pipeline and reusing existing pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant