diff --git a/v3-examples/inference-examples/inference-pipeline-modelbuilder-vs-core-example.ipynb b/v3-examples/inference-examples/inference-pipeline-modelbuilder-vs-core-example.ipynb new file mode 100644 index 0000000000..e544bb5024 --- /dev/null +++ b/v3-examples/inference-examples/inference-pipeline-modelbuilder-vs-core-example.ipynb @@ -0,0 +1,795 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# SageMaker V3 Inference Pipeline - ModelBuilder vs Core\n", + "\n", + "This notebook demonstrates how to create and deploy an **inference pipeline** in SageMaker V3. An inference pipeline chains multiple containers together, where the output of one container becomes the input to the next.\n", + "\n", + "### Prerequisites\n", + "Note: Ensure you have sagemaker and ipywidgets installed in your environment. The ipywidgets package is required to monitor endpoint deployment progress in Jupyter notebooks.\n", + "\n", + "## What You'll Learn\n", + "\n", + "1. Train models using `ModelTrainer` (high-level training API)\n", + "2. Package inference code with model artifacts using `repack_model`\n", + "3. Create multi-container pipeline models with `Model.create()`\n", + "4. Deploy pipelines using both low-level APIs and `ModelBuilder`\n", + "\n", + "## Pipeline Architecture\n", + "\n", + "```\n", + "Raw Data → [SKLearn: StandardScaler] → Scaled Data → [XGBoost: Classifier] → Predictions\n", + "```\n", + "\n", + "- **Container 1 (Preprocessing)**: SKLearn StandardScaler normalizes input features\n", + "- **Container 2 (Inference)**: XGBoost binary classifier predicts outcomes\n", + "\n", + "## Why Use Inference Pipelines?\n", + "\n", + "- **Separation of concerns**: Preprocessing and inference logic in separate containers\n", + "- **Reusability**: Same preprocessing can be used with different models\n", + "- **Scalability**: Each container can be optimized independently\n", + "- **Maintainability**: Update one component without affecting others" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Step 1: Setup and Data Preparation\n", + "\n", + "We start by importing the required modules and creating synthetic data for our binary classification task. The data has features at different scales to demonstrate the value of preprocessing." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import uuid\n", + "import os\n", + "import tempfile\n", + "import numpy as np\n", + "import pandas as pd\n", + "import boto3\n", + "\n", + "from sagemaker.core.resources import Model, Endpoint, EndpointConfig\n", + "from sagemaker.core.shapes import ContainerDefinition, InferenceExecutionConfig, ProductionVariant\n", + "from sagemaker.core.image_uris import retrieve\n", + "from sagemaker.core.utils import repack_model\n", + "from sagemaker.core.helper.session_helper import Session, get_execution_role\n", + "from sagemaker.train.model_trainer import ModelTrainer\n", + "from sagemaker.train.configs import SourceCode, InputData\n", + "from sagemaker.serve import ModelBuilder" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize session\n", + "sagemaker_session = Session()\n", + "role = get_execution_role()\n", + "region = sagemaker_session.boto_region_name\n", + "bucket = sagemaker_session.default_bucket()\n", + "unique_id = str(uuid.uuid4())[:8]\n", + "prefix = f\"inference-pipeline-v3/{unique_id}\"\n", + "\n", + "print(f\"Region: {region}\")\n", + "print(f\"Bucket: {bucket}\")\n", + "print(f\"Prefix: {prefix}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Generate synthetic data\n", + "np.random.seed(42)\n", + "n_samples = 1000\n", + "\n", + "feature1 = np.random.normal(100, 15, n_samples)\n", + "feature2 = np.random.normal(50, 10, n_samples)\n", + "feature3 = np.random.normal(0.5, 0.1, n_samples)\n", + "feature4 = np.random.normal(1000, 200, n_samples)\n", + "target = ((feature1 > 100) & (feature2 > 50) | (feature4 > 1100)).astype(int)\n", + "\n", + "df = pd.DataFrame({\n", + " 'feature1': feature1, 'feature2': feature2,\n", + " 'feature3': feature3, 'feature4': feature4, 'target': target\n", + "})\n", + "\n", + "train_df = df[:800]\n", + "test_df = df[800:]\n", + "\n", + "# Upload training data\n", + "data_dir = tempfile.mkdtemp()\n", + "train_file = os.path.join(data_dir, 'train.csv')\n", + "train_df.to_csv(train_file, index=False, header=False)\n", + "\n", + "s3_client = boto3.client('s3')\n", + "train_s3_key = f\"{prefix}/data/train.csv\"\n", + "s3_client.upload_file(train_file, bucket, train_s3_key)\n", + "train_data_uri = f\"s3://{bucket}/{train_s3_key}\"\n", + "print(f\"Training data: {train_data_uri}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Step 2: Train SKLearn Model with ModelTrainer\n", + "\n", + "`ModelTrainer` is the V3 high-level API for training. It simplifies job creation compared to the low-level `TrainingJob.create()` API.\n", + "\n", + "**Key components:**\n", + "- `SourceCode`: Points to your training script and source directory\n", + "- `InputData`: Defines training data channels\n", + "- The training script only needs training logic - inference code is added separately later" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create SKLearn training script (training only - no inference functions)\n", + "sklearn_source_dir = tempfile.mkdtemp()\n", + "\n", + "sklearn_train_script = '''import argparse, os, joblib\n", + "import pandas as pd\n", + "from sklearn.preprocessing import StandardScaler\n", + "\n", + "if __name__ == \"__main__\":\n", + " parser = argparse.ArgumentParser()\n", + " parser.add_argument(\"--model-dir\", type=str, default=os.environ.get(\"SM_MODEL_DIR\", \"/opt/ml/model\"))\n", + " parser.add_argument(\"--train\", type=str, default=os.environ.get(\"SM_CHANNEL_TRAIN\", \"/opt/ml/input/data/train\"))\n", + " args = parser.parse_args()\n", + " \n", + " train_files = [os.path.join(args.train, f) for f in os.listdir(args.train) if f.endswith(\".csv\")]\n", + " df = pd.concat([pd.read_csv(f, header=None) for f in train_files])\n", + " X = df.iloc[:, :4].values\n", + " \n", + " scaler = StandardScaler()\n", + " scaler.fit(X)\n", + " \n", + " os.makedirs(args.model_dir, exist_ok=True)\n", + " joblib.dump(scaler, os.path.join(args.model_dir, \"model.joblib\"))\n", + " print(f\"Model saved to {args.model_dir}\")\n", + "'''\n", + "\n", + "with open(os.path.join(sklearn_source_dir, 'train.py'), 'w') as f:\n", + " f.write(sklearn_train_script)\n", + "\n", + "print(f\"SKLearn training script: {sklearn_source_dir}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Get SKLearn training image\n", + "sklearn_training_image = retrieve(\n", + " framework=\"sklearn\", region=region, version=\"1.4-2\", py_version=\"py3\"\n", + ")\n", + "print(f\"SKLearn training image: {sklearn_training_image}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Train SKLearn model using ModelTrainer\n", + "sklearn_trainer = ModelTrainer(\n", + " training_image=sklearn_training_image,\n", + " source_code=SourceCode(\n", + " source_dir=sklearn_source_dir,\n", + " entry_script=\"train.py\"\n", + " ),\n", + " base_job_name=\"sklearn-preprocess\",\n", + " role=role,\n", + " sagemaker_session=sagemaker_session\n", + ")\n", + "\n", + "sklearn_trainer.train(\n", + " input_data_config=[InputData(channel_name=\"train\", data_source=train_data_uri)]\n", + ")\n", + "\n", + "sklearn_model_uri = sklearn_trainer._latest_training_job.model_artifacts.s3_model_artifacts\n", + "print(f\"SKLearn model artifacts: {sklearn_model_uri}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Step 3: Train XGBoost Model with ModelTrainer\n", + "\n", + "We train an XGBoost classifier using the same `ModelTrainer` pattern. Note that we pass hyperparameters directly to the trainer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create XGBoost training script\n", + "xgboost_source_dir = tempfile.mkdtemp()\n", + "\n", + "xgboost_train_script = '''import argparse, os\n", + "import pandas as pd\n", + "import xgboost as xgb\n", + "\n", + "if __name__ == \"__main__\":\n", + " parser = argparse.ArgumentParser()\n", + " parser.add_argument(\"--model-dir\", type=str, default=os.environ.get(\"SM_MODEL_DIR\", \"/opt/ml/model\"))\n", + " parser.add_argument(\"--train\", type=str, default=os.environ.get(\"SM_CHANNEL_TRAIN\", \"/opt/ml/input/data/train\"))\n", + " parser.add_argument(\"--num-round\", type=int, default=100)\n", + " parser.add_argument(\"--max-depth\", type=int, default=5)\n", + " parser.add_argument(\"--eta\", type=float, default=0.2)\n", + " args = parser.parse_args()\n", + " \n", + " train_files = [os.path.join(args.train, f) for f in os.listdir(args.train) if f.endswith(\".csv\")]\n", + " df = pd.concat([pd.read_csv(f, header=None) for f in train_files])\n", + " X, y = df.iloc[:, :4].values, df.iloc[:, 4].values\n", + " \n", + " dtrain = xgb.DMatrix(X, label=y)\n", + " params = {\"max_depth\": args.max_depth, \"eta\": args.eta, \"objective\": \"binary:logistic\"}\n", + " model = xgb.train(params, dtrain, num_boost_round=args.num_round)\n", + " \n", + " os.makedirs(args.model_dir, exist_ok=True)\n", + " model.save_model(os.path.join(args.model_dir, \"xgboost-model\"))\n", + " print(f\"Model saved to {args.model_dir}\")\n", + "'''\n", + "\n", + "with open(os.path.join(xgboost_source_dir, 'train.py'), 'w') as f:\n", + " f.write(xgboost_train_script)\n", + "\n", + "print(f\"XGBoost training script: {xgboost_source_dir}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Get XGBoost training image\n", + "xgboost_training_image = retrieve(\n", + " framework=\"xgboost\", region=region, version=\"3.0-5\",\n", + ")\n", + "print(f\"XGBoost training image: {xgboost_training_image}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Train XGBoost model using ModelTrainer\n", + "xgboost_trainer = ModelTrainer(\n", + " training_image=xgboost_training_image,\n", + " source_code=SourceCode(\n", + " source_dir=xgboost_source_dir,\n", + " entry_script=\"train.py\"\n", + " ),\n", + " hyperparameters={\n", + " \"num-round\": 100,\n", + " \"max-depth\": 5,\n", + " \"eta\": 0.2\n", + " },\n", + " base_job_name=\"xgboost-classifier\",\n", + " role=role,\n", + " sagemaker_session=sagemaker_session\n", + ")\n", + "\n", + "xgboost_trainer.train(\n", + " input_data_config=[InputData(channel_name=\"train\", data_source=train_data_uri)]\n", + ")\n", + "\n", + "xgboost_model_uri = xgboost_trainer._latest_training_job.model_artifacts.s3_model_artifacts\n", + "print(f\"XGBoost model artifacts: {xgboost_model_uri}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Step 4: Create Inference Scripts and Repack Models\n", + "\n", + "Training produces model artifacts (e.g., `model.tar.gz`) but these don't include inference code. The `repack_model` utility:\n", + "\n", + "1. Downloads the original model artifacts from S3\n", + "2. Extracts them to a temporary directory\n", + "3. Adds your inference script to a `code/` subdirectory\n", + "4. Re-packages and uploads to S3\n", + "\n", + "**Important for pipelines:** The `output_fn` must return a tuple `(data, content_type)` to explicitly set the content type passed to the next container. Without this, intermediate containers receive `application/json` as the default accept type." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create SKLearn inference script\n", + "sklearn_inference_dir = tempfile.mkdtemp()\n", + "\n", + "sklearn_inference_script = '''import joblib, os\n", + "import numpy as np\n", + "\n", + "def model_fn(model_dir):\n", + " return joblib.load(os.path.join(model_dir, \"model.joblib\"))\n", + "\n", + "def input_fn(request_body, request_content_type):\n", + " if request_content_type == \"text/csv\":\n", + " return np.array([[float(x) for x in line.split(\",\")] for line in request_body.strip().split(\"\\\\n\")])\n", + " raise ValueError(f\"Unsupported content type: {request_content_type}\")\n", + "\n", + "def predict_fn(input_data, model):\n", + " return model.transform(input_data)\n", + "\n", + "def output_fn(prediction, accept):\n", + " # Always return CSV with explicit content-type for pipeline compatibility\n", + " csv_output = \"\\\\n\".join([\",\".join([str(x) for x in row]) for row in prediction])\n", + " return csv_output, \"text/csv\"\n", + "'''\n", + "\n", + "with open(os.path.join(sklearn_inference_dir, 'inference.py'), 'w') as f:\n", + " f.write(sklearn_inference_script)\n", + "\n", + "print(f\"SKLearn inference script: {sklearn_inference_dir}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Repack SKLearn model with inference code using repack_model utility\n", + "sklearn_repacked_uri = f\"s3://{bucket}/{prefix}/sklearn/repacked/model.tar.gz\"\n", + "\n", + "repack_model(\n", + " inference_script=\"inference.py\",\n", + " source_directory=sklearn_inference_dir,\n", + " dependencies=[],\n", + " model_uri=sklearn_model_uri,\n", + " repacked_model_uri=sklearn_repacked_uri,\n", + " sagemaker_session=sagemaker_session\n", + ")\n", + "\n", + "print(f\"Repacked SKLearn model: {sklearn_repacked_uri}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# XGBoost uses built-in inference - no custom script needed for basic CSV input/output\n", + "# The XGBoost container handles text/csv natively\n", + "xgboost_repacked_uri = xgboost_model_uri\n", + "print(f\"XGBoost model (no repack needed): {xgboost_repacked_uri}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Step 5: Deploy with ModelBuilder (Recommended)\n", + "\n", + "`ModelBuilder` provides a simplified deployment experience for inference pipelines. This is the **recommended approach** for most use cases.\n", + "\n", + "**How it works:**\n", + "1. Create individual `Model` objects using `Model.create()` with `primary_container`\n", + "2. Pass the list of models to `ModelBuilder(model=[model1, model2, ...])`\n", + "3. Call `build()` to create the pipeline model\n", + "4. Call `deploy()` to create the endpoint\n", + "\n", + "**Note:** Each `Model` must use `primary_container` (not `containers`). ModelBuilder extracts the container definitions and combines them into a pipeline." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Get inference images\n", + "sklearn_inference_image = retrieve(\n", + " framework=\"sklearn\", region=region, version=\"1.4-2\"\n", + ")\n", + "xgboost_inference_image = retrieve(\n", + " framework=\"xgboost\", region=region, version=\"3.0-5\"\n", + ")\n", + "print(f\"SKLearn inference image: {sklearn_inference_image}\")\n", + "print(f\"XGBoost inference image: {xgboost_inference_image}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create individual Model objects for each container\n", + "sklearn_model_name = f\"sklearn-model-{unique_id}\"\n", + "xgboost_model_name = f\"xgboost-model-{unique_id}\"\n", + "\n", + "# SKLearn preprocessing model\n", + "sklearn_model = Model.create(\n", + " model_name=sklearn_model_name,\n", + " primary_container=ContainerDefinition(\n", + " image=sklearn_inference_image,\n", + " model_data_url=sklearn_repacked_uri,\n", + " environment={\n", + " \"SAGEMAKER_PROGRAM\": \"inference.py\",\n", + " \"SAGEMAKER_SUBMIT_DIRECTORY\": \"/opt/ml/model/code\"\n", + " }\n", + " ),\n", + " execution_role_arn=role\n", + ")\n", + "\n", + "# XGBoost inference model\n", + "xgboost_model = Model.create(\n", + " model_name=xgboost_model_name,\n", + " primary_container=ContainerDefinition(\n", + " image=xgboost_inference_image,\n", + " model_data_url=xgboost_repacked_uri\n", + " ),\n", + " execution_role_arn=role\n", + ")\n", + "\n", + "print(f\"Created sklearn model: {sklearn_model_name}\")\n", + "print(f\"Created xgboost model: {xgboost_model_name}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create ModelBuilder with list of Models for inference pipeline\n", + "pipeline_builder = ModelBuilder(\n", + " model=[sklearn_model, xgboost_model],\n", + " role_arn=role,\n", + " sagemaker_session=sagemaker_session\n", + ")\n", + "\n", + "# Build the pipeline model\n", + "pipeline_model_mb = pipeline_builder.build()\n", + "print(f\"Pipeline model built: {pipeline_model_mb.model_name}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Deploy using ModelBuilder\n", + "endpoint_name_mb = f\"pipeline-mb-{unique_id}\"\n", + "\n", + "endpoint_mb = pipeline_builder.deploy(\n", + " endpoint_name=endpoint_name_mb,\n", + " instance_type=\"ml.m5.large\",\n", + " initial_instance_count=1\n", + ")\n", + "\n", + "print(f\"Endpoint deployed: {endpoint_name_mb}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test the ModelBuilder-deployed endpoint\n", + "test_samples = test_df.iloc[:5, :4].values\n", + "test_labels = test_df.iloc[:5, 4].values\n", + "\n", + "csv_data = \"\\n\".join([\",\".join([str(x) for x in row]) for row in test_samples])\n", + "\n", + "response = endpoint_mb.invoke(\n", + " body=csv_data,\n", + " content_type=\"text/csv\",\n", + " accept=\"text/csv\"\n", + ")\n", + "\n", + "result = response.body.read().decode('utf-8')\n", + "predictions = [float(x) for x in result.strip().split('\\n')]\n", + "\n", + "print(\"ModelBuilder Pipeline Results:\")\n", + "print(f\"Predictions: {predictions}\")\n", + "print(f\"Binary: {[1 if p > 0.5 else 0 for p in predictions]}\")\n", + "print(f\"Actual: {list(test_labels)}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Clean up ModelBuilder resources\n", + "try:\n", + " endpoint_mb.delete()\n", + " print(f\"Deleted endpoint: {endpoint_name_mb}\")\n", + "except Exception as e:\n", + " print(f\"Error: {e}\")\n", + "\n", + "try:\n", + " sklearn_model.delete()\n", + " xgboost_model.delete()\n", + " pipeline_model_mb.delete()\n", + " print(\"Deleted models\")\n", + "except Exception as e:\n", + " print(f\"Error: {e}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Alternative: Low-level API Deployment\n", + "\n", + "This section demonstrates the low-level approach using `Model.create()` with multiple `ContainerDefinition` objects. Use this when you need fine-grained control over the deployment configuration.\n", + "\n", + "**Key parameters:**\n", + "- `containers`: List of `ContainerDefinition` objects executed in order\n", + "- `container_hostname`: Identifies each container in logs and metrics\n", + "- `inference_execution_config`: Set to `Serial` for pipeline execution\n", + "- `environment`: Must include `SAGEMAKER_PROGRAM` and `SAGEMAKER_SUBMIT_DIRECTORY` for custom inference scripts" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create inference pipeline model\n", + "pipeline_model_name = f\"pipeline-model-{unique_id}\"\n", + "\n", + "pipeline_model = Model.create(\n", + " model_name=pipeline_model_name,\n", + " containers=[\n", + " ContainerDefinition(\n", + " container_hostname=\"preprocessing\",\n", + " image=sklearn_inference_image,\n", + " model_data_url=sklearn_repacked_uri,\n", + " environment={\n", + " \"SAGEMAKER_PROGRAM\": \"inference.py\",\n", + " \"SAGEMAKER_SUBMIT_DIRECTORY\": \"/opt/ml/model/code\"\n", + " }\n", + " ),\n", + " ContainerDefinition(\n", + " container_hostname=\"inference\",\n", + " image=xgboost_inference_image,\n", + " model_data_url=xgboost_repacked_uri\n", + " )\n", + " ],\n", + " inference_execution_config=InferenceExecutionConfig(mode=\"Serial\"),\n", + " execution_role_arn=role\n", + ")\n", + "\n", + "print(f\"Pipeline model created: {pipeline_model.model_name}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "### Deploy the Inference Pipeline\n", + "\n", + "Deployment requires creating an `EndpointConfig` and then an `Endpoint`. This is the low-level approach that gives you full control over the deployment configuration." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create endpoint config and endpoint\n", + "endpoint_config_name = f\"pipeline-config-{unique_id}\"\n", + "endpoint_name = f\"pipeline-endpoint-{unique_id}\"\n", + "\n", + "endpoint_config = EndpointConfig.create(\n", + " endpoint_config_name=endpoint_config_name,\n", + " production_variants=[\n", + " ProductionVariant(\n", + " variant_name=\"AllTraffic\",\n", + " model_name=pipeline_model_name,\n", + " initial_instance_count=1,\n", + " instance_type=\"ml.m5.large\"\n", + " )\n", + " ]\n", + ")\n", + "\n", + "endpoint = Endpoint.create(\n", + " endpoint_name=endpoint_name,\n", + " endpoint_config_name=endpoint_config_name\n", + ")\n", + "\n", + "print(f\"Creating endpoint: {endpoint_name}\")\n", + "endpoint.wait_for_status(target_status=\"InService\")\n", + "print(f\"Endpoint ready: {endpoint_name}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "### Test the Inference Pipeline\n", + "\n", + "When invoking the pipeline:\n", + "- Your input goes to Container 1 (SKLearn preprocessing)\n", + "- Container 1's output automatically flows to Container 2 (XGBoost)\n", + "- Container 2's output is returned as the final response\n", + "\n", + "The `content_type` you specify applies to Container 1's input, and `accept` applies to Container 2's output." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Test inference\n", + "test_samples = test_df.iloc[:5, :4].values\n", + "test_labels = test_df.iloc[:5, 4].values\n", + "\n", + "csv_data = \"\\n\".join([\",\".join([str(x) for x in row]) for row in test_samples])\n", + "\n", + "response = endpoint.invoke(\n", + " body=csv_data,\n", + " content_type=\"text/csv\",\n", + " accept=\"text/csv\"\n", + ")\n", + "\n", + "result = response.body.read().decode('utf-8')\n", + "predictions = [float(x) for x in result.strip().split('\\n')]\n", + "\n", + "print(\"Pipeline Inference Results:\")\n", + "print(f\"Predictions (probabilities): {predictions}\")\n", + "print(f\"Binary predictions: {[1 if p > 0.5 else 0 for p in predictions]}\")\n", + "print(f\"Actual labels: {list(test_labels)}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "### Clean Up\n", + "\n", + "Delete resources in reverse order of creation: Endpoint → EndpointConfig → Model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Clean up resources\n", + "print(\"Cleaning up...\")\n", + "\n", + "try:\n", + " endpoint.delete()\n", + " print(f\"Deleted endpoint: {endpoint_name}\")\n", + "except Exception as e:\n", + " print(f\"Error: {e}\")\n", + "\n", + "try:\n", + " endpoint_config.delete()\n", + " print(f\"Deleted endpoint config: {endpoint_config_name}\")\n", + "except Exception as e:\n", + " print(f\"Error: {e}\")\n", + "\n", + "try:\n", + " pipeline_model.delete()\n", + " print(f\"Deleted model: {pipeline_model_name}\")\n", + "except Exception as e:\n", + " print(f\"Error: {e}\")\n", + "\n", + "print(\"Cleanup completed!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "This notebook demonstrated two approaches for deploying inference pipelines in SageMaker V3.\n", + "\n", + "### Approach 1: ModelBuilder (Recommended)\n", + "\n", + "| Step | API | Description |\n", + "|------|-----|-------------|\n", + "| Training | `ModelTrainer` | High-level training with `SourceCode` and `InputData` |\n", + "| Repacking | `repack_model()` | Adds inference code to model artifacts |\n", + "| Models | `Model.create(primary_container=...)` | Individual models per container |\n", + "| Deploy | `ModelBuilder(model=[...]).deploy()` | Single call for build + deploy |\n", + "\n", + "### Approach 2: Low-level APIs (Full Control)\n", + "\n", + "| Step | API | Description |\n", + "|------|-----|-------------|\n", + "| Training | `ModelTrainer` | Same as above |\n", + "| Repacking | `repack_model()` | Same as above |\n", + "| Model | `Model.create(containers=[...])` | Creates multi-container pipeline model |\n", + "| Deploy | `EndpointConfig` + `Endpoint` | Explicit endpoint configuration |\n", + "\n", + "### Key Concepts\n", + "\n", + "**Training vs Inference Code Separation:**\n", + "- Training scripts focus on model fitting\n", + "- Inference logic added via `repack_model`\n", + "\n", + "**Pipeline Data Flow:**\n", + "- `content_type` in `invoke()` → applies to first container's input\n", + "- `accept` in `invoke()` → applies to last container's output\n", + "- Intermediate data: controlled by `output_fn` return value\n", + "\n", + "### When to Use Each Approach\n", + "\n", + "- **ModelBuilder**: Quick deployment, recommended for most use cases\n", + "- **Low-level APIs**: Fine-grained control over endpoint configuration" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}