Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
6abf0de
in progress
brown9804 Apr 29, 2025
f8d52a3
Merge 6abf0de7e27b6e8b66ebb4c8fc9d2d8c75ef1000 into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
f3d2b1b
visual guidance added
brown9804 Apr 29, 2025
a7bc310
Merge f3d2b1b0e911bb6287a5faeb0627606bb34fb9f5 into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
ff2078d
working step 6
brown9804 Apr 29, 2025
a26297c
Merge ff2078dbe17d292954331c938f089a83b4287da5 into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
0eadce7
until step 8
brown9804 Apr 29, 2025
9d172a8
Merge 0eadce76013305ac9815127b83cccb19d733bc9b into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
cf438d8
hw to register a model
brown9804 Apr 29, 2025
e1e23b2
Merge cf438d86041e073dbe5deff462bf9de2257edd6e into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
5407865
model creation
brown9804 Apr 29, 2025
3a04bdd
Merge 5407865d4e52a9e13a3f67d628cd26dbec752650 into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
10bb194
updated
brown9804 Apr 29, 2025
8134f32
Merge 10bb194263cb374015fbc304935e7dbddcbd643c into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
8c8acb5
adding title
brown9804 Apr 29, 2025
6c66dcb
Merge 8c8acb5b9dd22c02a464410b7fa5673520fbb14b into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
a2a47a9
pending deploy the model + consume
brown9804 Apr 29, 2025
960ca29
Merge a2a47a942441f8a7a55aedf248e49e1572045ea0 into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
d4ba7f4
add in progress
brown9804 Apr 29, 2025
72f2d41
Merge d4ba7f4f78bebdb348ad70b311cab02d34c8e46a into 6a07ef6938189e07c…
brown9804 Apr 29, 2025
062f323
state
brown9804 May 6, 2025
f6a07d8
Merge 062f323735eede8c04d2ec28e4ed871a41dd079c into 6a07ef6938189e07c…
brown9804 May 6, 2025
8fd3c63
Update last modified date in Markdown files
github-actions[bot] May 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Costa Rica
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-04-29
Last updated: 2025-05-06

------------------------------------------

Expand All @@ -15,7 +15,7 @@ Last updated: 2025-04-29
- Terraform [Demonstration: Deploying Azure Resources for a Data Platform (Microsoft Fabric)](./infrastructure/msFabric/)
- Terraform [Demonstration: Deploying Azure Resources for an ML Platform](./infrastructure/azMachineLearning/)
- [Demostration: How to integrate AI in Microsoft Fabric](./msFabric-AI_integration/)
- [Demostration: Creating a Machine Learning Model](./azML-modelcreation/)
- [Demostration: Creating a Machine Learning Model](./azML-modelcreation/) - in progress

> Azure Machine Learning (PaaS) is a cloud-based platform from Microsoft designed to help `data scientists and machine learning engineers build, train, deploy, and manage machine learning models at scale`. It supports the `entire machine learning lifecycle, from data preparation and experimentation to deployment and monitoring.` It provides powerful tools for `both code-first and low-code users`, including Jupyter notebooks, drag-and-drop interfaces, and automated machine learning (AutoML). `Azure ML integrates seamlessly with other Azure services and supports popular frameworks like TensorFlow, PyTorch, and Scikit-learn.`

Expand Down Expand Up @@ -284,9 +284,6 @@ Read more about [Endpoints for inference in production](https://learn.microsoft.
</details>





<div align="center">
<h3 style="color: #4CAF50;">Total Visitors</h3>
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
Expand Down
246 changes: 207 additions & 39 deletions azML-modelcreation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,34 @@ Costa Rica
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
[brown9804](https://github.com/brown9804)

Last updated: 2025-04-29
Last updated: 2025-05-06

------------------------------------------


<details>
<summary><b>List of References </b> (Click to expand)</summary>

- [AutoML Regression](https://learn.microsoft.com/en-us/azure/machine-learning/component-reference-v2/regression?view=azureml-api-2)
- [Evaluate automated machine learning experiment results](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml?view=azureml-api-2)
- [Evaluate Model component](https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/evaluate-model?view=azureml-api-2)

</details>

<details>
<summary><b>Table of Content </b> (Click to expand)</summary>

- [Step 1: Set Up Your Azure ML Workspace](#step-1-set-up-your-azure-ml-workspace)
- [Step 2: Create a Compute Instance](#step-2-create-a-compute-instance)
- [Step 3: Prepare Your Data](#step-3-prepare-your-data)
- [Step 4: Create a New Notebook or Script](#step-4-create-a-new-notebook-or-script)
- [Step 5: Load and Explore the Data](#step-5-load-and-explore-the-data)
- [Step 6: Train Your Model](#step-6-train-your-model)
- [Step 7: Evaluate the Model](#step-7-evaluate-the-model)
- [Step 8: Register the Model](#step-8-register-the-model)
- [Step 9: Deploy the Model](#step-9-deploy-the-model)
- [Step 10: Test the Endpoint](#step-10-test-the-endpoint)

</details>

## Step 1: Set Up Your Azure ML Workspace
Expand Down Expand Up @@ -69,86 +84,239 @@ https://github.com/user-attachments/assets/c199156f-96cf-4ed0-a8b5-c88db3e7a552

https://github.com/user-attachments/assets/f8cbd32c-94fc-43d3-a7a8-00f63cdc543d

## Step 4: Create a New Notebook or Script

### **4. Create a New Notebook or Script**
- Use the compute instance to open a **Jupyter notebook** or create a Python script.
- Import necessary libraries:

```python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
```

---
https://github.com/user-attachments/assets/16650584-11cb-48fb-928d-c032e519c14b

## Step 5: Load and Explore the Data

> Load the dataset and perform basic EDA (exploratory data analysis):

### **5. Load and Explore the Data**
- Load the dataset and perform basic EDA (exploratory data analysis):
```python
data = pd.read_csv('your_dataset.csv')
print(data.head())
import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

ml_client = MLClient.from_config(credential=DefaultAzureCredential())
data_asset = ml_client.data.get("employee_data", version="1")

tbl = mltable.load(f'azureml:/{data_asset.id}')

df = tbl.to_pandas_dataframe()
df
```

---
https://github.com/user-attachments/assets/5fa65d95-8502-4ab7-ba0d-dfda66378cc2

### **6. Train Your Model**
- Split the data and train a model:
```python
X = data.drop('target', axis=1)
y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
## Step 6: Train Your Model

> Split the data and train a model:

model = RandomForestClassifier()
```python
# Step 1: Preprocessing
from sklearn.preprocessing import LabelEncoder, StandardScaler

# Encode categorical columns
label_encoder = LabelEncoder()
df['Department'] = label_encoder.fit_transform(df['Department'])

# Drop non-informative or high-cardinality columns
if 'Name' in df.columns:
df = df.drop(columns=['Name']) # 'Name' is likely not predictive

# Optional: Check for missing values
if df.isnull().sum().any():
df = df.dropna() # or use df.fillna(method='ffill') for imputation

# Step 2: Define Features and Target
X = df.drop('Salary', axis=1) # Features: Age and Department
y = df['Salary'] # Target: Salary

# Optional: Feature Scaling (especially useful for models sensitive to scale)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Step 3: Split the Data
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y, test_size=0.2, random_state=42
)

# Step 4: Train a Regression Model
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor(
n_estimators=100,
max_depth=None,
random_state=42,
n_jobs=-1 # Use all available cores
)
model.fit(X_train, y_train)
```

---
https://github.com/user-attachments/assets/2176c795-5fda-4746-93c7-8b137b526a09

## Step 7: Evaluate the Model

> Check performance:

### **7. Evaluate the Model**
- Check performance:
```python
# Step 5: Make Predictions
predictions = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, predictions))

# Step 6: Evaluate the Model
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np

mae = mean_absolute_error(y_test, predictions)
mse = mean_squared_error(y_test, predictions)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, predictions)

print("Model Evaluation Metrics")
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R² Score: {r2:.2f}")
```

---
<img width="550" alt="image" src="https://github.com/user-attachments/assets/6aa19680-cadb-4fe4-a419-a626942e15f9" />

> Distribution of prediction errors:

```python
import matplotlib.pyplot as plt

# Plot 1: Distribution of prediction errors
errors = y_test - predictions
plt.figure(figsize=(10, 6))
plt.hist(errors, bins=30, color='skyblue', edgecolor='black')
plt.title('Distribution of Prediction Errors')
plt.xlabel('Prediction Error')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()

# Plot 2: Predicted vs Actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, predictions, alpha=0.3, color='darkorange')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.title('Predicted vs Actual Salary')
plt.xlabel('Actual Salary')
plt.ylabel('Predicted Salary')
plt.grid(True)
plt.show()
```

<img width="550" alt="image" src="https://github.com/user-attachments/assets/d8ec1f2c-eb97-4106-9cee-809849d02796">

## Step 8: Register the Model

> Save and register the model in Azure ML:

### **8. Register the Model**
- Save and register the model in Azure ML:
```python
import joblib
joblib.dump(model, 'model.pkl')

from azureml.core import Workspace, Model
ws = Workspace.from_config()
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model")
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model_RegressionModel")
```

---
https://github.com/user-attachments/assets/a82ff03e-437c-41bc-85fa-8b9903384a5b


> [!TIP]
> Click [here](./src/0_ml-model-creation.ipynb) to read the script used.

## Step 9: Deploy the Model

> Create the Scoring Script:

```python
import joblib
import numpy as np
from azureml.core.model import Model

def init():
global model
model_path = Model.get_model_path("my_model_RegressionModel")
model = joblib.load(model_path)

def run(data):
try:
input_data = np.array(data["data"])
result = model.predict(input_data)
return result.tolist()
except Exception as e:
return str(e)
```

https://github.com/user-attachments/assets/cdc64857-3bde-4ec9-957d-5399d9447813

> Create the Environment File (env.yml):

https://github.com/user-attachments/assets/8e7c37a2-e32b-4630-8516-f95926c374c0

> Create a new notebook:

https://github.com/user-attachments/assets/1b3e5602-dc64-4c39-be72-ed1cbd74361e

> Create an **inference configuration** and deploy to a web service:

### **9. Deploy the Model**
- Create an **inference configuration** and deploy to a web service:
```python
from azureml.core import Workspace
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

env = Environment.from_conda_specification(name="myenv", file_path="env.yml")

# Load the workspace
ws = Workspace.from_config()

# Get the registered model
registered_model = Model(ws, name="my_model_RegressionModel")

# Create environment from requirements.txt (no conda)
env = Environment.from_pip_requirements(
name="regression-env",
file_path="requirements.txt" # Make sure this file exists in your working directory
)

# Define inference configuration
inference_config = InferenceConfig(entry_script="score.py", environment=env)


# Define deployment configuration
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
service = Model.deploy(workspace=ws,
name="my-service",
models=[model],
inference_config=inference_config,
deployment_config=deployment_config)

# Deploy the model
service = Model.deploy(
workspace=ws,
name="regression-model-service",
models=[registered_model],
inference_config=inference_config,
deployment_config=deployment_config
)

service.wait_for_deployment(show_output=True)
print(f"Scoring URI: {service.scoring_uri}")
```

---

### **10. Test the Endpoint**
- Once deployed, you can send HTTP requests to the endpoint to get predictions.

## Step 10: Test the Endpoint

> Once deployed, you can send HTTP requests to the endpoint to get predictions.



Expand Down
Loading