AI-Hypercomputer
diff --git a/‎.github/workflows/run_jupyter_notebooks.yml‎
Lines changed: 2 additions & 1 deletion b/‎.github/workflows/run_jupyter_notebooks.yml‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎codecov.yml‎
Lines changed: 1 addition & 1 deletion b/‎codecov.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/guides/run_python_notebook.md‎
Lines changed: 40 additions & 27 deletions b/‎docs/guides/run_python_notebook.md‎
Lines changed: 40 additions & 27 deletions
diff --git a/‎docs/tutorials/first_run.md‎
Lines changed: 6 additions & 1 deletion b/‎docs/tutorials/first_run.md‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎docs/tutorials/posttraining/multimodal.md‎
Lines changed: 6 additions & 1 deletion b/‎docs/tutorials/posttraining/multimodal.md‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎pedagogical_examples/__init__.py‎
Lines changed: 0 additions & 13 deletions b/‎pedagogical_examples/__init__.py‎
Lines changed: 0 additions & 13 deletions
diff --git a/‎src/MaxText/configs/rl.yml‎
Lines changed: 1 addition & 1 deletion b/‎src/MaxText/configs/rl.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/MaxText/maxtext_utils.py‎
Lines changed: 0 additions & 2 deletions b/‎src/MaxText/maxtext_utils.py‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎src/maxtext/__init__.py‎
Lines changed: 31 additions & 0 deletions b/‎src/maxtext/__init__.py‎
Lines changed: 31 additions & 0 deletions
@@ -90,8 +90,9 @@ jobs:
           HF_TOKEN: ${{ secrets.HF_TOKEN }}
         run: |
           MAXTEXT_REPO_ROOT=$(pwd)
-          MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/MaxText/examples"
+          MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/maxtext/examples"
 
+          shopt -s nullglob  # Prevents loop from running if no files match
           for notebook in "$MAXTEXT_NOTEBOOKS_ROOT"/{sft,rl}*.ipynb; do
             filename=$(basename "$notebook")
             output_name="${filename%.ipynb}_output.ipynb"
 
@@ -43,7 +43,7 @@ See our guide on running MaxText in decoupled mode, without any GCP dependencies
 
 * \[December 22, 2025\] [Muon optimizer](https://kellerjordan.github.io/posts/muon) is now supported.
 * \[December 10, 2025\] DeepSeek V3.1 is now supported. Use existing configs for [DeepSeek V3 671B](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/configs/models/deepseek3-671b.yml) and load in V3.1 checkpoint to use model.
-* \[December 9, 2025\] [New RL and SFT Notebook tutorials](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/MaxText/examples) are available.
+* \[December 9, 2025\] [New RL and SFT Notebook tutorials](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/examples) are available.
 * \[December 4, 2025\] The [ReadTheDocs documentation site](https://maxtext.readthedocs.io/en/latest/index.html) has been reorganized.
 * \[December 3, 2025\] Multi-host support for GSPO and GRPO is now available via [new RL tutorials](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl_on_multi_host.html).
 * \[November 20, 2025\] A new guide, [What is Post Training in MaxText?](https://maxtext.readthedocs.io/en/latest/tutorials/post_training_index.html), is now available.
 
@@ -34,7 +34,7 @@ fixes:
 ignore:
   - "src/MaxText/assets"
   - "src/MaxText/configs"
-  - "src/MaxText/examples"
+  - "src/maxtext/examples"
   - "src/MaxText/experimental"
   - "src/MaxText/inference"
   - "src/MaxText/inference_mlperf"
 
@@ -19,6 +19,7 @@ Before starting, make sure you have:
 - ✅ Basic familiarity with Jupyter, Python, and Git
 
 **For Method 2 (Visual Studio Code) and Method 3 (Local Jupyter Lab) only:**
+
 - ✅ A Google Cloud Platform (GCP) account with billing enabled
 - ✅ TPU quota available in your region (check under IAM & Admin → Quotas)
 - ✅ `tpu.nodes.create` permission to create a TPU VM
@@ -36,16 +37,18 @@ Currently, this method only supports the **`sft_qwen3_demo.ipynb`** notebook, wh
 
 Before proceeding, please verify that the specific notebook you are running works reliably on the free-tier TPU resources. If you encounter frequent disconnections or resource limitations, you may need to:
 
-* Upgrade to a Colab Pro or Pro+ subscription for more stable and powerful TPU access.
+- Upgrade to a Colab Pro or Pro+ subscription for more stable and powerful TPU access.
 
-* Move to local Jupyter Lab setup method with access to a powerful TPU machine.
+- Move to local Jupyter Lab setup method with access to a powerful TPU machine.
 
 ### Step 1: Choose an Example
-1.a. Visit the [MaxText examples directory](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/MaxText/examples) on Github.
+
+1.a. Visit the [MaxText examples directory](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/examples) on Github.
 
 1.b. Find the notebook you want to run (e.g., `sft_qwen3_demo.ipynb`) and copy its URL.
 
 ### Step 2: Import into Colab
+
 2.a. Go to [Google Colab](https://colab.research.google.com/) and sign in.
 
 2.b. Select **File** -> **Open Notebook**.
@@ -63,9 +66,11 @@ Before proceeding, please verify that the specific notebook you are running work
 3.c. Click **Save**
 
 ### Step 4: Run the Notebook
+
 Follow the instructions within the notebook cells to install dependencies and run the training/inference.
 
 ## Method 2: Visual Studio Code with TPU (Recommended)
+
 Running Jupyter notebooks in Visual Studio Code (VS Code) provides a powerful, interactive environment that combines the flexibility of notebooks with the robust features of a code editor. Follow these steps to get your environment up and running.
 
 ### Step 1: Set Up TPU VM
@@ -75,9 +80,10 @@ In Google Cloud Console, create a standalone TPU VM:
 1.a. **Compute Engine** → **TPUs** → **Create TPU**
 
 1.b. Example config:
-   - **Name:** `maxtext-tpu-node`
-   - **TPU type:** Choose your desired TPU type
-   - **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)
+
+- **Name:** `maxtext-tpu-node`
+- **TPU type:** Choose your desired TPU type
+- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)
 
 ### Step 2: SSH to TPU-VM via VS Code
 
@@ -86,11 +92,12 @@ In Google Cloud Console, create a standalone TPU VM:
 2.b. Follow [Connect to a remote host](https://code.visualstudio.com/docs/remote/ssh#_connect-to-a-remote-host) guide to connect to your TPU-VM via VS Code.
 
 ### Step 3. Install Necessary Extensions on VS Code
+
 To enable notebook support, you must install two official extensions from the VS Code Marketplace:
 
-* Python Extension: Provides support for the Python language.
+- Python Extension: Provides support for the Python language.
 
-* Jupyter Extension: Enables you to create, edit, and run `.ipynb` files directly inside VS Code.
+- Jupyter Extension: Enables you to create, edit, and run `.ipynb` files directly inside VS Code.
 
 To install, click the `Extensions` icon on the left sidebar (or press `Ctrl+Shift+X` or `Cmd+Shift+X`), search for `Jupyter` and `Python`, and click `Install`.
 
@@ -99,6 +106,7 @@ To install, click the `Extensions` icon on the left sidebar (or press `Ctrl+Shif
 To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl.html#create-virtual-environment-and-install-maxtext-dependencies) to install MaxText and its dependencies inside a dedicated virtual environment.
 
 ### Step 5: Install the necessary library for Jupyter
+
 Jupyter requires a kernel to execute code. This kernel is tied to a specific Python environment. Open your terminal inside VS Code and run:
 
 ```bash
@@ -110,9 +118,9 @@ uv pip install ipykernel
 Before you can run the notebook, you must tell VS Code which Python environment to use.
 
 1. Look at the top-right corner of the notebook editor.
-2. Click `Select Kernel`.
-3. Choose Python Environments and select the virtual environment you created in Step 4.
-4. Open [available post-training notebooks in MaxText](#available-examples) inside VS Code and run the jupyter notebook cells.
+1. Click `Select Kernel`.
+1. Choose Python Environments and select the virtual environment you created in Step 4.
+1. Open [available post-training notebooks in MaxText](#available-examples) inside VS Code and run the jupyter notebook cells.
 
 ## Method 3: Local Jupyter Lab with TPU (Recommended)
 
@@ -125,12 +133,15 @@ In Google Cloud Console, create a standalone TPU VM:
 1.a. **Compute Engine** → **TPUs** → **Create TPU**
 
 1.b. Example config:
-   - **Name:** `maxtext-tpu-node`
-   - **TPU type:** Choose your desired TPU type
-   - **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)
+
+- **Name:** `maxtext-tpu-node`
+- **TPU type:** Choose your desired TPU type
+- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)
 
 ### Step 2: Connect with Port Forwarding
+
 Run the following command on your local machine:
+
 > **Note**: The `--` separator before the `-L` flag is required. This tunnels the remote port 8888 to your local machine securely.
 
 ```bash
@@ -170,13 +181,15 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
 ```
 
 ### Step 7: Access the Notebook
+
 7.a. Look at the terminal output for a URL that looks like: `http://127.0.0.1:8888/lab?token=...`.
 
 7.b. Copy that URL.
 
 7.c. Paste it into your **local computer's browser**.
-   * **Important:** If you changed the port in Step 2 (e.g., to `9999`), you must manually replace `8888` in the URL with `9999`.
-   * *Example:* `http://127.0.0.1:9999/lab?token=...`
+
+- **Important:** If you changed the port in Step 2 (e.g., to `9999`), you must manually replace `8888` in the URL with `9999`.
+- *Example:* `http://127.0.0.1:9999/lab?token=...`
 
 7.d. Once the interface opens in your browser, Click on the current kernel name (e.g., `Python 3 (ipykernel)`).
 
@@ -197,13 +210,13 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
 
 ## Common Pitfalls & Debugging
 
-| Issue | Solution |
-|-------|----------|
-| ❌ TPU runtime mismatch | Check TPU runtime version matches VM image |
-| ❌ Colab disconnects | Save checkpoints to GCS or Drive regularly |
-| ❌ "RESOURCE_EXHAUSTED" errors | Use smaller batch size or v5e-8 instead of v5e-1 |
-| ❌ Firewall blocked | Ensure port 8888 open, or always use SSH tunneling |
-| ❌ Path confusion | In Colab use `/content/maxtext`; in TPU VM use `~/maxtext` |
+| Issue                          | Solution                                                   |
+| ------------------------------ | ---------------------------------------------------------- |
+| ❌ TPU runtime mismatch        | Check TPU runtime version matches VM image                 |
+| ❌ Colab disconnects           | Save checkpoints to GCS or Drive regularly                 |
+| ❌ "RESOURCE_EXHAUSTED" errors | Use smaller batch size or v5e-8 instead of v5e-1           |
+| ❌ Firewall blocked            | Ensure port 8888 open, or always use SSH tunneling         |
+| ❌ Path confusion              | In Colab use `/content/maxtext`; in TPU VM use `~/maxtext` |
 
 ## Support and Resources
 
@@ -217,9 +230,9 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
 If you encounter issues or have improvements for this guide, please:
 
 1. Open an issue on the MaxText repository
-2. Submit a pull request with your improvements
-3. Share your experience in the discussions
+1. Submit a pull request with your improvements
+1. Share your experience in the discussions
 
----
+______________________________________________________________________
 
-**Happy Training! 🚀**
+**Happy Training! 🚀**
@@ -75,7 +75,12 @@ In the same TPU VM where you just installed all the dependencies of MaxText, You
 
 #### Decoding in MaxText via notebook
 
-You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.
+You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.
+>>>>>>> fdeee6de (move examples)
+You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.
+=======
+You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.
+>>>>>>> fdeee6de (move examples)
 
 ### Run MaxText on NVIDIA GPUs
 
 
@@ -6,7 +6,12 @@ This document provides a guide to use the multimodal functionalities in MaxText
 - **Multimodal Decode**: Inference with text+images as input.
 - **Supervised Fine-Tuning (SFT)**: Apply SFT to the model using a visual-question-answering dataset.
 
-We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:
+We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:
+>>>>>>> fdeee6de (move examples)
+We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:
+=======
+We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:
+>>>>>>> fdeee6de (move examples)
 
 | Models                                         | Input Modalities | Output Modalities |
 | :--------------------------------------------- | :--------------- | :---------------- |
 
@@ -166,7 +166,7 @@ reasoning_start_token: '<reasoning>'
 reasoning_end_token: '</reasoning>'
 solution_start_token: '<answer>'
 solution_end_token: '</answer>'
-chat_template_path: 'src/MaxText/examples/chat_templates/gsm8k_rl.json'
+chat_template_path: 'src/maxtext/examples/chat_templates/gsm8k_rl.json'
 skip_jax_distributed_system: True
 
 # # TODO(@mazumdera): fix this
 
@@ -15,8 +15,6 @@
 # pylint: disable=line-too-long, disable=bare-except, consider-using-generator
 """ Utils that are only interesting to MaxText. """
 
-"""This is PR#3000!"""
-
 import functools
 import pickle
 import os
 
@@ -0,0 +1,31 @@
+# Copyright 2023–2026 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+MaxText is a high performance, highly scalable, open-source LLM written in pure Python/Jax and targeting Google Cloud
+TPUs and GPUs for training and inference. MaxText achieves high MFUs and scales from single host to very large clusters
+while staying simple and "optimization-free" thanks to the power of Jax and the XLA compiler.
+"""
+
+__author__ = "Google LLC"
+__version__ = "0.1.1"
+__description__ = (
+    "MaxText is a high performance, highly scalable, open-source LLM written in pure Python/Jax and "
+    "targeting Google Cloud TPUs and GPUs for training and **inference."
+)
+
+from MaxText.layers import models
+
+Transformer = models.Transformer
+transformer_as_linen = models.transformer_as_linen