Skip to content

OpenELM Architecture Adapter#1172

Open
jlarson4 wants to merge 13 commits intodev-3.xfrom
feature/OpenELM-architecture-adapter
Open

OpenELM Architecture Adapter#1172
jlarson4 wants to merge 13 commits intodev-3.xfrom
feature/OpenELM-architecture-adapter

Conversation

@jlarson4
Copy link
Collaborator

Added an Architecture Adapter for Apple's OpenELM collection of models.
Added handling for the following additional cases that are new for OpenELM:

  • prepare_model and prepare_loading in ArchitectureAdapters – These hooks allow us to patch compatibility issues between the OpenELM remote code and transformers v5.

  • trust_remote_code – OpenELM's modeling code comes from HuggingFace, and not transformers v5. In order to get that information from HuggingFace, we need to set trust_remote_code to true for all HF calls

  • repetition_penalty – Apple's OpenELM example generations suggest using this HuggingFace parameter for generation, so we have recreated it for TransformerLens to ensure we can match the generation done in hugging face effectively

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@jlarson4 jlarson4 changed the title Feature/open elm architecture adapter OpenELM Architecture Adapter Feb 12, 2026
@jlarson4 jlarson4 changed the base branch from feature/StableLM-architecture-adapter to dev-3.x February 12, 2026 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant