Skip to content

Plan for MoE #5456

@nilsjohanbjorck

Description

@nilsjohanbjorck

Is there a plan for adding Mixture of experts for GPT-style models? I've found this PR https://github.com/NVIDIA/NeMo/pull/5409/files but that seems to be for T5-like models. Thanks!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions