-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Closed
Labels
Description
Is there a plan for adding Mixture of experts for GPT-style models? I've found this PR https://github.com/NVIDIA/NeMo/pull/5409/files but that seems to be for T5-like models. Thanks!