windows support omnidrive by 3a1b2c3 · Pull Request #321 · NVIDIA/flashdreams

3a1b2c3 · 2026-06-10T01:42:13Z

No description provided.

copy-pr-bot · 2026-06-10T01:42:17Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-06-10T01:44:45Z

Greptile Summary

This PR adds Windows support to the omnidrive integration across three modules: optional fcntl import with entry-point guards in wheel_profiles.py, OS-specific compiler/linker flag branches in _plugin.py, and a defensive getattr guard in rasterizer.py's cleanup().

wheel_profiles.py: fcntl is now imported inside a try/except ImportError; read_evdev_name and scan_evdev_devices return safe sentinels (None / ()) when fcntl is absent, and a new test validates both paths.
_plugin.py: Compiler options are split into POSIX and Windows branches (MSVC rejects -Wall/-Werror); Windows linker flags add cuda.lib plus an optional /LIBPATH: derived from CUDA_HOME/CUDA_PATH; the dead lib_dir variable is removed.
rasterizer.py: cleanup() uses getattr(self, "_temp_dir", None) so a failed __init__ (e.g. CUDA extension build error) no longer masks the real exception with an AttributeError from __del__.

Confidence Score: 5/5

Safe to merge — the changes are well-scoped Windows stubs and defensive guards that don't alter existing POSIX behavior.

All three changed modules have correct OS guards: the fcntl optional-import pattern is sound, the entry-point short-circuits prevent any unreachable ioctl path from being accidentally exercised, the compiler/linker flag split mirrors the existing POSIX structure, and the getattr cleanup fix is a standard defensive pattern. A new test covers the Windows fcntl-absent code paths. No existing logic is modified.

The Windows linker path in _plugin.py silently omits /LIBPATH: when neither CUDA_HOME nor CUDA_PATH is set — noted in a prior thread — but no files require blocking attention for this change.

Important Files Changed

Filename	Overview
integrations/omnidreams/ludus-renderer/ludus_renderer/_ops/_plugin.py	Restructures compiler/linker options into OS-specific branches for Windows support; removes the dead `lib_dir` variable; adds CUDA driver lib linkage via env-var-derived `/LIBPATH`.
integrations/omnidreams/omnidreams/interactive_drive/input/wheel_profiles.py	Makes `fcntl` import optional via try/except; adds `if fcntl is None` short-circuit guards on `read_evdev_name` and `scan_evdev_devices` so the module loads and behaves safely on Windows.
integrations/omnidreams/omnidreams/interactive_drive/rasterizer.py	Defensive `getattr` guard in `cleanup()` prevents `AttributeError` masking real `__init__` failures when `_temp_dir` was never assigned.
integrations/omnidreams/tests/interactive_drive/test_wheel_profiles.py	Adds `test_evdev_helpers_without_fcntl` to verify both public entry points return safe sentinels when `fcntl` is patched to `None`, covering the Windows stub behaviour.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_get_plugin called] --> B{os.name == 'nt'?}
    B -- Yes --> C[find_cl_path / ensure cl.exe in PATH]
    C --> D[cc_opts = MSVC flags /wd4067 /wd4624]
    D --> E[cuda_opts = nvcc flags no -Xcompiler -Wall]
    E --> F[_cuda_home from CUDA_HOME or CUDA_PATH]
    F --> G{_cuda_home set?}
    G -- Yes --> H[ldflags = cuda.lib + /LIBPATH]
    G -- No --> I[ldflags = cuda.lib only]
    B -- No --> J[cc_opts = GCC/Clang -Wall -Werror -Wno-psabi]
    J --> K[cuda_opts with -Xcompiler -Wall,-Werror,-Wno-psabi]
    K --> L[ldflags = -lcuda]
    H --> M[cpp_extension.load]
    I --> M
    L --> M

_{Reviews (3): Last reviewed commit: "Address PR #321 review: Windows fcntl gu..." | Re-trigger Greptile}

greptile-apps · 2026-06-10T01:44:50Z

+    elif os.name == "nt":
+        # Link the CUDA driver API (cu* functions). MSVC needs cuda.lib, which
+        # lives in $CUDA_HOME\lib\x64 (not auto-added like the runtime libs).
+        _cuda_home = os.environ.get("CUDA_HOME") or os.environ.get("CUDA_PATH") or ""
+        ldflags = ["cuda.lib"]
+        if _cuda_home:
+            ldflags.append("/LIBPATH:" + os.path.join(_cuda_home, "lib", "x64"))


cuda.lib added without a search path when env vars are unset

When both CUDA_HOME and CUDA_PATH are absent, ldflags is set to just ["cuda.lib"] with no /LIBPATH: directive. The comment notes that cuda.lib is "not auto-added like the runtime libs", so the MSVC linker will only find it if it happens to be in a default search path (e.g. via the Visual Studio CUDA integration). In a CI/container environment without VS integration this silently produces a link error. Emitting a warning or raising an error when _cuda_home is empty would make the failure actionable rather than cryptic.

jmccaffrey-nv · 2026-06-10T05:37:41Z

/ok to test fcfe4e4

jmccaffrey-nv · 2026-06-10T05:37:54Z

Thank you for your contribution !

jmccaffrey-nv · 2026-06-10T06:16:12Z

_ No description provided. _

Please add a concise MR description , including which Windows version and HW tested

jmccaffrey-nv · 2026-06-10T17:27:38Z

assisted with claude

Review — PR #321 "windows support omnidrive"

Verdict: LGTM minor suggestions. I checked out fcfe4e44 and tested it end-to-end natively on Windows 11 (RTX 5090 32 GB, driver 591.86, CUDA toolkit 13.1, MSVC 2022 Build Tools 14.44, Python 3.12, torch 2.11.0+cu130 from the pytorch-cu130 index already pinned for win32 in flashdreams/pyproject.toml). Summary of results:

uv sync --package flashdreams-omnidreams --extra interactive-drive resolves and installs cleanly on native Windows (no Linux-only wheels in the path; transformer-engine correctly excluded by marker, triton-windows picked up).
The Ludus renderer extension compiles and links first-try with this PR's flags — MSVC 19.44 + nvcc 13.1, all 11 build steps, loads as ludus_renderer_plugin.pyd. Without the PR, cl.exe rejects the GCC -Werror/-Xcompiler flags exactly as the new comment says, and the driver-API symbols don't link.
The full interactive-drive demo runs natively: --stream-mjpeg presenter, scene staging cache, Cosmos-Reason1-7B text encoder with --offload-text-encoder, DiT pipeline, browser driving. Benchmark below.

Per-file notes

`ludus_renderer/_ops/_plugin.py` — looks right; the greptile linker comment is covered by torch itself

Re: the bot's "cuda.lib without a search path" concern — in practice this is already double-covered:

torch.utils.cpp_extension with with_cuda=True appends its own /LIBPATH:<cuda_home>\lib\x64 on Windows (cpp_extension.py:2497 in torch 2.11), where cuda_home is resolved via CUDA_HOME → CUDA_PATH → nvcc on PATH → the default install glob — and it raises a clear "CUDA_HOME not found" error if all of those fail, before any silent link error can happen.
The CUDA Windows installer always sets CUDA_PATH, so the PR's explicit /LIBPATH fires in any normal install.

I verified this on my link line — the PR's /LIBPATH and torch's identical one appear side by side. So the explicit /LIBPATH here is harmless belt-and-suspeder and no warning is strictly needed; if you want one anyway, a logger.debug when _cuda_home is empty would do. Two small suggestions:

A one-line comment that paths containing spaces (C:\Program Files\...) survive because torch's ninja writer quotes ldflags (_nt_quote_args) — it looks fragile otherwise.
The NT branch drops warnings-as-errors entirely (POSIX keeps -Wall -Werror; NT has only the two /wd suppressions). Fine as a bootstrap, but consider a follow-up to add /W3 /WX (or a TODO) so the Windows build doesn't permanently drift to warnings-ignored. FWIW the build is warning-clean today except a benign LNK4098 (LIBCMT vs /MD).
Optional drive-by: lib_dir at the top of the NT block is computed but never used (pre-existing).

`wheel_profiles.py` — the greptile `AttributeError` finding is real-but-unreachable; the comment is what needs fixing

I traced all five fcntl.ioctl() call sites: every one is gated behind a successful open()/os.open() of a /dev/input/* evdev node, and scan_evdev_devices() enumerates nothing on Windows (/dev/input/by-id fails is_dir(), the event* glob is empty). The FFB backends additionally bail on _fd is None. So guarding all five sites is over-prescribed — but the bot is right that the new comment ("let the call sites no-op") is inaccurate: if one were reached it would raise AttributeError, which notably is not caught by the surrounding except OSError handlers.

Cheapest honest fix: early-return in the two public entry points (read_evdev_name() → None, scan_evdev_devices() → ()) when fcntl is None, and reword the comment to say the paths are unreachable on Windows rather than no-ops. A tiny test would lock it in (none of this module is covered today):

def test_evdev_helpers_without_fcntl(monkeypatch, tmp_path):
    monkeypatch.setattr(wheel_profiles, "fcntl", None)
    assert wheel_profiles.scan_evdev_devices() == ()
    assert wheel_profiles.read_evdev_name(tmp_path / "event0") is None

`rasterizer.py` — good catch

The getattr guard is correct, and worth keeping even apart from Windows: pre-PR, a failed extension build raised in __init__ before _temp_dir was assigned, and __del__ → cleanup()'s AttributeError masked the real build error. (Equivalent alternative: assign self._temp_dir = None first thing in __init__; the chosen fix is fine and self-documenting.)

Native 5090 benchmark (`interactive-drive --stream-mjpeg --offload-text-encoder`)

Steady-state wall_present_fps from INTERACTIVE_DRIVE_PROFILE_INPUT_TO_PRESENT=1, same scene (clipgt-0d404ff7), same GPU. WSL2 numbers are from prior runs on the identical machine for comparison:

Config	Native Windows (this PR)	WSL2 (same rig)
Default manifest (1280×704, 2 steps)	7.3 FPS (8-frame chunk ≈ 1035 ms, model 1007 ms; peak 21.9 GiB)	~6.8 FPS
Perf manifest w/ `native_dit_acceleration: disabled` (1168×640, `compile_net`)	13.4 FPS (chunk ≈ 539 ms; peak 20.0 GiB)	—
Perf manifest, full native FP8/cuDNN DiT	does not build on Windows (see below)	~15–20 FPS

Native Windows matches (slightly beats) WSL2 at identical config, fits comfortably in 32 GB with --offload-text-encoder, and the browser drive loop works end to end.

Remaining gap to Linux is entirely the omnidreams_singleview native extension, which is out of this PR's scope but worth tracking as follow-ups (happy to file issues):

sync_thirdparty.py fails on Windows checkouts with core.autocrlf=true: the *.patch files get CRLF-translated and git apply reports "corrupt patch". A .gitattributes with *.patch -text (repo has none today) fixes it at the root; I worked around it with GIT_CONFIG_* env overrides.
omnidreams/native/omnidreams_singleview.py cuDNN discovery is Linux-only (libcudnn.so.9 checks, no Windows cudnn lookup), so the build dies at cudnn.h, and the -lcublas/-l:libcudnn.so.9 ldflags are GCC-style.
Sage3/Sparge are already explicitly reported unsupported on win32 (optimized_dit.py:860,886) — the cuDNN attention backend would be the natural first Windows target.

Other run notes: Triton autotune rejects the 128–144 KB smem kernel candidates on consumer Blackwell (benign "out of resource" warnings, same as Linux); triton-windows 3.5 handled all Inductor kernels without issue.

Request-changes-level items

None. The fcntl comment wording is the only thing I'd fix before merge (a one-liner), plus please add a PR description covering the tested toolchain and the Sage3/Sparge scope limit — the diff is small but it's a platform-support claim, and the next Windows user will want to know what was validated ("win" as the commit message undersells a working native port!).

Nice work — small diff that's the missing piece for native Windows; everything else (cu130 index pin for win32, triton-windows, TE exclusion) was already in main and holds up under a real run.

Guard the evdev entry points (read_evdev_name, scan_evdev_devices) when fcntl is unavailable (Windows) so the unreachable ioctl paths are never entered, and reword the comment to match. Add a regression test. Tidy the Windows renderer-plugin branch: drop the unused lib_dir, add a /W3 /WX TODO, and note that torch _nt_quote_args keeps the /LIBPATH intact for space-containing CUDA paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

3a1b2c3 · 2026-06-11T00:00:38Z

My Claude did:
Reworded the fcntl comment and added short-circuit guards in read_evdev_name/scan_evdev_devices (closes P1), plus a regression test. Also tidied the _plugin.py NT branch: dropped the unused lib_dir, added a /W3 /WX TODO, and a comment noting torch's _nt_quote_args keeps /LIBPATH intact for space-containing CUDA paths.

jmccaffrey-nv · 2026-06-11T01:13:49Z

thanks for revision ! Approved , testing next.

win

fcfe4e4

greptile-apps Bot reviewed Jun 10, 2026

View reviewed changes

3a1b2c3 changed the title ~~windows support~~ windows support omnidrive Jun 10, 2026

liruilong940607 requested review from gtong-nv and wlewNV June 10, 2026 04:58

3a1b2c3 force-pushed the windows branch from f257944 to e6c1ee9 Compare June 10, 2026 22:33

jmccaffrey-nv approved these changes Jun 11, 2026

View reviewed changes

Comment thread integrations/omnidreams/ludus-renderer/ludus_renderer/_ops/_plugin.py

jmccaffrey-nv enabled auto-merge June 11, 2026 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

windows support omnidrive#321

windows support omnidrive#321
3a1b2c3 wants to merge 2 commits into
NVIDIA:mainfrom
3a1b2c3:windows

3a1b2c3 commented Jun 10, 2026

Uh oh!

copy-pr-bot Bot commented Jun 10, 2026

Uh oh!

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

3a1b2c3 commented Jun 11, 2026 •

edited

Loading

Uh oh!

jmccaffrey-nv commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

3a1b2c3 commented Jun 10, 2026

Uh oh!

copy-pr-bot Bot commented Jun 10, 2026

Uh oh!

greptile-apps Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Uh oh!

jmccaffrey-nv commented Jun 10, 2026

Review — PR #321 "windows support omnidrive"

Per-file notes

ludus_renderer/_ops/_plugin.py — looks right; the greptile linker comment is covered by torch itself

wheel_profiles.py — the greptile AttributeError finding is real-but-unreachable; the comment is what needs fixing

rasterizer.py — good catch

Native 5090 benchmark (interactive-drive --stream-mjpeg --offload-text-encoder)

Request-changes-level items

Uh oh!

3a1b2c3 commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmccaffrey-nv commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading

`ludus_renderer/_ops/_plugin.py` — looks right; the greptile linker comment is covered by torch itself

`wheel_profiles.py` — the greptile `AttributeError` finding is real-but-unreachable; the comment is what needs fixing

`rasterizer.py` — good catch

Native 5090 benchmark (`interactive-drive --stream-mjpeg --offload-text-encoder`)

3a1b2c3 commented Jun 11, 2026 •

edited

Loading