-
Notifications
You must be signed in to change notification settings - Fork 157
Reset more vcpu state on snapshot::restore #1120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9cfcc9c to
8507c7a
Compare
| xsave | ||
| } | ||
|
|
||
| fn hyperlight_vm(code: &[u8]) -> Result<HyperlightVm> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this duplicating alot of the configuration code we do during VM setup in the test? how will we maintain this with the actual setup? What if there is a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is calling HyperlightVm::new so they should be the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about the logic setting up the exclusive memory, stack cookie, etc. It seems like alot of the same in set_up_hypervisor_partition but simplier and having that not be the same might cause difference in configuration that might not be caught. This seems like a good case of doing integration testing at the host API and guest api layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it does indeed do a lot of setup... But that's more related to memory, and whether it's different or not shouldn't affect these tests which exercises vcpu state tests (I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After rebasing on top of main branch with changes that changed snapshotting, it got even worse. But I still believe it's fine, because memory-specifics is irrelevant to these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this but I guess it might be fine. We might be able to add an abstraction that "preps the vm" but I haven't really thought this through fully.
d09b1fc to
18d4ff9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances vCPU state management during snapshot restoration by adding debug register abstraction and ensuring all vCPU state is properly reset. The changes follow the established Common* struct pattern for hypervisor abstraction and include comprehensive test coverage across KVM, MSHV, and WHP hypervisors.
Key changes:
- Added
CommonDebugRegsabstraction with implementations for all supported hypervisors (KVM, MSHV, WHP) - Removed padding fields from
CommonFputo simplify comparisons while maintaining correct conversions to hypervisor-specific types - Implemented
reset_vcpu()to reset general purpose registers, debug registers, FPU/XSAVE state, and special registers - Added
reset_vcpu()call inMultiUseSandbox::restore()to ensure clean state after snapshot restoration - Added extensive tests covering all reset scenarios with both direct state setting and actual code execution
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/hyperlight_host/src/hypervisor/regs/debug_regs.rs |
New file implementing CommonDebugRegs abstraction with conversions for KVM, MSHV, and WHP |
src/hyperlight_host/src/hypervisor/regs/fpu.rs |
Removed pad1 and pad2 fields from CommonFpu; updated all conversions to use literal 0 for padding |
src/hyperlight_host/src/hypervisor/regs.rs |
Added debug_regs module export |
src/hyperlight_host/src/hypervisor/mod.rs |
Added debug_regs, set_debug_regs, reset_xsave, and set_xsave methods to Hypervisor trait |
src/hyperlight_host/src/hypervisor/kvm.rs |
Implemented debug register and xsave methods for KVM hypervisor |
src/hyperlight_host/src/hypervisor/hyperv_linux.rs |
Implemented debug register and xsave methods for MSHV hypervisor |
src/hyperlight_host/src/hypervisor/hyperv_windows.rs |
Implemented debug register and xsave methods for WHP; refactored gdb breakpoint code to use new abstractions |
src/hyperlight_host/src/hypervisor/hyperlight_vm.rs |
Added reset_vcpu() method and comprehensive test suite covering all vCPU state reset scenarios |
src/hyperlight_host/src/sandbox/initialized_multi_use.rs |
Added reset_vcpu() call in restore() and test to verify debug registers are reset |
src/hyperlight_host/src/sandbox/hypervisor.rs |
Added Copy and Clone derives to HypervisorType enum |
src/hyperlight_host/src/mem/layout.rs |
Changed visibility from pub(super) to pub(crate) for methods used by tests |
src/tests/rust_guests/simpleguest/src/main.rs |
Added SetDr0 and GetDr0 guest functions for testing debug register reset |
.github/workflows/ValidatePullRequest.yml |
Changed fail-fast from true to false for build-test job to allow all matrix combinations to complete |
180efe7 to
6cf483b
Compare
jsturtevant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great to see. Great work figuring out the differences between the hypervisors and paying attention and documenting the differences. I learned a bunch.
My main concern is around the readability and maintenance of the tests. I've left a few suggestions on how we could maybe clean them up to be more concise and easier to reason about
| #[cfg(feature = "init-paging")] | ||
| fn set_xsave(&self, xsave: &[u32]) -> Result<()>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it only for the init-paging feature? shouldn't there be a test for all the features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot of current code (unrelated to this) depends on init-paging feature, and note that it's a default-feature. All hyperlight usages except nanvix must use this feature. For example GuestPageTableBuffer, set_pt_size, standard_64bit_defaults are some things already gated behind this feature. We can consider ungating these, but let's do that in another PR as it would be a substantial effort.
| #[rustfmt::skip] | ||
| const CODE: [u8; 289] = [ | ||
| // xmm0-xmm7: use movd + pshufd to fill with pattern | ||
| 0xb8, 0x11, 0x11, 0x11, 0x11, // mov eax, 0x11111111 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems really hard to reason about, Could we use something like https://docs.rs/iced-x86/latest/iced_x86/ or put the files in test.asm and compile them at build time with nasm then bring them in with include_bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| /// Test that snapshot restore properly resets vCPU debug registers. This test verifies | ||
| /// that restore() calls reset_vcpu. | ||
| #[test] | ||
| fn snapshot_restore_resets_debug_registers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we also add a test like this for other registers too? Doesn't need to be all the registers but atleast the common registers kind of like it is done here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is just making sure reset_vcpu is called. Doing it with other registers is not really feasible because they would be clobbered before/after the guest call, unlike debug registers. I think this is fine because the other registers are tested in hyperlight_vm.rs's tests
0118f6d to
1e8d95a
Compare
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
MSRs will be added in another PR.
into()implementation for kvm/mshv due to single memcyp, but that seems like premature optimization to me