Fix regex test OOM on x86 checked coreclr#126007
Fix regex test OOM on x86 checked coreclr#126007danmoseley wants to merge 3 commits intodotnet:mainfrom
Conversation
Chunk SourceGenRegexAsync batches to 200 patterns on 32-bit processes to avoid OutOfMemoryException from Roslyn compiling thousands of patterns at once in the ~2GB address space. Skip CharClassSubtraction_DeepNesting_DoesNotStackOverflow on 32-bit as the 1000-depth nested BDD exhausts address space. Fixes dotnet#126003 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
I think the recently added CharClassSubtraction_DeepNesting is what pushed this over the edge, the chunking is reasonable extra change here |
|
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions |
There was a problem hiding this comment.
Pull request overview
This PR addresses OutOfMemoryException failures in System.Text.RegularExpressions.Tests on the x86 checked CoreCLR test leg by reducing peak memory usage in source-generated regex compilation and skipping a deep-nesting stress test on 32-bit processes.
Changes:
- Chunk source-generator Roslyn compilations into smaller batches on 32-bit processes to avoid exhausting the ~2GB address space.
- Skip
CharClassSubtraction_DeepNesting_DoesNotStackOverflowon 32-bit processes to avoid address-space exhaustion in deep BDD construction paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/RegexGeneratorHelper.netcoreapp.cs | Adds 32-bit-only batching (max 200 patterns per compilation) for source-generated regex creation. |
| src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs | Adds a 32-bit guard to skip the deep nesting char class subtraction test. |
...ries/System.Text.RegularExpressions/tests/FunctionalTests/RegexGeneratorHelper.netcoreapp.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs
Outdated
Show resolved
Hide resolved
… list - Use [ConditionalTheory(typeof(PlatformDetection), nameof(PlatformDetection.Is64BitProcess))] instead of runtime check, matching the pattern used by StressTestDeepNestingOfLoops - Pre-allocate List<Regex> with regexes.Length capacity to avoid resizing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@copilot address any remaining feedback |
… enclosing scope Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| [ConditionalTheory(typeof(PlatformDetection), nameof(PlatformDetection.Is64BitProcess))] // deep nesting exhausts address space on 32-bit | ||
| [SkipOnTargetFramework(TargetFrameworkMonikers.NetFramework, "Fix is not available on .NET Framework")] |
There was a problem hiding this comment.
The new [ConditionalTheory(...Is64BitProcess)] skips this deep-nesting test for all engines on 32-bit, but the OOM described in the PR/issue is specific to the NonBacktracking (symbolic) engine path. Skipping the whole theory reduces coverage for Interpreter/Compiled (and potentially SourceGenerated) on x86 even though they may run fine. Consider removing the 64-bit-only attribute and instead skipping only when !Environment.Is64BitProcess && RegexHelpers.IsNonBacktracking(engine) (and optionally lowering depth on 32-bit for the other engines if needed to keep memory reasonable).
Four
System.Text.RegularExpressions.Teststests fail withOutOfMemoryExceptionon the x86 checked coreclr leg. TheSourceGeneratedregex engine path runs full Roslyn compilations at test time, and large batches (up to ~2,903 patterns) exhaust the ~2GB x86 address space — especially with 4 parallel xunit threads and checked runtime overhead.Changes
Chunk batches on 32-bit (
RegexGeneratorHelper.netcoreapp.cs): When!Environment.Is64BitProcess, compile at most 200 patterns per Roslyn invocation instead of all at once. No change on 64-bit.Skip
CharClassSubtraction_DeepNesting_DoesNotStackOverflowon 32-bit (Regex.Match.Tests.cs): The 1000-depth nested char class BDD exhausts address space via a different mechanism (recursiveRegexNodeConverter.CreateBDDFromSetString). This test was added 3 weeks ago in Fix StackOverflowException from deeply nested character class subtractions #124995 and already had to be skipped on browser-wasm (Skip NonBacktracking deep nesting test on single-threaded WASM #125021).Failing tests covered
RegexPcre2Tests.IsMatchTestsDefiniteAssignmentPass(Array.Resize)MonoTests.ValidateRegexBlobBuilder(Roslyn emit)RegexGroupTests.GroupsCharClassSubtraction_DeepNestingRegexNodeConverter.CreateBDDFixes #126003