Skip to content

Conversation

@r-hensley
Copy link

@r-hensley r-hensley commented Jul 14, 2025

Description

This pull request fixes misclassification of certain .v Verilog files that were previously identified as Rocq Prover or Coq due to syntax edge cases. These cases include:

  • Module stubs where the module keyword is followed directly by the port list with no space (e.g., module foo(...))
  • Files consisting mainly of `pragma protect directives (common in encrypted IP cores)
  • Compact syntax using always@ or initial@ without space before the sensitivity list

The updated heuristics.yml pattern broadens the existing Verilog match to allow for:

  • Optional spacing before # and ( in module declarations
  • `pragma as a valid Verilog preprocessor directive
  • Optional spacing before @ in always and initial blocks

Two new .v samples are added to demonstrate these cases, and an existing sample (button_debounce.v) is updated to include a real-world use of always@.


Checklist:

  • I am fixing a misclassified language
    • I have included a new sample for the misclassified language:
      • Sample source(s):
      • Sample license(s):
        • Vivado-generated code: compatible with example/sample use under Xilinx’s tools
      • Updated source(s):
        • button_debounce.v: Change always @ ( to always@( to provide an example of the possibility of no whitespace between these operators.
    • I have included a change to the heuristics to distinguish my language from others using the same extension.

Supporting Evidence

  • GitHub code search shows both always @ and always@ are widely used:
  • Example of real-world use of always@:
    1bitSDR/NCO.v#L36

…d IP

- Allow no space between module name and parameter/port list using \s*
- Add `pragma` to preprocessor directive match
- Allow no space between `always`/`initial` and @

This improves detection of real-world Verilog files such as:
- Vivado-generated module stubs like `module name(...)`
- Encrypted vendor IP with `pragma protect`
- Compact constructs like `always@(posedge clk)`

Previously, these were sometimes misclassified as Rocq Prover or Coq.
- Added `module_stub.v`: Vivado-style stub using `module name(...)` syntax with no space
- Added `encrypted_module.v`: representative of encrypted IP using `pragma protect`
- Modified `button_debounce.v` to include a valid `always@(...)` syntax case

These examples demonstrate the heuristic's improved ability to detect real-world Verilog files,
including those previously misclassified as Rocq Prover or Coq due to compact or directive-heavy syntax.
@r-hensley r-hensley marked this pull request as ready for review July 14, 2025 19:59
@r-hensley r-hensley requested a review from a team as a code owner July 14, 2025 19:59
state <= (!reset_n) ? WAIT : next_state;

always @ (posedge clk or negedge reset_n) begin
always@(posedge clk or negedge reset_n) begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't change samples. It's better to add a new sample.

@lildude lildude enabled auto-merge December 12, 2025 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants