Skip to content

C++: Add cpp/extraction-information query#21512

Merged
paldepind merged 2 commits intogithub:mainfrom
paldepind:cpp/extraction-information
Mar 20, 2026
Merged

C++: Add cpp/extraction-information query#21512
paldepind merged 2 commits intogithub:mainfrom
paldepind:cpp/extraction-information

Conversation

@paldepind
Copy link
Contributor

@paldepind paldepind commented Mar 19, 2026

This PR adds a cpp/telemetry/extraction-information query similar to the $LANG/telemetry/extraction-information query for Rust, Java, and C#.

The implementation is consistent with the other languages. For now the query only includes metrics for resolved calls. The difference between this metric and the "Calls with explicit target" in cpp/telemetry/extraction-metrics is that only call in the source are considered.

This distinction is important for evaluating BMN with dependency installation. In the table below the first row shows the existing metric and the remaining rows show the new metric for the project nmap.

We believe that dependency installation has a detrimental effect on this project, but the existing metric looks like an improvement. This is because a lot of additional calls are added in dependencies which causes the number to increase. We the new metric the degradation in the database is clear.

The new metric also shows clearly the quality difference between traced and BMN (irrespective of dependency installation).

traced BMN BMN + deps inst
Existing call with explicit target 41,261 43,192 51,688
Percentage of calls with call target 100 79.18 71.709
Number of calls with call target 25,218 28,983 22,729
Number of calls with missing call target 0 7,621 8,967

In DCA this should populate the "Missing call targets, per source" that's produced for every language. In my DCA run it looks a bit weird (maybe because only one side had the query?). To get the other metrics as well we'll have to add new summaries to DCA.

Note that I didn't modify the existing cpp/telemetry/extraction-metrics for two reasons:

  • Going forward I think we should use cpp/telemetry/extraction-information for consistency with other languages.
  • I didn't want to break anything existing. Maybe cpp/telemetry/extraction-metrics is hooked up to some telemetry and dashboards?

@github-actions github-actions bot added the C++ label Mar 19, 2026
@paldepind paldepind force-pushed the cpp/extraction-information branch from e212be4 to 1af128f Compare March 19, 2026 13:28
@paldepind paldepind force-pushed the cpp/extraction-information branch from 1af128f to 4c525ce Compare March 19, 2026 13:29
@paldepind paldepind marked this pull request as ready for review March 20, 2026 07:55
@paldepind paldepind requested a review from a team as a code owner March 20, 2026 07:55
Copilot AI review requested due to automatic review settings March 20, 2026 07:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a C/C++ “extractor information” telemetry metric query to align C++ with other languages’ $LANG/telemetry/extraction-information reporting, focused initially on call-target resolution in source files.

Changes:

  • Add cpp/telemetry/extraction-information metric query (ExtractorInformation.ql) based on CallTargetStatsReport.
  • Introduce a new Telemetry/DatabaseQuality.qll library module that defines call-target quality stats restricted to source files.
  • Update C++ query-suite .expected manifests to include the new telemetry query.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cpp/ql/src/Telemetry/ExtractorInformation.ql New C++ extractor-information metric query exporting call-target quality stats.
cpp/ql/src/Telemetry/DatabaseQuality.qll New stats module feeding extractor-information with call-target quality metrics.
cpp/ql/integration-tests/query-suite/cpp-security-extended.qls.expected Adds the new telemetry query to the expected suite contents.
cpp/ql/integration-tests/query-suite/cpp-security-and-quality.qls.expected Adds the new telemetry query to the expected suite contents.
cpp/ql/integration-tests/query-suite/cpp-code-scanning.qls.expected Adds the new telemetry query to the expected suite contents.

@paldepind paldepind added the no-change-note-required This PR does not need a change note label Mar 20, 2026
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@jketema
Copy link
Contributor

jketema commented Mar 20, 2026

Note that I didn't modify the existing cpp/telemetry/extraction-metrics for two reasons:

Are we using it in DCA to get extraction stats out, or was that done differently?

@paldepind
Copy link
Contributor Author

paldepind commented Mar 20, 2026

Note that I didn't modify the existing cpp/telemetry/extraction-metrics for two reasons:

Are we using it in DCA to get extraction stats out, or was that done differently?

Yes, we use cpp/telemetry/extraction-metrics to in DCA to create this table.

If we keep expanding on cpp/telemetry/extraction-information (like in other languages) then I'd think that we could eventually remove cpp/telemetry/extraction-metrics.

Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@paldepind paldepind merged commit f6c81ff into github:main Mar 20, 2026
22 checks passed
@paldepind paldepind deleted the cpp/extraction-information branch March 20, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants