Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The OSV database contains 100% of vulnerabilities from NVD/CVE since 2016 that are determined to relate to OSS #783

Open
andrewpollock opened this issue Oct 24, 2022 · 11 comments
Assignees
Labels
datasource Requests for new data sources enhancement New feature or request

Comments

@andrewpollock
Copy link
Contributor

andrewpollock commented Oct 24, 2022

We need to generate OSV records from historical and future CVE records in the NVD that we can determine to relate to Open Source Software.

These records will be keyed by commit.

A side-effect of this is we will start picking up vulnerabilities in C/C++ packages.

@andrewpollock andrewpollock added enhancement New feature or request datasource Requests for new data sources labels Oct 24, 2022
@andrewpollock andrewpollock self-assigned this Dec 14, 2022
@evverx
Copy link

evverx commented Dec 16, 2022

I wonder if it would be possible to somehow exclude/mark duplicates? For example, https://nvd.nist.gov/vuln/detail/CVE-2021-45941 was presumably generated automatically based on https://osv.dev/vulnerability/OSV-2021-1576 and I don't think it makes much sense to show them both without any links to each other. I think it should also help to somewhat address #258.

@slonka
Copy link

slonka commented Jan 23, 2023

Any updates on this?

@andrewpollock
Copy link
Contributor Author

Hi Krzysztof, not sure if you're asking for an update with respect to the work overall, or an update in relation to #783 (comment) specifically?

I can give a progress update for interested parties:

  • "low hanging fruit" CVE records are convertible to OSV today, but not productionized
  • I'm able to derive a reasonably exhaustive set of OSS repositories for CVEs, using the ones from 2022 as my test bench
  • I'm now focusing on mapping from versions in CVEs to commits via Git tags to derive fix commits for CVEs where that is not first-order self-evident.

@slonka
Copy link

slonka commented Jan 25, 2023

Hi Andrew, Thanks for the update! 🙂 I was just wondering if this was actively worked on and if there is anything that can be split into smaller tasks so people can help out.

@andrewpollock
Copy link
Contributor Author

Yes, it's very much being actively worked on. I will look at defining and sharing some milestones, help is always welcome :-)

Everything's currently in a proof-of-concept stage, as I familiarise myself with the input data. Feel free to poke at https://github.com/google/osv.dev/tree/master/vulnfeeds/cpp (I think the README file needs an update)

@andrewpollock
Copy link
Contributor Author

This issue is overdue for a status update.

Work has been progressing well, and the most recent conversion runs are yielding the following results from the 2022 NVD CVE data set:

  • 15,698 CVEs are determined to relate to applications
  • 6,787 of these CVEs have one or more Git repositories identified for them
  • 5,579 of these CVEs with one or more Git repositories are successfully converting to OSV records

There's much validation work still to be done, before expanding to previous years as well as automating ongoing conversion.

@andrewpollock
Copy link
Contributor Author

Another overdue status update

We're close (targeting early October) to going live with the data currently available. CVEs from 2023-2016 are being processed.

From the 2023 NVD CVE data set:

  • 11,728 CVEs are determined to relate to applications
  • 5,201 of these CVEs have one or more Git repositories identified for them
  • 3,752 of these CVEs with one or more Git repositories are successfully converting to OSV records

@andrewpollock
Copy link
Contributor Author

andrewpollock commented Oct 11, 2023

Early access update for followers of this issue 👋

We've soft-launched this into OSV.dev production overnight, and we now have 31,889 NVD CVE-based records from 2023-2016 in OSV.dev, e.g:

This enables much broader commit hash-based vulnerability scanning, e.g.

curl -d \
  '{"commit": "227d2c20509f85a394133e2be6d0b0fc1fda54b2"}' \
  "https://api.osv.dev/v1/query" | jq '.vulns | map(.id)'
[
  "CVE-2023-26130"
]

Any individual record feedback can be filed using this issue template

We'll now continue to iterate on records that either didn't convert, but look like they should have been able to (weeding out any false positives from this set, and addressing any remaining deficiencies), or did convert but do not successfully import, as well as were discarded because they only half-converted, and would have caused a false positive.

@slonka
Copy link

slonka commented Nov 2, 2023

hi @andrewpollock - thank you very much for the update! Is this only available via the REST API or does it already work with osv-scanner?

@andrewpollock
Copy link
Contributor Author

Hi @slonka I can advise as of https://github.com/google/osv-scanner/releases/tag/v1.4.3, which went out a few hours ago, the functionality is now fully available in OSV-Scanner directly, as well as the REST API.

@oliverchang
Copy link
Collaborator

Our blog post announcing this has just been published: https://osv.dev/blog/posts/introducing-broad-c-c++-support/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource Requests for new data sources enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants