Skip to content

Add inner product metric to vector index #21814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 24, 2025
Merged

Conversation

jbajic
Copy link
Contributor

@jbajic jbajic commented Jun 20, 2025

Scope & Purpose

Add inner product metric for vector index. Inner product metric is similar to the cosine metric except we won't normalize vectors on search and insertion. The example of using the inner product is here:

  db.<collection>.ensureIndex({
    type: "vector",
    fields: ["value"],
    params: {
      "metric": "innerProduct",
      "dimension": 300,
      "nLists": 100
    }
  });

Also adding a new AQL function APPROX_NEAR_INNER_PRODUCT that uses vector index built with inner product metric. And using it in the query looks like this:

FOR d IN col
SORT APPROX_NEAR_INNER_PRODUCT(d.vector, @qp) DESC
LIMIT 3
RETURN d; 

Like in cosine metric with inner product metric, the bigger the similarity score more similar the vectors are, meaning we need to use DESC to get similar vectors.

  • 💩 Bugfix
  • 🍕 New feature
  • 🔥 Performance improvement
  • 🔨 Refactoring/simplification

Checklist

  • Tests
    • Regression tests
    • C++ Unit tests
    • integration tests
    • resilience tests
  • 📖 CHANGELOG entry made
  • 📚 documentation written (release notes, API changes, ...)
  • Backports
    • Backport for 3.12.0: (Please link PR)
    • Backport for 3.11: (Please link PR)
    • Backport for 3.10: (Please link PR)

Related Information

(Please reference tickets / specification / other PRs etc)

  • Docs PR:
  • Enterprise PR:
  • GitHub issue / Jira ticket:
  • Design document:

@cla-bot cla-bot bot added the cla-signed label Jun 20, 2025
@maierlars
Copy link
Contributor

Happy to see you working on vector index again :) if I get this PR right, the difference between this and cosine is, that we do not normalize the vectors?

@jbajic
Copy link
Contributor Author

jbajic commented Jun 20, 2025

Happy to see you working on vector index again :) if I get this PR right, the difference between this and cosine is, that we do not normalize the vectors?

Exactly, this one should be more efficient since we avoid normalization! 😃

@jbajic jbajic self-assigned this Jun 20, 2025
@jbajic jbajic marked this pull request as ready for review June 20, 2025 14:46
Copy link
Member

@neunhoef neunhoef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See two small wishes in the comments. But I am also missing the place where the kCosine case and the inner product case differ w.r.t. normalization. Maybe I overlooked it, but can you explain to me where this difference is achieved?

Copy link
Member

@neunhoef neunhoef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@goedderz goedderz merged commit 8a4f2cd into devel Jun 24, 2025
7 checks passed
@goedderz goedderz deleted the feature/inner-product-metric branch June 24, 2025 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants