-
Notifications
You must be signed in to change notification settings - Fork 233
feat: support redis #1550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support redis #1550
Conversation
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: Saba Sturua <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, but looking good!
docarray/index/backends/redis.py
Outdated
|
||
:return: Number of documents in the index. | ||
""" | ||
num_docs = self._client.ft(self._index_name).info()['num_docs'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work? Maybe I'm wrong, but it seems like this 'num_docs'
key is not something that is built into redis. Do we manually keep track of this somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, num_docs
is included in the index info
but apparently there's a better way to get number of docs
return self._client.dbsize()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think we should stick with the old way because dbsize returns number of keys the client stores, combining all indices so it's inaccurate if you have more than one index
docarray/index/backends/redis.py
Outdated
|
||
ids = [self._prefix + id for id in doc_ids] | ||
docs = self._client.json().mget(ids, '$') | ||
docs = [doc[0] for doc in docs if doc] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs = [doc[0] for doc in docs if doc] | |
docs = (doc[0] for doc in docs if doc) |
I think this avoids doing an iteration over the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but the abstract class expects a list and not a generator, we can refactor that in a separate pr maybe
def __contains__(self, item: BaseDoc) -> bool: | ||
""" | ||
Checks if a given document exists in the index. | ||
|
||
:param item: The document to check. | ||
It must be an instance of BaseDoc or its subclass. | ||
:return: True if the document exists in the index, False otherwise. | ||
""" | ||
if safe_issubclass(type(item), BaseDoc): | ||
return self._doc_exists(item.id) | ||
else: | ||
raise TypeError( | ||
f"item must be an instance of BaseDoc or its subclass, not '{type(item).__name__}'" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worthy of a separete PR, but I think we should move __contains__
to the abstract class and tell backends to implement _doc_exitsts()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
1 similar comment
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
Signed-off-by: jupyterjazz <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
📝 Docs are deployed on https://ft-feat-add-redis--jina-docs.netlify.app 🎉 |
Redis as a Document Index Backend
Simple usage:
Supported functionality: