-
Notifications
You must be signed in to change notification settings - Fork 233
feat: i/o for DocVec #1562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: i/o for DocVec #1562
Conversation
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets discuss I don't think we should implement this that way.
If I am understanding correctly here the json representation of a DocList and a DocVec is the same right ?
IMO DocVec should have its own column based json representation.
For csv etcc I would say we ask first user to instantiate a DocList then cast to docVec
Fair
Why? The whole point of this issue is to natively enable these, no? Why do you think DocVec should not have these methods? |
My point is that saving line by line in csv for instance is a bit weird for a column base format. But maybe yes we can have them, I just don't want it to be confusing |
yeah this is fair, I will address that |
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
Signed-off-by: Johannes Messner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice!
📝 Docs are deployed on https://ft-feat-docvec-io--jina-docs.netlify.app 🎉 |
This PR implements the IOMixinArray for DocVec, enabling the following (de)serializations:
Depends on #1561 (protobuf serialization bug)
JSON data now contains four columns, like this:
CSV is not supported, since it is a row-based format.
All other formats are used the same way as in DocList, but have a different serialized representation.
Closes #1330