Skip to content

fix: make DocList.to_json() return str instead of bytes #1769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 1, 2023

Conversation

JohannesMessner
Copy link
Member

@JohannesMessner JohannesMessner commented Sep 1, 2023

💥 Contains breaking change

This changes the return type of DocList.to_json() and DocVec.to_json() from bytes to str, by decoding the bytes output obtained from the orjson library.
This is done to make it consistent with to_json() on BaseDoc and in pydantic.

from_json() does not need to be touched, since orjson happily accepts strings as well.

This decoding step incurs a small performance penalty:
image

Running this multiple times I saw a to_json() performance penalty of 8-15%, depending on the run.
The to_json()/from_json() roundrip penalty was consistently single digit percent, presumably because the from_json() call dominates this runtime.

Note the breaking change this induces in cases like the one shown in the documentation that is also changed as part of this PR.
Also note that in the scenario there, after this change, a bytes object will be decoded to string, only to be encoded into bytes again.

closes #1766

@codecov
Copy link

codecov bot commented Sep 1, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (cc2339d) 85.00% compared to head (04a3797) 85.00%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1769   +/-   ##
=======================================
  Coverage   85.00%   85.00%           
=======================================
  Files         134      134           
  Lines        8845     8845           
=======================================
  Hits         7519     7519           
  Misses       1326     1326           
Flag Coverage Δ
docarray 85.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
docarray/array/doc_list/io.py 88.99% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JohannesMessner JohannesMessner marked this pull request as ready for review September 1, 2023 11:55
@JohannesMessner JohannesMessner marked this pull request as draft September 1, 2023 11:57
Signed-off-by: Johannes Messner <[email protected]>
@github-actions
Copy link

github-actions bot commented Sep 1, 2023

📝 Docs are deployed on https://ft-fix-to-json-list--jina-docs.netlify.app 🎉

@JohannesMessner JohannesMessner marked this pull request as ready for review September 1, 2023 12:31
@JohannesMessner JohannesMessner merged commit 3dc525f into main Sep 1, 2023
@JohannesMessner JohannesMessner deleted the fix-to-json-list branch September 1, 2023 12:31
@JoanFM JoanFM mentioned this pull request Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent to_json() return type
2 participants