Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bigquery/storage/managedwriter): support default value controls #8686

Merged
merged 14 commits into from
Oct 24, 2023

Conversation

shollyman
Copy link
Contributor

@shollyman shollyman commented Oct 11, 2023

This PR adds new options to control how missing values are interpreted when writing.

For ManagedStream instantiation, the options are:

  • WithDefaultMissingValueInterpretation (blanket setting for all columns)
  • WithMissingValueInterpretations (per-column settings)

To support updates, these are added as AppendOption options:

  • UpdateDefaultMissingValueInterpretation
  • UpdateMissingValueInterpretations

Implementation-wise, this PR removes the previous schema-specific
versioner (descriptorVersion) and expands the concept to a versioned
AppendRowsRequest template (versionedTemplate). This more general
mechanism allows us to version all settings that manifest as request fields
in the AppendRowsRequest.

Fixes: #8387

This feature introduces a new abstraction, the versionedTemplate.  The
intent is for this to replace the existing schema versioning mechanism
with something more general and robust.  Once the swap is complete, we
can support default value changes and schema changes through the same
templating mechanism.
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the BigQuery API. labels Oct 11, 2023
In terms of public surface, this PR adds new options to control how
missing values are interpreted when writing.

For ManagedStream instantiation, the options are:
* WithDefaultMissingValueInterpretation (blanket setting for all columns)
* WithMissingValueInterpretations (per-column settings)

To support updates, these are added as AppendOptions:
* UpdateDefaultMissingValueInterpretation
* UpdateMissingValueInterpretations

Implementation-wise, this PR rips out the previous schema-specific
versioner and expands the concept to a versioned AppendRowsRequest
template.  This more general mechanism allows us to version all
settings that manifest as request fields in the AppendRowsRequest.
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Oct 13, 2023
@shollyman shollyman changed the title feat(bigquery/storage/managedwriter): refactor to add versionedTemplate feat(bigquery/storage/managedwriter): support default value controls Oct 13, 2023
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: l Pull request size is large. labels Oct 13, 2023
@shollyman shollyman marked this pull request as ready for review October 13, 2023 22:22
@shollyman shollyman requested review from a team as code owners October 13, 2023 22:22
@shollyman shollyman requested review from chalmerlowe and alvarowolfx and removed request for chalmerlowe October 13, 2023 22:22
bigquery/storage/managedwriter/options.go Outdated Show resolved Hide resolved
bigquery/storage/managedwriter/options.go Outdated Show resolved Hide resolved
bigquery/storage/managedwriter/options.go Outdated Show resolved Hide resolved
bigquery/storage/managedwriter/options.go Outdated Show resolved Hide resolved
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Oct 20, 2023
Copy link
Contributor

@alvarowolfx alvarowolfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added minor nit on missing comment and while I'm not a huge fan of the naming of the reviseXXX functions, I don't have a better alternative ( and they are all internal so 🤷 / and this can also be a matter of it not making much sense for a non native english speaking person haha )

bigquery/storage/managedwriter/integration_test.go Outdated Show resolved Hide resolved
@shollyman shollyman added the automerge Merge the pull request once unit tests and other checks pass. label Oct 24, 2023
@gcf-merge-on-green
Copy link
Contributor

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Oct 24, 2023
@shollyman shollyman added the automerge Merge the pull request once unit tests and other checks pass. label Oct 24, 2023
@shollyman shollyman merged commit dfa8e22 into googleapis:main Oct 24, 2023
9 checks passed
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Oct 24, 2023
@shollyman shollyman deleted the new-templater branch October 24, 2023 17:55
gcf-merge-on-green bot pushed a commit that referenced this pull request Oct 30, 2023
🤖 I have created a release *beep* *boop*
---


## [1.57.0](https://togithub.com/googleapis/google-cloud-go/compare/bigquery/v1.56.0...bigquery/v1.57.0) (2023-10-30)


### Features

* **bigquery/biglake:** Promote to GA ([e864fbc](https://togithub.com/googleapis/google-cloud-go/commit/e864fbcbc4f0a49dfdb04850b07451074c57edc8))
* **bigquery/storage/managedwriter:** Support default value controls ([#8686](https://togithub.com/googleapis/google-cloud-go/issues/8686)) ([dfa8e22](https://togithub.com/googleapis/google-cloud-go/commit/dfa8e22edf560211ae2a2ebf1f9a23b86887c7be))
* **bigquery:** Expose Apache Arrow data through ArrowIterator  ([#8506](https://togithub.com/googleapis/google-cloud-go/issues/8506)) ([c8e7692](https://togithub.com/googleapis/google-cloud-go/commit/c8e76923621b379fb7deb6dfb944011af1d980bd)), refs [#8100](https://togithub.com/googleapis/google-cloud-go/issues/8100)
* **bigquery:** Introduce query preview features ([#8653](https://togithub.com/googleapis/google-cloud-go/issues/8653)) ([f29683b](https://togithub.com/googleapis/google-cloud-go/commit/f29683bcd06567e4fc2d404f53bedbea5b5f0f90))


### Bug Fixes

* **bigquery:** Handle storage read api Recv call errors ([#8666](https://togithub.com/googleapis/google-cloud-go/issues/8666)) ([c73963f](https://togithub.com/googleapis/google-cloud-go/commit/c73963f64ef667daa8a33a5a4cc2156818fc6914))
* **bigquery:** Update golang.org/x/net to v0.17.0 ([174da47](https://togithub.com/googleapis/google-cloud-go/commit/174da47254fefb12921bbfc65b7829a453af6f5d))
* **bigquery:** Update grpc-go to v1.56.3 ([343cea8](https://togithub.com/googleapis/google-cloud-go/commit/343cea8c43b1e31ae21ad50ad31d3b0b60143f8c))
* **bigquery:** Update grpc-go to v1.59.0 ([81a97b0](https://togithub.com/googleapis/google-cloud-go/commit/81a97b06cb28b25432e4ece595c55a9857e960b7))

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
bhshkh pushed a commit that referenced this pull request Nov 3, 2023
…8686)

* feat(bigquery/storage/managedwriter): support default value controls

In terms of public surface, this PR adds new options to control how
missing values are interpreted when writing.

For ManagedStream instantiation, the options are:
* WithDefaultMissingValueInterpretation (blanket setting for all columns)
* WithMissingValueInterpretations (per-column settings)

To support updates, these are added as AppendOptions:
* UpdateDefaultMissingValueInterpretation
* UpdateMissingValueInterpretations

Implementation-wise, this PR rips out the previous schema-specific
versioner and expands the concept to a versioned AppendRowsRequest
template.  This more general mechanism allows us to version all
settings that manifest as request fields in the AppendRowsRequest.
bhshkh pushed a commit that referenced this pull request Nov 3, 2023
🤖 I have created a release *beep* *boop*
---


## [1.57.0](https://togithub.com/googleapis/google-cloud-go/compare/bigquery/v1.56.0...bigquery/v1.57.0) (2023-10-30)


### Features

* **bigquery/biglake:** Promote to GA ([e864fbc](https://togithub.com/googleapis/google-cloud-go/commit/e864fbcbc4f0a49dfdb04850b07451074c57edc8))
* **bigquery/storage/managedwriter:** Support default value controls ([#8686](https://togithub.com/googleapis/google-cloud-go/issues/8686)) ([dfa8e22](https://togithub.com/googleapis/google-cloud-go/commit/dfa8e22edf560211ae2a2ebf1f9a23b86887c7be))
* **bigquery:** Expose Apache Arrow data through ArrowIterator  ([#8506](https://togithub.com/googleapis/google-cloud-go/issues/8506)) ([c8e7692](https://togithub.com/googleapis/google-cloud-go/commit/c8e76923621b379fb7deb6dfb944011af1d980bd)), refs [#8100](https://togithub.com/googleapis/google-cloud-go/issues/8100)
* **bigquery:** Introduce query preview features ([#8653](https://togithub.com/googleapis/google-cloud-go/issues/8653)) ([f29683b](https://togithub.com/googleapis/google-cloud-go/commit/f29683bcd06567e4fc2d404f53bedbea5b5f0f90))


### Bug Fixes

* **bigquery:** Handle storage read api Recv call errors ([#8666](https://togithub.com/googleapis/google-cloud-go/issues/8666)) ([c73963f](https://togithub.com/googleapis/google-cloud-go/commit/c73963f64ef667daa8a33a5a4cc2156818fc6914))
* **bigquery:** Update golang.org/x/net to v0.17.0 ([174da47](https://togithub.com/googleapis/google-cloud-go/commit/174da47254fefb12921bbfc65b7829a453af6f5d))
* **bigquery:** Update grpc-go to v1.56.3 ([343cea8](https://togithub.com/googleapis/google-cloud-go/commit/343cea8c43b1e31ae21ad50ad31d3b0b60143f8c))
* **bigquery:** Update grpc-go to v1.59.0 ([81a97b0](https://togithub.com/googleapis/google-cloud-go/commit/81a97b06cb28b25432e4ece595c55a9857e960b7))

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bigquery: add missing_value_interpretations as append option
3 participants