Skip to content

2020-07-15 (GCS 2.1.4, BQ 1.1.4)

Compare
Choose a tag to compare
@hongyegong hongyegong released this 16 Jul 06:44
· 54 commits to branch-2.1.x since this release

Changelog

Cloud Storage connector:

  1. Added a new parameter to configure output stream pipe type:

    fs.gs.outputstream.pipe.type (default: IO_STREAM_PIPE)
    

    Valid values are NIO_CHANNEL_PIPE and IO_STREAM_PIPE.

    Output stream now supports (when property value set to NIO_CHANNEL_PIPE) Java NIO Pipe that allows to reliably write in the output stream from multiple threads without "Pipe broken" exceptions.

    Note that when using NIO_CHANNEL_PIPE option maximum upload throughput can decrease by 10%.

  2. Throw ClosedChannelException in GoogleHadoopOutputStream.write methods if stream already closed. This fixes Spark Streaming jobs checkpointing to Cloud Storage.

  3. Add a property to impersonate a service account:

    fs.gs.auth.impersonation.service.account (not set by default)
    

    If this property is set, an access token will be generated for this service account to access GCS. The caller who issues a request for the access token must have been granted the Service Account Token Creator role (roles/iam.serviceAccountTokenCreator) on the service account to impersonate.

  4. Add properties to impersonate a service account through user or group name:

    fs.gs.auth.impersonation.service.account.for.user.<USER_NAME> (not set by default)
    fs.gs.auth.impersonation.service.account.for.group.<GROUP_NAME> (not set by default)
    

    If any of these properties is set, an access token will be generated for the service account associated with specified user name or group name in order to access GCS. The caller who issues a request for the access token must have been granted the Service Account Token Creator role (roles/iam.serviceAccountTokenCreator) on the service account to impersonate.

  5. Update all dependencies to latest versions.

Big Query connector:

  1. Update all dependencies to latest versions.