Skip to content

2018-06-15 (GCS 1.9.0, BQ 0.13.0)

Compare
Choose a tag to compare
@medb medb released this 15 Jun 17:11
· 973 commits to master since this release

Changelog

Cloud Storage connector:

  1. Update all dependencies to latest versions.

  2. Delete metadata cache functionality because Cloud Storage has strong native list operation consistency already. Deleted properties:

    fs.gs.metadata.cache.enable
    fs.gs.metadata.cache.type
    fs.gs.metadata.cache.directory
    fs.gs.metadata.cache.max.age.info.ms
    fs.gs.metadata.cache.max.age.entry.ms
    
  3. Decrease default value for max requests per batch from 1,000 to 30.

  4. Make max requests per batch value configurable with property:

    fs.gs.max.requests.per.batch (default: 30)
    
  5. Support Hadoop 3.

  6. Change Maven project structure to be better compatible with IDEs.

  7. Delete deprecated GoogleHadoopGlobalRootedFileSystem.

  8. Fix thread leaks that were occurring when YARN log aggregation uploaded logs to GCS.

  9. Add interface through which user can directly provide the access token.

  10. Add more retries and error handling in GoogleCloudStorageReadChannel, to make it more resilient to network errors; also add a property to allow users to specify number of retries on low level GCS HTTP requests in case of server errors and I/O errors.

  11. Add properties to allow users to specify connect timeout and read timeout on low level GCS HTTP requests.

  12. Include prefix/directory objects metadata into storage.objects.list requests response to improve performance (i.e. set includeTrailingDelimiter parameter for storage.objects.list GCS requests to true).

BigQuery connector:

  1. POM updates for GCS connector 1.9.0.
  2. Update all dependencies to latest versions.
  3. Change Maven project structure to be better compatible with IDEs.
  4. Support Hadoop 3.
  5. Default BigQueryInputFormats to use unsharded exports and deprecate sharded exports.
  6. Deprecate BigQueryOutputFormat in favor of IndirectBigQueryOutputFormat.
  7. Add interface through which user can directly provide the access token.
  8. Support Cloud KMS key name in the output table spec.