This website uses Cookies. Click Accept to agree to our website's cookie use as described in our Privacy Policy. Click Preferences to customize your cookie settings.
Hi @gradientopt, Hope the former solution has unblocked you! Before I
give you a more accurate answer to your follow up questions, would you
mind sharing me one example of your failed job (task)'s batch json logs,
so that I can understand more about ...
Hi @thiago-oliveira, Thanks for your patience. Batch now has
`BATCH_TASK_RETRY_ATTEMPT` environment variable that you can use to
track retry attempts.
https://cloud.google.com/batch/docs/create-run-basic-job#create-job-environment-variables
has more ...
Hi @Ash15998, Could you try to follow
https://cloud.google.com/batch/docs/analyze-job-using-logs and collect
Batch related logs? The logs should tell you the direct reason why the
Batch container Job has task failures with exit code 125. Thanks!
Hi @gradientopt, Thanks for your elaboration through message! If your
tasks for network with/without external ip address are both partially
failed, I would suggest you also add retry on exit_code 125 to unblock
yourself for this issue. We are doing f...
Hi @vedantroy-genmo, Below is an example if I submit a small Batch
container-only Job which uses Batch Container-Optimized OS Image as the
default image (with 30GB as book disk size) with machine type as
`e2-highcpu-2`: ``` ~ $ df -hFilesystem Size U...