-
Notifications
You must be signed in to change notification settings - Fork 89
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Initial checkin of the TPU training on GKE reference guide * Updates to the main README * Created reusable Terraform modules * Modified Terraform scripts * Updated Terraform modules * Updated Terraform modules * Updated Terraform modules * Updated the README files * Increased time-outs in TPU node pool creation * Updated Terraform * Updated Terraform * Updated JobSet and Kueue configuration * Updated Cloud Build configuration * Updated the main README * Updated the setup for examples * Updated the main README * Updated Hello World examples * Updated Hello World examples * Updated hello world examples * Updated maxtext examples * Updated maxtext examples * Updated the jobset examples * Updated the jobset examples * Updates to JobSet examples * Updates to JobSet examples * Updated the README * Updated the README * Updated the Jobset examples * wip * Updated xpk examples * Updated the Terraform module docs * Updated the Terraform module readme * Updated the README for Terraform modules * Updated Terraform to support v5p * Cleanup Terraform * Clean up Kustomize * Updated the main README * Updated Terraform * Updated the README * Updated JobSet examples * updated xpk examples
- Loading branch information
Showing
89 changed files
with
4,550 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
134 changes: 134 additions & 0 deletions
134
ai-infrastructure/terraform-modules/bootstrap/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Automation bootstrap | ||
|
||
This Terraform module establishes the initial configuration of a GCP project that requires elevated administrative permissions. Its primary objective is to set up Terraform and Cloud Build automation for subsequent provisioning tasks. The module enables the specified set of services and sets up an automation service account along with an automation GCS bucket. Optionally, the module can create a GCP project. | ||
|
||
## Examples | ||
|
||
``` | ||
module "automation_bootstrap" { | ||
source = "github.com/GoogleCloudPlatform/applied-ai-engineering-samples//ai-infrastructure/terraform-modules/bootstrap" | ||
project_id = "project-id" | ||
automation_bucket = { | ||
name = "automation-bucket-name" | ||
location = "us-central1" | ||
automation_sa_name = "service-account-name" | ||
services = [ | ||
"aiplatform.googleapis.com" | ||
] | ||
roles = [ | ||
"roles/aiplatform.user" | ||
] | ||
} | ||
``` | ||
|
||
By default the module enables the following services: | ||
|
||
- accesscontextmanager.googleapis.com | ||
- artifactregistry.googleapis.com | ||
- cloudbuild.googleapis.com | ||
- cloudkms.googleapis.com | ||
- cloudresourcemanager.googleapis.com | ||
- container.googleapis.com | ||
- compute.googleapis.com | ||
- container.googleapis.com | ||
- iam.googleapis.com | ||
- iamcredentials.googleapis.com | ||
- serviceusage.googleapis.com | ||
- sourcerepo.googleapis.com | ||
- stackdriver.googleapis.com | ||
- storage-component.googleapis.com | ||
- storage.googleapis.com | ||
- sts.googleapis.com | ||
|
||
You can specify additional services to enable through the services input variable. | ||
|
||
By default, the following roles are assigned to the automation service account: | ||
|
||
- roles/iam.securityAdmin | ||
- roles/iam.serviceAccountAdmin | ||
- roles/compute.networkAdmin | ||
- roles/container.admin | ||
- roles/iam.serviceAccountUser | ||
- roles/storage.admin | ||
- roles/artifactregistry.admin | ||
|
||
You can specify additional roles to assign to the automation service account through the roles input variable. | ||
|
||
|
||
## Impersonating automation service account | ||
|
||
To be able to use the automation service account, the account that will be used to run Terraform commands in the other deployment stages needs to have the `iam.serviceAccountTokenCreator` rights on the automation service account. You can grant this permission using the following command. Make sure to set the AUTOMATION_SERVICE_ACCOUNT and TERRAFORM_USER_ACCOUNT variables to the email addresses of the accounts in your environment. | ||
|
||
|
||
``` | ||
AUTOMATION_SERVICE_ACCOUNT=you-automation-service-account-name@jk-mlops-dev.iam.gserviceaccount.com | ||
[email protected] | ||
gcloud iam service-accounts add-iam-policy-binding $AUTOMATION_SERVICE_ACCOUNT --member="user:$TERRAFORM_USER_ACCOUNT" --role='roles/iam.serviceAccountTokenCreator' | ||
``` | ||
|
||
If the impersonating account itself is a service account, such as the Cloud Build service account: | ||
|
||
|
||
``` | ||
AUTOMATION_SERVICE_ACCOUNT=you-automation-service-account-name@jk-mlops-dev.iam.gserviceaccount.com | ||
[email protected] | ||
gcloud iam service-accounts add-iam-policy-binding $AUTOMATION_SERVICE_ACCOUNT --member="serviceAccount:$TERRAFORM_USER_ACCOUNT" --role='roles/iam.serviceAccountTokenCreator' | ||
``` | ||
|
||
|
||
## Input variables | ||
|
||
| Name | Description | Type | Required | Default | | ||
|---|---|---|---|---| | ||
|[project_id](variables.tf#L31)| The project ID, where to enable services and create an automation service account and an automation bucket|`string`| ✓ || | ||
|[deletion_protection](variables.tf#L28)|Prevent Terraform from destroying the automation bucket. When this field is set, a terraform destroy or terraform apply that would delete the bucket will fail.|`string`||`true`| | ||
|[automation_bucket](variables.tf#L22)| Settings for the automation bucket |`map(strings)`|✓|| | ||
|[automation_sa_name](variables.tf#L37)|The name of the automation service account|`string`| ✓|| | ||
|[services](variables.tf#L43)|The list of additional services to enable|`list(strings)`| ✓ || | ||
|[roles](varialbes.tf#L50)|The list of additional roles to assign to the automation service account|`list(strings)`|✓ || | ||
|
||
|
||
## Outputs | ||
|
||
| Name | Description | | ||
|---|---| | ||
|[automation_sa](outputs.tf#L42)|The email of the automation service account| | ||
|[automation_gcs](outputs.tf#L37)|The name of the automation bucket| | ||
|
||
|
||
|
||
The module also creates two files in the `gs://<AUTOMATION_BUCKET_NAME>/providers` | ||
|
||
- the `providers.tf` file | ||
|
||
``` | ||
provider "google" { | ||
impersonate_service_account = "[email protected]" | ||
} | ||
provider "google-beta" { | ||
impersonate_service_account = "[email protected]" | ||
} | ||
``` | ||
|
||
- the `backend.tf` file | ||
|
||
``` | ||
terraform { | ||
backend "gcs" { | ||
bucket = "automation-bucket-name" | ||
impersonate_service_account = "[email protected]" | ||
# remove the newline between quotes and set the prefix to the folder for Terraform state | ||
prefix = " | ||
" | ||
} | ||
} | ||
``` | ||
|
||
You can utilize these files in the downstream Terraform stages to configure the management of Terraform state in Cloud Storage and enable Terraform impersonation. | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Copyright 2023 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
locals { | ||
gcs_storage_class = ( | ||
length(split("-", var.automation_bucket.location)) < 2 | ||
? "MULTI_REGIONAL" | ||
: "REGIONAL" | ||
) | ||
|
||
default_services = [ | ||
"accesscontextmanager.googleapis.com", | ||
"artifactregistry.googleapis.com", | ||
"cloudbuild.googleapis.com", | ||
"cloudkms.googleapis.com", | ||
"cloudresourcemanager.googleapis.com", | ||
"container.googleapis.com", | ||
"compute.googleapis.com", | ||
"container.googleapis.com", | ||
"iam.googleapis.com", | ||
"iamcredentials.googleapis.com", | ||
"serviceusage.googleapis.com", | ||
"sourcerepo.googleapis.com", | ||
"stackdriver.googleapis.com", | ||
"storage-component.googleapis.com", | ||
"storage.googleapis.com", | ||
"sts.googleapis.com" | ||
] | ||
services = concat(local.default_services, var.services) | ||
|
||
default_roles = [ | ||
"roles/iam.securityAdmin", | ||
"roles/iam.serviceAccountAdmin", | ||
"roles/compute.networkAdmin", | ||
"roles/container.admin", | ||
"roles/iam.serviceAccountUser", | ||
"roles/storage.admin", | ||
"roles/artifactregistry.admin", | ||
] | ||
roles = concat(local.default_roles, var.roles) | ||
} | ||
|
||
module "project_config" { | ||
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/project?ref=v28.0.0&depth=1" | ||
name = var.project_id | ||
project_create = false | ||
services = local.services | ||
} | ||
|
||
module "automation_gcs" { | ||
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/gcs?ref=v28.0.0&depth=1" | ||
project_id = module.project_config.project_id | ||
name = var.automation_bucket.name | ||
location = var.automation_bucket.location | ||
storage_class = local.gcs_storage_class | ||
versioning = true | ||
force_destroy = var.deletion_protection ? false : true | ||
} | ||
|
||
|
||
module "automation_sa" { | ||
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/iam-service-account?ref=v28.0.0&depth=1" | ||
project_id = module.project_config.project_id | ||
name = var.automation_sa_name | ||
display_name = "Terraform automation service account." | ||
# allow SA used by CI/CD workflow to impersonate this SA | ||
#iam = { | ||
# "roles/iam.serviceAccountTokenCreator" = compact([ | ||
# try(module.automation-tf-cicd-sa["bootstrap"].iam_email, null) | ||
# ]) | ||
#} | ||
iam_storage_roles = { | ||
(module.automation_gcs.name) = ["roles/storage.admin"] | ||
} | ||
iam_project_roles = { | ||
"${module.project_config.project_id}" = local.roles | ||
|
||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Copyright 2023 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
locals { | ||
_tpl_providers = "${path.module}/templates/providers.tf.tpl" | ||
_tpl_backend = "${path.module}/templates/backend.tf.tpl" | ||
providers = { | ||
"providers" = templatefile(local._tpl_providers, { | ||
sa = module.automation_sa.email | ||
}) | ||
|
||
"backend" = templatefile(local._tpl_backend, { | ||
backend_extra = join("\n", [ | ||
"# remove the newline between quotes and set the prefix to the folder for Terraform state", | ||
"prefix = \"", | ||
"\"" | ||
]) | ||
bucket = module.automation_gcs.name | ||
sa = module.automation_sa.email | ||
}) | ||
} | ||
} | ||
|
||
output "automation_gcs" { | ||
description = "GCS bucket where Terraform automation artifacts are managed" | ||
value = module.automation_gcs.name | ||
} | ||
|
||
output "automation_sa" { | ||
description = "The email of the automation service account" | ||
value = module.automation_sa.email | ||
} | ||
|
||
resource "google_storage_bucket_object" "providers" { | ||
for_each = local.providers | ||
bucket = module.automation_gcs.name | ||
name = "providers/${each.key}.tf" | ||
content = each.value | ||
} |
24 changes: 24 additions & 0 deletions
24
ai-infrastructure/terraform-modules/bootstrap/templates/backend.tf.tpl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# Copyright 2023 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
terraform { | ||
backend "gcs" { | ||
bucket = "${bucket}" | ||
impersonate_service_account = "${sa}" | ||
%{~ if backend_extra != null ~} | ||
${indent(4, backend_extra)} | ||
%{~ endif ~} | ||
} | ||
} |
22 changes: 22 additions & 0 deletions
22
ai-infrastructure/terraform-modules/bootstrap/templates/providers.tf.tpl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Copyright 2023 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
provider "google" { | ||
impersonate_service_account = "${sa}" | ||
} | ||
provider "google-beta" { | ||
impersonate_service_account = "${sa}" | ||
} | ||
|
55 changes: 55 additions & 0 deletions
55
ai-infrastructure/terraform-modules/bootstrap/variables.tf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# Copyright 2023 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
variable "deletion_protection" { | ||
description = "Prevent Terraform from destroying data storage resources (storage buckets, GKE clusters). When this field is set, a terraform destroy or terraform apply that would delete data storage resources will fail." | ||
type = bool | ||
default = true | ||
nullable = false | ||
} | ||
|
||
variable "automation_bucket" { | ||
description = "The parameters of the bucket to be used by automation tools including Terraform backend" | ||
type = object({ | ||
name = string | ||
location = string | ||
}) | ||
nullable = false | ||
} | ||
|
||
variable "project_id" { | ||
description = "The GCP project ID" | ||
type = string | ||
nullable = false | ||
} | ||
|
||
variable "automation_sa_name" { | ||
description = "The name of the automation service account" | ||
type = string | ||
nullable = false | ||
} | ||
|
||
variable "services" { | ||
description = "Additional services to enable" | ||
type = list(string) | ||
default = [] | ||
nullable = false | ||
} | ||
|
||
variable "roles" { | ||
description = "Additional roles to add to an automation account" | ||
type = list(string) | ||
default = [] | ||
nullable = false | ||
} |
Oops, something went wrong.