Skip to content

Commit f56811d

Browse files
authored
[AIRFLOW-6290] Create guide for GKE operators (#8883)
1 parent 997ddb6 commit f56811d

File tree

5 files changed

+147
-2
lines changed

5 files changed

+147
-2
lines changed

airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,9 @@
3232
GCP_LOCATION = os.environ.get("GCP_GKE_LOCATION", "europe-north1-a")
3333
CLUSTER_NAME = os.environ.get("GCP_GKE_CLUSTER_NAME", "cluster-name")
3434

35+
# [START howto_operator_gcp_gke_create_cluster_definition]
3536
CLUSTER = {"name": CLUSTER_NAME, "initial_node_count": 1}
37+
# [END howto_operator_gcp_gke_create_cluster_definition]
3638

3739
default_args = {"start_date": days_ago(1)}
3840

@@ -42,12 +44,14 @@
4244
schedule_interval=None, # Override to match your needs
4345
tags=['example'],
4446
) as dag:
47+
# [START howto_operator_gke_create_cluster]
4548
create_cluster = GKECreateClusterOperator(
4649
task_id="create_cluster",
4750
project_id=GCP_PROJECT_ID,
4851
location=GCP_LOCATION,
4952
body=CLUSTER,
5053
)
54+
# [END howto_operator_gke_create_cluster]
5155

5256
pod_task = GKEStartPodOperator(
5357
task_id="pod_task",
@@ -59,6 +63,7 @@
5963
name="test-pod",
6064
)
6165

66+
# [START howto_operator_gke_start_pod_xcom]
6267
pod_task_xcom = GKEStartPodOperator(
6368
task_id="pod_task_xcom",
6469
project_id=GCP_PROJECT_ID,
@@ -70,18 +75,23 @@
7075
cmds=["sh", "-c", 'mkdir -p /airflow/xcom/;echo \'[1,2,3,4]\' > /airflow/xcom/return.json'],
7176
name="test-pod-xcom",
7277
)
78+
# [END howto_operator_gke_start_pod_xcom]
7379

80+
# [START howto_operator_gke_xcom_result]
7481
pod_task_xcom_result = BashOperator(
7582
bash_command="echo \"{{ task_instance.xcom_pull('pod_task_xcom')[0] }}\"",
7683
task_id="pod_task_xcom_result",
7784
)
85+
# [END howto_operator_gke_xcom_result]
7886

87+
# [START howto_operator_gke_delete_cluster]
7988
delete_cluster = GKEDeleteClusterOperator(
8089
task_id="delete_cluster",
8190
name=CLUSTER_NAME,
8291
project_id=GCP_PROJECT_ID,
8392
location=GCP_LOCATION,
8493
)
94+
# [END howto_operator_gke_delete_cluster]
8595

8696
create_cluster >> pod_task >> delete_cluster
8797
create_cluster >> pod_task_xcom >> delete_cluster
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
or more contributor license agreements. See the NOTICE file
3+
distributed with this work for additional information
4+
regarding copyright ownership. The ASF licenses this file
5+
to you under the Apache License, Version 2.0 (the
6+
"License"); you may not use this file except in compliance
7+
with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
software distributed under the License is distributed on an
13+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
KIND, either express or implied. See the License for the
15+
specific language governing permissions and limitations
16+
under the License.
17+
18+
19+
20+
Google Kubernetes Engine Operators
21+
==================================
22+
23+
`Google Kubernetes Engine (GKE) <https://cloud.google.com/kubernetes-engine/>`__ provides a managed environment for
24+
deploying, managing, and scaling your containerized applications using Google infrastructure. The GKE environment
25+
consists of multiple machines (specifically, Compute Engine instances) grouped together to form a cluster.
26+
27+
.. contents::
28+
:depth: 1
29+
:local:
30+
31+
Prerequisite Tasks
32+
^^^^^^^^^^^^^^^^^^
33+
34+
.. include:: _partials/prerequisite_tasks.rst
35+
36+
Manage GKE cluster
37+
^^^^^^^^^^^^^^^^^^
38+
39+
A cluster is the foundation of GKE - all workloads run on on top of the cluster. It is made up on a cluster master
40+
and worker nodes. The lifecycle of the master is managed by GKE when creating or deleting a cluster.
41+
The worker nodes are represented as Compute Engine VM instances that GKE creates on your behalf when creating a cluster.
42+
43+
.. _howto/operator:GKECreateClusterOperator:
44+
45+
Create GKE cluster
46+
""""""""""""""""""
47+
48+
Here is an example of a cluster definition:
49+
50+
.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py
51+
:language: python
52+
:start-after: [START howto_operator_gcp_gke_create_cluster_definition]
53+
:end-before: [END howto_operator_gcp_gke_create_cluster_definition]
54+
55+
A dict object like this, or a
56+
:class:`~google.cloud.container_v1.types.Cluster`
57+
definition, is required when creating a cluster with
58+
:class:`~airflow.providers.google.cloud.operators.kubernetes_engine.GKECreateClusterOperator`.
59+
60+
.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py
61+
:language: python
62+
:dedent: 4
63+
:start-after: [START howto_operator_gke_create_cluster]
64+
:end-before: [END howto_operator_gke_create_cluster]
65+
66+
.. _howto/operator:GKEDeleteClusterOperator:
67+
68+
Delete GKE cluster
69+
""""""""""""""""""
70+
71+
To delete a cluster, use
72+
:class:`~airflow.providers.google.cloud.operators.kubernetes_engine.GKEDeleteClusterOperator`.
73+
This would also delete all the nodes allocated to the cluster.
74+
75+
.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py
76+
:language: python
77+
:dedent: 4
78+
:start-after: [START howto_operator_gke_delete_cluster]
79+
:end-before: [END howto_operator_gke_delete_cluster]
80+
81+
Manage workloads on a GKE cluster
82+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
83+
84+
GKE works with containerized applications, such as those created on Docker, and deploys them to run on the cluster.
85+
These are called workloads, and when deployed on the cluster they leverage the CPU and memory resources of the cluster
86+
to run effectively.
87+
88+
.. _howto/operator:GKEStartPodOperator:
89+
90+
Run a Pod on a GKE cluster
91+
""""""""""""""""""""""""""
92+
93+
There are two operators available in order to run a pod on a GKE cluster:
94+
95+
* :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator`
96+
* :class:`~airflow.providers.google.cloud.operators.kubernetes_engine.GKEStartPodOperator`
97+
98+
``GKEStartPodOperator`` extends ``KubernetesPodOperator`` to provide authorization using Google Cloud credentials.
99+
There is no need to manage the ``kube_config`` file, as it will be generated automatically.
100+
All Kubernetes parameters (except ``config_file``) are also valid for the ``GKEStartPodOperator``.
101+
For more information on ``KubernetesPodOperator``, please look at: :ref:`howto/operator:KubernetesPodOperator` guide.
102+
103+
We can enable the usage of :ref:`XCom <concepts:xcom>` on the operator. This works by launching a sidecar container
104+
with the pod specified. The sidecar is automatically mounted when the XCom usage is specified and it's mount point
105+
is the path ``/airflow/xcom``. To provide values to the XCom, ensure your Pod writes it into a file called
106+
``return.json`` in the sidecar. The contents of this can then be used downstream in your DAG.
107+
Here is an example of it being used:
108+
109+
.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py
110+
:language: python
111+
:dedent: 4
112+
:start-after: [START howto_operator_gke_start_pod_xcom]
113+
:end-before: [END howto_operator_gke_start_pod_xcom]
114+
115+
And then use it in other operators:
116+
117+
.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_kubernetes_engine.py
118+
:language: python
119+
:dedent: 4
120+
:start-after: [START howto_operator_gke_xcom_result]
121+
:end-before: [END howto_operator_gke_xcom_result]
122+
123+
Reference
124+
^^^^^^^^^
125+
126+
For further information, look at:
127+
128+
* `GKE API Documentation <https://cloud.google.com/kubernetes-engine/docs/reference/rest>`__
129+
* `Product Documentation <https://cloud.google.com/kubernetes-engine/docs/>`__
130+
* `Kubernetes Documentation <https://kubernetes.io/docs/home/>`__

docs/howto/operator/kubernetes.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,12 @@
2222
KubernetesPodOperator
2323
=====================
2424

25+
.. note::
26+
If you use `Google Kubernetes Engine <https://cloud.google.com/kubernetes-engine/>`__, consider
27+
using the
28+
:ref:`GKEStartPodOperator <howto/operator:GKEStartPodOperator>` operator as it
29+
simplifies the Kubernetes authorization process.
30+
2531
The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator`:
2632

2733
* Launches a Docker image as a Kubernetes Pod to execute an individual Airflow

docs/operators-and-hooks-ref.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -735,7 +735,7 @@ These integrations allow you to perform various operations within the Google Clo
735735
-
736736

737737
* - `Kubernetes Engine <https://cloud.google.com/kubernetes_engine/>`__
738-
-
738+
- :doc:`How to use <howto/operator/gcp/kubernetes_engine>`
739739
- :mod:`airflow.providers.google.cloud.hooks.kubernetes_engine`
740740
- :mod:`airflow.providers.google.cloud.operators.kubernetes_engine`
741741
-

tests/test_project_structure.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,6 @@ class TestGoogleProviderProjectStructure(unittest.TestCase):
146146
'datastore',
147147
'dlp',
148148
'gcs_to_bigquery',
149-
'kubernetes_engine',
150149
'mlengine',
151150
'mssql_to_gcs',
152151
'mysql_to_gcs',

0 commit comments

Comments
 (0)