-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feat: Added a template for tutorials. (#299)
- Loading branch information
1 parent
4a5a2fb
commit ae23d4b
Showing
2 changed files
with
251 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,250 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "7i8lsRFe2lu5" | ||
}, | ||
"source": [ | ||
"# Overview\n", | ||
"\n", | ||
"Add a brief description of this tutorial here." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "cFeLh7K7KI6B" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%%capture\n", | ||
"\n", | ||
"# Installing the required libraries:\n", | ||
"!pip install matplotlib pandas scikit-learn tensorflow pyarrow tqdm\n", | ||
"!pip install google-cloud-bigquery google-cloud-bigquery-storage\n", | ||
"!pip install flake8 pycodestyle pycodestyle_magic" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "6_jTxerkMtkg" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Python Builtin Libraries\n", | ||
"from datetime import datetime\n", | ||
"\n", | ||
"# Third Party Libraries\n", | ||
"from google.cloud import bigquery\n", | ||
"\n", | ||
"# Configurations\n", | ||
"%matplotlib inline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "2x4wG61omjBQ" | ||
}, | ||
"source": [ | ||
"### Authentication\n", | ||
"In order to run this tutorial successfully, we need to be authenticated first. \n", | ||
"\n", | ||
"Depending on where we are running this notebook, the authentication steps may vary:\n", | ||
"\n", | ||
"| Runner | Authentiction Steps |\n", | ||
"| ----------- | ----------- |\n", | ||
"| Local Computer | Use a service account, or run the following command: <br><br>`gcloud auth login` |\n", | ||
"| Colab | Run the following python code and follow the instructions: <br><br>`from google.colab import auth` <br> `auth.authenticate_user() ` |\n", | ||
"| Vertext AI (Workbench) | Authentication is provided by Workbench |" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "i7aszhgnkxuv" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"try:\n", | ||
" from google.colab import auth\n", | ||
"\n", | ||
" print(\"Authenticating in Colab\")\n", | ||
" auth.authenticate_user()\n", | ||
" print(\"Authenticated\")\n", | ||
"except: # noqa\n", | ||
" print(\"This notebook is not running on Colab.\")\n", | ||
" print(\"Please make sure to follow the authentication steps.\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "N7qEYK98Nx89" | ||
}, | ||
"source": [ | ||
"### Configurations\n", | ||
"\n", | ||
"Let's make sure we enter the name of our GCP project in the next cell." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "So_ed4wf0lKu" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# ENTER THE GCP PROJECT HERE\n", | ||
"gcp_project = \"YOUR-GCP-PROJECT\"\n", | ||
"print(f\"gcp_project is set to {gcp_project}\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"" | ||
], | ||
"metadata": { | ||
"id": "xhyBJNts3MHK" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def helper_function():\n", | ||
" \"\"\"\n", | ||
" Add a description about what this function does.\n", | ||
" \"\"\"\n", | ||
" return None" | ||
], | ||
"metadata": { | ||
"id": "eo0IVOdr3MSU" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "UXIitTXv7IEu" | ||
}, | ||
"source": [ | ||
"## Data Preparation" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "LBLKdccfZzKu" | ||
}, | ||
"source": [ | ||
"### Query the Data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "P1VfAL04kwF7" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"query = \"\"\"\n", | ||
" SELECT\n", | ||
" created_date, category, complaint_type, neighborhood, latitude, longitude\n", | ||
" FROM\n", | ||
" `bigquery-public-data.san_francisco_311.311_service_requests`\n", | ||
" LIMIT 1000;\n", | ||
"\"\"\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "eoJgVS5KkwF7" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"bqclient = bigquery.Client(project=gcp_project)\n", | ||
"dataframe = bqclient.query(query).result().to_dataframe()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "_I8mnYhOBsnr" | ||
}, | ||
"source": [ | ||
"### Check the Dataframe\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "84lvVNg8odvS" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(dataframe.shape)\n", | ||
"dataframe.head()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"### Process the Dataframe" | ||
], | ||
"metadata": { | ||
"id": "K3-yckkMKX3g" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Convert the datetime to date\n", | ||
"dataframe['created_date'] = dataframe['created_date'].apply(datetime.date)" | ||
], | ||
"metadata": { | ||
"id": "fFMgzswOG0Wz" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
} | ||
], | ||
"metadata": { | ||
"colab": { | ||
"collapsed_sections": [], | ||
"name": "Cloud Datasets - Tutorial Template", | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.8.2" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |