Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Adding a tutorial for the Iowa Liquor dataset #419

Merged
merged 9 commits into from
Aug 3, 2022

Conversation

edkrizo
Copy link
Contributor

@edkrizo edkrizo commented Jul 19, 2022

Description

Note: If you are adding or editing a dataset, please specify the dataset folder involved, e.g. datasets/google_trends

Checklist

  • (Required) This pull request is appropriately labeled
  • Please merge this pull request after it's approved

Use the sections below based on what's applicable to your PR and delete the rest:

Feature

  • I'm adding or editing a feature
  • I have updated the README accordingly
  • I have added/revised tests for the feature

Data Onboarding

  • I'm adding or editing a dataset
  • The Google Cloud Datasets team is aware of the proposed dataset
  • I put all my code inside datasets/<DATASET_NAME> and nothing outside of that directory

Documentation

  • I'm adding/editing documentation

Bug fix

  • I'm submitting a bugfix
  • I have added/revised tests related to my bugfix (see the tests folder)

Code cleanup or refactoring

  • I'm refactoring or cleaning up some code

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@edkrizo edkrizo changed the title iowa_liquor_sale tutorial datasets/iowa_liquor_sale/docs Jul 19, 2022
@happyhuman happyhuman changed the title datasets/iowa_liquor_sale/docs Feat: Adding a tutorial for the Iowa Liquor dataset Jul 19, 2022
@@ -0,0 +1,1026 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #2.    from datetime import datetime

I am not sure if you actually made use of datetime in your code. If not, flake8 will warn you and you can delete this line.


Reply via ReviewNB


artifact:
title: "Iowa Liquor sales predictions"
description: "Predict a liquor sales price based previous years sales data using a tree based ML estimators such as Random Forest"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint is complaining about line 17: 17:16 [colons] too many spaces after colon

import pandas as pd
mock_client = mock.MagicMock()
mock_df = pd.DataFrame()
mock_df['week'] = range(50)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to change mock_df. These lines are added, so our test code does not make an actual request to BigQuery. Instead, we "mock" the BQ call, and create a mock object (in this case, mock_df) to be the object we pretend that our BQ call returned. You want it to be a dataframe that you expect the BQ call returns, so the rest of your code can function and work properly.

dataframe = tb.get("dataframe")
assert dataframe.shape == (50, 3)

train_pred_plot = tb.get("train_pred_plot")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to pick an object from your code (ideally something from the end of your code) and replace train_pred_plot with it, and test whether it exists or not. That way, we are testing whether your notebook runs all the way to the end during the test.

edkrizo and others added 3 commits July 21, 2022 14:32
reducing the librarires
reducing spaces after columns
@happyhuman happyhuman merged commit b619b71 into GoogleCloudPlatform:main Aug 3, 2022
vijay-google pushed a commit to vijay-google/public-datasets-pipelines that referenced this pull request Aug 17, 2022
…orm#419)

* putting my files with docs

* changing my project id

* flake8 passed on notebook & test

* reducing spaces in files names

* reformatting the artifact

* Update artifact.yaml

reducing the librarires

* Update artifact.yaml

reducing spaces after columns

* Adding test file and notebook

Co-authored-by: Edouard Gahou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants