Skip to content

Unable to train any data #1007

Closed
Closed
@EVAUTOAI

Description

@EVAUTOAI

I am trying to train a model on iris dataset by running the command in the postgres database that conatins pgml:
SELECT * FROM pgml.train(project_name => 'test1c_extra_trees', task => 'classification',
relation_name => 'public.iris_new', y_column_name => 'Class');

I am getting the below error:

INFO: Snapshotting table "public.iris_new", this may take a little while...
INFO: Dataset { num_features: 4, num_labels: 1, num_distinct_labels: 4, num_rows: 150, num_train_rows: 112, num_test_rows: 38 }
INFO: Column "Class": Statistics { min: 1.0, max: 3.0, max_abs: 3.0, mean: 1.6607143, median: 2.0, mode: 2.0, variance: 0.43845627, std_dev: 0.6621603, missing: 0, distinct: 3, histogram: [50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 50, 0, 0, 0, 0, 0, 0, 0, 0, 12], ventiles: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0], categories: Some({"Iris-setosa": Category { value: 1.0, members: 50 }, "Iris-virginica": Category { value: 3.0, members: 12 }, "Iris-versicolor": Category { value: 2.0, members: 50 }, "NULL": Category { value: 0.0, members: 0 }}) }
INFO: Column "Sepallength": Statistics { min: 4.3, max: 7.6, max_abs: 7.6, mean: 5.5866075, median: 5.5, mode: 5.0, variance: 0.5275885, std_dev: 0.7263529, missing: 0, distinct: 32, histogram: [4, 5, 2, 11, 19, 4, 7, 12, 7, 7, 8, 2, 8, 5, 4, 2, 2, 1, 1, 1], ventiles: [4.6, 4.8, 4.9, 5.0, 5.0, 5.1, 5.1, 5.2, 5.4, 5.5, 5.6, 5.7, 5.8, 6.0, 6.1, 6.3, 6.4, 6.6, 6.9], categories: None }
INFO: Column "Sepalwidth": Statistics { min: 2.0, max: 4.4, max_abs: 4.4, mean: 3.0776787, median: 3.0, mode: 3.0, variance: 0.21280527, std_dev: 0.4613082, missing: 0, distinct: 23, histogram: [1, 2, 4, 3, 6, 10, 6, 10, 17, 8, 13, 10, 6, 3, 7, 2, 1, 1, 1, 1], ventiles: [2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 2.9, 3.0, 3.0, 3.0, 3.1, 3.2, 3.2, 3.3, 3.4, 3.4, 3.5, 3.7, 3.9], categories: None }
INFO: Column "Petallength": Statistics { min: 1.0, max: 6.6, max_abs: 6.6, mean: 3.1633925, median: 3.7, mode: 1.5, variance: 2.645713, std_dev: 1.6265649, missing: 0, distinct: 36, histogram: [4, 33, 11, 2, 0, 0, 0, 1, 4, 2, 9, 9, 15, 9, 4, 1, 1, 4, 2, 1], ventiles: [1.3, 1.4, 1.4, 1.4, 1.5, 1.5, 1.6, 1.7, 3.0, 3.6, 4.0, 4.1, 4.2, 4.4, 4.5, 4.6, 4.8, 5.1, 5.8], categories: None }
INFO: Column "Petalwidth": Statistics { min: 0.1, max: 2.5, max_abs: 2.5, mean: 0.91785717, median: 1.0, mode: 0.2, variance: 0.43753833, std_dev: 0.6614668, missing: 0, distinct: 20, histogram: [34, 7, 7, 1, 1, 0, 0, 7, 3, 18, 7, 10, 3, 2, 6, 1, 2, 1, 0, 2], ventiles: [0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.3, 0.4, 1.0, 1.0, 1.2, 1.3, 1.3, 1.4, 1.4, 1.5, 1.6, 1.8, 2.0], categories: None }
INFO: Training Model { id: 233, task: classification, algorithm: linear, runtime: python }
INFO: Hyperparameter searches: 1, cross validation folds: 1
INFO: Hyperparams: {}

ERROR: assertion failed: (left == right)
left: 4,
right: 2

SQL state: XX000_

I even tried running the below command given in git:
SELECT pgml.transform(
task => 'text-classification',
inputs => ARRAY[
'I love how amazingly simple ML has become!',
'I hate doing mundane and thankless tasks. ☹️'
]
) AS positivity;

This threw the error:
ERROR: Lazy instance has previously been poisoned

SQL state: XX000

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions