Skip to content

Fix spurious warning from type_of_target when called on estimator.classes_ #31584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

saskra
Copy link

@saskra saskra commented Jun 18, 2025

Reference Issues/PRs

Fixes #31583

What does this implement/fix? Explain your changes.

This PR suppresses an unintended warning in get_response_values, where type_of_target is called on estimator.classes_. Since classes_ does not represent full sample-level data, this call may spuriously trigger the warning:

"The number of unique classes is greater than 50% of the number of samples."

This is now avoided by passing suppress_warning=True to type_of_target() at this specific location.

This patch is intentionally minimal and does not affect calls to type_of_target that operate on actual sample labels (y, y_true, etc.).

Any other comments?

This was first observed while calibrating classifiers with many classes. Although the dataset was large and well-balanced, the warning appeared due to how classes_ was passed into type_of_target.

Apologies in advance if this is already known or intentional – this is my first contribution here, and I appreciate any feedback or corrections.

Thanks for your time and for maintaining this great library!

Copy link

github-actions bot commented Jun 18, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 0b5cfc5. Link to the linter CI: here

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @saskra. Please add a test for _get_response_values in test_response.py to check that no warning is raised now.

Comment on lines +417 to +421
if (
not suppress_warning
and y.shape[0] > 20
and classes.shape[0] > round(0.5 * y.shape[0])
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the spurious warning is only triggered because we call type_of_target on classes, I'd prefer to add an extra condition to avoid it rather than adding a new parameter to a public function.

Suggested change
if (
not suppress_warning
and y.shape[0] > 20
and classes.shape[0] > round(0.5 * y.shape[0])
):
if y.shape[0] > 20 and y.shape[0] > classes.shape[0] > round(0.5 * y.shape[0]):

With this suggestion, there's an edge case where a y contains only unique values but it's kind of useless for a classification task so a lot of problems would have happen before that, so I wouldn't worry about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unjustified "number of unique classes > 50%" warning in CalibratedClassifierCV
2 participants