Skip to content

Reduce iteration over steps in _sk_visual_block_ #29022

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 1, 2025

Conversation

deepyaman
Copy link
Contributor

@deepyaman deepyaman commented May 14, 2024

Reference Issues/PRs

N/A

What does this implement/fix? Explain your changes.

I was reading the code for _sk_visual_block, and I felt like it was unnecessary to get names and estimators from self.steps separately (the first pass simply throws away names), so I figured I'd try some drive-by refactoring.

Any other comments?

I know the "optimization" is very minimal, but I think it could also be viewed as cleaner (than discarding names in _ the first time around).

Copy link

github-actions bot commented May 14, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2c2ef80. Link to the linter CI: here

@deepyaman deepyaman marked this pull request as ready for review May 17, 2024 16:34
@deepyaman deepyaman changed the title [WIP] Reduce iteration over steps in _sk_visual_block_ Reduce iteration over steps in _sk_visual_block_ May 17, 2024
name_details = [str(est) for est in estimators]
return _VisualBlock(
"serial",
estimators,
names=names,
names=list(names),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to add this partially undid the idea of removing one loop. 🤷 While I still think the refactor is an ever-so-slight improvement, I'm OK if want to close it, too.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. WDYT @Charlie-XIAO

@Charlie-XIAO
Copy link
Contributor

I'm wondering why list(names) is needed, can you explain a bit @deepyaman? At first glance I think any iterable would work in _VisualBlock.

@deepyaman
Copy link
Contributor Author

deepyaman commented Sep 15, 2024

@adrinjalali thanks for the review, and @Charlie-XIAO sorry I missed your question!

I'm wondering why list(names) is needed, can you explain a bit @deepyaman? At first glance I think any iterable would work in _VisualBlock.

To be honest, it's been so long that I forgot (and obviously I didn't explain it properly in my comment...), but looking at the CI failure, I think it's because of this:

def test_get_visual_block_pipeline():
        pipe = Pipeline(
            [
                ("imputer", SimpleImputer()),
                ("do_nothing", "passthrough"),
                ("do_nothing_more", None),
                ("classifier", LogisticRegression()),
            ]
        )
        est_html_info = _get_visual_block(pipe)
        assert est_html_info.kind == "serial"
        assert est_html_info.estimators == tuple(step[1] for step in pipe.steps)
>       assert est_html_info.names == [
            "imputer: SimpleImputer",
            "do_nothing: passthrough",
            "do_nothing_more: passthrough",
            "classifier: LogisticRegression",
        ]
E       AssertionError

It seems _VisualBlock doesn't modify/coerce the passed names in any way, so it needs to be done somewhere to maintain the same attribute value/exact API.

@deepyaman
Copy link
Contributor Author

It seems _VisualBlock doesn't modify/coerce the passed names in any way, so it needs to be done somewhere to maintain the same attribute value/exact API.

Upon second thought, I guess could change the test instead? _VisualBlock is not part of the public API, so I guess it should be OK to have .names be a tuple (if you all comfortable with the slight change, and aren't too worried about potential downstream effects). 🤷

@adrinjalali
Copy link
Member

Fixing the test sounds good.

@adrinjalali
Copy link
Member

@deepyaman would you be able to finish this PR?

@deepyaman
Copy link
Contributor Author

@deepyaman would you be able to finish this PR?

Sorry, lot of things happened. :) Let me try and wrap it up later today, or on the weekend.

@deepyaman
Copy link
Contributor Author

deepyaman commented Nov 18, 2024

@adrinjalali @Charlie-XIAO Sorry for the delay on this; I've made the minor requested change in 99a2177!

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @deepyaman

@jeremiedbb jeremiedbb enabled auto-merge (squash) July 1, 2025 14:19
@jeremiedbb jeremiedbb merged commit 00763ab into scikit-learn:main Jul 1, 2025
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants