Skip to content

chore(coderd/database): optimize GetRunningPrebuiltWorkspaces #18588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 1, 2025

Conversation

johnstcn
Copy link
Member

@johnstcn johnstcn commented Jun 25, 2025

Fixes coder/internal#715

After this change, the only use of the workspace_prebuilds view is the ClaimPrebuiltWorkspace query. A subsequent PR will update the view.

Before: ~44ms https://explain.dalibo.com/plan/76cbe21d1a4c9329#plan

~After: 3.7ms https://explain.dalibo.com/plan/f61beg93f9306fc0#plan

After: 7.3ms https://explain.dalibo.com/plan/5abbdf926315677e#plan

@johnstcn johnstcn marked this pull request as ready for review June 26, 2025 20:50
@johnstcn johnstcn requested review from mafredri and SasSwart June 26, 2025 20:50
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit thrown off by the indentation, but I tried my best 😆, left some suggestions and questions.

AS (
SELECT
latest_prebuilds.job_id,
BOOL_AND(workspace_agents.lifecycle_state = 'ready'::workspace_agent_lifecycle_state)::boolean AS ready
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BOOL_AND(workspace_agents.lifecycle_state = 'ready'::workspace_agent_lifecycle_state)::boolean AS ready
workspace_agents.lifecycle_state = 'ready'::workspace_agent_lifecycle_state AS ready

What's this, it actually works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied it from the workspace_prebuilds view.

latest_prebuilds.created_at
FROM
latest_prebuilds
LEFT JOIN ready_agents ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This join will duplicate rows when there are multiple agents since there is no group-by here or in the ready_agents CTE, was this considered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'll add some unit tests to compare the behaviour of the previous query.

Copy link
Member Author

@johnstcn johnstcn Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'correct' behaviour seems to be one row per provisioner job / workspace build.

@johnstcn johnstcn force-pushed the cj/optimize/GetRunningPrebuiltWorkspaces branch from ab80dc0 to 9131d78 Compare June 30, 2025 11:23
@@ -3888,6 +3888,95 @@ func TestWorkspacePrebuildsView(t *testing.T) {
}
}

func TestGetRunningPrebuiltWorkspaces(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reivew: adapted from TestWorkspacePrebuildsView.

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see an updated plan but looks good, nice work on the optimization!

latest_prebuilds.created_at
FROM latest_prebuilds
LEFT JOIN ready_agents ON ready_agents.job_id = latest_prebuilds.job_id
ORDER BY latest_prebuilds.workspace_id ASC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, formatting is chef's kiss

@johnstcn
Copy link
Member Author

The updated plan is slower than the previous one, but correct according to the added unit tests.

@mafredri
Copy link
Member

The updated plan is slower than the previous one, but correct according to the added unit tests.

Interesting, it could be due to moving WHERE conditions. It's always interesting how a hypothesis for optimization can crumble in the face of actual data layout. And where naive improvements may actually stomp on the query planners feet. 😅 Either way, the "slower" improvement is still great, no need to micro optimize IMO. 👍🏻

@johnstcn johnstcn merged commit 258a839 into main Jul 1, 2025
34 checks passed
@johnstcn johnstcn deleted the cj/optimize/GetRunningPrebuiltWorkspaces branch July 1, 2025 08:42
@github-actions github-actions bot locked and limited conversation to collaborators Jul 1, 2025
@johnstcn
Copy link
Member Author

johnstcn commented Jul 1, 2025

Revert here: #18688

This caused the prebuilds reconciler to continuously attempt to delete failed prebuilds.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: GetRunningPrebuiltWorkspaces creates lots of DB load
3 participants