Skip to content

fix(coderd): mark sub agent deletion via boolean instead of delete #18411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 19, 2025

Conversation

mafredri
Copy link
Member

@mafredri mafredri commented Jun 17, 2025

Since deletion of data is uncommon in our database, the introduction of sub agents and their deletion introduced issues with foreign key assumptions in coder/internal#685. We could have only addressed the specific case by allowing cascade deletion of stats as well as handling in the stats collector, but it's unclear how many more such edge-cases we would've run into.

For this reason, we decided to mark the rows as deleted via boolean instead, and filter them out in all queries that seem relevant so that coderd doesn't see the deleted agents.

Testing is covered by updating the relevant agentapi tests as well as adding an antagonist deleted agent to dbgen which automatically gets applied to all tests.

Fixes coder/internal#685

Note that there are some queries that read workspace_agents that were not updated with deletion handling, but each one was analyzed and deemed unnecessary because of how they are used, doing so may introduce more harm than good. On that note, many of these queries can return results for non-latest builds which essentially means, "deleted" agents.

@mafredri mafredri force-pushed the mafredri/fix-subagent-deletion branch 2 times, most recently from db5700a to 7d73585 Compare June 18, 2025 10:29

// Add a test antagonist. For every agent we add a deleted sub agent
// to discover cases where deletion should be handled.
subAgt, err := db.InsertWorkspaceAgent(genCtx, database.InsertWorkspaceAgentParams{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: This may be a bit unusual, but it already helped me catch an issue with the workspace_prebuilds VIEW.

Ideally this potential cause would somehow be propagated to the user in case of test failure, but not sure how that could be done. I'll add a log entry to help with discovery.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot! I do agree that it could get confusing to troubleshoot, but the log message in combination with dbtestutil.DumpOnFailure should help. You could also potentially set some relevant fields of the sub-agent so that its purpose is clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? 😆 bc370a8 (#18411)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SATISFACTORY

@mafredri mafredri marked this pull request as ready for review June 18, 2025 12:06
Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spot-checked the migrations and the updated views/triggers look OK to me.

Would like a second pair of eyes on this for approval in case there's something I missed.


// Add a test antagonist. For every agent we add a deleted sub agent
// to discover cases where deletion should be handled.
subAgt, err := db.InsertWorkspaceAgent(genCtx, database.InsertWorkspaceAgentParams{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot! I do agree that it could get confusing to troubleshoot, but the log message in combination with dbtestutil.DumpOnFailure should help. You could also potentially set some relevant fields of the sub-agent so that its purpose is clear.

Comment on lines +360 to +367
UPDATE
workspace_agents
SET
deleted = TRUE
WHERE
id = $1
AND parent_id IS NOT NULL
AND deleted = FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 literally impossible to delete a 'top-level' agent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obligatory reminder to check migration number!

Copy link
Contributor

@DanielleMaywood DanielleMaywood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but awaiting approval from someone on prebuilds

@mafredri mafredri force-pushed the mafredri/fix-subagent-deletion branch from c26d6ab to 732eb5f Compare June 19, 2025 13:11
@mafredri mafredri force-pushed the mafredri/fix-subagent-deletion branch from 732eb5f to 6121231 Compare June 19, 2025 13:12
@mafredri mafredri enabled auto-merge (squash) June 19, 2025 13:20
@mafredri mafredri merged commit 511fd09 into main Jun 19, 2025
35 of 37 checks passed
@mafredri mafredri deleted the mafredri/fix-subagent-deletion branch June 19, 2025 13:32
@github-actions github-actions bot locked and limited conversation to collaborators Jun 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Devcontainers: As a side-effect of deleting agents, it's possible for the stats collector to become sad
4 participants