Skip to content

fix(coderd): fix panics by always checking for non-nil request logger #18228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 12, 2025

Conversation

mafredri
Copy link
Member

@mafredri mafredri commented Jun 4, 2025

These panics can happen during tests (racy), saw that some places protect the request logger via non-nil check so fixed all instances.

(Discussion: This seems like a dangerous tool, perhaps a noop type would be better than nil to cover our bases, but I did not make that decision here.)

@mafredri mafredri self-assigned this Jun 4, 2025
@mafredri mafredri requested review from johnstcn and dannykopping June 4, 2025 12:54
Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to strange issues where we expect to find some information in the logs but end up not doing so due to forgetting to include the logger context.

Can we instead introduce a t.Logf adapter in coderdtest?

@mafredri
Copy link
Member Author

mafredri commented Jun 4, 2025

Looking at the commit history, it seems this was introduced by @ibetitsmike. Do you have any thoughts on how to proceed based on @johnstcn comment?

I'm just looking to fix the test race, but happy to hand this over to someone else if another approach is preferred.

@ibetitsmike
Copy link
Contributor

@mafredri @johnstcn generally the only scenario in which requestLogger is nil is if the request circumvents our API Logger middleware. It's responsible for attaching RequestLogger to context.

So the only scenario where it fails is if we call the method that tries to log, but doesn't do it through API. An example that was failing for me was provisioner_job_tests which I fixed here: https://github.com/coder/coder/pull/17304/files#diff-a0b103dd31ee53f52c43564220dbada29c0d9147e7210b27cb42e375b08fd947

If we experience flakes I'd try to understand why a given test path fails to run this line: https://github.com/coder/coder/blob/5f7e5d709787ca4e6ec287197f112d56d0eae2c8/coderd/httpmw/loggermw/logger.go#L44C11-L44C28

RequestLogger is useful to debug issues with long lived connections, so it's not critical to have that information 100% of the time, but @deansheather was the one that initially asked me to remove those nil checks (I had them in the first implementation).

@mafredri
Copy link
Member Author

mafredri commented Jun 4, 2025

Here's a repro:

go test ./coderd -run=Test_logFollower_completeBeforeFollow -v -race -count=10

There was another test as well that failed, also log follower. It's an easy miss to include the request logger when writing a test of this type, so I'm in favor of nil checks tbh as this happens too late and is racy otherwise.

@johnstcn
Copy link
Member

johnstcn commented Jun 5, 2025

What if we added another middleware only in tests that immediately fails the test if the request logging middleware is not present when any request comes in? That should allow surfacing this issue quickly.

@dannykopping
Copy link
Contributor

This one seems to be languishing and I'm getting pinged daily for review.
Thus far I've deferred since others have reviewed.
If you need another voice, I'm happy to add it - but either way let's come to a conclusions on this if possible.

@mafredri
Copy link
Member Author

mafredri commented Jun 12, 2025

I don't really have the bandwidth to refactor the implementation or any tests that touch upon this code path. The problem here is that components are being tested with a http test server and there's nothing enforcing or informing the user/test writer to add certain middleware, leading to occasional flakes.

There is also pre-existing precedent for the != nil check so I don't see why we'd have it in some places but not in others. Either case, we can either merge or close this, either works for me.

Edit: Meanwhile @dannykopping, I've released you from your agony and dropped you as a reviewer. 😜

@mafredri mafredri removed the request for review from dannykopping June 12, 2025 08:01
Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock.

@mafredri mafredri merged commit 70723d3 into main Jun 12, 2025
33 checks passed
@mafredri mafredri deleted the mafredri/fix-coderd-panic-request-log branch June 12, 2025 10:50
@github-actions github-actions bot locked and limited conversation to collaborators Jun 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants