Skip to content

Add Keycloak Refresh Token Middleware #51657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

bugraoz93
Copy link
Contributor


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR should be separated in multiple PRs, you are trying to implement multiple different features. I started to work on the create token API (https://github.com/orgs/apache/projects/500/views/1?pane=issue&itemId=112609811) but it seems you want to work on it as well?

@bugraoz93
Copy link
Contributor Author

bugraoz93 commented Jun 13, 2025

I think this PR should be separated in multiple PRs, you are trying to implement multiple different features. I started to work on the create token API (Keycloak auth manager (view)) but it seems you want to work on it as well?

I should have communicated this with you better. My bad, Vincent! Although I saw the issue, I first started adding these features because I thought we would have these endpoints to manage login, similar to FAB and SAM but then realised that we should manage the login and refresh part separately. Those things just stayed. I can also help on those parts if there isn't too much duplicate work already on your side 😅 Please let me know which one can stay and I can separate the PR and implement unit test on it

@vincbeck
Copy link
Contributor

vincbeck commented Jun 13, 2025

No worries at all Bugra, and thanks a lot for your hard work, I understand Keycloak can be confusing and auth in general is not easy to grasp at first sight. Let's do it step by step, could you update this PR to focus only on the refresh token logic? We'll see other points later. I think that will simplify the review.

@bugraoz93
Copy link
Contributor Author

Thanks Vincent! It has been a great time spent to learn and connecting all the pieces for both Keycloak and our internal authentication. I only worked on the backend API part in other authentication managers, so I missed completely what and how the entire authentication system works.
Sure, Vincent, I will update the PR to include only the refresh_token logic. I will update the PR today or tomorrow. Today, I have a full day plan, so I will try the night or tomorrow morning to update the PR :)

@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from 0e3aeb3 to 891394a Compare June 15, 2025 19:39
@bugraoz93 bugraoz93 changed the title Add Keycloak endpoints, middleware, service and logout Add Keycloak Refresh Token Middleware Jun 15, 2025
return await call_next(request)
except (InvalidTokenError, ExpiredSignatureError):
code = None
if "code" in request.query_params:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that work? We do we get this code when the token expires? Have you tested it? My idea was to do something like this.

# Extract Authorization header
auth = request.headers.get("authorization")

if auth and auth.lower().startswith("bearer "):
    token_str = auth.split(" ", 1)[1]
    user = await get_auth_manager().get_user_from_token(token_str)

    token = get_auth_manager().refresh_token(user.refresh_token)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your time! This is not working because the code coming is different. Also, only auth requests end up in this middleware, which makes it impossible to cover each cases. This is a middle state of the code. I tested multiple times. It was really hard and problematic to do this in KeycloakMiddleware. I tried multiple ways in a couple of days this week. Nothing worked because I couldn't properly get the cureent user. I will move the middleware to Core Airlow. I will try to finilise it today or tomorrow.

There is one thing, though I tested multiple times with Keycloak and see that after the token expires internal one, UI automatically calls the login and refreshes the token. I have tested via default to 10 seconds expiration. Without these changes, UI refreshing the token, the only thing is it fallbacks to the home page of Airflow. If we want to fall back to the same page rather than the homepage, that could indeed be done in core Airflow middleware. Of course, if the Keycloak session expires, login is needed. So I am thinking maybe this refresh token is not needed after all. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, sorry for the delay!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this concept there are 2 tokens:

  • The Airflow token which expiration is set by [api_auth] jwt_expiration_time. When this token expires, Airflow redirects to login page. We do not want to modify this logic
  • The Keycloak access token. When a user makes an action through the UI or API, it sends a JWT token. This JWT token is the Airflow token. Airflow deserializes this token and extracts information from it. One piece of information is the Keycloak access_token. Keycloak auth manager then use this access token to communicate with Keycloak server to check whether a user has access to a given resource. This token expires as well and by default it expires every 5 minutes, which means, if a user uses the Keycloak auth manager, after 5 minutes, their Airflow token will still be valid but the Keycloak auth manager will be invalid, which result in every call to Keycloak server failing. This is what we are trying to solve here.

After 5 minutes, we should refresh the Keycloak access token automatically and transparently for the user. This is the purpose of the middleware

Copy link
Contributor Author

@bugraoz93 bugraoz93 Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vincent! Updated the code according to comments.
I have tested and the refresh part is working. The only remaining task is to update the UI to handle this case, as we need to cascade new user information to the UI via the token to retrieve the latest values at a later stage to refresh again. Otherwise, that refresh token expires after the default 5 minutes. Plus, unit tests. Please let me know if there are things that is missed in the meanwhile 🙏

Copy link
Member

@pierrejeambrun pierrejeambrun Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically all remains the same. We just expose a "get_refresh_token_url" in the provider, a "refresh_token" in the public API, and we remove the middleware. Then anybody (clients) is free to implement his own refresh flow how he sees fit.

The UI will most likely wait for an "Expired Token" to refresh the token.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I agree with this approach. I think this is indeed better. Thanks Pierre. @bugraoz93 Is everything clear on your side?

Copy link
Contributor Author

@bugraoz93 bugraoz93 Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All clear, thanks for the discussion and comments! I will follow up with this soon

  • Expose an API endpoint
  • UI calls the endpoint when the token has expired error is obtained
  • Remove middleware
  • Call refresh token from exposed endpoint
  • Return token to UI, which updates local storage

Bonus could be an additional endpoint to check if the token is valid because this will always be on the auth manager side and will not be directly available to the API without calling an additional endpoint, such as in this case, the Keycloak introspect method. This can tell UI to understand if the token has expired on the backend (not internal JWT). Since this will be available in the auth manager, it is easier to expose to the UI. It would also help automations to understand that their token needs a refresh. Since this will be publicly available, I thought it could be a good addition. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is needed, we should not check whether the token is valid, we should assume it is and when it is not (catching the exception), kicking off the refresh token flow

Copy link
Member

@pierrejeambrun pierrejeambrun Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also feel free to split it in two, the backend part, and once we have the endpoints available you can follow up (or someone else) with the front-end refresh token flow. That should make the PR much smaller and easy to review / merge.

@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from 891394a to ce1fa1c Compare June 21, 2025 15:10
@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch 2 times, most recently from a59ce34 to 0e94d22 Compare June 22, 2025 21:12
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it is much better :) I like the direction and I think we are close. Thanks for being receptive to my comments :)

On top of my comments, one other thing: I think we should call refresh_token only when the token expired. A bit like your previous implementation. Basically:

  • Detecting when Keycloak token expired by catching the exception. When that happens, raising another generic exception (e.g. using an exception from jwt package)
  • In the middleware, catching this exception, and calling refresh_token when that happens

That allows to call refresh_token only when the token expires, otherwise we call refresh_token at every call, which can lead to severe latency increase.

@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from bdf69d3 to 3a8a002 Compare June 23, 2025 19:58
@bugraoz93
Copy link
Contributor Author

Overall it is much better :) I like the direction and I think we are close. Thanks for being receptive to my comments :)

On top of my comments, one other thing: I think we should call refresh_token only when the token expired. A bit like your previous implementation. Basically:

  • Detecting when Keycloak token expired by catching the exception. When that happens, raising another generic exception (e.g. using an exception from jwt package)
  • In the middleware, catching this exception, and calling refresh_token when that happens

That allows to call refresh_token only when the token expires, otherwise we call refresh_token at every call, which can lead to severe latency increase.

Thanks a lot, Vincent! I would be a lot faster in different areas but in auth managers, I have learned a lot during the implementation. Still small unit test is needed for refresh_token. Functionality is also working as expected. I will try to do it and finalise this one in the coming hours.

@bugraoz93 bugraoz93 requested a review from jason810496 as a code owner June 24, 2025 22:21
@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from c053c59 to 1793cc9 Compare June 24, 2025 22:24
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only this comment thread to resolve and we are good :)

@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from 1793cc9 to ad0eeba Compare June 25, 2025 18:32
@vincbeck
Copy link
Contributor

One last ask (promise, the last one), can you add a test for the middleware? Thanks again for the work!

@bugraoz93
Copy link
Contributor Author

One last ask (promise, the last one), can you add a test for the middleware? Thanks again for the work!

For sure, thanks for pointing out! I would say this is a good collaboration rather than an ask :)

@bugraoz93
Copy link
Contributor Author

Added the test, which I had missed while deleting it for the CORS. That was on me

@bugraoz93
Copy link
Contributor Author

This needs small care on the test. It worked in my local but seems not here. Will check this out

@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from 73ffbdd to 4499ad3 Compare June 26, 2025 15:59
@bugraoz93 bugraoz93 force-pushed the feat/keycloak/api branch from 58894a9 to 24fdd3a Compare June 26, 2025 16:03
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just on comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants