-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Add Keycloak Refresh Token Middleware #51657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR should be separated in multiple PRs, you are trying to implement multiple different features. I started to work on the create token API (https://github.com/orgs/apache/projects/500/views/1?pane=issue&itemId=112609811) but it seems you want to work on it as well?
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/routes/login.py
Outdated
Show resolved
Hide resolved
I should have communicated this with you better. My bad, Vincent! Although I saw the issue, I first started adding these features because I thought we would have these endpoints to manage login, similar to FAB and SAM but then realised that we should manage the login and refresh part separately. Those things just stayed. I can also help on those parts if there isn't too much duplicate work already on your side 😅 Please let me know which one can stay and I can separate the PR and implement unit test on it |
No worries at all Bugra, and thanks a lot for your hard work, I understand Keycloak can be confusing and auth in general is not easy to grasp at first sight. Let's do it step by step, could you update this PR to focus only on the refresh token logic? We'll see other points later. I think that will simplify the review. |
Thanks Vincent! It has been a great time spent to learn and connecting all the pieces for both Keycloak and our internal authentication. I only worked on the backend API part in other authentication managers, so I missed completely what and how the entire authentication system works. |
0e3aeb3
to
891394a
Compare
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/middleware.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/middleware.py
Outdated
Show resolved
Hide resolved
return await call_next(request) | ||
except (InvalidTokenError, ExpiredSignatureError): | ||
code = None | ||
if "code" in request.query_params: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that work? We do we get this code when the token expires? Have you tested it? My idea was to do something like this.
# Extract Authorization header
auth = request.headers.get("authorization")
if auth and auth.lower().startswith("bearer "):
token_str = auth.split(" ", 1)[1]
user = await get_auth_manager().get_user_from_token(token_str)
token = get_auth_manager().refresh_token(user.refresh_token)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your time! This is not working because the code coming is different. Also, only auth requests end up in this middleware, which makes it impossible to cover each cases. This is a middle state of the code. I tested multiple times. It was really hard and problematic to do this in KeycloakMiddleware. I tried multiple ways in a couple of days this week. Nothing worked because I couldn't properly get the cureent user. I will move the middleware to Core Airlow. I will try to finilise it today or tomorrow.
There is one thing, though I tested multiple times with Keycloak and see that after the token expires internal one, UI automatically calls the login and refreshes the token. I have tested via default to 10 seconds expiration. Without these changes, UI refreshing the token, the only thing is it fallbacks to the home page of Airflow. If we want to fall back to the same page rather than the homepage, that could indeed be done in core Airflow middleware. Of course, if the Keycloak session expires, login is needed. So I am thinking maybe this refresh token is not needed after all. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, sorry for the delay!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this concept there are 2 tokens:
- The Airflow token which expiration is set by
[api_auth] jwt_expiration_time
. When this token expires, Airflow redirects to login page. We do not want to modify this logic - The Keycloak access token. When a user makes an action through the UI or API, it sends a JWT token. This JWT token is the Airflow token. Airflow deserializes this token and extracts information from it. One piece of information is the Keycloak access_token. Keycloak auth manager then use this access token to communicate with Keycloak server to check whether a user has access to a given resource. This token expires as well and by default it expires every 5 minutes, which means, if a user uses the Keycloak auth manager, after 5 minutes, their Airflow token will still be valid but the Keycloak auth manager will be invalid, which result in every call to Keycloak server failing. This is what we are trying to solve here.
After 5 minutes, we should refresh the Keycloak access token automatically and transparently for the user. This is the purpose of the middleware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Vincent! Updated the code according to comments.
I have tested and the refresh part is working. The only remaining task is to update the UI to handle this case, as we need to cascade new user information to the UI via the token to retrieve the latest values at a later stage to refresh again. Otherwise, that refresh token expires after the default 5 minutes. Plus, unit tests. Please let me know if there are things that is missed in the meanwhile 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically all remains the same. We just expose a "get_refresh_token_url" in the provider, a "refresh_token" in the public API, and we remove the middleware. Then anybody (clients) is free to implement his own refresh flow how he sees fit.
The UI will most likely wait for an "Expired Token" to refresh the token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I agree with this approach. I think this is indeed better. Thanks Pierre. @bugraoz93 Is everything clear on your side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All clear, thanks for the discussion and comments! I will follow up with this soon
- Expose an API endpoint
- UI calls the endpoint when the token has expired error is obtained
- Remove middleware
- Call refresh token from exposed endpoint
- Return token to UI, which updates local storage
Bonus could be an additional endpoint to check if the token is valid because this will always be on the auth manager side and will not be directly available to the API without calling an additional endpoint, such as in this case, the Keycloak introspect method. This can tell UI to understand if the token has expired on the backend (not internal JWT). Since this will be available in the auth manager, it is easier to expose to the UI. It would also help automations to understand that their token needs a refresh. Since this will be publicly available, I thought it could be a good addition. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this is needed, we should not check whether the token is valid, we should assume it is and when it is not (catching the exception), kicking off the refresh token flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also feel free to split it in two, the backend part, and once we have the endpoints available you can follow up (or someone else) with the front-end refresh token flow. That should make the PR much smaller and easy to review / merge.
891394a
to
ce1fa1c
Compare
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
a59ce34
to
0e94d22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it is much better :) I like the direction and I think we are close. Thanks for being receptive to my comments :)
On top of my comments, one other thing: I think we should call refresh_token
only when the token expired. A bit like your previous implementation. Basically:
- Detecting when Keycloak token expired by catching the exception. When that happens, raising another generic exception (e.g. using an exception from
jwt
package) - In the middleware, catching this exception, and calling
refresh_token
when that happens
That allows to call refresh_token
only when the token expires, otherwise we call refresh_token
at every call, which can lead to severe latency increase.
airflow-core/src/airflow/api_fastapi/auth/managers/base_auth_manager.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/base_auth_manager.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/base_auth_manager.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Outdated
Show resolved
Hide resolved
bdf69d3
to
3a8a002
Compare
Thanks a lot, Vincent! I would be a lot faster in different areas but in auth managers, I have learned a lot during the implementation. Still small unit test is needed for |
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/base_auth_manager.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/api_fastapi/auth/managers/middleware/refresh_token.py
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Outdated
Show resolved
Hide resolved
providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py
Show resolved
Hide resolved
ebe67d1
to
9c8fca7
Compare
c053c59
to
1793cc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only this comment thread to resolve and we are good :)
1793cc9
to
ad0eeba
Compare
One last ask (promise, the last one), can you add a test for the middleware? Thanks again for the work! |
For sure, thanks for pointing out! I would say this is a good collaboration rather than an ask :) |
Added the test, which I had missed while deleting it for the CORS. That was on me |
This needs small care on the test. It worked in my local but seems not here. Will check this out |
73ffbdd
to
4499ad3
Compare
… and change refresh_token
… and change refresh_token
…ocs and add tests
…anager.py Co-authored-by: Vincent <[email protected]>
58894a9
to
24fdd3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just on comment
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in airflow-core/newsfragments.