Skip to content

Refactor serve_logs with FastAPI #52581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jason810496
Copy link
Member

@jason810496 jason810496 commented Jun 30, 2025

closes: #52526
related: https://lists.apache.org/thread/hfr8q85rgr6knpp5wblbz301ysnmzhht

Why

  • Move Flask ( or FAB you said ) out of airflow-core dependency

What

Replace Flask's send_from_directory fastapi_app.mount with JWTAuthStaticFiles ( which inherent from FastAPI's StaticFiles and extend the existed authorization. reference from fastapi/fastapi#858 (comment) )

@jason810496 jason810496 self-assigned this Jun 30, 2025
@jason810496 jason810496 added the full tests needed We need to run full set of tests for this PR to merge label Jun 30, 2025
@jason810496 jason810496 force-pushed the refactor/logging/replace-flask-serve-log-with-fastapi branch from 8be599f to 99651ed Compare June 30, 2025 18:00
Fix test_invalid_characters_handled

Refactor with StaticFiles
@jason810496 jason810496 force-pushed the refactor/logging/replace-flask-serve-log-with-fastapi branch from 99651ed to b5facdf Compare June 30, 2025 18:02
@jason810496 jason810496 marked this pull request as ready for review June 30, 2025 18:02
Copy link
Member Author

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though there isn't Flask'ssend_from_directory alternative in FastAPI.
So I implement the validation for file path in first try ( and the security check is angry ).

Fortunately I found the gist for having authorization for FastAPI's StaticFiles. ( linked in the PR description )

raise ImportError(f"Unable to load {log_config_class} due to error: {e}")

fastapi_app = FastAPI()
fastapi_app.state.signer = JWTValidator(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to set the signer instance in app.state just like what we do in core-api for dag_bag to make it singleton.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! TIL a bit more about fast_api by reviewing this PR.

Comment on lines 211 to 219

options = [bind_option, GunicornOption("workers", 2)]
StandaloneGunicornApplication(wsgi_app, options).run()
# Use Uvicorn worker class for ASGI applications
options = [
bind_option,
GunicornOption("workers", 2),
GunicornOption("worker_class", "uvicorn.workers.UvicornWorker"),
]
StandaloneGunicornApplication(asgi_app, options).run()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, I'm also think about replace StandaloneGunicornApplication with uvicorn.run. Since the api_server_command use uvicorn.run to start the whole core-api.

Any comment for this ?

uvicorn.run(
"airflow.api_fastapi.main:app",
host=args.host,
port=args.port,
workers=num_workers,
timeout_keep_alive=worker_timeout,
timeout_graceful_shutdown=worker_timeout,
ssl_keyfile=ssl_key,
ssl_certfile=ssl_cert,
access_log=access_logfile,
proxy_headers=proxy_headers,
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the difference is. But we have to remember that "serve_logs" is run in "celery" worker - and we do not have "api_server" running there - serve_log is the only thing that celery workers are exposing. So I think that's the reason we had "Standalone server". I do not know too much about those.

@pierrejeambrun -> maybe you can help here?

@@ -43,74 +44,55 @@
logger = logging.getLogger(__name__)


def create_app():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

if token_filename is None:
logger.warning("The payload does not contain 'filename' key: %s.", payload)
abort(403)
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid token payload")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not actually want to provide more details to client on why we rejected the request. This is an important security principle - never explain why you fail to the client side if the reason is authentication problem, just return "403" without any details - and log details on the server side.

Otherwise it might make easier for potential attacker to see what is wrong and they can adjust their attack - including leveraging some of the "timing" attacks for example to see if the tokens are partially matching and things like that.

The lest we tell the client about reasons, the more secure we are.

if token_filename != request_filename:
logger.warning(
"The payload log_relative_path key is different than the one in token:"
"Request path: %s. Token path: %s.",
request_filename,
token_filename,
)
abort(403)
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Token filename mismatch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This one for example is pretty informative to the attacker - so we should just return 403 and keep all the details in the server log for diagnostics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we can do in this case is to get a random id associated with such request and report it (and log on the server side) - this makes it easier to correlate client side requests with errors for legitimate errors.

except HTTPException:
raise
except InvalidAudienceError:
logger.warning("Invalid audience for the request", exc_info=True)
abort(403)
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid audience")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here and all other cases. It was pretty deliberate to just return 403 here.

We could actually make a comment here to metion that it's deliberate - otherwise future contributors might try to "fix" it in the same way.


import gunicorn.app.base
from flask import Flask, abort, request, send_from_directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ ❤️

@potiuk
Copy link
Member

potiuk commented Jul 1, 2025

One other thing - I think we can get rid of those dependencies from airflow-core/pyproject.toml:

   # We could get rid of flask and gunicorn if we replace serve_logs with a starlette + unicorn
    "flask>=2.1.1",
    # We could get rid of flask and gunicorn if we replace serve_logs with a starlette + unicorn
    "gunicorn>=20.1.0",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert serve_logs to use fast-api
2 participants