Skip to content

feat(api/ui)#50779 : add asset group filtering and asset group views #51682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

davidfgcorreia
Copy link
Contributor

Add backend and OpenAPI support for filtering assets by group (group_pattern) in /assets endpoint.

Implement multi-asset dependency queries in /ui/dependencies endpoint with node_ids parameter.

Add new frontend pages and components for asset group views, including group graph and sidebar.

Refactor asset list and group list to support group filtering, merging, and deduplication.

Add and update tests for group filtering, group views, and multi-asset dependency queries.

introduced small bug in caching on the asset_grath.

https://youtu.be/ybpRUjEhdoQ


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. labels Jun 13, 2025
Copy link
Contributor

@bugraoz93 bugraoz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API side looks good! Could you please fix static checks? It is failing for Compile / format / lint UI

@davidfgcorreia
Copy link
Contributor Author

Runned precommit locally and got zero errors how can I replicate this setup and make sure it passes consistently?

Copy link
Contributor

@guan404ming guan404ming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this feature is great. Feel free to ask anything.

<HStack gap={4} mb={2}>
<Box>
<Text color="chakra-fg" fontSize="xs">
{/* i18n: Producing Tasks label */}Producing Tasks
Copy link
Contributor

@guan404ming guan404ming Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently disable literal string in ui due to i18n thus you need to update the translation file like en/assets.json or en/common.json and use useTranslation hook to access them. You could also run npm run lint in ui folder to see your lint error.

@davidfgcorreia
Copy link
Contributor Author

Thanks, this feature is great. Feel free to ask anything.

Yes, I have a question. I created a new file called sidebar.tsx, but I'm not able to catch the errors in it , even though the runner here on Git is marking them. Do you know what might be going on?

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice effort!

I believe there are a few things to update before we can merge but this can be a great feature and addition for Airflow 3.1.0

Comment on lines 102 to 117
all_nodes = []
all_edges = []
seen_nodes = set()
seen_edges = set()
for nid in ids_to_fetch:
comp = extract_single_connected_component(nid, data["nodes"], data["edges"])
for n in comp["nodes"]:
if n["id"] not in seen_nodes:
all_nodes.append(n)
seen_nodes.add(n["id"])
for e in comp["edges"]:
edge_tuple = (e["source_id"], e["target_id"])
if edge_tuple not in seen_edges:
all_edges.append(e)
seen_edges.add(edge_tuple)
data = {"nodes": all_nodes, "edges": all_edges}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this manually.

extract_single_connected_component is actually calling extract_connected_components which holds all the connected components of the graph. We could directly use all of those and merge them to avoid computing multiple times extract_connected_components (for each node id).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since extract_connected_components returns all components, should I use it even for single node IDs (by finding the one containing the ID), or should I keep using extract_single_connected_component in that case for simplicity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can rework it a little bit to have:

  • extract_connected_components (return all)
  • rework and rename extract_single_connected component to instead take a list of node, and extrat the relevant coponent IDs. So you can always use this function everywhere, whether you want a single connected component or multiple of them filtered out.

Comment on lines 50 to 52
# If node_id contains commas, treat as multiple IDs (extra protection)
if node_ids:
ids_to_fetch = [nid.strip() for nid in node_ids.split(",") if nid.strip()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, FastAPI handles node_ids: list[str] | None = None natively. FastAPI chose exploded form query params, i.e ?node_ids=1&node_ids=2 you will directly receive `node_ids=["1","2"] in your function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to send the parameters as a string specifically to avoid the exploded format in the URL. node_ids=1&node_ids=2, I send something like ?node_ids=1,2 (or another delimited format).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stay consistent with the rest of the API, we don't want some endpoints doing that and some other doing something else, also this will remove the need for you to manually parse that string.

Comment on lines 34 to 52
const ChevronDownIcon = () => (
<svg fill="currentColor" height="1em" viewBox="0 0 20 20" width="1em">
<path
clipRule="evenodd"
d="M5.23 7.21a.75.75 0 011.06.02L10 11.085l3.71-3.855a.75.75 0 111.08 1.04l-4.24 4.4a.75.75 0 01-1.08 0l-4.24-4.4a.75.75 0 01.02-1.06z"
fillRule="evenodd"
/>
</svg>
);

const ChevronRightIcon = () => (
<svg fill="currentColor" height="1em" viewBox="0 0 20 20" width="1em">
<path
clipRule="evenodd"
d="M7.21 5.23a.75.75 0 011.06-.02l4.4 4.24a.75.75 0 010 1.08l-4.4 4.24a.75.75 0 11-1.04-1.08L11.085 10 7.23 6.29a.75.75 0 01-.02-1.06z"
fillRule="evenodd"
/>
</svg>
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do this at different places in the code. (We use chevron icons from react-icon lib instead), you can do something similar.

Comment on lines 126 to 147
const { data, error, isLoading } = useAssetServiceGetAssets({
limit: 1000,
offset: 0,
orderBy,
});

const [search, setSearch] = useState("");
const [sortAsc, setSortAsc] = useState(true);

// Group assets by group
const grouped = useMemo(() => {
const filtered = (data?.assets ?? []).filter((asset) =>
asset.group.toLowerCase().includes(search.toLowerCase()),
);
const groupedObj = filtered.reduce<Record<string, Array<AssetResponse>>>((acc, asset) => {
const groupName = asset.group;

acc[groupName] ??= [];
acc[groupName].push(asset);

return acc;
}, {});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic can holds a few problems, first of the endpoint useAssetServiceGetAssets is paginated. It means that you'll only retrieve the first 50 items, and you need a mechanism to display the remaining items (cf tables pagination component).

In addition, there is no guarantee at this point that all assets for a specified group will be on the same page, you could end up with a group missing some assets because those were not returned yet.

I think the easiest approach would be to add a backend ui endpoints to list/retrieve asset groups directly. (I group would list all of it's child asset and we would paginate on groups directly). Maybe there are other approaches possible but I would need to think about it.

Comment on lines 123 to 132
const {
data: dataByName,
error: errorByName,
isLoading: isLoadingByName,
} = useAssetServiceGetAssets({
limit: pagination.pageSize,
namePattern,
namePattern: searchValue ?? undefined,
offset: pagination.pageIndex * pagination.pageSize,
orderBy,
});

const {
data: dataByGroup,
error: errorByGroup,
isLoading: isLoadingByGroup,
} = useAssetServiceGetAssets({
groupPattern: searchValue ?? undefined,
limit: pagination.pageSize,
offset: pagination.pageIndex * pagination.pageSize,
orderBy,
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't do that manually in the UI. (call the endpoints with two different parameters and then merge results in the UI).

The backend should support this. (You can write a custom filter that will do the search accross both 'asset.nameandasset.group, or expand the search_param_factoryand_SearchParamto be able to take alist[ColumnElement]` as attribute. And perform a search across multiple columns.

Comment on lines 41 to 48
query.data?.nodes.forEach((node) => {
const key = UseDependenciesServiceGetDependenciesKeyFn({ nodeId: node.id });
const queryData = queryClient.getQueryData(key);

if (!Boolean(queryData)) {
queryClient.setQueryData(key, query.data);
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea but I think in its current form it could induce some bugs. Setting the cache for a particular node_id, with the result of useDependenciesServiceGetDependencies with many node ids. Because the response returned for many node ids will hold more connected components than just the 1 single connected component that we should retrieve from the nodeId: node.id filter.

This could therefore display unrelated dependencies when displaying dependencies of a particular node.

@pierrejeambrun
Copy link
Member

pierrejeambrun commented Jun 23, 2025

Yes, I have a question. I created a new file called sidebar.tsx, but I'm not able to catch the errors in it , even though the runner here on Git is marking them. Do you know what might be going on?

Be sure to be rebased on latest main with latest devtools updates (We recently merged a few improvement to this part). You can then run pre-commit hooks or eslint:

  • pre-commit run ts-compile-lint-ui (with your files stages) or pre-commit run ts-compile-lint-ui --all-files (for all files) This is what is run in the CI so you should be able to reproduce
  • or pnpm eslint --fix your_file (maybe don't run that, I'm having trouble running it locally at the moment, I thing something might be broken in the setup)

@davidfgcorreia
Copy link
Contributor Author

I'm getting a mypy error when running the pre-commit hook for airflow-core. Here's the stack trace:

AssertionError: Should never get here in normal mode, got TypeAlias:numpy.bool_ instead of TypeInfo
It suggests running breeze ci-image build --python 3.9 or breeze down --cleanup-mypy-cache, but I'm not sure what's actually causing this.
Any idea what might be going on or how to fix it?

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what is wrong in your setup. You can run pre-commit on main to be sure that everything is in order. It is really hard for me to help you without more context and iterating. Everything should be detailed in the contribution docs where you can find all the steps and troubleshooting help.

Comment on lines 102 to 117
all_nodes = []
all_edges = []
seen_nodes = set()
seen_edges = set()
for nid in ids_to_fetch:
comp = extract_single_connected_component(nid, data["nodes"], data["edges"])
for n in comp["nodes"]:
if n["id"] not in seen_nodes:
all_nodes.append(n)
seen_nodes.add(n["id"])
for e in comp["edges"]:
edge_tuple = (e["source_id"], e["target_id"])
if edge_tuple not in seen_edges:
all_edges.append(e)
seen_edges.add(edge_tuple)
data = {"nodes": all_nodes, "edges": all_edges}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can rework it a little bit to have:

  • extract_connected_components (return all)
  • rework and rename extract_single_connected component to instead take a list of node, and extrat the relevant coponent IDs. So you can always use this function everywhere, whether you want a single connected component or multiple of them filtered out.

@davidfgcorreia
Copy link
Contributor Author

Hi, the pipeline is currently failing on Tests AMD / CI image checks / Static checks (pull_request), specifically at the step "Generate SVG from Airflow CTL Commands".

However, when I run the checks locally:

pre-commit run (on modified files only) skips this check, as expected, since I haven't modified any related files;

pre-commit run --all-files passes the check successfully, including the "Generate SVG from Airflow CTL Commands" step.

Please let me know if there's anything else I should look into or adjust. Thanks!

@potiuk
Copy link
Member

potiuk commented Jun 29, 2025

Hi, the pipeline is currently failing on Tests AMD / CI image checks / Static checks (pull_request), specifically at the step "Generate SVG from Airflow CTL Commands".

However, when I run the checks locally:

pre-commit run (on modified files only) skips this check, as expected, since I haven't modified any related files;

pre-commit run --all-files passes the check successfully, including the "Generate SVG from Airflow CTL Commands" step.

Please let me know if there's anything else I should look into or adjust. Thanks!

That was temporary. As usual - rebase. Always Be Rebased. There is another issue in main being fixed as we speak though so you might wait a few minutes.

@potiuk
Copy link
Member

potiuk commented Jun 29, 2025

This is a PR that targets fixing the current main "generate constraints" failure #52464 - once merged, rebasing your PR should fix unrelated issue.

davidfgcorreia and others added 2 commits June 29, 2025 18:18
…views

Add backend and OpenAPI support for filtering assets by group
(`group_pattern`) in /assets endpoint.

Implement multi-asset dependency queries in /ui/dependencies endpoint
with node_ids parameter.

Add new frontend pages and components for asset group views, including
group graph and sidebar.

Refactor asset list and group list to support group filtering, merging,
and deduplication.

Add and update tests for group filtering, group views, and multi-asset
dependency queries.

Co-authored-by: Francisco Núncio <[email protected]>
- Added `/assets/groups` endpoint to return asset groups with their assets and counts.
- Added `AssetGroupResponse` and `AssetGroupCollectionResponse` models to API and OpenAPI schemas.
- Updated backend, OpenAPI YAML, and TypeScript types to support asset group listing and filtering.
- Updated frontend to use new `useAssetServiceGetAssetGroups` hook for grouped asset views and sidebar.
- Refactored asset list and group pages to use new group API and improved i18n.
- Updated dependencies API and frontend to use exploded `node_ids` query parameter (array, not comma-separated).
- Updated backend logic for `/dependencies` to return 404 if any requested node_id is missing.
- Updated and fixed unit tests for dependencies and asset groups to match new API and error handling.
- Improved search and filtering for assets and asset groups in UI.
@potiuk
Copy link
Member

potiuk commented Jun 29, 2025

Rebased it for you after merging my PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants