Nicer value rendering in API errors #132314

thockin · 2025-06-15T01:15:57Z

Today, if the value passed is a struct, map, or list, we get Go's native rendering which is clunky.

This uses JSON (could be kyaml when that is ready) instead.

I hear it already: "But JSON is slow!". I benchmarked it -- for a simple int or string field, JSON is only a little slower (~20%) than a type assertion, but it IS slower, so I left the type assertion in. Remember that this is only called when an API error has occurred.

The type assertions do not handle typedefs-to-{string, int64, etc} so those will fall back on JSON. Almost all of our errors go thru standard functions which demand string or int64 anyway, so mostly pointless.

I also benchmarked using reflect to check CanInt() and that is almost exactly as fast as type-switch but handles more cases, so we COULD switch to that instead, if we wanted. I thought it wasn't worth the complexity.

JSON is really there to handle composite types.

/kind bug
/kind cleanup

NONE

k8s-triage-robot · 2025-06-15T02:38:18Z

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

k8s-triage-robot · 2025-06-15T03:38:18Z

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

yongruilin

/triage accepted

staging/src/k8s.io/apimachinery/pkg/util/validation/field/errors_test.go

thockin · 2025-06-18T02:22:17Z

Open questions:

For types that define String() - should we prefer that or JSON? I chose JSON, but it changes some results. If we choose String() then do we put explicit quotes or leave it "naked"? One place (logs) renders int-type values as hex because of the default %#v, but other places like time render things that are clearly meant to be a string. We would not want to add MarshalJSON() to logs, since hex is not JSON. We could add ANOTHER optional method like MarshalLog() or MarshalErrorValue ? @pohly
metav1.Time has a MarshalJSON() and inherits a String() (from embedded time.Time) and they are different - should we make them the same? @deads2k
Since validation runs on internal types, we still get some GoNames instead of goNames, but this was true before.

Today, if the value passed is a struct, map, or list, we get Go's vative rendering which is clunky. This uses JSON (could be kyaml when that is ready) instead. I hear it already: "But JSON is slow!". I benchmarked it -- for an simple int or string field, JSON is only a little slower (~20%) than a type assertion, but it IS slower, so I left the type assertion in. Remember that this is only called when an API error has occurred. The type assertions do not handle typedefs-to{string, int64, etc} so those will fall back on JSON. Almost all of our errors go thru standard functions which demand string or int64 anyway, so mostly pointless. I also benchmarked using reflect to check `CanInt()` and that is almost exactly as fast as type-switch but handles more cases, so we COULD switch to that instead, if we wanted. I thought it wasn't worth the complexity. JSON is really there to handle composite types.

Notes: * For types that define String() - should we prefer that or JSON? * metav1.Time has a MarshalJSON() and inhereits a String() and they are different * Since validation runs on internal types, we still get some GoNames instead of goNames.

k8s-ci-robot · 2025-06-19T01:12:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/apis/OWNERS~~ [thockin]
~~pkg/credentialprovider/OWNERS~~ [thockin]
~~staging/src/k8s.io/apiextensions-apiserver/pkg/apis/OWNERS~~ [thockin]
~~staging/src/k8s.io/apimachinery/pkg/api/validation/OWNERS~~ [thockin]
~~staging/src/k8s.io/apimachinery/pkg/util/validation/OWNERS~~ [thockin]
~~staging/src/k8s.io/apiserver/pkg/apis/OWNERS~~ [thockin]
~~staging/src/k8s.io/component-base/logs/api/OWNERS~~ [thockin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

stlaz · 2025-06-23T16:10:26Z

(triage):
looks good from sig-auth side (credentialplugin changes)

pohly · 2025-06-25T13:58:29Z

For types that define String() - should we prefer that or JSON? I chose JSON, but it changes some results.

At first glance, this seems like a situation where a user-visible representation of a value is needed, which is what String is supposed to provide. Was JSON chosen because some of our values have String implementations which are fairly unreadable (the protobuf generated String implementations come to mind) or because we want a complete dump of the bad value (similar to how some test frameworks dump the entire error in addition to the "summary string" returned byError)?

If we choose String() then do we put explicit quotes or leave it "naked"?

As we use error-wrapping style (i.e. <prefix>: <details>) one can read from left to right and figure out what the value is without quoting it. I prefer leaving it "naked".

pohly · 2025-06-25T14:01:38Z

one can read from left to right and figure out what the value is without quoting it

Except that there is more text after the value:

path.to.field: Invalid value: "the value": the details

That invalidates my argument and quoting becomes necessary.

Or can we shuffle things around?

path.to.field: Invalid value, the details: the value

pohly · 2025-06-25T14:06:43Z

Apropos error rendering: should or shouldn't validation tests check for expected errors by comparing against full strings, check for more or less complete sub-strings, or against errors produced by calling the same error method as in the validation code?

For DRA, I chose the latter because I didn't want the test to depend on the implementation of those methods. As seen in this PR, some other validation tests use strings which then need to be updated when changing the implementation. Strings have the advantage that one can check the readability of the user-visible error message and more easily spot when the wrong method is used when the result makes no sense.

thockin · 2025-06-26T08:19:19Z

The Genesis of this PR (other than being a long-annoying thing) is declarative validation. We added support to auto-check listmap types for duplicates and throw errors. Testing those exposed the fact that we produce a "duplicate value" error, where the value is (for example) a whole container. It's clearly NOT a duplicate value, but the key is buried in there.

I thought "perhaps we can return a map[string]any with just the key field(s) set, but that still gets rendered with Go's "%#v", which is not super helpful. But if I run that through JSON (or KYAML) it is nicer.

As for readability, I don't think we intend the final error to be machine parseable or splittable, but I am eager to make it more useful to humans.

E.g. is something like "Invalid value ("the value"): the details" better? I think the quotes are useful to distinguish "true" from true (as in labels) or 4 from "4" (as in resources).

As for unit tests, I prefer they operate with the new Matcher logic, so they are less dependent on exact strings. For example, I want to make the error message better for dns-label, and it breaks hundreds of tests. This is why we are adding "origin".

pohly · 2025-06-27T08:44:17Z

where the value is (for example) a whole container

So that's exactly the case where ignoring the fmt.Stringer implementation in favor of some nicer rendering makes sense. I sometimes wish we wouldn't need those fmt.Stringer implementations (but protobuf needs them) or nicer output (let's replace with KYAM?!), but for now preferring JSON as proposed in this PR makes sense.

One can also argue that the API errors are meant to provide a data dump of the values, not just a user visible rendering, because one may have to inspect the entire value.

I don't think we intend the final error to be machine parseable or splittable

Agreed, that's why I thought it would be okay to not use quoting. That can make strings less readable and works fine as long as humans can "spot" where the value starts.

is something like "Invalid value ("the value"): the details" better

If we keep quoting the value, then path.to.field: Invalid value: "the value": the details is fine.

I prefer they operate with the new Matcher logic, so they are less dependent on exact strings

So produce expected errors and compare against the actual errors, which is what DRA does - except that it doesn't use the ErrorMatcher helper yet. Let me look into changing that now...

thockin · 2025-06-27T11:28:54Z

The advantage of matcher is that you can decide which criteria to match, often the field path + error type + detail substring is OK, but as we add more origin, the details string matters less and becomes the opposite of useful.

pohly · 2025-06-27T13:40:40Z

I want my error matching to be pretty complete, but the "origin instead of detail string" is nice. I converted pkg/apis/resource/validation, which included making some changes elsewhere - see #132577

thockin · 2025-06-30T18:50:21Z

AFAIK this is OK to review now.

thockin assigned jpbetz and deads2k Jun 15, 2025

k8s-ci-robot requested review from derekwaynecarr and sttts June 15, 2025 01:16

k8s-ci-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Jun 15, 2025

dims added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 15, 2025

yongruilin reviewed Jun 17, 2025

View reviewed changes

staging/src/k8s.io/apimachinery/pkg/util/validation/field/errors_test.go Outdated Show resolved Hide resolved

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 17, 2025

thockin force-pushed the jp_nicer_api_errors branch from 96a6584 to 6fcb038 Compare June 18, 2025 02:19

k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Jun 18, 2025

github-project-automation bot added this to SIG Auth and SIG Apps Jun 18, 2025

github-project-automation bot moved this to Needs Triage in SIG Apps Jun 18, 2025

thockin force-pushed the jp_nicer_api_errors branch from 6fcb038 to 5ac9af4 Compare June 18, 2025 02:52

enj moved this to Needs Triage in SIG Auth Jun 18, 2025

thockin added 3 commits June 19, 2025 10:11

WIP: Fix tests

4ca91a0

Notes: * For types that define String() - should we prefer that or JSON? * metav1.Time has a MarshalJSON() and inhereits a String() and they are different * Since validation runs on internal types, we still get some GoNames instead of goNames.

Don't panic in case of an unknown API error code

e68d601

thockin force-pushed the jp_nicer_api_errors branch from 5ac9af4 to e68d601 Compare June 19, 2025 01:12

stlaz moved this from Needs Triage to In Review in SIG Auth Jun 23, 2025

pohly mentioned this pull request Jun 27, 2025

API: enhance validation with ErrorMatcher #132577

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nicer value rendering in API errors #132314

Nicer value rendering in API errors #132314

Uh oh!

thockin commented Jun 15, 2025 •

edited

Loading

Uh oh!

k8s-triage-robot commented Jun 15, 2025

Uh oh!

k8s-triage-robot commented Jun 15, 2025

Uh oh!

yongruilin left a comment

Uh oh!

Uh oh!

thockin commented Jun 18, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Jun 19, 2025

Uh oh!

stlaz commented Jun 23, 2025

Uh oh!

pohly commented Jun 25, 2025

Uh oh!

pohly commented Jun 25, 2025

Uh oh!

pohly commented Jun 25, 2025 •

edited

Loading

Uh oh!

thockin commented Jun 26, 2025

Uh oh!

pohly commented Jun 27, 2025

Uh oh!

thockin commented Jun 27, 2025

Uh oh!

pohly commented Jun 27, 2025

Uh oh!

thockin commented Jun 30, 2025

Uh oh!

Uh oh!

Nicer value rendering in API errors #132314

Are you sure you want to change the base?

Nicer value rendering in API errors #132314

Uh oh!

Conversation

thockin commented Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-triage-robot commented Jun 15, 2025

Uh oh!

k8s-triage-robot commented Jun 15, 2025

Uh oh!

yongruilin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thockin commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jun 19, 2025

Uh oh!

stlaz commented Jun 23, 2025

Uh oh!

pohly commented Jun 25, 2025

Uh oh!

pohly commented Jun 25, 2025

Uh oh!

pohly commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thockin commented Jun 26, 2025

Uh oh!

pohly commented Jun 27, 2025

Uh oh!

thockin commented Jun 27, 2025

Uh oh!

pohly commented Jun 27, 2025

Uh oh!

thockin commented Jun 30, 2025

Uh oh!

Uh oh!

thockin commented Jun 15, 2025 •

edited

Loading

thockin commented Jun 18, 2025 •

edited

Loading

pohly commented Jun 25, 2025 •

edited

Loading