Skip to content

proposal 246

Tristan Swadell edited this page Oct 27, 2022 · 3 revisions

CEL - Optional Values

Summary

Introduce first-class support for optional values within CEL to address the following use cases:

  • Conditionally provided variables to CEL expressions
  • Conditional field selections with support for alternative values upon absence
  • Conditionally set fields based on the presence of an optional value.
Author(s) Tristan Swadell, Matthias Blume
Reviewer(s) Justin King, Jonathan Tatum
Status Approved
Last-Modified 2022-10-27

Overview

Currently, users have to write potentially deeply nested ternary operators using the has macro to determine whether a field is set, to obtain the value, and to return an alternative value.

{ "int_value" : has(data.int_value) ? data.int_value : dyn(null) }

The expression above then has a further problem in that the 'null' return is intended to indicate an optional return value, but doing so requires the use of 'dyn' which obscures the type information from subsequent checks.

Goals

  • Introduce a type-safe option for indicating an optional value.
  • Provide a simple UX for selecting among alternative values, including defaults.
  • Simplify the creation of map and message objects with optional field values.
  • Minimize syntactic changes as these are hard to version and flag-guard.

Non-Goals

  • Provide an exact replica of what is exposed java.util.Optional or absl::optional implementations.
  • Provide the optimal syntactic experience for working with optional values

Proposal

Introduce a new opaque type called optional, a small set of syntax changes, and namespaced functions for the instantiation of parameterized optional values:

Syntax Semantics
msg.?field Optionally select a field.

// select semantics
has(msg.field) ? optional{msg.field} : optional.none()
map[?key]
// map key semantics
key in map ? optional{map[key]} : optional.none()
list[?index]
// list index semantics
index >= 0 && index < list.size() 
    ? optional{list[index]} : optional.none()
Msg{ ?field: <expr> } Optionally set a message field. Note, the <expr> must be of type optional(T) where T is the type of the field referenced.

To support the indication of an optional type on the field, a new boolean flag will be added to the CEL syntax.proto which marks the message entry as optional.

// optional field set semantics pseudo-code.
<expr>.hasValue() ? Msg{field: <expr>} : Msg{}
{?key: <expr>} Optionally set a map key. The <expr> must be of the same type as the map's value type. If the map is a literal, then the expected value type is expected to be optional(type(<expr>)).

During literal construction when there are multiple value types, the map entry's value type is deferred to runtime. The mixed value types, even if some of those types are optional and some are not, will have a type of map(<keyType>, dyn) rather than map(<keyType>, optional(dyn)).

To support the indication of an optional type on the field, a new boolean flag will be added to the CEL syntax.proto which marks the struct (map/message) entry as optional.

// optional key set semantics
<expr>.hasValue() ? {key: <expr>} : {}

This proposal also recommends that support be added for chaining optional field selection and indexing where selection / indexing on an optional value produces another optional value. This means that the '?' is a viral choice that causes subsequent lookups to produce optional values as well. The intent here is to ensure that nested optional value checks, the common case, are succinct and easy to read:

The following will produce equivalent results:

msg.?map_field[key][0].subfield
msg.?map_field[?key][0].subfield
msg.?map_field[?key][?0].subfield
Chained semantics:

has(msg.map_field) 
    && key in msg.map_field  
    && 0 < msg.map_field[key].size()  
    && has(msg.map_field[key].subfield)
    ? optional.of( 
        msg.map_field[key][0].subfield) 
    : optional.none())
Example usage

msg.?map_field[key][0].subfield.orValue("hello world").size()

Since the optional selection behavior should propagate from parent to child select / index expressions the checker will be updated to support the standard select (a.b) behavior over optional objects, and overloads for the index operators (map[value], list[index]) will be created to support indexing on such values.

The following functions are needed to support interoperability with the proposed operators and syntax changes:

Global Function Signature
optional.of If the input value is an error or unknown value, the input is returned as-is, otherwise create an optional value which wraps the input.

// overload: optional_of_value
optional.of(T) -> optional(T)
optional.ofNonZeroValue The semantics of this function are identical to the optional.of(); however, if the input is a zero value or null, the result will be optional.none().

The purpose of this function is to assist users who are working with literals where an empty literal value is intended to be treated equivalently to unset. In some instances when working with JSON data, a map key-value may be set to 'null' in order to indicate absence. Under normal circumstances, CEL would treat the zero value or 'null' as a present value rather than an absent one.

Zero values include:

  • Empty message, map, list, string, and bytes values
  • Zero values for int, double, uint, duration, and timestamp
  • The concrete 'null' and boolean 'false'

// overload: optional_ofNonZeroValue_value
optional.ofNonZeroValue(T) -> optional(T)
optional.none
// overload: optional_none
optional.none() -> optional(T)
_==_ The equality operator definition does not change with the introduction of optional, but at runtime the comparison of optional values would require that both optionals have a value and that the contained values are equal.

Optional values will use the standard definition of runtime heterogeneous equality, and will not automatically dereference when compared with non-optional values. This constraint may be relaxed in future revisions, but the use of optional values should increase the likelihood of a strong type-check signal, and there is presently no way to produce an optional value through an object traversal (unlike producing a dyn-typed float when an int is expected).

// operator: equals
optional(T) == optional(T) -> bool
_.?_ It's not possible to represent the type change upon field selection as an operator signature; however, to differentiate between selection and optional field selection a new call will be introduced. The declaration of the signature would be: _.?_(dyn, string) -> dyn

The new call makes it simpler to differentiate selection behaviors within parsed-only expressions, and can be handled in the same fashion as existing select expressions within the type-checker implementations.

// overload: optional_select_optional_field
optional(T).?field -> optional(type(T.field))

// overload: select_optional_field
T.?field -> optional(type(T.field))
_[?_] The new call makes it simpler to differentiate between optional indexing and the standard indexing behaviors.

// overload: list_index_optional_int
list(T)[?int] -> optional(T)

// overload: optional_list_index_optional_int
optional(list(T))[?int] -> optional(T)

// overload: map_index_optional_value
map(K, V)[?K] -> optional(T)

// overload: optional_map_index_optional_value
optional(map(K, V))[?K] -> optional(V)

The new optional values will also have the following member functions to assist with chained operations and testing whether the optional has a value:

Member Function Signature
or If the first optional hasValue() == true, then the second optional expression is not evaluated and the first optional returned.

While this operator supports short-circuiting, it is still strict as this function is not commutative like the boolean || operator.

// overload: optional_or_optional
optional(T).or(optional(T)) -> optional(T)
orValue If the first optional hasValue() == true, then the optional.value() is returned, and the else branch is not evaluated; otherwise, the else value is returned.

While this operator supports short-circuiting, it is still strict as this function is not commutative like the boolean || operator.

// overload: optional_orValue_value
optional(T).orValue(T) -> T
value Return the value held within the optional, or error if the optional hasValue() == false.

// overload: optional_value 
optional(T).value() -> T // error if none()
hasValue Return whether the optional has a value.

// overload: optional_hasValue
optional(T).hasValue() -> bool

With the methods proposed, the original CEL expression is simplified and provides a clear, strongly-typed optional(int) result type which also provides better assurances about the presence of the first UserID mentioned in the original expression:

{ ?"int_value" : data.?int_value }

The new functions make it possible to express whether a value is optional; however, they pose a different challenge when it comes to consuming optional values within message / map creation statements and in 'let'-style expressions which have been implemented in a few different contexts. The following macros will also be introduced to assist with conditionally adapting optional values:

Macro Signature
optMap
optional(T).optMap(T var, T -> U) -> optional(U)

Here's an example of how optMap might be used to transform the value and express complex logic

optional.of(some.field.path)
    .optMap(value, value == x || value == y)

This macro is equivalent to the following:

(has(some.field.path)  
    ? [some.field.path].map(value,  
          optional.of(value == x || value == y))[0] 
    : optional.none())
optFlatMap
optional(T).optFlatMap(T var, T -> optional(U)) -> optional(U)

As an example, the optFlatMap can be used to implement optional.ofNullable(). It's possible that similar expressions will need to be provided to ensure that derived expressions within pToken which produce optional values can be appropriately inspected and adapted to other optional values.

optional.of(msg.field[key])
    .optFlatMap(value, value == null
        ? optional.none()  
        : optional.of(value))

This macro is equivalent to the following:

(has(msg.field[key])  
    ? [some.field.path].map(value, value == null  
        ? optional.none()  
        : optional.of(value))[0] 
    : optional.none())

Alternative(s) Considered

A strict implementation of java.util.Optional or absl::optional is not desired since it ties the CEL implementation to a single particular flavor of optional value.

Note, if the has() macro were updated to expand index calls into 'index' in value, then has() would be more broadly useful than it is presently and the semantics more simply expressed as a series of nested has() calls over a field path.

As a potential further improvement in the common case, an operator such as ?? could be considered to simplify expressions such as a.?b.orValue('hello world') to a.b ?? 'hello world'. While this is more succinct, it's not clear how common users will need to leverage optional values. At present, there are a number of cases where the optional introduces optional field proto semantics where they had otherwise been lacking in CEL. With more experience and knowledge about the kinds of use cases this feature opens up, we will be able to better determine which changes will best serve CEL users.

Production Safety

Concerns Yes No
Alters AST representation
Alters type-check semantics
Alters evaluation semantics
Impacts evaluation performance
Introduces new runtime function
Introduces syntax change

Mitigations / Pre-Mortem

Feature flags will need to be introduced to manage exposing new functions and new syntax to existing CEL applications. However, all changes are forward compatible with previously compiled CEL expressions.

All stacks may initially support this feature by implementing the abstract "optional" type, functions, and macros. The bulk of the risk comes from the syntactic changes as they also have implications for type-checking and introduce new overloads for existing operators.