Skip to content

Investigate new partition selection and texel assignment algorithms #216

Open
@solidpixel

Description

@solidpixel

The current partition selection algorithms are based on kmeans clustering, and bit mismatch assignment. The basic algorithm is:

texel_assignment = compute_block_kmeans()

partition_accuracy = []
foreach partition_index in range(0, 1024):
    mismatch = compute_texel_assignment_mismatches(texel_assignment)
    patition_accuracy.append((mismatch, partition_index))

partition_accuracy.sort()

foreach trial_index in range(0, tune_max_partion_indexes):
    trial_partition = partition_accuracy[trial_index]

Ideas

Improved partition selecting based on quality threshold

With the current compressor we consider all active partitionings and then rank those by pixel mismatch counts. Should we introduce percentiles for partitionings to reduce the active set further? This has a natural benefit for RDO as we can use the known percentile table to repack the partitioning index.

Improved partition index sorting

Selecting a specific partition is done based simply on a counting of texels that end up in the "wrong" partition, and then sorting by that count. Nothing factors in the magnitude of the error introduced; some partition swaps are likely to be worse than others. Can we use to this choose better block candidates and either (1) improve PSNR or (2) reduce the number of partition trials we need.

Improved partition selection

A brute-force test which puts every valid partitioning through full 1 plane compression improves -thorough image quality for the Kodak suite by up to 0.26 dB (> 0.15 dB common). This isn't viable as a strategy (50x slower), but indicates that there is potential for selecting better partitionings. Unclear if this is because we are using partitionings that are not tested by the search, or partitionings which are dropped by the search due to a bad error estimate.

Improved edge case assignment

Today the algorithm requires at least one texel in each partition. If a partition ends up without a texel we assign the last texel in the block to it (repeating this N times if needed). This choice isn't based on any actual error metric, so there is a good chance that assignment is a bad choice. We should track the texels which are "worst" and assign them to the empty partition.

Try clustering endpoints not whole partitions

Today we make 3 clusters for 3 partitions. We could instead make 6 clusters as endpoint approximators, and then try to find good pairings. The cluster centers themselves would be bad endpoints as they will truncate, but a lerp line passing though each pair of them might be interesting.

Good pairings could be encoding-aware:

  • Prefer endpoint pairings that are stronger for RGBS.
  • Prefer endpoint pairings that are all of similar distance apart (tend to have similar quant requirements).

Some of the data we need for determining good pairings could actually be good input into later selection passes. For example, if this pass can tell us ahead of time that RGBS is going to be terrible we can just skip trying it later.

Improved texel assignment

Clustering using kmeans will select texels based on distance to the cluster center, tending to making blobs of texels rather than lines of texels. Using kmeans is probably OK for a broad classification, but today we also use it for detailed texel assignment to partitions. The partition that is chosen by texel distance to a cluster center is not necessarily the same partition that would be chosen based on distance to the projected lerp line. Can we chose better partition assignments for each texel?

Special case constant color handling

For blocks with high amounts of constant color we may be able to free up bits by fusing two constant color partitions together, provided that the magnitude of the endpoint diff is similar to the other partitions (this keeps the quant requirements similar). Not always a good idea though - might disallow RGBS encoding if there is a significant chroma skew, for example.

Color heuristics for choosing 4 partitions

The biggest reason to use 4 partitions are chroma shifts that cannot be managed by a "lerp" between two existing chroma values. Unexpected high intensity chroma shifts also stand out in images perceptually (even though luma is more important overall). Determine need for multiple partitions based on the area of the bounding box in HS(L) space?

Early out on bad options

We currently complete every partition search. For bad options we should look at when it is possible to early out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions