Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate new partition selection and texel assignment algorithms #216

Open
solidpixel opened this issue Feb 20, 2021 · 1 comment
Open
Labels

Comments

@solidpixel
Copy link
Contributor

solidpixel commented Feb 20, 2021

The current partition selection algorithms are based on kmeans clustering, and bit mismatch assignment. The basic algorithm is:

texel_assignment = compute_block_kmeans()

partition_accuracy = []
foreach partition_index in range(0, 1024):
    mismatch = compute_texel_assignment_mismatches(texel_assignment)
    patition_accuracy.append((mismatch, partition_index))

partition_accuracy.sort()

foreach trial_index in range(0, tune_max_partion_indexes):
    trial_partition = partition_accuracy[trial_index]

Ideas

Improved partition selecting based on quality threshold

With the current compressor we consider all active partitionings and then rank those by pixel mismatch counts. Should we introduce percentiles for partitionings to reduce the active set further? This has a natural benefit for RDO as we can use the known percentile table to repack the partitioning index.

Improved partition index sorting

Selecting a specific partition is done based simply on a counting of texels that end up in the "wrong" partition, and then sorting by that count. Nothing factors in the magnitude of the error introduced; some partition swaps are likely to be worse than others. Can we use to this choose better block candidates and either (1) improve PSNR or (2) reduce the number of partition trials we need.

Improved partition selection

A brute-force test which puts every valid partitioning through full 1 plane compression improves -thorough image quality for the Kodak suite by up to 0.26 dB (> 0.15 dB common). This isn't viable as a strategy (50x slower), but indicates that there is potential for selecting better partitionings. Unclear if this is because we are using partitionings that are not tested by the search, or partitionings which are dropped by the search due to a bad error estimate.

Improved edge case assignment

Today the algorithm requires at least one texel in each partition. If a partition ends up without a texel we assign the last texel in the block to it (repeating this N times if needed). This choice isn't based on any actual error metric, so there is a good chance that assignment is a bad choice. We should track the texels which are "worst" and assign them to the empty partition.

Try clustering endpoints not whole partitions

Today we make 3 clusters for 3 partitions. We could instead make 6 clusters as endpoint approximators, and then try to find good pairings. The cluster centers themselves would be bad endpoints as they will truncate, but a lerp line passing though each pair of them might be interesting.

Good pairings could be encoding-aware:

  • Prefer endpoint pairings that are stronger for RGBS.
  • Prefer endpoint pairings that are all of similar distance apart (tend to have similar quant requirements).

Some of the data we need for determining good pairings could actually be good input into later selection passes. For example, if this pass can tell us ahead of time that RGBS is going to be terrible we can just skip trying it later.

Improved texel assignment

Clustering using kmeans will select texels based on distance to the cluster center, tending to making blobs of texels rather than lines of texels. Using kmeans is probably OK for a broad classification, but today we also use it for detailed texel assignment to partitions. The partition that is chosen by texel distance to a cluster center is not necessarily the same partition that would be chosen based on distance to the projected lerp line. Can we chose better partition assignments for each texel?

Special case constant color handling

For blocks with high amounts of constant color we may be able to free up bits by fusing two constant color partitions together, provided that the magnitude of the endpoint diff is similar to the other partitions (this keeps the quant requirements similar). Not always a good idea though - might disallow RGBS encoding if there is a significant chroma skew, for example.

Color heuristics for choosing 4 partitions

The biggest reason to use 4 partitions are chroma shifts that cannot be managed by a "lerp" between two existing chroma values. Unexpected high intensity chroma shifts also stand out in images perceptually (even though luma is more important overall). Determine need for multiple partitions based on the area of the bounding box in HS(L) space?

Early out on bad options

We currently complete every partition search. For bad options we should look at when it is possible to early out.

@solidpixel solidpixel added this to the 3.0 milestone Feb 20, 2021
@solidpixel solidpixel removed this from the 3.0 milestone Jun 9, 2021
@solidpixel
Copy link
Contributor Author

solidpixel commented Oct 24, 2022

Experiment - introducing partitioning centile tables

A lot of the potential partitioning options do get used, but there is a long tail of low value partitioning indices. This experiment looked at marking many of these long tail of partitioning indices above the Nth centile as invalid, reducing the number of partitionings tested.

Note: The centile table for this experiment was built using only the Kodak suite as input, so it's biased towards photographic images. We'd need to consider use of a wider input set for a real production build (in particular hard-body textures with sharp edges).

Testing with 4x4 blocks, and it's definitely possible to get improvements (up to 2-6% with 98th centile, 3-8% performance with 95th centile), and even some IQ improvements on Kodak (neutral on Khronos tests). Need to test more widely with other block sizes.

This mostly seems to benefit high-bitrate encodings - the equivalent tables for 6x6 only gets minor improvement (0-2%). Interestingly, the 6x6 table seems to work "reasonably" for 4x4 too, so presumably there is a lot of overlap in "good" indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant