AI and imperfect memory
Tropical Foliage, Waterfall. (c) 2017 B. Ellison

AI and imperfect memory

The uproar over the rise of AI-generated imagery, text, and music has me thinking on a few topics and questions recently and I wanted to share. I'll focus on the AI for visual art -- specifically photography -- instead of the written word or music, but I think much of what I've been thinking applies to all three. 

Two related thoughts came to mind -- is this basically just extremely advanced data compression? And, why do we care?

Compression can be split into two general buckets - lossless and lossy. Lossless compression is typically used for things where exact fidelity matters (think financial spreadsheets, books, or studio recordings). Lossy compression then is the opposite, and those algorithms typically analyze the data being compressed and discard information that "doesn't matter." With photos, an example technique might rely on our understanding of human vision and therefore convert several adjacent chunks of somewhat-similar blue to a single blue color.  Both compression techniques often also rely on building a 'library' where it would be easier (smaller) to reference a single sequence in that library multiple times versus spelling it out every time it occurs.

Anyone who has ever browsed the internet or listened to music on an electronic device is likely familiar with the JPG and MP3 formats - both typically utilizing lossy compression. And broadly speaking, not only are we familiar with it, we accept it. The majority of us have accepted that it is ok for the digital photos we're storing to be somewhat less than 100% accurate, because for most of us we can't visually tell the difference between the original photo and the version after compression.

This blends into my other thought -- why do we care? At a deeper level, why do we take pictures? Generally speaking, we use photography to invoke a feeling, tell a story, or as evidence of a particular scenario or event - with or without emotion attached. From the evidentiary perspective, now that we can no longer rely on a photograph as incontrovertible proof of something, we're in an arms race. But for the rest of the reasons we photograph, ask yourself, what are the important parts of a photo?  

Think about a good photo from the last vacation you took -- do you remember exactly everything in that picture, or just a few pertinent details? Does every blade of grass or wisp of cloud or the way your shoes are tied in the photo matter? To put it differently, if instead of throwing away color information for your pictures, JPG rather threw away minor details like how your shoes were laced... is that still acceptable? Most of us wouldn't be able to tell the difference. In fact, many of us are perfectly ok with the idea of photos being retouched and objects being erased from them. These photos are no longer a 100% true-to-life representation, but we still find merit in them and hold them dear.

To set up our thought exercise, let us pretend that we have an AI model which in addition to being generally well trained, is also extensively trained on our personal collection of photos. This model is so good, if you were to tell this AI to give you a picture of one of your family members, it would be able to give you a picture that you couldn't tell wasn't an actual photo taken with a camera. You would still carry your phone/camera around and take pictures like normal, and upload them to "the cloud." The trick is, rather than compressing your photos and storing them as separate files, they instead feed the AI model and are then either completely discarded or perhaps saved as just a description: "family standing under waterfall in Puerto Rico." When you want to browse your photos, the AI then generates them from scratch. Here's a picture of your family standing under a waterfall in Puerto Rico. Your kid has the same lopsided smile or quirky expression that they often do in the pictures, and all the major elements you remember are there. Does it matter then if the picture on your computer screen is an imperfect recreation of color data captured by however many pixels the manufacturer was able to cram into your camera then compressed with some low-order bits thrown out, or a photo created by an AI model based on a deeper understanding of the world, how we perceive it, and the elements you've indicated are important to you?

Does it matter to you? If so, why?

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics