Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 ... 95 96 [97] 98 99 ... 180

Author Topic: Random thoughts - On the Origins of "I Could Eat A Horse"  (Read 218987 times)

Frumple

  • Bay Watcher
  • The Prettiest Kyuuki
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1440 on: April 08, 2021, 10:14:41 pm »

... why isn't five million output images feasible, again? Those words read to me like saying it's not feasible for several million word works of fiction to exist, tbh. Maybe it isn't, but that doesn't stop people :V
Logged
Ask not!
What your country can hump for you.
Ask!
What you can hump for your country.

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1441 on: April 08, 2021, 10:35:59 pm »

Consider that I'm working with raw image data, in 24-bit RGB. Uncompressed raw image data. The image dimensions are 2000 x 3000. This means:
Size per raw image = 2000 px * 3000 px * 3 bytes per color = 18 MB per image.

So, uh, 18 megs times 5 million would be 90 terabytes, if I've understood my SI prefixes correctly. That's feasible for a large server, completely inaccessible for the average Joe. Imagine how much you'd pay in hard drive space!

Hell, even with the lower bound of [128, 500], you'd still need 2 and a bit terabytes. Maybe you have the budget for a 4 TB hard drive. I don't.

But, hey, if you wanna try, I've patched my Python script to iterate over resolutions instead of bitrates! You'll need a copy of FFmpeg in PATH and a 2000x3000 raw image as input. Oh, and you'll still need an image editor that will support opening raw image data to open the output files. I suggest Irfanview.

Spoiler (click to show/hide)
« Last Edit: April 08, 2021, 10:39:39 pm by methylatedspirit »
Logged

King Zultan

  • Bay Watcher
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1442 on: April 09, 2021, 02:35:20 am »

So what exactly is the endgame for this whole project?
Logged
The Lawyer opens a briefcase. It's full of lemons, the justice fruit only lawyers may touch.
Make sure not to step on any errant blood stains before we find our LIFE EXTINGUSHER.
but anyway, if you'll excuse me, I need to commit sebbaku.
Quote from: Leodanny
Can I have the sword when you’re done?

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1443 on: April 09, 2021, 04:42:51 am »

There is none! Just like before. Not every project has to have a satisfying ending; do most things have satisfying endings? Does every movie have a good ending? Do most people die in a blaze of glory? Is every discovery as useful as the Discrete Cosine Transform? Good endings are statistically very, very rare. It's nice if anything I do ends up being significant in the future, but it's not likely.

The metapoint in me documenting these things is that I'll be able to recall elements from these explorations ("project" implies a definite goal) that'll be used later on. Whether it'll bear fruit now is irrelevant. I'm sure Einstein had a couple of duds before coming up with Relativity, but to say that those duds were entirely worthless because they weren't themselves revolutionary is to deny the fact that knowledge is incremental. Rome wasn't built in a day, just like knowledge can't be.

I've basically been guessing at things to do with raw data, and the ones that produce viable results are just lucky. This one... I'll let it incubate in the Idea Machine, freely mixed in with a hundred different ideas. Something will happen in there, and I have no way of knowing what.
Logged

taat

  • Bay Watcher
  • Doesn't notify people when putting them in the sig
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1444 on: April 09, 2021, 07:55:35 am »

You don't actually need to save the files to disk, you can just discard every image that doesn't set a new "noisiness" record (unless you want to compare every image to every other image or something horrible like that)

Granted iterating all 5 million images would probably take several weeks still, and you need an actual objective way to determine noisiness
Logged
Hypothesis: I will kill all dwarves in a horrible fashion

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1445 on: April 09, 2021, 08:12:51 am »

You don't actually need to save the files to disk, you can just discard every image that doesn't set a new "noisiness" record (unless you want to compare every image to every other image or something horrible like that)

Ayyy! You've moved the problem from "infeasible" to "doable given enough time"!

Granted iterating all 5 million images would probably take several weeks still

I do have a little Intel NUC sitting at home that I can control remotely. I suppose the computational requirement is basically done; that thing's just sitting around doing mostly nothing for months on end. I've been itching to find dumb ways to use all 2.8 GHz of its power. It's "cloud computing". I think I (or someone else; I put it in public domain) could repatch that script to automatically create PNG files and do nice image comparison stuffs; FFmpeg's very versatile, and I've made that script platform-agnostic without even realizing it.

and you need an actual objective way to determine noisiness

I dunno, my first idea was "use SSIM to compare between a random noise image and the current image", but I really don't think that was the intended use for such a metric. The theory there would be "find the image that most closely resembles noise", but I doubt it works at all like that. Not when you're literally pitting noise against slightly-less noisy noise to see if slightly-less noisy noise is noisier than the noisiest noise found thus far.
« Last Edit: April 09, 2021, 08:19:14 am by methylatedspirit »
Logged

Frumple

  • Bay Watcher
  • The Prettiest Kyuuki
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1446 on: April 09, 2021, 09:14:06 am »

Who cares about intent if it works, though? Might as well try!
Logged
Ask not!
What your country can hump for you.
Ask!
What you can hump for your country.

taat

  • Bay Watcher
  • Doesn't notify people when putting them in the sig
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1447 on: April 09, 2021, 09:41:01 am »

Now that I think of it, the best way to figure out how noisy the image is would probably be to use compression, since images with higher entropy/noise are less compressible. either see how large a losslessly compressed result is (though the fact that thedimensions of the image change with the parameter may skew the result) or see how large the difference between a lossy compressed image and the original is

There's still the problem that different compression methods give different results. And the absolute best compression methods are extremely slow
Logged
Hypothesis: I will kill all dwarves in a horrible fashion

McTraveller

  • Bay Watcher
  • This text isn't very personal.
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1448 on: April 09, 2021, 07:17:58 pm »

I think you can use a FFT or other spectral analysis, which can be very fast (much faster than compression) and get a deterministic metric describing the noise.

For a FFT, if the levels of all frequencies are the same, that basically means it's random and as noisy as you can get.  If you have peaks at one or more frequencies, that means it's not random.

You could probably do the FFT, do a standard deviation on the levels with respect to frequency, and the lower it is, the more noisy it is.
Logged

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1449 on: April 11, 2021, 04:58:23 am »

I've been thinking of this "uncompressed data conversion" matrix. I've sorta wanted to review it. Here:
Code: [Select]
                            Increasing bandwidth
                       ----------------------------->

                       +-----+-------+-------+-------+
                     | |\    |       |       |       |
                     | | \ In|       |       |       |
                     | |  \  | Audio | Image | Video |
                     | |Out\ |       |       |       |
                     | |    \|       |       |       |
                     | +-----+-------+-------+-------+
                     | |     | Audio | Image | Video |
                     | |Audio|  to   |  to   |  to   |
                     | |     | audio | audio | audio |
Increasing bandwidth | +-----+-------+-------+-------+
                     | |     | Audio | Image | Video |
                     | |Image|  to   |  to   |  to   |
                     | |     | image | image | image |
                     | +-----+-------+-------+-------+
                     | |     | Audio | Image | Video |
                     | |Video|  to   |  to   |  to   |
                     | |     | video | video | video |
                     v +-----+-------+-------+-------+

If we were all to go back to dial-up over POTS lines (the technical term is DS0), we'd end up converting all our data to audio at some point. All data ends up as audio. Hell, even T-whatever and DSL are literally transmitting data over many POTS lines at once, so that too is actually audio at some point. Interesting point, but not the one I'm focusing on.

See, I'm thinking of the idea that any uncompressed data can be converted to any of these formats (text is a valid format, but too-low bandwidth to consider here) with enough prodding. With a lossless encoder, it must always yield a result that's identical to the original when decoded back to its original form, though you may lose a couple bytes converting to video. With a lossy encoder... you'll get interesting results. Even more prodding is required in practice, but I know it can yield viable results. Something that's implied here is that the third and final step is always "> (Original format)". I am not touching that rabbit hole of doing stuff like "Audio to video to image to audio to image to...".

The trivial ones are the X to X ones. These do have some use, but it's mostly what is called "digital generation loss". If you've ever seen those "deep-fried" memes, that's an example of going "Original image > JPEG > JPEG > JPEG > (Superimpose smaller image on top) > JPEG..." until you end up with a suitably-degraded image. 'Video to image' and 'Image to video' are also trivial. That's under "no shit, Sherlock": all videos are series of images; series of images can be compiled into videos.

So that's 5 excluded for being too boring. This leaves 4.

Image to audio is pretty good. Depending on your choice of lossy codec, you could end up with some good effects, though it's a lot of blurring and line-by-line artifacting at lower bitrates.

Video to audio is rather limited, as far as lossy codecs go. Using Bad Apple as a test video (as you do), there really aren't many codecs that run at less than 32 kbps to yield suitable amounts of distortion. Off the top of my head, they are Opus, AAC, Speex, and G.7-something (it's one of the GSM codecs). Then again, I'm using black-and-white footage; it's not really showing off the full power of this technique. God, I need good, freely-licensed test footage. I'm aware of Xiph's collection of lossless footage, so I really should look at it in depth.

Audio to image... I tried it once, ghastly results. I should really revisit it; maybe screwing with royalty-free music might be seen as "ironic", even if I haven't a clue what I'm doing.

Audio to video is also pretty good. All my ears can hear is quantization noise; it's great! Seems to need bitrates below 32 kbps, and there aren't many video codecs able to operate at this low a bitrate. Then again, I could always increase the internal resolution to evade this issue entirely. It's good as long as you intend it to be audio at the end. Just 'audio to video' yields something like:

Logged

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1450 on: April 13, 2021, 08:02:30 pm »

The thing about trying to degrade raw image/video data using lossy audio-based methods (recursive re-recording, lossy audio codecs) is that color is the first thing that dies out. Always. Doesn't matter which color space you use.

In RGB-land (and other color-component-separated color spaces), the R, G and B components in a(n) image/video are almost always strongly correlated, at least when working with footage of things in real life. If you were to define the average RGB component, Av = (R + G + B), then my strong suspicion is that the values Av - R, Av - B, and Av - B, are very "weak" relative to the magnitude of Av. As such, the first things to go are the colors, since they're the weakest part of the "signal". Those are the parts that tend to get "smoothed-off" first, as it is. Even if each component is processed individually (and they are), color is still the first thing to go.

In YUV-land (or any color space that defines a grayscale component and a set of chroma components), a different problem arises. The grayscale component is the "sharpest" signal. That's where your detail is. The chroma parts are very "blurry", as it is; the color is painted on top of a grayscale image. When you process this, you end up losing the already-blurry chroma signal first. This is why VHS tapes lose their color long before they lose their grayscale if you induce generation loss by recursively re-recording tapes. There's simply very little detail to preserve in the color components. Don't blame them for this, blame evolution. Our eyes have good grayscale perception, but weak color perception. As such, blurry color is always harder to spot than blurry grayscale.

So at this point, does there exist a color space that doesn't suffer from either issue? If there is, there wouldn't be much incentive to support it; any such 'space simply doesn't follow human visual perception, nor has the convenience that RGB offers for our tech. Still, I'd consider a color space in which the grayscale and color parts degrade at roughly the same rate an interesting thought experiment, if such a thing can exist. Send me the encoder and decoder for that if you make one. I wanna see that shit.
Logged

bloop_bleep

  • Bay Watcher
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1451 on: April 14, 2021, 01:35:22 am »

The deviations from average brightness are high frequency and low amplitude, so I can see why they could get shaved off. You could just try encoding the color channels separately and then interleaving.
Logged
Quote from: KittyTac
The closest thing Bay12 has to a flamewar is an argument over philosophy that slowly transitioned to an argument about quantum mechanics.
Quote from: thefriendlyhacker
The trick is to only make predictions semi-seriously.  That way, I don't have a 98% failure rate. I have a 98% sarcasm rate.

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1452 on: April 14, 2021, 04:54:46 am »

So at this point, does there exist a color space that doesn't suffer from either issue?

I'll respond to myself. I have tentative evidence to say that there likely does exist such a color space, and I sure as hell didn't expect it to be widely available. This is a frame from Big Buck Bunny, encoded in YUV 4:2:0, and pushed through the reference Opus audio codec (libopus) at 8 kbps. The wraparound isn't intentional, but I think it adds to the VHS-like aesthetic. The ones next to it are the same thing using YUV 4:4:4, again with 24-bit RGB, and the original.



For the YUV color spaces:
Extreme loss of detail on all channels, check.
Colors retained despite that, check.
Unintentional artificial VHS look, check.

Oh, and the attribution? "The movie frames shown above are (c) copyright 2008, Blender Foundation / www.bigbuckbunny.org". Check. (Hey, between this and paying someone to use their stuff, I'd much prefer this)

Edit: I tentatively take it back. YUV spaces tend to do quite well with color preservation, at least when working with video and using lossy audio codecs. RGB 24-bit suffers from sudden color death. For Big Buck Bunny over there, the point of sudden color death is between 20 kbps and 20.5 kbps with libopus.

I wonder if 'abuse of lossy audio codecs' and 'generation loss by recursive re-recording' actually behave differently with respect to color spaces and color preservation under their respective forms of degradation.

The deviations from average brightness are high frequency and low amplitude, so I can see why they could get shaved off. You could just try encoding the color channels separately and then interleaving.

Wait, isn't that what interleaved RGB is? I'm having trouble trying to interpret what you mean. If I just induce exponential generation loss on this here innocent picture...



And those are, left-to-right, the original, pass 1, pass 2, pass 4, pass 8, and pass 16. If you can see the hue shifting a bit, your eyes aren't fooling you. Interleaved images are pains-in-the-ass to deal with, precisely for this reason; they can sometimes shift hue by 30 degrees (rather than the 60 you'd expect from byte misalignment), which I didn't believe until I put some calibration bars on an earlier image.

It's doing pretty good color-wise given that's 16 iterations. If that's what you meant, that I'd agree.

Here's one I "baked" earlier: a planar RGB image. These are the original, pass 1, 2, and 3.



It's not even on an exponential scale, and the color's still fading so damn hard. It's almost disheartening to see the color fade so quickly. A shame, too. I love working with planar images because they don't suffer from the hue-shifting issue that interleaved images do.
« Last Edit: April 14, 2021, 08:15:53 am by methylatedspirit »
Logged

bloop_bleep

  • Bay Watcher
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1453 on: April 16, 2021, 01:45:29 am »

I mean separate the channels first (uninterleave), encode each, then interleave again.
Logged
Quote from: KittyTac
The closest thing Bay12 has to a flamewar is an argument over philosophy that slowly transitioned to an argument about quantum mechanics.
Quote from: thefriendlyhacker
The trick is to only make predictions semi-seriously.  That way, I don't have a 98% failure rate. I have a 98% sarcasm rate.

methylatedspirit

  • Bay Watcher
  • it/its
    • View Profile
Re: Random thoughts - On the Origins of "I Could Eat A Horse"
« Reply #1454 on: April 16, 2021, 02:52:46 am »

My first thought was:

1. Start with RGB,
2. Split into planes of R, G and B with FFmpeg's extractplanes filter.
3. Use mergeplanes to generate colorized versions of the initially-grayscale planes by using one plane each as input, since extractplanes yields grayscale versions.
4. Encode those components as YUV.
5. Pipe those individually through an audio system, save them. (Or perform some other databending activity)
6. Re-encode those back to RGB.
7. Use extractplanes to extract only the individual components.
8. Use mergeplanes to merge together the individual components back to RGB.
9. Done.

Incredibly convoluted. That's Rube Goldberg-levels of complexity. That's, like, 15 individual FFmpeg operations, probably fewer if you used filtergraphs (which are black magic to me).

That, or you mean:

1. Start with RGB.
2. Split into RGB planes with extractplanes (remember, the output planes are grayscale).
3. Re-encode them back to RGB.
4. Pipe them individually through an audio system.
5. Use extractplanes to extract only the individual components in the RGB-ized components (which themselves represent RGB planes)
6. Use mergeplanes to merge them back to RGB.
7. Done.

I think that's 8 individual FFmpeg operations without filtergraphs.

I'm not sure of any other interpretations. Is this what you meant? "Interleave" is a 'reserved word' to me within this context; to me, it can only ever mean the way in which some pixel formats are stored RGB RGB RGB (...) instead of RRR (...) GGG (...) BBB as planar formats are stored (replacing R, G and B with the appropriate components in the general case). "Packed" is also reserved for this reason. That's why I thought you meant interleaved RGB.
Logged
Pages: 1 ... 95 96 [97] 98 99 ... 180