Two Minute Papers:
TecoGAN, The latest NN for upscaling crappy video.
This kind of thing has several implications. First, put old VHS footage into this, and have 'remastered' versions of old stuff even if they lost the originals. Second, better compression: squish the crap out of your video based on how the detail-enhancer actually works. But, just making video look how it should have looked is clearly only the start. What if you could teach it what a good cartoon is meant to look like then feed shittily made cartoons into it? Would it spit out a
better version that never existed. How much better than the original can it look? Imagine an NN that basically redraws a cartoon while keeping the general parameters of the unique art style intact, and adds things like decent shading and more of a cell-shaded 3d-ish feel to it.
Just upscaling with fake detail is the start, move a little beyond that and NNs can start creatively enhancing works.
Two Minute Papers on the animation of the Mona Lisa for an early example of that. Think of this: if you had a tool that could animate any drawing anyone had ever made, so if for example there was a depiction of a character from a book cover art, you could feed that into an algorithm then have a 3D animatable character you could put in your own fan movies and stuff.
Also, for reference from Computerphile,
Vectoring Words (Word Embeddings). This stuff is more low-level and technical than the stuff in Two Minute Papers which is all about the wow factor, but vectoring words, which uses the Transformer NN is something that's really easy to do at home. The general idea here is that you can take a large set of inputs, the same set of outputs, and you train an NN to map the set of inputs to itself (the input equals the output). The trick is that in the middle, the data is squeezed into a very narrow layer, which we're calling a vector of numbers.
In the video he shows how you can "vectorize" words, then you can manipulate the vectors in surprisingly powerful ways. For example, take the vector for 'King' subtract the vector for 'Man', add vector 'Woman' and the closest vector is 'Queen'. Take 'Moo', subtract 'Cow' and add 'Pig', and check what vector you get and it's 'Oink'. And many others. Basically allows you to do calculus on concepts. And all of this is just from feeding data in with no model.
For a video example, if you input millions of faces into a Transformer network and told the network to output the same face, but squeezed the information through a layer only 10 cells wide, what the network does is boil it down to 10 factors/sliders that encode what's
different about each different face, then the second half of the network actually becomes a face-building algorithm which takes those 10 sliders and works out how to make a face from them. All the information about how faces work is now transferred to the network itself, the vector does not contain it. You can now chop off the second half of the network, feed 10 random numbers into it and it'll keep spitting out random faces. This is what
This Person Does Not Exist is doing, although they're using far more than the 10 sliders I gave in the example.
I'm interested to try something like that out, but I'm looking for a good data set to do it on. One thing I'm thinking of is to put stats for heaps of D&D monster or similar into a transformer then making a new monster generator from that.