It seems that you're somewhat out of touch with the modern state-of-the-art neural networks. Nothing they do is actually "beyond the presumptions of the programmer" (after all, they don't actually change their own code or anything weird and unpredictable like that), but they already do some seriously powerful stuff. And this is just the beginning of the intelligence revolution.
Very interesting, but I don't see any AI in that.
I also missed the bit about
why the author wanted to analyse/compare fonts, so perhaps I missed a previous blog/whatever about the actual aims and expectations. The most interesting part to me was the morphing between cases
1, although you must admit that some of those intermediate forms are effectively unusable (e.g. g to G on that static grid can be clearly seen to have forms indistinguishable from a rough single-story miniscule 'a' form) so that it shares the quality with standard image-morphing techniques
2. I see no 'understanding', it's just a fancy (and impressive!) statistical analysis then poked and prodded by the human in charge.
Unless, again, I miss something.
1 Here credited to manually selecting fonts with a smallcaps paradigm, to morph between, but would have been as easy to ask of the analysis to produce vectors from all a-z forms to A-Z forms, but maybe also accented (ä into Ä, etc) and ligatures (æ into Æ), assuming that wasn't beyond the invisaged capabilities of the font-comparer because of the 'fiddly bits'.
2 In which the more unguided the initial 'gridding' is, prior to the cross-fading of intermediate distorts, the more it sends you through an intermediate that is not just an uncanny valley but actually 'shows your working'. Whereas, with understandng and intelligence behind the triangulations, not only can (say) the legs of the source cat be engineered to become the wheels of the target vehicle but, with effort, the warps can 'wrap' the feet around, and 'hide' the offside leg-wheels behind the descending body that will fully obscure them in a 'believable' way. But that would need one helluva image-recognition algorithm to determine and enact without an actual operator/artist in charge.