The timing on this is notable because it came right after the earlier news of the day, clearly in an attempt to suppress it and prove that OpenAI is still the world leader in AI.
—
For the new LLM models, the intractable problem I think it seems to have is *context*. To generate a whole novel with consistent context, you'd need to tokenize the previous data and feed it in when generating the next chunk. This is an exponential problem, and basically kills any significantly large content generation.
https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#gemini-15The other other huge AI news of the day it that Google’s new AI has a context window of 1 million tokens. Not unlimited, but it's still basically two War and Peace's in a row, so no, you can already fit an entire novel into the context window.
I won’t pretend the context window is solved or anything, but its in no way an impassible wall.
There are of course still other fundamental issues preventing human equivalent AI (most notably they don’t change their neural net in response to new info and thus can’t truly “learn” outside of training), but given time I have little doubt they will solve that as well with varying degrees of effectiveness.
You and I might have different definitions of "high-quality"... all of those videos being shown off there have serious flaws. Still, it'll easily be able to replace those weird poorly-animated pharmaceutical commercials, for a start.
All of the videos being shown off have tells that they aren't real if you look hard and zoom in, but if I just saw some of them in normal circumstances (most notably the dude with the book), they would totally fool me into thinking they were real.
https://youtu.be/NXpdyAWLDas?si=HXqp0dqR3eAYSpwW&t=179If you compare it to where we were just a single year ago the difference is staggering. Even a single more iteration with this big a difference (or more likely multiple smaller iterations) is probably going to end up with videos that average people won’t be able to tell apart from real videos without using some kind of tool.
A while back in a hearts of iron thread someone linked a video about creating fake newsreals for fake 1900's history, and I couldn’t help but think that we were witnessing the death of the historical record in real time. I have the same feeling now. Obviously this won't really trick people watching out yet, but with a few more years...
Like what? Give some examples of what GPT-5 could POSSIBLY do that GPT-4 couldn't, besides simply knowing more uber-niche topics. What I'm getting at is that those new use cases, at least for text AI, are not something the average user needs at all.
Write an entire coherent book without any nonsense.
DM a game of DND for you and your friends in a world that it created while remembering the events of every session.
Be your
AI girlfriend assistant that actually remembers what you tell her for weeks or months. (Although note that it will actually be worse at being your girlfriend since its likely to be even more on rails then GPT 4 is).
Not hallucinate.
Be a order of magnitude better at coding (eg. be able to code dozens of lines without problem instead of only ~4 or 5).
Be consistent enough to use for a business in answering your emails without having to worry about it saying something stupid.
Have it run a dozen people at once (on a single instance) with all of them having unique personalities and keeping them all apart.
All of the above are things that merely text GPT-5 could do without having to dip into non-text input or output. All of them are things that people will want them to be able to do. Adding that it will have non text input and output will only add *more* stuff it can do.
What I'm getting at by diminishing returns is that at some point, "better" becomes nigh on imperceptible. On some automated tests it might score 30% more, sure. But at what point does the user stop noticing the difference? I don't believe that point is far away at all. The quality gap between GPT-3 and GPT-4 is technically higher than between 2 and 3 (iirc) but they feel much more similar.
I think the point where people won't notice a difference is when they are as good as a human and capable of avoiding any mistake that a human wouldn't make. And even then the gap between high quality human work and low quality human work is immense.
So sure, once they can make My Immortal tier fiction people might not notice a difference between it and the previous version, but even then there is still a large amount of room to grow. There is no reason to think they will be content with merely that, the works of the masters will be beyond AI.
But why would they stop there? The corps will keep working until they match the masters, and eventually surpass even them.
I think that many of the problems with current AI text generation are more fundamental issues with the AI (eg. lack of context window, lack of fundamental understanding of some concepts) and are things that are important for non-text AI as well and that they have every reason to try to improve those lacking abilities, and that those improvements will continue even if they say that their text generation is good enough and start working on other parts of the AI instead.