Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Poll

Reality, The Universe and the World. Which will save us from AI?

Reality
- 13 (65%)
Universe
- 4 (20%)
The World
- 3 (15%)

Total Members Voted: 20


Pages: 1 ... 27 28 [29] 30 31 ... 50

Author Topic: What will save us from AI? Reality, the Universe or The World $ Place your bet.  (Read 49667 times)

Maximum Spin

  • Bay Watcher
  • [OPPOSED_TO_LIFE] [GOES_TO_ELEVEN]
    • View Profile

I mean, it was originally defined as meaning an impossible inflection directly into undefinable infinity. That was exactly what was intended.
Logged

Strongpoint

  • Bay Watcher
    • View Profile

https://openai.com/sora

Wow...

High-quality video arrived sooner than I expected. So many people will lose their jobs... Who will waste money filming an ad if an AI can generate it?
Logged
No boom today. Boom tomorrow. There's always a boom tomorrow. Boom!!! Sooner or later.

Maximum Spin

  • Bay Watcher
  • [OPPOSED_TO_LIFE] [GOES_TO_ELEVEN]
    • View Profile

https://openai.com/sora

Wow...

High-quality video arrived sooner than I expected. So many people will lose their jobs... Who will waste money filming an ad if an AI can generate it?
You and I might have different definitions of "high-quality"... all of those videos being shown off there have serious flaws. Still, it'll easily be able to replace those weird poorly-animated pharmaceutical commercials, for a start.
Logged

Strongpoint

  • Bay Watcher
    • View Profile

I don't mean that they have reached the point of looking real and can replace human content now. I didn't expect this level of quality to arrive so soon. Even considering that those are hand-picked best generations, we seem to be a few years away from videos that will require careful examination for detecting AI.
Logged
No boom today. Boom tomorrow. There's always a boom tomorrow. Boom!!! Sooner or later.

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

The timing on this is notable because it came right after the earlier news of the day, clearly in an attempt to suppress it and prove that OpenAI is still the world leader in AI.

For the new LLM models, the intractable problem I think it seems to have is *context*. To generate a whole novel with consistent context, you'd need to tokenize the previous data and feed it in when generating the next chunk. This is an exponential problem, and basically kills any significantly large content generation.
https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#gemini-15
The other other huge AI news of the day it that Google’s new AI has a context window of 1 million tokens. Not unlimited, but it's still basically two War and Peace's in a row, so no, you can already fit an entire novel into the context window.
I won’t pretend the context window is solved or anything, but its in no way an impassible wall.

There are of course still other fundamental issues preventing human equivalent AI (most notably they don’t change their neural net in response to new info and thus can’t truly “learn” outside of training), but given time I have little doubt they will solve that as well with varying degrees of effectiveness.
You and I might have different definitions of "high-quality"... all of those videos being shown off there have serious flaws. Still, it'll easily be able to replace those weird poorly-animated pharmaceutical commercials, for a start.
All of the videos being shown off have tells that they aren't real if you look hard and zoom in, but if I just saw some of them in normal circumstances (most notably the dude with the book), they would totally fool me into thinking they were real.
https://youtu.be/NXpdyAWLDas?si=HXqp0dqR3eAYSpwW&t=179
If you compare it to where we were just a single year ago the difference is staggering. Even a single more iteration with this big a difference (or more likely multiple smaller iterations) is probably going to end up with videos that average people won’t be able to tell apart from real videos without using some kind of tool.

A while back in a hearts of iron thread someone linked a video about creating fake newsreals for fake 1900's history, and I couldn’t help but think that we were witnessing the death of the historical record in real time. I have the same feeling now. Obviously this won't really trick people watching out yet, but with a few more years...
Quote from: kittytac
Like what? Give some examples of what GPT-5 could POSSIBLY do that GPT-4 couldn't, besides simply knowing more uber-niche topics. What I'm getting at is that those new use cases, at least for text AI, are not something the average user needs at all.
Write an entire coherent book without any nonsense.
DM a game of DND for you and your friends in a world that it created while remembering the events of every session.
Be your AI girlfriend assistant that actually remembers what you tell her for weeks or months. (Although note that it will actually be worse at being your girlfriend since its likely to be even more on rails then GPT 4 is).
Not hallucinate.
Be a order of magnitude better at coding (eg. be able to code dozens of lines without problem instead of only ~4 or 5).
Be consistent enough to use for a business in answering your emails without having to worry about it saying something stupid.
Have it run a dozen people at once (on a single instance) with all of them having unique personalities and keeping them all apart.

All of the above are things that merely text GPT-5 could do without having to dip into non-text input or output. All of them are things that people will want them to be able to do. Adding that it will have non text input and output will only add *more* stuff it can do.
Quote from: Kittytac
What I'm getting at by diminishing returns is that at some point, "better" becomes nigh on imperceptible. On some automated tests it might score 30% more, sure. But at what point does the user stop noticing the difference? I don't believe that point is far away at all. The quality gap between GPT-3 and GPT-4 is technically higher than between 2 and 3 (iirc) but they feel much more similar.
I think the point where people won't notice a difference is when they are as good as a human and capable of avoiding any mistake that a human wouldn't make. And even then the gap between high quality human work and low quality human work is immense.
So sure, once they can make My Immortal tier fiction people might not notice a difference between it and the previous version, but even then there is still a large amount of room to grow. There is no reason to think they will be content with merely that, the works of the masters will be beyond AI.
But why would they stop there? The corps will keep working until they match the masters, and eventually surpass even them.

I think that many of the problems with current AI text generation are more fundamental issues with the AI (eg. lack of context window, lack of fundamental understanding of some concepts) and are things that are important for non-text AI as well and that they have every reason to try to improve those lacking abilities, and that those improvements will continue even if they say that their text generation is good enough and start working on other parts of the AI instead.
« Last Edit: February 16, 2024, 01:19:06 am by lemon10 »
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

KittyTac

  • Bay Watcher
  • Impending Catsplosion. [PREFSTRING:aloofness]
    • View Profile

Yeah this is what I brought up earlier. It depends on if you believe that GPT could ever do any of those things.

I don't. idk what else is there to talk about. I'll change my mind if it somehow does but until then I'm finding it hard to believe it could.

Sora is... interesting. I'll refrain from commenting on it until we have more info about how it works and what are its limitations.
Logged
Don't trust this toaster that much, it could be a villain in disguise.
Mostly phone-posting, sorry for any typos or autocorrect hijinks.

Strongpoint

  • Bay Watcher
    • View Profile

Quote
The other other huge AI news of the day it that Google’s new AI has a context window of 1 million tokens. Not unlimited, but it's still basically two War and Peace's in a row, so no, you can already fit an entire novel into the context window.

But a larger context window means a higher chance to hallucinate based on something irrelevant from 500K tokens ago. The problem is not that it is impossible to have a huge context window (it is a matter of memory,  calculating power, and efficiency ), the problem is diminishing returns and hallucinations.
Logged
No boom today. Boom tomorrow. There's always a boom tomorrow. Boom!!! Sooner or later.

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

Yeah this is what I brought up earlier. It depends on if you believe that GPT could ever do any of those things.

I don't. idk what else is there to talk about. I'll change my mind if it somehow does but until then I'm finding it hard to believe it could.
Fair enough, we just have to wait and see what they manage over the next few years, as they say, the proof is in the pudding.
Quote
And besides, AI image gen basically plateaued already, for the general use case.
Although this is objectively wrong. Over the past year AI image generation has improved in basically every way, in stuff like optimization, ability to respond to prompts, ability to make a good picture even if you *don't* have any clue how to specify what you want, ability to generate and understand text in images, ability to use existing images as guides for style, ability to use previous images you generate for context, ability to comprehend and generate tricky things like fingers and hands, ect.
All of that is stuff that people care about, and all of it it improves the general use case. There is still a ton of stuff to improve on(eg. not even Sora gets hands correct 100% of the time), and to my complete lack of surprise new image generation (Sora if you pause the video and look at individual frames) seems to have improved even further on what already existed in ways that people will totally care about and that will very much improve the general use case.
E: And yes, newer image generation does just flat-out generate visually better images on average.
« Last Edit: February 17, 2024, 01:58:08 am by lemon10 »
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

Strongpoint

  • Bay Watcher
    • View Profile

Yes, image models are getting better every day. Progress may be slower but it is there, especially in the area of prompt comprehension.

Also, publicly available image generation models are generalist models or slightly tweaked generalist models. We are yet to see what an image generation model trained to do something specific can do.
Logged
No boom today. Boom tomorrow. There's always a boom tomorrow. Boom!!! Sooner or later.

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

Yeah, I'm looking through the paper now and Sora can generate HD images with resolutions of up to 2048x2048. It still isn't flawless... but some of them kind of are?
Spoiler: Large image (click to show/hide)
https://openai.com/research/video-generation-models-as-world-simulators
Quote from: Paper
Simulating digital worlds. Sora is also able to simulate artificial processes–one example is video games. Sora can simultaneously control the player in Minecraft with a basic policy while also rendering the world and its dynamics in high fidelity. These capabilities can be elicited zero-shot by prompting Sora with captions mentioning “Minecraft.”

These capabilities suggest that continued scaling of video models is a promising path towards the development of highly-capable simulators of the physical and digital world, and the objects, animals and people that live within them.
That's... uh... sure something. It might even be bigger than the whole video generation thing. Maybe? I'm honestly not quite sure what *exactly* they are saying and what the limits of it are.
---
E: On a different note over the past few months I've noticed quite a few posts on the internet (eg. here in other threads, reddit) that basically have been going "Well, it looks like this AI stuff is overblown because it hasn't advanced over the last year, and GPT isn't really that big a deal". (And no, I'm not calling out kitty here, they seems to have put a lot more thought into this then most people at least).
Which is both A) wrong (basically every company + open source has advanced substantially, the only reason that progress seems even somewhat static is because the most advanced company was hiding their progress) and b) Even if there had been no advances its still such a crazy take to me.
Its basically them saying that since there wasn't a categorical epoch altering change in the human condition in the last six months that the technology is dead and that we don't have to worry about it that much. I do really really hope they are right but...
« Last Edit: February 17, 2024, 04:12:52 am by lemon10 »
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

KittyTac

  • Bay Watcher
  • Impending Catsplosion. [PREFSTRING:aloofness]
    • View Profile

Yeah this is what I brought up earlier. It depends on if you believe that GPT could ever do any of those things.

I don't. idk what else is there to talk about. I'll change my mind if it somehow does but until then I'm finding it hard to believe it could.
Fair enough, we just have to wait and see what they manage over the next few years, as they say, the proof is in the pudding.
Quote
And besides, AI image gen basically plateaued already, for the general use case.
Although this is objectively wrong. Over the past year AI image generation has improved in basically every way, in stuff like optimization, ability to respond to prompts, ability to make a good picture even if you *don't* have any clue how to specify what you want, ability to generate and understand text in images, ability to use existing images as guides for style, ability to use previous images you generate for context, ability to comprehend and generate tricky things like fingers and hands, ect.
All of that is stuff that people care about, and all of it it improves the general use case. There is still a ton of stuff to improve on(eg. not even Sora gets hands correct 100% of the time), and to my complete lack of surprise new image generation (Sora if you pause the video and look at individual frames) seems to have improved even further on what already existed in ways that people will totally care about and that will very much improve the general use case.
E: And yes, newer image generation does just flat-out generate visually better images on average.
I meant newer as in "latest half of past year" really. Yes it got more convenient. No it didn't get better, in terms of quality and being less obviously AI, from what I have seen. Which is what I meant.

Yeah, I'm looking through the paper now and Sora can generate HD images with resolutions of up to 2048x2048. It still isn't flawless... but some of them kind of are?
Spoiler: Large image (click to show/hide)
https://openai.com/research/video-generation-models-as-world-simulators
Quote from: Paper
Simulating digital worlds. Sora is also able to simulate artificial processes–one example is video games. Sora can simultaneously control the player in Minecraft with a basic policy while also rendering the world and its dynamics in high fidelity. These capabilities can be elicited zero-shot by prompting Sora with captions mentioning “Minecraft.”

These capabilities suggest that continued scaling of video models is a promising path towards the development of highly-capable simulators of the physical and digital world, and the objects, animals and people that live within them.
That's... uh... sure something. It might even be bigger than the whole video generation thing. Maybe? I'm honestly not quite sure what *exactly* they are saying and what the limits of it are.
---
E: On a different note over the past few months I've noticed quite a few posts on the internet (eg. here in other threads, reddit) that basically have been going "Well, it looks like this AI stuff is overblown because it hasn't advanced over the last year, and GPT isn't really that big a deal". (And no, I'm not calling out kitty here, they seems to have put a lot more thought into this then most people at least).
Which is both A) wrong (basically every company + open source has advanced substantially, the only reason that progress seems even somewhat static is because the most advanced company was hiding their progress) and b) Even if there had been no advances its still such a crazy take to me.
Its basically them saying that since there wasn't a categorical epoch altering change in the human condition in the last six months that the technology is dead and that we don't have to worry about it that much. I do really really hope they are right but...
One of their videos has been discovered to be 95% source material with some fuzzing. This is hype.
Logged
Don't trust this toaster that much, it could be a villain in disguise.
Mostly phone-posting, sorry for any typos or autocorrect hijinks.

Robsoie

  • Bay Watcher
  • Urist McAngry
    • View Profile

Is the Sora AI creating those from actual scratch (well from its training) or is it doing a video2video (i mean each frames of an existing video processed by an AI in the desired/prompted style)  like the guys from Corridor Digital did with "Rock, Paper, Scissor" a year go
https://www.youtube.com/watch?v=GVT3WUa-48Y
« Last Edit: February 18, 2024, 12:55:43 pm by Robsoie »
Logged

McTraveller

  • Bay Watcher
  • This text isn't very personal.
    • View Profile

That person in the Sora image has too many lips.
Logged
This product contains deoxyribonucleic acid which is known to the State of California to cause cancer, reproductive harm, and other health issues.

Robsoie

  • Bay Watcher
  • Urist McAngry
    • View Profile

It's a change from the usual "too many fingers" from AI :D
Logged

Starver

  • Bay Watcher
    • View Profile

Is the Sora AI creating those from actual scratch (well from its training) or is it doing a video2video (i mean each frames of an existing video processed by an AI in the desired/prompted style)  like the guys from Corridor Digital did with "Rock, Paper, Scissor" a year go
https://www.youtube.com/watch?v=GVT3WUa-48Y
When I earlier had a look at the Sora examples (on the main link given, the other day), various revealing errors were... revealing.

Take the dalmation at the 'ground' floor window (it wasn't that, much as the cat never got fed treats by the man in the bed, and the rabbit-squirrel never looked up at the fantasy tree), it was clearly a reskinned cat-video. A cat making some windowsill-to-windowsill movement (not something even asked for in the Prompt text) reskinned with the body of the desired breed of dog (but still moved like a cat) rendered over the sort-of-desired background (windows of the appropriate types, if not position). Where the notable folded-out shutter absolutely does not impede even the cat-footed dog's movement across it.

It'll have its ultimate roots in the morphing algorithms that I was (manually) using back in the '90s. Improvements by context-(semi-)aware AI configuration of the layered image recomposition, rather than painstaking manual 'meshing' and merging. The footage store clearly didn't have the right kind of windows, couldn't supply the dog (or baseline cat) just obeying the Prompt text instructions, did not have the things in its preprocessing display stock to give the necessary background (ground-level) street-stuff requested. But, with what it had, it managed to blend together something that might have been done differently by a human, but exquisitely meticulously as far as its 3-to-3.5D (X,Y,t, with rudimentary understanding of the thrid spacial dimention) rendering was able.
Logged
Pages: 1 ... 27 28 [29] 30 31 ... 50