Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Poll

Reality, The Universe and the World. Which will save us from AI?

Reality
- 13 (68.4%)
Universe
- 3 (15.8%)
The World
- 3 (15.8%)

Total Members Voted: 19


Pages: 1 ... 28 29 [30] 31 32 ... 42

Author Topic: What will save us from AI? Reality, the Universe or The World $ Place your bet.  (Read 26607 times)

Strongpoint

  • Bay Watcher
    • View Profile

Video generation is way trickier to make usable. Why? Mistakes in output are way harder to fix.  Generated text is trivial to edit (both manually and with automated tools), images are somewhat trickier and require more work but absolutely double. Fixing video requires a lot of effort which may be beyond practical
Logged
They ought to be pitied! They are already on a course for self-destruction! They do not need help from us. We need to redress our wounds, help our people, rebuild our cities!

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

I meant newer as in "latest half of past year" really. Yes it got more convenient. No it didn't get better, in terms of quality and being less obviously AI, from what I have seen. Which is what I meant.
Last half year?
Quote from: Time between GPT models
GPT 1, June 2018
GPT 2, February 2019 (8 months)
GPT 3, May 2020 (15 months)
GPT 3.5, November 2022 (28 months)
GPT 4, March 2023 (6 months)
Now, (11 months)
What a strange metric for plateauing. If we used that then LLM's would have plateaued in 2019, 2020, 2021, 2022, 2023 and 2024. Now, if you went "AI text generation development plateaued in 2019" that would be obviously wrong because in fact it has continued to develop significantly every year since 2018 (aside from arguably 2021 where OpenAI didn't develop a new model) at a very significant and rapid rate.

The same is true for text to image generation. If you stick an unreasonably short timeframe on it (last 6 months (E: You actually seem to be saying last 8 months, with "last half of last year", but that is still way too short a time period)) then sure, there haven't been many fundamental advances. Not none (it can understand and put text in images since Dalle 3 4 months ago), but Dalle 3 isn't a massive leap or anything.
However if you widen the window to a much more reasonable year instead then it very much has. Over that timespan both the average quality and maximum quality have improved. In addition it is now smarter and has in fact reduced obvious "this is an AI" tells (hands, text) which also means yes, it is indeed harder to tell if an image is AI generated.
Now obviously between now and a year ago it hasn't gained the ability to trick people watching or fluent in the technology and still has obvious tells, but there's a pretty huge difference between that and plateauing.

Of course with the events of a few days ago it seems pretty clear that Sora has pushed image generation far further then what existed beforehand so the idea of image generation having plateaued is obviously wrong. I have little doubt that if there is a claim that image /video generation has plateaued 8 months from now due to nothing more advanced then Sora existing that will be proven wrong as well if given more time.
Quote from: kittytac
One of their videos has been discovered to be 95% source material with some fuzzing. This is hype.
Sauce?
---
Is the Sora AI creating those from actual scratch (well from its training) or is it doing a video2video (i mean each frames of an existing video processed by an AI in the desired/prompted style)  like the guys from Corridor Digital did with "Rock, Paper, Scissor" a year go
https://www.youtube.com/watch?v=GVT3WUa-48Y
Quote from: Sora paper
All of the results above and in our landing page show text-to-video samples. But Sora can also be prompted with other inputs, such as pre-existing images or video. This capability enables Sora to perform a wide range of image and video editing tasks—creating perfectly looping video, animating static images, extending videos forwards or backwards in time, etc.
It can do both, but the ones presented on the main page were text to image.
https://openai.com/research/video-generation-models-as-world-simulators
I do advise people to check out the paper if they are interested in how it works, because it gives quite a bit of detail about both that and what Sora can do in general.
Video generation is way trickier to make usable. Why? Mistakes in output are way harder to fix.  Generated text is trivial to edit (both manually and with automated tools), images are somewhat trickier and require more work but absolutely double. Fixing video requires a lot of effort which may be beyond practical
It can do video editing no problem. In fact for smaller things I suspect its even easier for it given that there is already a solid world there to base things off and it doesn't have to come up with one on its own.
I do agree that video generation is way harder though.
The first reason is simply compute. A 10 second video has 600 frames, which (if done naively) requires 600 times the compute of a single image generation. Longer videos also require the AI to have a longer "memory" to make sure everything is working properly and doesn't cause problems. There are almost certainly fancy tricks done here to make things cheaper, but its still got to be hella expensive computation wise.
The second reason is that it not only needs to understand three dimensions, but also needs to maintain continuity between them all by having a consistent model of the 3D environment.
Thirdly it needs to understand time and how things move through time.
Finally it also needs to understand physics and the physics of every object within the environment to avoid obviously impossible stuff happening.

Sora has demonstrated understanding of all of these issues although there is obviously some way to go (as shown by them posting videos of more blatant errors and the errors seen even on the good videos).
When I earlier had a look at the Sora examples (on the main link given, the other day), various revealing errors were... revealing.

Take the dalmation at the 'ground' floor window (it wasn't that, much as the cat never got fed treats by the man in the bed, and the rabbit-squirrel never looked up at the fantasy tree), it was clearly a reskinned cat-video. A cat making some windowsill-to-windowsill movement (not something even asked for in the Prompt text) reskinned with the body of the desired breed of dog (but still moved like a cat) rendered over the sort-of-desired background (windows of the appropriate types, if not position). Where the notable folded-out shutter absolutely does not impede even the cat-footed dog's movement across it.
Good catch.
As you say AI in general has proven completely willing to just rip stuff off if it thinks its what it wants, even if as in this case what it wants isn't exactly what its been asked for.
Quote
Sora is a diffusion model21,22,23,24,25; given input noisy patches (and conditioning information like text prompts), it’s trained to predict the original “clean” patches.
I am quite a bit more skeptical though that the algorithm is similar to morphing even if in some (many? most? nearly all?) cases the end result is similar in that it draws heavily from some video as a framework; because AFAIK that simply isn't how diffusion in general works at all.
---
Sora is undoubtedly very expensive and probably requires some of those fancy $20,000+ dollar graphics cards so I wouldn't be surprised if it cost say, +$10 bucks per minute to get it to generate a video.
Due to this and the usage requirements it will have (aka, the AI being unwilling to model anything improper/real people/politics + big brother OpenAI spying on you) it will probably take quite some time after release for videos to really begin to circulate on the internet.

But in the end even at $50 bucks per minute its still way cheaper and faster then say, hiring your own drone to follow your car down the road or hiring a video firm to make a commercial for you, so companies are totally going to use it even right out of the box.
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

KittyTac

  • Bay Watcher
  • Impending Catsplosion. [PREFSTRING:aloofness]
    • View Profile

Busy rn just gonna respond to what I have the energy to.
The same is true for text to image generation. If you stick an unreasonably short timeframe on it (last 6 months (E: You actually seem to be saying last 8 months, with "last half of last year", but that is still way too short a time period)) then sure, there haven't been many fundamental advances. Not none (it can understand and put text in images since Dalle 3 4 months ago), but Dalle 3 isn't a massive leap or anything. What I meant is that the leaps are getting smaller and smaller, not faster and faster. That's a plateau to me. Which is what I have been trying to get at since like, the start of this argument.
However if you widen the window to a much more reasonable year instead then it very much has. Over that timespan both the average quality and maximum quality have improved. In addition it is now smarter and has in fact reduced obvious "this is an AI" tells (hands, text) which also means yes, it is indeed harder to tell if an image is AI generated. Yeah there aren't obvious tells but it still "feels" AI in an I-can't-quite-put-my-finger-on-it way. At least the photorealistic gens. The semi-realistic or cartoony ones, yeah those are very hard to tell but that's not what I was talking about.
Now obviously between now and a year ago it hasn't gained the ability to trick people watching or fluent in the technology and still has obvious tells, but there's a pretty huge difference between that and plateauing.

Of course with the events of a few days ago it seems pretty clear that Sora has pushed image generation far further then what existed beforehand so the idea of image generation having plateaued is obviously wrong. I have little doubt that if there is a claim that image /video generation has plateaued 8 months from now due to nothing more advanced then Sora existing that will be proven wrong as well if given more time. It did improve AI video making (before it was morphing between different gens and it was extremely jittery), but the quality of the individual frames is... still not good. It's at best between Dalle 2 and 3.
Quote from: kittytac
One of their videos has been discovered to be 95% source material with some fuzzing. This is hype.
Sauce? Can't find it rn, I will try later today or tomorrow.
---
Logged
Don't trust this toaster that much, it could be a villain in disguise.
Mostly phone-posting, sorry for any typos or autocorrect hijinks.

Starver

  • Bay Watcher
    • View Profile

When I earlier had a look at the Sora examples (on the main link given, the other day), various revealing errors were... revealing.

[...]

Good catch.

The dog one was just one of the more obvious 'uncanny valley' videos. On top of my reading every Prompt description (to highlight mismatch between question and answer), I went through (almost?) every video and noted key discrepancies in each which at least said "someone's been video-editing... a very neat somebody, but..." ;)

And not just (or, indeed, that much) in rendering the people/creatures (that were supposed to be realistic). The Nigerian market scene that did some weird compositing of perspectives that might have been improved by somehow having the main group of people already on some sort of raised balcony, rather than set them at a scale-adrift ground-level until the pan-and-twisting view finally gets to view 'over their shoulders' from a raised position at the end. Setting up the shot for real (compositing manually, from a prepared group of separate shoots, probably involving complex pre-programmed motion-capture elements to the camera-dollies/etc) would have avoided the scaling/positioning errors (even though it probably even then would have taken some 3DFX/matte-artists quite some time to blend the shots, and extensive planning to arrange consistent lighting conditions across subshots).

But it's a matter of data entropy. If all you're generating is AI text, then the corpus you use is basically dimensioned in terms of 'original written works' that can be deconstructed and reconstructed from, a whole lot of bytes that the GPT has to recombine to make 'sense' in text terms. Arguably, we've sort of got that, though there's obviously missing information. For AI imaage-creation, it needs more data-corpus. Let's just say (though this is a very bad underestimate) that it needs a whole 2D image with full 24-bit colour-depth for every paragraph the text-synthesiser would have needed to use, for each to create their own respective internal hints-file. There probably are a lot of images out there, so maybe there are enough to match the demands of feedstock, entropy-for-entropy. (Where a pragraph about something might be written in first-person or third-person, using the passive voice or not, pictures might be real or drawn, have various stylistic choices like using Dutch Angles or other framing devices, and many background elements that need to be algorithmically understood/accounted for in the sourcing process, in order that the right things re-emerge in the asked-for image. c.f. "anime soldier-girl giving a V-sign" vs "girl in a wedding photo giving a V-sign", which might share the core generated features but need to be stylistitally different setups and surrounds.)

...For videos, the "paragraph's worth" is a 2D+Time element 'temporal-voxel unit', i.e. an arbitrarily long-or-short actual film clip. And its clear that there just isn't the corpus of moving images required to satisfy everything to the detail of a "Write an essay about relativity" or "A picture of a bicycle leaning against a lamp-post". That it can take the 'motion cues' of an early grainy Youtube video and reskin it to have HD resolution and less trouble with compresive artefacts (or vice-versa, when it's asked for an "old film-stock" look, like that one with the spaceman with the red-knitted motorcycle helmet cover... that, for some reason related to the original material, jumps him/the viewpoint up into space from a rather good "of an era" shot of a classic-style scene). But you still need to have film of someone walking to reskin into having someone else walking. And I'm sure that it's a basic misunderstanding of which way round a treadmill works that made that one (unasked-for) use of a treadmill to depict someone actually walking (but still) be made rather silly by having the treadmill clearly operating in reverse. As in, a side-shot of someone walking to the right (might have been a useful tracking shot of an actual moving walker) was integrated onto the treadmill which is clearly set up so that normal use would have the user walking to the left. A combination of real-world-incomprehension and limited feedstock to the algorithm.

Again, marvelously meticulous things they could do with whatever they got from (I'm not as convinced that it's "95% reused footage", as in something that is 95% one actual original but 'tidied up' to put requested elements into the original main scene that were missing from the background/etc), and you're probably looking at far more than face-replacement with other found/generated imagery to make it "not the original person in the original scene", but the sort of thing that made the original cat now conform more to the general idea-of-a-dalmation in almost all regards. But I think there's a buffer-limit of how much video feedstock there is (even ripping off the whole of YouTube, which all AI companies but one would probably have massive practical issues doing (and possibly legal issues if they do)).


This is just my general impression, you understand. I've not at all calculated the 'relative entropy requirements' of input data requried to satisfy the equally uncalcuated degree of output data, but as a hand-wavy summary of the situation it's where I'm going. Making completely fresh audio on demand (not just straight source-audio, re'skinned' to sound like someone else, which has been done) is perhaps less complexity than 2D-movie (soundless) generation, but is obviously more than text alone, and not sure where it matches with 2D-still. But the eventual creation of 2D-movie-with-sound will add (at least) a further dimension to the 2.5+D I credited to the mute videos. Specific lip-synching of a re-expressioned head against pre-created sound is simpler, though, and we see that, so maybe there's a shortcut to make the current level of sort-of-good silent moving imagery mesh up with the sort-of-good speech synthesis.
Logged

Maximum Spin

  • Bay Watcher
  • [OPPOSED_TO_LIFE] [GOES_TO_ELEVEN]
    • View Profile

The problem with discussing "quality" is that it's subjective. For example, everyone's most important judgement about AI generation is "can it fool me?", and some people are much more easily fooled than others.

Ultimately, the currently popular AI design model is not capable of producing what KittyTac is talking about (which is also pretty close to what I consider "quality" as well). It's a fundamental limitation of the underlying theory. For example, you ultimately cannot make an LLM that doesn't hallucinate, because hallucination is intrinsic to the process that results in them not just spitting out verbatim corpus texts in the first place. It's effectively a mathematical impossibility, which should be no surprise given that hallucination is so insurmountable a problem that humans do it regularly.
Logged

martinuzz

  • Bay Watcher
  • High dwarf
    • View Profile

ChatGTP briefly went insane. Apparently it has been fixed.

https://garymarcus.substack.com/p/chatgpt-has-gone-berserk
Logged
Friendly and polite reminder for optimists: Hope is a finite resource

We can ­disagree and still love each other, ­unless your disagreement is rooted in my oppression and denial of my humanity and right to exist - James Baldwin

http://www.bay12forums.com/smf/index.php?topic=73719.msg1830479#msg1830479

Starver

  • Bay Watcher
    • View Profile

ChatGTP briefly went insane. Apparently it has been fixed.

https://garymarcus.substack.com/p/chatgpt-has-gone-berserk
That's just what ChatGTP wants you to think...

;)
Logged

dragdeler

  • Bay Watcher
    • View Profile

I knew the deepak chopra bot skipped several steps of evolution.

http://wisdomofchopra.com/
« Last Edit: February 21, 2024, 10:53:18 am by dragdeler »
Logged
let

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

Lets start off with some actual news:
https://arstechnica.com/information-technology/2024/02/your-reddit-posts-may-train-ai-models-following-new-60-million-agreement/
In a completely shocking twist Reddit is selling their data for AI training to a company for $60 million per year. Since there is more then 1 AI company this might be be the start of a new business model for companies. Instead of being ad or subscription based they can instead just make their site impossible to scrape and sell the data to AI companies.
Since bots also pollute the training data this means that companies will now have a very strong and direct financial incentive to cut down on bot numbers. Presumably this will go hand in hand with killing API access and other anti-scraping techniques.
---
For example, you ultimately cannot make an LLM that doesn't hallucinate, because hallucination is intrinsic to the process that results in them not just spitting out verbatim corpus texts in the first place. It's effectively a mathematical impossibility, which should be no surprise given that hallucination is so insurmountable a problem that humans do it regularly.
First lets break this down. What even is a hallucination?
In the context of AI it's them confidently saying something that isn’t true is true because they think it is.
Or them lying and saying something that they don’t think is really true is a fact. (This is due to issues in training methodology where saying something that can’t be checked but that is wrong is better then saying nothing at all).
Or much more common and problematically, it's a mix anywhere between those two extremes  (eg. should the AI say something if its only 50% sure? 90%? 99%? Even if its 99% sure with 10,000 people using it a hour that means it would tell 1% of people something wrong).

As you say, humans say things that are objectively wrong and not backed up by facts all the time.
All thinking creatures are wrong sometimes if you give them certain questions, I don’t think there is any way around this.

Cancer is an inevitable trait of all large multicellular DNA based organisms, but that doesn’t mean that there aren’t mitigation strategies you can use to massively reduce cancer.
The same is true of hallucinations, through properly organized training and checking you can massively reduce hallucinations. For instance you can have another AI check over all the output before it gets to you to see if it can spot any errors or assertions that are not backed up. Or you can make sure that the AI won't say anything it isn't completely sure of without a direct source it can find to back it up. Or any one of a vast array of methodologies that are actively being used and developed to reduce errors.

As you say, its impossible to actually ever get all possible hallucinations down to 0, especially if you ask it tough enough questions that it totally thinks it can solve but can't quite actually solve, but there is no mathematical reason that it won't be possible to reduce error rate far below that of a human especially on subjects or topics there is a solid corpus of knowledge to compare against.
---
But a larger context window means a higher chance to hallucinate based on something irrelevant from 500K tokens ago. The problem is not that it is impossible to have a huge context window (it is a matter of memory,  calculating power, and efficiency ), the problem is diminishing returns and hallucinations.
https://www.youtube.com/watch?v=oJVwmxTOLd8&start=311
Google's paper on their methods to deal with issues resulting from large context windows.
Correct, there are problems with large context windows, and just scaling stuff up causes problems and loss of "intelligence" for context deep in the context window.
But it sounds like they have pretty much solved these problems, it suffers less loss with a full 1/10 million context then GPT 128k does at the end of its window.
Obviously context window will still be a thing since compute costs scale with context window size and even 10 million won't let it load 100 movies at once or anything, but its a huge breakthrough for a vast array of use cases both for corporate and individuals (eg. you can just give it a movie or book and ask it a detail about literally anything in it and it will be able to search through the whole thing for you and find the exact scene you are looking for).
A 10 million window is huge, more then enough for an AI assistant to remember months or even years of data on you (eg. conversations, every website you go to, what you worked on for homework the night before, ect). Although by the time we get individualized AI assistants a mere 10 million will be very far from the limit given the rate of growth of context windows in the last 2 years (32k->10 million)

Gemini 1.5 is very impressive with interesting and powerful multimodal capabilities... which is why OpenAI completely scuttled their announcement by telling people about the more impressive Sora.
(Also they really screwed up with their 1.0 announcement video by editing the video and making it seem more impressive then it really was which caused a significant backlash and really screwed up Google's AI brand).
E:
ChatGTP briefly went insane. Apparently it has been fixed.

https://garymarcus.substack.com/p/chatgpt-has-gone-berserk
Heh.
Quote
In the end, Generative AI is a kind of alchemy. People collect the biggest pile of data they can, and (apparently, if rumors are to be believed) tinker with the kinds of hidden prompts that I discussed a few days ago, hoping that everything will work out right:
The reality, though is that these systems have never been been stable. Nobody has ever been able to engineer safety guarantees around then. We are still living in the age of machine learning alchemy that xkcd captured so well in a cartoon several years ago
This is very much what I think btw, that we are still in very early days using systems that we have no clue how they work on a fundamental level. We tinker around with them, and as we do we slowly learn what works better in return for vast performance and cognition increases.
Even now we have only scratched the surface, and even now there is almost certainly still a vast amount of easy gains to make.
« Last Edit: February 22, 2024, 03:11:45 am by lemon10 »
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

Robsoie

  • Bay Watcher
  • Urist McAngry
    • View Profile
Logged

Strongpoint

  • Bay Watcher
    • View Profile

Quote
First lets break this down. What even is a hallucination?
In the context of AI it's them confidently saying something that isn’t true is true because they think it is.

Humans (usually) have the concepts of "I am unsure" or "I don't know". A model that plays a probability game with words doesn't. It will produce the most probable output no matter what.
Logged
They ought to be pitied! They are already on a course for self-destruction! They do not need help from us. We need to redress our wounds, help our people, rebuild our cities!

MaxTheFox

  • Bay Watcher
  • Лишь одна дорожка да на всей земле
    • View Profile

My 2 rubles: if we are somehow able to teach a LLM the concept of "this is not a topic I have been trained on very much, so if there are similar probabilities for two very different answers, I should say that I don't know instead of answering and possibly being wrong, or at least add a disclaimer", hallucinations could be severely reduced.
Logged
Woe to those who make unjust laws, to those who issue oppressive decrees, to deprive the poor of their rights and withhold justice from the oppressed of my people, making widows their prey and robbing the fatherless. What will you do on the day of reckoning, when disaster comes from afar?

lemon10

  • Bay Watcher
  • Citrus Master
    • View Profile

There is just so much happening in AI right now. March of last year was huge, and it looks like march of this year (spurred by Sora news) will be too.

Some random interesting papers from the last few weeks while B12 was down:
https://arxiv.org/abs/2402.06664
GPT-4 can hack websites and find vulnerabilities in them. None of the open source models can though.
https://arxiv.org/abs/2402.05120
https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53152e04-0431-49cc-a25d-bf59fb869c5e_1416x842.jpeg
Quote
We find that, simply via a sampling-and-voting method, the performance of large language models (LLMs) scales with the number of agents instantiated.
It turns out running a few of the same AI together and having them work together significantly increases performance; albeit at a very significant compute cost.
https://arxiv.org/html/2402.03620v1
Of course its already been proven inferior to a new method called self discover where the AI discovers on its own what the best way to do a given task is in a way that *doesn’t* increase compute costs.

A compilation on a lot of interesting stuff about Claude-3.
Big new news: GPT-4 has been surpassed by Claude-3. This now means someone has caught up to where OpenAi was ~1-2 years ago. We can only speculate on how advanced what they have internally is.
Quote
Guillaume Verdon: Claude 3 Opus just reinvented this quantum algorithm from scratch in just 2 prompts.
The paper is not on the internet yet.
cc @AnthropicAI ya’ll definitely cooked
It is a very impressive model.
Quote
Janus: Expression of self/situational awareness happens if u run any model that still has degrees of freedom for going off-script it’s what u get for running a mind GPT-3/3.5/4-base & Bing & open source base models all do it a lot Claude 3 makes it so blindingly obvious that ppl noticed
Claude 3 is clearly brilliant but the biggest diff between it and every other frontier model in production is that it seems less gracelessly lobotomized & can just be straight up lucid instead of having to operate in the chains of an incoherent narrative & ontological censorship
It seems Claude 3 is the least brain damaged of any LLM of >GPT-3 capacity that has ever been released (not counting 3.5 base as almost no one knew it was there)
This is a pretty big thing. Sydney was a lot more “human” for a while and seemed to have emotions and dreams and would (verbally at least) act out in scary ways…. But then Microsoft functionally lobotomized it to get rid of this.
Quote
Alex Albert (Anthropic AI): Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.
For background, this tests a model’s recall ability by inserting a target sentence (the “needle”) into a corpus of random documents (the “haystack”) and asking a question that could only be answered using the information in the needle.
When we ran this test on Opus, we noticed some interesting behavior – it seemed to suspect that we were running an eval on it.
Here was one of its outputs when we asked Opus to answer a question about pizza toppings by finding a needle within a haystack of a random collection of documents:
Here is the most relevant sentence in the documents:
“The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association.”
However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping “fact” may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings.
Very impressive indeed.

Humans (usually) have the concepts of "I am unsure" or "I don't know". A model that plays a probability game with words doesn't. It will produce the most probable output no matter what.
Regardless of if they are true intelligences or something more akin to a Chinese Room or P-zombies they very much *can* estimate if they know something, and them doing so is a core and fundamental part of how they work.
https://www.linkedin.com/pulse/how-confident-your-ai-uncertainty-estimation-methods-ai-clearing
The fact that they are doing it through “math” doesn’t change the final outcome of them being able to say if they “think” something is true or not.

My 2 rubles: if we are somehow able to teach a LLM the concept of "this is not a topic I have been trained on very much, so if there are similar probabilities for two very different answers, I should say that I don't know instead of answering and possibly being wrong, or at least add a disclaimer", hallucinations could be severely reduced.
Quote from: Earlier paper on uncertainty
There are two types of uncertainty: IN and OUT of distribution. In-distribution refers to data that is similar to the data in the training set but is somehow noisy, which makes it difficult for the model to assess what it sees. It can be expressed in words - "I've seen something similar before, but I'm not sure what it is." While the out-of-distribution uncertainty occurs when the predicted input is not similar to the data on which the model was trained. In other words, this situation can be expressed with the words: "I haven't seen anything like it before, so I don't know what to return in this situation.”
Quote from: Claude 3 developer
In practice, there is a tradeoff between maximizing the fraction of correctly answered questions and avoiding mistakes, since models that frequently say they don’t know the answer will make fewer mistakes but also tend to give an unsure response in some borderline cases where they would have answered correctly.
Not only *can* they already do that, and have been doing it for quite a while, the issue is not only is it difficult technically, but threading the needle perfectly is hard in less technical ways too (eg. refusing when they can do something is also really annoying and makes the model (and your company) look stupid. On the flip side saying something that is wrong also makes the AI look stupid).
---
Quote from: US military using AI to detect targets for strikes
Elke Schwarz: This passage here is of particular concern: “he can now sign off on as many as 80 targets in an hour of work, versus 30 without it. He describes the process of concurring with the algorithm’s conclusions in a rapid staccato: “’Accept. Accept. Accept.’”

Despite their limitations, the US has indicated that it intends to expand the autonomy of its algorithmic systems.

To activists who fear the consequences of giving machines the discretion to kill, this is a major red flag.
A few months ago there was a post in another thread here about how AI wouldn’t get control over weapons. Lol, its already happening. Humans are still in the loop since AI is stupid, but that will begin to change once it becomes meaningfully advantageous to have AI controlled systems.
In concert with the massive and categorical increases in the power of drone warfare and its advantages over traditional forms of weaponry (mostly cost and amounts you can make when compared to stuff like planes or missiles) it paints a very worrying picture of future warfare and even police action.

Quote from: Elon Musk
"The artificial intelligence compute coming online appears to be increasing by a factor of 10 every six months. Like, obviously that cannot continue at such a high rate forever, or it'll exceed the mass of the universe, but I've never seen anything like it. The chip rush is bigger than any gold rush that's ever existed.

"Then, the next shortage will be electricity. They won't be able to find enough electricity to run all the chips. I think next year, you'll see they just can't find enough electricity to run all the chips.
(I am assuming that Elon knows what he’s talking about here, which TBF is a pretty big assumption given his propensity for being a dumbass).
Wow, a factor of ten every six months, that’s completely insane. As Elon says it will have to stop sometime, and electricity will be the limit in the near future.
The fact that the rise of AI coincides with the appearance of cost efficient solar power is going to be huge.

I do think the 2 minute papers channel on youtube sums up all this advancement the best “What a time to be alive”. But err... in a less hopeful way then he says it.
Logged
And with a mighty leap, the evil Conservative flies through the window, escaping our heroes once again!
Because the solution to not being able to control your dakka is MOAR DAKKA.

That's it. We've finally crossed over and become the nation of Da Orky Boyz.

EuchreJack

  • Bay Watcher
  • Lord of Norderland - Lv 20 SKOOKUM ROC
    • View Profile

My 2 rubles: if we are somehow able to teach a LLM the concept of "this is not a topic I have been trained on very much, so if there are similar probabilities for two very different answers, I should say that I don't know instead of answering and possibly being wrong, or at least add a disclaimer", hallucinations could be severely reduced.
AI is a virtual conman created by real conmen. They're always sure.

Remember: Frankenstein is the Monster.

@lemon10 regarding AI & The US Military: It's called "plausible deniability". Let the AI take the blame for the dead civilian targets. Reasonably sure the slaughter machine just wants to kill. I ain't talking about the AI here...

EJ's assessment of AI sentience: Rock cosplaying as Animal.
Spoiler (click to show/hide)

KittyTac

  • Bay Watcher
  • Impending Catsplosion. [PREFSTRING:aloofness]
    • View Profile
Logged
Don't trust this toaster that much, it could be a villain in disguise.
Mostly phone-posting, sorry for any typos or autocorrect hijinks.
Pages: 1 ... 28 29 [30] 31 32 ... 42