So its time for yet another roundup of AI news, as expected AI is still developing at a breakneck pace.
—
Three weeks ago Elon Musk promised that his new Grok 1.5 AI would be released the next week, as with almost every single Musk timeline promise it turned out to be nonsense as it still isn’t released.
—
There are now numerous companies that have matched or nearly matched GPT 4 at release. Catching up to where OpenAI was two years ago is impressive but its not like OpenAI is standing still, new versions of GPT 4 are being released that are notably and measurably better.
—
Speaking of OpenAI…
GPT 5 has finished training, and could now be released. However, it's almost certain that its release will be delayed at least a few months for security purposes given the release delay on every single one of their other projects. I suspect it will be released at some point after the US election is finished.
Apparently its substantially and meaningfully better than 4 in everything as well as being significantly larger. Only rumors though since its still under wraps.
Unlock the power of accurate predictions and confidently navigate
uncertainty. Reduce uncertainty and resource limitations. With
TimeGPT, you can effortlessly access state-of-the-art models to make
data-driven decisions. Whether you’re a bank forecasting market trends
or a startup predicting product demand, TimeGPT democratizes access to
cutting-edge predictive insights.
Their new TimeGPT is also out which is designed for time series analysis and forecasting the future. Not that useful to a regular person, but it sounds like it could be a very big deal for businesses since its flat out better than existing forecasting services.
Sora will be released at some point this year as well.
In addition OpenAI has developed an AI that can clone your voice by just listening to it for 15 seconds. Like a lot of AI tech this is really scary stuff. Even if OpenAI keeps a lid on it someone else will soon develop and release equivalent tech to the public, scammers and people creating deepfakes will absolutely love it.
—
DarkGemini is a powerful new GenAI chatbot, now being sold on the dark web for a $45 monthly subscription.
It can generate a reverse shell, build malware, or even locate people based on an image. A “next generation” bot, built specifically to make GenAI more accessible to the attacker next door.
A few pages back I was talking about the end of the open internet, and criminal AI was brought up and it was questioned why it didn’t exist. Well, it exists now. On the darknet you can find DarkGemini which will assist you with criminal activities.
—
Prompt: a song about boatmurdered.
https://www.udio.com/songs/gnqdHVMZjX89866jQjTQ7P
A new AI music generation service called Udio is now out and it makes pretty decent music. Not amazing, but as I keep saying, its still just early days.
Musicians are now officially in trouble. Not as much as writers or even artists since people care about who wrote the songs they listen to in a way they don’t care about who wrote what they read or who made the art they see, but things aren’t looking good for them either.
Like the ability to create functionally free art on demand this will be a big tool in the box of creators.
—
There are various regulations on AI in the works, but aside from the anti-deepfake stuff I’m very doubtful about what will actually get through, money talks after all and the US congress has huge amounts of trouble acting against anyone with any real amount of money.
—
Claude 3 (the best AI out out there right now, aside from possibly the newest fork of GPT 4) is now about as persuasive as a human. When its acting deceptively it is more persuasive than your average person.
Durably reduce belief in conspiracy theories about 20% via debate, also reducing belief in other unrelated conspiracy theories.
On some topics (such as convincing people that conspiracy theories are wrong) its vastly better than your average person, presumably due to the fact that it knows all the conspiracy theory talking points that regular people don’t and can counteract them point by point.
Of course AI is just going to get better at persuasion, and there is no reason at all to think that it won't get far better then your average human at it.
—
Some interesting stuff summarized from an interview with some AI engineers working for Google and Anthropic (claude).
https://www.youtube.com/watch?v=UTuuTTnjxMQ(8:45) Performance on complex tasks follows log scores. It gets it right one time in a thousand, then one in a hundred, then one in ten. So there is a clear window where the thing is in practice useless, but you know it soon won’t be. And we are in that window on many tasks. This goes double if you have complex multi-step tasks. If you have a three-step task and are getting each step right one time in a thousand, the full task is one in a billion, but you are not so far being able to in practice do the task.
(9:15) The model being presented here is predicting scary capabilities jumps in the future. LLMs can actually (unreliably) do all the subtasks, including identifying what the subtasks are, for a wide variety of complex tasks, but they fall over on subtasks too often and we do not know how to get the models to correct for that. But that is not so far from the whole thing coming together, and that would include finding scaffolding that lets the model identify failed steps and redo them until they work, if which tasks fail is sufficiently non-deterministic from the core difficulties.
The interview talks about this quite a bit, how the reliability (especially multistep) is a huge bottleneck for actually using these. But once it can do it even infrequently that means that being able to to do the same thing actually reliably is just around the corner.
(51:00) “I think the Gemini program would probably be maybe five times faster with 10 times more compute or something like that. I think more compute would just directly convert into progress.”
The two bottlenecks are currently highly skilled engineers who have the right “taste” or intuition for how to design experiments and compute. More compute is still the biggest bottleneck.
(1:01:30) If we don’t get AGI by GPT-7-levels-of-OOMs (this assumes each level requires 100x times compute) are we stuck? Sholto basically buys this, that orders of magnitude have at core diminishing returns, although they unlock reliability, reasoning progress is sublinear in OOMs. Dwarkesh notes this is highly bearish, which seems right.
(1:03:15) Sholto points out that even with smaller progress, another 3.5→4 jump in GPT-levels is still pretty huge. We should expect smart plus a lot of reliability. This is not to undersell what is coming, rather the jumps so far are huge, and even smaller jumps from here unlock lots of value. I agree.
Yeah, sounds reasonable enough, eventually things will become too costly to continue scaling, and if we don’t reach AGI before then progress will slow down dramatically. But we are currently nowhere near the end of the S-curve.
(1:32:30) Getting better at code makes the model a better thinking. Code is reasoning, you can see how it would transfer. I certainly see this happening in humans.
(They *also* say that making it better at coding improves its more mundane language skills too).
It has a few things in this vein where the researcher point out how cross-learninghas interesting side effects, for instance apparently fine tuning a model to make it better at math makes it better at entity recognition at the same time.
—
https://dreams-of-an-electric-mind.webflow.io/Claudes talking to each other. This sure looks like creativity to me.
—
I don't think AI will replace humans for several more decades given the cost of the AI, especially since they're saying better AI need even more money to make then the current ones.
I don't think that AI will replace humans period.
The simplest example is chess. Hardcoded chess engines have been far better than humans since the late 1990s. Neural network chess engines came like 5 years years ago and kicked the ass of hardcoded chess engines. Modern engines are a combination of the two and their level of play is ungodly, they make moves beyond human comprehension that somehow work.
And yet chess is alive both as a hobby and as a professional sport.
This is why I chuckle when I hear that AI will replace humans in stuff like graphics design or movie script writing where such concept as "better" is very vague compared to chess.
Do you think a hobby/sport where 99% of people make no money off it operates remotely the same as profit driven businesses where everyone involved expects a paycheck?
Because I can tell you with 100% certainty, if AI can deliver a equivalent product* at significantly lower costs** companies will drop screenwriters like hot potatoes.
*Obviously if they can’t then things are different, but going “well, if AI sucks then it won’t replace everyone” is obvious.
**And of course it will, because the “AI is expensive” crowd is forgetting that people are really expensive. On average a screenplay sells for $110k dollars. Even if you increase the price of AI generation by literally ten thousand times it will still be cheaper.
Parts of the movie industry that people care about as individuals (movie stars) will have protection, but nobody actually cares who or what wrote the movie they are watching as long as its good.