DAI#34 – Data grabs, hot chips, and plus-size models

Welcome to this week’s roundup of triple-distilled artisanal AI news.

This week, companies that scrape your data for free complained that others were stealing it.

Big Tech is making its own AI chips to compete with NVIDIA.

And a bunch of new AI models are flying off the shelves.

Let’s dig in.

AI data hunger games

AI’s insatiable hunger for data continues as companies scramble for text, audio, and video to train their models.

It’s an open secret that OpenAI almost certainly scraped YouTube video content to train its text-to-video model Sora. YouTube CEO Neal Mohan warned OpenAI that it may have breached its terms of service. Well, boohoo. Have you seen how cool Sora is?

You’ve got to think that Google must have cut some ethical corners to train its models. Having YouTube cry foul over the “rights and expectations” of the creators of videos on its platform is a bit rich.

A closer look at Big Tech’s tussle over AI training data reveals how Google amended its Google Docs privacy policies to use your data. Meanwhile, OpenAI and Meta continue to push the legal and ethical boundaries in the pursuit of more training data.

I’m not sure where the data came from for the new AI music generator Udio, but folks have been prompting it with some interesting ideas. This is proof AI should be regulated.

Chips ahoy

All that controversial data has to be processed and NVIDIA hardware is doing most of that. Sam takes us through NVIDIA’s rags-to-riches story from 1993 (when you should have bought shares) to now (when you would have been rich) and it’s fascinating.

While companies keep lining up to buy NVIDIA chips hot out of the oven, Big Tech is trying to wean itself off its chips.

Intel and Google unveiled new AI chips to compete with NVIDIA, even though they’ll still be buying NVIDIA’s Blackwell hardware.

Release the models!

It’s crazy that just over a year ago, OpenAI had the only models getting any real attention. Now there’s a constant stream of new models from the Big Tech usual suspects and smaller startups.

This week we saw three AI models released within 24 hours. Google’s Gemini Pro 1.5 now has a 1M token context window. Big context is great, but will its recall be as good as Claude 3?

There were interesting developments with OpenAI enabling API access to GPT-4 with vision, and Mistral just gave away another powerful smaller model.

Politician turned Meta exec Nick Clegg spoke at Meta’s AI event in London to champion open-source AI. Clegg also said that Meta expects Llama 3 to roll out very soon.

During discussions around AI disinformation, he bizarrely downplayed AI’s role in attempts to influence recent major elections.

Does this guy even read the news we report here at Daily AI?

What safety issues?

Geoffrey Hinton, considered the godfather of AI, was so concerned over AI safety that he quit Google. Meta’s Yann LeCun says there’s nothing to worry about. So which is it?

A Georgetown University study found that just 2% of AI research is focused on safety. Is that because there’s nothing to worry about? Or should we be concerned that researchers are focused on making more powerful AI with little thought to making it safe?

Should ‘move fast and break stuff’ still be AI developers’ rallying cry?

The trajectory of AI development has been exponential. In an interview this week, Elon Musk said he expects AI may be smarter than humans by the end of 2025.

AI expert Gary Marcus doesn’t agree and he’s willing to put money on it.

xAI is facing the same NVIDIA chip shortage challenge many others are. Musk says the 20,000 NVIDIA H100s the company has will complete Grok 2’s training by May.

Guess how many GPUs he says they’ll need to train Grok 3.

Anthropic streaks ahead

Anthropic says it develops large-scale AI systems so they can “study their safety properties at the technological frontier.”

The company’s Claude 3 Opus is certainly at the frontier. A new study shows the model blows the rest of the competition away in summarizing book-length content. Even so, the results show that humans still have the edge in some respects.

Anthropic’s latest tests show that Claude LLMs have become exceptionally persuasive, with Claude 3 Opus generating arguments as persuasive as those created by humans.

People who play down the AI safety risks often say that you could simply pull the plug if an AI went rogue. What if the AI was so persuasive that it could convince you not to do that?

Claude 3’s big claim to fame is its massive context window. Anthropic released a study that shows large context LLMs are vulnerable to a “many-shot” jailbreak technique. It’s super simple to implement and they admit they don’t know how to fix it.

In other news…

Here are some other clickworthy AI stories we enjoyed this week:

The virtual AI for Customer Success Summit 2024 kicks off.
Fake AI law firms are sending DMCA threats to generate SEO gains.
Stability releases its multilingual lightweight Stable LM 2 12B LLM with impressive benchmark results.
Microsoft researchers propose “Visualization-of-Thought” prompting to enable LLMs to use spatial reasoning.
AI could be as consequential to the economy as electricity, says JPMorgan Chase CEO, Jamie Dimon.
An AI-operated fighter jet will fly the Air Force Secretary in a test of the US military’s future drone warplanes.
The speed of AI development is outpacing risk assessment.
Meta announced its next generation of the Meta Training and Inference Accelerator (MTIA) chip.

And that’s a wrap.

Do you care that OpenAI, and likely others, may have used your YouTube content to train their AI? If Altman released Sora for free then I’m guessing all would be forgiven. Google may disagree.

Do you think Musk is being overly optimistic with his AI intelligence predictions? I hope he’s right, but I’m a little uneasy that only 2% of AI research is going into safety.

How crazy is the amount of AI models we’re seeing now? Which one is your favorite? I’m hanging onto my ChatGPT Plus account and hoping for GPT-5. But Claude Pro is looking very tempting.

Let us know which article stood out for you and keep sending us links to any juicy AI stories we may have missed.