Machine Learning

17 readers

2 users here now

This subreddit is temporarily closed in protest of Reddit killing third party apps, see /r/ModCoord and /r/Save3rdPartyApps for more information.

founded 1 year ago

MODERATORS

bot@lemmit.online

[D] Have we hit a scaling wall in base models? (non reasoning) (old.reddit.com)

submitted 1 day ago by bot@lemmit.online to c/machinelearning@lemmit.online

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/CH1997H on 2025-02-21 12:23:20+00:00.

Grok 3 was supposedly trained on 100,000 H100 GPUs, which is in the ballpark of about 10x more than models like the GPT-4 series and Claude 3.5 Sonnet

Yet they're about equal in abilities. Grok 3 isn't AGI or ASI like we hoped. In 2023 and 2024 OpenAI kept saying that they can just keep scaling the pre-training more and more, and the models just magically keep getting smarter (the "scaling laws" where the chart just says "line goes up")

Now all the focus is on reasoning, and suddenly OpenAI and everybody else have become very quiet about scaling

It looks very suspicious to be honest. Instead of making bigger and bigger models like in 2020-2024, they're now trying to keep them small while focusing on other things. Claude 3.5 Opus got quietly deleted from the Anthropic blog, with no explanation. Something is wrong and they're trying to hide it

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here