overview for scruiser

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 13th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 11 points 2 weeks ago (1 children)

The hidden prompt is only cheating if the reviewers fail to do their job right and outsource it to a chatbot, it does nothing to a human reviewer actually reading the paper properly. So I won't say it's right or ethical, but I'm much more sympathetic to these authors than to reviewers and editors outsourcing their job to an unreliable LLM.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 13th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 12 points 2 weeks ago (2 children)

The only question is who will get the blame.

Isn't it obvious? Us sneerers and the big name skeptics (like Gary Marcuses and Yann LeCuns) continuously cast doubt on LLM capabilities, even as they are getting within just a few more training runs and one more scaling of AGI Godhood. We'll clearly be the ones to blame for the VC funding drying up, not years of hype without delivery.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 9 points 2 weeks ago* (last edited 2 weeks ago)

I think we mocked this one back when it came out on /r/sneerclub, but I can't find the thread. In general, I recall Yudkowsky went on a mini-podcast tour a few years back. I think the general trend was that he didn't interview that well, even by lesswrong's own standards. He tended to simultaneously assume too much background familiarity with his writing such that anyone not already familiar with it would be lost and fail to add anything actually new for anyone already familiar with his writing. And lots of circular arguments and repetitious discussion with the hosts. I guess that's the downside of hanging around within your own echo chamber blog for decades instead of engaging with wider academia.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 7 points 2 weeks ago

For purposes of something easily definable and legally valid that makes sense, but it is still so worthy of mockery and sneering. Also, even if they needed a benchmark like that for their bizarre legal arrangements, there was no reason besides marketing hype to call that threshold "AGI".

In general the definitional games around AGI are so transparent and stupid, yet people still fall for them. AGI means performing at least human level across all cognitive tasks. Not across all benchmarks of cognitive tasks, the tasks themselves. Not superhuman in some narrow domains and blatantly stupid in most others. To be fair, the definition might not be that useful, but it's not really in question.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 5 points 2 weeks ago

Optimistically, he's merely giving into the urge to try to argue with people: https://xkcd.com/386/

Pessimistically, he realized how much money is in the doomer and e/acc grifts and wants in on it.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 5 points 2 weeks ago (1 children)

Best case scenario is Gary Marcus hangs around lw just long enough to develop even more contempt for them and he starts sneering even harder in this blog.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 10 points 3 weeks ago (5 children)

Gary Marcus has been a solid source of sneer material and debunking of LLM hype, but yeah, you're right. Gary Marcus has been taking victory laps over a bar set so so low by promptfarmers and promptfondlers. Also, side note, his negativity towards LLM hype shouldn't be misinterpreted as general skepticism towards all AI... in particular Gary Marcus is pretty optimistic about neurosymbolic hybrid approaches, it's just his predictions and hypothesizing are pretty reasonable and grounded relative to the sheer insanity of LLM hypsters.

Also, new possible source of sneers in the near future: Gary Marcus has made a lesswrong account and started directly engaging with them: https://www.lesswrong.com/posts/Q2PdrjowtXkYQ5whW/the-best-simple-argument-for-pausing-ai

Predicting in advance: Gary Marcus will be dragged down by lesswrong, not lesswrong dragged up towards sanity. He'll start to use lesswrong lingo and terminology and using P(some event) based on numbers pulled out of his ass. Maybe he'll even start to be "charitable" to meet their norms and avoid down votes (I hope not, his snark and contempt are both enjoyable and deserved, but I'm not optimistic based on how the skeptics and critics within lesswrong itself learn to temper and moderate their criticism within the site). Lesswrong will moderately upvote his posts when he is sufficiently deferential to their norms and window of acceptable ideas, but won't actually learn much from him.

‘AI is no longer optional’ — Microsoft admits AI doesn’t help at work in c/techtakes@awful.systems

[–] scruiser@awful.systems 13 points 3 weeks ago (5 children)

Unlike with coding, there are no simple “tests” to try out whether an AI’s answer is correct or not.

So for most actual practical software development, writing tests is in fact an entire job in and of itself and its a tricky one because covering even a fraction of the use cases and complexity the software will actually face when deployed is really hard. So simply letting the LLMs brute force trial-and-error their code through a bunch of tests won't actually get you good working code.

AlphaEvolve kind of did this, but it was testing very specific, well defined, well constrained algorithms that could have very specific evaluation written for them and it was using an evolutionary algorithm to guide the trial and error process. They don't say exactly in their paper, but that probably meant generating code hundreds or thousands or even tens of thousands of times to generate relatively short sections of code.

I've noticed a trend where people assume other fields have problems LLMs can handle, but the actually competent experts in that field know why LLMs fail at key pieces.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 10 points 3 weeks ago (1 children)

Exactly. I would almost give the AI 2027 authors credit for committing to a hard date... except they already have a subtly hidden asterisk in the original AI 2027 noting some of the authors have longer timelines. And I've noticed lots of hand-wringing and but achkshuallies in their lesswrong comments about the difference between mode and median and mean dates and other excuses.

Like see this comment chain https://www.lesswrong.com/posts/5c5krDqGC5eEPDqZS/analyzing-a-critique-of-the-ai-2027-timeline-forecasts?commentId=2r8va889CXJkCsrqY :

My timelines move dup to median 2028 before we published AI 2027 actually, based on a variety of factors including iteratively updating our models. But it was too late to rewrite the whole thing to happen a year later, so we just published it anyway. I tweeted about this a while ago iirc.

...You got your AI 2027 reposted like a dozen times to /r/singularity, maybe many dozens of times total across Reddit. The fucking vice president has allegedly read your fiction project. And you couldn't be bothered to publish your best timeline?

So yeah, come 2028/2029, they already have a ready made set of excuse to backpedal and move back the doomsday prophecy.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 6th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 12 points 3 weeks ago* (last edited 3 weeks ago) (5 children)

So two weeks ago I linked titotal's detailed breakdown of what is wrong with AI 2027's "model" (tldr; even accepting the line goes up premise of the whole thing, AI 2027's math was so bad that they made the line always asymptote to infinity in the near future regardless of inputs). Titotal went to pretty extreme lengths to meet the "charitability" norms of lesswrong, corresponding with one of the AI 2027 authors, carefully considering what they might have intended, responding to comments in detail and depth, and in general not simply mocking the entire exercise in intellectual masturbation and hype generation like it rightfully deserves.

But even with all that effort, someone still decided make an entire (long, obviously) post with a section dedicated to tone-policing titotal: https://thezvi.substack.com/p/analyzing-a-critique-of-the-ai-2027?open=false#%C2%A7the-headline-message-is-not-ideal (here is the lw link: https://www.lesswrong.com/posts/5c5krDqGC5eEPDqZS/analyzing-a-critique-of-the-ai-2027-timeline-forecasts)

Oh, and looking back at the comments on titotal's post... his detailed elaboration of some pretty egregious errors in AI 2027 didn't really change anyone's mind, at most moving them back a year to 2028.

So, morale of the story, lesswrongers and rationalist are in fact not worth the effort to talk to and we are right to mock them. The numbers they claim to use are pulled out of their asses to fit vibes they already feel.

And my choice for most sneerable line out of all the comments:

https://forum.effectivealtruism.org/posts/KgejNns3ojrvCfFbi/a-deep-critique-of-ai-2027-s-bad-timeline-models?commentId=XbPCQkgPmKYGJ4WTb

And I therefore am left wondering what less shoddy toy models I should be basing my life decisions on.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 29th June 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 12 points 1 month ago* (last edited 1 month ago) (3 children)

Following up because the talk page keeps providing good material..

Hand of Lixue keeps trying to throw around the Wikipedia rules like the other editors haven't seen people try to weaponize the rules to push their views many times before.

Particularly for the unflattering descriptions I included, I made sure they reflect the general view in multiple sources, which is why they might have multiple citations attached. Unfortunately, that has now led to complaints about overcitation from @Hand of Lixue. You can't win with some people...

Looking back on the original lesswrong ~~brigade organizing~~ discussion of how to improve the wikipedia article, someone tried explaining to Habyrka the rules then and they were dismissive.

I don’t think it counts as canvassing in the relevant sense, as I didn’t express any specific opinion on how the article should be edited.

Yes Habyrka, because you clearly have such a good understanding of the Wikipedia rules and norms...

Also, heavily downvoted on the lesswrong discussion is someone suggesting Wikipedia is irrelevant because LLMs will soon be the standard for "access to ground truth". I guess even lesswrong knows that is bullshit.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 29th June 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 12 points 1 month ago (7 children)

The wikipedia talk page is some solid sneering material. It's like Habryka and HandofLixue can't imagine any legitimate reason why Wikipedia has the norms it does, and they can't imagine how a neutral Wikipedian could come to write that article about lesswrong.

Eigenbra accurately calling them out...

"I also didn't call for any particular edits". You literally pointed to two sentences that you wanted edited.

Your twitter post also goes against Wikipedia practices by casting WP:ASPERSIONS. I can't speak for any of the other editors, but I can say I have never read nor edited RationalWiki, so you might be a little paranoid in that regard.

As to your question:

Was it intentional to try to pick a fight with Wikipedians?

It seems to be ignorance on Habyrka's part, but judging by the talk page, instead of acknowledging their ignorance of Wikipedia's reasonable policies, they seem to be doubling down.