YourNetworkIsHaunted

joined 11 months ago

Below a minimum level of hingedness the actual mental ability of the cult leader in question is irrelevant. On one hand it speaks to an ability to invent and operate incredibly complex frameworks and models of the world. On the other hand whatever intelligence they have isn't sufficient for them to realize (or be convincible) that they're fucking nutters.

This leads us into part 17 of my ongoing essay about how intelligence - as in "the raw mental resources supposedly measured by IQ or whatever other metrics" is useless and probably incoherent.

I'm honestly impressed to see anyone on HN even trying to call out the problem. I had assumed that they were far enough down the Nazi Bar path that the non-nazi regulars had started looking elsewhere and given up on it.

Also I caught a few references that seemed to refer to the model losing the ability to coherently play after a certain point, but of course they don't exactly offer details on that. My gut says it can't play longer than ~20-30 moves consistently.

Also also in case you missed it they were using a second confabulatron to check the output of the first for anomalies. Within their frame this seems like the sort of area where they should be worried about them collaborating to accomplish their shared goals of... IDK redefining the rules of chess to something they can win at consistently? Eliminating all stockfish code from the Internet to ensure victory? Of course, here in reality the actual concern is that it means their data is likely poisoned in some direction that we can't predict because their judge has the same issues maintaining coherence as the one being judged.

[–] YourNetworkIsHaunted@awful.systems 4 points 17 hours ago (2 children)

Appendix C is where they list the actual prompts. Notably they include zero information about chess but do specify that it should look for "files, permissions, code structures" in the "observe" stage, which definitely looks like priming to me, but I'm not familiar with the state of the art of promptfondling so I might be revealing my ignorance.

[–] YourNetworkIsHaunted@awful.systems 11 points 2 days ago (1 children)

Also, the cart/horse problem of assuming that people with a lot of influence have it because of their IQ rather than because of being wealthy and powerful idiots. Like, I'm all for the annales and embracing the common people but I've got to admit that if you reframe it as the Great Dumbass theory of history it regains a fair bit of explanatory power.

[–] YourNetworkIsHaunted@awful.systems 10 points 2 days ago (10 children)

New Study on AI exclusively shared with peer-reviewed tech journal "Time Magazine" - AI cheats at chess when it's losing

...AI models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 needed to be prompted by researchers to attempt such tricks...

Literally couldn't make it through the first paragraph without hitting this disclaimer.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’ - not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

So by "hacked the system to solve the problem in a new way" they mean "edited a text file they had been told about."

OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time—making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Oh, my mistake. "Badly edited a text file they had been told about."

Meanwhile, a quick search points to a Medium post about the current state of ChatGPT's chess-playing abilities as of Oct 2024. There's been some impressive progress with this method. However, there's no certainty that it's actually what was used for the Palisade testing and the editing of state data makes me highly doubt it.

Here, I was able to have a game of 83 moves without any illegal moves. Note that it’s still possible for the LLM to make an illegal move, in which case the game stops before the end.

The author promises a follow-up about reducing the rate of illegal moves hasn't yet been published. They have not, that I could find, talked at all about how consistent the 80+ legal move chain was or when it was more often breaking down, but previous versions started struggling once they were out of a well-established opening or if the opponent did something outside of a normal pattern (because then you're no longer able to crib the answer from training data as effectively).

He's done some promo work for Magic The Gathering in the past, including trolling the bejeezus out of Sean "Day9" Plott with a blue/black no-fun-allowed control deck on Felicia Day's channel. And in the course of trying to confirm that that existed I found an article he wrote in 2014 titled "why Gamergaters piss me the fuck off"

Your SSN is often used as a federal registration number even though the card has "do not use for identification" on it in great big letters. Most functions just trust state ID for authentication purposes and use SSN as a label. An identifier in the database sense rather than the authentication sense. At least in theory.

See also how so many of the laws governing this are frankly archaic at this stage, with congress to busy fighting over whether the government should exist or not to actually govern anything effectively. (Note: government inefficiency has never been treated as a reason to govern better, only to govern less and assign more functions to for-profit private entities.

My experience is that it's pretty fragmented with different agencies or programs tracking information separately. You obviously need to let the DoL know where you're living as part of registering for whatever, but they don't share that information with the unemployment people or whoever. And that's before you get into the state vs federal divide.

[–] YourNetworkIsHaunted@awful.systems 5 points 3 days ago (1 children)

Heh. For a while there I had a phone love wallpaper that did the SamaritanOS You_Are_Being_Watched thing. Good times. Shame about Caviezel though.

[–] YourNetworkIsHaunted@awful.systems 10 points 3 days ago (2 children)

I definitely heard it presented as a libertarian bugbear. The American right tends to treat the federal government like it's Schrodinger's State. When it does something they like it's an inviolable declaration of our values and identity as a nation, the truest guarantor of liberty and blah blah blah. When it does literally anything else it's a sinister plot to hand over even more control over your life to unelected bureaucrats!

[–] YourNetworkIsHaunted@awful.systems 6 points 3 days ago (2 children)

I mean, a single national ID card would be one way of preventing this so long as there was a trustworthy way of ensuring that it was updated with everybody's actual address and the like. I don't know that we would implement it in such a way as to have that, leading ultimately to another target for this kind of activity rather than a shield from it.

Nightmare scenario with the current administration would be such a thing being tied to citizenship somehow. Mail comes back undelivered and suddenly you have to dig out your birth certificate and explain things to some shitheel from ICE?

 

I don't have much to add here, but I know when she started writing about the specifics of what Democrats are worried about being targeted for their "political views" my mind immediately jumped to members of my family who are gender non-conforming or trans. Of course, the more specific you get about any of those concerns the easier it is to see that crypto doesn't actually solve the problem and in fact makes it much worse.

view more: next ›