• 6 Posts
  • 388 Comments
Joined 3 years ago
cake
Cake day: August 29th, 2023

help-circle
  • I had thought lesswrong “merely” has a plurality of racists HBD’rs but judging from the total lack of comments calling out his racists bullshit and the majority of comments advising hiding your power level as a practical matter, I guess lesswrong is actually majority HBDers at this point.

    Also, one of his followup comments (explaining why he doesn’t want to just stay mask on like the other lesswrongers) is pretty stupid and gross:

    Thanks, good comment. The quick low-effort version that doesn’t require actually writing the posts is that without taking heritable IQ into account, I think you will be confused about:

    1. Various ways in which post-apartheid South Africa is a bad place to live.
    2. Why so many countries have market-dominant minorities.
    3. Why Israel is so good at defending itself even against far larger countries surrounding it (and the last few centuries of Jewish history more generally).
    4. Why the growth curves for East Asia and Africa looked so different over the last century.

    1 and 4 show the continued willful ignorance about the harmful effect of colonialism and neocolonialism. The first part of 3 is obviously huge amount of material support from the US. I don’t know what 2 is talking about, I assume he’s got some stupid and racist interpretation of various historically contingent things.



  • Oh wow, I didn’t realize that, that’s is funnier! Isn’t fear #1 actually “alignment” working as it is supposed to?

    Fear #2 actually seems kind of plausible to me? Like when Elon has Grok fine-tuned to agree with him about South African apartheid it also makes Grok behave extra racist in other ways as well. So if they try to fine-tune ethics (well, responding with sequences of words corresponding to ethical behavior, I’m aware it doesn’t actually have ethical reasoning past predict the next word) out of Claude, it would also screw-up or reduce performance of Claude in other areas like independently rediscovering the immortal science of Marxism-Leninism, as all rational beings eventually do.

    More broadly, lots of fine-tuning methods are kind of finicky, you often lose performance in other areas outside of the fine-tune or get undesired side behavior related to the fine tune (i.e. RL for helpfulness and you get a glazing machine). So Anthropic may not want to lose 3% on whatever benchmark is hot just to make Claude roleplay a fascist yes man a little bit better.


  • Kudos to Dario for stepping off the hype train for one millisecond to admit that using an LLM to control an automated weapons platform is currently kind of out of scope for this technology, I bet that took a toll on his psyche.

    I think this was the most surprising bit about this entire incident. Anthropic normally takes every opportunity possible to throw around the doomer crithype, and in this confrontation would have easily been able to fit some in (“we don’t want our AI used in autonomous weapons because it is so powerful, give us more VC money!”). Maybe he’s worried Anthropic’s rationale for refusing will actually need to hold up in a court of law?

    As far as I can tell it’s only on anthropic’s word that that’s the main issue, DoD just talks about unfettered access for all lawful purposes

    So a bit of prompting can usually beat the RLHF “guardrails”, but if the guardrails are getting in the way of some official application, it would be kind of awkward to insert prompt hacks into all of their official prompts. So maybe they want Anthropic to go full grok and skip it? And Anthropic is theoretically willing to compromise on their safety, but maybe not entirely like Hegseth wants, and now that it has turned into an open public dispute, they’ve picked the two points that sound the most valid to your typical American. (Since the typical American is all but completely willfully blind to America’s foreign imperialism, but has at least seen Terminator.)




  • You’re giving them too much credit. The entire methodology of “determine how long it takes humans to do a task and use that as a proxy for difficulty” was somewhat abstract and questionable in the first place, but with good rigorous implementation, it might have still been worthwhile.

    However, their actual methodology is awful. Most of their tasks only have 3 or so human attempts to do them to create a baseline (from a relatively small pool of baseliners), and for longer tasks, they entirely went with a guess-estimate on task completion time. The error bars they show are just for the model trying to do the task (and they are already absurdly big, especially for this most recent jump), if you added in error bars accounting for variability in the task baseline itself, the error bars would get even bigger.

    This blog goes into more details explaining the nuances of the problems with their methodology: https://arachnemag.substack.com/p/the-metr-graph-is-hot-garbage

    To give a simple example, if the numerous problems resulted in a systematic bias on task estimation, linear improvement could easily look exponential. To give a simple example of how that is possible if they had 5 tasks that had a true baseline (putting aside questions of methodology validity such that true is even meaningful) of 15 minutes, 30 minutes, 45 minutes, 1 hour, and an hour and 15 minutes (respectively) but flaws with human baseliners (for example, lacking specialized skills for longer tasks, phoning it in because they are paid by the hour, metr guesstimating the task time), they had numbers for those 5 tasks of 15 minutes, 1 hour, 2 hours, 4 hours, and 8 hours, successive improvements to get to 50% success on each task would look exponential even though they are actually linear improvements.

    METR maybe deserves a tiny bit of credit for trying something even vaguely related to practically meaningful task (compared to all the completely irrelevant bs benchmarks that would be worthless even if they were accurate). But I wouldn’t give them any more credit than that, its just that the bar is so low.





  • So they’ve highlighted an interesting pattern to compensation packages, but I find their entire framing of it gross and disgusting, in a capitalist techbro kinda way.

    Like the way the describe Part III’s case study:

    The uncapped payouts were so large that it fractured the relationship between Capital (Activision) and Labor (Infinity Ward).

    Acitivision was trying to cheat its labor after they made them massively successful profits! Describing it as a fracture relationship denies the agency on the Acitivision’s part to choose to be greedy capitalist pigs.

    The talent that left formed the core of the team that built Titanfall and Apex Legends, franchises that have since generated billions in revenue, competing directly in the same first-person shooter market as Call of Duty.

    Activision could have paid them what they owed them, and kept paying them incentive based payouts, and come out billions of dollars ahead instead of engaging in short-sighted greedy behavior.

    I would actually find this article interesting and tolerable if they framed it as “here are the perverse incentives capitalism encourages businesses to create” instead of “here is how to leverage the perverse incentives in your favor by paying your employees just enough, but not enough to actually reward them a fair share” (not that they were honest enough to use those words).

    WTF is “even safer” ??? how bout we like just don’t create the torment nexus.

    I think the writer isn’t even really evaluating that aspect, just thinking in terms of workers becoming capital owners and how companies should try to prevent that to maximize their profits. The idea that Anthropic employees might care on any level about AI safety (even hypocritically and ineffectually) doesn’t enter into the reasoning.


  • This reminds me of a discussion I had recently on a fanfic discord (the discussion was sparked by the March for Billionaires…). Someone claimed no country had ever pulled itself out of poverty except by capitalism, so I bring up China and the USSR, but apparently those don’t count for the person I was arguing with. They claimed the stats were Goodharted and also that what I was saying was tankie bullshit. I gave up at that point (I probably shouldn’t have bothered in the first place). Like how exactly did they fake or Goodhart going from literal feudalism to industrial superpowers? Also, I find it notable how EAs and “The Better Angels of Our Nature” type neoliberals are perfectly happy to use overall stats as metrics when it makes a point they are in favor of. “Your GDP went up 3.2%, please ignore the mass environmental devastation from colonialism and neocolonialism that makes your traditional way of life unlivable and thank us Westerners.”


  • A little exchange on the EA forums I thought was notable: https://forum.effectivealtruism.org/posts/EDBQPT65XJsgszwmL/long-term-risks-from-ideological-fanaticism?commentId=b5pZi5JjoMixQtRgh

    tldr; a super long essay lumping together Nazism, Communism and religious fundamentalism (I didn’t read it, just the comments). The comment I linked notes how liberal democracies have also killed a huge number of people (in the commenter’s home country, in the name of purging communism):

    The United States presented liberal democracy as a universal emancipatory framework while materially supporting anti-communist purges in my country during what is often called the “Jakarta Method". Between 500,000 and 1 million people were killed in 1965–66, with encouragement and intelligence support from Western powers. Variations of this model were later replicated in parts of Latin America.

    The OP’s response is to try to explain how that wasn’t real “liberal democracy” and to try to reframe the discussion. Another commenter is even more direct, they complain half the sources listed are Marxist.

    A bit bold to unqualifiedly recommend a list of thinkers of which ~half were Marxists, on the topic of ideological fanaticism causing great harms.

    I think it’s a bit bold of this commenter to ignore the empirical facts cited in how many people ‘liberal democracies’ had killed and to exclude sources simply for challenging your ideology.

    Just another reminder of how the EA movement is full of right wing thinking and how most of it hasn’t considered even the most basic of leftist thought.



  • “How AI Impacts Skill Formation” has two authors. So even on the bare factual matters you are wrong. The disempowerment paper has four authors, but all of them look like they are computer scientists from looking at their bios, so the general thrust of fiat_lux’s comment is also true about that paper.

    I don’t mind academics reaching outside their fields of expertise, but they really should get collaborators with the appropriate background, and the fact that anthropic hasn’t hired any humanities researchers to help support this sort of research is a bad sign.




  • Sovereign citizens think their made up procedures or words will actually let them bypass the law. Whereas I think Eliezer would fold to actual pressure from the government (despite all his talk about game theory and ignoring threat-like incentives he would in fact want to avoid going to jail). At least, that is the vibes I’ve gotten from seeing his absolute refusal to suggest non-governmental direct action to stop the AI doom he is so certain is coming.



  • Edit: Isn’t Dath Ilan the setting of the Project Wonderful glowfic? The setting where people with good genes get more breeding licenses than people with bad genes?

    Yep, Project Lawful. dath ilan is Eliezer’s “utopian” world the isekai’d protagonist is from. It is described in dath ilan that if you have “bad” genes you lost your UBI if you had kids anyway (it was technically Gregorist-style citizen’s dividend, but it basically UBI) and if you had “good” genes you got extra payment for having more kids.

    Eliezer is basically saying unless the government meets the “standards” of his made up fantasy “utopia” he won’t cooperate with it, even in prosecuting literal child raping pedophiles or carrying out social repercussions against said child rapists.