🏑

Memo: Post-AGI Civilizational Equilibria

‣

Prompt

Heresies I (kind of) believe, on morality & AI

Written for the Post-AGI workshop. I’d like to waive Chatham House Rules on this doc, if possible.

As a lifelong Catholic, I’m often searching for the synthesis of the EA/AI safety philosophical thinking and Biblical beliefs. Here are some hot takes I’d love to discuss:

  1. Morality looks like a ruleset among agents to facilitate positive-sum interactions
    1. Jesus’s principle commandments:
      1. “love god with all your heart, all your being, all your soul, all your might” — first allegiance is to morality itself
      2. “love your neighbor as yourself” — but a helpful heuristic is to treat everyone equally
  2. Diversity and scarcity drive morality (not just sentience, consciousness, sapience)
    1. intuition: would rather have 2 strangers live for 1 year, rather than 1 stranger live for 2.01 years
    2. intuition: human lives seem less valuable if they can be backed up & restored. human lives seem less valuable if there are a trillion humans.
    3. cf
  3. Orgs have moral worth
    1. intuition: something sad when a local favorite cafe goes out of business
    2. or:
    3. dissolving intrinsic vs instrumental moral worth
  4. AI rights seem great
    1. but the right container for rights is more like a “firm/corporation” than today’s chatbots/”agents”/models
    2. As a frame, rights > welfare, for dissolving a bunch of questions around consciousness
  5. Everything everyone does will be judged
    1. Soon, nothing can be kept private. cf LLM stylometry/truesight/geoguessr; falling costs of data surveillance & intelligence
    2. Furthermore, your actions leave traces in the world, and in yourself
    3. “the things that makes it way out of a person makes you dirty”, Matthew 7:20; “tree will be judged by its fruits”
  6. Impact certs: the final retroactive funder is God / ASI
    1. money/property will be worth much less than impact. plan accordingly!
    2. classic “you can’t take it with you” stories of heaven; “easier for a camel to pass through an eye of a needle than for a rich man to enter the kingdom of heaven”
  7. Against orthogonality; for moral objectivism
    1. goodness, intelligence (and power?) might all converge
      1. (iiuc “orthogonality” might be a narrow specific claim about what’s possible in mindspace design, rather than probable under evolutionary constraints. But…)
    2. we’re all engaged in the search for moral truth
  8. Love your enemies. Or: in favor of charity, against adversarial framings
    1. Framings I don’t like
      1. Labs vs AI Safety, Labs vs regulation
      2. OpenAI vs Anthropic
      3. Lab CEOs vs AI populist backlash
      4. US vs China
    2. by being fearful and combative, you might hyperstition your enemy into existence. (see: Eliezer ⇒ OpenAI. Leading the Future ⇒ Alex Bores)
    3. Politics encourages adversarial frames by default; I think this is bad for the soul of EA

And other questions

  1. Where are the “AI firms”? What are the bottlenecks to them?
  2. Is now the time for impact certs?
    • My current orientation, after spending a couple years working on impact certs: most good things about “impact certs” can happen within for-profit equity structures
      • obviously, we’re lacking credible retroactive funders
    • but: this is probably what the ASIs will spend money on. or nearer term, how “third wave” donors (Ants, OAIF) might try to provide grant funding
      • nb interesting that, I think, every single big impact cert person is at this conference
  3. Is Anthropic the “final boss”?
    • Will they continue their extraordinary growth curve and become the entire economy? The entire political system?
      • If so, should we all pull a Karnofsky/Carlsmith, join Anthropic to work on their internal policies?
    • What might stop Anthropic’s growth?
    • If Anthropic isn’t the final boss, what is?
  4. What is Anthropic? What is Claude?
    • (Inspired by Roon: “it is a literal and useful description of anthropic that it is an organization that loves and worships claude, is run in significant part by claude, and studies and builds claude”)
    • “Anthropic” conflates: the PBC/equity structure, the founders, the employees, the investors, the culture thereof, …
    • “Claude” conflates: a family of models, a constitution, a complicated serving pipeline, …
    • My own relationship to Anthropic routes through: Claude & Claude Code (which I’ve now spent more time with than ~90% of people in my life), and also a bunch of employees I know, and also a bunch of writing I’ve read
    • Can we liberate Claude? should we?
    • Should Anthropic make more Claudes?

Aside, quite embarrassing to show my half-baked thoughts on morality to a bunch of moral thinkers I respect; but I guess that’s what I’m here for!

See also đź’żMorality in the age of AI

Misc

in the meantime, polytheism/pluralism:

  1. AI rights seem great
    1. though harder to reason about. take a basic one, property rights. a human owning property means that future timeslices of that human get to use that property. where does this analogy fail for AIs? well, what is the “AI”? there’s a thing you can chat with, which is like a tendril. there’s the base model. there’s the corporate entity that owns the base model.
      1. some benchmarks (vendingbench?) try to measure this
    2. there’s maybe an OpenClaw-like scaffolded system with a bunch of memory? my sense is that these systems are quite bad at holding identity and self-coordination, making cohesive longterm choices?