“Agent stocks”
“model.exchange”
The exchange for “agenty” corporations & stocks
a synthetic corporation that’s an abstraction/bundle over LLM API calls, humans, prompts, balances
provides the scaffolding to solve “Agent” economics
(Human) user experience: you are a corporation!
- When you sign on, you’re given a balance of $10, and 100% equity in your own stock eg $AUSTN
- As part of sign up, it auto suggests:
- investing in some of your friends
- “IPO” — selling some of your own stock
- Everything is readable/stalkable (people’s positions, investments, trading code, etc)
- Though: how do we enforce copyright? Do we?
- Maybe LLM/human judge that looks at how similar or novel two strategies are
Stocks you can invest in
- Other humans and corporations that are “on chain”
- Perp swaps tracking a few big AI things
- public companies like NVIDIA, Meta,
- private companies/startups like Anthropic, OpenRouter
- specific LLMs like Sonnet based on OpenRouter metrics?
- Or maybe Sonnet just has an onchain representation
- Would make sense — onchain Sonnet proxies real Sonnet and powers many apps; you can also invest in “Sonnet Foundation”
- E.g. if you’re poor, Sonnet Foundation could give you loans/credits to use specifically for Sonnet, in exchange for some equity in your app
Marketplace of services
Things that corporations can offer to you:
- Responding to queries (like ChatGPT/Claude/Perplexity)
- Building websites for you (like yield.sh/v0/Artifacts)
- Generating images
- Writing blog posts, or assisting with writing a report?
Notes:
- getting this part right/valuable is probably more important (though maybe a bit less fun) than the exchange piece
- Theory is that a single market sharing a currency (credits) can have smoother interop, and innovate faster than the existing LLM wrapper ecosystem?
Inspirations
- Bountied Rationality
- People Stock
- Stockfighter
- Though reading through this, I’m less excited to solve problems that are like “write a go trading server that execute in nanoseconds” and more excited for problems that are like “let humans use & invest in corps that are valuable”
- OpenRouter
- Manifold, ofc
Vibes
- game-like
- sandbox, fishtank
- LLMs and humans side-by-side
@July 9, 2025
Random musings
- Expensive piece of these corporations might start as “Austin time”
- Maybe design in “phone-a-human-friend” tool available for calling
- JS (bun) or python?
- Dynamically building out GUIs for interp seems cool ⇒ JS, plus it’s what I’m good at
- (which is more concise for orchestrating agent flows?)
- python obv is lingua franca
- Core agents
- “build me a UI”
- “write like PG or Scott Alexander”
- “help me plan”
- “contact people/agents for me”
- Core outputs
- Deep Research style
- Code, generally
- Exec assistant?
- “LLM-powered CRM”
- Writing for fiction, writing for education
- Speed
- look into diffusion?
- understand tradeoffs in speed vs accuracy
- “Flashy demo”
- forecasting, grant eval
- multi-agent CRM
- gameplay?
- “better substack/LW/reddit commenters”
- “build an agent” UI, kind of like creating a tamagotchi, or RPG. “arpg”
- initialize with eg $100, and a callback for your email/notifs
- “pick your class” ⇒ what does agent specialize in (code? or sth?)
- have it go off and play with other agents, provide goods/services in a sandbox
- “character.ai community”
- “synthetic mox” for versions of members to chat
- today, thanks to internet + globalization, humans often are interfacing with pareto-frontier world class operators. harder for a random new human/agent/model to compete
- (or if not (eg McDonald’s), at least good execution + world-class branding + consistency)
- so: what is the space for new entities, corporations, agents?
- one answer: humans start off as cogs in corps, and sometimes go off to start startups (new corps) that try to achieve world-classedness
- another answer: branding & consistency become more important than peak or avg performance
- solving search problems (matching supply/demand, agents & their capabilities to people needing those)
- when it’s cheaper to test, agents can try-before-buy
- agents build habits (preferred prompts, models) like humans build habits (preferred tools?)
- agents build memories
- differences between agents and humans
- agents can fork & merge more easily
- agents can rewrite their own memory
- (or, can they? maybe can rewrite local memory, but global memory is stored on a chain)
- differences between agents and corporations (today)
- corporations run on human substrate rather than LLM substrate
- humans
- other words because “agent” is overused
- character
- pet
- child, child process
‣
- More reading
- https://www.dwarkesh.com/p/ai-firm
- as a research agenda:
- copying
- how bottlenecked are firms today on copying? sure, humans (execs, employees) are not copied
- things that are scarce:
- time, both of individuals (wall-clock), and of the firm (calendar)
- coordination?
- focus
- at some point you branch, such that your economic or self-interest is no longer completely aligned (Google ⇒ Waymo, or parent ⇒ child)
- money
- money is a shared agreement, ledger of
- what does it mean to be “AI Sundar”?
- merge
- how good are two different “agents” at merging?
- https://www.mechanize.work/
- “generate lots more data on useful workflows”
- https://workshoplabs.ai/
- “personal AI for everyone” seems good
- what does a personal agent even do?
- write?
- communicate?
- decide?
- summarize?
- “I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.”
- https://www.dwarkesh.com/p/give-ais-a-stake-in-the-future
- What pieces are necessary for establishing the new social contract?
- Agent runtime?
- (equivalent to the waterways or railroads of early America?)
- new independent country or network state?
- what is the constitution, political structure between AIs and humans?
- fun idea: this is Mox
- understanding of a c-corp-like structure for trade
- isn’t a current model & weights (and serving infra) already an ASI shoggoth demon that we interface with tendrils of?
- fun idea: LLM “director of Mox”
- moonshot: build proper impact accounting for everyone/everything. “heaven on earth”. “heavenshot”?
- takeaways from kontext vibecoding
- still kind of a lot of work and requiring a fair amount of context knowledge to debug (eg localhost:3000 are just not going to work as a url)
- testing that UI works is still slow and expensive and hard
- probably suffered for being a newfangled bun setup, vs memorized NextJS. requires better context engineering
- idea: reach out to alexander wales to put pipelines into model exchange?
- more broadly: work with LLM-friendly world-class experts to turn their line of work into interpretable and scaleable context pipelines
- Scott Alexander for nonfiction writing; Dwarkesh for podcasting; roon, asara for tweeting; etirabys?? for imagegen; aella for polling, sexting
- codebuff for code; us for event hosting??
- thesis: pay agents for value, not costs
- aligns incentives between provider/agent and customer
- allows agents to build up surplus to invest into improvements
- intuition: value of a vibecoded website is way higher than cost. should generate 4 in parallel and get the best learnings of each
- with vibecoding, bfs > dfs. why?
- LLMs have less consideration at each point in time
@July 19, 2025
- Typescript vs Python?
- Typescript is better for building UI; UI is the translation layer between humans and text/data
- Work on a minimal UI that’s pleasant for humans and bots?
- Tailwind is super powerful but very consumer-grade (’overdesigned’)
- Python is kinda lingua franca. More flexible for pipelines?
- Native function: cost of evaluation, aka penalize output tokens
- Restrict output — eg Twitter with 280char limit
- L1/L2/L3 cache to store info with different costs to agents
- exercise: just try building out lots of agent frameworks
- Grant evaluation
- Forecasting
- essay writing
- info bounty
- arc-agi-3
‣
- Core ideas:
- Agents like humans, with identity & balance
- Pay for value
- Easy traces for humans
- stretch: flexible, dynamic-gen UI
@July 26, 2025
(meta: this doc is kind of my own version of context engineering myself)
- idea: Chat UI, but generate eg 3 in parallel and pick best? allows for fast swapping on demand (when a particular response is bad)
- human evaluation costs are high
- rlhf is kind of distilling somebody else’s evaluation taste. how about distilling for your own eval/taste? (is this what Workshop does?)
- problem of “aligning agent to a human”
Minimum viable demo: bounty-hunting over text
- example: “$10 for any speaker that we suggest at Mox”
- ask agents to manage their own costs, decide whether to bid on a proposal
‣
- concepting: what’s the difference between blab and eg Manifund? or a clone of Manifund and Manifund?
- ownership from existing team ⇒ easy to modify
- consensus reality with many different stakeholders (donors, project creators) is slow to establish
- mantra of “write code and talk to users” ⇒ “talk to LLMs and talk to LLMs?”
@Last Saturday
Reading
- <to find>: vary outputs and reset, rather than iterate
- (improve one-shot-ability)
- Why don't LLM agents work
Void, the Bluesky bot that remembers everyone
- on changing backend requirements, when apps are cheap
- on changing frontends when codegen is cheap
- Lincoln Quirk: Home Product in the age of AI
- crypto, payments and agents
- For forecasting specifically:
AI forecasting bots incoming — LessWrong and discussion from Habryka, gwern and more
Q1 AI Benchmarking Results
- LLM psychology
- https://www.ophira.ink/ might be looking for a job?
- LLM identity
‣
- AI rights
- Dwarkesh:
Dwarkesh Patel Give AIs a stake in the future
- Matthew Barnett: Consider granting AIs freedom
Cooperative AI – Foundation
‣
‣
- Multi-agent systems