🐇

buni dev notes

♠️buni 1-pager📚buni library & links👞Naming buni

@October 28, 2024 election-promises usage notes

  • Started with Claude Artifacts
    • (also wtf Claude really needs to make these things exportable)
    • Mock up a website design to inform voters of likely outcomes of the presidential election. The website name is "If elected, I will...". The outcomes are taken from predictions on the site Manifold Markets.

      Show an image of Harris saying "reschedule marijuana from a schedule 1 drug" and trump saying "organize a ceasefire in Ukraine before 2026 midterms".

      Put Kamala on the left and Trump on the right of their cards; make the quotes look like speech bubbles coming out of their avatar

      But it didn’t actually have Trump/Harris images, so I grabbed them from Manifold

      Replace the images with https://manifold.markets/political-candidates/harris.png and trump.png; add a little tail to the speech bubbles so it looks like it comes out of their mouths

      Could paste in images, which was cool/timesaving

      image
  • But inability to see the results with real images led me to switch to yield…
    • See discussion on https://yield.sh/edit/election-promises/app.tsx
      • Looking at the diffs sometime helped when code didn’t match my expectations
        • I guess it saves a bit of time too
      • Applying diffs via Haiku seemed to basically always work, nice
        • Could write eval for Cerebras
      • Had to be prompted to refactor code
      • I sometimes dropped in to tweak wording exactly (best for copy text, where my taste is better than Claude’s)
      • Do want to have side-by-side code + preview for these cases
      • Not having image paste was a bit awkward; I copied in some text instead
        • Maybe ideal is pasting in an URL?
        • With images, the benefit is in providing exactly the context and no more
  • Benefits of yield over Claude Artifacts
    • Immediately hosted & shareable (eg sent to Nate)
    • Can render externally linked images…
    • Can tweak code directly for things I know how to do, instead of asking via prompt
    • Can import custom libraries from esm.sh
  • Benefits of Claude Artifacts over yield
    1. Accepts image-pasted input
      Shadcdn out of the box
      Fewer clicks (applied & saved automatically)
    2. Feels a bit more reliable? idk
    3. Version tabbing is nice, maybe we should steal it
    4. image
      At least, get rid of alert popup, and show current loaded version “7/9”. Maybe also show timestamp.
      • Though in practice I’m not really going back to old version very much lol. It’s more about peace of mind, having an undo function
    5. Maybe instead of versioning to a number, it saves a commit message based on the prompt
      • Maybe it’s integrated inline to the chat, vs separate numbering system?
      • Try: fast diff gen + apply with cerebras, versioned on every checkpoint?
  • Export formats
    • Could export to single HTML page which is hosted… somewhere (git?)
      • There’s gotta be some cheap/free place to just host static HTML files at scale. (Cloudflare?)
      • Some stuff not needed for export (like uiw react editor)
    • Nice to have source code too, for modifying + exporting to NextJS I guess (did this for futarchy.dev)
      • Could just add a copy button

@October 25, 2024 Playing with different futarchy.dev versions

🪡futarchy.dev + yield.sh

@October 25, 2024 Check in with Matt

  • Catch ups, childcare, what happened, Matt & childcare
  • Overall:
    • yield.sh: upgraded sonnet, db generation works out of the box a lot, one more prompt that passed “want to share” test
      • shareability
      • Intuition: turn up the fun, turn up variability, get something that you are
      • Sonnet should be make things a lot better
        • Old sonnet 3.5 vs new sonnet 3.5 to build stuff in Minecraft
          • Even though it’s not much better — on this test it’s so much better. Some speculation — maybe get better spatial & visual reasoning. Feels handwavy
    • Some tradeoff between “how to get started” vs constraining space
      • Pippa: Maybe in early days, don’t constrain the space
      • Like Matt where you find a set of users to force, do 15 prompts, get past the hurdle
        • Instinctive user behavior in the early days, the 2 things Pippa sees in the left panel. Really interesting behavior
    • starting talking with Nate Foss (founder of gather town) on a joint project to hack on (prediction market futarchy)
      • Matt: Usual model, be pretty explicit, it’s something we’re pretty into, don’t know where it’ll go, maybe won’t be a thing. Pretty open minded, hope that it evolves into something that’s important to me
        • In an ideal world, have someone to do it with. Not asking to accept a marriage proposal
        • Way Matt usually thinks about it — do we feel more productive, ideally more than sum of parts
        • Anchor on a reasonable level of uncertainty, with upside case being clear
      • Could just be really interesting to — as well as hacking together — spend some time talking about what it could become, seeing what his instincts
        • There’s the core of something interesting here, suspect there’ll be a new behavior somewhere, needs to come up
        • Prompting, sharing, a loop around it
        • Fun to see whether we feel aligned or not
      • slight tangent: was at AI conference last week. “Why do AI panels suck?” was talking about it’s a weird technology, the primitives are so far ahead of productiziation
        • Other than ChatGPT, few products that use the primitives
        • Panel with chief product officers of OAI, Ant, but also Instagram
          • Riff on early insta — they got so obsessed with features, but in the end, what they thought was important, were not — and the things they didn’t hink about was key
          • Like the square — scroll back to early insta, became iconic
          • More important: have the right primitive, and then create enough variance that interesting behaviors get amplified
            • Just getting people to use it even if it feels weird
          • Very generative, the 2 of them, talking about what it might be. But then — let’s build it, put in people hands
        • What are the new primitives?
          • Arbitrary image, arbitrary prompt and get high quality response
          • Core premise: chatpgt will not be most popular product in 5 years. We’re basically not using the primitives in 5 years, the way we’re using them is boring
        • (Already told pippa) 2 panel of og 6 of ChatGPT
          • When they finished GPT4 training in May/Jun of ‘22, 2 posttraining teams, smaller one was chat, that was not that interesting, not the cool thing (which was instruction tuning)
          • In Nov there were 7 people in Chat
          • Median guess for #users was 50k after 1mo. 1 person guessed 1m, actually 100m
          • Ran from Aug - Oct a friends & family beta in 50. Average daily actives was 4
      • Some equivalent of temperature for this, where you can increase the variance in the actual code generation
        • Artificial equivalent of 4 different types, where you increase the variance at level of
        • Computer usage
      • What to do next
    • advice for The Curve (esp — maintaining value after the conference?)
      • Won’t be able to get out to SF in Nov will be in London
      • From Jan, will be in SF a lot. Keep having a lot of other things
      • also Progress Conference — but sad not able to make
      • E.g. how did Matt do it for AISS
    • advice for Manifold

Next steps for checkpoint

  • try the “variations” approach
  • put it in front of 15 people
  • build out the futarchy thing here

@October 24, 2024

  • How to do a db migration?
  • Adding upvotes sucks
    • Even though no migration needed!
    • Like, need a new join table, then need to add APIs to write to and sync from the table, ugh
    • Just want to pretend everything’s in memory and just works

@October 22, 2024 Orchestrating prompts

  • Fundamentally: how to structure code when language models are creative but unreliable?
text/code-based
node-based
  • Hm… Instead of complicated fanout structure for eg generating a slug for an app, can just ask for structured text output:

@October 19, 2024

  • Bun is seriously so good, shell scripting instead of package.json :chefskiss:
  • This Vercel AI SDK thing seems annoying to merge with Cerebras. Let’s just use bare cerebras for now.
  • Cerebras is fast (like, 1s to generate apps from scratch) but the apps are somewhat worse than sonnet’s
  • meta: there’s probably going to be lots of things like this, where there’s a tradeoff between price, latency, and quality among different LLM providers

@October 8, 2024

  • Notes from Ben Evans:

@October 7, 2024

  • Playing with Eden Treaty; need to figure out how to pass Server generated Elysia framework to clients?
  • Spent a while trying to reroute to OpenRouter; it’s not an exact match for Anthropic API…

@October 5, 2024

  • Hm, try catching errors the way Val Town does?
  • Bug: modifyCode sometimes just returns a subset

Testing user interaction with LLMs?

  • Spin up a new app, make various calls, have LLMs input commands and test them?
  • Write tests for the different platform components (eg auth, db, etc?)
  • Note: When more of the coding happens via LLM, then testing/verifying becomes more expensive, takes up a larger chunk of human time.

Flo user test

  • Expected code to be on the same side as the chat
  • Generated movie recs, enter a movie.
  • Tried getting Claude to do it
  • Can you edit the diff?
  • “You have to apply”. Oh, you can edit the code
Tried to save just now with ctrl+s
  • Formatting feels weird, thought it would auto format
  • Now it doesn’t work — is that something I did?
Ended up having to delete Flo’s stuff because of local vs remote sync issues… ideally figure out a way to merge volumes and databases?

Oh, should ask about background with coding, code gen tools as part of interview

Render debugging — fixed itself (maybe because resolved %/buni/FileEditor?

@October 4, 2024

  • More context to Claude:
  • Approaches for context on internal APIs:

@October 2, 2024

  • Auth
  • Database

@October 1, 2024 Approaches to screenshots

  1. Tried some html2canvas approaches suggested by Claude, but doesn’t work for iframes?
  2. Could also try a screenshot API of some kind from chrome?
  3. Started trying puppeteer
  4. Could try to find a different service, like an existing API? “trim” or sth looked okay

Franz

  • Try creating chess game, and then create a neural net with self play, then an animation of the action space distribution, animated on the board
Noticed prompt isn’t there, add it
  • saw chess prompts
  • Python is more familiar, reason being machine learning

Pippa

  • John Daley — ping, looking for first users, understand the demo
  • If you’re doing this, the metric to decide to invest, who’s winning, who has customers
  • Cursor vs Buni
  • Put together some paragraphs on:
  • Other question:
  • As part of research: think about what Cursor doesn’t work well for right now
  • After figuring that out, can think through GTM

@September 26, 2024

  • How much to focus on de novo app generation, vs steady state editing?

@September 24, 2024

  • Codemirror

@September 21, 2024

  • How to do sql access in userspace?
  • For sqlite tracking chats:

Late musings

  • Why is prompting high-effort?
  • Everything will be meta*programming
  • Most of programming with AI is thinking very hard about which tokens to keep
  • Also, a lot of doing research, figuring out what APIs and components are good to use, what people say about them, previewing stuff, having some taste

@September 18, 2024

  • Maybe just importmap react to esm.sh and then don't bundle React via bun's node_modules. - ...has been working ok, but adding eg react codemirror has been a pain
  • Even with simpler @uiw/react-textarea-code-editor, have to: 1. Add react/jsx-runtime 2. add to importmap 3. remove deps with ?external=... (and it still doesn't style right right)
  • Musing: Debuggability is not super there atm, want to expose more of the guts (or, build the thing that makes the guts exposable)

@September 16, 2024

@September 15, 2024

  • pricing: Claude Sonnet 3.5 is $15/mtok, to output, aka:
  • Prompt architecture?
  • Misc musing: RTS like Age of Mythology except you spend your “gold” on training LLMS?

@September 2, 2024

Flyio mysteries while trying to get push & pull to volumes working:

  • Why is the app down sometimes?
  • … why isn’t there any easy way to sync local files to cloud files?

Trying out Render — seems okay too, on the paid $7/mo plan

@July 29, 2024

  • Hm, doesn’t build correctly in prod…
  • Should we build as a single HTML file?

Dev notes @July 24, 2024

Architecture questions:

  • Where to store LLM-generated code?
  • How to compile and serve it?
  • What is the ownership model of data?
  • Note: NPM packages are published built

Things we’re trying to replace

  • Twitter as a town square
  • Traditional hosting of web apps
  • NPM as a package registry

What is the first use case?

  • Building a bunch of my own small side projects?
  • Hosting blogs?
  • Minigames? Like the boardless games vision?
  • Calculators? Like microcovid, kelly bet? Or RPS poker calculator?
  • Connecting to AI functionality? Like midjourney, chatgpt?
  • Q: What have Claude Artifacts been most used for?
  • Q: Which things require user navigation? Where do people spend most of their time?
  • Q: What could go viral?