I'm sympathetic, but I do think Realtalk could be improved with some simple object recognition and LLMing.
One of the challenges I found when I played with RealTalk is interoperability. The aim is to use the "spacial layer" to bootstrap people's intuitions on how programs should work, and interact with the world. It's really cool when this works. But key intuitions about how things interact when combined with each other, only work if the objects have been programmed to be compatible. A balloon wants to "pop if it comes into contact with anything sharp". A cactus wants to say "I am sharp". But if someone else has programmed a needle card to say "I am pointy", then it won't interact with the balloon in a satisfying way. Or, to use one of Dynamicland's favorite examples: say I have an interactive chart which shows populations of different countries when I place the "Mexico card" into the filter spot. What do you think should happen if I put a card showing the Mexican flag in that same spot, or some other card which just says the string "Mexico" on it? Wouldn't it be better if their interaction "just works"?
Visual LLMs can aid with this. Even a thin layer which can assign tags or answer binary questions about objects could be used to make programs massively more interoperable.
That's similar to the issue with the whole NFT craze where you'd "take items from one game to another", it requires everything to work with everything.
For Dynamicland I get the issue though putting the whole thing through an LLM to make pointy and sharp both trigger the same effects on another card would just hide the interaction entirely. It could or couldn't work for reasons completely opaque to both designer and user.
The way you'd figure this out in dynamic land is you'd look at the balloon, which by custom would have the code taped on somewhere. You'd read that code, figure out what it's looking for, and write said trigger.
I just realized that the OP said they had already played around with real talk, but it's too late to edit: sorry for assuming that the printed code is sufficient! Was term mismatch one of the biggest issues you ran into, and if so, was it that the printed code didn't contain enough information?
I'm only just now reading about Dynamicland for the first time, so maybe I'm not understanding something obvious. The text description is not very helpful, as far as I can tell from pictures it's a place where you can move around physical objects and papers to do computer programming type stuff?
Under visibility they say:
>To empower people to understand and have full agency over the systems they are involved in, we aim for a computing system that is fully visible and understandable top-to-bottom — as simple, transparent, trustable, and non-magical as possible
But the programming behind the projector-camera system feels like it would be pretty impenetrable to the average person, right? What is so different about AI?
Dynamicland is bootstrapped in a sense, [0] the same way you write the first compiler/interpreter for your code in another language then later write it in it's own language. The code running the camera and projector systems is also running from physically printed programs in one of the videos you can see a wall that's the core 'OS' so to speak of Dynamicland.
I think the vision is neat but hampered by the projector tech and the cost of setting up a version of your own, since it's so physically tied and Bret is (imo stubbornly) dedicated to the concept there's not a community building on this outside the local area that can make it to DL in person. It'd be neat to have a version for VR for example and maybe some day AR becomes ubiquitous enough to make it work anywhere.
[0] Annoyingly it's not open sourced so you can't really build your own version easily or examine it. There have been a few attempts at making similar systems but they haven't lasted as long or been as successful as Bret's Dynamicland.
That's pretty cool. I figure this is explained in some of the videos but I can't watch them right now.
I'm reading more about the "OS" Realtalk
>Some operating system engineers might not call Realtalk an operating system, because it’s currently bootstrapped on a kernel which is not (yet) in Realtalk.
You definitely couldn't fit the code for an LLM on the wall, so that makes sense. But I still have so many questions.
Are they really intending to have a whole kernel written down? How does this work in practice? If you make a change to Realtalk which breaks it, how do you fix it? Do you need a backup version of it running somewhere? You can't boot a computer from paper (unless you're using punch cards or something) so at some level it must exist in a solely digital format, right?
Yeah he's put out a fair number of videos and the whole idea makes more sense there or if you can manage to visit in person.
I think even if you could squeeze down an LLM and get it to run in realtalk I don't think it fits with the radical simplicity model they're going for. LLMs are fundamentally opaque, we have no idea why they output what they do in the end and can only twiddle the prompt knobs as a user which is the complete opposite direction from a project that refuses to provide the tools to build a version because it's putting the program back into the box instead of fileted out into the physical instantiation.
I wish he'd relent and package it up in a way that could be replicated more simply than reimplementing entirely from scratch.
I'm not sure where to draw the line between Realtalk and the underlying operating system. I'm willing to give it some credit, it's interesting without being written entirely from scratch. IIRC most of the logic that defines how things interact IS written in Realtalk and physcially accessible within the conceptual system instead of only through traditional computing.
I believe it runs on stock Linux, but it's an operating system in the sense that it multiplexes tasks and facilitates IPC. The closest analogy I can think of is something like a lisp machine or Smalltalk, where the line between program and OS is really blurry.
Also, if you haven't heard of folk computer[1] as a viable alternative, I'd highly suggest checking it out! I'm one of the contributors, and it's definitely not dead (unlike all the other dynamic land spin-offs I've seen). The head programmers—Omar and Andreas—both worked at dynamic land for a couple months, so they've been able to carry over the good parts while also open sourcing it. The implementations have definitely diverged, but imho in a good way—folk computer is working on multi threading and is much more realtime-safe (you'll see in the latter dynamic land videos that it pauses every second or so).
I'll have to take a look and scrounge up a decent projector to try. The other Bret project I was actually even more interested in was the robotics lab project when I was on a big robotics kick in the 2010s.
What do you mean by “scale”? It’s designed to be decentralized, and promote agency of small, co-located groups of people
The typical “scale” mindset is almost the opposite of that — the people doing the scaling are the ones with agency, and the rest get served slop they didn’t choose!
If the system is an unreliable demo, then that can promote agency. In the same way that you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.
They've been at this since 2017 and there's only one location where it's working. Occasionally they do a demo somewhere else. It is only used, apparently, when supervised by its promoters. That's the "scaling" issue. They need a few more deployments.
Primary and high schools is such an obvious "market" for this. Even if they just get a smart projector on the ceiling and some decent apps and not all the possibilities.
The difficulty with that is there's no code or instructions to build your own so despite being "more open than open source" you're stuck implementing it from scratch if you want to make your own. Even if you can make the trek out the the current instance you can't take it home because there's still the core interpreter you need to run on a regular system to read the cameras, recognize the feducial marks, run the interpreter, and output that to the projectors that isn't immediately replicable.
I love the project but it's nearly a decade old and still lives in one location or places Bret's directly collaborated with like the biolab. [0]
Folk.computer (https://folk.computer) is an open source version of DL-like system, and even though the code uses TCL it's pretty easy to reimplement any bits you see in the DynamicLand archives (I've done this). For example, the code in the video here https://dynamicland.org/archive/2022/Knobs can be 1-1 translated into TCL and it works the same.
If you really wanted to play around with similar ideas it doesn't take a needing to do a full reimplemention of the reactive engine.
> you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.
You can fix your car just fine - just not the electronics. And those were to a large degree added for safety reasons. It is due to the complexity that they are difficult or impossible to fix.
The electronics don't have any more complexity than any other computer system. If you can fix your PC you could fix your car's electronics. Except that they aren't documented. So then your service light comes on, and the car has all kinds of detailed information about why, but the manufacturer doesn't give it to you because they want you to take it to the stealership so they can pick your pocket or try to sell you a new car instead of fixing it yourself or taking it to an independent mechanic.
This isn't about the cost; they already pay the cost to write the documentation or software for their own dealerships. It isn't about other carmakers; any company large enough to actually make a car would have no trouble getting a copy of it from one of the dealers. The only reason it's not published on their websites is that they don't want the vehicle owners and independent mechanics to have it, which is spiteful and obnoxious.
I've made a lot of progress recently working on my own homebrew version, running it in the browser in order to share it with people. Planning to take some time soon to take another stab at the real (physical) thing.
I’m an artist who’ve always struggled to learn how to code. I can pick up on computer science concepts, but when I try to sit down and write actual code my brain just pretends it doesn’t exist.
Over like 20 years, despite numerous attempts I could never get past few beginner exercises. I viscerally can’t stand the headspace that coding puts me in.
Last night I managed to build a custom CDN to deliver cool fonts to my site a la Google fonts, create a gorgeous site with custom code injected CSS and Java (while grokking most of it), and best part … it was FUN! I have never remotely done anything like that in my entire life, and with ChatGPT’s help I managed to it in like 3 hours. It’s bonkers.
AI is truly what you make of it, and I think it’s an incredible tool that allows you to learn things in a way that fits how your brain works.
I think schools should have curriculum that teaches people how to use AI effectively. It’s truly a force multiplier for creativity.
This is actually what I'm most excited about: in the reasonably near future, productivity will be related to who is most creative and who has the most interesting problems rather than who's spent the most hours behind a specific toolchain/compiler/language. Solutions to practical problems won't be required to go through a layer of software engineer. It's going to be amazing, and I'm going to be without a job.
So your understanding is that the chief value a software engineer provides is experience utilizing a specific toolchain/compiler/language to generate code. Is that correct?
> productivity will be related to who is most creative and who has the most interesting problems rather than who's spent the most hours behind a specific toolchain/compiler/language.
Why stop at software? AI will do this to pretty much every discipline and artform, from music and painting, to law and medicine. Learning, mastery, expertise, and craftsmanship are obsolete; there's no need to expend 10,000 hours developing a skill when the AI has already spent billions of hours in the cloud training in its hyperbolic time chamber. Academia and advanced degrees are worthless; you can compress four years of study into a prompt the size of a tweet.
The idea guy will become the most important role in the coming aeon of AI.
Also, since none of us will have any expertise at all anymore, everything our AI makes will look great. No more “experts” pooping our parties. It’s gonna be awesome!
Why would you be out of a job? Nothing he described is something that someone is being paid to do. Look at everything he needs just to match a fraction of your power.
Consumer apps may see less sales as people opt to just clone an app using AI for their own personal use, customized for their preferences.
But there’s a lot of engineering being done out there that people don’t even know exists, and that has to be done by people who know exactly what they’re doing, not just weekend warriors shouting stuff at an LLM.
I think much of HN has a blind spot that prevents them from engaging with the facts.
Yes, AI currently has limitations and isn't a panacea for cognitive tasks. But in many specific use cases it is enormously useful, and the rapid growth of ChatGPT, AI startups, etc. is evidence of that. Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real, same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
I would trust many peoples' evaluations on the impacts of AI if they could at least engage with reality first.
To me the progress achieved so far has been overhyped in many respects. The numbers out of Google that 25% of the code being generated is AI or some high number like that? BS. It’s gamified statistics that look at the command completion (not AI trying to solve a problem) vs what’s accepted and it’s likely hyper inflated even then.
It works better than you for UI prototypes when you don’t know how to do UI (and maybe even faster even if you do). It doesn’t work at all on problems it hasn’t seen. I literally just saw a coworker staring at code for hours and getting completely off track trying to correct AI output vs stepping through the problem step by step using how we thought the algorithm should work.
There’s a very real difference between where it could be in the future to be useful vs what you can do with it today in a useful way and you have to be very careful about utilizing it correctly. If you don’t know what you’re doing and AI helps you get it done cool, but also keep in mind that you also won’t know if it has catastrophic bugs because you don’t understand the problem and the conceptual idea of the solution well enough to know if what it did is correct. For most people there’s not much difference but for those of us who care it’s a huge problem.
I'm not sure if this post is ragebait or not but I'll bite...
If anything, HN is in general very much on the LLM hype train. The contrarian takes tend to be from more experienced folks working on difficult problems that very much see the fundamental flaws in how we're talking about AI.
> Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real
That's not what people are saying. They're noting that revenue is meaningless in the absence of looking at cost. And it's true, investor money is propping up extremely costly ventures in AI. These services operate at a substantial loss. The only way they can hope to survive is through promising future pricing power by promising they can one day (the proverbial next week) replace human labor.
> same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
Again, no one really denies that LLMs can be useful in learning.
This all feels like a strawman-- it's important to approach these topics with nuance.
I was talking to a friend today about where AI would actually be useful in my personal life, but it would require much higher reliability.
This is very basic stuff, not rewriting a codebase, creating a video game from text prompt or generating imagery.
Simply - I would like to be able to verbally prompt my phone something like "make sure the lights and AC are set to I will be comfortable when I get home, follow up with that plumber if they haven't gotten back to us, place my usual grocery order plus add some berries plus anything my wife put on our shared grocery list, and schedule a haircut for the end of next week some time after 5pm".
Basically 15-30min of daily stupid personal time sucks that can all be accomplished via smartphone.
Given the promise of IoT, smart home, LLMs, voice assistants, etc.. this should be possible.
This would require it having access to my calendar, location, ability to navigate apps on my phone, read/send email/text, and spend money. Given the current state of the tools, even if there is a 0.1% chance it changes my contact card photo to Hitler, replies to an email from my boss with an insult, purchases $100,000 in bananas, or sets the thermostats to 99F.. then I couldn't imagine giving an LLM access to all those things.
Are we 3 months, 5 years, or never away from that being achievable? These feel like the kind of things previous voice assistants promised 10 years ago.
because the preachers preach how amazing it is on their greenfield 'i built a todo list app in 5 minutes from scratch' and then you use it on an established codebase with a bigger context than the llm could ever possibly consume and spend 5x more time debugging the slop than it would've taken you to do the task yourself, and you become jaded
stop underestimating the amount of internalized knowledge people can have about projects in the real world, it's so annoying.
an llm can't ever possibly get close to it. there's some guy in a team in another building who knows why a certain weird piece of critical business logic was put there 6 years ago, the llm will never know this and won't understand this even if it consumed the whole repository because it would have to work there for years to understand how the business works
A completely non-technical saleslady on our team prototyped a whole JS web app that generated some data based on some user inputs (and even generated PDFs), which solved a problem our customers were having and our devs didnt have the time to develop yet.
This obviously was a temporary tool we'd never let touch our github repo but it still very much worked and solved a niche problem. It even looked like our app because the LLM could consume screenshots to copy our designs.
I'm on board with vibe coding = non-maintainable, non-tested, mostly useless code by non-devs. But the plus side it will expose many many people to learn basic programming and fill many tiny gaps not solved by bigger more serious pieces of code. Especially once people start building infrastructure and tooling around these non-devs, like hosting, deployment, webhook integrations, etc.
Do people actually learn when using these tools though? I mean, I’m sure they can be used to learn, just like TikTok could be used to read John Stuart Mill. But I doubt that’s what it’s going to be used for in real life.
If the barrier to entry is lower then more people will engage with it. Everything in life is about incentives. This is a hugely powerful tool for people working in the information industry, which is most people with office jobs. A sales person who can overcome a simple customer objection without a major time investment with devs is a sales person who makes more $$ and gets more promotions.
Most people in practice won't, they'll stick to what they know, but there's tons of semi-nerds on the edges who are going to flourish in the next decade. Which is great news for the economy.
But that’s not good. You don’t want Bob to be the gate keeper for why a process is the way it is.
In my experience working with agents helps eliminate that crap, because you have to bring the agent along as it reads your code (or process or whatever) for it to be effective. Just like human co-workers need to be brought along, so it’s not all on poor Bob.
totally. Especially when i'm debugging something for colleagues or friends, given a domain and a handful of ways of doing it, if i'm familiar with it I generally already get a sense of the reasons why its failing or falling short. This has nothing to do with the code base, any given language not withstanding. It comes from years and decades of systems and idiosyncratic behaviors of systems and examples which strangely rear their heads in notable ways.
Theese notable ways may also not be commonly known or put into words, but they persist nevertheless.
This post is about a specific, complex system that stretches from operating system to the physical world, as well as some philosophical problems.
What you're describing is a dead simple hobby project that could be completed by a complete novice in less than a week before the advent of LLMs.
It's like saying "I'm absolutely blown away by microwaves, I can have a meal hot and ready in just a few minutes with no effort or understanding. I think all culinary schools should have a curriculum that teaches people how to use microwaves effectively."
Maybe the goal of education should be giving people a foundation that they can build on, not make them an expert in something that a low skill ceiling and diminishing returns.
I’m actually a huge fan of Brett Victor and I felt like he’s kinda missing the dynamic, adaptable nature of AI that allows non-technical people like me to finally access the system layer of computation for our creative ends.
In other words, in many ways, AI (or rather llms) is the very thing that Brett Victor has spent his whole career imagining and creating - a computing interface that closes gap between human imagination and creation. But here, he’s focusing on the negatives while neglecting, IMHO, the vast potential of AI to allow people to connect, create, and express themselves. As in truly having a PERSONAL computer.
At Dynamicland, he was attempting to build a system that non-technical people like me can interface in a way that makes sense to us.
Taking your unnecessarily disparaging microwave analogy - Using CHATGPT, I can understand it, reprogram it, and do fun stuff, like I don’t know - set up a basketball hoop that sets the timer based on how many shots I make, despite having limited or no technical background. Like I can tell chatgpt my crazy vision, and it will give me step by step approach, with proper resources, and respond in a way that I can grok to build this thing.
THIS is why I'm awestruck.
My anecdote is just my personal reaction to the post. Besides, what’s wrong with people expressing themselves freely here?
This was the most surprising/disturbing/enlightening part of the post imo. Surpring; this person literally had no clue! Disturbing; this person literally had no clue? Enlightening; this person literally did not need a clue.
My takeaway as an AI skeptic is AI as human augmentation may really have potential?
I feel like, AI makes learning way more accessible, at least it did for me, where it evoked a childlike sense of curiosity and joy for learning new things.
I’m also working on a Trading Card Game, where I feed it my drawings and it renders it into final polished form based on visual style that I spent some time building in chat GPT. It’s like an amplifier / accelerator.
I feel like, yes while it can augment us, at the end day it depends on our desire to grow and learn. Otherwise, you will end up with same result as everybody else.
I enjoy the fun attitude, I had a family member state something similar, but I always warn: with powerful AI comes serious consequences. Are we ready for those consequences? How far do we want AI to reach into our lives and livelihood?
We managed to survive the nuke and environmental lead (two examples of humanity veering into drastically wrong directions)
we are never ready for seismic changes. But we will have to adapt one way or another, might as well find a good use for it and develop awareness as a child would around handling knives.
We cannot predict what the consequences will be, but, as a species, we are pretty good at navigating upheavals, opportunities. There are no guarantees that human ingenuity is likely to always save the day, but the fact evolution has bestowed us with risk taking, curiosity, so we won’t stop.
No, we are not good at that. We have dire warnings of extreme catastrophes heading our way for decades now and instead of fixing what is broken, we collectively decide to race to our extinction faster.
Modern Humans have been around and successful for 10s of thousands of years. This might be a genetic dead end in the relatively near short term, but I bet we and at least one branch of our descendants will live for many more 10s or 100s of thousands of years.
Genuine question, how would you feel about reading the dual to your comment here?
"I'm a computer scientist who's always struggled to learn how to paint." "Last night I managed to create a gorgeous illustration with Stable Diffusion, and best part ... it was FUN!" "Art hasn't felt this fun for a long time."
Makes sense. I know what I built is nowhere near actual software development. Still I was able to quickly learn how things work through GPT.
Since I’ve literally been working on this project for two days, here’s a somewhat related answer to your question: I’ve been using chat gpt to build art for TCG. Initially I was resistant and upset at AI companies were hoovering up people’s work wholesale for training data (which is why I think now is an excellent time to have serious conversation about UBI, but I digress).
But I finally realized that I could develop my own distinctive 3D visual style by feeding GPT my drawings and having it iterate in interesting directions. It’s fun to refine the style, by having GPT simulate actual camera lens and lighting set up.
But yes I’ve used AI to make numerous stylistic tweaks to my site, including building out a tagging system that allows me to customize the look of individual pages when I write a post)
Hope I’ll be able to learn how to build an actual complex app one day, or games.
Maybe CDN isn’t the right term after all, see I’m not a software engineer!
But, basically I wanted a way to have a custom repository of fonts a la Google Fonts (found their selection kinda boring) that I could pull from.
Ran fonts through transfonter to convert them to .woff2, set up a GitHub repository (which is not designed for people like me), and set up an instance on Netlify, then wrote custom CSS tags for my ghost.org site.
The thing that amazes me is that aside from my vague whiff of GitHub, I had absolutely no idea how to do this. Zilch. Nada. Chat GPT gave me a clear step by step plan, and exposed me to Netlify, how to write CSS injections, how ghost.org tagging works from styling side of things. And I’m able to have back and forth dialogue with it, not only to figure out how to do it, but understand how it works.
Sounds more like a Continuous Integration / Continuous Deployment (CI/CD) pipeline - defined as a set of practices that automate the process of building, testing and deploying software. Or rather, fonts in this case.
A Content Delivery Network (CDN) is a collection of geographically scattered servers that speeds up delivery of web content by being closer to users. Most video/image services use CDNs to efficiently serve up content to users around the world. For example, someone watching Netflix in California will connect to a different server than someone watching the same show in London.
Yes, it's an ever patient teacher that's willing to chop up any subject matter into bits that are just the correct shape and size for every brain, for as long and as deep as you're willing to go. That's definitely one effective way to use it.
Nice! That's what we all want, but without the content reapropriation, surveillance and public discourse meddling.
Think of a version that is even more fun, won't teach your kids wrong stuff, won't need a datacenter full of expensive chips and won't hit the news with sensacionalist headlines.
I've never experienced dynamicaland in person (only seen videos). However, one concern I have about it's demos so far is that they use a projector. So you need a room dark enough to for the projected light and you need to keep your heads, hands, and body out of the way of it.
Projectors have been strong enough to be visible in decently lit rooms for ages. The reason you want the room extremely dark for most projector setups is for contrast, because the darkest thing you can make on a projected image is the ambient surface illumination (and the brightest is that surface under full power from your projector [0]). If you accept that compromise you don't need a super dark room, the recommendation for tight light control is mostly for media viewing where you want reasonable black levels.
Do still need to keep hands out of the light to see everything but that can also be part of the interaction too. If we ever get ubiquitous AR glasses or holograms I'm sure Bret will integrate them into DL.
[0] Which leads to a bit of a catch 22 you want a surface that looks dark but prefectly reflects all the colors of your projector so you need a white screen which means you ideally want zero other light other than the projector to make the projector act the most like a screen.
>you need to keep your heads, hands, and body out of the way of it.
I've seen systems like this that use multiple projectors from different angles, calibrated for the space and the angle. They're very effective at preventing occlusion, and it takes fewer than you'd think (also see Valve's Lighthouse tech for motion tracking).
Unfortunately, doing that is expensive, big, and requires recalibrating whenever it's moved.
The light level isn't an issue in practice: when I visited the actual installation during the day, the building was brightly lit with natural light and the projections were easily visible, to the point that I didn't think about it at the time.
This is true, but modern laser projectors are very, very bright. I use one as my main computer display with no problems with the blinds open, and the sun shining in.
> we aim for a computing system that is fully visible and understandable top-to-bottom
I mean even for something that is in theory fully understandable like the linux kernel it is not feasible to actually read the source before using it.
To me this really makes no sense. Even for traditional programming we only have so powerful systems because we use a layered approach. You can look into these layers and understand them but it is totally out of scope for a single human being.
That’s because you’re conflating “understanding” with “comprehension”. You can understand every component in a chain and its function, how it works, where its fragilities lay or capabilities are absent, without reviewing the source code for everything you install. To comprehend, however, you must be intimately familiar with the underlying source code, how it compiles, how it speaks to the hardware, etc.
I believe this is the crux of what the author is getting at: LLMs are, by their very nature, a black box that cannot ever be understood. You will never understand how an LLM reached its output, because their innate design prohibits that possibility from ever manifesting. These are token prediction machines whose underlying logic would take mathematicians decades to reverse engineer even a single query, by design.
I believe that’s what the author was getting at. As we can never understand LLMs in how they reached their output, we cannot rely on them as trustworthy agents of compute or knowledge. Just like we would not trust a human who gives a correct answer much of the time but can never explain how they knew that answer or how they reached that conclusion, so should we not trust LLMs in that same capacity.
I would go one step further: if you look at what Bret Victor has been researching for the past 20ish years it has been all about explaining difficult systems in a simple way. I'm pretty sure he genuinely wants every part to be explained, from the bottom up. You'll also get that sense if you read his library short fiction[1].
Now, whether that's possible is up for debate, but we could definitely use a push in that direction, a system that we can genuinely understand top to bottom, so that we can modify it to what we need. That's where agency comes into his argument.
I get that LLMs are a black box in ways that most other technologies aren't. It still feels to me like they have to be okay with abstracting out some of the details of how things work.
Unless they have a lot of knowledge in electrical engineering/optics, the average user of this isn't going to understand how the camera or projector work except at a very high level.
I feel like the problem with LLMs here is more that they are not very predictable in their output and can fail in unexpected ways that are hard to resolve. You can rely on the camera to output some bits corresponding to whatever you're pointing it at even if you don't know anything about its internals.
Building projection optics is a bench top experiment that we did in 7th grade in school. Electric circuitry isn't exactly rocket science, either. Things like LCD panels for projecting arbitrary images and CCD chips for cameras become harder to understand.
But the point is to make users understand the system enough to instill the confidence to change things and explore further. This is important because the true power of computer systems comes from their flexibility and malleability.
You can never build that level of confidence with LLMs.
I get what you are talking about. My gripe with that is - yeah, would be indeed great if we could at some point get the structure to such deep level, as to write down pen on paper on one page, a string of maths symbols that is a good enough description. However - it's possible that for many things, that's not possible. Suspect maybe not possible in e.g. biology. Possible that the great success of physics of the 20-th century over-indulged us. So our expectations are out of kilter with realities of our world.
Fwiw I personally describe than as white, not black boxes. For we know, and can trace back every single bit of the output, back to the input. That does not help us as much as we'd like though. When drilling down into "why did the model answer wrongly 1, and not rightly 2", it comes down to "well, it added one trillion small numbers, and the sum came close to 1, but didn't reach 2". Which is unsatisfactory, and your "understanding" v.s. "comprehension" delineates that nicely.
Maybe more productive to think of them more "artefacts", less "mechanical contraptions". We shape them in many ways, but we are not in complete control of their making. We don't make them explicitly with out hands: we make a maker algorithm, and that algorithm then makes them. Or even "biological", grown artefacts. Given we don't control the end result fully. Yes we know and apply the algorithm that builds them, but we don't know the end result before hand, the final set of weights. Unlike say when we are making a coffee machine - we know all the parts to a millimetre in advance, have it all worked out pre-planned, before embarking on the making of the machine.
You're right, and when the behaviour of large codebases violates expectations, the intertwined webs of code are problems. Alan Kay's Viewpoints Research studied this, and famously Kay proposed "t-shirt sized algorithms", where short rules can be used to make even fancy desktop word processing and presentation software. There are also projects like From NAND to Tetris that show understanding a full stack is achievable. Could this go broader, and deeper? Of course, and this is what Bret Victor is getting at. Not just for when code goes wrong, but to make it more modifiable and creative in the first place (See Maggie Appleton's essay/talk "Home-Cooked Software and Barefoot Developers").
Projects like SerenityOS show how powerful "small" software can be. For instance, its spin-off project, the Ladybug browser has vastly fewer lines of code compared to Chromium, and yet it seems that the Ladybug team is able to implement one specification after another. Last I saw, they were close to meeting the minimum feature set Apple requires for shipping as a browser on iOS.
There's a fundamental difference between systems that are theoretically comprehensible but practically large (Linux) versus systems whose internal reasoning is inherently opaque by design (modern neural networks).
I'm sympathetic, but I do think Realtalk could be improved with some simple object recognition and LLMing.
One of the challenges I found when I played with RealTalk is interoperability. The aim is to use the "spacial layer" to bootstrap people's intuitions on how programs should work, and interact with the world. It's really cool when this works. But key intuitions about how things interact when combined with each other, only work if the objects have been programmed to be compatible. A balloon wants to "pop if it comes into contact with anything sharp". A cactus wants to say "I am sharp". But if someone else has programmed a needle card to say "I am pointy", then it won't interact with the balloon in a satisfying way. Or, to use one of Dynamicland's favorite examples: say I have an interactive chart which shows populations of different countries when I place the "Mexico card" into the filter spot. What do you think should happen if I put a card showing the Mexican flag in that same spot, or some other card which just says the string "Mexico" on it? Wouldn't it be better if their interaction "just works"?
Visual LLMs can aid with this. Even a thin layer which can assign tags or answer binary questions about objects could be used to make programs massively more interoperable.
That's similar to the issue with the whole NFT craze where you'd "take items from one game to another", it requires everything to work with everything.
For Dynamicland I get the issue though putting the whole thing through an LLM to make pointy and sharp both trigger the same effects on another card would just hide the interaction entirely. It could or couldn't work for reasons completely opaque to both designer and user.
The way you'd figure this out in dynamic land is you'd look at the balloon, which by custom would have the code taped on somewhere. You'd read that code, figure out what it's looking for, and write said trigger.
I just realized that the OP said they had already played around with real talk, but it's too late to edit: sorry for assuming that the printed code is sufficient! Was term mismatch one of the biggest issues you ran into, and if so, was it that the printed code didn't contain enough information?
I'm only just now reading about Dynamicland for the first time, so maybe I'm not understanding something obvious. The text description is not very helpful, as far as I can tell from pictures it's a place where you can move around physical objects and papers to do computer programming type stuff?
Under visibility they say:
>To empower people to understand and have full agency over the systems they are involved in, we aim for a computing system that is fully visible and understandable top-to-bottom — as simple, transparent, trustable, and non-magical as possible
But the programming behind the projector-camera system feels like it would be pretty impenetrable to the average person, right? What is so different about AI?
Dynamicland is bootstrapped in a sense, [0] the same way you write the first compiler/interpreter for your code in another language then later write it in it's own language. The code running the camera and projector systems is also running from physically printed programs in one of the videos you can see a wall that's the core 'OS' so to speak of Dynamicland.
I think the vision is neat but hampered by the projector tech and the cost of setting up a version of your own, since it's so physically tied and Bret is (imo stubbornly) dedicated to the concept there's not a community building on this outside the local area that can make it to DL in person. It'd be neat to have a version for VR for example and maybe some day AR becomes ubiquitous enough to make it work anywhere.
[0] Annoyingly it's not open sourced so you can't really build your own version easily or examine it. There have been a few attempts at making similar systems but they haven't lasted as long or been as successful as Bret's Dynamicland.
That's pretty cool. I figure this is explained in some of the videos but I can't watch them right now.
I'm reading more about the "OS" Realtalk
>Some operating system engineers might not call Realtalk an operating system, because it’s currently bootstrapped on a kernel which is not (yet) in Realtalk.
You definitely couldn't fit the code for an LLM on the wall, so that makes sense. But I still have so many questions.
Are they really intending to have a whole kernel written down? How does this work in practice? If you make a change to Realtalk which breaks it, how do you fix it? Do you need a backup version of it running somewhere? You can't boot a computer from paper (unless you're using punch cards or something) so at some level it must exist in a solely digital format, right?
Yeah he's put out a fair number of videos and the whole idea makes more sense there or if you can manage to visit in person.
I think even if you could squeeze down an LLM and get it to run in realtalk I don't think it fits with the radical simplicity model they're going for. LLMs are fundamentally opaque, we have no idea why they output what they do in the end and can only twiddle the prompt knobs as a user which is the complete opposite direction from a project that refuses to provide the tools to build a version because it's putting the program back into the box instead of fileted out into the physical instantiation.
I wish he'd relent and package it up in a way that could be replicated more simply than reimplementing entirely from scratch.
I'm not sure where to draw the line between Realtalk and the underlying operating system. I'm willing to give it some credit, it's interesting without being written entirely from scratch. IIRC most of the logic that defines how things interact IS written in Realtalk and physcially accessible within the conceptual system instead of only through traditional computing.
I believe it runs on stock Linux, but it's an operating system in the sense that it multiplexes tasks and facilitates IPC. The closest analogy I can think of is something like a lisp machine or Smalltalk, where the line between program and OS is really blurry.
Also, if you haven't heard of folk computer[1] as a viable alternative, I'd highly suggest checking it out! I'm one of the contributors, and it's definitely not dead (unlike all the other dynamic land spin-offs I've seen). The head programmers—Omar and Andreas—both worked at dynamic land for a couple months, so they've been able to carry over the good parts while also open sourcing it. The implementations have definitely diverged, but imho in a good way—folk computer is working on multi threading and is much more realtime-safe (you'll see in the latter dynamic land videos that it pauses every second or so).
[1] https://folk.computer/
I'll have to take a look and scrounge up a decent projector to try. The other Bret project I was actually even more interested in was the robotics lab project when I was on a big robotics kick in the 2010s.
> You definitely couldn't fit the code for an LLM on the wall, so that makes sense. But I still have so many questions.
You probably could fit the code for an LLM on a wall. Usually the code for an LLM is no more than a couple hundred lines.
Of course the weights wouldn't fit on a wall.
If you’re looking for an open source project like Realtalk there is https://folk.computer/
Here's a video of Dynamicland.[1] The textual description doesn't tell you much.
It's still at the cool demo level, though. How do you scale this thing?
[1] https://www.youtube.com/watch?v=7wa3nm0qcfM
Reminds me of Microsoft's "Surface/Table computing"
https://www.youtube.com/watch?v=kr1O917o4jI
> How do you scale this thing?
You release a tablet and call it Dynamicland? I think that's what Microsoft did, but don't quote me on that.
What do you mean by “scale”? It’s designed to be decentralized, and promote agency of small, co-located groups of people
The typical “scale” mindset is almost the opposite of that — the people doing the scaling are the ones with agency, and the rest get served slop they didn’t choose!
If the system is an unreliable demo, then that can promote agency. In the same way that you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.
They've been at this since 2017 and there's only one location where it's working. Occasionally they do a demo somewhere else. It is only used, apparently, when supervised by its promoters. That's the "scaling" issue. They need a few more deployments.
Ah OK, that's a valid criticism. I'd call it reducing the "bus factor", or enabling independent replication of the research results.
Not necessarily "scaling", but the point stands. If the goal is enable agency, then obviously you want that to happen in multiple places!
Primary and high schools is such an obvious "market" for this. Even if they just get a smart projector on the ceiling and some decent apps and not all the possibilities.
The difficulty with that is there's no code or instructions to build your own so despite being "more open than open source" you're stuck implementing it from scratch if you want to make your own. Even if you can make the trek out the the current instance you can't take it home because there's still the core interpreter you need to run on a regular system to read the cameras, recognize the feducial marks, run the interpreter, and output that to the projectors that isn't immediately replicable.
I love the project but it's nearly a decade old and still lives in one location or places Bret's directly collaborated with like the biolab. [0]
[0] https://dynamicland.org/2023/Improvising_cellular_playground...
Folk.computer (https://folk.computer) is an open source version of DL-like system, and even though the code uses TCL it's pretty easy to reimplement any bits you see in the DynamicLand archives (I've done this). For example, the code in the video here https://dynamicland.org/archive/2022/Knobs can be 1-1 translated into TCL and it works the same.
If you really wanted to play around with similar ideas it doesn't take a needing to do a full reimplemention of the reactive engine.
> you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.
You can fix your car just fine - just not the electronics. And those were to a large degree added for safety reasons. It is due to the complexity that they are difficult or impossible to fix.
The electronics don't have any more complexity than any other computer system. If you can fix your PC you could fix your car's electronics. Except that they aren't documented. So then your service light comes on, and the car has all kinds of detailed information about why, but the manufacturer doesn't give it to you because they want you to take it to the stealership so they can pick your pocket or try to sell you a new car instead of fixing it yourself or taking it to an independent mechanic.
This isn't about the cost; they already pay the cost to write the documentation or software for their own dealerships. It isn't about other carmakers; any company large enough to actually make a car would have no trouble getting a copy of it from one of the dealers. The only reason it's not published on their websites is that they don't want the vehicle owners and independent mechanics to have it, which is spiteful and obnoxious.
Such an amazing project.
I've made a lot of progress recently working on my own homebrew version, running it in the browser in order to share it with people. Planning to take some time soon to take another stab at the real (physical) thing.
Progress so far: https://deosjr.github.io/dynamicland/
Im genuinely blown away by llms.
I’m an artist who’ve always struggled to learn how to code. I can pick up on computer science concepts, but when I try to sit down and write actual code my brain just pretends it doesn’t exist.
Over like 20 years, despite numerous attempts I could never get past few beginner exercises. I viscerally can’t stand the headspace that coding puts me in.
Last night I managed to build a custom CDN to deliver cool fonts to my site a la Google fonts, create a gorgeous site with custom code injected CSS and Java (while grokking most of it), and best part … it was FUN! I have never remotely done anything like that in my entire life, and with ChatGPT’s help I managed to it in like 3 hours. It’s bonkers.
AI is truly what you make of it, and I think it’s an incredible tool that allows you to learn things in a way that fits how your brain works.
I think schools should have curriculum that teaches people how to use AI effectively. It’s truly a force multiplier for creativity.
Computers haven’t felt this fun for a long time.
> "I'm an artist"
This is actually what I'm most excited about: in the reasonably near future, productivity will be related to who is most creative and who has the most interesting problems rather than who's spent the most hours behind a specific toolchain/compiler/language. Solutions to practical problems won't be required to go through a layer of software engineer. It's going to be amazing, and I'm going to be without a job.
So your understanding is that the chief value a software engineer provides is experience utilizing a specific toolchain/compiler/language to generate code. Is that correct?
> productivity will be related to who is most creative and who has the most interesting problems rather than who's spent the most hours behind a specific toolchain/compiler/language.
Why stop at software? AI will do this to pretty much every discipline and artform, from music and painting, to law and medicine. Learning, mastery, expertise, and craftsmanship are obsolete; there's no need to expend 10,000 hours developing a skill when the AI has already spent billions of hours in the cloud training in its hyperbolic time chamber. Academia and advanced degrees are worthless; you can compress four years of study into a prompt the size of a tweet.
The idea guy will become the most important role in the coming aeon of AI.
Also, since none of us will have any expertise at all anymore, everything our AI makes will look great. No more “experts” pooping our parties. It’s gonna be awesome!
Why would you be out of a job? Nothing he described is something that someone is being paid to do. Look at everything he needs just to match a fraction of your power.
Consumer apps may see less sales as people opt to just clone an app using AI for their own personal use, customized for their preferences.
But there’s a lot of engineering being done out there that people don’t even know exists, and that has to be done by people who know exactly what they’re doing, not just weekend warriors shouting stuff at an LLM.
I think much of HN has a blind spot that prevents them from engaging with the facts.
Yes, AI currently has limitations and isn't a panacea for cognitive tasks. But in many specific use cases it is enormously useful, and the rapid growth of ChatGPT, AI startups, etc. is evidence of that. Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real, same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
I would trust many peoples' evaluations on the impacts of AI if they could at least engage with reality first.
Some people see it as a political identity issue.
One person told me the other day that for the rest of time people will see using an AI as equivalent to crossing a picket line.
To me the progress achieved so far has been overhyped in many respects. The numbers out of Google that 25% of the code being generated is AI or some high number like that? BS. It’s gamified statistics that look at the command completion (not AI trying to solve a problem) vs what’s accepted and it’s likely hyper inflated even then.
It works better than you for UI prototypes when you don’t know how to do UI (and maybe even faster even if you do). It doesn’t work at all on problems it hasn’t seen. I literally just saw a coworker staring at code for hours and getting completely off track trying to correct AI output vs stepping through the problem step by step using how we thought the algorithm should work.
There’s a very real difference between where it could be in the future to be useful vs what you can do with it today in a useful way and you have to be very careful about utilizing it correctly. If you don’t know what you’re doing and AI helps you get it done cool, but also keep in mind that you also won’t know if it has catastrophic bugs because you don’t understand the problem and the conceptual idea of the solution well enough to know if what it did is correct. For most people there’s not much difference but for those of us who care it’s a huge problem.
I'm not sure if this post is ragebait or not but I'll bite...
If anything, HN is in general very much on the LLM hype train. The contrarian takes tend to be from more experienced folks working on difficult problems that very much see the fundamental flaws in how we're talking about AI.
> Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real
That's not what people are saying. They're noting that revenue is meaningless in the absence of looking at cost. And it's true, investor money is propping up extremely costly ventures in AI. These services operate at a substantial loss. The only way they can hope to survive is through promising future pricing power by promising they can one day (the proverbial next week) replace human labor.
> same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
Again, no one really denies that LLMs can be useful in learning.
This all feels like a strawman-- it's important to approach these topics with nuance.
I was talking to a friend today about where AI would actually be useful in my personal life, but it would require much higher reliability.
This is very basic stuff, not rewriting a codebase, creating a video game from text prompt or generating imagery.
Simply - I would like to be able to verbally prompt my phone something like "make sure the lights and AC are set to I will be comfortable when I get home, follow up with that plumber if they haven't gotten back to us, place my usual grocery order plus add some berries plus anything my wife put on our shared grocery list, and schedule a haircut for the end of next week some time after 5pm".
Basically 15-30min of daily stupid personal time sucks that can all be accomplished via smartphone.
Given the promise of IoT, smart home, LLMs, voice assistants, etc.. this should be possible.
This would require it having access to my calendar, location, ability to navigate apps on my phone, read/send email/text, and spend money. Given the current state of the tools, even if there is a 0.1% chance it changes my contact card photo to Hitler, replies to an email from my boss with an insult, purchases $100,000 in bananas, or sets the thermostats to 99F.. then I couldn't imagine giving an LLM access to all those things.
Are we 3 months, 5 years, or never away from that being achievable? These feel like the kind of things previous voice assistants promised 10 years ago.
because the preachers preach how amazing it is on their greenfield 'i built a todo list app in 5 minutes from scratch' and then you use it on an established codebase with a bigger context than the llm could ever possibly consume and spend 5x more time debugging the slop than it would've taken you to do the task yourself, and you become jaded
stop underestimating the amount of internalized knowledge people can have about projects in the real world, it's so annoying.
an llm can't ever possibly get close to it. there's some guy in a team in another building who knows why a certain weird piece of critical business logic was put there 6 years ago, the llm will never know this and won't understand this even if it consumed the whole repository because it would have to work there for years to understand how the business works
A completely non-technical saleslady on our team prototyped a whole JS web app that generated some data based on some user inputs (and even generated PDFs), which solved a problem our customers were having and our devs didnt have the time to develop yet.
This obviously was a temporary tool we'd never let touch our github repo but it still very much worked and solved a niche problem. It even looked like our app because the LLM could consume screenshots to copy our designs.
I'm on board with vibe coding = non-maintainable, non-tested, mostly useless code by non-devs. But the plus side it will expose many many people to learn basic programming and fill many tiny gaps not solved by bigger more serious pieces of code. Especially once people start building infrastructure and tooling around these non-devs, like hosting, deployment, webhook integrations, etc.
Do people actually learn when using these tools though? I mean, I’m sure they can be used to learn, just like TikTok could be used to read John Stuart Mill. But I doubt that’s what it’s going to be used for in real life.
If the barrier to entry is lower then more people will engage with it. Everything in life is about incentives. This is a hugely powerful tool for people working in the information industry, which is most people with office jobs. A sales person who can overcome a simple customer objection without a major time investment with devs is a sales person who makes more $$ and gets more promotions.
Most people in practice won't, they'll stick to what they know, but there's tons of semi-nerds on the edges who are going to flourish in the next decade. Which is great news for the economy.
But that’s not good. You don’t want Bob to be the gate keeper for why a process is the way it is.
In my experience working with agents helps eliminate that crap, because you have to bring the agent along as it reads your code (or process or whatever) for it to be effective. Just like human co-workers need to be brought along, so it’s not all on poor Bob.
totally. Especially when i'm debugging something for colleagues or friends, given a domain and a handful of ways of doing it, if i'm familiar with it I generally already get a sense of the reasons why its failing or falling short. This has nothing to do with the code base, any given language not withstanding. It comes from years and decades of systems and idiosyncratic behaviors of systems and examples which strangely rear their heads in notable ways.
Theese notable ways may also not be commonly known or put into words, but they persist nevertheless.
[dead]
This post is about a specific, complex system that stretches from operating system to the physical world, as well as some philosophical problems.
What you're describing is a dead simple hobby project that could be completed by a complete novice in less than a week before the advent of LLMs.
It's like saying "I'm absolutely blown away by microwaves, I can have a meal hot and ready in just a few minutes with no effort or understanding. I think all culinary schools should have a curriculum that teaches people how to use microwaves effectively."
Maybe the goal of education should be giving people a foundation that they can build on, not make them an expert in something that a low skill ceiling and diminishing returns.
I mean - going from “doable in a week with the right mindset” to “doable in a day when I’ve struggled before” is uhhh kinda worth being blown away by
Yeah, great, but also completely irrelevant. Is every post on Hacker News related to AI in any way a place to post anecdotes about AI?
I’m actually a huge fan of Brett Victor and I felt like he’s kinda missing the dynamic, adaptable nature of AI that allows non-technical people like me to finally access the system layer of computation for our creative ends.
In other words, in many ways, AI (or rather llms) is the very thing that Brett Victor has spent his whole career imagining and creating - a computing interface that closes gap between human imagination and creation. But here, he’s focusing on the negatives while neglecting, IMHO, the vast potential of AI to allow people to connect, create, and express themselves. As in truly having a PERSONAL computer.
At Dynamicland, he was attempting to build a system that non-technical people like me can interface in a way that makes sense to us.
Taking your unnecessarily disparaging microwave analogy - Using CHATGPT, I can understand it, reprogram it, and do fun stuff, like I don’t know - set up a basketball hoop that sets the timer based on how many shots I make, despite having limited or no technical background. Like I can tell chatgpt my crazy vision, and it will give me step by step approach, with proper resources, and respond in a way that I can grok to build this thing.
THIS is why I'm awestruck.
My anecdote is just my personal reaction to the post. Besides, what’s wrong with people expressing themselves freely here?
> a gorgeous site with custom code injected CSS and Java (while grokking most of it)
From the context, it's not Java, but Javascript.
This was the most surprising/disturbing/enlightening part of the post imo. Surpring; this person literally had no clue! Disturbing; this person literally had no clue? Enlightening; this person literally did not need a clue.
My takeaway as an AI skeptic is AI as human augmentation may really have potential?
Just a vision and goal.
I feel like, AI makes learning way more accessible, at least it did for me, where it evoked a childlike sense of curiosity and joy for learning new things.
I’m also working on a Trading Card Game, where I feed it my drawings and it renders it into final polished form based on visual style that I spent some time building in chat GPT. It’s like an amplifier / accelerator.
I feel like, yes while it can augment us, at the end day it depends on our desire to grow and learn. Otherwise, you will end up with same result as everybody else.
Gotcha. Thanks.
I enjoy the fun attitude, I had a family member state something similar, but I always warn: with powerful AI comes serious consequences. Are we ready for those consequences? How far do we want AI to reach into our lives and livelihood?
We managed to survive the nuke and environmental lead (two examples of humanity veering into drastically wrong directions)
we are never ready for seismic changes. But we will have to adapt one way or another, might as well find a good use for it and develop awareness as a child would around handling knives.
We cannot predict what the consequences will be, but, as a species, we are pretty good at navigating upheavals, opportunities. There are no guarantees that human ingenuity is likely to always save the day, but the fact evolution has bestowed us with risk taking, curiosity, so we won’t stop.
Enjoy the ride.
No, we are not good at that. We have dire warnings of extreme catastrophes heading our way for decades now and instead of fixing what is broken, we collectively decide to race to our extinction faster.
Modern Humans have been around and successful for 10s of thousands of years. This might be a genetic dead end in the relatively near short term, but I bet we and at least one branch of our descendants will live for many more 10s or 100s of thousands of years.
Genuine question, how would you feel about reading the dual to your comment here?
"I'm a computer scientist who's always struggled to learn how to paint." "Last night I managed to create a gorgeous illustration with Stable Diffusion, and best part ... it was FUN!" "Art hasn't felt this fun for a long time."
I do love the approachability they create. Personally as a coder, I don't find the AI experience at all fun, but I get that others do.
Have you tried using AI to further changes to any of these projects down the line?
Makes sense. I know what I built is nowhere near actual software development. Still I was able to quickly learn how things work through GPT.
Since I’ve literally been working on this project for two days, here’s a somewhat related answer to your question: I’ve been using chat gpt to build art for TCG. Initially I was resistant and upset at AI companies were hoovering up people’s work wholesale for training data (which is why I think now is an excellent time to have serious conversation about UBI, but I digress).
But I finally realized that I could develop my own distinctive 3D visual style by feeding GPT my drawings and having it iterate in interesting directions. It’s fun to refine the style, by having GPT simulate actual camera lens and lighting set up.
But yes I’ve used AI to make numerous stylistic tweaks to my site, including building out a tagging system that allows me to customize the look of individual pages when I write a post)
Hope I’ll be able to learn how to build an actual complex app one day, or games.
Interesting, CDNs require a lot of infrastructure to be effective, what did you (or the LLM) use to set that up?
Maybe CDN isn’t the right term after all, see I’m not a software engineer!
But, basically I wanted a way to have a custom repository of fonts a la Google Fonts (found their selection kinda boring) that I could pull from.
Ran fonts through transfonter to convert them to .woff2, set up a GitHub repository (which is not designed for people like me), and set up an instance on Netlify, then wrote custom CSS tags for my ghost.org site.
The thing that amazes me is that aside from my vague whiff of GitHub, I had absolutely no idea how to do this. Zilch. Nada. Chat GPT gave me a clear step by step plan, and exposed me to Netlify, how to write CSS injections, how ghost.org tagging works from styling side of things. And I’m able to have back and forth dialogue with it, not only to figure out how to do it, but understand how it works.
Sounds more like a Continuous Integration / Continuous Deployment (CI/CD) pipeline - defined as a set of practices that automate the process of building, testing and deploying software. Or rather, fonts in this case.
A Content Delivery Network (CDN) is a collection of geographically scattered servers that speeds up delivery of web content by being closer to users. Most video/image services use CDNs to efficiently serve up content to users around the world. For example, someone watching Netflix in California will connect to a different server than someone watching the same show in London.
Gotcha. Appreciate you taking the time to teach me!
Yes, it's an ever patient teacher that's willing to chop up any subject matter into bits that are just the correct shape and size for every brain, for as long and as deep as you're willing to go. That's definitely one effective way to use it.
CDNs are usually one of the more expensive things you can build, there's a reason most are run by large public companies.
Those are probably near the top of the list of things you don't want to blindly trust an LLM with building.
Nice! That's what we all want, but without the content reapropriation, surveillance and public discourse meddling.
Think of a version that is even more fun, won't teach your kids wrong stuff, won't need a datacenter full of expensive chips and won't hit the news with sensacionalist headlines.
That’s the dream eh?!
It's a tradition, a way of building things.
You're an artist, I'm sure you understand what art that comes from tradition tries to accomplish.
RealTalk has some interesting features that I wish there was a more complete writeup that explained it in detail.
Like, you can write a script that talks to functionality that may or may not exist yet.
Programming by moving pieces of paper around deservedly gets attention, but there's a lot more to it.
I've never experienced dynamicaland in person (only seen videos). However, one concern I have about it's demos so far is that they use a projector. So you need a room dark enough to for the projected light and you need to keep your heads, hands, and body out of the way of it.
Projectors have been strong enough to be visible in decently lit rooms for ages. The reason you want the room extremely dark for most projector setups is for contrast, because the darkest thing you can make on a projected image is the ambient surface illumination (and the brightest is that surface under full power from your projector [0]). If you accept that compromise you don't need a super dark room, the recommendation for tight light control is mostly for media viewing where you want reasonable black levels.
Do still need to keep hands out of the light to see everything but that can also be part of the interaction too. If we ever get ubiquitous AR glasses or holograms I'm sure Bret will integrate them into DL.
[0] Which leads to a bit of a catch 22 you want a surface that looks dark but prefectly reflects all the colors of your projector so you need a white screen which means you ideally want zero other light other than the projector to make the projector act the most like a screen.
>you need to keep your heads, hands, and body out of the way of it.
I've seen systems like this that use multiple projectors from different angles, calibrated for the space and the angle. They're very effective at preventing occlusion, and it takes fewer than you'd think (also see Valve's Lighthouse tech for motion tracking).
Unfortunately, doing that is expensive, big, and requires recalibrating whenever it's moved.
The light level isn't an issue in practice: when I visited the actual installation during the day, the building was brightly lit with natural light and the projections were easily visible, to the point that I didn't think about it at the time.
This is true, but modern laser projectors are very, very bright. I use one as my main computer display with no problems with the blinds open, and the sun shining in.
Occlusion is definitely a problem.
There never is anything more satisfying than to see maestro Victor, or maestro Kay being up voted in the hacker news.
> we aim for a computing system that is fully visible and understandable top-to-bottom
I mean even for something that is in theory fully understandable like the linux kernel it is not feasible to actually read the source before using it.
To me this really makes no sense. Even for traditional programming we only have so powerful systems because we use a layered approach. You can look into these layers and understand them but it is totally out of scope for a single human being.
That’s because you’re conflating “understanding” with “comprehension”. You can understand every component in a chain and its function, how it works, where its fragilities lay or capabilities are absent, without reviewing the source code for everything you install. To comprehend, however, you must be intimately familiar with the underlying source code, how it compiles, how it speaks to the hardware, etc.
I believe this is the crux of what the author is getting at: LLMs are, by their very nature, a black box that cannot ever be understood. You will never understand how an LLM reached its output, because their innate design prohibits that possibility from ever manifesting. These are token prediction machines whose underlying logic would take mathematicians decades to reverse engineer even a single query, by design.
I believe that’s what the author was getting at. As we can never understand LLMs in how they reached their output, we cannot rely on them as trustworthy agents of compute or knowledge. Just like we would not trust a human who gives a correct answer much of the time but can never explain how they knew that answer or how they reached that conclusion, so should we not trust LLMs in that same capacity.
I would go one step further: if you look at what Bret Victor has been researching for the past 20ish years it has been all about explaining difficult systems in a simple way. I'm pretty sure he genuinely wants every part to be explained, from the bottom up. You'll also get that sense if you read his library short fiction[1].
Now, whether that's possible is up for debate, but we could definitely use a push in that direction, a system that we can genuinely understand top to bottom, so that we can modify it to what we need. That's where agency comes into his argument.
[1] https://dynamicland.org/2019/The_Library.pdf
I get that LLMs are a black box in ways that most other technologies aren't. It still feels to me like they have to be okay with abstracting out some of the details of how things work.
Unless they have a lot of knowledge in electrical engineering/optics, the average user of this isn't going to understand how the camera or projector work except at a very high level.
I feel like the problem with LLMs here is more that they are not very predictable in their output and can fail in unexpected ways that are hard to resolve. You can rely on the camera to output some bits corresponding to whatever you're pointing it at even if you don't know anything about its internals.
Building projection optics is a bench top experiment that we did in 7th grade in school. Electric circuitry isn't exactly rocket science, either. Things like LCD panels for projecting arbitrary images and CCD chips for cameras become harder to understand.
But the point is to make users understand the system enough to instill the confidence to change things and explore further. This is important because the true power of computer systems comes from their flexibility and malleability.
You can never build that level of confidence with LLMs.
I get what you are talking about. My gripe with that is - yeah, would be indeed great if we could at some point get the structure to such deep level, as to write down pen on paper on one page, a string of maths symbols that is a good enough description. However - it's possible that for many things, that's not possible. Suspect maybe not possible in e.g. biology. Possible that the great success of physics of the 20-th century over-indulged us. So our expectations are out of kilter with realities of our world.
Fwiw I personally describe than as white, not black boxes. For we know, and can trace back every single bit of the output, back to the input. That does not help us as much as we'd like though. When drilling down into "why did the model answer wrongly 1, and not rightly 2", it comes down to "well, it added one trillion small numbers, and the sum came close to 1, but didn't reach 2". Which is unsatisfactory, and your "understanding" v.s. "comprehension" delineates that nicely.
Maybe more productive to think of them more "artefacts", less "mechanical contraptions". We shape them in many ways, but we are not in complete control of their making. We don't make them explicitly with out hands: we make a maker algorithm, and that algorithm then makes them. Or even "biological", grown artefacts. Given we don't control the end result fully. Yes we know and apply the algorithm that builds them, but we don't know the end result before hand, the final set of weights. Unlike say when we are making a coffee machine - we know all the parts to a millimetre in advance, have it all worked out pre-planned, before embarking on the making of the machine.
1. Tell me you trust only humans who always explain you in detail how they come to their beliefs. You are probably very lonely.
2. There is a lot of ongoing work on mechanistic interpretability by e.g. antropic that shows we can understand LLMs better than we initially thought.
You're right, and when the behaviour of large codebases violates expectations, the intertwined webs of code are problems. Alan Kay's Viewpoints Research studied this, and famously Kay proposed "t-shirt sized algorithms", where short rules can be used to make even fancy desktop word processing and presentation software. There are also projects like From NAND to Tetris that show understanding a full stack is achievable. Could this go broader, and deeper? Of course, and this is what Bret Victor is getting at. Not just for when code goes wrong, but to make it more modifiable and creative in the first place (See Maggie Appleton's essay/talk "Home-Cooked Software and Barefoot Developers"). Projects like SerenityOS show how powerful "small" software can be. For instance, its spin-off project, the Ladybug browser has vastly fewer lines of code compared to Chromium, and yet it seems that the Ladybug team is able to implement one specification after another. Last I saw, they were close to meeting the minimum feature set Apple requires for shipping as a browser on iOS.
There's a fundamental difference between systems that are theoretically comprehensible but practically large (Linux) versus systems whose internal reasoning is inherently opaque by design (modern neural networks).
reminds me a bit of http://tunes.org (and I'm sure there's many more).. so cool to see deep exploration of computing/operating system ideas.
thanks for sharing this, seems like a very idealistic project, had not heard of Bret nor dynamicland before.
Highly recommend one of the most memorable talks I've ever seen...
Inventing on Principle: https://www.youtube.com/watch?v=PUv66718DII
"Stop drawing dead fish" by Bret has stuck with me for a decade – https://www.youtube.com/watch?v=ZfytHvgHybA
Very much worth the watch if you haven't seen this one before.