I can see "no progress in 50 years" in fundamental physics where the experimental frontier seems to be running away from us (though recent gamma astronomy results suggest a next generation accelerator really could see the dark matter particle)
In biology or chemistry it's absurd to say that -- look at metal organic frameworks or all kinds of new synthetic chemistry or ionic liquids or metagenomics, RNA structure prediction, and unraveling of how gene regulation works in the "dark genome".
Progress in the 'symbolic AI' field that includes proof assistants is a really interesting story. When I was a kid I saw an ad for Feigenbaum's 3-volume "Handbook of AI" and got a used copy years later -- you would have thought production rules (e.g. "expert systems" or "business rules") were on track to be a dominant paradigm but my understanding was that people were losing interest even before RETE engines became mainstream and even the expert system shells of the early 1980s didn't use the kind of indexing structures that are mainstream today so that whereas people we saying 10,000 rule rule bases were unruly in the 1980s, 10,000,000 well-structured rules are no problem now. Some of it is hardware but a lot of it is improvements in software.
SAT/SMT solvers (e.g. part of proof assistants) have shown steady progress in the last 50 years, though not as much as neural networks because they are less parallelization. There is dramatically more industrial use of provers though business rules engines, complex event processing, and related technologies are still marginal in the industry for reasons I don't completely understand.
>But it’s fair to assume that such fields have not been idle either.
"Manngell amnesia", where if you hear of breakthroughs in any field other than AI, you assume that very field has always been stagnant?
There's another angle to this. Eg MoF-synthesis is a breakthrough unappreciated outside of chem because of how embarrassingly easy it is. Laymen (& VCs) expect breakthroughs to require complexity, billions, wasted careers, risk, unending slog etc..
Read the bios of the chem nobellists to see what stress-free lives they led (around the time of the discovery), even compared to VCs and proof assistant researchers. Disclaimer: possibly not applicable to physics/physiology laureates after 1970 :)
Mullis succeeded in demonstrating PCR on December 16, 1983, but the staff remained circumspect as he continued to produce ambiguous results amid alleged methodological problems, including a perceived lack of "appropriate controls and repetition."
There was one day the bus was late so I drove in with a grad student who did density functional theory calculations of MOFs and asked him "How do you make a MOF?" and he said "Beats me, I'm a theorist" so I figured that I wanted a quick answer to that one myself and it turned out to be "mix up the ingredients and bake them in the oven"
> business rules engines, complex event processing, and related technologies are still marginal in the industry for reasons I don't completely understand
Translating between complex implicit intention in colloquial language and software and formal language used in proof assistants is usually very time consuming and difficult.
By the time you’ve formalized the rules, the context in which the rules made sense will have changed/a lot will be outdated. Plus time and money spent on formalizing rules is time and money not spent on core business needs.
That's definitely true, but I do think production rules have some uses that are less obvious.
For instance, XSLT is not "an overcomplicated Jinja 2" but rather it is based on production rules but hardly anybody seems to know that, they just think it's a Jinja 2 that doesn't do what they want.
Production rules are remarkably effective at dealing with deep asynchrony, say a process that involves some steps done by people or some steps done by humans, like a loan application being processed by a bank that has to be looked at by a loan officer. They could be an answer to the async comm problems in the web browser. See also complex events processing.
Production rules could be a more disciplined way to address the issues addressed by stored procedures in databases.
I've written systems where production rules are used in the control plane to set up and tear down data pipelines with multiple phases in a way that can exploit the opportunistic parallelism that can be found in sprawling commercial batch jobs. (The Jena folks told me what I was doing wasn't supported but I'd spent a lot of time with the source code and there was no problem.)
When I was a kid I saw an ad for Feigenbaum's 3-volume "Handbook of AI" and got a used copy years later
There was a Volume IV added as well at some point[1]. I've had this entire set sitting on my shelf for ages now, intending to read the entire thing "one of these days" but somehow "one day" keeps not showing up. Still, if I live long enough, I still want to read it all eventually.
Hell maybe I'll pull Volume 1 off the shelf later tonight and read a few pages, just to put a stake in the ground and say I started it at least. :-)
I picked these up at a used bookstore ages ago, since they had the three-volume set. My recommendation would be to familiarize yourself with just the table of contents that’s printed on the binding, and when you come across something adjacent in your day-to-day work (e.g. Search), review the papers in that section. Those books are an excellent snapshot of the field at the time.
> In 1994, came the Pentium with its FDIV bug: a probably insignificant but detectable error in floating-point division. The subsequent product recall cost Intel nearly half a billion dollars. John Harrison, a student of Mike’s, decided to devote his PhD research to the verification of floating-point arithmetic.
No mention of the effort by Boyer and Moore, then at their Computational Logic, Inc., to do a formal verification of the AMD FPU for the AMD5K86TM. The AMD chip shipped with no FDIV bug. [1]
ACL2 doesn't get a lot of love from the side of the verification community that focuses on the proof systems that are more academically popular (HOL family, CIC family, etc.). A lot of interesting industrial work has been done with ACL2 and related systems.
Yes. Been there, done that, with the pre-ACL2 Boyer-Moore prover. We had the Oppen-Nelson prover (the first SAT solver) handling the easy stuff, and used the Boyer-Moore prover for the hard stuff. Not that much manual work.
I'd like to call out the work from Nada Amin in this area:
Dafny and verification-aware programming, including proof by induction to verify properties of programs (for example, that an optimizer preserves semantics). Dafny Sketcher (https://github.com/namin/dafny-sketcher)
Multi-stage programming, a principled approach to writing programs that write programs, and its incarnation in multi-stage relational programming for faster synthesis of programs with holes—with the theoretical insight that a staged interpreter is a compiler, and a staged relational interpreter for a functional language can turn functions into relations running backwards for synthesis.
multi-stage miniKanren (https://github.com/namin/staged-miniKanren)
I recently started reading "Specifying Systems: The TLA+ Language and Tools for Hardware and Software En" by Lawrence Lamport[0]. It is a good starting point for learning how to specify systems on the basis of mathematical proofs.
Since the new code is specifications in the age of AI, learning how to specify systems mathematically is a huge advantage because English is extremely ambiguous.
I've had the same intuition. I've had mixed results in this area, although I'm certainly no expert. Recently I wanted to formalize a model of a runbook for a tricky system migration, to help me reason through some alternatives. I ended up writing a TLA+ spec before generating some visualizations, and also some possible prototypes in MiniZinc. All three (spec, visualizations, CP models) were vibe-coded in different sessions, in about that order, though most of my personal effort went into the spec.
While the later AIs quickly understood many aspects of the spec, they struggled with certain constraints whose intuitive meaning was concealed behind too much math. Matters which I had assumed were completely settled, because a precise constraint existed in the spec, had to be re-explained to the AI after implementation errors were found. Eventually, I added more spec comments to explain the motivation for some of the constraints, which helped somewhat. (While it's an untested idea, my next step was going to be to capture traces of the TLA+ spec being tested against some toy models, and including those traces as inputs when producing the implementations, e.g. to construct unit tests. Reasoning about traces seemed to be a fairly strong suit for the AI helper.)
In hindsight, I feel I set my sights a little too high. A human reader would have had similar comprehension problems with my spec, and they probably would have taken longer to prime themselves than the AI did. Perhaps my takeaway is that TLA+ is a great way to model certain systems mathematically, because precision in meaning is a great quality; but you still have to show sympathy to your reader.
I think when Peter Thiel talks about stagnation it’s as much about the vibe of things as it is quantifying metrics of progress. I happen to agree with the idea that the vibe of progress has unilaterally focused on ai and computer technology even though this isn’t the case for metrics (e.g., CRISPR). I think the stagnation vibe has come from a series of issues such as decades of American political gridlock, Europes inability to commit to anything, the rise of the attention economy, the stagnation of Japan, the lack of a competitive focus for American markets (with no Soviet Union to defeat they seemed to have turned to defeating billionaire poverty). This list continues but it’s the confluence of all these things that gives the vibe of stagnation. It seems like there isn’t enough time to focus on anything anymore to actually get excited to drive it forward. Even though this isn’t the case. And there also isn’t the political space to celebrate things. Like it seems like any spacex accomplishment should be followed with the caveat that you don’t like Elon. Another way I think about this is that we live in this post cynicism world where we all have to couch every statement within some framework acknowledging harm to someone somehow. It’s hard to have a vibe of progress when every statement of progress includes an impact statement saying why that progress is harmful.
…and is now being taught in combined “Formal Real Analysis”[1] courses to undergrads, and the lean prover community has a joint project to formalize the proof of Fermat’s Last Theorem, which is a lot of work but is progressing. It’s sort of weird to say there is no progress. It seems to me when you have a fields medal winner publishing lean4 formal proofs on github[2] to go with one of his books you are making a lot of progress.
Yes, 50 years of LCF would have been much better. You should not talk about "50 years of proof assistants" and not mention Mizar which had the largest library of theorems for about half of that time.
And now, Matthew Scherf has published "A Formal Specification of Advaita Vedānta in Classical High-Order Logic," and verified it in both Isabelle and Lean4. https://github.com/matthew-scherf/Advaita
This is funny as well as amazing haha. This makes me wonder whether mathematics can prove the real truth or not since it does not take spirituality, conciousness and metaphysics (things we dont yet fully understand) into account.
I can see "no progress in 50 years" in fundamental physics where the experimental frontier seems to be running away from us (though recent gamma astronomy results suggest a next generation accelerator really could see the dark matter particle)
In biology or chemistry it's absurd to say that -- look at metal organic frameworks or all kinds of new synthetic chemistry or ionic liquids or metagenomics, RNA structure prediction, and unraveling of how gene regulation works in the "dark genome".
Progress in the 'symbolic AI' field that includes proof assistants is a really interesting story. When I was a kid I saw an ad for Feigenbaum's 3-volume "Handbook of AI" and got a used copy years later -- you would have thought production rules (e.g. "expert systems" or "business rules") were on track to be a dominant paradigm but my understanding was that people were losing interest even before RETE engines became mainstream and even the expert system shells of the early 1980s didn't use the kind of indexing structures that are mainstream today so that whereas people we saying 10,000 rule rule bases were unruly in the 1980s, 10,000,000 well-structured rules are no problem now. Some of it is hardware but a lot of it is improvements in software.
SAT/SMT solvers (e.g. part of proof assistants) have shown steady progress in the last 50 years, though not as much as neural networks because they are less parallelization. There is dramatically more industrial use of provers though business rules engines, complex event processing, and related technologies are still marginal in the industry for reasons I don't completely understand.
>in biology or chemistry..
>But it’s fair to assume that such fields have not been idle either.
"Manngell amnesia", where if you hear of breakthroughs in any field other than AI, you assume that very field has always been stagnant?
There's another angle to this. Eg MoF-synthesis is a breakthrough unappreciated outside of chem because of how embarrassingly easy it is. Laymen (& VCs) expect breakthroughs to require complexity, billions, wasted careers, risk, unending slog etc..
Read the bios of the chem nobellists to see what stress-free lives they led (around the time of the discovery), even compared to VCs and proof assistant researchers. Disclaimer: possibly not applicable to physics/physiology laureates after 1970 :)
https://www.amazon.com/Dancing-Naked-Mind-Field-Mullis/dp/07...
Mullis succeeded in demonstrating PCR on December 16, 1983, but the staff remained circumspect as he continued to produce ambiguous results amid alleged methodological problems, including a perceived lack of "appropriate controls and repetition."
(From wiki)
There was one day the bus was late so I drove in with a grad student who did density functional theory calculations of MOFs and asked him "How do you make a MOF?" and he said "Beats me, I'm a theorist" so I figured that I wanted a quick answer to that one myself and it turned out to be "mix up the ingredients and bake them in the oven"
It looks like that the "theorist" might be replaced sooner, given the narratives that are being driven now.. (after the entry level coder)
> business rules engines, complex event processing, and related technologies are still marginal in the industry for reasons I don't completely understand
Translating between complex implicit intention in colloquial language and software and formal language used in proof assistants is usually very time consuming and difficult.
By the time you’ve formalized the rules, the context in which the rules made sense will have changed/a lot will be outdated. Plus time and money spent on formalizing rules is time and money not spent on core business needs.
That's definitely true, but I do think production rules have some uses that are less obvious.
For instance, XSLT is not "an overcomplicated Jinja 2" but rather it is based on production rules but hardly anybody seems to know that, they just think it's a Jinja 2 that doesn't do what they want.
Production rules are remarkably effective at dealing with deep asynchrony, say a process that involves some steps done by people or some steps done by humans, like a loan application being processed by a bank that has to be looked at by a loan officer. They could be an answer to the async comm problems in the web browser. See also complex events processing.
Production rules could be a more disciplined way to address the issues addressed by stored procedures in databases.
I've written systems where production rules are used in the control plane to set up and tear down data pipelines with multiple phases in a way that can exploit the opportunistic parallelism that can be found in sprawling commercial batch jobs. (The Jena folks told me what I was doing wasn't supported but I'd spent a lot of time with the source code and there was no problem.)
When I was a kid I saw an ad for Feigenbaum's 3-volume "Handbook of AI" and got a used copy years later
There was a Volume IV added as well at some point[1]. I've had this entire set sitting on my shelf for ages now, intending to read the entire thing "one of these days" but somehow "one day" keeps not showing up. Still, if I live long enough, I still want to read it all eventually.
Hell maybe I'll pull Volume 1 off the shelf later tonight and read a few pages, just to put a stake in the ground and say I started it at least. :-)
[1]: https://www.amazon.com/Handbook-Artificial-Intelligence-IV/d...
I picked these up at a used bookstore ages ago, since they had the three-volume set. My recommendation would be to familiarize yourself with just the table of contents that’s printed on the binding, and when you come across something adjacent in your day-to-day work (e.g. Search), review the papers in that section. Those books are an excellent snapshot of the field at the time.
> In 1994, came the Pentium with its FDIV bug: a probably insignificant but detectable error in floating-point division. The subsequent product recall cost Intel nearly half a billion dollars. John Harrison, a student of Mike’s, decided to devote his PhD research to the verification of floating-point arithmetic.
No mention of the effort by Boyer and Moore, then at their Computational Logic, Inc., to do a formal verification of the AMD FPU for the AMD5K86TM. The AMD chip shipped with no FDIV bug. [1]
[1] https://dl.acm.org/doi/abs/10.1109/12.713311
ACL2 doesn't get a lot of love from the side of the verification community that focuses on the proof systems that are more academically popular (HOL family, CIC family, etc.). A lot of interesting industrial work has been done with ACL2 and related systems.
Yes. Been there, done that, with the pre-ACL2 Boyer-Moore prover. We had the Oppen-Nelson prover (the first SAT solver) handling the easy stuff, and used the Boyer-Moore prover for the hard stuff. Not that much manual work.
I assume you mean first SMT solver when you refer to Oppen-Nelson? I thought their contribution was the basis for SMT methods.
I'd like to call out the work from Nada Amin in this area:
Dafny and verification-aware programming, including proof by induction to verify properties of programs (for example, that an optimizer preserves semantics). Dafny Sketcher (https://github.com/namin/dafny-sketcher)
Multi-stage programming, a principled approach to writing programs that write programs, and its incarnation in multi-stage relational programming for faster synthesis of programs with holes—with the theoretical insight that a staged interpreter is a compiler, and a staged relational interpreter for a functional language can turn functions into relations running backwards for synthesis. multi-stage miniKanren (https://github.com/namin/staged-miniKanren)
Monte Carlo Tree Search, specifically the VerMCTS variant, and when this exploration-exploitation sweet spot is a good match for synthesis problems. VerMCTS (https://github.com/namin/llm-verified-with-monte-carlo-tree-...), and Holey (https://github.com/namin/holey).
Very Interesting; thanks for the pointer.
Nada Amin website - https://namin.seas.harvard.edu/
Very nice post thanks - I didn’t know about cakeml and bootstrapping formally verified compilers.
I recently started reading "Specifying Systems: The TLA+ Language and Tools for Hardware and Software En" by Lawrence Lamport[0]. It is a good starting point for learning how to specify systems on the basis of mathematical proofs.
Since the new code is specifications in the age of AI, learning how to specify systems mathematically is a huge advantage because English is extremely ambiguous.
[0]: https://lamport.azurewebsites.net/tla/book-02-08-08.pdf
I've had the same intuition. I've had mixed results in this area, although I'm certainly no expert. Recently I wanted to formalize a model of a runbook for a tricky system migration, to help me reason through some alternatives. I ended up writing a TLA+ spec before generating some visualizations, and also some possible prototypes in MiniZinc. All three (spec, visualizations, CP models) were vibe-coded in different sessions, in about that order, though most of my personal effort went into the spec.
While the later AIs quickly understood many aspects of the spec, they struggled with certain constraints whose intuitive meaning was concealed behind too much math. Matters which I had assumed were completely settled, because a precise constraint existed in the spec, had to be re-explained to the AI after implementation errors were found. Eventually, I added more spec comments to explain the motivation for some of the constraints, which helped somewhat. (While it's an untested idea, my next step was going to be to capture traces of the TLA+ spec being tested against some toy models, and including those traces as inputs when producing the implementations, e.g. to construct unit tests. Reasoning about traces seemed to be a fairly strong suit for the AI helper.)
In hindsight, I feel I set my sights a little too high. A human reader would have had similar comprehension problems with my spec, and they probably would have taken longer to prime themselves than the AI did. Perhaps my takeaway is that TLA+ is a great way to model certain systems mathematically, because precision in meaning is a great quality; but you still have to show sympathy to your reader.
I think when Peter Thiel talks about stagnation it’s as much about the vibe of things as it is quantifying metrics of progress. I happen to agree with the idea that the vibe of progress has unilaterally focused on ai and computer technology even though this isn’t the case for metrics (e.g., CRISPR). I think the stagnation vibe has come from a series of issues such as decades of American political gridlock, Europes inability to commit to anything, the rise of the attention economy, the stagnation of Japan, the lack of a competitive focus for American markets (with no Soviet Union to defeat they seemed to have turned to defeating billionaire poverty). This list continues but it’s the confluence of all these things that gives the vibe of stagnation. It seems like there isn’t enough time to focus on anything anymore to actually get excited to drive it forward. Even though this isn’t the case. And there also isn’t the political space to celebrate things. Like it seems like any spacex accomplishment should be followed with the caveat that you don’t like Elon. Another way I think about this is that we live in this post cynicism world where we all have to couch every statement within some framework acknowledging harm to someone somehow. It’s hard to have a vibe of progress when every statement of progress includes an impact statement saying why that progress is harmful.
Lean4 seems to be very popular in the Math Olympics-solving AI startups (Harmonic etc).
…and is now being taught in combined “Formal Real Analysis”[1] courses to undergrads, and the lean prover community has a joint project to formalize the proof of Fermat’s Last Theorem, which is a lot of work but is progressing. It’s sort of weird to say there is no progress. It seems to me when you have a fields medal winner publishing lean4 formal proofs on github[2] to go with one of his books you are making a lot of progress.
[1] eg https://youtube.com/playlist?list=PLs6rMe3K87LHu03WWh9rEbEhh...
[2] https://github.com/teorth/analysis
I wish he had just said 50 years of LCF, since he even mentions automath in the article but that was but that was late 60s
Yes, 50 years of LCF would have been much better. You should not talk about "50 years of proof assistants" and not mention Mizar which had the largest library of theorems for about half of that time.
And now, Matthew Scherf has published "A Formal Specification of Advaita Vedānta in Classical High-Order Logic," and verified it in both Isabelle and Lean4. https://github.com/matthew-scherf/Advaita
This is funny as well as amazing haha. This makes me wonder whether mathematics can prove the real truth or not since it does not take spirituality, conciousness and metaphysics (things we dont yet fully understand) into account.