How a 20-Person Startup Won Gold at the Math Olympiad—Tying With OpenAI & DeepMind (Tudor Achim, CEO of Harmonic)

Nicholas

@nicholas

Tudor Achim is the co-founder and CEO of Harmonic, a startup working to solve one of AI’s hardest problems: mathematical reasoning. In July 2024, Harmonic achieved gold-medal-level performance on International Math Olympiad problems alongside systems from OpenAI and Google DeepMind—but with a key difference: every proof Harmonic submitted was formally verified. Tudor's path to Harmonic wound through competitive piano, computational biology, and autonomous driving. He studied at Carnegie Mellon's music preparatory school, worked on machine learning at Quora, briefly pursued a PhD before dropping out, and then co-founded an autonomous driving company, Helm.ai.

Uploaded: Uploaded May 26, 2026
File type: POD
Queried: Queried 0 times

Full transcript

Showing the full transcript for this episode.

Speaker A: Aristotle is the world's first mathematical agent. You can delegate a mathematical reasoning task to Aristotle via the API, and it'll think for a long time and it'll give you an answer. And the really cool thing about it is that it's always correct. You can ask a math LLM a question and it'll give you an answer that looks very, very plausible. The dream of software engineering since day one was to produce programs that had proofs of correctness. And the reason why we don't have that is that for every line of code that you write, you might have to write 20 lines of code to verify it.

But you can imagine if you have an AI system that can specify functionality, formalize it, write proofs of correctness. All of a sudden, I would actually question why would you write any software that's not verified. By 2030, we will have 5 theories unifying quantum mechanics and general relativity, and we won't know which is correct because they all are equally plausible and correct. But we'll have to do higher and higher energy physics experiments to determine which one's right. So just because an AI can create like a logically consistent explanation of something no human has been able to create before doesn't mean that it fully solves every single, like, open problem in the world.

Speaker C: Last July, a 20-person startup won gold at the International Math Olympiad, tying with OpenAI and Google DeepMind in the process. Speaker B: It solved 5 of 6 problems, and unlike its much larger rivals, every proof was formally verified checked step by step. Speaker C: The company is Harmonic, and its CEO is Tudor Akeem, a PhD dropout and serial founder who believes that formal mathematical proof is the necessary infrastructure layer to create truly trustworthy AI. Today, Tudor and I discuss how he started a company with Robinhood CEO Vlad Tenev, why hallucinations may be the key to machine originality, and his extraordinary claim that by 2030, AI will have generated theoretical explanations for essentially everything.

I'm Mario, and this is The Generalist. Speaker B: It solved 5 of 6 problems, and unlike its much larger rivals, every proof was formally verified checked step by step. Speaker C: The company is Harmonic, and its CEO is Tudor Akeem, a PhD dropout and serial founder who believes that formal mathematical proof is the necessary infrastructure layer to create truly trustworthy AI. Today, Tudor and I discuss how he started a company with Robinhood CEO Vlad Tenev, why hallucinations may be the key to machine originality, and his extraordinary claim that by 2030, AI will have generated theoretical explanations for essentially everything.

I'm Mario, and this is The Generalist. Speaker B: This episode is brought to you by Brex. If you're a founder, the hardest part isn't the idea. It's scaling fast without getting buried in back office work. That's where Brex comes in. Brex is the intelligent finance platform for founders. With Brex, you get high limit corporate cards, easy banking, and high yield treasury Plus a team of AI agents that handle manual finance tasks for you. Speaker C: They take care of things like expenses, all according to your rules, so you can move faster while staying in full control.

Speaker B: 1 in 3 startups in the US already runs on Brex. You can too at com/mario. Every revolution in AI creates one question that never changes. Can you trust the output? AI for work is incredible, but without trust, it's just leading to faster mistakes. The challenge isn't building an AI that can answer questions, it's making sure those answers are right. That's where Guru comes in. It's the AI source of truth that connects everything your company knows. So every insight, every answer, every recommendation is grounded in verified knowledge, not outdated information or hallucinations.

When your teams and your AIs share one trusted foundation, everything moves faster with fewer redos, fewer blind spots, and more confidence in every decision. Because in the age of AI, truth isn't just power, it's protection. See what Guru is doing for thousands of companies like Spotify, DHL, and Stripe at com. That's com. Speaker B: 1 in 3 startups in the US already runs on Brex. You can too at com/mario. Every revolution in AI creates one question that never changes. Can you trust the output? AI for work is incredible, but without trust, it's just leading to faster mistakes.

The challenge isn't building an AI that can answer questions, it's making sure those answers are right. That's where Guru comes in. It's the AI source of truth that connects everything your company knows. So every insight, every answer, every recommendation is grounded in verified knowledge, not outdated information or hallucinations. When your teams and your AIs share one trusted foundation, everything moves faster with fewer redos, fewer blind spots, and more confidence in every decision. Because in the age of AI, truth isn't just power, it's protection. See what Guru is doing for thousands of companies like Spotify, DHL, and Stripe at

com. That's com. Speaker C: In, in researching your background, uh, It seems like you were a quite extraordinary young pianist competing at a very high level, uh, and being invited to Carnegie Hall to perform. How did that musician become a computer scientist? Speaker A: Piano was a surprisingly big part of my life growing up, and you're the first person that's asked, so I, uh, I have to think a little bit here, but, um, what I had was an amazing piano teacher. Her name was Mineko Avery. And she had been a lawyer before working for children's rights, and then she became a piano teacher and she had a very interesting story.

She was actually interned as a child in the internment camps for the Japanese in California. So she was exposed to a lot of, you know, pain and suffering. And I think that set the stage for her life and what she wanted to do. And later in life she became a piano teacher 'cause I think she had played piano her entire life. I was lucky enough to get to study with her at the Carnegie Mellon Music Preparatory School. And Mrs. Avery, as I call her, I think she's one of the main people from which I learned kind of discipline, you know, how to perform, how to stick to something for a while.

So while I appreciate you calling me an extraordinary, you know, pianist, uh, I have to gently challenge that a little bit. I was, I was pretty good. Um, I went to some competitions and One competition I went to was the so-called World Piano Competition in Cincinnati. Now, I think calling it a World Piano Competition is a bit of a stretch, but there were people from around the world there, and I did well enough at the rounds there to get invited to play Carnegie Hall with some of the other winners. And the funny thing is that my parents didn't actually understand what was in the letter when it came in the mail.

And so they kind of like left it for like a month or so. And then when we finally opened it, everything had passed, like the concert had happened and, uh, You know, unfortunately I didn't actually play Carnegie Hall, but you know, notwithstanding, um, piano was a really important part of my life. Uh, it taught me a lot of life lessons and I was very lucky to have a teacher like Mrs. Avery. Really a lot of the teachers at the CMU Preparatory School were, were just incredible, incredible teachers. Speaker A: Piano was a surprisingly big part of my life growing up, and you're the first person that's asked, so I, uh, I have to think a little bit here, but, um, what I had was an amazing piano teacher.

Her name was Mineko Avery. And she had been a lawyer before working for children's rights, and then she became a piano teacher and she had a very interesting story. She was actually interned as a child in the internment camps for the Japanese in California. So she was exposed to a lot of, you know, pain and suffering. And I think that set the stage for her life and what she wanted to do. And later in life she became a piano teacher 'cause I think she had played piano her entire life. I was lucky enough to get to study with her at the Carnegie Mellon Music Preparatory School.

And Mrs. Avery, as I call her, I think she's one of the main people from which I learned kind of discipline, you know, how to perform, how to stick to something for a while. So while I appreciate you calling me an extraordinary, you know, pianist, uh, I have to gently challenge that a little bit. I was, I was pretty good. Um, I went to some competitions and One competition I went to was the so-called World Piano Competition in Cincinnati. Now, I think calling it a World Piano Competition is a bit of a stretch, but there were people from around the world there, and I did well enough at the rounds there to get invited to play Carnegie Hall with some of the other winners.

And the funny thing is that my parents didn't actually understand what was in the letter when it came in the mail. And so they kind of like left it for like a month or so. And then when we finally opened it, everything had passed, like the concert had happened and, uh, You know, unfortunately I didn't actually play Carnegie Hall, but you know, notwithstanding, um, piano was a really important part of my life. Uh, it taught me a lot of life lessons and I was very lucky to have a teacher like Mrs.

Avery. Really a lot of the teachers at the CMU Preparatory School were, were just incredible, incredible teachers. Speaker C: I will accept your, your framing of not being exceptional. Uh, from the outside, it certainly seems exceptional to, you know, finish, I think you, you know, finished 4th maybe in this, uh, world's competition. Speaker A: Yeah. Yeah. But like I said, it's not quite everyone around the world competing, but like it was, it was, I was pretty good at that. You know, I, I think, uh, it was, it was a great experience for me as a kid.

Speaker C: Were your parents very musical? Was that sort of something that was encouraged? Speaker A: Well, my dad can sing very nicely, which is ironic because I'm possibly the worst singer you can imagine. I think my mom really appreciated classical music, but you know, they grew up in communist Romania, so there weren't too many opportunities to, you know, take a lot of extracurricular lessons in music and stuff like that. So I think I was the first generation that really had a chance to study this stuff. Speaker C: You know, I ask about the piano because it seems like there's, I don't know, some, some really deep connections between music and math and sort of the need for, for formal correctness, right?

There's, you know, different ways to play a song, but there is sort of only a few correct ways, let's say. Um, I wonder how much you reflect on, on those links, or does it feel like you activate similar parts of your brain when you do what you're you're doing with harmonic or some of the more mathematical work when you're, you know, when you used to play the piano? Speaker A: I think, uh, one thing maybe people don't know about is that you can explain a lot of what people enjoy about music with things like harmonics and analyzing scales and keys.

And those are essentially studies in harmonic analysis. So it's understanding, I mean, it's called harmonic analysis in math, but it's really understanding the harmonics and functions. And in music, it's the various frequencies and how they relate to each other as you play different notes and chords. So things that sound really pleasing have certain mathematical patterns. And I think a lot of people don't know that that's explainable by math. Now, with that said, uh, I think what makes the best classical music performances beautiful is the unique interpretations artists give to the music.

So I think there's something very non-mathematical and non-AI and human in those performances, but when it comes to analyzing more basic songs, there's kind of mathematical underpinnings to what sounds nice. I think that for me, the connection between music and AI, it's, I try to keep them separate. I think AI is very much an engineering field. It's about building effective systems, scaling them up, hitting metrics. I think for me, music is more a source of relaxation, you know, play piano to relax and really focus on something. I do think it was a little different.

I think in the future, there's an open question about how much art will be created by AIs. I don't know where we'll land on that. Uh, and I think early signals are people don't really love AI-generated art, but that might change over time. Speaker C: I can't help but, but pick up on that thread because it's an interesting one. Uh, there was, I think, I, I hope I'm paraphrasing his argument well, but I think it in the New Yorker, Ted Chiang sort of talked about how AI is incapable of making art because there's not the same sense of intent.

I wonder what you, what you make of that. Will we get to a stage where there is such a thing as, as true AI intent, sort of authorial intent in the same way? Speaker A: That's a tricky question. Personally, I have no interest in AI art. I don't think I ever will, uh, no matter how the intent, uh, situation changes. You know, I, I like art because it's created by humans. You know, you take someone's life, they distill that experience into a piece of music or a piece of visual art.

I hesitate to characterize what it is about the AI that part that, that doesn't really speak to me, whether it's the lack of intent right now, you know, that might change however we interpret intent. But I think right now AIs are very, very good tools in the same way that, you know, a power screwdriver is a great tool and you can turn it on and it'll do something for you and then you turn it off. I think in the future, if AI changes its nature fundamentally, I might change my mind.

There might be some interpretation of an AI where we would say, look, this thing has a lot of experience and We are interested in it for its own sake and therefore we're interested in the art it produces. But I, I think that's so different from AI now that it's hard for me to speculate on what that might look like or come up with like acceptance criteria for what it means to be an AI artist that I would like or that other people would like. So yeah, I think we're pretty far away from that.

Speaker C: Clearly piano was, was such a big part of your, your upbringing. Uh, but when I sort of look at your career. There are all these disparate interests that, you know, in some sense seem to have a really unified sort of set of, of, of substructures or, you know, some connecting tissues between it. And it makes me wonder, you know, what other things you were interested in as a kid. Were you sort of the kid who was fascinated by AI and science fiction or, you know, uh, hard sciences, physics?

What were your obsessions in those early years? Speaker A: Well, as a kid, I wanted to be a marine biologist. Which had nothing to do with AI. I really like the idea of, you know, being on a ship all the time, cataloging the ocean, that kind of thing, understanding climate change or how ecosystems work, that kind of thing. Over time, I, I think I came to the conclusion a lot of people came to, which is that math is really the fundamental tool to understand the world. So whether doing marine biology or physics or number theory, you're still using the same basic logical principles to reason through things.

You have the notion of facts, that have been discovered before. You have deduction rules, you have ways to check whether your reasoning is correct. And that was a very general toolkit that I concluded could be applied to a lot of fields. So that's how I kind of got into AI. And while it's true that I've done a number of different kinds of jobs over time, I think that all of the things I've worked on have essentially reduced to, you know, making search better or pattern recognition better. So if you think of intelligence, generally is, you know, trying things out and learning from your experience and then refining your strategies and trying again.

You know, in my last company, autonomous driving, we were learning from all the world's video data in order to try to learn how to drive. At Harmonic, we're letting our systems do a lot of math on its own with formal verification feedback to learn how to do math. At Quora, we were learning the patterns of, you know, what kind of content do people like to engage in? You know, do they like to learn about cooking or technology? Right? When, how about the time of day? What about when they're traveling? That kind of thing.

So it's, I've always been fascinated with the idea of learning, having an artificial intelligence system that can learn from experience. But I would say that I, I still think of the AI itself as a more of a tool to, or a means to an end rather than the end itself. Speaker C: You mentioned this sort of transition to, to realizing that everything can almost be distilled into these mathematical rules and, and finding that so powerful. Was, was that, you know, as dramatic as a, a eureka moment or more sort of something that you sort of steadily turned your mind to?

Speaker A: I think as a kid, it's one of those things where it does seem like a eureka moment. And then when you're older, you're like, well, it's kind of trivial and it's vacuous, right? So in some sense, just as an example, reinforcement learning can model essentially any problem. So in that sense, it's a vacuous statement to say that reinforcement learning can solve everything. So when I was a kid, I think realizing that math gave you the toolkit to solve everything, it kind of seemed like a eureka moment, but now it's just kind of like, you know, it's like a toolkit in the same way, like logical reasoning is a toolkit.

And I, I wouldn't call it like a eureka moment if I had that realization at this age rather than as a kid. Speaker C: Yes, that makes sense. Um, you, you mentioned sort of this fascination with search and pattern recognition. Are there other parts that like, you know, those two things don't adequately capture when we talk about intelligence? Speaker A: Well, it depends on how you define intelligence. As an example, so many people would consider most animals unintelligent because they can't talk the way we do, or they can't remember things the way we do, but they have, for example, other senses that we don't have.

And so to some animals, we look very unintelligent. How, what do you mean you can't recognize the smell of this plant? Right? How, how's that possible? You like, you're so dumb, you just can't even perceive that. Like, I think it's tricky to define what intelligence is and, uh, You know, I, I don't want to just choose one dimension, then call it, you know, intelligence. So the better you are in that dimension, the more intelligent you are. But one example of intelligence is mathematical intelligence. So the physicist Eugene Wigner, you know, wrote this famous, uh, essay about the unreasonable effectiveness of mathematics in physics.

He was pointing out that many physics discoveries were actually underlied by mathematical tools that were not even predicted to help with physics when they were invented. And in that sense, I think mathematics is a very interesting form of intelligence because it generalizes. So when you look at mathematics, I actually think that all of mathematics falls in the framework of search and pattern recognition. So one has a mathematical question they're trying to understand, exploring that question is the search process, and because it's an exponentially large search space, you need to distill your learnings from your explorations into whatever pattern recognition system you have.

So then the next time you search to try to answer the question, you benefit from that pattern recognition. So I think within certain forms of intelligence, some which are very generalizable, like math, it is search and pattern recognition that underlies how it, uh, how it really works. Speaker C: You went to university very early and, and graduated at, you know, I think you were 19 or so. Given your, your interests, I think it would have been maybe natural for someone to imagine, you know, you would go straight to do a PhD or, you know, a further education, but you, you went to Quora, as you mentioned, and were sort of leading an ML team there.

Was that when you first became acquainted with, with AI in, in the sense that it existed at that time? Speaker A: Well, I became acquainted with the form of AI that was very popular in Silicon Valley at the time, which was recommender systems, but Before that in college, I'd actually been working on robotics and back in the day before deep neural networks, you have to do this thing called feature engineering. So I spent an entire summer building this thing called histograms of oriented gradients in order to better detect people in webcam streams.

So that actually gave me an experience that I didn't really want to repeat. It was, it was a very, it was very clear to me at the time that that approach would never ever scale to the kind of intelligence that humans have. And so, you know, when I went to Quora, which was very exciting, uh, it was growing super quickly. It was one of the main places on the internet where people had substantive discussions. We started to ask ourselves about a year in, what does it look like if you use machine learning to improve people's experience?

Not only just what you show them in the feed, but even things like a marketplace of who to answer questions. So we had a system called Ask to Answer where, you know, you could allocate a certain amount of credits to ask people to answer certain questions. Of course, they got a positive reaction if they answered the question well. And at CoreEyes, that's where I really learned, you know, what the power of machine learning is at scale, because the kinds of algorithms we had created were simply beyond what you could program by hand.

And that was very different than my experience just a year before in that internship where I was trying to create this, uh, person detection system. So I think that's what kind of turned me on to the power of, you know, larger scale machine learning. Now, of course, I ended up going back to a PhD briefly before dropping out, but, um, I think Quora really, you know, flipped the switch for me. Speaker C: There was a sort of almost a tiny tidbit that I wonder, uh, if it's true, which is that I found somewhere where you mentioned, or maybe someone was mentioning to you that you had done some Bitcoin mining in 2012.

Was that something that you got interested in? Speaker A: I wish I'd done Bitcoin mining in 2012. There were some people at Quora that mined a lot of Bitcoin back then, but I wasn't part of them. I thought it was really silly. I, I, I wasn't very good at predicting where that particular project would go. Fortunately, I also don't have any stories like paying someone 20,000 Bitcoin for a pizza. So yeah, that, that, that's, uh, that's a painful mistake. Speaker C: Certainly. You're, you did your PhD before you dropped out.

You were focused on computational biology. Uh, if, if I understand correctly, why was that the sort of direction that, uh, you were most interested in at the time? Speaker A: What I was interested in broadly was a field called probabilistic graphical modeling, which was a form of so-called Bayesian inference. Bayesian inference is trying to solve the problem of how do you properly incorporate new information into your prior models of belief and then applying that to a lot of different areas. So one area you can apply that to is computational biology.

You can try to infer back before AlphaFold worked the structure of proteins by figuring out which amino acids co-evolved with each other. So you have the same protein across many organisms and you say, well, look, if an amino acid at one part of my sequence doesn't really change across 10,000 species, that means that it's probably an important part of the structure of the protein because if it had changed, the animal would've died out because the protein doesn't work anymore. That was a really interesting early form of AI for computational biology before deep learning work.

Then I started working with my former advisor, Stefano Orman. Um, this is back before he was hyper famous, uh, as the creator of diffusion models. And I continued my work in probabilistic inference by essentially trying to get theoretical guarantees for reasoning processes. One thing we don't have these days with language models is theoretical guarantees of anything. You can train them and they'll do whatever, they can hallucinate anything and you try to post-train them to make them a little more accurate. But back in the day, I was very interested in trying to identify algorithms that actually gave you scalable guarantees for your error bounds on certain types of reasoning, or maybe more efficient algorithms to be able to reason about more things.

And that's what I did for about a year before, uh, leaving. Speaker C: You mentioned this autonomous driving company, Helm. What convinced you to drop out? Were you, realizing you weren't built for a PhD program and needed sort of something faster speed, or was it really something about, you know, that particular problem or that set of people that galvanized it? Speaker A: Yeah, I found a great co-founder named Vlad Borninski, my former co-founder at Helm. I think that I wasn't so much running away from the PhD as it was just, it was a very exciting time for autonomous driving.

You had recent results that showed very accurate detections of objects and road scenes. And look, I mean, I still think self-driving cars is one of the most obviously valuable applications of AI. I don't know if you've tried a Waymo, but I take Waymo all the time and it's just incredible. Right. So I think there's a lot of excitement about it. And we had a very different approach that later became popular based on unsupervised learning. At the time, all work on computer vision was based on datasets that humans would painstakingly label.

Yes. So there were graduate students in my department that were spending hours per week simply putting bounding boxes on cats or cars or people in random images from the internet. And if you looked at the performance curves with those approaches, you could actually conclude obviously that that would never work for self-driving cars. You could never get to the number of nines of reliability you needed. And so we felt that if you mastered a form of learning called unsupervised learning, where instead of learning from human-labeled data, you just learn from vast swaths of unlabeled data on the internet, you might get a neural network that generalizes well enough to be able to perform well in unexpected scenarios.

So that was a pretty big bet early on. You know, we raised some money and we started de-risking it. And I mean, I, this is public now, but I think for the first time in the United States, you'll be able to buy an L3 highway driving system, I think next year in 2027. Honda, it's called the EV Zero. And what L3 means is that on the highway, you'll be able to engage the car and then do whatever you want, be on your phone, maybe sleep or something. And the reason for that is that the car is taking responsibility if there's an accident.

So it has a lot of other sensors, but it's based on HALM Vision technology that lets it do this. And this is in stark contrast to the Tesla system, which is Level 2. So if there's an accident with the Tesla system, the driver is still responsible, even if the autopilot's engaged. So I think this kind of thing, these phase transitions are just, they were, they were really exciting to work towards. I mean, it took a bit longer than we expected, uh, 10 years, but, uh, it's great to see it happen.

Speaker C: What were the, the lessons from, from building that business? Because you really did take it to, yeah, a, a meaningful size valuation, uh, some, you know, really profound technology, as you're mentioning. Uh, I'm sure that was formative in many ways. Speaker A: It was really formative. I mean, I, I learned a lot from my former co-founder. Um, I also just learned how exciting it is to work on fundamental technology. I think there were always opportunities to work on, you know, simpler things like simpler products, but you know, at the end of the day, like, I just really like building core technology and I'm very lucky that we've been able to work on that at Harmonic as well in the math area.

Speaker C: Given that, you know, you, you had done such interesting work at Helm. How did you get convinced to work on a new mission, uh, with Harmonic? Speaker A: Yeah, I think one thing is just that, you know, Helm was doing pretty well. So there were about 100 people when I left. We had an org structure. I mean, almost 100 people at the company. So I could even consider doing something else. But as I mentioned before, when I was a kid, I had that moment, you know, the eureka moment that math is the logical toolkit that lets you understand the world.

If we can contrast math in 2023 or Math AI in 2023 to Math AI in 2016. In 2016, there were not even reliable ways to represent the AI problem of decoding text. So right now we're all used to a token-based API where if I give you some text in, you get text out. Back in the day, all of the machine learning approaches were based on single outputs or bounding box outputs or very rudimentary, you know, token-based outputs. And so it was difficult to even contemplate what it looks like to be an AI system that can do math because you can't even have an AI system that decodes really basic English text.

2016, right? 7 years later in 2023, what had changed? Well, 2 main things. So first of all, text-based AIs were starting to do okay at rudimentary high school level math. And that's very different because you couldn't do any math 7 years ago. Now you can do high school level math. But the other significant thing was Lean and formally verified math. Yes. Lean is a computer programming language that lets you encode math as code, which means that if the code compiles under the Lean compiler, you know, 100% sure that the math is right.

This has 2 benefits. First, think from the human perspective. Let's say in 2036, you ask an AI system to prove the Riemann hypothesis and it gives you 100,000 pages of text. I think you might as well throw that in the trash can. And the reason is that first of all, it's probably wrong. And second of all, it's just text. It's not something— it's not very clear what some things mean. It could use different definitions elsewhere. So you have to spend a lot of time just understanding the text, even if it's right.

But in contrast, if you get computer code that proves Riemann hypothesis, first of all, you know it's correct because the computer checked it, and it does this without using AI, just using basic algorithms. And secondly, and importantly, you can kind of jump around the math proof in the same way one jumps around a computer code base. So that's the first benefit of using Lean. And the second benefit is that because you don't have humans in the loop, we considered that it would actually let you do reinforcement learning for math much more efficiently.

And that's all we proved pretty definitively with the IMO Gold performance in 2025, essentially starting from absolute zero as an AI lab in October 2023. Speaker C: The lean element here was something that was, was so interesting to dig into. Had there been sort of an inflection point around lean as a, as a system or, you know, certain level of maturity that you were excited about that sort of allowed this to happen? Speaker A: Yeah, it's, it's a weird coincidence, but The language we use is Lean 4, so it's the 4th version of Lean, and it actually went GA the month before we started the company.

Kind of a big coincidence, but, but truly a coincidence. Speaker C: Like it, it wasn't sort of galvanized by seeing that. Speaker A: No, we had decided that formal verification was the key, but so I'll just tell you a little bit about Lean. It's created by a programmer named Leo DeMora. He works at AWS now. He's, I think, one of the best programmers ever, and I think Lean is the best language ever. I can talk about why, but Leo had the benefit of learning from all of the verification language that had come before, whether Roc or Isabelle or Daphne or these other languages.

And Lean went through 4 revisions starting from Lean 1. And by the time of Lean 4, we had kind of worked out all the bugs in Lean, all of the UX issues. And that's the first version that seemed to really scale for both math and programming. So I think it was a coincidence, but it was very important that Lean 4 worked for Harmonic to work. Speaker C: I would actually love to hear why you think it's the best language of all time, because that's a, an exciting claim. And I imagine that many people have never thought about Lean before or, uh, not at any level of depth.

Speaker C: I would actually love to hear why you think it's the best language of all time, because that's a, an exciting claim. And I imagine that many people have never thought about Lean before or, uh, not at any level of depth. Speaker A: So I'm a bit of an amateur in, in the area of programming languages, but I, I think the thing that Lean has going for it is I think it's the first language that really did dependent types well and also tactics well. So previously you had languages that, you know, might have a very nice type theory, but lacked a nice tactic language.

And a tactic language is essentially a metaprogramming language within the language that lets you construct programs more efficiently. I don't know if there's one design decision that made Lean 4 so good, but it's really the sum of many small decisions that made it finally usable enough. For mathematicians to pick up on. And the other thing Lean had going for it is that there's a network effect where Mathlib, which is the library of open source math written in Lean, that's the biggest open source repository. And because it's the biggest repository, it's written in Lean.

Any new contributor that wants to contribute to open source math, they're gonna contribute to Mathlib. And Mathlib gets bigger, which makes it better for more people to contribute. And so this is like a network effect that I, I learned to recognize very quickly from my time at Quora. And as soon as I saw Lean have this network effect, it was very clear that it was going to be the runaway winner. So Lean has actually gotten even better over time because so many people are using it. And again, it's, it's not really one giant feature here or there.

It's just the fact that all these small design decisions were made well, which makes it good for humans and coincidentally AI as well. Speaker C: You, uh, have had sort of two key Vlads as co-founders, you know, Vlad Voronovsky with Helm and, uh, Harmonik is founded with with Vlad Tenev, CEO of Robinhood. How did you two find each other? And what was the conversation such that it led you to this, you know, discovery in, you know, fall of 2023 where you're thinking formal verification really matters and, and this is the moment where it can happen?

Speaker A: Well, the funniest thing is there's another Vlad involved. Really? Speaker C: There's 3 Vlads. Speaker A: It's going to be a little confusing, but Vlad Tenev was an investor in Helm. Speaker C: Ah, okay. Speaker A: And I got just a tiny bit known, just a tiny bit through Helm. But in the fall of 2023, I was chatting with another friend, also named Vlad, and one idea I ran by him was to try to win a Fields Medal with AI. This was on a Sunday morning, it was the Enford Distrail.

It's a very nice hike. Most people that I told that idea to, the Fields Medal with AI, they kind of laughed it off, right? And they said, okay, let's talk about something else. But this Vlad, Vlad Novikovsky, um, he really engaged with it and we chatted about it. He thought it was a pretty cool idea. The funny thing is that the next morning he was talking to Vlad Tenev. I think they were talking about some, you know, potential, potential deal. And apparently in that conversation, my co-founder Vlad Tenev, he brought up the idea of trying to win a Fields Medal with AI.

And he said, you know, it's funny that you mentioned that because just yesterday morning I was talking to Tudor about the same thing. And the thing I like about the story is that, you know, you often hear about companies getting started, you know, by CEO having some idea and then recruiting a team and stuff. But here it's, it's really coincidental. You know, we both told our mutual friend about this idea at the same time, and that's how we all connected. Speaker C: That is so cool. Vlad Tenev, I hadn't really fully appreciated this until researching this, was or is really a math nerd, a math nut.

Studied under, under Terence Tao. Speaker A: Yeah. Yeah, he's a real mathematician. I only did an undergrad in math, so he's the mathematician on the founding team. Speaker C: And so you guys knew each other a little bit, but how did you sort of build the connection, the sort of vision, the rapport needed to say, hey, we should actually, you know, try and do this together? Speaker A: It wasn't really clear that either of us would do it. I think that, you know, he has a day job. He had a day job and still has a day job.

Speaker C: Yes. Speaker A: You know, he's the executive chairman of the company, but He has his main focus, which is Robinhood, obviously. Um, so I think it was more on me to basically figure out what I wanted to do. And as we were talking, I just got more and more excited about the fact that it's possible, right? This had been a longstanding dream of mine to build an AI mathematician, right? Since I was a kid, basically. And I just want to emphasize that the fact that the AIs were starting to work for high school math and that Lean was getting so much traction, those are the two things that I think made it logical for, for you to consider doing this.

So I, you know, we, we had lots of discussions and thought about what it would look like. Speaker C: You make it sound very easy in that sense. I imagine there's a lot of sort of being on the same wavelength that doesn't always come as natural to people, but it's great that you guys had that. Speaker A: Yeah, I think, I think Harmonica is lucky in many ways. Speaker B: One of the hardest things about running a startup is how easy it is to get pulled into low leverage work, payroll, onboarding, hardware setup.

It all has to happen, but it pulls you away from the actual reason you started the company. That's what Rippling was built to solve. Rippling is a unified platform that lets startups run HR, payroll, IT, and finance in one system from day one. With other tools, workflows like onboarding a new hire, setting up payroll, provisioning apps, and shipping a laptop can take days and eat up your focus when you need it most. With Rippling, they happen automatically in one place. Over 15,000 startups, including Cursor, Clay, and Sierra, trust Rippling to scale fast without adding additional ops and HR headcount so that founders can keep building.

So if you or your startup want to move as quickly as you can and focus on what really matters, like your product and your customers, you need Rippling. Right now, venture-backed startups can get 6 months of Rippling Startup Stack for for free. Head to com/mario and sign up today. That's com/mario to sign up for 6 months free today. Speaker A: Yeah, I think, I think Harmonica is lucky in many ways. Speaker B: One of the hardest things about running a startup is how easy it is to get pulled into low leverage work, payroll, onboarding, hardware setup.

So if you or your startup want to move as quickly as you can and focus on what really matters, like your product and your customers, you need Rippling. Right now, venture-backed startups can get 6 months of Rippling Startup Stack for for free. Head to com/mario and sign up today. That's com/mario to sign up for 6 months free today. Speaker C: To put a finer point on it, formal verification and the need for that, how did that sort of emerge from these two forces that, you know, these two sort of catalysts, those two catalysts maybe made math more possible with AI, but it sounds clear to me that you also saw this sort of obvious market need.

Speaker A: At the end of the day, the co— the mission of the company is to explore the frontiers of human knowledge. And to do that, we need to build a tool that's useful for people. The decision to go with formal verification first and foremost is to produce a mathematical agent that is useful to people. What you see a lot actually these days, besides people using our product Aristotle to solve math problems, is you'll see people taking mathematical ideas that they've explored with other AIs, whether it's Gemini or GPT, or in some cases, uh, Opus from Anthropic.

They'll want to double-check that the ideas they've explored are correct, and to do that, they will turn to Aristotle to formalize them. Then they can pick up the exploration from there and, you know, riff off of that. But I think that formal verification is the fundamental technology that lets AIs explain their output in a way that's directly useful to people. Without formal verification, you're really relying on the trust me, bro principle where an AI is ostensibly so smart that whatever text it gives you, just kind of rely on it. I've never felt that that's, uh, interesting to people.

I think in some cases it can help research, but that's maybe in a very transitionary period where humans still have something to contribute to math. But when you get to the point where AIs are just doing all of it, I, I don't see a future where the AIs are not verifying their output in a computer-checkable way. So just to, to reiterate, you know, the, the main point of formal verification is to build a tool that's useful to humans just by construction. Speaker C: How far away are we, do you think, from the phase where AI is just doing this entirely itself, by itself, you know, where the human is no longer playing a particularly useful role?

Speaker C: How far away are we, do you think, from the phase where AI is just doing this entirely itself, by itself, you know, where the human is no longer playing a particularly useful role? Speaker A: Humans will always be the critical component of the system because they have to determine what matters to humanity from math. Do we allocate our resources to number theory or group theory? Do we allocate it towards symplectic geometry or other fields? So again, AI for math is a tool that human mathematicians use to explore the frontiers of human knowledge.

So from that perspective, uh, although AI will become a better and better tool over time, and eventually very few humans will be doing any like really basic calculations or really basic math, I still wouldn't say that we're kind of like replacing humans in the process. I think, I think within 2 or 3 years, AI mathematicians will just be, they'll surpass human mathematicians for any like specific mathematical task. Uh, I don't think it'll be like a decade, like some people say. Speaker C: Right. You mentioned Aristotle, um, and you sort of gave us maybe a few of the pieces there of how it works, but to, to maybe flesh that out.

Speaker B: Yeah. Speaker C: What, what is the sort of the core product of Harmonic and how is it being used? Speaker A: Um, at the moment, Aristotle is the world's first mathematical agent. So you can delegate a mathematical reasoning task to Aristotle via the API, and it'll think for a long time. And it'll give you an answer. And the really cool thing about it is that it's always correct. People have been very used to using calculators, right? So if you have arithmetic tasks, calculators will always work. They'll always be right.

And when we move to LLMs for math, now people have an experience where you can ask a math LLM a question and it'll give you an answer that looks very, very plausible, but you don't know for sure that it's correct. So you essentially have to be a professional mathematician to be able to use these models to their full potential, because only by being a professional mathematician are you able to ensure that the response you got is correct. Now, it's true that checking work is in many cases easier than, uh, coming up with the work in the first place, but we shouldn't underestimate how hard it is to check a detailed mathematical argument.

So Aristotle, in contrast, when it answers your math question or your software reasoning question or your physics question or your stats question or whatever other quantitative reasoning question, It provides the output as a formally verified Lean program. And what that means is that you can be absolutely certain in its correctness. We have people using this to validate solutions, to come up with solutions to Erdős problems. We have people using it to come up with theory for software verification. We've got people using it to model quantum physics, right? We've had a user, uh, identify a bug in a I think quantum Stein's lemma or something like that, and, you know, patch it up and get the first formally verified proof of some important quantum information theory.

So I think the sky's the limit for this kind of tool. You know, it's not going to be writing history essays anytime soon, but when it comes to quantitative reasoning tasks, I would consider it the first API that's an AI mathematician. Speaker A: Um, at the moment, Aristotle is the world's first mathematical agent. So you can delegate a mathematical reasoning task to Aristotle via the API, and it'll think for a long time. And it'll give you an answer. And the really cool thing about it is that it's always correct. People have been very used to using calculators, right?

So if you have arithmetic tasks, calculators will always work. They'll always be right. And when we move to LLMs for math, now people have an experience where you can ask a math LLM a question and it'll give you an answer that looks very, very plausible, but you don't know for sure that it's correct. So you essentially have to be a professional mathematician to be able to use these models to their full potential, because only by being a professional mathematician are you able to ensure that the response you got is correct.

Now, it's true that checking work is in many cases easier than, uh, coming up with the work in the first place, but we shouldn't underestimate how hard it is to check a detailed mathematical argument. So Aristotle, in contrast, when it answers your math question or your software reasoning question or your physics question or your stats question or whatever other quantitative reasoning question, It provides the output as a formally verified Lean program. And what that means is that you can be absolutely certain in its correctness. We have people using this to validate solutions, to come up with solutions to Erdős problems.

We have people using it to come up with theory for software verification. We've got people using it to model quantum physics, right? We've had a user, uh, identify a bug in a I think quantum Stein's lemma or something like that, and, you know, patch it up and get the first formally verified proof of some important quantum information theory. So I think the sky's the limit for this kind of tool. You know, it's not going to be writing history essays anytime soon, but when it comes to quantitative reasoning tasks, I would consider it the first API that's an AI mathematician.

Speaker C: You'll, you'll forgive me. I don't have it perfectly straight in my head, but is it the case that you are sort of using LLMs for the exploratory part of You know, the, the work that goes into finding perhaps a novel theorem or, you know, exploring that space and then sort of transitioning it into a lean formal verification when it gets to the proving step, or is that LLM step happening usually elsewhere and people are, are bringing it to Aristotle specifically for that, you know, verification phase? Speaker A: Aristotle is everything on its own.

So if you ask it to try to puzzle through a problem, it'll puzzle through it. If you ask it to just formalize an existing argument, it'll do that. I actually do a hot take here. So I think hallucination is key to creativity and to reasoning. We rely expressly on the fact that LLMs can hallucinate ideas, which is the same thing human mathematicians do, right? In their problem solving process. If you're given a hard problem, the only way to solve it is to make it easier by breaking into subproblems. The way to do that, if you don't know how to do it in the first place, is to try lots of things, some of which are wrong, and then you learn from them.

And you make progress after that. The way you described it is accurate. There are LLMs that hallucinate lots of ideas. Many of them are wrong, but the formal verification signal gives you intermediate checks that your thought process is correct, and that lets you build on certain branches of your thought process tree and explore them further until you finally get to the solution. So it's the interplay between hallucinatory language models informal verification that lets you make progress in these very difficult problems. Speaker C: And to, to sort of use the, the heuristic from earlier, hallucinations in your view are valuable because it's expanding that search area and sort of exposing you to different patterns.

Speaker A: Yeah. Otherwise you would just be doing stuff you've done before. And it, that means that if you're not able to solve a problem, you'll never be able to solve it. So hallucinations is the way that you inject entropy and, uh, new ideas into, you know, RL training and also search processes. Speaker C: In the phase of transitioning, you know, uh, let's say plausible hallucinations that you want to prove or that sort of LLM-generated output into Lean-4 and formally verifying it, is there sort of a risk of, of almost slippage in that step?

Is that something that's, that's hard to get right? Speaker A: Yeah, there's a process of transforming the mathematical questions themselves without the proofs into the computer programming language Lean. It was what's called the theorem statement. And it's true, that's not a perfect process, but the good news is that that's only a couple lines of code, or in some cases, 10 or 20 lines of code. If you contrast that to the size of the proof, which goes into the thousands of lines or hundreds of thousands, you're saving a lot of time by only having to review the theorem statement.

So ultimately, until people just do math in Lean from the get-go, and you never have to translate to, to and from English, um, you will have a bit of slippage, but it reduces the burden of the mathematician by 1,000 times or more by just only having to review the theorem statements. Speaker C: How do you generate sufficient training data to make this system work? Because, you know, in some sense you're trying to have it create a lot of new math, uh, and, and that's, There's not, you know, easy patterns to riff on, I would imagine, in that sense.

Speaker A: Yeah, we use a technology called reinforcement learning. So the two main components are search and pattern recognition, as I mentioned in the beginning. So you're given a new problem and you don't have any human data because humans haven't done math in Lean. Now the question is, how do you get training data to solve that problem? Well, you attempt to solve it, and in that process, you hallucinate a lot of approaches. And let's say 90% of them don't work. The good news is that because you have a verifier, when 10% of them work, you can take that data and say, look, I had to spend 10x the computing budget to get this one proof.

But now when my pattern recognition system learns about how I did that proof, it captures that in the weights. So the next time around when I'm hallucinating, I'm a little better at hallucinating on problems like that. So now I can solve problems that are a little bit harder of that type. And I repeat the process again. You're basically building your AI system by itself on synthetic data to eventually solve problems that are far beyond any that humans have solved before. And by the way, we have a technical report and we detail a lot of these facets of the training process there.

But ultimately it's just as simple as you let the model try a lot of things and you can verify which ones worked and which ones didn't. Speaker C: It's, it seems like, you know, there's. Clearly some things that, that Harmonic and Aristotle can do very differently, uh, than these LLMs. And that's, you know, immensely valuable. The other places where different models try and compete is on speed and price. I would imagine that there's sort of a, an element where maybe you have to, to give up on speed given that you're doing all of this, this verification stage.

Is, is that correct? And, and also how does price sort of play into this? Does this end up, you know, for the time being, being a more costly product such that it, you know, you have to sort of select when to use it a bit more carefully? Speaker C: It's, it seems like, you know, there's. Clearly some things that, that Harmonic and Aristotle can do very differently, uh, than these LLMs. And that's, you know, immensely valuable. The other places where different models try and compete is on speed and price. I would imagine that there's sort of a, an element where maybe you have to, to give up on speed given that you're doing all of this, this verification stage.

Is, is that correct? And, and also how does price sort of play into this? Does this end up, you know, for the time being, being a more costly product such that it, you know, you have to sort of select when to use it a bit more carefully? Speaker A: First of all, though, the product is free. Um, so we don't charge for it, but it's actually a lot more cost effective than alternatives, even though it's better. One area where I think we can always improve is in speed. So if you use Aristotle right now, it might think for 3 or 4 hours before it gets back to you.

And that's okay because it's thinking about a lot of hard thoughts and doing a lot of math. But we hear over and over from our users that they would appreciate a faster version that maybe was a little less smart and did a little less work. And so from that perspective, I think we can, we can do better to meet our users' needs. But right now we're just not seeing that there's a trade-off between the cost and the capabilities. We've done a lot of reinforcement learning and that actually just gives you a very capable model at reasoning.

Speaker C: There's clearly applications beyond, you know, folks, you know, maybe bringing some, some math problems to it. Where are the most interesting use cases that you see maybe for larger companies that really need that level of, you know, security and certainty? Speaker C: There's clearly applications beyond, you know, folks, you know, maybe bringing some, some math problems to it. Where are the most interesting use cases that you see maybe for larger companies that really need that level of, you know, security and certainty? Speaker A: I think it would be incredible if we can make software bugs a thing of the past.

If you look at the history of software engineering, the dream of software engineering since day one was to produce programs that had proofs of correctness. And the problem, the reason why we don't have that yet is that for every line of code that you write, you might have to write 20 lines of code to verify it. And if it was hard enough for people to write programs, well, writing 20 times the amount of code on top of that, just to get a proof of correctness is out of the question. And as a result, I think, you know, only the most critical bits of code, uh, Amazon AWS or Boeing or the space shuttle or certain medical devices, only that was verified.

And even that was a very laborious process. There's like not many people that can do it. It's, it's really just a rare thing to verify code. But you can imagine if you have an AI system that can specify functionality, formalize it, write proofs of correctness, all of a sudden I would actually question why would you write any software that's not verified? Okay, now it turns out if you're building a website, you don't need to verify that the button is blue. You can just look at it, you know, and if it's blue, you're good to go.

Um, but I think that in the future, as, as these systems grow in prominence and become easier to use, it'll become the expectation that if you're working on any code that has to do with high-stakes scenarios like human lives or finance, or I don't know, like cybersecurity and nuclear plants or something, I think there'll be an expectation that all that code is verified. That means that it's never going to crash. There's no undefined behavior. It means, you know, the functions are implemented correctly. The crypto stuff is correct. And I think that's clearly the future of, you know, high stakes software development.

Speaker C: I think Terence Tao has said, uh, that, you know, the current state of sort of math and AI and, and, you know, maybe Harmonic in particular is that it's suited more to sort of, uh, what he described as the, the long tail of obscure problems. Uh, it sounds like you think that's likely to change very soon. And I've, I've heard you maybe in another podcast, see that you've seen these sparks of insight, I think is how you described it already. Where are you seeing that most, or what have been the, the sort of little glimmers that give you confidence that there may be some really major mathematical breakthroughs from AI in the, in the not too distant future.

Speaker A: So I think what Professor Tao is referring to is the fact that right now the thing that AI is seen best at in math is remixing existing techniques for doing math and applying them to scenarios. Now, if you can do that at scale, there's going to be a lot of problems that there just have simply not been enough human mathematicians to look at. And so if you have an AI that can do this tirelessly and in a cost-effective way, you can start knocking down conjectures. I actually think the opposite is true.

I, I, I have not yet seen the sparks of intelligence that maybe Grigori Perelman had in his, uh, Poincaré conjecture proof, or Wiles had to connect the dots to get Fermat's Last Theorem. That seems to still be a uniquely human thing. I don't know if it'll last more than 2 or 3 years, but it's also not completely obvious how to get there. I think that it's possible and maybe more than more likely than not that it simply scaling up the systems will eventually get you there. But it's also possible that you might have to start from scratch and do math without any human priors and only in that way learn unbiased algorithms that are able to think in ways that, uh, most humans cannot.

So it's open question how you get there. And I think Professor Tao is more right than wrong in saying that right now AIs are mostly suited for kind of the long tail of math problems versus like the, the really key ones. Speaker C: You mentioned this sort of remarkable result, uh, at the IMO where you sort of tied for the gold medal with OpenAI and, and DeepMind as a much, much smaller, much, much newer, uh, company that has raised a lot less money. How did you manage to achieve that in this amount of time?

Like, what does it take to build something that effective at that speed? Speaker A: I think it mostly comes down to the team. I think we have a remarkable set of people here. They're absolutely committed to building mathematical superintelligence and to the mission of the company. Um, I think we have an environment where we found ourselves able to make good decisions on the technology. Look, when we started doing formal math, I think it was a very contrarian bet. I think most researchers would have said there's no future to this. There's no formal will work.

We felt otherwise and we, we kept going with that bet. That did end up paying off in terms of reinforcement learning efficiency. If we had had to manage 10,000 contractors to get human data and build a system that way, that would have been completely impossible for a team of 20 people, which is all we had. But because we use formal verification, we're able to use those 20 people to scale up compute far beyond the rate you'd see at other companies. So I think, you know, there was a bit of luck involved too.

I think, you know, we've had some lucky breaks in the algorithms we chose and the approaches, but overall, I think I would just put it at the feet of the team of making all those good decisions over time and our investors for supporting us on the mission. Speaker C: One of the interesting parts of, of the IMO results was that, you know, OpenAI and DeepMind were not using formal verification, but they did still achieve impressive results. Is there any part of you that, you know, wonders whether if LLMs just sort of keep scaling, the need for formal verification is reduced somehow that, you know, they're still probabilistic, but they're so accurate within a certain, you know, the error rates are so, so low that it doesn't make as big a difference as, you know, we imagine it might at this stage?

Speaker A: I think the reality is it will always be cheaper to verify things formally with earning CPU cycles to check things than it is to generate a lot more tokens by some GPU. So I, I think ultimately people make a decision based on cost and cost and benefit analysis. So no, I, I think that as LLMs get better, even as the error rate goes down, we're going to be producing more and more content with them, which means that the relative value of verification goes up. Just to give you a example, right?

I mean, to, to verify an IMO proof, it'll take a human mathematician maybe 30 to 60 minutes, and this is for high school math competitions. What we're seeing now is that as other LLMs have gotten good at math and started solving Erdős problems themselves, people do still end up running them through Aristotle just to triple check that it's correct. And often Aristotle will find some bugs and like fill in the gaps and proofs and even make it a perfect proof. I think that that's obviously going to continue as the problems we're attacking become harder and harder.

And to be clear, like, if, you know, as we state in our tech report, informal reasoning is a core component of how the Aristotle system works. So it's not like we're betting against that. We just, we just think that verification has to be applied really at the core of the approach to make it scalable and efficient. Speaker C: There was an analogy, I think it was in another podcast that I, I thought found really interesting, which is you, you sort of described what Harmonic is doing and, you know, what AI is doing for mathematics is sort of transitioning the field perhaps away from operating like a cathedral and more towards, you know, operating more like a bazaar.

And, and, you know, maybe Harmonic is sort of, uh, I think maybe you refer to it as like the Linux of math, uh, in some way. Speaker A: Well, 100%. No, I, I, I mean, my co-founder believes in this a lot too. We, we really want to democratize these systems. I think OpenAI feels the same way. Like I respect what they do with just opening their models up. This is in contrast to, you know, like if you look at the Gemini organization, a lot of their new tools, they'll stay under wraps for a while.

Only like a select set of mathematicians will have access. I just think the future of math is the most capable AI is in everyone's hands with verification. You get rid of the checking bottleneck. 10 years ago, let's say you had access to like an English language, language model that was very smart and you used it to solve P versus NP. Nobody's going to look at your proof. It just doesn't matter. People get so many cranks with that every single day. And actually with AI slop, like that's probably grown tenfold. You know, I'll have to ask some professors I know, but it's, it's not, it's not getting any better.

But if you put a system like Aristotle in everyone's hands, where Aristotle can verify the proofs to 100% correctness, now you transition from humans being the gatekeepers of correctness to just evaluating the quality. But what are other approaches we've seen to evaluate quality of projects? Well, you can look at GitHub stars and forks. So if you have a GitHub repository that claims to solve some big conjecture and it gets a lot of stars and everyone else is depending on it for other things, you can probably estimate that it's an impactful project.

Yeah. So I think you're decentralizing both the correctness and also the value in math. I wouldn't be surprised if in 5 to 10 years the journal model starts to go away, or at least like all journals are now formal journals where it's expected that you have a formal proof of something, and it's almost like the final step of the process to go into the journal. But by that time, it's already been adopted on GitHub and other open source repositories. Again, to sum it all up, I think Aristotle and formal reasoning brings math more into the open source software setting rather than, you know, making things even more gatekept over time.

Speaker C: We've touched on this a little bit, but, um, In, in one of your interviews, you made what I thought was a really fascinating and, and deep and provocative claim, which was by 2030, we will have theoretical explanations for everything. Basically, the, you know, the interviewer, I think, said that sounds like a few thousand years of scientific progress, which it, it really does. I, I'm, yeah. What gives you that confidence? Speaker A: Well, you just look at the scaling curves. There, there's basically no upper bound for math, right? If you look at A game like Go, for which similar techniques have been applied, there is an optimal strategy.

Like, one of the initial players wins, or it's a draw, right? Speaker C: Yes. Speaker A: And so you can't do any better than that, no matter how much reinforcement learning you put in. It is a two-player game with perfect information. There's an optimal strategy. With math, there's no limit, honestly. I mean, you have a problem, you solve it, you just generalize and make it harder and solve that. So you can go infinitely long. And I actually think this is an underappreciated point. Which I want to cover for a sec.

People think of AI as, or AGI as like eventually becoming omniscient, but that's not what's going to happen. AI will solve reasoning given the information that this system has. So I think in, by 2030, we will have 5 theories unifying quantum mechanics and general relativity. And we won't know which is correct because they are all equally plausible and correct. but we'll have to do higher and higher energy physics experiments to determine which one's right. Speaker C: Wow. Speaker A: So just because an AI can create like a logically consistent explanation of something no human has been able to create before doesn't mean that it fully solves every single like open problem in the world.

Speaker C: Wow. Speaker A: So just because an AI can create like a logically consistent explanation of something no human has been able to create before doesn't mean that it fully solves every single like open problem in the world. Speaker C: Wow. So you, I suppose that makes sense, but you know, this is truly a, a verbal layman trying to make sense of what you just said. Your sense is that there are multiple theories that would fit something that capacious and that it will just require further and further computation calculation, uh, to— Speaker A: No, no, no, that's not computation, experimentation.

Mm. Speaker C: Okay. Speaker A: So I might have 10 theories about what happened at the Big Bang and they are all equally plausible. They're logically consistent and they explain every known physics result, but you'll have to collide particles at 100x higher energies in CERN than we have now to determine, okay, at this very high energy, here's the difference between these theories. And so which one's correct? And until you have that experiment, it's just a theory. So the AI can only get you so far. And ultimately the real test is the real world.

So AI is not omniscient. It's just, it's just going to be really, really smart. Speaker C: Yeah. Until it can generate sort of the necessary data to disprove or prove some of these further. Speaker A: Exactly. And that's not really like a math problem. It's like a political problem, right? Like, do we even want to spend the money to figure that out and why? Speaker C: You also described a pattern where history alternates between intellect leaps and data leaps, which I thought was something that I hadn't heard before. Can you explain that a bit?

Speaker C: You also described a pattern where history alternates between intellect leaps and data leaps, which I thought was something that I hadn't heard before. Can you explain that a bit? Speaker A: Yeah, it's kind of what I, what I was alluding to here. So if you consider biology, for example, before the microscope, there were so many theories of I don't know, phlogistons or this really weird explanations of how the human body works. And then someone simply invents a microscope and you can look at the cells. Now all of a sudden, 99% of those theories are ruled out and you have like one theory that works and that's the one you're going to go with.

And then you have the electron microscope and then you have super-resolution microscopy. And that is what gives you more insight to break symmetries between theories. I think what naturally happens is in the period where you don't have new measurement modalities, all you can do is think. And so mathematical thinkers, they come up with all sorts of crazy ideas. And then some engineer somewhere just has some breakthrough and they say, okay, we have a microscope now. And all of a sudden, all those theories go away except for one. Now you have like 30 new theories to explain like all the new things you saw.

So now naturally there's an advantage to people who just measure more things. And then you run out of things to measure and you kind of have what's been happening over the last 50 years in physics where it's like, you just can't run too many physics experiments. And so you're just sitting around thinking of theories. So I think it's natural to go between these periods, but what I think will happen with AI is that there will no longer be long periods where people just think about things because the AI will just think about everything immediately.

It's up to the human to say, okay, like what measurement do I want to do next? Speaker C: Interesting. So the, the role of the mathematician in 2035 is, is more about directing the process or deciding where the places worth investigating are. Speaker C: Interesting. So the, the role of the mathematician in 2035 is, is more about directing the process or deciding where the places worth investigating are. Speaker A: It's almost like a portfolio manager in a hedge fund. Like they're just allocating some, uh, spend of the AI and it's up to them where they'll be able to convince the other humans that it's like valuable to spend it there rather than somewhere else.

Speaker C: Now sort of approaching 3 years into this, what have been the most significant things you've been wrong about so far? Speaker A: I think all of my timelines are too conservative. Um, I remember one example. So when we started the company, uh, and we were pitching it to investors and to prospective candidates, Vlad would always say, yeah, and then we're going to win the IMO gold and then we're going to do a rewind hypothesis. And I remember just thinking like, man, it's like, it's a pretty aggressive claim to say IMO gold.

Like we're pretty far from that. Then we did it. Right. And I think that was, that was a pretty crazy, pretty crazy result. And then it turned out, you know, OpenAI and Google had also done it. Right. The IMO was like far beyond the capabilities of general models at the time. So I think the thing I've been most wrong about, even having been in AI and working at a company like Harmonic is just, how fast things go. And that's why, you know, uh, I don't have specific reasons to think that within 2 or 3 years AI will be better than all human mathematicians.

It just seems like it's no longer implausible. I, I wouldn't be shocked. I mean, I'd be a little surprised, but I wouldn't be shocked at all. Speaker C: You've gone from, you know, these different disciplines that we've talked about, you know, the piano and protein folding and autonomous driving and math. If you were to describe your sort of zone of genius across those? What is the sort of cleanest distillation? Speaker C: You've gone from, you know, these different disciplines that we've talked about, you know, the piano and protein folding and autonomous driving and math.

If you were to describe your sort of zone of genius across those? What is the sort of cleanest distillation? Speaker A: I wouldn't say I have a zone of genius. I try hard, I guess. I think that I just, I really care about choosing the right problem for the right reason and then sticking with it. And so I think a property that a lot of people here share as well, actually, especially some of the early folks that joined. I think with the autonomous driving business, for example, it was very contrarian to do unsupervised learning.

Really everyone was just thinking, well, we're just going to label our way to solving this problem. But it seemed obvious that you couldn't, just scientifically. And then there were people raising billions of dollars in 2017 to do L4 and it was just clear that wouldn't happen. I think with Harmonix it's been the same thing. We chose this problem because you need formal verification for it to be very useful to people when AI gets smarter than humans. I think it was very controversial at the time. Even now I think some people would disagree, but we kind of stuck with it.

So I guess, I guess when I feel like I've sunk my teeth into a problem that I think makes sense, I'm unwilling to let go until something major happens to like, you know, give an information update. And I haven't yet to see that in the field we're in. So if I had to choose one thing, that's probably it. Although I would just call it tenacity more than a zone of genius. Speaker C: You know, it can be equally on the level of genius, tenacity, right? I always like to end with a few sort of thought experiments.

Uh, so I hope you'll indulge me, but If you were given no operational constraints and unlimited resources, what is an experiment you would like to run? Speaker C: You know, it can be equally on the level of genius, tenacity, right? I always like to end with a few sort of thought experiments. Uh, so I hope you'll indulge me, but If you were given no operational constraints and unlimited resources, what is an experiment you would like to run? Speaker A: I would run economic experiments to figure out how we're going to organize ourselves once AGI hits.

I think AI development's going just fine. I don't think we need to staff it even more than we have so far, but I care a lot about kind of humans being in charge and like society working well. And I feel like we're not spending enough like mental energy as a society and economic energy to determine like what's the right way to organize ourselves. So I'd probably figure something else, something out there to work on. But yeah, I think, I think we're good on the AI side. I think RL is working just fine.

We don't need to drop our resources. Speaker C: Do you have any hypotheses on the, uh, post-AIG, AGI world, uh, side or of areas that you'd want someone to, to really dig into? Speaker A: I think the question of whether it should be UBI or something else is, is the right one. I mean, we need to like decide as a society, right? Do we get all of our meaning from work? If not, where else? If it is from work, how do we properly account for the, you know, value that all humans have created in history to give AIs the background knowledge that they have to then be able to do reinforcement learning and self-improve?

So I don't have concrete proposals, but I, I just think that as a society, we talk so much about AI's capabilities, but not nearly enough healthy debate about what comes next after that. We're just kind of assuming it'll be fine. I think we should put a little more energy into like figuring out what that looks like. Speaker C: What's a practice from another culture or another era that you wish was more widely adopted in, in your environment, your local environment? Speaker A: I feel like it would be helpful for everyone working on AI technology to, to have studied a lot more humanities and history.

To have like the human perspective. I think right now, because everybody building AI is so focused purely on the engineering merits of one solution versus another, I think a lot of concerns that other people have about the development get kind of like lost in the shuffle and just not really discussed. So I think certain companies are better than others, but I think if I interpret my local environment to be the AI industry, um, I think we could do a lot just, uh, simply looking at what's happened before. Ethics, right? Equity, that kind of thing.

I mean, maybe I'll publish like a reading list on my website or something, but I think that would just go such a long way to like putting things in context, like what's going on now, what's happened before, how can we go forward, right? What is the justice in certain approaches versus others? I don't, I wouldn't say it's for another culture, but just that's maybe something from college or something that kind of stuck with me. Speaker C: Well, you've given a perfect segue into my final question, which is, If you could assign a book to everyone on earth to read and know they'd understand it, what would you want to assign?

And maybe you can give us a, a few, the, a few of the starters on your reading list. Speaker A: Probably Anna Karenina, I'd say. I, I, I think, I think that these books, I mean, these authors seem to have really captured human nature in all its facets. And I, I think these books teach empathy, they teach foresight, they teach so many things that, uh, that you just don't get via like short-form content or, you know, YouTube videos or stuff like that. So I think some of these like books where the author really tried to understand the human condition and put it in writing, it wouldn't be fiction, right?

I think that's, that's what I'd write. So I think Anna Karenina is probably top of my list. Speaker A: Probably Anna Karenina, I'd say. I, I, I think, I think that these books, I mean, these authors seem to have really captured human nature in all its facets. And I, I think these books teach empathy, they teach foresight, they teach so many things that, uh, that you just don't get via like short-form content or, you know, YouTube videos or stuff like that. So I think some of these like books where the author really tried to understand the human condition and put it in writing, it wouldn't be fiction, right?

I think that's, that's what I'd write. So I think Anna Karenina is probably top of my list. Speaker C: We fully agree on that. I wish more people would, uh, would read a lot of fiction. Uh, I think there are such deep lessons in fiction that we, we don't get many other places or anywhere else, uh, in our media diet really. Well, this was such a pleasure. Uh, Tudor, thank you so much for spending this time with me. Speaker A: Thanks so much, Mario. This was really, really great. Thanks for the great questions.

Speaker B: That's it. Thank you for listening to this episode of The Generalist Podcast. Please subscribe on Apple Podcasts, Spotify, or your preferred podcast app. Ratings and reviews help others discover these discussions. So if you enjoyed the conversation, I'd be grateful if you could take a moment to leave one. For all past episodes and more, visit us at com. See you next time as we continue to explore the future.

Want to learn more?

Ask about this episode