Decart’s Dean Leitersdorf on AI-Generated Video Games and Worlds

Can GenAI allow us to connect our imagination to what we see on our screens? Decart’s Dean Leitersdorf believes it can. In this episode, Dean Leitersdorf breaks down how Decart is pushing the boundaries of compute in order to create AI-generated consumer experiences, from fully playable video games to immersive worlds. From achieving real-time video inference on existing hardware to building a fully vertically integrated stack, Dean explains why solving fundamental limitations rather than specific problems could lead to the next trillion-dollar company. Hosted by: Sonya Huang and Shaun Maguire, Sequoia Capital 00:00 Introduction 03:22 About Oasis 05:25 Solving a problem vs overcoming a limitation 08:42 The role of game engines 11:15 How video real-time inference works 14:10 World model vs pixel representation 17:17 Vertical integration 34:20 Building a moat 41:35 The future of consumer entertainment 43:17 Rapid fire questions

Published: Published Nov 13, 2024
Uploaded: Uploaded Jun 11, 2026
File type: POD
Queried: 00

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:00-1:46

[00:00] So we launched Oasis a few weeks ago. [00:02] And really, when we launched it, the incredible thing from a tech perspective was, oh, this is the first video model that actually runs real time and you can interact with it in response to user actions. You can move around the world. You can break blocks. You can place blocks. And so we got this nice game without a game engine. [00:18] Okay. But that's not interesting. Why is this actually interesting? [00:22] And so to answer that, forget about Oasis, Oasis 1. Think about, say, Oasis 3. [00:29] Okay. And imagine this. So... [00:32] Imagine for a sec, just go, just, [00:34] Put tech aside for a second. Imagine you're looking at a mirror. [00:38] and you have this magical mirror you can talk to it [00:42] You can tell to do cool things. You can say, "Hey, I'm here, and here's my hand, and I want to hold a sword." You can give me a sword. And then you look at yourself in the mirror and boom, there's a sword in the mirror where your hand is. And you move your hand around and the sword moves. And you can be like, "No, no, no, make this sword bigger, make it blue." And it changes. And you can be like, "Okay, now turn me into Game of Thrones." And everything around you become Game of Thrones. And then you get a crown and everything. And you can be like, "I don't like my crown. Change it a bit." And then you start jumping and you move around. And the mirror responds to that. [01:12] . [01:12] Okay? [01:14] And that's interesting. Now, the reason that's interesting is because it's a completely different experience than anything we've had before. [01:22] on earth. And it allows us to kind of channel our imagination through screens that we can see. [01:29] It connects two things. It connects what we see in our minds and what we can see with our eyes. And so that's where we're going with this. How can we, in a sentence, how can Gen AI really allow us to connect our imagination to what we see in our minds?

1:46-3:25

[01:46] on our screens and with that we can take it into into really worlds that we didn't explore before it can change everything from applications we can't we can't do today all the way to how we even interact with with computers or with hardware. [02:01] *music* [02:18] Hey everyone, I'm Sean McGuire. I'm a partner at Sequoia Capital. [02:21] Today, my partner, Sonia Huang, and I are going to interview Dean Ladersdorf. [02:25] Dean is a brilliant young mind. [02:27] He grew up back and forth between Israel and the United States. [02:30] He was the youngest person to ever get a PhD from the Technion at Israel at 23 years old. [02:35] at least until his younger brother beat him and got his PhD when he was 21. [02:39] Descartes is trying to deliver delightful AI experiences. [02:45] Really trying to let people interact with their imagination and other people's imaginations in a way that's never been possible before. [02:52] To do this, they are fully vertically integrated, optimizing everything from as low level as CUDA kernels up to designing their own models, training the models, and then at the end of the day, delivering experiences. [03:04] Over the next few months, we're going to see some pretty impressive launches from these guys. [03:10] Dean, thank you for joining us today. I was just playing Oasis this morning. I had so much fun. So let me start by asking Oasis, a fully playable AI game engine. It's like, what is it? Why did you launch it? About Oasis. So we launched Oasis a few weeks ago.

3:25-4:59

[03:25] And really, when we launched it, the incredible thing from a tech perspective was, "Oh, this is the first video model that actually runs real-time and you can interact with it in response to user actions. You can move around the world. You can break blocks. You can place blocks." And so we got this nice game without a game engine. [03:41] Okay. But that's not interesting. Why is this actually interesting? And so to answer that, forget about Oasis, Oasis 1. Think about, say, Oasis 3. [03:51] Okay. And imagine this. So... [03:55] Imagine for a sec, just go, just, [03:57] Put tech aside for a second. Imagine you're looking at a mirror. [04:01] okay and you have this magical mirror you can you can talk to it okay you can you can tell to do cool things you can say hey i'm i'm here and here's my hand and i want i want to hold a sword okay you can give me a sword and then you look at yourself in the mirror and boom there's a sword in the mirror where your where your hand is okay and and you move your hand around and the sword moves and you can be like no no no make the sword bigger or make it blue and it changes and you can be like okay now turn me into game of thrones and everything around you become game of thrones and then you get a crown and everything and you can be like i don't let my crown change it a bit [04:31] you start jumping and you move around and the mirror responds to that. [04:35] Okay? [04:37] And that's interesting. Now, the reason that's interesting is because it's a completely different experience than anything we've had before. [04:46] on earth and it allows us to kind of channel our imagination through screens that we can see. [04:52] It connects two things. It connects what we see in our minds and what we can see with our eyes. And so that's...

4:59-6:38

[04:59] That's where we're going with this. In a sentence, how can Gen.ai really allow us to connect our imagination to what we see now? [05:09] on our screens [05:11] And with that, we can take it into into really worlds that we didn't explore before. It can change everything from applications we can't we can't do today all the way to how we even interact with with computers or with hardware. [05:23] I love the mirror. Let's take it further. Where are you going with that? Is this a social media thing? Are you building a game? Are you building a world model, an interactive world model? How should I think about what is Descartes? What is Oasis? So let me ask you this. [05:43] What problems does ChatGPT solve? [05:45] Homework. [05:46] Homework, great. And what else does it solve? It makes it easier to talk to computers. Nice. Sean knows the answer because... Because I spend a lot of time with you. Classic Sean. I spend a lot of time with you. But exactly that. The TLDR's strategy but he doesn't solve any given problem. It helps you do your homework better, it helps you write emails, it helps you... Summarize. Exactly. Now... [06:08] It doesn't solve a problem and overcome some fundamental limitation, which is exactly what Sean was saying, that... [06:13] It overcomes this communication barrier between humans and computers. Computers speak in structured languages, humans in unstructured languages, or... [06:19] and languages with complex structure, that LLMs just bridge that gap. Unlike computers and machines interact with each other in a language that we can both understand. That itself, the second you have that, you get 100 different things that are solved on top of that. So what you get with the mirror, or what you get with...

6:39-8:09

[06:39] generative interactive video is you get that communication barrier now overcome not just with [06:45] text, but also with what we can see. Now computers will be able to see the world the way we see it, and they'll be able to show us the world in ways that we can understand. And you solve that, you build a platform that allows you to build everything on top of that, from next-gen Snapchat or TikTok to simulators for fighter pilots. [07:08] Okay? And that's the cool thing here. And that's... [07:12] If, you know, now we're in 2024, I think one of the most fun things we had at the cart is that we're founding a company when you have an opportunity to build something that doesn't solve a problem, but overcomes a limitation. 99% of companies solve problems. When you look at companies that come to pitch Sequoia or pitch any other VC, they start with, here's the problem. Here's how big the problem is. That's our TAM and everything. And here's how we're going to solve the problem. [07:42] Otherwise, you call it a pivot, right? You say, okay, this is the problem I'm solving. If you change the problem you're solving, you call that a pivot. And 500 times, you change the way you're going to solve it. That's 99% of the companies. And that's what you can do in any regular year. [07:55] Thank you. [07:56] They're moments in history. [07:59] Recently, it's been like once every decade, maybe 15 years, that you actually have the chance to build something that doesn't solve the problem, but just overcomes the limitation. And...

8:10-10:03

[08:10] Let me ask you this in a different way. [08:13] Is the Mac a consumer product or an enterprise product? [08:18] And is it a hardware company or software company? Is it a hardware company or a software company? And what problems does it solve? Okay. And if you try to give me a list of problems that the personal computer solves, you'd have everything from gaming to Excel. [08:34] And that's the nice thing about this, that you're building an insane piece of tech that you'll be able to productize in so many different ways. Yeah, I love that. [08:43] One of the things that was so cool about what you've built is that there's no game engine inside as far as I can tell. Like, what do you think that means? Like, do you think? [08:52] Do you think that game engines are an artifact of the past? Or like, what does that mean? [08:56] Game engines were supposed to make it... [08:59] so that one person can create a world, and a different person can interact with that world. That's the purpose of game engines. You have the game developer and you have the user that uses that. And it might go for movies or whatever other people use game engines for. Unreal has been used for movies a lot recently as well. [09:19] Thank you. [09:20] Now, that is a very valuable product and it has lots of advantages to it. The world is very consistent. You can really make things very accurate. The problem is that it does take a lot of time to interact with it. [09:35] People like taking the basic game and like turning into a bunch of different things. And, you know, as we got into this and we actually saw what people do with it, do you know there's an actual mod to put Pokemon inside Minecraft? Okay. You can walk around the forest and there's Pokemon running around. That's an actual mod someone built. Okay. And so people inherently have this, "Oh, we got this platform and we want to change it." And so that's the nice thing about mods. What you get here,

10:03-11:33

[10:03] is that because what's running your... [10:06] your game or your environment [10:08] is an AI, you can interact with it in the ways we're used to interacting with AI. You'd be able to say, hey, can you turn this into Elsa-themed? And then boom, everything becomes Elsa-themed. And can you add a flying elephant? And there's a flying elephant in the game, and it's not just there as a picture, you can actually interact with it. You can punch the elephant, it'll punch you back, or whatever you can do with an elephant. [10:29] And [10:30] So I think that [10:31] If this trend were to replace game engines, it would have to be at the state that you can program for it so that it's some machine that one person can build worlds on and the other can interact with. [10:45] And that is definitely coming. And not only that, it's going to be much easier to program for this. You can just use it. [10:52] You don't have to write code. And even if you do know how to write code, you can iterate so much faster on it. [10:59] So, [11:01] Basically, to summarize this, I think what this will allow us to do is we'll get modding much, much, much, much, much faster. [11:09] Hmph. [11:10] And we'll get into active modding. [11:12] To get a little more technical for a second, [11:16] You're the first video model I've ever seen that has real-time inference. What are some of the things that go into having real-time inference? How hard is it? Give us some of the flavors of what goes into that. If we go back three, four months, back to the summer...

11:33-13:20

[11:33] Bye. [11:34] I don't remember where this was published, but there were a few headlines about [11:38] Oh. [11:40] when blackball chips come out, when NVIDIA's blackball chips come out, we'll get real-time video. [11:45] Okay. Hoppers just can't do it. The H100s can't do it. We have to wait for NVIDIA's next generation. And I think I heard this from quite a few different sources. There were like two weeks during the summer where everyone was saying that for some reason. Okay. [12:01] And no, H100s can actually do it. Okay? [12:05] And... [12:05] To pull that off, you have to do... [12:09] Two things at once. You have to change a lot of things around the model itself. Not every video model can be run real time. You have to train the model differently. The architecture needs to look different. Now, it's not major architectural changes, but you do have to make them. On the other hand, you also have to do lots of the systems level stuff. You actually have to write your own kudo kernels. You have to write, we threw out like PyTorch's garbage collector and wrote like half of it from scratch, okay? And you really have to write everything on the systems level as well. [12:39] To actually pull this off. So because if you do only one of the two, you'll be waiting for someone else to do the other half for you. If you're only doing the systems level part, then you won't be able to pull this off because you won't have a model that's ready to be interacted with this way. If you do just the modeling stuff, you won't have the systems level support to be able to make it run real time. [13:01] Can you say a word on how the model works? Like, you know, is it transformer based? Is it similar to like the Soros of the world? Like, what have you built on the model side? Yeah. TLDR, it's exactly like the Soros of the world. Just the prompt is use reactions instead of text. Like, that's the easiest way to think about it. Thinking like, we have text to video models, right? You have--

13:21-15:00

[13:21] Sora, that you put in a sentence, and you get a video. Same thing here. Just you put in-- your prompt is like your keyboard actions and your past frames, and it generates the next frame. OK. So how do you get the data between actions and video? So yeah, you do have to do some pre-processing steps here that you don't do with regular video models. For example, you do have to take the raw recordings of, hey, this is the gameplay. [13:51] taken. And so we retrained a small model that does that. It actually doesn't need too much data. You can solve that with a small model that doesn't need too many examples. And so you can just have our team just play for a bit, recorded that. You get a small model, and then you use that to label all your data. Super interesting. And are you building a world model, or is this just purely pixel representation? Nice. So it is. It's [14:16] The beautiful thing here is that it's purely pixel representation. [14:20] Now let's compare that to exactly what you were saying with like world models or 3D stuff and the other things. [14:27] In AI, for more than a decade, there's been the general question of do you solve stuff end to end or do you take an existing workflow and make something more efficient? There could have been two ways to solve this problem. You could say, hey, game engines exist. Unity is amazing. Unreal is amazing. [14:44] Plug into that workflow. [14:46] Let's build text to 3D. So I'll describe an elephant, and I'll get the 3D mesh of an elephant, and that'll be embedded into Unity and Unreal, or whatever game engine you're using. Okay? So I compare that to the end-to-end solution of

15:01-16:32

[15:01] At the end of the day, what I have is a screen. The screen needs to show something and that needs to work. [15:06] And at the end of the day, what people do is they see their computer screen and they touch their keyboard and they move their mouse and that's your interface and you solve this end to end from keystroke to frame. [15:18] Okay, so obviously these two are competing directions. [15:22] Now [15:24] Over time, I think that there will be some merging between them. From a technical perspective, they each have their own advantages. The first is much more consistent over time. It's much easier to say, oh, here's this object. Here's how it looks. And when it'll come back in two hours, it'll look exactly the same. And the other one, the end-to-end pixel, the diffusion version that does pixels in pixel space, that one is much more easy to work around. It's much more flexible. [15:54] really say, oh, no, no, change the elephant's tail. It's too big. Or you can actually edit it [16:00] live in a way that's just more dynamic. So I do think that long-term though, [16:05] these two things will converge. [16:07] And just if... [16:11] if we roughly map this out. So today we really just have [16:16] prompt to pixels. [16:19] keystrokes to pixels. You could, in theory, say that the right way to solve this in, say, the next two or three years is to have two models. To have a model that... Everything's Transformers, right? Transformers win. You have...

16:32-18:32

[16:32] One model that's in charge of holding some state... [16:36] state of the game. And that's unrelated to pixels. It's literally just like a LLM-wise transformer. It just gets the current state, it gets the new user's action, and just outputs changes to that state. [16:48] And you have one model that's doing that. And then the second model takes that state and renders it to pixels. So it makes sense that that's roughly where we'll converge, because that will really take into account both the advantage of world models and the advantages of diffusion models. Do you want to build both of those models? [17:05] Of course. Yeah, definitely. [17:10] Love it. But yeah. One of the things for me... I will say that we are a bit off. It will take some time to reach that stage. [17:17] One of the things for me that really caught my attention about Dean and Descartes is they have this ambition to be completely vertically integrated. These guys understand literally down to electrons. I'm serious. They understand how to... [17:36] electrons move in logic gates and even alternate logic gates and how you can represent them [17:44] you know, in levels even below assembly, you know, how you can change, you know, then in, like in assembly CUDA kernels, like you can go, they literally go all the way from electrons to pixels that your eye sees and they're optimizing every single level and there. And I, I, [18:02] I think you, by doing that, I think they'll always have like a kind of 10x plus advantage over anyone that's just on the application layer. Actually, so talk about this, because Sean loves to talk about this. You know, I think the counter argument would be specialization, right? You know, there's 10,000 very smart people at NVIDIA, you know, choose your favorite company working on this. You should focus on building the best possible user experience and the viral loops and things like that. So talk about your decision to be vertically integrated.

18:32-20:22

[18:32] Let me actually say something because Dean can't brag about himself the way we can brag about him. [18:40] I've been studying business models my whole life, has been a passion of mine from a young age. And for myself, like Google to me is one of the most amazing companies of all time, one of the most amazing business models. And I worked at Google for a few years. [18:55] I really feel like people have the wrong understanding of what was Google's moat. For it's worth, I also think people have the wrong understanding of what is NVIDIA's moat today. But for me with Google, obviously Sergey and Larry had invented PageRank. PageRank was a very beautiful algorithm. But it's actually, it was like a... [19:14] deep insight, but it's very simple to implement. It's like a very basic graph theoretical idea. And it was a published paper. So like once... [19:23] PageRank came out, everyone replicated it very quickly. For me, the real advantage of Google was that these guys were-- [19:31] some of the best in the world at distributed systems and at like low level systems optimization. And they had this very profound insight from early on that basically all the other search engines were buying [19:44] Sun Microsystems, like server racks. The way they would get fault tolerance was by buying expensive hardware. Whereas for Google, they realized that they can [19:55] buy just cheap consumer commodity hardware that fails all the time. You buy Intel Pentium processors that are in your gaming computer or like SanDisk memory. And you need five times as many total flops or five times as many bits to get the same performance because of all the failure rates. But the cost per flop is like 1 50th. So you can have like a 10x cost optimization,

20:25-22:02

[20:25] really leaning into distributed systems and getting the most out of the hardware. And what that led to with Google is like for me, when I look back on when I first started using it, it was this very, very simple front end. It was literally just a white web page with like a search box. It was a [20:41] I think, a worse front end than Yahoo at the time. Yahoo also had chat rooms and other, like these kind of flashier, exciting things. But Google had this magical backend. Like all the magic to me of Google was on the backend. And I think that backend, the performance came from this cost advantage. And it came from the fact that they had optimized all the way down to the bare metal. And with Dean and Descartes, the story really rhymes with me. And look, we need to stay humble. [21:11] company hasn't done jack shit yet. It's a very long way before they deserve a comparison to Google. But, and for it to swear at Sequoia, [21:21] Co-led the Series A in Google. I'm very proud of that. Also led the seed in NVIDIA. So we have good history. Good track record. Good track record. Also Series A in Apple. [21:36] Commercial break is over. Commercial break is over. But anyways, I think... [21:43] I think to... [21:45] really deliver these like delightful, like say a delightful mirror experience, which is a very simple front end, I think you need this absolutely insane back end that is optimized to the bare metal. And I think it's kind of all or nothing. Like if you can't deliver real time,

22:02-23:48

[22:02] I don't think it's very good. And I don't think you can deliver real time in the next year without going all the way to the bottom. And so I just, I don't know, for me, I think you kind of have to do that. And these guys are the only ones I've seen doing that. Wow. I love what Sean just said. Because two things really caught my attention. One is about the vertical integration. We'll touch about that in a sec. And it goes back to your original question. The second is really about... [22:32] So... [22:34] I won't name names, but I was speaking to someone who's very, very, very executive at Google recently. [22:39] Okay? And just, you know, reminiscing about the past and trying to hear because [22:44] I was three months old when Google was founded. Okay, so I was around back then, but not really paying attention. [22:53] Knowing you, Dean, you might have been paying attention. [22:58] So I was trying to understand exactly what happened there, why that was interesting. It came from an unrelated conversation. And... [23:09] The way that person brought it up, we're talking about how GPU clusters are just unreliable. [23:14] Okay. Just, you know, in general today, if you try to train a model like the one we trained on any cluster. [23:19] okay whether it's hyperscalers or gpu clouds that thing's gonna crash every few hours [23:24] Okay. And you're going to have like the weirdest things. Okay. You'll have one node crashing and it'll be because two other nodes have dust on the cable between them. Okay. And there won't be any error to really tell you that that's what's happening. Okay. So your training room will just crash and be like, okay, why did it crash? And you'll try rebooting it and it won't work. And then you'll try removing random nodes until you understand what happens. And that's the state of the entire industry.

23:54-25:26

[23:54] built, they really built everything down to like Google built everything down to the hardware as well. OpenAI had a lot of time to really focus a lot of these reliability stuff, but anyone else's training from the big companies to the small startups, they're all experiencing this. And so I was talking to this, to this person who's very, very high up at Google and [24:15] And they said, hey... [24:16] where training today is back where CPUs were in the 90s. Forget Kubernetes. There was no VMware. [24:27] Okay? Nothing was reliable and your servers would just crash all the time and you had the exact same thing that... [24:35] Most companies didn't want to deal with that. And so they just either paid for the premium service that was somehow better or [24:43] A, so they both paid more money, but B, they also paid with time. [24:47] The broken hardware exists before the stable hardware exists. Sure, we'll get to stable training runs in a year. [24:57] in two years, whenever that will happen. OK, NVIDIA will make their chips more stable. They'll make their code more stable. The GPU clouds will figure out stuff around this. That'll happen. It's not the state today. [25:09] If you want to train a model today, you're going to face all of that. And so [25:13] one of the things that it's really a challenge you have to deal with, and at Descartes we just deal with it. [25:20] Okay? The reason we can... So the model that you saw, OASIS, okay? OASIS 1,

25:27-27:18

[25:27] Oasis one converges from start to finish in 20 hours. [25:30] Well, [25:31] And you can compare that. We know what the... We have lots of... [25:37] joint work or communication with other AI labs, they were all shocked by this. [25:43] Now, we're talking really about the best labs training diffusion models. For this model, their convergence would usually take around two weeks. [25:51] And it's both because they're not using optimized systems layer stuff, but also because they crash every few hours. [26:00] or every few days or whatever. We can actually hold it. We can actually hold the training run end-to-end without crashing. We can also hold a training run for a week or for two weeks without crashing. [26:11] And that reliability part really, really resonates with what happened back then. [26:17] Now, the thing about it is that it's really not simple to pull off. [26:22] You see, we have this internal doc, I think it's around 200 pages now, of everything that can go wrong when you're training a model. And it's everything from, if you see this error on this node, then yeah, tell your hardware operators that these two nodes have a problem between them. These other nodes have a problem between them. And all the way to, and here's a fun one. At a certain point, as we're training OASIS, [26:48] We were doing the training run. [26:50] And we needed some synthetic data to generate as well. And so we said, okay, well, we have this cluster. It has a shit ton of CPUs as well. Like, great, it has lots of GPUs, but there's lots of CPUs and they're being like, they're utilized by like 3% or something. Okay, we can just use this and just generate lots of synthetic data on the same cluster as the training is happening. Okay. By the way, this like blew the minds of our GPU cloud.

27:20-28:56

[27:20] cluster to like 200%. You're using the CPUs, you're using the GPUs, and you're using, we even use like the InfiniBand to send data around during training. So like we're getting a lot more out of the cluster than is like, should be expected. Okay. [27:36] Now, that all makes sense. So on one hand, you have this, like the GPUs are utilized, the CPUs are not utilized. So you run like synthetic data in parallel. It's not supposed to utilize, it uses just the CPUs. And so it's not supposed to hurt anything. [27:49] and then your training run doesn't work. [27:51] and you get a random error that literally says the team will know how to say this better but the error that you get is something like missing lock file in the data loader [28:02] Okay, and it's like how are these two related? Do you know why I know how they're related? They're related like this. [28:08] The synthetic data gen was using up more RAM [28:13] which is fine, but it caused-- sorry, no, it was-- okay. To move the data around between the different nodes as the synthetic data was being generated, it was using more network bandwidth than before. And that caused Python's data loader to take one of its log files that's usually network mapped, and move it to be-- swap it out to disk. [28:37] Okay. And that caused the state that different nodes had different log files. And that caused the data loader to crash. Okay. Now I'm probably saying this wrong and the team's probably like listening to this and like, no, Dean, you're getting it all wrong. But that's the TLDR of what happened. Okay. You did something that was supposed to make sense.

28:56-30:44

[28:56] Thank you. [28:57] And you got a random error. And that's the day-to-day. And we have a 200-page doc of all of these things. And so that's my part. And this is a simple example that Dean is happy to share. Like there's, you know, there's... It's one of the simpler ones. There's... [29:11] 100x [29:13] harder, more important things that they've had to figure out. One that I think is also relatively simple, but it's just [29:20] kind of shows the current state of AI. And Dean, feel free if you don't want to talk about this, don't talk about it. But they got access to a new cluster, and somehow the cluster had not installed memory yet. But the GPUs have some very small amount of onboard memory. And so, like... [29:41] most people would just not even be able to use the GFuse. Can you share anything about this story? Yeah, so this is actually a nice story. So, you know, [29:51] We call this the best place on earth to train a video model. [29:54] Training a video model isn't just the cluster. It's everything surrounding the cluster. Okay, you need to have storage there. You need to have the networking there. There's so much that needs to go into building the best place on earth to train a video model. [30:05] And we're actually very far away from them. Okay, like... [30:07] I'm assuming that roughly over the next half year, lots of this will stabilize and lots of the GPU clouds are working on this. [30:14] Um... [30:16] But yeah, with one of the clusters that we got to, [30:18] there wasn't any storage. And by the way, it wasn't even with one, it happened with a few clusters and different clouds. Okay. That, you know, the clouds, they bring the GPUs and they try to get everything. They're so focused on getting the H100s that, you know, they forgot the memory or the storage. And it's fine and it's okay. And they were going to install it, they would get there, but, you know, they try to release everything as fast as possible, which is great, which makes sense. And so, okay, there was, there was no, you know,

30:45-32:17

[30:45] stable storage, storage optimized nodes that you can use, or an S3 bucket or something that you can use. [30:51] And so we said, okay, well, [30:53] every node has a few SSDs connected to it. What if we just, you know, [31:00] build our own mini fake distributed file system on top of that. Okay. And that's what we did. And it worked. And it was, there were so many things to overcome to make that happen. [31:12] But it works at the end of the day. And that's, I think, and it goes back to your question about vertical integration. [31:18] Yeah. [31:20] Vertical integration, so I'm... [31:22] Sean knows business much better than I do and has been around... [31:26] all of these fields way longer than I have. Okay? I did PhDs and, like, technical stuff. I think he just called you old. Oh, no, no. I said experience. I was using Google when it first came out, and I bought NVIDIA shares in the IPO, which is also right around when it was born. So, yes. NVIDIA, I think, IPO'd before it was born, no? 96? 99, I think. 99? Okay, okay. [31:50] Um... [31:51] But yeah, as far as I see it, and correct me if I'm wrong, vertical integration usually gives you two things. It gives you a cost reduction, like higher margins or whatever. And it gives you the ability to move faster. [32:01] Maybe it gives you a third thing, because usually things give you three things, but who knows. So I think here in AI... [32:07] The more important part, sure, they're both important, but I think the second one is even more important than the first. Because at the end of the day, if you look at all the problems we're facing, great, they will be solved.

32:18-33:49

[32:18] but it'll take time for them to be soft. And if you, you know, [32:23] I think there was a great article, I think at the information about how it was like a few months ago that people who leave Google to start startups. [32:32] suddenly realize that nothing works. Because everything works inside Google, and then you go outside like... [32:37] "Oh, there's no storage." Or, "Oh, my cloud provider doesn't provide me with this. I actually need to take care of this." And so, okay, fine. Over time, these things will stabilize and your clouds will provide you what the cloud needs to provide you. And you'll have great companies that provide you with like middle layer for the system stuff, or even for the model training stuff will make lots of easier, lots of easier for you. But, [33:02] If you really do everything end to end, you can get to market a year before everyone else. You can get to market two years before everyone else. [33:08] And that's, I think, what's key here. Because even if we go to the Google story, [33:14] or open your eyes story, tech modes don't last. [33:17] Right. [33:18] Sure, Google is a great search engine. Bing is probably not that bad. [33:23] Sure, maybe Google has more data, so they're able to do that. But Microsoft is a huge company. They've been working on Bing for so long. It's a good search engine. They have the tech. [33:33] It still doesn't mean that now Bing and Google are balanced. So at the end of the day, the entire game here is get your tech mode quickly and then two years before everyone else, like Google and OpenAI did.

33:49-35:22

[33:49] and work as fast as possible to convert that to a different note. [33:52] and that's the game here that's what you have to play because we can all say okay, you know what? Sequoia invested, all good let's put the money in the bag for a sec okay, let's get some interest on that we'll go beyond the beach for like two years wait for everything to stabilize we'll come back in two years and then we'll build the same company and that'll be great but someone else would have done it before [34:13] And that's, I think, why we chose to be vertically integrated. I love it. What's your remote going to be? [34:21] Long-term or short-term? Both. Both. Perfect. Short-term tech. [34:26] Okay, short-term tech, and that's great. And we have the best systems layer stuff, and we're also doing the model layer stuff as well. So we're fully integrating, and that's your mode at the end of the day, short-term. [34:39] Long term? [34:41] Long term, I think that's a great question. And let me... [34:46] Let me share something that I found really interesting. Okay? So, [34:51] Thank you. [34:52] There is a new, weaker version of network effects that exists today that didn't exist before. [34:59] And that network effect is called what people see on TikTok. [35:04] Now, why is that interesting? Okay, we were one of the companies that I really, that we learned a lot from and that I think is actually a really, really good company. They did end up selling to Google, it's Character AI. They didn't end up selling to Google and wanting to go back to training big models, but...

35:22-37:05

[35:22] Character, there's a lot to learn from character. And one of the things that the second they took off, they had lots of competition instantly. [35:31] Like... [35:32] Fine. The tech mode lasted for like half a year until Meta released open source models and then [35:37] Other people started running this. They were still vertically integrated. And so they were able to be 10x cheaper than everyone else, which was great. [35:45] But one of the things that really stood out to me was their TikTok mode. [35:51] If you go on TikTok and you look for any character you add a competitor, fine, you'll find a video of that competitor and then you'll scroll and you'll see a hundred videos of character. And if you even, you know, if you go on the videos which are not character, all the comments are full of... [36:06] Character. And if you talk to a random character AI user, they don't even know the competition. And so we have somehow, literally because of TikTok, there is a new moat. [36:19] of what people say about you on TikTok? And do you have a mini network effect there? [36:25] I'm not sure if it's a network effect or brand effect, but-- - Why is this different from just brand? - So, [36:33] So it's very similar to brand, but it's... [36:37] it's in your face. Like brand like 20 years ago was okay. Did you hear your friends talking about this or your parents talking about this? Here you're always on, like the younger generation, especially they're always on TikTok. And so they just see this instantly. And so there's even a big question of whether a moat like that could survive for the two, three years until you need, until you get your long-term moat of like insane brand, like Google's or a distribution brand or

37:07-39:03

[37:07] So I think we're really in this new market here that we're not necessarily going to have the same moats we had 10 years ago. Hmm. [37:16] Super interesting. Hardware is always the best moat, though. [37:20] And for it's worth it, Google, I think, [37:23] you know, they elevated what was initially like a, [37:28] software mode and a distributed systems mode to becoming a hardware mode. [37:35] I [37:35] I personally think that Google has not leveraged that enough. On the application layer, they haven't had that many really [37:46] like fantastic breakout, you know, consumer products since the early days. But they have an absolutely gigantic, you know, cost advantage, really because on the hardware layer. When I was at Google, there was a... [38:01] There was this project that just absolutely blew my mind and [38:05] gave me a prepared mind for a few investments, which is basically Google built optical interconnects to move data in data centers. One of the papers, if you Google Jupiter rising, like Google data center, you'll find the papers. And basically these optical switches, by turning them on, basically about doubled the performance of the data centers. Like this one switch is mainly rack to rack in data centers, you know, moving from electrons to photons. And [38:33] One of these switches were insanely hard to build. And basically everyone outside of Google, if you asked them at the time, is it possible to build-- They'd say no way. --this 100 terabit per second switch or whatever, they'd say absolutely no way. But they did it. People didn't even know for years that Google had this. And it reduced power consumption of the data center by 30% or something. Those things are real fundamental moats. I think it's always hard to know what the moats will be.

39:03-40:37

[39:03] for a company in the future. [39:05] But I strongly believe hardware is the ultimate moat, in part because there's always going to be an extreme delay to move atoms, to spin up fabs, to get power, to build a power plant. Even in a world with AGI, the timescale of hardware, even in a world with a billion optimist robots, the timescale to... [39:30] make new hardware will be much slower or the time scale will be longer. So anyways, I hope Dakar has a hardware mode. I think I agree with you on that. Like long term, [39:42] Okay, you know, this actually goes back to when we were founding Descartes. So we said, okay, [39:47] Thank you. [39:48] We're in this, we called it the golden ticket. We got this ticket that you get once in your life of starting a company and a time where, going back to what we were discussing before, [39:58] starting company at a time we can solve some fundamental limitation and not like there's some huge tech shift going on and we said okay they're they're they're [40:06] Three huge companies you can build here. [40:08] That was our analysis of the field. A, you can build an NVIDIA competitor. [40:14] And if you, like the next gen chip that's actually built for AI, and it'll be very tough to do, but NVIDIA, you know, NVIDIA is not just a chip giant, but they're a supply chain giant. Okay. Insanely hard to do, but, you know, if you hustle your way around, everyone in the industry wants to help you. And so it's doable if you really excel on the business side.

40:38-42:08

[40:38] Two was to build the next AWS. [40:41] Like there is an opportunity because the workloads themselves are changing. There is an opportunity to be able to build a new cloud. Very, very, very tough because in that market, there's a default winner. If you all lose, the big three will still win. The big three plus Oracle or the other clouds as well. And the third was create new experiences. [41:03] Mm-hmm. [41:04] that new experiences will happen and these experiences will be drastic enough so that the next trillion dollar company can come out of these in five years and not in 30 years. [41:14] And so we had to choose one to start with. We chose the experiences one. But a definitely strong second was let's build a NVIDIA competitor one. [41:24] And so we have that lingering thought of one day we'll get back to this. I see why you two are friends. [41:33] I will close it out with one last question. If everything goes right, what is Descartes in 10, 15, 20 years? And what experiences have you crafted? And what is the future of consumer entertainment? I don't know if that's the right market. [41:54] firm. [41:54] - Okay. - Generate experiences. [41:57] GX. [41:59] Okay? And... [42:01] We call this UX is dead, long live GX. Okay, basically...

42:08-43:45

[42:08] We're going to have... [42:10] new experiences that are generated in ways that match [42:16] how humans want to interact with computers. And that encapsulates everything from [42:22] characterize a generated experience to real-time video models or generated experiences. And that's what we're going to see. [42:30] With Descartes, at the end of the day, is a generated experiences company. We're implementing this with being fully vertically integrated, with having the systems layer. At the end of the day, [42:40] you're a generated experiences company. [42:42] You're creating the new wave of experiences that's going to touch... [42:47] every single person on the planet. [42:49] And that's where the car is. Now, the only question is whether it does take 10 or 15 years. [42:54] In today's age, it might take less. [42:57] It took a long time for the previous Titans to rule the world. [43:02] I... [43:04] I don't know if it'll take that long. [43:06] This time will definitely take at least five years. You operate on a different timescale than a lot of the best AI researchers that are in our orbit. And I really respect that about you. Should we close that with a rapid fire round? Sure. [43:18] Favorite AI app other than OASIS? [43:21] It has to be between Chachipati and character. It has to be between Chachipati and character. What do you use character for? [43:27] Not using character. Okay. But on the basic notion of that we'll have these... [43:34] apps that are entities that hold some kind of relationship, whether it's friendship or whether it's utilitarian with hundreds of millions of people. I think that's an insane platform that's

43:46-45:17

[43:46] So many things going forward. [43:48] Yeah, I love that. Favorite AI company could be the same as last. Same as the last. Same as last answer. Okay, let's see. When did you first program a computer? First program a computer? When I was 13, bots for RuneScape. [44:03] Okay, great game RuneScape. I bought it the hell out of it for years until six years in I used a bot that I downloaded from the internet. 24 hours later it got banned. Are we going to have AI generated video games first or AI generated novels? And I mean at the level where I would actually pay for it. [44:22] Um... [44:23] You're going to have, the first thing you're going to have is a platform that lets other people use their creativity to create this content. Because AI is still far away from creating creative content. Super interesting. [44:32] Okay. [44:34] Who's your favorite scientist ever? [44:36] Thank you. [44:37] Favorite scientist. That one I like. That one I like. It's... [44:44] You know, there's a reason we chose the name Descartes. [44:47] We chose the name Descartes. [44:51] Because, okay, first of all, I'll answer the question. Favorite scientist is Da Vinci, because I think he's both an insane scientist and engineer, and somehow was able to get people to fund his projects. He was like, if you go back to Da Vinci, he literally was a great scientist, engineer, and somehow knew how to raise money from VCs back then, which were kings. [45:15] So yeah, definitely Da Vinci.

45:18-46:33

[45:18] And... [45:19] Descartes and Tesla are close seconds. [45:21] The reason we chose the name Descartes was we looked at Tesla. We're like, okay, we love both that company and the name. And we needed someone who resembles the same thing that [45:33] Nikola Tesla resembled to the company Tesla. And for that, that was the card because... [45:39] You know, I think their Freyam resembles almost... [45:42] A lot of what AI is to do. [45:45] it's a perfect note to end on [45:47] Dean, congratulations on what you've done. Thank you for joining us today. We love this conversation. Dean, I'm not going to congratulate you. You haven't done jack shit yet. Let's build something insane. But I love the sentiment. We can't celebrate until we really win. There's no celebrating small wins. [46:17] Thank you.

Want to learn more?

Ask about this episode