Library / In focus
Dwarkesh PodcastCivilisational risk and strategy
Joseph Carlsmith - Utopia, AI, and Infinite Ethics

Why this matters
This episode strengthens first-principles understanding of alignment risk and the strategic conditions that shape safe outcomes.
Summary
This conversation examines core safety through Joseph Carlsmith - Utopia, AI, and Infinite Ethics, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Perspective map
MixedTechnicalMedium confidenceTranscript-informed
The amber marker shows the most Risk-forward score. The white marker shows the most Opportunity-forward score. The black marker shows the median perspective for this library item. Tap the band, a marker, or the track to open the transcript there.
An explanation of the Perspective Map framework can be found here.
Episode arc by segment
Early → late · height = spectrum position · colour = band
Risk-forwardMixedOpportunity-forward
Each bar is tinted by where its score sits on the same strip as above (amber → cyan midpoint → white). Same lexicon as the headline. Bars are evenly spaced in transcript order (not clock time).
StartEnd
Across 86 full-transcript segments: median 0 · mean -1 · spread -13–10 (p10–p90 0–0) · 0% risk-forward, 100% mixed, 0% opportunity-forward slices.
Slice bands
86 slices · p10–p90 0–0
Mixed leaning, primarily in the Technical lens. Evidence mode: interview. Confidence: medium.
- Emphasizes alignment
- Emphasizes safety
- Full transcript scored in 86 sequential slices (median slice 0).
Editor note
A high-leverage addition to the AI Safety Map that clarifies one important safety bottleneck.
ai-safetydwarkeshcore-safetytechnical
Play on sAIfe Hands
Episode transcript
YouTube captions (auto or uploaded) · video M3TUe4zUCKk · stored Apr 2, 2026 · 2,710 caption segments
Captions are an imperfect primary: they can mis-hear names and technical terms. Use them alongside the audio and publisher materials when verifying claims.
No editorial assessment file yet. Add content/resources/transcript-assessments/joseph-carlsmith-utopia-ai-and-infinite-ethics.json when you have a listen-based summary.
Show full transcript
so Utopia for me just means a kind of profoundly better future and I think it's important because I think it's just actually possible I just think it's actually something that we could do if we sort of play our cards right we could just build a world that is radically better than the world we live in today infinite ethics is ethics that tries to Grapple with how we should uh act with respect to kind of infinite worlds there's the middle ground between I shall ignore this completely and I shall you know be a Jain um which is recognizing that this is a this is a real trade-off there's uncertainty here and and taking responsibility for how you're responding to that the future is a big thing to try to model with this tiny mind and so you know of necessity you need to use these extremely lossy abstractions [Music] foreign of interviewing Joe Carl Smith who's a senior research analyst at open philanthropy and a doctoral student in philosophy at the University of Oxford um Joe has a really interesting blog that I got to check out uh called hands and cities and that's the reason that I wanted to have him on the podcast because it has a bunch of thought-provoking and insightful uh post on there about philosophy morality ethics the future and yeah so I I really wanted to talk to you Joe but do you want to give a bit of a longer intro on what you're up to sure so I work at open philanthropy on existential risk from artificial intelligence um and so you know I think about what's going to happen with AI how can we make sure it goes well and in particular how can we make sure that advanced AI systems are safe uh and then uh I have a side project which is this blog uh where I write about philosophy and uh and the future and things like that and that emerges partly from a sort of my background which is um I was I was before before getting into uh into Ai and working at open open philanthropy I was in academic philosophy okay yeah that's uh that that's a quite an ambitious side project I mean given the length and the regularity of those posts it's actually quite stunning um do you want to talk more about what you're working on about AI at open philanthropy so it's a mix of things right now I'm thinking about AI timelines and what's called takeoff speeds sort of sort of how fast the transition is from pretty impressive AI systems to AI systems that are uh kind of radically transformative um and I'm trying to use that uh to provide more perspective on the probability that um that everything goes terribly wrong I see okay um I I didn't know but I said well what are the implications I suppose it's uh higher or lower than I would expect um I guess if it's higher maybe I should work on EI 11 but other than that what is what are the implications of that that figure changing I think there are a number of implications just from understanding uh timelines with respect to how you prioritize and what um you know just to some extent the sooner something is then uh you need to be planning for it coming sooner and and kind of cutting more corners or you know um Counting Less on having more time um and yeah I think overall the higher you think uh the probability of catastrophe is the um the easier it is for this to uh to become kind of the most important priority I do think there's a range of probabilities where it maybe doesn't matter that much um but I think uh the difference between say uh one and ten percent I think is uh is quite substantive and um the difference between 10 and 90 is quite substantive and um you know I know people in all of those ranges gotcha okay interesting um yeah so let's let's back up here and uh talk a bit more about the philosophy motivating this so I think you identify as a long-termist um yeah so maybe a wrap picture question here is um you have an interesting blog post about what the future looking back on us might think about uh the 21st century given the risk we're taking um so I mean what do you think about the possibility that we're potentially giving up resources potentially dedicating well I'm not you're dedicating your career um to uh you know building a future that you know maybe given you know given the fact that you're alive now you might find strange or disturbing or disgusting I mean um uh sort of I guess to add more contests to the question if from utilitarian perspective the present is clearly much much better than the past but somebody from the past might think that you know uh there's a lot of bad things about the president that are kind of disturbing I mean they they might not like the configuration of how maybe isolating a modern city might be they might find that kinds of free to cheap information that you can access on your phone uh kind of disturbing yeah so how do you think about that so yeah a few comments there so one um I do think that if you took you know for most people throughout history if you brought them to the present day uh they would my guess is that fairly quickly and depending on exactly the circumstances they would come to prefer a living uh living in the present day to to the Past even if there are sort of a bit of Future Shock and a bit of um uh some things for alienating or disturbing um and but that said I think the distance the sort of gap between historical humans and the present is actually much much smaller both in terms of time and kind of other factors than the Gap line Vision between present-day humans and the future humans who are living living ideally in a kind of radically better situation um and so I do expect a sort of Greater distance and possibly greater alienation when you first show up my personal view is that uh the best the best Futures are um uh going to be such that if you really understood that and if you really really experience what they're like which which may be a big step and might require sort of extensive engagement and possibly sort of changes to your capacities to understand and experience then you would think it's really good um and so uh and and I think that's the relevant standard so for me I I worry less if the future is sort of initially alienating um and the question for me is how do I feel once I really really understood what's going on I see um so I I I wonder how much we should value that kind of inside view you would get into the future from being there if you think about I don't know many many existing ideologies uh like I don't think of an islamist or something so you might say listen if you could just like come to Iraq and feel the Bliss of uh fighting for the caliphate uh you you would understand better than you can understand from the outside view just you know sitting on a couch eating Doritos what you know what it's like to fight for a cause and maybe that their experience is kind of blissful in some kind of way but um I feel like the outside view is more useful than the inside view there well so I think there's a couple different questions there one is what would the experience be if you had it from the inside um and then there was this I think a subtly different question which was what which is what would your take on this fee if you kind of fully understood where fully understanding is not just um a matter of having the internal experience of being in you know in a certain situation but it's also a matter of understanding what that situation is causing what sort of beliefs or structuring um the ideology whether those beliefs are true and all sorts of other factors and it's the latter thing that I have in mind so I'm not just imagining oh the feature will feel good if you're there um because you know sort of by hypothesis the people who are there at least one hopes they're enjoying it or one hopes their thumbs up if the people were there aren't thumbs up that's a strange a strange Utopia um but I'm thinking more that in addition to uh their perspective they're just sort of more holistic perspective which is the sort of full understanding uh and that's the perspective from which you would endorse uh in endorse this situation I see um and then yeah so another respect in which uh it's interesting to think about what they might think of us is you know like what will they think of the crazy risk we're taking um by not not optimizing for existential risks and um so you know one analogy you could offer I think what mccaskill does this in his new book is to think of us as uh you know teenagers in our civilization's history and then you know think of the crazy things you did as a teenager and how you um and yeah so uh I mean maybe there is an aspect of which like one would wish they could take back the crazy things they did as teenager but my impression is that most adults probably think that while the crazy things were um kind of risky um they were they're very formative and important and um they feel nostalgic about the things they did in the past oh do you think that the future of looking back they are going to um regret the the way we were living in the 21st century or uh or will they look back and think oh you know that was kind of a cool time I mean I guess this is kind of conditional on there being a future which takes away a lot of the mystery here but I doubt that they will look back with um uh with pleasure at uh the sort of risks and uh and horrors of the uh the 21st century I mean if you just think about how uh we or at least I tend to think about uh something like the Cuban Missile Crisis or uh World War II I don't personally have a kind of nostalgia oh you know sure it was risky but it made it made me who I am or something like that um I also want to say you know I think it's true that when you look back on your teenage years there is often a sort of you know let's say you did something like crazy you and your friends used to race you know around and you play chicken or something at the local Quarry and it's like oh all right right but you know you survived right and the real reason not to do that is the like chunk of probability where you just died um and so I think there's a you know to some extent the the it's it's um the ex post perspective of looking back on certain sorts of risk is not the right one uh for especially for death risks that's not the right uh perspective to use to kind of calibrate your understanding of how to feel about it overall I see um okay so I think you brought up Utopia and you have a really interesting post about uh the concept of Utopia uh well so do you want to talk a little bit more about this concept and why it's important and um and also why do we have so much trouble thinking of a compelling Utopia yeah so Utopia for me just means a kind of profoundly better future and I think it's important because I think it's just actually possible I just think it's actually something that we could do uh we we could make if if we sort of play our cards right in in sort of non-crazy ways we could just build a world that is radically better than the world we live in today um and in particular I think uh we often in thinking about um that sort of possibility underestimate just how big the difference in value could be between our current situation and and kind of um what's available uh and I think often Utopias are kind of anchored too hard on the status quo and sort of changing it in in small ways but imagine imagining our kind of fundamental situation basically unaltered um and I think such that it's a little bit like the difference between you know you have a kind of a crappy job or like a Beach vacation and Utopia is like everyone has Beach vacation uh and you know I don't know how you feel about Beach vacations um but I think it's much I think the difference is more like being asleep and being awake uh or sort of uh it's it's more um uh yeah it's sort of it's like living in a cave or living living in the in under the Open Sky I think I think it's like a really big a really big difference and um and that that matters a lot I that's interesting because I remember in the essay you had um you had a section where um you mentioned that you expect Utopia to be recognizable uh it's like to a person alive now um I guess the way you put it just earlier made it seem like it would be a completely different category of experience than we would be familiar with um yeah yeah so is there a contradiction there or uh I'm missing something so I think there's at least attention and and the way I see the tension uh playing out or you know kind of being reconciled is specifically uh via the notion I referenced earlier of kind of you would if you truly understood come to see uh the Utopia as genuinely good but I think that process I mean ideally I think the way we end up building Utopia is we go through a long um patient process of becoming wiser and better and more capable as a species um and and it's in virtue of that process kind of culminating that we're in a position to build um to build a civilization that is sort of profoundly good and radically radically different um but that's a long process and so I I do think you know if as I say if I just transported you right there and you skipped you skipped the process then you might not like it um but uh and and it is quite alien in some sense but I still but if you went through the process of like really understanding and kind of becoming wiser um you would you would would endorse uh-huh that's um that's interesting to me that you think uh the process to get to Utopia is more of a sort of uh maybe I'm misuring it but when you mentioned it's a process of getting wiser and um yeah so it sounds like it's a more philosophical process rather than I don't know we figure out how to convert everything to hedonium and you know it's Eternal Bliss from then on uh yeah so am I getting it right that you think it's more a philosophical process and then why is it that you think so yeah so I definitely don't sit around thinking that Utopia we sort of know what Utopia is right now and it's hedonium I'm not especially into the notion of hedonium but I think it's possible to um I think it's I think the brand is bad um I think I think you know people uh talk about pleasure with this kind of dismissive attitude sometimes and you know hedonium implies this kind of sterile um uniformity uh you know and you're sort of tiling people are talking about they're like gonna tile the universe with edonium and it's like wow this sounds this sounds rough um whereas I think actually you know the relevant perspective when you're thinking about something like hedonium is the kind of internal perspective from which uh the sort of experience of the subject is something kind of uh joyful and you know boundless and kind of uh energizing you know whatever whatever pleasure is actually like pleasure is not a trivial thing I think pleasure is a profound thing in a lot of ways but I really don't I don't assume that that's what Utopia is about at all I think we're at I think a my you know my own values seem to be quite complicated I don't think I just value pleasure I value a lot of different things and more broadly I have a lot of uncertainty about how how I will think and feel about things if I were to go through a kind of process of significantly uh increasing my capacity to understand um I don't I think sometimes when people imagine that they imagine oh we're going to sit around and do a bunch of philosophy and then we'll have like solved normative ethics and then we'll Implement our solution to normative ethics um and that's not what I'm imagining by uh kind of wisdom I'm imagining something um richer and also that involves uh importantly a kind of enhancement to our cognitive capacity so sort of really you know I think we have we have very small we're really limited in our ability to understand the universe right now we have kind of um and I think there's just a huge amount of Uncharted Territory in terms of what Minds can be and do and see and so I want to sort of chart that territory before we start making kind of big and irreversible decisions about what sort of civilization we want to build and the long term I see um and then I another uh maybe concerning part of uh the Utopia is that um yeah as you mentioned the piece many many of the worst ideologies in history have had elements of utopian thinking in them um to the extent that EA and utilitarianism generally are compatible with utopian thinking maybe they don't aggregate utopian thinking but they are compatible with it um do you see that as a problem for uh the movement's health and potential impact is the question something like uh is this a red flag kind of uh you know we look at we look at other ideologies throughout history and they've been uh compatible with utopian thinking um and and maybe sort of um effective altruism or or uh utilitarians or something similarly compatible so should we should we worry in the same way is that the question uh yeah partly and also um another part is um maybe the maybe maybe it's still right uh that like morally speaking Yeah Utopia is compatible with this worldview and the world view is correct uh but that that the implications are that you know somebody misunderstands what is best um they identify as an EA and this leads to bad consequences when they try to implement their scheme yeah so I think there are certainly reasons to be cautious uh in this broad vein um I don't see them as very specific to EA or utility I don't identify as utilitarian but I'm the to utilitarianism um I see them as more are sort of better understood as uh risks that come from believing that something is very important at all and I think it's true that many um acting from a space of of conviction um especially where uh that conviction has has sort of a flavor of you know it's interesting what exactly constitutes an ideology but I think it's I think it's reasonable to look at EA and sort of be like this this looks like an ideology and I think you know and I think um that's I think that's right and I think uh that's sort of important to to you know have the sort of relevant red flags about um I think it's pretty hard to have a view of the world that doesn't in some sense imply that it could be a lot better um or at least a plausible view of the world and and when I say Utopia I don't really mean anything much different from that you know I think it's sort of um I'm not saying a perfect thing I'm not you know I I do have sort of a more specific view about exactly how much better things could be but more broadly it seems to me many many people believe in the possibility of a much better world and are fighting for that in different ways um and uh so I wouldn't I wouldn't pin the red flag specifically to the belief that sort of things can be better um I think it would have more to do with uh sort of what degree of rigidness are you um you know relating to that belief with how are you uh kind of how are you acting on it in the world how much are you willing to kind of um kind of break things or kind of act in uncooperative ways in virtue of that sort of conviction and there I think um uh caution is definitely warranted I see yes I have I I'm not sure I agree that um most people have a view uh or an ideology that implies um uh anywhere close to the kind of Utopia that uh one uh utopian thinking one can't have like if you think of modern political parties in a developed uh democracy uh like in the United States for example if you think of uh what is like a utopian Vision that either party has it's like it's actually quite uh quite banal it's like oh we'll have universal healthcare or I don't know GDP will be higher in the next couple of decades um which is uh which doesn't seem utopian to me it just seems and it does seem um it does seem like a limited world view where they're not really thinking about how much better or worse things could be but it doesn't exactly seem utopian uh yeah I'll I'll let you react to that I think that's a good point so maybe the relevant notion of utopian here is something like to what extent is a concept of a radically better world kind of operative in your day-to-day uh engagement you know to some extent what I meant is that I think I think if I sat down and talked with most uh you know most people um you know we could eventually with some kind of constraints on reasonableness come to agree that things could be a lot better in the world like we could just cure cancer we could cure you know XYZ disease we can just go through a few things like that we could talk about um the degree of abundance that could be available um and I think you know so but the question is whether that's like the kind of structuring or important Dimension to how people are relating to the world I think you're right that it's often not and that's part of maybe um the thing I'm hoping to uh kind of push back against with that post is actually I think this is a really important feature of our situation um I think it's true that it's it can be dangerous and if you're wrong about it or if you're acting um in the right in a sort of um unwise way with respect to it that can be really bad but I also think it's just it's just a really basic fact and I think we just sort of need to learn to deal with it maturely and kind of pretending it's not true I think isn't the way to do that I see um but to me at least utopian or Utopia sounds like uh some sort of peak uh and maybe you didn't mean it this way but uh so are you saying in the essay and generally that you think there is some sort of carrying capacity to how much good things can get or that Beyond a certain point things can keep getting in um indefinitely better uh but at this point we're willing to say that we have reached Utopia yeah so I mean I certainly don't have a kind of hard threshold you know here's here's exactly where where I'm going to call it Utopia um you know I mean something that is profoundly better uh I do think that if you have a finite so you know a very basic level if there's only a finite number of states that uh the sort of affectable universe can be in um and your your ranking of these states in terms of how good they are is uh transitive and complete um then there will be a sort of top um a top and and I you know I don't think that's an important thing to focus on from the perspective of just getting it just you know taking seriously that things could be radically better at all I think like talking about that but exactly how good and what's the perfect thing is is often kind of um distracting in that respect and it gets into these issues about like oh you know um how much suffering is good to have and and a lot of the sort of discourse on Utopia I think gets distracted from basic facts about like at the very least we can do just a ton better um and that's important to keep in mind I see I see you you point out of the piece that many religions and spiritual movements have done the most amount of thinking on what a Utopia could look like and you know there's a very interesting um essay by Nick Bostrom in 2008 where he lays out his vision of what somebody's speaking from the future Utopia talking back to us would sound like and when you read it it sounds very much like a sort of mystical uh mystical essay the kind of thing that uh change a few words and a Christian could write like C.S Lewis could have written about like what it's like to speak down from heaven um yeah so so at what extent is there uh and I don't I don't mean this pejoratively but uh what extent is there some sort of like uh spiritual or religious Dimension to utopian thinking uh that relies on some amount of faith that things can get in sort of uh indescribably better in some sort of ephemeral Indescribable way so I think there are definitely analogs and similarities between some ways of relating to the notion of utopia and uh attitudes and orientations that are common in religious contexts and spiritual context and I think it's um and I think personally uh so I don't think it needs to be that like that as I say I think I don't think it requires faith I don't think it requires anything mystical um I don't think I think this is it's just a basic fact um about our kind of current uh you know our current cognitive situation our current a civilizational situation that um things could be radically better um and uh it's a you know it's ephemeral in the sense that it's quite hard to imagine especially you know for me an important an important source of evidence here is is sort of variance in the quality of human experiences so if you think about your kind of peak experiences um they're often it's it's a really big deal you're kind of you're kind of sitting there going wow this is radically this is serious um and kind of feeling and touch her or feeling that this is this is uh in some sense a a um something you would trade much much sort of mundane experience for the sake of um and I think it's important so the thing that I think we need to do is sort of extrapolate from there so you sort of look at the trajectory that your mind moved along as you as you moved into some experience or some broader non-experiential like your community got a lot better your relationships got about a lot better look at that trajectory and then sort of stare down you know where is that going um and I do think that requires a kind of I don't want to call it faith I think it requires a kind of extrapolation into a sort of Zone that is in some sense beyond your experience but that is sort of deeply worthy and important and I think that's um something that is often associated with with spirituality um and religion and I think uh I think that's okay um but I I actually think there's a there are a number of really important differences between Utopia and something like heaven um so you know centrally Utopia will be a sort of concrete limited situation they were you know there are going to be frictions they're going to be resource constraints uh it's going to be finite um there's there's a bunch of it's still going to be in the real world whereas I think um uh many you know most religious Visions have don't have don't have those constraints and that's an important important feature up there um uh uh yeah of their their situation yeah speaking of constrained constraints this reminds me of Robin Hansen's theory that you know eventually the universal economy will just be made up of um these digital people M's and that because of competition their wages will be driven down to subsystem levels uh which um maybe that's compatible with some Engineering in their ability to experience such that you know it's still Blissful for them to work as assistance levels of compute or whatever um but uh yeah so it seems like this sort of like uh first order of economic thinking implies that there will be no there'll be no Utopia in fact things will get um things will get worse for on average but maybe better uh overall if you just add up all the experience but worse on average uh yeah so so I don't know this vision seems incompatible with yours of a Utopia what do you think yeah I would not call uh Robin's World a Utopia uh and so you know a thing I haven't been talking about is what should our overall probability distribution be with respect to different quality of futures um and what um you know exactly how possible is it uh and How likely is it that we we build something that is sort of profoundly good as opposed to uh mediocre or much worse um and uh I would class Robin scenario in the mediocre or uh or much worse Zone but so do you have a criticism of the logic he uses to derive that to some extent I think my main my main criticism or the first thing that would come to mind is that I think we will very likely um uh like I think competitive pressures are uh are a source of kind of kind of pushing pushing uh the world in in bad directions but I also think there are ways in which um kind of wise forms of coordination and kind of preemptive action can uh can Stave off the sort of bad effects of competitive pressures and and so that's that's the sort of um that's the way I imagine avoiding uh stuff in the vicinity of of what Robin is talking about though you know there are a lot of complexities there yeah yeah um the last few years have not reinforced my uh my my belief in the possibility of wise coordination but uh yeah yeah uh anyways so um yeah one thing I want to talk to you about is you have a paper on what what it would take to match uh humans brains a computational capacity um uh and then associated with that you have uh you know a very good summary on open philanthropy um yeah so do you want to talk about uh the approach you took to estimate this and then why this is an important metric to try to figure out yeah so um the approach I took was to look at the evidence from neuroscience and the literature on uh the kind of computational capacity of the human brain and to talk to a bunch of neuroscientists and to try to you know see see what we know right now about uh the uh the number of floating Point operations per second uh that would be sufficient to kind of reproduce the task relevant uh aspects of human cognition in a computer um and that's important I mean it's actually not it you know it's not clear to me exactly how important this parameter is to our overall picture um I think the way in which it it's uh relevant to thinking that I've been doing and then openfill has been doing is um as an input into an overall methodology for estimating when we might see uh kind of human level AI systems that proceeds by first trying to estimate roughly the the kind of computational capacity of the brain or the sort of um uh the sort of size of the size of a kind of AI system and it's kind of overall parameter count uh and uh kind of compute capacity and that would be sort of analogous to humans and then you extrapolate from that to the training cost the cost to kind of create a system um of that kind using Uh current methods in machine learning and kind of current scaling scaling laws uh and um that methodology though brings in a number of additional assumptions that I think aren't um aren't like this transparent that that's oh yeah of course that's how we would do it or that and so um I think you have to sort of be a little bit more in the weeds to see exactly how it um how it feeds in I see and then yes I think you said it was 10 to the 15 flops uh for um for a human right like what did you have estimate for how many flops it would take to train uh to train something like the human brain I know the gbt3 is like um only 175 billion parameters or something which is can fit into a you know like a a Micro SD card even um but uh but yeah it was like oh 20 million dollars to train so um yeah so do you have did were you able to come up with some sort of estimate for how what it would cost to train something like this yeah so my focus in that report was not on the training extrapolation that was work uh that ajaya Carter at open philanthropy did um using my reports estimate as an input and uh that her methodology involves assigning different probabilities to different kind of ways of using that that input uh to to to derive an overall training estimate um and in particular an important source of uncertainty there is uh the kind of amount of compute required or the sort of number of times we need to run a system per data point that it gets so in the case of something like gpd3 you get a meaningful data point and a gradient update as to how well you're performing um with each token that you output as you're doing gpd3 style training so you're you know you're predicting text from the internet you know you you suggest an X token and then your training process says like nope do better next time or something like that whereas if you're uh say learning to play go and you have to play uh I mean this isn't exactly how or this isn't Hardware system won't work but it's an example if you have to play the full game out and that's sort of hundreds of moves um then before you get an update as to whether uh you know you're playing well or poorly then uh that's a big multiplier on on the compute requirement and so that's that's one of the central pieces that's called what a j calls The Horizon length of of training and um that's a sort of very important uh source of uncertainty in getting to your overall overall uh training estimate I think but ultimately you know she ends up with this big spread out distribution from something like I think gpt3 was like um 10 to the 24 yeah 4 times 10 to the 23 or something like that and you know she's she spreads out all the way up to the evolution anchor I think is something like 10 to the 41 and uh I think her distribution is centered somewhere in the low 30s okay that's that's still quite a bit I guess um how much does this rely on the you know the scaling hypothesis if one thought that the current efforts and the current approach were not um not likely to lead and uh or at least and not likely an example efficient way towards uh towards human intelligence you know it might be analogous to somebody saying we have um Enough tutorial on Earth to power civilization for millions of years um uh but but if you haven't figured out Fusion then it may be irrelevant uh statistic yeah so I think the approach does assume that you can train a human level or sort of uh transformative AI system um with a sort of non-astronomical amount of compute and data using current you know without without major conceptual or algorithmic breakthroughs relative to what's currently available um now the actual methodology AJ uses allows you to assign probabilities to that assumption too so you can if you want you know say I'm only 20 on that um and then uh you have then there are sort of other uh there are a few other options so you can also kind of rerun Evolution which is not uh and and so that's that's an anchor that she provides to sort of uh and this is often what people will say as a sort of upper bound on how hard it is to create um to create human level systems is is to do something something analogous to um to simulating Evolution um so that you know there are a lot of open questions as to how how hard that is um but I do think this methodology uh is a lot more compelling and interesting if you um are compelled by the uh the kind of available techniques in deep learning and by and by kind of scaling hypothesis like views at least in as an upper bound I think it's important so you know there's different ways of of kind of being interested in algorithmic breakthroughs one is because you think deep learning isn't enough another is because you think they will sort of provide a lot of efficiency relative to deep learning such that an estimate like a jazz is an overestimate because actually you know we won't have to do that we'll make some sort of breakthrough and it'll happen a lot earlier um and uh uh and I put I put weight on that view as well yeah that's really interesting so yeah that implies that like even if you think the current techniques are not uh not optimal maybe that maybe that should update you and take favor of thinking it could happen sooner that's really interesting um uh um yeah so yeah then how did you go about estimating uh like uh the amount of flops it would take to emulate uh the interactions that happen in the brain uh obviously it would be unreasonable to say that you have to emulate every single Atomic uh Atomic interaction um but then what is what is your proxy that you think it would be sufficient to emulate so I used a few different methodologies and tried to kind of synthesize them so one was looking at the kind of mechanisms of the brain and what we know about uh the kind of complexity of what they're doing and how hard it is to capture the kind of task relevant or our best our best guess about the task relevant dimensions of the the signaling happening in the brain um and then I also tried to bring in comparisons with uh existing AI systems that are replicating kind of chunks of functionality um that humans uh that the human brain has and in particular in the context of vision um so sort of uh how do our how do our current Vision systems compare with uh the parts of the brain that are kind of plausibly doing analogous processing other often they're often doing other things as well um and then I use the third method which has to do with physical limits on the kind of energy consumption per unit computation that the brain is possibly doing and then a fourth method that sort of gesture at which tries to extrapolate from uh the communication capacity of the brain uh to its computational capacity using comparisons with uh with current computers so it's sort of a triangulation of like you look at a bunch of different sources of evidence all of which in my opinion are pretty weak I think we are um uh we're quite well the physical limit stuff is maybe more complicated but it's sort of a upper bound um I think we are significantly uncertain about all of this and and my distribution is is pretty spread out um but uh the hope is that by looking at a bunch of things at once you can at least get um a sort of educated guess and then yeah so I'm very curious um uh is there consensus in Neuroscience or uh other relevant fields that we understand the signaling mechanisms well enough that we can say like basically this is what it's involved um this is what the system is reducible to um and yeah so this is how many bits you need to represent uh I don't know all the synaptic connections here or is there a variance of opinion about like just how complicated the the Enterprise is uh there's definitely a disagreement and um it was you know interesting and in some sense disheartening to talk with neuroscientists about just how uh you know how difficult Neuroscience is you know it's sort of I think it's easy a consistent message and I have a section on this in the report um was kind of how far we are from really understanding uh what's going on in the brain um especially at a kind of algorithmic level um that so in some sense the report is somewhat opinionated in that um you know there are experts that I found more compelling than others uh there are experts who are much more in a sort of agnosticism mode of like we just don't know um you know the brain is really really complicated who sort of err on the side of a very large compute estimates a lot of emphasis on biophysical detail a lot of emphasis on sort of mysterious things it could be happening that aren't happening and then there are other neuroscientists who are more uh uh you know more willing to say stuff like well we kind of basically know what's what's going on at a mechanistic level which isn't the same as knowing kind of the algorithm the sort of algorithmic organization overall and how to replicate it I sort of lean towards the latter view though I give weight to both and and try to um yeah try to synthesize the the kind of opinions of people I I saw overall just looking at the the post itself I haven't really looked deeper into the actual um the the the paper performance is Drive uh but it seemed like you were to estimate the flaws mechanistically you were adding up the different systems at play here um yeah so should we expect it to be additive in that way or maybe it's like multiplicative or there's more complicated interaction at like the flops grow super linearly to the inputs uh I know that probably sounds really nice having studied it but just like from a uh first glance kind of uh way that that's a question I had yeah so the the way I was understanding um and breaking down the forms of processing that you you would need to replicate in the brain um made uh made them seem not multiplicative in this way so you know an example would be if you think about I mean yes sort of simple example so suppose we have some neurons and they're uh you know they're signaling centrally via spikes through synopsis or something like that and then we have uh glial cells as well which are signaling via like slower calcium waves uh and it's a sort of separate uh separate Network um you know you could think that if it were something like you know the rate of calcium signaling is um uh dependent on the rate of spikes through synapses or something like that then that's an important interaction uh uh but you know overall if you sort of Imagine like this this kind of network processing um uh these are kind of you can just you can estimate them independently and then and then add it up it's they're not they're not actually multiplicative processes on that I'm not conception um I do think there are kind of correlations between the estimates for for the different parts but I uh it's sort of added it but a fundamental level I see okay and then yeah how much Credence do you put in um these sort of uh almost ruboo hypotheses that I don't know of Roger permanos has that thing about there's something like uh something quantum mechanical happening in the brain that's very important in uh for understanding cognition um yeah to what extent uh do do you put Credence in those kinds of hypotheses I put very little Credence in those hypotheses um uh yeah I don't see a lot of reason to think that um I see a good amount of reason not to think it um but it wasn't something I dug in on a ton okay gotcha all right so you have this really interesting blog post about infinite ethics um do you want to talk about why this is an important topic why it's important to integrate into a worldview and so on sure so infinite ethics is ethics that tries to Grapple with how we should uh act with respect to kind of infinite worlds um and how should we you know how should we rank them um how should they enter into our uh our expected utility calculations or our attitudes towards risk um and I think this is important for both kind of theoretical and practical reasons so I think at a theoretical level when you when you try to do this with a lot of common um ethical theories and constraints and principles um they just break on uh on infinite worlds um and I think that's that's an important clue as to their viability because I think infant worlds are at the very least possible um even if our world is finite I mean even if our causal influence is finite or our influence overall is finite um it's possible to have an infinite worlds and we have opinions about them you know like an infinite Heaven is better than an infinite hell and you know uh so I think um often in ethics we we expect our ethical principles to extend to um kind of ranking scenarios or sort of acting in hypothetical scenarios or overall kind of um all possible situations rather than just our actual situation I think um uh Infinities come in there but then I think maybe more importantly um I think it's a it's an issue with practical relevance um and a way to see that is that you know I think we should have non-zero Credence that we live in an infinite World um and uh you know it's it's a very live uh physical hypothesis that the universe is infinite even if I think the the mainstream view is that our causal influence on that Universe um is finite in virtue of things like entropy and light speed and stuff like that um but the universe itself May well be infinite in um uh you know uh and possibly different in a number of different ways uh uh if the sort of Max tegmark has some work on all the different kind of like large you know ways the university really very large there's a number of ways that I think it's just we should have non-zero Credence that that we we can have um infinite influence in our actions now um so uh you know our kind of the causal influence our the limitations there could be wrong it may be that there are ways you know in the future we'll be able to do infinite things um and then I also think somewhat more uh uh exotically that um it's there there are sort of ways of having a causal influence um on an infinite Universe even if you are uh Limited in your causal influence and that comes from some additional work I've done on decision Theory um and so if you try to incorporate that if you're a sort of expected value Reasoner um it just very quickly starts to dominate or at least break your expected value calculations so you know you mentioned long-termism earlier uh and you know a natural reason a natural argument for for getting an interest in long-termism is oh you know in the future there could be all these people their lives are incredibly important so if you do the EV calculation sort of your effect on them is what dominates um but actually if you have even a tiny increase that you can do an infinite thing uh you know either that dominates or it breaks and then if you have tiny credences on doing different types of infinite things and you need to compare them um you need to know how to do it uh and so I just think this is actually you know it's actually a part of our of our epistemology now um though it's I think we often don't uh don't treat it that way because we're often not doing EV reasoning or really thinking thinking about that um that uh that these are questions that just apply to us yeah yeah so that's that's super fascinating um I I if it is the case that we can only have an impact on a finite amount of stuff then maybe it is true that like there's infinite suffering or happiness in the universe at large but uh the Delta between the best case scenario for what we do in the best worst case scenario is finite um but yeah I don't know that still seems less compelling if the the Hell or Heaven we're surrounded by is uh overall not uh it doesn't change um uh can you talk a bit more I think you mentioned uh in your other work on having impact having infinite impact be Beyond uh the scope of what light at speed an entropy would allow us okay can you talk a bit more about how that might be possible sure so um you know a common decision Theory um though it's not I think the mainstream decision theory is a contender in the literature is evidential decision Theory where you should act um such that uh you would be you know roughly speaking happiest to learn that you had acted that way for that reason um and uh so the reason this allows you kind of a causal influence uh so you know a way of thinking about it is suppose that you are a um a deterministic simulation um and there's a copy of you being run uh sort of too far away for uh for you to ever uh causally interact with it right um but you know that it's a sort of um you know it's uh it's a deterministic copy and so it'll do exactly what you do absence some sort of computer malfunction um and now uh you're deciding whether to give uh you know you have two options you can send a million dollars to that well it's a little complicated because he's too far away but um uh you know just in general like if I raise my hand or if I want to write stuff on my whiteboard right or if I'm going to uh you know there's let's say I have to make some ethical decision like whether I should take an expensive vacation or I should donate that money to say someone's life because that the the other guy uh is going to act just like I do um even though I can't cause him to do that in some sense when I when I make my choice um after doing so I should think that he made the same choice and so evidential decision Theory treats his action as in some sense under my control um and so uh if you imagine an infinite Universe where there are an infinite number of copies of you or even not copies people whose actions are correlated with you such that when you act a certain way that gives you evidence about what they do in some sense their actions are under your control and so if there are an infinite number of them uh on evidential decision Theory and a few other decision theories uh then uh in some sense you're having influent influence on the universe yeah this sounds really similar to um this art experiment and quantum mechanics called the epr pair uh which which you might have heard of but the basic idea is if you have two entangled bits and you take them very far away from each other and then you measure one of them and you like before they're brought apart you come up to some rule that like hey if if it's plus we do this if it's minus we do the other thing it seems at first glance that measuring something yourself uh has an impact on what the other person does even though um it shouldn't be allowed uh uh uh by light speed it gets resolved if you take them any Worlds View but um um yeah yeah so that that's very interesting is this just a thought experiment or is this something that we should anticipate for uh some cosmological reason to actually be a way we could have influence on the world so I haven't dug into the cosmology a lot but my understanding is that it's at the very least a very live hypothesis that the universe is um infinite in the sense that there are you know sort of infinite an extent and there are uh you know suitably far away um there are copies of us having just this conversation and then you know even further away there are copies of us having this conversation but wearing raccoons for hats um and you know and and all the rest um which you know is itself something to wonder about and sit with but you know my understanding is this is this is just a live hypothesis and more broadly um kind of Infinities playing you know infinite universes are just sort of a part of of uh of mainstream cosmology at this point um and so uh yeah I think it I think I don't think it's just a thought experiment I think infinite universes are are live and then I think um uh you know these sort of non-causal decision theories are actually my sort of best guess decision theories um though that's not a mainstream view uh so uh it's fairly um I think it comes in Fairly directly and substantively if you have that combination of views but then I also think it comes in I think everyone should have non-zero Credence in all sorts of different infinity involving hypotheses and so infinite ethics gets a grip regardless I see um and then so taking that example um if if you're having an impact on every identical copy of yourself in the internet Universe it seems that for any such copy there's infinite amount of other copies that are slightly different so it's not even clear if you're increasing but maybe it makes no sense to talk about proportions in an infinite universe but you know if there is a another infinite set of copies that scribbled the exact opposite thing on the Whiteboard then it's um it's not clear that you had any impact on the total amount of good or bad stuff that happened I don't know my brain breaks here but maybe you can help me understand this yeah so I mean there's a general I think there's a couple Dimensions here there so one is um trying to understand actually what sort of difference does it make if you're in this sort of infinite situation and you're thinking about a positive influence um what even did you change um at a sort of empirical level before you talk about how to value that um and I think that's a pretty gnarly question um even if we settled that question though in terms of like the empirical uh a causal impact uh there's a further question of how to you how do you rank that or how do you deal with um you know the sort of the normative Dimension here and there you know so that's the sort of ethical question and their things get really gnarly very fast um and you know so uh and in fact there are kind of um impossibility results that show that even very basic constraints that you really would have thought that we could get um at the same time in our ethical theories uh you can't get them at the same time um when when you come when it comes to infinite universes um and uh so we know that something is going to have to go and change if we're going to extend our ethics to to infinities I see you but then um yeah so I think what is there some reason you settled on I guess you mentioned you're not a utilitarian but on some version of EA or long-termism as your tentative moral hypothesis despite the fact that this seems unresolved and then like how do you say that with that tension while uh attentively or remaining in EA yeah so I think there's a there's two Dimensions there one is that I think it's um I think it's good practice to not totally upend your life and do and and you know if you encounter some destabilizing philosophical idea especially one that's sort of difficult and you know you don't totally have a group on it to to come back but isn't it isn't that what long-term or some is yeah so I think there's a real tension there in that I think um many you know how seriously should we take these ideas at what point should you be making what sorts of changes for your life on the basis of different different things that you're um uh your thinking and believing you know it's a real art right and I think some people go you know they grab the first idea they see um and they start doing crazy stuff and uh in an unwise way and some people are too um it's kind of sluggish and they're not willing to take ideas seriously and not willing to reorient their their life on the basis of of uh changes in what seems true um but I think you know nevertheless I think especially things that involve like ah turns out it's fine to you know do terrible things or you know there's no reason to eat your lunch or whatever like things that you know sort of really really holistically breaking of your ethics views I think I think one should should tread very cautiously with um so that's one aspect at a philosophical level um the way I resolve it is I think for many of these issues uh the right path forward or at least a path that looks pretty good is to um survive long enough for our civilization to become much wiser um and if as and and then to use that position of wisdom and empowerment uh to act better with respect to these issues um and so and that's what I say in the end of the infinite ethics post is that um you know I think future civilization if all goes well we'll be much better equipped to deal with this um and you know we are at we are at square one and kind of really understanding how how these issues play out and how to respond and so uh I think both at an empirical level and at a at a kind of philosophical level um and so it looks convergently pretty good to me to survive become wiser keep your options open and then act from there um and that looks that ends up pretty similar to a lot of long-term as an existential risk it's just that it's focused Less on and the main event will be what happens to future people and it's it's more about getting to the point where we're wise enough to understand and reorient um in a better way okay um yeah so what if I'm really interesting about this is that you can um yeah it's a different people tend to have like different thresholds for um epistemic learn helplessness where they basically say this is too weird I'm not going to think about this uh let's just stick with my current Uh current uh moral theories um so for somebody else it might be before they became a long-term risk where it's just like natural in the future people what are we talking about here let's uh we're not changing my mind on stuff and then yeah for you maybe it's before the infinite ethics stuff um is there some principled reason for thinking that this is where that stop should be or is it just a matter of like temperament and openness so I don't think there's a principled reason and and I should say I don't think of my attitude towards infant ethics as solely oh this has gotten too far down the crazy the crazy path I'm out um it is this thing about the wisdom in the future is pretty important to me um as a as a a reasoned uh uh as an as a mode of orientation a first pass cut that I use is when do you feel like it's real um if you feel like a thing is real uh as opposed to a kind of abstract fun argument um then that's important or that's that's a real signal and I think I so um uh and I generally encourage people if if the sort of mode that I I I I don't know I'm drawn to is something like if there's an idea that seems compelling intellectually that's a reason to investigate it a lot and think about it and really grapple with you know if if this doesn't seem right to you or if it seems too crazy why um and really kind of processing you know it's a reason to pay a lot of attention but if you've paid a lot of attention at the end of the day you're like well I guess at an abstract level that sort of makes sense but it just doesn't feel to me like the real world it just doesn't feel to me like um wisdom or like a healthy way of living or whatever then I'm like well maybe you shouldn't do it right I mean and and uh and I think some people will do that wrong and they will end up bouncing off of ideas that are in fact good um but you know I think overall these These are sort of sufficiently intense and difficult issues that um uh kind of being actually persuaded and not just sort of chopping off the rest of your epistemology for the sake of some like version of the abstraction uh is it seems to me important and it's and it's a sort of a healthier way to relate yeah so another example of this is that you have this really interesting blog post on ants uh and that you're uh that your your your thoughts after uh sterilizing a colony of them so uh I uh yeah so this is another example of a thing where almost everybody other than I don't know maybe a Jane who wears a face mask to prevent bugs from going into his mouth would agree say like okay at this point if we're talking about how many heatons are in a hectare uh forest from all the millions of insects there um then uh you've lost me um but then you know somebody else might say okay well there's not a strong reason for thinking they have no absolutely no capacity to do to feel suffering um uh yeah so I I I wonder how you think about such questions because you can't like stop living and not gonna you're you're not even gonna stop going on road trips where you're probably killing hundreds of insects by just driving um but yeah so what do you think about this conundrums the I have significant uncertainty about you know exactly and I think this is the appropriate position about exactly how much uh kind of Consciousness or suffering or other forms of moral you know other ways other kind of properties that we associate with moral patient Hood how much those apply to different um different types of insects um I think it's a strange view to be you know extremely confident that uh what happens with insects is uh totally morally neutral and I think it actually doesn't fit with our common sense so let's say you see if you see a child like frying ants uh with uh with a magnifying glass I think we you know there is some uh you know one you could say ah well that just indicates that they're going to be cruel to other things that matter um but uh I don't think so I think you know and you see the ants like you know and they're twitching around and and um I so I think we aren't um you know as in many cases within with animal ethics I think we're a bit like kind of schizophrenic about about what cases we we view as sort of morally relevant and and which which not um you know we have we have you know pet treatment laws and then we have factory farm terms and stuff like that um so I don't see it as a radical position that ants matters somewhat um I think there's a further question of what your overall practical response should to that should be and I think um the main uh and I do think uh the kind of costs as as in a lot of ethical life there are trade-offs and you have to make um you have to make a call about what what sort of constraints you're going to put on yourself at the cost of other goals and um in you know in the case of insects it's not my my current moral focus and I don't pay a lot of costs to kind of um to lower my impact on animals and and I don't you know I don't I don't sweep the sidewalk or anything or sorry on on ants in particular um uh and so I think it's I think and I think that's you know that's my best guess response and that has to do with other ethical priorities in my life um but I think you know there's there's a middle ground between um I shall ignore this completely and I shall you know be a Jane um which is recognizing that this is a this is a real trade-off there's uncertainty here and um uh and taking responsibility for how you're responding to that yeah it seems um kind of similar to the infinite ethics example where if you put any sort of credence that um did they have any ability to suffer then at least if you're uh not going to say that oh it doesn't matter because like the far future trillions and trillions of ants um it seems like this should be a a compelling uh uh a a compelling thing to think about but then the result is um yeah it's not even like become a vegan where it's like your change or diet um uh and then so you know as you might know this is used as a productive at absurdum of veganism where you know if you're going to start caring about other uh non-human animals why not also care about insects and even if they're worth like a millionth of a cow then you know you're probably still killing like a million of them on any given day from all your activities uh indirectly maybe uh like I don't know like the food you're eating all the pesticides that are used to create that food but I don't know how you go about resolving that kind of stuff I mean I guess I'd want to really hear the empirical case I think um uh I think it's true you know there are a lot of insects uh and but you know I think it's easy uh you know I think if you want to say like uh taking seriously sort of the idea that um there's some reason to to not like squash squash a bug um if you see it leads immediately to kind of Jane like Behavior Uh absent long-termism or something like that I I really I feel like I want to hear the empirical Pace about like exactly what impact you're having and how um and and I'm not at all persuaded that that's the Practical upshot um uh and if it is if that's a really strong case then I think that's an interesting um an interesting uh uh you know that's an interesting kind of implication of of this view um and uh and you know we're worth concerned but I wouldn't jump it feels to me like it's easy to jump to that almost out of a desire to to get to the reductio um without kind of I would try to move slower and and really see it's like wait is that right there are a lot of trade-offs here what's the source of my hesitation about that um and kind of uh yeah and not not jump too quickly to something that's sufficiently absurd that I can be like ah therefore I got to reject this whole mode of thinking even though I don't know why I see yeah um okay so let's talk about uh the two different ways of thinking about Observer effects and their implications so do you wanna uh explain um you have a four-part series on this but do you want to explain uh the self-indication Assumption and the cell sampling assumption uh uh I know it's a big topic but uh yeah as much as possible sure so I think one way to start to get into this debate is by thinking about the following case so um you wake up in a white room and there's a message written on the wall and let's say you're going to believe this message and the message says I God it's from God I God created I I flipped a coin um and if it was heads I created one person in a white room and if it was Tails I created a million people all in white rooms and now you are asked to assign probabilities uh to uh the coin and having come up heads versus Tails um and uh so one approach to this question um uh which is the approach I favor or at least think is better than the other uh is the self the self-indication Assumption um these names are terrible um and uh but you know so it goes um so Sia uh says that uh your probability that the coin came up heads should be approximately one in a million um and that's because Sia thinks it's more likely that you exist in Worlds where uh there are more people in your epistemic situation or more people who have your evidence which in this case is just waking up in this white room uh and that's it and so that's that could be a weird conclusion and go to weird places um but I think uh it's a better conclusion than the alternative SSA uh which is the the main alternative I consider in that post which is that the self-sampling Assumption um uh says that you should you think it more likely that you exist in Worlds where people with your evidence are a larger fraction of uh something called your reference class um uh where it's quite opaque what what a reference class is supposed to be but broadly speaking a reference class is the sort of set of people you could have been or that's kind of how it functions in in ssa's discourse so um uh in this case in both cases everyone has your evidence um and so the fraction is the same um uh and so you you stick with the one half prior um but that's not true so SSA in other contexts um not everyone has your evidence and so it updates towards worlds um where it's a larger fraction so famously uh SSA leads to what's known as the Doomsday argument um where you imagine that uh there are two possibilities either Humanity will go extinct very soon um or we won't go existing very soon and there will be tons of people in the future and in the former case and then you imagine um everyone has sort of ranked in terms of when they're born um uh in the former case people born you know roughly this time um are a much larger percentage of all the people who ever lived um and so if you imagine you know if God first creates a world and then he inserts you randomly into like some group it's much more likely uh that you would find yourself in the 21st century um if Humanity goes extinct soon then if it's uh if there are tons of people in the future if God randomly inserted you into these tons of people in the future then it's like really that's it's a tiny fraction of them are in the 21st century um so SSA and other contexts actually you know it has these important implications namely that in this case you update very very hard towards the future being short and that matters a lot for long-termism because uh long-termism is all about the future being big in expectation okay so and then what is it what does the Sia take on this yeah so I think a way to think about essays kind of story so I gave this story about SSA which is it's sort of like this it's like first God creates a world this is SSA first it creates a world and then he takes and he's dead set on on putting you into this world so you he's got your soul right and he really wants and your soul is going in there no matter what right um but the way he's going to insert your soul into the world is by throwing you randomly into some set of people um the reference class uh and so if you wake so you should expect um to end up in the world where uh the the kind of person you end up as uh is is sort of um more like a more likely result of that throwing process is a sort of larger fraction of of the total people you could have been what SSA or what Sia thinks is different the way the story that I'll use for Sia though it doesn't this is the only gloss is God decides he's going to create a world and then he and say there's like a big line of souls in heaven and he goes and grabs them kind of randomly out of heaven and puts them into the world right and so in that case if there are more people in the world then you've got more shots and you're one of these Souls you're sort of sitting in heaven hoping to get created um uh on Sia God has more chances to grab you out of out of heaven and put you into the world if there are more people uh who uh more people like you in that World um and so you should expect to be in a world where there was sort of there are more such people and that that's that's kind of Sia's vibe doesn't this also imply that you should be in the future assuming there will be more people in the future tell me more about why why I would apply that okay in an analogous scenario maybe like go back to the god tossing the coin scenario where it just a substitute for people in right rooms you substitute um being a thing uh a conscious entity and if there's going to be more conscious entities in the future like you would really expect to just like in that example of being in that scenario where there's a lot more rooms just as maybe you should expect you to be in that scenario where there's a lot more conscious beings which presumably is the future so then it's still odd that you're in the present uh under Sia yes so in in a specific sense so um it's true that on Sia uh say that um say that we don't know what room you're in first right so so um you wake up in the white room and you're wondering uh am I in room one or am I in rooms two through a million right um and on Sia what you did first so you woke up and you don't know what room you're in but there's a lot more people in their world with lots of rooms and so you become very very confident that you're in that world right so you're very very confident on tales and then you're right that uh conditional entails you think it's uh much more like you sort of split your Credence evenly between all these rooms so you are uh very confident that you're in one of the the sort of two through a million rooms and not not room one um but that's before you've seen your room number um once you see your room number it's true that you should be quite surprised about your room number um uh but the uh once you get the room number you're back you're back to 50 50 on uh heads versus Tails because you had sort of equal Credence in being in room one uh conditional on Tails um or sorry uh you had equal Credence in being in tails in room one uh and uh heads in room one and so when you get rid of all of the other tails and rooms two through a million you're left with 50 50 overall on heads versus Tails um and so uh the the sense in which Sia leaves you back at normality with the Doomsday argument is once you update on being in the 21st century which admittedly should be surprising like if you didn't know what that you were in the 21st century and then you learned that you were you should be like wow that's really unexpected and fair so and that's that's true but I think once you do that you're back at um uh you know whatever your prior was about about Extinction maybe I'm still not sure on why the fact that you were surprised should not itself be the Doomsday argument yeah I think there's an intuition there um which is sort of like yeah is Sia making a bad prediction so you you could you could kind of update against Sia because Sia would have predicted that you're in the future um I I think there's something there and I think there's a few other analogs um like for example I think Sia naively predicts that um you know you you should find yourself in a situation where there are just tons of people that you know a situation obsessed with creating people with your evidence um and you know there's is one of the one of the problems with SIA so you should expect to find you know in every nook and cranny a simulation of you as soon as you like you know you open the door it's actually this giant Bank of simulations of you in like your previous epistemic state um and so you know I think there are and and then you don't see that you might be like well I should update against the anthropic theory that predicted uh that I would see that and I I think there are arguments in that thing yeah so maybe let's back up to go to the original example uh uh that that was used to distinguish these two theories yeah so I I uh can you help me Resolve my intuitions here where my intuition is very much SSA because um yeah it seems to me that uh you knew you were gonna wake up right you knew you're gonna wake up in a white room before you actually did wake up your prior should have been like one half heads or tails so it's not clear to me why Having learned nothing new your uh your posterior probability on either of those scenarios should change so I think the the Sia response to that would be or at least I think the way of making it intuitive would be to say that you didn't know that you were going to wake up right so in the um if we go back to that just so story where God is grabbing you out of the um Out of Heaven uh you know it's uh it's not at all it's actually incredibly unlikely that he grabs you there are so many so many people I mean there's a different thing where sis in general very surprised to exist um and in fact that's uh the um so you could make the same argument was like Sia says you shouldn't exist isn't that weird that you exist um and I actually think that's a that's a good argument um so um but uh once you're in that headspace then I think the way the way to think about it is that it's not a guarantee that you you were God is not dead set on creating you you are a particular contingent arrangement of the world um and so that that you should expect that arrangement to come about more often if there are more Arrangements of that type um rather than sort of assuming that that no matter what existence will include you yes okay can you talk more about the problems with SSA like scenarios where you think it breaks down uh every like why you prefer Sia yeah so um an easy an easy problem uh or sort of one of the most dramatic problems is that SI SSA um predicts that uh it's possible to have a kind of telekinetic influence on the world so imagine that there's a um there's a puppy you you're in a you you wake up and you're in an empty Universe except for this puppy and and you and this Boulder that's rolling towards the puppy right and the boulder is inexorably gonna kill the puppy um it's very large Boulder it's basically guaranteed that the puppy is dead meat but you have the power to make binding pre-commitments that you will in fact execute and and you have also to your right a button that would allow you to create tons of people like zillions and zillions and zillions of people all of whom um are wearing different clothes from you uh so they would be in a different epistemic state than you if you if you created them um now SSA uh so you so you make the following resolution you say um if uh this Boulder does not jump out of the way of this puppy um like the boulder leaks it you know in some very weird very unlikely way um uh then I will press this button and I will create zillions and zillions of people um uh all of whom are in a different epistemic state than me but let's assume they were in my reference class um SSA thinks that it's sufficiently unlikely that you would be in a world with zillions of those people um and uh but you you know you at the very beginning uh with with a different with different colored clothes because you know that's a tiny fraction of the reference class if those people get created um that SSA thinks it's actually more likely once you've made that commitment that the boulder will jump out of the way um and uh and you know so that and that looks weird right it seems like that's not going to work you can't you can't just make that commitment and then expect the boulder to jump um and you get to that so that's the sort of exotic example you get you get similar analogs even in the gods coin toss case where um like naively it doesn't actually matter whether God has tossed the coin yet right so suppose um yeah so like let's say let's say you wake up and learn that you're in room one right um but God hasn't tossed the coin it's like he created room one first before he tossed and then he's gonna toss and that's gonna determine whether or not he creates all the rooms in the future um if you on SSA once you wake up and learn um learn that you're in room one you think it's incredibly unlikely that there's going to be these future people so you now you say before it's a fair coin God's Gonna toss it in front of you you're still gonna say I'm sorry God it's uh you know it's a one in a million chance that this uh that this coin lands Tails um and uh or sorry one in a million some like a very small number I for example I forget exactly and that's um and that's very weird that's a Fair coin it hasn't been tossed but you with the power of SSA have become extremely confident about about what how it's gonna land um and that's uh so that's that's another argument there's there's a number of other uh I think really really bad problems for us I'd say yeah while I digest that um well so let me uh let me just uh mention the the problems you already pointed out against uh Sia um in the post and uh earlier where where if if one thinks SI is true one should be very confident that there are you're in a universe with many other people who have been sampled just like you and so then it's um then it's kind of surprising that we're in the universe that is not filled to the brim with people like there's a lot of um you could imagine like Mars is just completely made up of bodies um or you know like every single star has like you know a simulation of a trillion people inside um the fact that this is not happening seems like uh it seems like very strong evidence against Sia and then you know there's other things like the presumptuous philosopher that you might want to talk about as well but um yeah so but did you just bite the bullet on these things or how do you think about these things my main claim is that Sia is better than SSA um and and I think it's just a horrible situation with Sam tropics and and um it's I think overall Sia is an update towards bigger more populated universes um I think you know the most Salient populated universes don't involve like hidden people on other planets but they're probably um I don't know maybe we're in a simulation and people are you know obsessed with simulating us or or something like that or and then I think this is actually more important than worrying is I think the way I see this dialectic is um first SI I mean so a big problem with SIA is it immediately becomes certain naively that you live in an infinite universe or a universe with an infinite number of people um uh and that and then it breaks because and it doesn't know how to compare um uh kind of infinite universes now to be fair SSA also isn't great at comparing infinite universes um and they both have some you can do things that are actually quite analogous to things you can try to do in infinite ethics where you have like expanding spheres of space-time and you you count you know you have some fraction or some density of people in those spheres um and there's this General problem in cosmology of of like trying to understand what what it means to have like a fraction or a density of of different types of observers um but you know my own take is kind of what happens here is you we hit this Infinity you hit infinite universes fairly fast and then they kind of break your anthropics in analogous ways to how they break your ethics um and that's kind of where I'm currently at and I'm hoping to understand better uh how to do anthropics with Infinities um and um some of my work on the universal distribution uh which is a sort of I have I have a couple blog posts on that was attempting to go a little bit in that direction though it has its own giant problems okay interesting um do you know if um just vaguely it seems to me that the robin Hansen's grabby aliens thing probably uses SSA uh but do you know if that's the case uh if he's using SSA in there I don't I haven't looked closely at that work okay okay cool I don't know it's hard for me to think about maybe it'll take me a few more weeks before I can uh digest it uh fully but um yeah okay so that's really interesting you have a really interesting blog post about believing in things you cannot see um and one I mean this is almost an aside in the post itself but I thought it was a really interesting comment you make a interesting comment about futurism uh here's what you say much of futurism in my experience has a distinct flavor of unreality the concepts mind uploads nanotechnology settlement and energy capture in space are I think meaningful even if Loosely defined but at a certain point once models become so abstracted and incomplete that the sense of talking about a real thing even a possibly real thing is lost yeah so why do you think that is and is there a way to do Futures and better I think it comes partly because imagination is just quite a limited tool and it's just easy you know when you're talking about the whole world like the future is a big thing to try to model with this tiny mind and so you know of necessity you need to use these extremely lossy abstractions um and so you know uh it puts you in a mode of having these like you know really sketchy and gappy maps that you're trying to to manipulate um I think that's one dimension and then I think there's also a way in which um you know this isn't all that unique to futurism and insofar as just in general I think it's hard sometimes to keep our uh intellectual engagement kind of rooted and grounded in the the kind of real world and you know I think it's just easy to kind of move into a Zone and especially if that zone is inflected with kind of social dynamics or you're it's you know it's it's kind of like a intellectual game or you're enjoying it for its own sake or it's like a sort of their sort of status dimensions and the way people talk and other things that I think start to move our our discourse in uh in directions that aren't about like uh we're talking about the real world right now let's actually get it right and I think that happens with futurism um and and maybe more so because it can feel like like I think some people there's sort of topics that they treat as like Ah that's a real serious topic that's about real stuff and then there are other topics where it's like this is the chance to kind of make stuff up um and you know my experiences sometimes people relate to futurism that way there are other topics where people move into a zone of like one can just say stuff here um and there are kind of no constraints uh and I think I think that's actually wrong and and with futurism I think they're they're important um constraints and important things we can say um but uh I think that can that Vibe can seep in nonetheless yeah and it's interesting that it's true of the future and the past um I recently interviewed somebody who wrote a book about the Napoleonic War and yeah it is I mean it's very interesting to talk about in a sort of abstract sense but then also you can um which is a very seldom done you can like think of the reality of like a million men marching out of Russia and freezing and eating the remains of horses and other people and then starving um and the the the the the the then the the concrete reality yeah when you're not just talking about abstracts like oh the Border changed so much in these few decades or something um yeah just how you think about history changes so much and it becomes um yeah even recently I was reading this book about um the uh the the use of meth by the Nazis um and the if you just there's there's this really cynical part of the book where the the leaders um in the in the Nazi regime they're talking about like oh meth is the perfect drug because it gives them courage to kind of just Blitz through an area um without any sort of uh without thinking about how cold that is without thinking about how scary it is to just be in no man's land and just this idea of like this messed up Soldier who's like been forced to uh just go out into the middle of nowhere um and yeah and then all like marching to Russia or something uh in the winter I I don't know if that was going to lead up to a question I don't know if you have a reaction but yeah yeah I mean I think so I think that's a great example of um or you know specifically the sort of image of the difference between relating to history as this sort of how is the Border changing versus the concreteness of these people you know often I think engaging with history is horrifying in this respect is when you really bring to mind that the lived reality of all these events it's a really different uh uh experience and I think to some extent one of the reasons that um concreteness might be often lacking from futurism is that you can't any any attempt to specify the thing will be wrong um so you know we can you can you might be right about some abstract thing like you might be like oh we will you know we will have uh the ability to manipulate matter at like blob you know blah level uh you know uh you know a scale but um if you try to dig in and then you're like and here's what it's like to wake up in the future you know and and then you know the you're eating the or whatever and it says you're wrong immediately that that's not how it's going to be um and so you don't have the um the ability to really hone in on concrete details that are actually true um and so in some sense you need to there's this like back and forth where you need to sort of Imagine a concrete thing and then be like okay that's wrong but there will then take the flavor of concreteness that you got from that and say but it will be a concrete thing it just won't be the specific one I imagined um and then keep that flavor of concreteness even as you talk in more abstract ways and that's I think a delicate dance is like very uh talking point that it often brings up about uh our uh that we've become indefinite Optimist um and that he prefers a sort of definite optimism where you have a concrete vision of what the future could be um um okay so yeah uh I get to close out what one of the things I wanted to ask you about was uh so you said this was a side project this blog um I thought it was one of the actually before you mentioned that you're the one your main work is AI I thought this was at least part of your main work and so it's surprising it's really surprising to me then that uh yeah you're able to keep up the regularity it's like basically you're publishing a small book every I don't know every week or so and um I've filled a lot of insight and that I mean it's like so uh unlike many other blogs on the internet we're just playing style um yeah you've got like great Pros uh however like what is your uh how are you able to like maintain such uh productivity on your side project should say a few of a few of my recent my most recent posts which were especially long um I was I had taken taken some time off from work and I was working on those partly in an academic context um but the first the first year and a half were so of the blog was just on the side and I've gone back to having it be on the side now I think one thing that helps is my blog posts are too long um and so there's you know I have dreams of of writing these you know taking my Long blog posts and then really crunching them down and making it into this like pithy elegant uh statement that that's really concise and condensed um but uh that would be more so you know one one way I sort of uh increased my output is by not doing that uh editing and I feel I feel bad about that um but that's one that's one thing at least um uh yeah well what is that quote where I I don't know if somebody's asked like how did you I think it's something like I would you know I would have written you a long letter but or I would have I didn't have time to write you a short letter so I wrote your long letter or something like that yeah yeah exactly I have a friend who says it's like the actual thing it should be I didn't have time to write you a short letter so I wrote you a bad letter and you know I'm like I hope it's not that bad but I do you know I do think uh if I had more time for these posts I would I would try to kind of cut them down and that's that's one time saving um you know for better or worse um yeah at least as a reader uh it often seems to me that the people like you who write um I made the system describe your process but um if Scott Alexander says he kind of just writes stream of Consciousness and that you know it just turns out to be really readable your blog posts are really readable um and even like the stuff I write like the things that I read or that are I'm like consciously not trying to make edits while I'm going on they end up reading much better than the ones where I'm trying to optimize each sentence uh and then taking two steps back for everyone I take forward um I I I don't know if it just it could just be like a selection effect of the the things that are harder to convey you are spending more time editing but um yeah it's kind of interesting yeah I wonder I wonder I mean my my feeling is that my writing is is quite a bit better if I have a chance to edit it um and it's just it's just a time thing um but I do think people vary quite a bit and and you know it's interesting I I don't know if I was recently reading this book um George Saunders who I think the writer I really admire has this uh book about fiction writing called uh swim in the pond in the rain um and the vibe uh he tries to convey and I think this is relatively common amongst writer types is like this obsessive focus on you know even at a sentence by sentence level really thinking about what where is the reader's mind right now how are they engaging are they interested are they surprised am I losing them um and and you know his writing is really really engaging in ways that it's like not even obvious you just sort of start reading along and you're like oh wow I'm really into this um but it's also quite a daunting picture of the level of attentiveness required and it's like wow if I'm gonna write everything like that it's like that's going to cut down a lot on my um kind of overall output uh and so I do think there's a balance there and and um you know to the extent you're you're one of these people who you can just like stream of Consciousness and that's like close to what you would get out of out of editing which I'm not sure I am um you know all the better it's sort of like you're lucky yeah there's also an additional consideration where if you think there's going to be some kind of power a lot to like how much how interesting a piece is or how how many um how many people see it and how many people uh find Value in it then it's not clear whether that advises you to spend so much time on each piece to increase the odds that that one piece is gonna blow up given that there's a big difference between the pieces that blow up and don't or whether you should just like do a whole bunch and then uh kind of just try to sample uh as as often as possible yeah and I think I think actually the blog I started the blog partly as an exercise in just getting stuff out there I think I had had uh I had had the idea that I would one day write up a bunch of stuff that I've been thinking about but you know it was somehow a uh and I would write it up in this Grand you know I would finally write it up and it'd be this beautiful thing and I would you know take all this time and then um I had ended up you know for various reasons feeling like I was approaching some aspects of my life with too much perfectionism or too much and I needed to just like um get stuff out there faster and so the blog was an exercise in in that and I think has uh you know I think that's paid off in ways and that I don't know I don't think I would have done it otherwise I see all right final question um I'm curious if you have uh Theory book recommendations that you can give the audience probably my primary recommendation that this is somewhat self-serving because I helped with this this project is the book The precipice by Toby ORD um I you know maybe familiar to many many of your listeners but um you know I think it's uh it's a book that really uh conveys the ideas that matter you know most to me or that that have had you know close to the biggest impact in my own life um uh other books I I love the Play Angels in America um I think it's just I think it's a epic and amazing um and you know that's not quite a book but um you know you can read it uh I actually recommend watching the HBO mini series um but uh that's you know that's something I recommend um and then uh I don't know uh last last year I read I read this book housekeeping by um by Marilyn Robinson and and uh it had this sort of numinous uh quality that um I think a lot of her writing does um and so I really like that and recommend it to people that's also a piece of fiction if you're looking for philosophy I don't know a lot of my work is is in dialogue with Nick Bostrom um and uh and his uh yeah his his overall kind of Corpus and I think that's really really valuable to engage with I say cool cool all right uh yeah Joe thanks so much for coming on the podcast it was a lot of fun a lot of fun yeah thanks for having me oh I'll also say you know everything I've said here is just purely my personal opinion um you know I'm not speaking for my employer not speaking for you know uh anyone else just just myself so just just keeping that in mind cool cool um and then work and uh people find uh your stuff so just uh if you want to go over your blog link and then your Twitter link and other things yep so my blog is hands and cities.com um and my Twitter handle is JK Carl Smith um those are those are good places to reach me and then my personal website is josephcarlsmith.com okay and then we're going to find your stuff on AI uh and those kinds of things the stuff on AI is linked from my personal website so that's the best that's the best place to go all right cool cool thanks for watching I hope you enjoyed that episode if you did and you want to support the podcast the most helpful thing you can do is share it on social media and with your friends other than that please like And subscribe on YouTube and leave good reviews on podcast platforms cheers I'll see you next time [Music]
Related conversations
AXRP
3 Jan 2026
David Rein on METR Time Horizons

This conversation examines core safety through David Rein on METR Time Horizons, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Same shelf or editorial thread
Spectrum + transcript · tap
Slice bands
Spectrum trail (transcript)
Med 0 · avg -0 · 108 segs
AXRP
7 Aug 2025
Tom Davidson on AI-enabled Coups

This conversation examines core safety through Tom Davidson on AI-enabled Coups, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Same shelf or editorial thread
Spectrum + transcript · tap
Slice bands
Spectrum trail (transcript)
Med 0 · avg -5 · 133 segs
AXRP
6 Jul 2025
Samuel Albanie on DeepMind's AGI Safety Approach

This conversation examines core safety through Samuel Albanie on DeepMind's AGI Safety Approach, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Same shelf or editorial thread
Spectrum + transcript · tap
Slice bands
Spectrum trail (transcript)
Med 0 · avg -4 · 72 segs
AXRP
1 Dec 2024
Evan Hubinger on Model Organisms of Misalignment

This conversation examines technical alignment through Evan Hubinger on Model Organisms of Misalignment, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Same shelf or editorial thread
Spectrum + transcript · tap
Slice bands
Spectrum trail (transcript)
Med -6 · avg -7 · 120 segs
Counterbalance on this topic
Ranked with the mirror rule in the methodology: picks sit closer to the opposite side of your score on the same axis (lens alignment preferred). Each card plots you and the pick together.
Mirror pick 1
AXRP
3 Jan 2026
David Rein on METR Time Horizons

This conversation examines core safety through David Rein on METR Time Horizons, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Spectrum vs this page
This page -10.64This pick -10.64Δ 0
This pageThis pick
Near you on the spectrum — often same shelf or editorial thread, different conversation. Mixed · Technical lens.
Spectrum trail (transcript)
Med 0 · avg -0 · 108 segs
Mirror pick 2
AXRP
7 Aug 2025
Tom Davidson on AI-enabled Coups

This conversation examines core safety through Tom Davidson on AI-enabled Coups, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.
Spectrum vs this page
This page -10.64This pick -10.64Δ 0
This pageThis pick
Near you on the spectrum — often same shelf or editorial thread, different conversation. Mixed · Technical lens.
Spectrum trail (transcript)
Med 0 · avg -5 · 133 segs
Mirror pick 3
AXRP
6 Jul 2025
Samuel Albanie on DeepMind's AGI Safety Approach

Spectrum vs this page
This page -10.64This pick -10.64Δ 0
This pageThis pick
Near you on the spectrum — often same shelf or editorial thread, different conversation. Mixed · Technical lens.
Spectrum trail (transcript)
Med 0 · avg -4 · 72 segs