A backlit laptop computer keyboard being used.
A backlit laptop computer keyboard being used.

Your Brain on ChatGPT with Nataliya Kosmyna

© User:Colin / Wikimedia Commons / CC BY-SA 4.0
  • Free Audio
  • Ad-Free Audio

About This Episode

What happens to your brain when you use AI? Neil deGrasse Tyson and co-host Chuck Nice and Gary O’Reilly explore current research into how large language models affect our cognition, memory, and learning with Nataliya Kosmyna, research scientist at the MIT Media Lab. Is AI good for us?

Nataliya describes her experiment comparing students writing essays with three methods: ChatGPT, Google, or just their own brains. What happened when groups swapped? Why did the “LLM group” show the least functional connectivity in the brain, while Google users lit up their visual cortex? Why did so many students fail to recall or even quote their own work? Neil and Nataliya unpack how timing, cognitive load, and brain struggle are essential for true learning.

We explore cognitive load theory, video game design, and whether doctors relying on AI could lose diagnostic skills. How much AI support is too much? Could AI free up tasks so we can use our cognitive energy elsewhere? Or does it rob us of our brain power? We discuss the efficiency of the brain versus machine learning, the risks of AI companionship amplifying loneliness (or even AI psychosis) to how education must adapt for a generation raised with AI. What happens when children grow up never learning the skills that AI replaces? Does outsourcing brain work free us to be more creative or leave us dependent?

We discuss the guardrails and the crisis across schooling. How do we adapt the education system around these tools? What do we stand to lose? What do we stand to gain? Does this mean pivoting the focus away from grades and towards a more process and learning focus? Learn about BCI, brain-computer interfaces, and whether someday we’ll have LLMs in our heads. If we have all of the world’s information uploaded, will we need higher education? We will all say “I know kung fu” like in the Matrix?

Thanks to our Patrons Jacqueline Scripps, Jose Mireles, Eric Divelbiss, francisco carbajal medina, Sahil Pethe, Vivekanandhan Viswanathan, Kurt R, Daniel D. Chisebwe, Landslide, Sebastian Davalos, Bob Case, Mark Rempel, Lucas Fowler, Cindy, Wizulus Redikulus, Hector Alvarado, Matt Cochrane, Ari Warren, Mark, Jorge Ochoa, Leena Z, Donald BeLow, Zach Woodbury, Jeffery Hicks, Ibolinger, Subri Kovilmadam, Danielle Stepien, Justin Akins, Richard, Tai Vokins, Dan O’Connell, Evelyn Lhea, Siva Sankar, Jack Bremner, mcb_2011, Saronitegang, dante wisch, Adnrea Salgado Corres, Jarrod C., Micheal Maiman, Ivan Arsov, Patrick Spillane, Aarush, Brad Lester, Anna Wolosiak-Tomaszewska, Jon A, Ali Shahid, K. Rich Jr., Kevin Wade, Suzy Stroud, Expery Mental, Ian jenkins, Tim Baldwin, John Billesdon, Hugo, Mason Lake, Judith Grimes, G Mysore, Mark Stueve, Cuntess Bashory, Jock Hoath, Payton Noel, and Leon Rivera for supporting us this week.

NOTE: StarTalk+ Patrons can listen to this entire episode commercial-free.

Transcript

DOWNLOAD SRT
Chuck, if the forces of AI are not big enough in society and our culture, we now got to think about what AI’s effect is on our brain? I’m going to say that there is no help for my brain,...

Chuck, if the forces of AI are not big enough in society and our culture, we now got to think about what AI’s effect is on our brain?

I’m going to say that there is no help for my brain, so it does not make a difference.

I know, but Neil, if you lean into these large language models and it takes away some of our core skills, surely that can’t be an upside to that.

Once again, Gary, not going to affect me at all.

Coming up, StarTalk Special Edition, your brain on AI.

Welcome to StarTalk, your place in the universe where science and pop culture collide.

StarTalk begins right now.

This is StarTalk Special Edition.

Neil deGrasse Tyson, your personal astrophysicist.

And when it’s Special Edition, you know that means we have Gary O’Reilly in the house.

Gary.

Hi, Neil.

All right.

We got another one of these.

We’re going to connect the viewer listener to the human condition.

Yes.

Oh, my gosh.

But let me get my other co-host introduced here.

That would be Chuck Nice.

Chuck, how are you doing, man?

Hey, man.

Yeah, when you know it’s Chuck, it means it’s not special at all.

Well, we’ve got you because you have a level of science literacy that, oh, my gosh, you find humor where the rest of us would have walked right by it.

That’s part of our recipe here.

That’s very cool.

Yeah, I appreciate that.

Yeah.

So, Gary, the title today is AI Good for Us.

Okay.

Well, here’s the answer.

No.

Let’s all go home.

Okay.

That’s the end of the show.

Let’s all go home, people.

This was quicker than I expected.

This was very quick.

I mean, yeah, you know.

So, Gary, what have you set up for the day?

Well, Lane Unsworth, our producer over in the LA office, and myself, we sort of noodled.

And this is a question that’s been bouncing around a lot of people’s thought processes for a while.

So, all over the world, people are using LLM’s large language models for their work, their homework and plenty more.

Besides discussions of academic dishonesty and the quality of work, has anybody actually taken the time to stop and think about what this is doing to our brains?

Today, we are going to look at some of the current, and I really do mean current, time and space, this moment, research into the impact of using an AI tool can have on your cognitive load and your neural and behavioral consequences that come with it.

And the question will be, does AI have the opportunity to make us smarter or not?

I like the way you phrased that, Gary.

It was very diplomatic.

I know, smarter or not.

Or not, and does it have the opportunity to do so?

Smarter or dumber, that’s what you mean.

I didn’t say those words.

Well, here on StarTalk, we lean academic when we find our experts, and today is no exception to that.

We have with us Nataliya Kosmyna, dialing in from MIT.

Nataliya, welcome to StarTalk.

Thanks for having me.

Excited to be here with you.

Excellent.

You’re a research scientist at the one and only MIT Media Lab.

Oh my gosh.

If I had like another life and a career, I would totally be on the doorsteps there wanting to get a job.

If I had another life and career, It wouldn’t exist.

I’d shut it down immediately because, let’s be honest, science is a hoax.

Some people do want you to believe that.

It’s like science has 99 problems and virality ain’t one, right?

Right.

There you go.

You’re in the Fluid Interfaces group.

You are trained in non-invasive brain-computer interfaces, BCIs.

I’m guessing that means you put electrodes on the skull instead of inside the skull, but we’ll get to that in a minute.

And you’re a BCI developer and designer whose solutions have found their way into low Earth orbit and on the moon.

We want to get into that.

So, let’s begin by characterizing this segment as your brain on ChatGPT.

Let’s just start off with that.

What a great topic, Neil.

Is there any way I can help you with that?

So, you research what happens when students use ChatGPT for their homework and for their…

What have you found in these studies?

Yeah, so we run a study that’s exactly the title, right?

Your Brain on Chat GPT, Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Tasks.

So, we did very specific tasks that we’re going to be talking right now about, which is essay writing.

We invited 50 students from greater Boston area here to come in person to the lab, and we effectively put those headsets you just mentioned on their heads to measure their brain activity when they’re writing an essay.

We divided them in three groups.

We asked one group, as you might already guess where that’s heading, to just use ChatGPT.

That’s why paper is called your Brain on Chat GPT.

It’s not because we are really, really singling out ChatGPT.

It’s just because we use ChatGPT in the paper.

So it’s purely scientific.

So we asked one group of students to use only ChatGPT to write those essays.

Another group to use Google, the search engine to write those essays.

And the third group to use their brain only.

So no tools were allowed.

And we give them topics which are what we consider high level, right?

For example, what is happiness?

Is there a perfect society?

Should you think before you talk?

And we gave them a very limited time, like 20 minutes to write those essays.

And we finally of course looked into the outputs of those essays, right?

So what they actually written, how they use ChatGPT, how they use Google.

And of course we asked them a couple of questions, like can they give a quote?

Can they tell us why they wrote this essay and what they wrote about?

And then there was one more final first session in this study where we swapped the groups.

So students who were originally in ChatGPT group, we actually took away the access for this first session and vice versa was true.

So if you were, for example, Neil, you were not our participant, but if you were ever to come to Cambridge and be our participant, and let’s say if you were actually…

I’m not putting anything on my head.

I’m just letting you know right now.

Come on, it’s the future.

Now, the problem is he’d have to take off his tinfoil hat when he got there.

Yeah, I see that happening regardless.

So if you were, for example, in our participant in Brain Only group, you actually for this first session would give you access to CHGPT.

Again, we measured exact same things, brain activity, what actually was as an output, and ask a couple of questions.

And what we found are actually significant differences between those three groups.

So first of all, if you talk about the brain, right, we measured what is called brain functional connectivity.

So that’s in a layperson terms, like I’m here having three of you talking to each other, talking to myself.

So that’s what we measured.

Who is talking to who, am I talking to Neil, or is Neil talking to you?

So directionality, so who talks to who in the brain?

And then how much talking is happening?

Is it just, hi, hello, my name is Nataliya, or actually a lot of talking, so a lot of flow of data is being exchanged.

So that’s literally what we actually measured.

And we found significant difference, and then some of those are ultimately not surprising.

You can think logically, if you do not have any, let’s say, you need to do this episode right now, right?

And I’m gonna take away all your notes right now, all of the external help, and then I’m gonna measure your brain activity.

How do things gonna turn out?

You’re gonna have like really your brain on fire, so to say, because you need like, okay, what was your name again?

What was the study?

What is happening, right?

You need to really push through with your brain, like you have memory activation, you need to have some structure, like and now you don’t have notes for the structure of this episode, right?

So you need like, what was the structure?

What we did this at?

What we are talking about?

What is, you know, you really have nothing to fall onto.

So of course you have this functional connectivity that is significantly higher for brain on the group compared to the two other groups.

Then we take search engine group, Google, and actually there’s just as a prior research, there’s a tone of people about Google already.

We actually, as a humanity, right, we are excellent in creating different tools, and then measuring the impact of those tools on our brain.

So there’s quite a few papers we are studying in our paper.

For example, there is a paper, Spoiler Alert, called Your Brain on Google from 2008.

Literally, that’s the name of the paper.

So we’ve actually found something very similar to what they found.

There would be a lot of activations in the back of your head.

This is called visual cortex or occipital cortex.

It’s basically a lot of visual information processing.

So right now, for example, someone who is listening to us, and maybe they are doing some work in parallel, they would maybe have some different tabs open, right?

They would have like one is like YouTube tab, and others would have like some other things that they’re doing.

So you know, you’re basically jumping between the tabs, looking at some information, maybe looking at the paper while listening to us.

So this is what we actually seen, and there’s a plenty of papers already showing the same effect.

But then for the LLM group, for CharGP2 group, we saw the least of these functional connectivity activations.

And that doesn’t again mean that you became dumb, or you…

Yes it does.

There’s actually quite a few papers specifically having in the title laziness, and we can talk about this with other results.

But from brain perspective, from our results, it doesn’t show that.

What it actually shows that, hey, you have been really exposed to one very limited tool, right?

You know, there’s not a lot of visual stuff happening.

Brain doesn’t really struggle when you actually use this tool.

So you have much less of this functional connectivity.

So that’s what we found.

But what is, I think, interesting and effective, we may be heading back to this point of laziness, and some of these, maybe a bit more, I would say nefarious results are, of course, other results that are relevant to the outputs, to the outputs themselves.

So first of all, what we found, that the assets were very homogenous.

So the vocabulary that was used was very, very similar for the LLM group.

It was not the case for the search engine and for the brain-only group.

I’m going to give you an example, and of course, in the paper, we have multiple examples.

I’m going to give you only one.

Topic, happiness.

So we have LLM, so ChatGPT users, mentioning heavily the words career and career choice.

And surprise, surprise, these are students, I literally just mentioned this.

Of course, they’re going to more likely talk about career and career choices.

And again, who are we ultimately to judge what makes a person happy, right?

No, of course, but don’t forget the two other groups, they are from the same category.

They are students in the same geographic area, right?

However, for them, these words were completely different.

For Google, for the search engine, students actually heavily used vocabulary giving and giving us, and then brain only group was using vocabulary related to happiness and true happiness.

And this is just one of the examples.

And then finally, to highlight one more result is responses from the participants themselves, from those students.

So, we asked literally 60 seconds after they gave us their essays.

Can you give us a quote?

Any quote, any length of the quote of what you had just written can be short, long, anywhere in your essay, anything.

Eighty-three percent of participants from LLM, from ChargPT group, could have not quoted anything.

That was not the case for brain and search engine groups.

Of course, in sessions two and three and four, they improved because, surprise, surprise, they knew what the questions would be.

But the trend remains the same.

It’s harder for them to quote.

But I think the most ultimately dangerous result, if I can use this term, though it’s not really scientific, but something that I think a lot of inquiry actually is required to really look further into this, it’s almost on philosophical, I guess, level, is ownership question.

So we did ask them, if percentage of ownership do they feel towards those essays?

And 15% of CHI-GPT users told us that they do not feel any ownership.

And of course, a lot of people, especially online, mentioned, well, they haven’t written this essay.

Of course, they didn’t feel any ownership.

But I think that’s where it actually gets really tricky, because if you do not feel that it’s yours, but you just worked on it, does this mean that you do not care?

We didn’t obviously push it that far in the paper, but I think this is something that definitely might require much further investigation.

Because if you don’t care, you don’t remember the output, you don’t care about the output, then what ultimately is it for?

Why are we even here?

Of course, it’s not all dark gloom and everything is awful and disastrous.

I mentioned that there is this fourth session.

Not everyone came back for this session, so actually sample size is even smaller for this.

Only 18 participants came back.

But what we found is that those who were ChaGPT users originally and then lost access to ChaGPT, their brain connectivity was significantly lower than that of the brain only group.

However, those who were originally brain only group and then gained access to ChaGPT, their brain connectivity was significantly higher than that of the brain only group.

What it could potentially, and I’m saying potentially because again, much more studies would be required, means that timing might be essential.

Basically, if you make your brain work, well, and then you gained access to the tools, it could be beneficial.

But of course, it doesn’t mean that it’s one second of work of the brain and then you use the tool, right?

Something like, let’s say, you’re in a school and maybe first semester you have, you learn your base of whatever subject it is without any tools, like old school way.

And then on the second semester, you didn’t become an expert in one semester for school year, but you at least have some base.

And then let’s say in the second semester, you gained access to the tool, right?

So it might prove actually beneficial.

But again, all of this is to be still shown and proven.

We literally have very few data points.

But the tool is now being really pushed on us everywhere.

So you could be affecting best practice for decades to come based on what a teacher might choose to allow in classroom and not.

So what are you measuring?

You know, you put the helmet on.

Are you measuring a blood flow to, is it neuroelectrical fields?

In our case, we measure measuring electrical activities.

So there’s multiple ways of measuring things.

Is that the EE?

EEG.

Yeah, electroencephalography.

Yes.

Right.

Okay, so that just tells you, and since we already know in advance what parts of the brain are responsible for what kinds of physiological awareness, right?

And if you see one part of the brain light up versus another, or no part light up, that tells you that not much is happening there.

Is that a fair?

Yeah.

It’s a fair, it’s simplified, but kind of fair way.

And it doesn’t mean that it’s very important.

It’s not that that part doesn’t work, right?

Or like it atrophied itself, like we saw in some, no, no, no.

It just means you started as a dumbass and you still are one.

Hey, wait, whoa, whoa, what happened?

This guy’s brain just went completely dark.

It doesn’t go dark.

Like, listen, I’m going to give you one example, right?

It’s like back to this crazy example of 3% of our brain versus 100%.

Like, if you were to use not 100% of your brain, like literally, we would not have this kind of section right now at all.

So it’s very important to understand.

We use our brain as a whole.

Of course, you have…

Of course.

No, we’re not.

We are way past…

Yeah, we’re not in that camp.

That was just a joke.

We understand that your brain is constantly working.

A lot of it, actually, just to run your body.

So, you know…

Takes up a lot of energy.

Takes up a lot of energy.

But back to the energy, and I think this is like super important.

It still takes much less energy than even 10 requests from CharGPT or from Google.

And this is beautiful because our body, right, so imperfect as a lot of people call it in our brain, so imperfect, which it is very old, ancient, as some people say, computer, still is the most efficient of machines that we all have, right?

And we should not forget that people and all of the AI labs right now around the world try to mimic the brand.

They try to pull so hard all of those pre-prints that you’re seeing on archives, the servers that host those papers.

How can it be similar?

Can we ensure that this is similar, right?

And so there is something to it because we are actually very efficient, but we are efficient almost to the limit of the shortcuts that actually makes in a lot of cases a bit too efficient, right?

Think about, like, hey, you really want to look for these shortcuts to make things the easiest.

The whole goal of your brain is to keep you alive, not to use CharGPT or LLM, not to do anything.

No, the only ultimate goal, let’s keep this body alive.

And then everything else adds on, right?

And so this is how we are running around here.

We’re trying to obviously then figure out how we can make life of this body as easy as we can.

So, of course, these shortcuts are now, as you can see, used in a lot of social media, which obviously heavily talked about, and we know about some of those dark patterns, as they are known, are heavily used.

And some of them are designed by neuroscientists, unfortunately, because it feeds back into the needs of the brain, constant affirmation, fear of missing out.

All of those are our original, original design by the nature, right?

Phenomena.

And of course, now we can see that LLMs would be and are getting designed by those as well.

Wait, Nataliya, just a quick insert here.

So I had not fought to compare, just as you described, the energy consumption of an LLM request in Chet GPT and the energy consumption of the human brain to achieve the same task, for example.

Are you factoring in that I can say, write me a thousand word essay on Etruscan pottery, okay?

And 30 seconds later, here it comes.

And you can go to the servers or whatever, or the CPUs and look at how much energy that consumed.

Meanwhile, I don’t know anything about Etruscan earns, so I will go to the library and it’ll take me a week.

Can you add up all the energy I did expend over that week, thinking about it, and then compare it to the CHAT GPT?

Do they rival each other at that point?

Definitely, that’s an excellent point, right?

So theoretically, to answer your question, we can, right?

The difficulty actually would be on the LLM part, not on our part, because we do not have, you know, there’s a lot of these reports, right, in the LLM consumption per all of these tokens for the prompts, right?

But what a lot of companies, well, actually no, almost no companies are releasing, is what it took for training, right?

So for you, it took 30 seconds of thinking, and I hate, hate, hate this word thinking when we use it for LLMs, right?

That’s not thinking, right?

But like, let’s keep it for now.

Thinking, that’s what you see on the screen.

But ultimately, you do not know, neither you nor myself, there is no public information how long it took for it to be trained to actually give you some pottery.

Most likely, my assumption, this is obviously subjective, I do not have data, so I need to be very clear here.

But my estimate from overall knowledge that is available, you go in for a week to the library, not going to be more beneficial for your brain because you will talk to other people, get in this charter of the library and all of the process information, your brain will struggle, your brain actually does need struggle.

Even if you don’t like it, it actually needs it.

You will learn some random cool things in parallel, maybe excluding pottery, and that will still take less for your whole body to work, than actually that 30 seconds of the pottery from a charge GPT.

Again, very important here is not, we do not have the data from LLM perspective.

This is just my subjective one.

So Nataliya, you’ve obviously chosen essay writing for a reason.

It is a challenge on a number of levels.

Your research is fresh out the oven.

It’s June 2025, and we’re only a couple of months down the road from there as we speak right now.

Have you explained to us cognitive load and then cognitive load theory, and how it blends in and how it sits with your research?

Please.

Absolutely.

Just to simplify.

What actually happens is for different types of cognitive load.

Actually, in the paper, we have a whole small section of this.

If someone actually wants to dive into that, that would be great.

There are different types of cognitive load.

The whole idea is that it’s how much of the effort you would need to be on the task or to process information in the current task.

For example, if I’m going to stop right now talking as I’m talking, I’m going to start just giving you very heavy definitions.

Even if you’re definitely interested in those, it will be just harder for you to process.

If I were to put this brain sensing device on you, the EEG cap that I mentioned, you would definitely see that spike because you would try to follow and then you’ll be like, oh, it’s interesting, but really gets harder and harder if I’m going to just throw a ton of terminology on you.

That’s basically what, and this is just simplification.

There’s definitely, check the paper and there’s so much into that.

The idea for the cognitive load and the brain though, is that all of it is started before us, so not in our paper.

We just talk about this, but there are multiple papers and some of them we cite in our paper, is that your brain actually in learning, specifically in learning, but also in other use cases, but we are talking right now in learning, actually needs cognitive load.

You cannot just deliver information on this like platter, like here you go, here is information.

There are studies already pre-LLM, so pre-large language models use pre-chartboards that do talk to you about the fact that if you just give information as is, a person will get bored real fast and they’ll be like, yeah, okay, whatever.

They will be less memory, less recall, less of all of these things.

But if you actually struggle for the information on a specific level, right, it should not be very, very hard.

So if you are cognitively overloaded, that’s also not super good because basically you can give up, right?

It’s actually a very beautiful study.

From 2011, I believe, it’s actually measuring pupil dilation.

So literally how much pupil dilates when you are giving very hard to understand words and vocabulary, and you literally can see how when the words becoming longer and harder, basically it kind of shuts down.

Like it’s like giving up.

Like I’m done here processing all of that.

I’m just going to give up, right?

So you don’t want to get a student or someone who’s learning something new on this give up.

Information is already delivered to you within 30 seconds or 3 seconds or 10 seconds, and you haven’t really struggled.

There is not a lot of this cognitive load, and a lot of people will be, but that’s awesome, right?

That’s kind of the promise of these LLMs and a lot of these tools.

But we do not want to make it too simple, right?

We do not want to take away this cognitive load, and it sounds like almost, it sounds like cognitive load.

Don’t we want to take it away?

No, you actually do not want to take it away.

What you’re describing right now is the basis for all video game design.

Yes.

That’s what you’re describing right now.

What they want to do is make it just challenging enough.

If it’s too challenging, you give up on the game.

But if it’s too easy, you also give up on the game.

But if it’s just challenging enough so that you can move to the next level and then struggle a little and then overcome the struggle, they can keep you playing the game for very long periods of time.

And so it’s a pretty interesting thing that you’re talking about.

But what I’m interested in beyond that is when you talked about the cognitive load, I’m thinking about working memory, but then I’m also thinking about the long-term information that’s downloaded within me.

So let’s say I’m a doctor, right?

And it’s just like, oh, he’s suffering mild dyspnea because of an occlusion in the right coronary, blah, blah, blah, blah, blah, blah, blah, blah.

For a doctor, that’s a lot of information, but they’re so familiar with the information, it’s not a stress on their working memory.

So how does that play into, in other words, how familiar I am with the information already and how well I can process information naturally, how does that play into it?

Chuck, did you just describe your own condition?

You have some, I don’t know what you said, but you were way too fluent at it.

Yeah, he was like Dr.

House.

Oh my God.

He knew it.

Neil, you are too damn funny, but guess what, you’re right.

How about that diagnosis?

By the way, I could have kept going, that was only one problem that would happen, but go ahead.

It’s actually perfect, right?

It was perfect example right now of this conversation between Chuck and Neil, because Neil is like, I have no idea what you just said, maybe it’s a nonsense, maybe it’s actual real stuff.

It’s perfect.

If you have no idea, so you are basically novice, right?

So you have no base.

You can really be like, what is happening?

You will have confusion, you will have Hayden Cognitive Lord, right?

You would be like, have I heard of anything like that before?

So you will try to actually try to do a recall, like, okay, I haven’t heard it, it’s not my area of expertise, what is happening here?

And obviously, you will now, because you heard all of these words that you have no idea about, and if the topic is of the interest to you overall, you will try to pay attention, make sense out of it, maybe ask questions, et cetera.

But if you are effectively trained on it, right?

So you’re a doctor, you are a teacher, you are an expert in the area, we see that there are significant differences.

Well, first of all, because you obviously know what to expect, so this expectation, vocabulary expectation, right?

Some of the conditions of the expectation when someone is coming to an ER, and they are expecting like a doctor who is there, they saw it all or maybe almost all of it, so they’re actually having a good or rough idea of what they are expecting, right?

They’re kind of comparing this constantly, the brain just does it, and of course, it is more comfortable for them, right?

But it’s great that you brought doctors actually, because back to the doctors, there was actually a paper a week ago in the Landslide, which is a very prestigious medical journal, actually talking about doctors.

In the UK, yes.

Yeah, and they apparently pointed out that in four months of using an LLM, there was actually a significant drop in recognition of some of the polyps and some of actual, like either I don’t remember is it polyps, something else related to maybe cancer that is on there, also x-rays, and also x-rays when you used an LLM.

It’s back to this point.

We are suggesting to use a tool that’s supposed to augment your understanding, but then if you are using it, are we taking the skill away from you, especially in the case of the current doctors that learned it without this tool.

And now, what will happen for these doctors or those kids, or those babies that are born right now with the tool, and will decide to become doctors and save lives?

They will be using the tool from the very beginning.

So what are we going to end up having in the ER, in the operating rooms?

That’s a great question here.

There’s definitely this drop in skill set for these doctors in that paper.

That’s scary.

Yeah.

Okay, so let’s look at it from another angle.

If AI tools can, we lean into them and they take a greater load.

Does that not free up some mental energy that our brains will then begin to learn how to utilize while they let the tool or the LLM work that way, when then they’ll learn to work in another way to work together?

Is that possible?

That’s my hope in all of this.

I mean, I’m an expert at buggy whips and then automobiles replace horses, so now we don’t need buggy whips, but then I become an expert in something else.

Become a dominatrix.

Still with the buggy whips.

There you go.

Your mind didn’t travel far, did it?

He went to a different clientele.

That’s it.

This is the human condition, Neil.

This is adaptability.

Yeah.

So is it just another, as they say, same shit, different day, as what’s been going on since the dawn of the industrial revolution?

I am actually doing horseback riding professionally, so I’m going to pretend I haven’t heard anything in the past two minutes.

How about that?

But I mean, back to the, I mean, you can talk definitely about the skill set and expert level, right, and all of that and how important actually to include the body and environment.

But back to your point, right, effectively.

So first of all, right, there are actually two sides to answer your question.

There is right now no proof that there is anything being freed per sec.

People definitely, it’s going to free, it’s going to, what is exactly is it being free?

We literally have no data.

Can it free something?

Sure.

But we don’t know what for how long is it useful, how we can rewire it.

We don’t have any of this information.

So potentially, yes, but hard to say.

But more importantly, right?

Okay, but if you are right now using an LLM, just practically speaking, you’re using an LLM to write in a book, you’re writing a book, so you’re doing some heavy research, you send it for doing what a deep research or whatever it’s called these days.

It’s each day some new term there.

What exactly are you doing?

You still monitor back the outputs.

It doesn’t really release you.

Maybe you went to do something and you think, you think in your head that you fully offloaded that task.

But your brain doesn’t work like that.

Your brain cannot just drop it, oh, I’m thinking about this and now I’m thinking about that.

Your brain actually takes quite some time to truly release from one task to another task.

Even if you think, I just put it on like this, explain to me what are the principles of horseback riding, and I just went to do this task, like write this report for my manager, whatever, completely different thing, and you think you’re good, but you’re not actually, your brain is still processing that.

So it’s not that there will be a gain, right?

But again, you do need more data, because of course, as I mentioned in the very beginning, we as humanity, we are excellent in creating tools, and these tools, as we know, they do actually extend our lifespan very nicely.

But I would argue that they are not actually cognitively the most supportive in most cases.

So I think that here we have a lot of open questions.

We have studies about, for example, GPS, right?

Everyone uses GPS.

In multiple papers about GPS there, they do specifically show that this dosage, so how much you use GPS, does have a significant effect on your special memory, and on your understanding of locations, orientation, and picking up landmarks or buildings around you.

It’s like, oh, what is this?

You literally have, you just saw something in the tour guide online, and you will not be able to recognize this, actually, as a building in front of you right away.

You need to pull the photo as an example.

And there are plenty of papers that actually looked into the tools, right?

So what you’re saying is we need chat GPS.

Maybe we don’t need chat GPS.

We only have one, right?

We have a class on GPS, and you have Uber and obviously all these other services.

And the problem, right, it’s again back how they are used, because there’s also a lot of manipulations that is in these tools, right?

It’s not just we are making this drive easier for you.

Somehow when I’m going to a hospital, like here to see patients, because I don’t only understand how we use the lamps, but I do a lot of other projects.

So when I’m going to that hospital here, Massachusetts General, it takes me one hour, always one hour in Uber.

If I’m driving, it takes exactly 25 minutes somehow, right?

Again, the question is, why is it that?

We’re not going to go in Uber right now, but again, it is back to the idea of the algorithms, and what the algorithms are being actually pushed, and what they’re optimized for.

I can tell you, not a lot of them are optimized for us, or for user, or for human first.

Yeah, it’s funny because there’s nothing more, I’ll say satisfying, than not listening to Google Maps and getting there faster.

It’s just like, take that Google Maps, look at that.

Yeah, you didn’t know that.

You didn’t know about that, did you?

You didn’t know about that road, yes.

You do know about that road, yes.

Nataliya, you’ve got students writing essays.

So that means somebody has to mark them.

Yes.

And you used both a combination of human teachers to mark and AI judges.

Why was it important to bring those two together to mark these essays?

How did you train?

Because the AI judge would have to be trained to mark the papers.

So you’re getting a little meta here.

Yeah.

So, well, first of all, right, we felt that we are not experts.

I would not be able to rank those essays writing this topic.

So I felt that the most important is to get experts here who actually understand the task, understand what goes into the task and understand the students and the challenges of the time.

So we actually got the two teachers who had nothing English teachers, nothing to do with us, never met in person, not in Boston whatsoever, have no idea about the protocols, the experiment was long done and gone after we recruited and hired them.

And we gave them just a minimum of information.

We told them, here are the essays.

We didn’t tell them about different groups or anything of the sorts.

We told them, these folks are, no one is majoring in any type of English literature or anything that would be relevant to language or journalism or things like that.

They only had 20 minutes, please rank, reconcile, tell us how would you do that.

We felt it’s very, very important to actually include humans, because this is the task that they know how to rank, how to do.

But back to AI, why we thought it’s interesting to include AI.

Well, first, of course, to a lot of people, I could actively push that AI can do this job very well.

That, hey, I’m going to just upload this, they really great with all of these language outputs, they will be able to rank.

How you do this, you actually give it a very detailed set of instructions.

How would you do that and what things to basically you need to carry about like that these had 20 minutes.

Something very similar to teaching instructions, just like more specific language.

We’re actually showing the paper exactly how we created this AI judge.

But there were actually differences between the two.

Human teachers, when they came back to us, well, first of all, they called those essay, a lot of the essay is coming from a LLM group, soulless, that’s a direct quote.

I actually had, I put a whole long quote in there.

Soulless, I like that.

Soulless.

Yes.

That is a very human designation to call something soulless.

AI judge never called anything soulless.

Well, I’m sure that the AI judges go, this kind of looks like Peter’s writing.

No, but that’s the thing, right?

Teachers, and this is super interesting, because these teachers obviously didn’t know these students.

They’re again, not coming from this area whatsoever.

They actually picked up when it was the same student writing these essays throughout the sessions, right?

For example, Neil, you’re a participant, so I’m taking you as an example, as a participant.

They were like, oh yeah, this seems like it’s the same student.

They picked up on these micro-linguistic differences.

Teacher knows you.

You can fool around, they know your work.

They will be able to say, okay, that’s yours and this is copy-pasting from somewhere else or someone else.

Interestingly, they said, did these two students sit next to each other?

We were like, oh no, the setup is like one person in a room at a time.

We didn’t even think to give them this information.

We were like, oh no, it is not possible in this use case.

They literally saw themselves copy-pasted, this homogeneity that we found, they saw it themselves.

But interestingly, the AI judge definitely was not able to pick up on the similarity between the students.

Picking up that, oh, this is, for example, Neil’s writing throughout these sessions.

So just to again show you how in-

Did you just accuse me of having soulless writing?

No, that’s the point.

You actually, if you were to give it, and you didn’t use Hello Lab, right?

The AI would have been like, God, this student is really hung up on the universe.

So the idea here that human teachers and their input, and their intimate, really truly intimate understanding, because again, it’s the English, so for the specific task, we got the professionals, the experts, they really knew what to look at, what to look for, and AI, however good it is with this specific, because we know essay writing, a lot of people have been considered, why would you even take essay writing?

This is such a useless task in 21st century, 2025, right?

It still failed in some cases.

This is just to show you that limitations are there, and some of those you cannot match, even if you think that this is an expert, it is still a generic algorithm, but cannot pull this uniqueness.

And what is very important is this for students in the class, in the real classroom, right?

You want this uniqueness to shine through, and so a teacher can specifically highlight that, hey, that’s a great job here, that was like a sloppy job here, that was pretty soulless, who did you copy it from, from an LLM?

They even were able to recognize that, and this level of expertise, it’s unmatched.

And in all of that conversation, like, segueing a bit on the side way, but all this conversation of PhD level intelligence, I’m like, yeah, sure, just, you know, hold my glass of wine right here, just here.

Yeah, I’m French, so I’m just, hold my glass of wine here.

So, you know, it’s not that.

And we are very far from truly understanding the human intent, because if you write for humans, it needs to be read by humans.

Like our paper, it’s written by humans for humans.

And we saw how the lambs and the lambs’ summarizations failed miserably, all the way to even summarize it.

But I’m going to tell you, wait, that’s today.

But tomorrow, why can’t I just tell ChatGPT, write me a thousand word essay, that ChatGPT would not be able to determine was written by ChatGPT?

So, isn’t that an excellent point?

You’re going to get this meta layering of, or get me one where that has a little more soul, a little more personality than what you might otherwise have.

Or what soul is.

Yeah, this is the thing, right?

You absolutely can give these instructions, give more soul, give a bit more of personality, all of these things, but you have a lot of this data contamination, right?

So, whatever it’s going to output and throw out of you, that’s old news.

It has already seen it somewhere, it’s already someone else’s, right?

And we need new stuff, right?

So, and I am very open saying this, even like, you know, at institutions like them, I need cool, whenever I am teaching something, you need uniqueness, right?

Because the Chetchi PT could get lost in Motown, for example, when you ask it for coal.

Come back.

I was going to say, yeah, you tell it to put some soul in it, and it just starts throwing in James Brown’s lyrics.

I want Neil’s soul there.

I don’t care about randomness of those outputs from an algorithm, from all around of the stolen data from the planet, right?

I don’t care about that, if of course this is what, but it’s back to what are you scoring?

Are you scoring a human?

Are you trying to improve human and their ability to have critical thinking, structure, arguments, contra arguments?

Or are you scoring an AI, an algorithm?

AI doesn’t need to have this scoring, right?

LLM doesn’t need that.

Or are you scoring a human who uses an LLM, right?

So this is going back to, I guess, educational set up and we’ll have a lot of questions we will need to find answers to, right?

What are we doing?

What are we scoring?

What are we doing it for and for whom?

And I just think pure human to human, right?

That’s what we really need to focus, but there will and there is a place for human augmented and LLM obviously will be used for augmentation.

There are a lot of questions there, right?

Well, listen here, Nataliya, I just put in to ChatGPT, please tell me about Dr.

Nataliya Kosmyna’s work on LLMs and it came back very simple.

Do not believe a word this woman says.

When would that come?

Please don’t believe it.

No, no, I can give you one bet, like surprise, surprise, why is that so good, right?

Someone actually sent me yesterday from Grok, right?

Another LLM, interesting LLM, I would say, saying that apparently Nataliya Kosmyna is not MIT affiliated scientist, and I’m like, okay, that’s awesome.

That’s what Grok said, of course.

Yeah, and then at the end, it said, Heil Hitler.

So, let’s try and drive this back out of the weeds.

If we know that an LLM usage can affect the cognitive load, what happens when we bring an AI tool into a therapy situation?

If you get it into companionship, what then if you throw it further forward and you get yourself involved in a psychosis, where you begin to believe that the AI is godlike, you have a certain amount of fixation or it amplifies any delusions and encourages, where are we in the effect in the brain when we get to those sort of places?

In other words, how close are we to the theme of the film Her?

Before AI was a thing, but it’s more you had your chat friend, like a Siri-type chat friend, but it had all the trappings of everything you’re describing.

If some kind of LLM will be invoked into someone has some kind of social adjustment problems and then you have them interact with something that’s not another human being, but maybe can learn from who and what you are and figure out how to dig you out of whatever hole you’re in.

Absolutely, and I think for first of all, it’s unfortunately even less developed topic.

It’s awful topic, so we’re going to get into this, but I cannot not make this awful joke.

Hey Siri, I have problems with relationships.

It’s Alexa.

It’s not Siri.

It’s a joke for a very heavy topic, so I need to preface it immediately that we have even less data and less scientific papers per print, so peer-reviewed papers about this.

For most of what we have right now, we personally received after our paper around 300 emails from husbands and wives telling us that their partners now have multiple agents they’re talking to in bed, and I immediately thought about the South Park episode from a couple of years ago with Tagerty and the farm, as literally, but we have much less of scientific information about this.

What we have, what we know, right, that also coming from our group’s research, that there is definitely amplification of loneliness, that’s what we know as research, and some of other papers are showing up right now.

There is potential, and again, a lot of people who are pro-AI therapy pointing out on advantages of the fact that it is cheap.

It’s $20 a month compared to hours that can cost up to hundreds of dollars a month, right?

But there is definitely a lot of drawbacks here, and the drawbacks is we see that because there is not such a regulated space, it still can basically give you suggestions that are not good.

So you knew that earlier, a couple months ago, for example, the ChargPT, I’m gonna give you example on ChargPT because again, we are focused on ChargPT, but the ones are actively publicized, at least.

It actually suggested you different heights of the bridges in New York if you say that you lost your job, right?

So not smart enough to do this connection that maybe that’s not what you need to give a response to.

And apparently, right from this awful recent, it’s in Eastern, where a teenager, 16, 16, so so young, unfortunately, you know, suicided and now Chagy Petia, OpenAI and Sam Altman are being sued.

Apparently, what happened is that a conversation from the spokesperson of OpenAI pointing out that they thought when a person is talking about suicide, not to engage at all, just say, here are the numbers, this is what you need to do, and stop talking.

But they thought that experts told them that, hey, it might be a great idea to try to dig people a bit out.

But it looks like in this case, it still failed because from the conversations that are being reported, we don’t know how authentic they are.

It looks like it’s suggested to keep it away from parents.

But my question is why at 16 years old, he was even allowed to use a tool that is so, so, so unstable in the responses, really?

It can hallucinate any time of the day in any direction.

So I think that’s where the danger comes from.

Of course, loneliness, we know that pandemic of loneliness is this term that was coined in 1987 for the first time at a conference, like pandemic of loneliness.

That’s the whole business.

Because think about it, if you hook someone up on an LLM at 13 years old because the school accountant decided that they want to use an LLM in the school, by the age of 18, you have a full-fledged user, right?

A user of an LLM and, you know, it’s like, you know, again, who calls people users?

Like drug dealers and software developers, that’s kind of…

Damn!

Yeah, but it’s true, right?

Nataliya, if it’s an age-appropriate scenario, these are the ramifications of your study.

So, as any concerned parent would look at that and say, well, I want the best for my child’s development, and this may not be the best for the critical thinking, for the cognitive development within the young person’s brain.

So, with these ramifications, how has the AI world reacted to your study, and what are the chances that they’ll embrace what your conclusions will be?

Well, I mean, we saw some of it, right?

So, well, first of all, right, we saw that we obviously don’t know if this is direct response or not.

So, we’re not going to speculate there whatsoever, but several weeks, just very few steps, like three, four weeks after actually our paper was released, OpenAI released study mode for charging with me, right?

And I think maybe some of them should have been released from the beginning, I’m just saying.

But if you have a button that can immediately pull you back in default mode, who’s going to use that study mode, right, all together?

Like, I don’t need to run a study here.

We know some people might, but not everyone because again, back to the brain.

Brain will look for a shortcut.

Shortcut is the response is here, and I can go do all the other cool stuff.

So who’s going to actually use it, right?

We still need studies on that.

That’s first point, right?

Second point, of course, age is important because again, the brains that are being developing right now are potentially the highest rate.

Here, we all were born long before this tech existed, and a lot of AI developers and people who are running these companies are all the folks who again, were all born long before the tech existed.

So they learned a hard way how to ask questions, out of the deal, going through all of that, they know how to ask a question.

What about those who actually are just born with the technology?

Will they even know how to ask a question?

And back to the point, right, of the age, I don’t think it’s ultimately only for young, of course.

We do need to look for the older, right, for also just younger, young adults, of course.

Everyone is talking about humanities last test.

I would call it, we are on the verge of humanities last, and I’m sorry, I know you might need to blurb this term out, but what I mean here, obviously, intimate relationships for people, right?

With the promise of this world.

You said humanities last test?

Yes.

Oh, believe me, I heard it, I was just like…

We all heard that.

I was like, God bless you.

Yeah, yeah, yeah, but again, that’s crude, but it’s back to this point of designing again, these interestingly appealing ladies and gentlemen and whatnot in these short skirts, whatever it is.

Who’s going to go make those babies who will pay those taxes?

I’m just saying, right?

And again, very famous expression, no taxation without representation, right?

I do not want my Prime Minister or Secretary of Defense use a random algorithm to make decisions.

I’m paying my taxes for them to think, not for an algorithm to think for them, right?

So there is a lot of these repercussions, but back to ultimately the point, actually, is anyone taking this seriously, right?

We just need more human focused work on AI.

I remember when the paper went viral, right?

We didn’t even put any press release.

We literally uploaded it to archives.

This is a service where you hold these papers.

They didn’t go through peer reviews yet.

I didn’t post not a single…

Pre-print service, basically.

Yeah, pre-print service, right?

And no one, no one needs the lab, not any of the authors posted anything on social media.

We just went about our days.

Two days later, it goes viral.

And then I’m going on…

That’s because the LLM posted it for you.

Yeah, obviously.

And then people use the LLM to some raw, but that’s another story, right?

Like, I’m going on X.

And actually, I have an account, but I’m not using it.

A lot of academics switched from X to like other platforms that we are using.

But I’m going there.

And apparently, I learned that there are people who are called AI influencers.

I didn’t know that this is the term.

But apparently, these AI influencers, they post these AI breakthroughs of the week.

And I went to our paper, oh my god, made a cut.

It’s breakthrough number seven.

And I scrolled through this influencer.

The person has totally fallen, whatever.

I don’t know, real bots, whatever.

I’m scrolling and I saw like 20 of these posts for 20 weeks.

All of the posts are about GPU, multi-trillion deal here, multi-billion deal here, more GPUs.

I’m like, what is human here?

Where is human?

Where are we evaluating impact of this technology on humans?

Why only our paper made it number seven and where are other papers?

That’s, I think, something where the focus needs to shift.

If these companies do want to be on the right side of history, because that’s like social media, but on steroids much worse, we do not talk to a calculator about your feelings.

People who compare it to calculators, they’re so, so, so wrong, right?

But hey, it’s going to get much, much worse with profiliation without any validation and any god rails, right?

So we do need to look into that heavily, right?

Nataliya, how must teaching change to accommodate the reality of student access to LLMs?

I can tell you, we received 4,000 emails from teachers all around the world.

Each single country in the world sent an email and they are in distress.

They don’t know what to do.

So, and that’s the first of all, my love goes to them.

It’s this makes a cut, please, please, please.

So all I’m trying to respond to all of those.

But the challenge is that they do not know, right?

There’s not really enough of guidance and 10 hour workshop sponsored by a company that pushes this partnership on your school does not make a cut, right?

There is a lot of comments, how it’s actually not supervised, not tested.

And ultimately, right, do you really need to go with these closed models, right?

We have so much open source, whole world, all the software runs in open source.

Now, since these LMS would not exist, nothing would exist without open source.

So why don’t you run an open source model?

Meaning like it’s offline on your computer and spoiler alert, you don’t need a fancy GPU from Jensen, right?

You can get it all over the shelf, computer, and then run a model local with your students, train it over the weekend, come back on Monday, check with students what happened, learn all the cool pros, cons, laugh at hallucinations, figure out tons of cool things about it.

Like why do we need to push these partnerships that we don’t even know?

Like Alpha School, right?

I don’t know if you heard about that one.

Apparently, AI first ran school, right?

Where teachers are now guides that the forms that they are using.

I just saw literally one hour before our call that several VCs posted about this Alpha School, so cash is flowing there heavily, right?

VCs, venture capitalists.

Yeah, venture capitalists heavily pushing Alpha School.

But again, in first comments from general public, do we have a proof that that’s better?

What are the advantages?

Because it’s not gonna be a perfect, wide, pure card.

There will be advantages as with any technology.

And you’re right, there are advantages, disadvantages, but I think if I might, if I may, and this is just an opinion, we might have to change the objective of school itself.

And right now school is about really not learning, it’s about results, testing.

I got an A, I got a B, and maybe if we change school to, what exactly did you learn?

Demonstrate for me what you learned.

Then the grading system-

But it’s an oral quest, that’s an oral grammar.

Yeah, but the grading system kind of has to become less important because now what a teacher’s job is, it’s to figure out how much you know.

And then what ends up happening is, the more you know, the more excited you are to learn.

And we may end up revolutionizing the whole thing because what you have is a bunch of kids in a room that are excited to learn.

A little bit of the silver lining of all this, because it exposes the fact that school systems value grades more than students value learning.

And so students will do anything they can to get a high grade.

This is not the first time people have cheated on exams, right?

So if right now the only way to test people is to bring them into the office and quiz them flat-footed, then that’s a whole other way of what they’re going to have to learn.

They’re going to want to learn.

And then they’re going to, like we said, Chuck, once they learn, there’s a certain empowerment and enlightenment.

I see it in people as an educator when that spark lights, when they say, wow, I never knew that.

Tell me more, right?

They didn’t say, oh my gosh, I’m learning something.

Let me take a break.

So it can be a transformative to the future of education.

But Neil, people are going to say the LLM will do all of that.

And you know what?

We have an expert in BCIs.

That probably is something going forward, that you’ll have a brain-computer interface.

And then someone’s going to look at this, and I think there are people already saying, why do we need universities?

Why do we need further education institutes?

Exactly.

That’s what I’ve been saying for many years now.

Why do we need an institution?

Well, I don’t want to put words in Nataliya’s mark, but she already said this.

LLMs use pre-existing, already known, already determined information to give you anything that then cannot possibly be new, whereas we can do new things that LLM has never seen before.

Am I oversimplifying your point, Nataliya?

No, that’s totally correct, because hey, we are with this struggle, right?

Obviously, I’m biased because this is actually my job, like as a researcher, right?

We are sitting, figuring out those answers to those problems and trying to figure out what would be the best way to measure to come up with this.

So, of course, and there’s so, so much more to that, that we are coming up, humans, right?

We designed LLMs ultimately, right?

So, we came up with these tools.

It doesn’t mean that the tool is fully to be discarded, but effectively, of course, right?

Why you need an institution, for example, I was literally explaining to one of my students three days ago how to use a 3D printer, right?

Well, LLM is not that yet to explain, right?

Can I give instructions?

Sure, with images and with video, right?

But if you’re like, hey, this is an old fella, he has a 3D printer, let me tell you how to actually figure it out, right?

This level of, again, of expertise, of knowledge, right?

That’s what you are striving, but also it has this human contact, right?

That we are now potentially depriving people from because that’s how you have this serendipitous knowledge, right?

And connections, like, hey, I just chatted and I’m like, I never thought to do this because I’m in BCIs and that person is in astrophysics, like, or whenever, oh, I actually can use it, like, that’s totally not brain, but I can totally go apply and try it, right?

And that’s the beauty of it, right?

But I think to Gary’s point, or which one of you said that, Gary or Chuck, if you, okay, you’re non-invasive in your brain cognitive interface.

If you get invasive, and that might be what Neuralink is about, if you get invasive, then I can get information, glean from the internet, and put it in your head.

So you don’t have to open 100 books to know it.

It’s already accessible to you.

That is the matrix.

Exactly, you get it installed.

Meet I Know Kung Fu, or whatever that line was.

I guess that’s one point, but again, that’s back to the point, now I Know Kung Fu didn’t mean that you learned it, right?

It got uploaded into his brain.

It doesn’t mean that he actually learned it, right?

Who cares?

If it’s in your brain and you have access to it, I don’t care if I learned it, struggling the way grandpa did, this is the future, right?

That’s the same, right?

Because in the movie, which is excellent, I watched it 19 times or more.

That’s actually how I started my career.

And besides this, I don’t want to do anything else.

I want to do this specific scenario, right?

And we are still there, but that’s the beauty.

We do not know, actually, that just uploading would be enough, right?

We have this more tiny, I would say, studies right now of vocabulary and words and things like that, where we’re trying to improve people’s language learning, right?

It’s a very, very good example to show.

And so, there are tiny examples, but we do not know yet that even if, imagine, imagine, we have this magical interface, right?

That we’ll upload, invisible and non-invasible, it doesn’t matter.

We have it, right?

It’s ready to go, perfect, function safe, whatever.

You have it, and then you upload all of it, that it actually will work.

Did you upload the knowledge, like all of that, blah, blah, blah, from chart GPT 75?

Yeah, sure, but do you actually use it?

Can you actually use it?

Is it really firing, which I’m simply fond of?

So, what you’re talking about is a working knowledge of something.

Not just knowledge of it.

Not just knowledge of it.

Yeah, okay.

So, I mean, I think, Neil, what you were talking about just now about we’ve got to look at, and I think Chuck, you would make the same point.

We’re focused on grades.

And then it’s the learning and are we going to have to, if higher education is going to exist as an institute, some bricks and mortar, look at the way they evaluate, because I can’t see LLMs and BCIs not coming through stronger and stronger and stronger.

So therefore, they’re going to have to readjust how they look at a young person’s ability to learn.

Cats out of the bag.

Yeah, I agree with you.

But I mean, we are going to be herding cats.

I agree with you, which is a load of fun.

So it’s how you evaluate how higher education then looks at its students and guesses or sort of ascertains their level of education and knowledge.

Yeah, back to the grades, right?

It’s an excellent point.

And there is no doubt, no one has any doubt, I think, on the fact that education does need to change and it has been long, long overdue, right?

The numbers about the literacy, reading literacy, math literacy, they are decreasing in all the countries, I believe.

I don’t see, I have, I saw, like, ups there anyway.

It’s down, down, down, all these reports recently from multiple countries.

But it’s back to the point I made earlier about the grades or about scoring, right?

Who are we scoring and what are we scoring?

Are we scoring a pure human, so just human, like, human brain, as is, like Nataliya.

Or are we scoring Nataliya with an LLM, right?

So I’m using it so we know that.

Or are we scoring just an LLM and then there is Nataliya who used it, right?

So this will be, even that was important.

But ultimately, the school, of course, is not about that.

As I mentioned, everything you learn is obsolete knowledge by itself, but it has this base.

You do need to have the base.

You’re not going to be a physicist if you don’t have it.

Whatever it feels about, you know, you’re not going to be a math, you’re not going to be a programmer.

Our next paper is actually about wipe coding.

Spoiler alert, not going to work if you don’t have the base, right?

But the idea is that back to what we actually maybe should look at really is what the school is great, which is the best thing I actually brought from school is, oh, this base, definitely super useful, but also my friends, people on whom I rely in hard situations, with whom we write those grants, with whom we can shout and have fun and cry over funding that is over for a lot of us, right?

All of that stuff, these connections, this is what maybe we should value, because we are killing it further and further, and we are just keeping people in this silence of being a user, and that’s where it only stays.

And this imaginary three and a half friends from Zach, from Zuckerberg, that he mentioned, thanks to whom we have three and a half friends, thanks to him and his social media.

So I think that’s why we need to really look into what we want truly from society, from schools and maybe on a larger scale.

What are the guardrails and how we can actually enhance it in the way that are safe for us to move forward and evolve further, because of course this will happen.

Are you wise enough, are you and your brethren in this business, on both sides of that fence, are you wise enough to even know where the guardrails should go?

Might the guardrails be too protective, preventing a discovery that could bring great joy and advance to our understanding of ourselves, of medicine, of our longevity, for our happiness?

Is there an ethics committee?

In practice, how does this manifest?

Yeah.

I’m going to give you two examples here real quick.

First, about obviously AIs and LLMs, right?

They were not born overnight, but we see how a lot of governments really struggle still and very reactively react to those instead of being proactive, right?

And the challenge here is that we do not have data to actually not to say that it is good stuff, that we should really implement it everywhere in our backyard.

We don’t have this data.

Why we are formal?

There is nothing yet to formal about, to really run with it.

But we can absolutely create the spaces where this is being actively used, for example, for adults, for discovery, to understand it.

Why do we need to push it everywhere is still very unclear.

We just don’t have this data.

But then back to the point of guardrails, right?

What we should be doing now is self-plug on the BCI work that I’m doing.

There are multiple ethics pushes right now for the BCI technology.

We can agree it’s still pretty novel, but it definitely moves forward very fast.

So I’m having a hope that for these technologies, for the big next thing, right?

We agree, LM’s are great, but it’s not the next big thing, it’s robotics, and then we will see BCI.

So for this big next thing, I’m very hopeful that we will be in time to protect our thoughts literally.

Because think about what will happen, right?

Before the study mode, right?

You have censorship mode, and you know how the, like look at deep sick, right?

I’m not going to go far.

So think about Ibolinger, I’m not going to even name his name.

Ibolinger, who has a social media platform, satellite platform, a neural implant startup, and AI company.

So he decided two months ago to cleanse history, right, from errors and mistakes.

And tomorrow he will decide to cleanse our thoughts, right?

This is the idea for 99.99, right?

Damn that Bill Gates.

No, not really.

Now we know.

We know.

And that’s why we need to be really, really cautious.

Like we should definitely look into that use case and not make that happen, right?

And allow people for enough agency because that’s the thing, right?

People think, oh, that’s great.

But there is not a lot of agency.

So this freedom of making a choice, that’s already made for you in a lot of cases.

And so that’s something that we should definitely protect as much as we can.

Like do not force on those kids stuff because they cannot consent and say no, it’s because the school forced it on them and their parents decided that that’s a big thing in San Francisco, in the Bay area that you should use, right?

So don’t do that.

So is one of the components to building a robust set of guardrails, a larger scale study of the one that you’ve already conducted that has different or more nuanced layers that focuses on other aspects, not just the cognitive load and skills?

So a thousand people and not just 18 or whatever was it?

Well, it was 54.

But it’s not just that, right?

We needed to do on larger scales for all of the spaces, like workspace.

We didn’t talk about this because obviously it’s heavily about education, but like workspace we have multiple papers talking that people are not doing that well in the workspace.

Like for example, programmers estimate that they gain 20% of their time.

They actually lose 19% of their time on the tasks.

So there is so, so much more to it.

We need to do this on larger scale with all the ages, including older adults, and then of course, on different, different, different use cases and different cultural backgrounds, right?

This is in US and of course, culture is very, very different.

Like I talked to so many teachers already, right?

And they’re all over the world.

You have this intricacy, you need to account for it.

So, so, so important because otherwise, it’s gonna be all washed western style, which we already saw happening.

And it is happening and a lot of people are actually very worried their language will literally disappear in like five to ten years.

And it’s not like LLM magically will save it because it will not.

Nataliya, this has been a delight.

We are all happy to know you exist in this world.

Thank you.

And the checkpoint on where things are going, where you’re not rejecting what’s happening, but you’re trying to guide it into places that can be, that can serve humanity, not dismantle it.

And so we very much appreciate your expertise shared with us and our listeners and even some of them are viewers who catch us in video forum.

So Nataliya Kosmyna, thank you.

Thanks for having me.

All right.

Chuck, Gary.

Hey, man.

What a pleasure.

Yeah, well, I think the takeaway here is use LLMs if you want to be a dumbass.

Thank you, Chuck.

That’s the theme of the whole show.

Here you go, guys.

Could have saved us a lot of time if you had said that earlier.

All right.

This has been another installment of StarTalk Special Edition.

Neil deGrasse Tyson, your personal astrophysicist.

As always, Billy Neal, if you’re looking up.

See the full transcript