Conversational interfaces are a bit of a meme. Every couple of years a shiny new AI development emerges and people in tech go “This is it! The next computing paradigm is here! We’ll only use natural language going forward!”. But then nothing actually changes and we continue using computers the way we always have, until the debate resurfaces a few years later.
Reminded me of this piece I wrote shortly after alexa "happened"
"New input devices don’t kill their predecessors, they stack on top of them. Voice won’t kill touchscreens. Touchscreens didn’t kill the mouse. The mouse didn’t kill the command line. Analysts yearn for a simple narrative where the birth of every new technology instantly heralds the death of the previous one, but interfaces are inherently multimodal. The more the merrier. Every new technology starts in a new underserved niche and slowly expands until it finds all the areas it’s best suited for."
This essay really made me think. I saved so many highlights. Here are a few of my thoughts, but would love to keep the conversation going.
1. I couldn't agree more about the low information density of text. LLM input is a sum of prompt + context. So in order to decrease the magnitude of input necessary, the AI entrypoint or affordance has to live where the intention (prompt) and background information (context) is known or implied. The problem with chat interfaces is that the interaction with the AI lives in a completely different application or part of the application. Which means the user has to very explicitly say what they want (prompt), and what they know (context). So this naturally lends itself towards more integrated/embedded AI experiences, where the user can invoke specific AI tasks with the click of the button because the button is exactly where the user was already working (this might be the equivalent of keyboard shortcuts for AI).
2. There's a bit of irony in this piece: saying that AI is a complementary interface implies that it is not native or fundamental. But most successful AI-native applications (like Cursor) treat AI with the same patterns that you describe – as complementary entrypoints to the main application that the user still interacts with.
AKA, the for a transformation from VSCode (non AI-native) to Cursor (AI native) requires that VSCode exists and isn't going anywhere.
Nice piece and couldn’t agree more. While AI has been ingrained in our day to day workflows, it is still just another tool, which we can chose to use or not to use, rather than something like a PC, without which any work we want to do today is incomplete. For AI to go from becoming just another tool to something that is essential part of our work—without which we cannot do anything, there’s a long way to go.
Great piece, sound thesis. This def resonates "This wasn’t just a one-sided “Hey, can you write a few paragraphs about x” prompt. It felt like a genuine, in-depth conversation and exchange of ideas with a true thought partner. Even weeks later, I’m still amazed at how well it worked. It was one of those rare, magical moments where software makes you feel like you’re living in the future."
Some really fascinating thoughts here. It’s interesting to see how this could connect with your thoughts on calendars, and layered productivity.
One of the big barriers for AI at the moment to use voice commands is the ubiquity that it needs. They need to be “always on”, always listening - it needs to understand context and be ready when you need it.
As you say at the OS level.
with fears of privacy, and what companies do when they listen all the time ( see Facebook and its ads based on listening) it’s a real Trickey one to navigate.
Great essay Julian.
Reminded me of this piece I wrote shortly after alexa "happened"
"New input devices don’t kill their predecessors, they stack on top of them. Voice won’t kill touchscreens. Touchscreens didn’t kill the mouse. The mouse didn’t kill the command line. Analysts yearn for a simple narrative where the birth of every new technology instantly heralds the death of the previous one, but interfaces are inherently multimodal. The more the merrier. Every new technology starts in a new underserved niche and slowly expands until it finds all the areas it’s best suited for."
https://www.intercom.com/blog/benefits-of-voice-ui/
ha! great post! ahead of its time!
This essay really made me think. I saved so many highlights. Here are a few of my thoughts, but would love to keep the conversation going.
1. I couldn't agree more about the low information density of text. LLM input is a sum of prompt + context. So in order to decrease the magnitude of input necessary, the AI entrypoint or affordance has to live where the intention (prompt) and background information (context) is known or implied. The problem with chat interfaces is that the interaction with the AI lives in a completely different application or part of the application. Which means the user has to very explicitly say what they want (prompt), and what they know (context). So this naturally lends itself towards more integrated/embedded AI experiences, where the user can invoke specific AI tasks with the click of the button because the button is exactly where the user was already working (this might be the equivalent of keyboard shortcuts for AI).
2. There's a bit of irony in this piece: saying that AI is a complementary interface implies that it is not native or fundamental. But most successful AI-native applications (like Cursor) treat AI with the same patterns that you describe – as complementary entrypoints to the main application that the user still interacts with.
AKA, the for a transformation from VSCode (non AI-native) to Cursor (AI native) requires that VSCode exists and isn't going anywhere.
Anyways, great read :-D
Nice piece and couldn’t agree more. While AI has been ingrained in our day to day workflows, it is still just another tool, which we can chose to use or not to use, rather than something like a PC, without which any work we want to do today is incomplete. For AI to go from becoming just another tool to something that is essential part of our work—without which we cannot do anything, there’s a long way to go.
Great piece, sound thesis. This def resonates "This wasn’t just a one-sided “Hey, can you write a few paragraphs about x” prompt. It felt like a genuine, in-depth conversation and exchange of ideas with a true thought partner. Even weeks later, I’m still amazed at how well it worked. It was one of those rare, magical moments where software makes you feel like you’re living in the future."
Some really fascinating thoughts here. It’s interesting to see how this could connect with your thoughts on calendars, and layered productivity.
One of the big barriers for AI at the moment to use voice commands is the ubiquity that it needs. They need to be “always on”, always listening - it needs to understand context and be ready when you need it.
As you say at the OS level.
with fears of privacy, and what companies do when they listen all the time ( see Facebook and its ads based on listening) it’s a real Trickey one to navigate.