Computers Talking
Welcome to the February issue of Hacker Chronicles!
Last month I visited New York City for the first time in many years and went to Liberty Bagels. You might remember it from a conversation between BestBye and her sister in my novel Identified. Yes, the bagels were awesome, and our kids had the rainbow ones.
In this newsletter issue we delve into talking computers, or speech synthesis. It plays a role in our interaction with machines today, and has been for decades in fiction. This topic will serve as the lead to my upcoming review of 2001: A Space Odyssey by Arthur C. Clarke and Stanley Kubrick.
I will also reveal an Easter egg from my novel, as promised.
Remember to suggest this newsletter to a friend, coworker, or family member! If you love it, I'm sure they will too. Here's the subscription link.
Take care! /John
Writing Update
I've written almost 8,000 words since my last issue! Things are pretty clear in my mind plot-wise when I get this close to the end and it's more about the story's intensity and pace. Plus tying up loose ends.
This is where I am right now:
March Feature: Computers Talking
Let's face it, interaction with computers is pretty boring to watch. That's why whenever there's any hacking in movies, they go to lengths to make it not be a person typing on a keyboard for two hours.
One option is to have computers talk, and human characters talk to computers. It's not even imaginative anymore now that many of us talk to voice assistants regularly. Still, it's under utilized in fiction.
My Introduction to Talking Computers
My first computer was a 1980s Atari 520 ST and it could talk. I can't say I did much useful with the built-in speech synthesis but it captured my imagination and I understood better how Electric Light Orchestra had created the eerie voice in the one of my favorite songs of that time, Yours Truly, 2095 (listen on Apple Music or Spotify).
That whole ELO album, Time, takes us straight into the world of fiction. As Wikipedia puts it: "It is a concept album about a man from the 1980s who is taken to the year 2095, where he is confronted by the dichotomy between technological advancement and a longing for past romance." Released in 1981, it came in the era of the space opera, with The Empire Strikes Back out the year before. This line about the robot in the song Yours Truly, 2095 is striking given that it's more than 40 years old:
She is the latest in technology Almost mythology But she has a heart of stone She has an IQ of 1001 She has a jumpsuit on And she's also a telephone
How It Actually Started, In the 1930s
The Voder, or Voice Demonstrator, was invented at Bell Labs 1937-1938, and shown to the world at the 1939 New York World's Fair. It certainly wasn't a computer, rather an instrument.
I continue to be amazed at what was created in the 1930s. In the US: Golden Gate Bridge, Empire State Building, and Hoover Dam. In Sweden: Västerbron, Slussen, and Markeliushuset. But then again, it all started with the Great Depression and ended with Nazi Germany invading Poland. I sometimes think of that decade when it feels like we live in turbulent times.
Computers Learn How to Talk
Computers that audibly talk use speech synthesis and it developed from the mid-1900s.
1961, an IBM computer was programmed to synthesize speech and sang the song Daisy Bell. Arthur C. Clarke, who co-wrote 2001: A Space Odyssey was visiting a friend at Bell Labs and got to hear the computer singing. He wrote it into the screenplay for his sci-fi movie later that decade.
A general English text-to-speech system was developed in Japan 1968. Ten years later, the technology came to kids in the form of the Speak & Spell toy.
Then in 1984, the Apple Macintosh introduced itself on stage (YouTube). And that same year, Stephen Hawking got his speech-generating device and could finally talk to his assistant on finishing his seminal book A Brief History of Time.
This old style of speech synthesis is still available on modern computes. For instance on a Mac, in Terminal, you can run this: say -v Junior "Hi there! I'm Junior. How's your day?"
Nowadays we have voice assistants such as Siri, Alexa, and Google Assistant. Real human voices are used as the basis of these computer voices. Then deep learning is applied to turn the recorded human sounds into new words and sentences with a human touch.
Speech synthesis has gotten so good that we now face the problem of audio deepfakes. The whole deepfake situation both inspires me as an author and scares me as a human.
Learning the Reverse — How to Listen
Just as important as speech synthesis is ability to recognize our voices, turn what we say into sentences, and interpret the meaning of what we say.
The research into speech recognition started in the 1960s too but required a lot more computing power. In 1984 when Macintosh introduced itself and Stephen Hawking spoke to his assistant, the Apricot Portable computer was released. It had a vocabulary of up to 4096 words, of which only 64 could be handled live when you spoke to it.
Going from recognized speech – basically a series of recognized words – to understanding of sentences and context takes us to the full-blown research area of Natural language processing. Only fairly recently did this become feasible with consumer electronics.
The Three Pieces Needed for Talking to Computers
We don't think about the distinct technologies when we watch fiction like 2001: A Space Odyssey with the space ship computer HAL 9000, or Knight Rider with its speaking car computer KITT. But with the exploration above, we can break it down to three key pieces:
- The ability for the computer to turn text to audible voice.
- The ability for the computer to decode our audible voices into text.
- The ability to understand and generate human language in text form.
Once you have those three, you can talk to computers. The limitations to those three will limit your interactions, such as when you curse at voice assistants for not understanding what you want.
We have very recently seen a technological leap in this space with publicly accessible large language models (LLMs), such as ChatGPT.
In the world of fiction, my mind goes to Colossus: The Forbin Project, which I reviewed a year ago. In that late 1960s story, the rogue super computer demands installation of new capabilities to let it listen to humans talk and to be able to talk itself.
Hacking by Talking
There are two interesting aspects of hacking in the context of talking to computers.
First, the ability to make the computer aid a hacker in their efforts to break into a system. A lot of that is done through shells and tools with textual interfaces today, as explored in my November 2022 newsletter issue The Shell. In movies, it's much more popular to have those interfaces be graphical, such as in Hackers and Ready, Player One. In Minority Report, Anderton even has a human assistant behind him that he talks to instead of talking to the computer. But Deckard does talk to his computer in Blade Runner and makes it help him.
The second way of hacking over voice is to have the speech be the hack itself. Either by triggering an exploitable bug in the natural language processing of the target system, or by fooling or lying to the system in a way it doesn't handle gracefully. We've seen the latter category in the recent hacks of ChatGPT, referred to as "jailbreaks." That category is more intriguing to me and it touches upon the very human breakdown of trust between HAL 9000 and the humans onboard the space ship in 2001: A Space Odyssey. That's what my next newsletter issue will cover.
Easter Egg in Identified
Spoiler Alert: Content directly from my novel below.
Remember when West found that paper slip advertisement in his bag of coffee beans? Here's the ad from the book:
How can a vendor that serves almost half the market have "great" coffee? Great means exceptional, outstanding, better than the rest. At Luw4k Coffee we give you great coffee. Make it count! Visit luw4k.coffee/gr8 for 10% off your next purchase.
Kopi luwak is this luxury coffee bean variety curated by the Asian palm civet. Curated as in carefully chosen coffee cherries that are eaten, digested, and pooped out. This whole process supposedly leads to superior coffee. I say supposedly because I haven't tried it yet.
So that's where the Luw4k Coffee name comes from in that fictional ad. Now, about that link in the book … luw4k.coffee/gr8.
That's all I'll say for now. Feel free to explore.
Currently Reading
I decided to focus on Dark Waters by Lee Vyborny and Don Davis before heading into more fiction.
During my lengthy commutes in the Bay Area traffic, I've started listening to Anne Applebaum's second history book, Iron Curtain: The Crushing of Eastern Europe 1944–56. She is my favorite non-fiction writer and it's another brick of hers, checking in at over 600 pages. Being of European origin myself, it's uncomfortable to discover how ingrained my view of "Eastern Europe" is. It's utterly colored by the half-century those countries were dominated by the Soviet Union. Applebaum picks up history before that, making me understand that those countries were not behind Western Europe in development, less culturally influential, or part of a homogeneous bloc before the war. Soviet communism and the Iron Curtain pulled them into that. No doubt, people from those countries would be offended by my poor understanding so I'm happy to have patched that. A highly recommended read!
The Adventures of Huckleberry Finn is still going. Such a great novel.
US law requires me to provide you with a physical address: 6525 Crown Blvd #41471, San Jose, CA 95160