How Do You Take Your Coffee?

At the risk of this blog being devoted to a few people that have influenced my thinking and subsequently my work, I’ve had enough people ask me about the core ideas that it’s worth discussing some of them here. Before we can even go there, it’s worth distilling a couple of meta points that play into the thinking I put into my work as a VUI designer and have tried to capture in my course at UW.

One of the most important ones is my broken-record-like insistence that people use the products, services, and devices they’re working on in situ, especially for speech interactions.  I work on experiences and devices primarily for home use, and I can easily anticipate the sound of my partner’s voice ringing out to stop talking to robots when I say something ungrammatical, offensive or otherwise weird because she knows it’s my habit to hypothesize and test anything I stumble upon.

Another is to get lots and lots of diverse feedback, especially people who don’t agree with you, sound like you, or look like you.  This comes in many forms, direct and indirect, but especially in the concept and design phases it means bouncing ideas off people and testing prototypes by simply saying stuff.

People often refer to this as hallway testing, but it’s not limited to that. I’ve become well-accustomed to feeling uncomfortable while I say something out loud to my headphones on a sidewalk or in a bus, using their stares (or maybe not paying attention at all) as indirect input for the problems I’m working on.

Doing both, and a lot, are important for two reasons:

  1. How speech “feels” when you’re speaking and listening in an interaction is very difficult to separate from our interpretation of text.  Specifically, when you’re parsing your own output while you’re producing it, your brain processes it differently than when you typed it, both for the producing and the hearing your own production, as well as the subsequent response from the system. This is true even if you designed the system.
  2. It’s very difficult to distinguish between our own limbic system’s “instinctual” responses and how they’re inseparable from our beliefs and worldview, and our frontal lobe functions, which we’ll generalize as “rational.”

It’s very very difficult to distinguish between our beliefs and our rational thought.  If you find yourself nodding in agreement, you might need help. Much as you think you’re doing it, when you think you’re being rational and not responding emotionally, recall that rationality is itself a worldview that not everyone agrees on.

Part of the scientific method’s reason for being is to divorce interpretation (objectivity) from beliefs (subjectivity) by removing the observer from the equation and introducing the concept of reproducibility.  If your experimental results are only credible because someone else can do the same experiment and get the same result, you’ve effectively created a mechanism to avoid your own bias to believe you’re right.

On the other hand, you must interpret the “feel” of language interactions without producing and parsing language exactly because we’re capable of rational and voluntary reasoning. It’s how we write.  Voice designers I’ve known have tended to be readers and writers by nature, and readers and writers have honed skills to craft voice, tone and take more than the sum of the parts of a bunch of glyphs on a page.

The ability to intentionally and deliberately create impressions and trigger beliefs is what we call “writing;” however, since it’s a composed, and not a natural process, we must effectively run it through the lizard brain to truly understand whether we’re on the right track.

I say “on the right track” because it’s of course likely that despite my efforts my worldview specifically excludes lots of ways of thinking, which I’m biased to believe I don’t do (paradoxically because I’m rational), so I can’t be the only one to test my work.  The more universal your feature or system needs to be, the more broadly you need to seek feedback and input on your work.

This has its limits of course.  Asking everyone to participate in your science experiment for free is annoying and unethical, but most of the time it’s just takling to people, and talking to people is just humans being humans.

“How do you take your coffee?”

“To go.”

You’d be surprised how easy it can be to get on the right track (or off the wrong one) by asking someone a simple question and finding out their answer is neither what you’d say or what you’d expect. This can become a vitally important part of your own design practice!

Especially by developing both the sensitivity to observing your own parsing of speech experience and the habit of not trusting it, you can create the conditions for stumbling on hidden truths more frequently.