Press 1 to check your balance, press 2 to pay your bill, press 3 to update contact information, press 4 …
Menus, when spoken, areboring at best, rage-inducing at worst.
Information architecture (IA) aims to minimize that frustration by giving clear options, and organizing them in such a way that the majority of people get to push 1, 1, 1, 1 and have finished their task.
However, it rarely goes so smoothly.
Oftentimes we are stuck on the phone, waiting for number 9 just to realize that number 2 was the best option for us, but I believe it is human nature to want to hear the whole menu, just to make sure we aren’t going down the wrong path. I also believe it human nature to click the first button that’s read, just hoping a human comes along, being completely blocked by a broken system, and starting over at square one, only to wait for number 9 to make sure you don’t have to try the whole process a third time.
The visual organization of IA on the web is a much more enjoyable process. As menus become standardized by industry, and the ability to search on multiple levels increases, finding what you want online is often easy, and requires very few tries.
And now we are discussing the IA of Alexa, Siri, Google Home, Microsoft Cortana, etc. These devices are much more intelligent than a landline, and aren’t going to ask you to press 1 for English, but how much exposure do these tools need to their user before they’re really efficient? And will the hardware even last that long?!
My personal use of Siri is almost more of a game than a true functionality. I think most of my time with her is spent listening to “I didn’t quite that?” and asking her what 0 divided by 0 is to make friends laugh. However, there are instances where I am driving and she is the best option (Apple CarPlay often makes her the ONLY option). I ask her to play the “White Stripes” and she plays the “Night Lights,” and I say, “hey, at least it’s music.” I try to respond to a text, Siri somehow mistakes my speech for me wanting to close my texts. I ask to go to Chipotle, she directs me on a 1,089 mi. journey to Wisconsin, because I obviously meant that specific Chipotle, and not the one 8 minutes away.
I feel like maybe voice recognition was supposed to take a minute with training wheels that they forgot to.
I understand that some people might say the “Hey, Siri” and “Okay, Google” are the training wheels, but I mean a little more than that.
Maybe these voice recognition softwares might want to start asking “Say 1 for the White Stripes, say 2 for the Night Lights.”
In regards to the future, like the future future, I’m very excited to see how well voice recognition begins to work. And I wonder how many layers of legislation are going to be created in response to the questions of invasion of privacy and data collection?
If I’m watching an action movie, I don’t want the CIA asking why my Alexa was hearing “bomb” over and over. If I’m discussing purchasing a sous vide, I don’t want one to show up on my porch the next day, and if I’m having some intimate time with my partner, I definitely don’t want smooth jazz to start playing automatically.