Press 4 for “Funner Options”, and use our Facebook fan page!

This is brilliant. We can’t take credit for it, as some competitor apparently built this client’s system, but it’s a great idea. It would be easy to build the same thing in our own software.

As it says in some wall postings on the Facebook fan page for a popular candy bar:

“Hotline rocks! Thanks for being an organization that’s not afraid to show their sense of humor! I believe you just upped your popularity even more by doing that!”

“Loooooooooooooove the hotline more companies need to have a sense of humor this way!!”

“Love the hotline! Hilarious. Thanks for the laughs.”

1-800-295-0051 OMG LOL!!!!! ok press 1 for english or 2 for spanish, then hit 4 for “funner options” then hit 7. you will be rolling on the floor laughing!”

Exactly. Dial that number, listen through the short advertisement of their product, and the address of this Facebook fan group. Then, wait through the menu that offers 1 or 2 for the language, but do nothing. After a short pause, it gives additional options. (An “Easter egg” hidden feature that nobody expects on a boring corporate product line, and that’s why it is fun.) …For Pig Latin, esspray orway aysay eethray. For a knock-knock joke and other funner stuff, press or say 4.

Way down into that inner “funner stuff” menu, which itself was pretty funny, option 7 tells the caller about the different kinds of cooties, and how to get rid of them. Well done, and quite entertaining!

Option 5 is pretty good, too: “Hear me give a noogie to the operator next to me”, and then it sounds like the two guys clowning in a call center.

As of this morning, that Facebook group has 214,000 fans. That’s 214,000 potential customers for their product, plus all their family and friends.

And, if these fans are spreading excitement about the IVR hotline by word of mouth (and by forwarding e-mails and Facebook statuses)…WOW! That’s where I heard about it: seeing a thing on the Internet from somebody I’ve never met…advertising an IVR system as “this is so much fun, you’ve gotta call it!”

Some observations:

Once, when I called it back to hear some of the other options, it didn’t give any of the extras. There could be several possible causes for that: (1) Maybe those were temporarily taken down? (2) Maybe the system is capturing Caller ID, and deliberately not playing the extras for subsequent calls: so, a caller won’t keep calling it back all day and running up the charges. But also, it forces the caller to use a different phone, which gives another opportunity to capture another Caller ID, store it into a database, do a reverse lookup, get a mailing address, and send out some promotional materials…. Clever!

I waited a while, and tried it from a different phone. On my first three attempts in that session, the system did not answer. It gave me a busy signal. Did the company get overwhelmed by the success of this application and its viral spread of enthusiasm? Didn’t they scale it big enough when they built it? What platform are they using that can’t handle all the traffic….?  That’s a problem: being ready for overwhelming success.

Eventually, I got through again, and it let me get to all the options each time. So, maybe they aren’t blocking multiple calls by Caller ID, after all…although it would be clever, and might become necessary.

The “press or say” stuff on the options is annoying, and this doesn’t really need to be a speech recognition system. It would be just as funny and useful if it were keypad-only (DTMF). But hey, it’s their money, and if they want to create more error-handling problems for themselves with this speech design, they are welcome to it. When some kid is playing this phone call on a cell phone’s speaker to amuse a friend, and they’re laughing, the laughing and other noises shouldn’t make it cut off the prompts.

The guy introducing the Spanish option obviously isn’t a Spanish speaker. That’s a demerit. They could have done better. He’s a good actor for the funny options, though: deadpan enough to mock other bad IVR systems and their cliches, but giving just enough twist to the delivery that the caller realizes it’s funny.

All around, it’s brilliant in generating traffic to advertise their product. If they’ve considered those other problems, they’ve done a great thing here: marketing to the approximately fourth-grade level, giving it some viral hooks for free publicity, and making it “funner” than everybody else’s boring hotlines.

Again, that phone number is: 1-800-295-0051

The Elements of Tuning

No matter how carefully you crafted your VUI design, or how diligently the design was implemented, or how thoroughly the implementation was tested, your application will need regular and careful tuning once deployed if your aim is to maintain a world-class, highly usable voice solution.

Tune up

Tune up

To effectively tune your application, you should have at your disposal three sources of information: (1) Call Logs: which will enable you to identify patterns across calls (e.g., where are people hanging up), (2) Call Recordings: which will enable you to understand the nature of a problem (why are people hanging up?), and (3) Your callers: usually, you do this by assessing their level of satisfaction with the solution.


Here are the basic questions that need to be asked in order to begin tuning a voice application:


Where are people hanging up? A hang up prior to completion of a task is usually a sign of frustration. If the goal of your application is automation, your first tuning task is to identify such hang up spots in your application and understand why people are hanging up.


Where are people asking to be routed to an agent? If you have designed your application with the goal of empowering the caller, you must have provided the caller with the option to route to an agent. A caller actively asking to speak to an agent is a caller who has decided that the application is not successfully enabling them to serve themselves. This is especially true of callers who have engaged the application over several minutes of interaction and then decided to bail out.


Where are people saying the wrong thing? The aim here is to identify those spots in your application where no-match failures are significantly higher than the average or the expected. The remedy is to listen to the prompt the caller hears and then listen what people are saying in response to that prompt. In such situations, adjust your application by either re-writing the prompt or by adding to the language the system is listening to what callers are responding with.


Where are people not saying anything? These are the spots in your application where the caller goes quiet on you. This occurs usually because the prompt is confusing or the caller was asked for some information that they don’t have (or don’t have ready access to, such as a subscription ID or an account number). If the issue is with lack of clarity of ambiguity, then re-craft your prompt (see Chapter 3). If the issue is with lack of readiness, then provide the caller with the time they need to retrieve the information you need from them or suggest that they call back when they have the information handy. Another strategy is to inform the caller at the very outset of the interaction that the subscription ID or the account number will be needed.


Where are people speaking too soon? At times, callers are impatient and speak sooner than they should, often missing crucial information or instructions. To remedy, either turn the barge-in setting off, or re-craft the wording of the prompt the caller is interrupting.


What type of noise level are your callers calling from? When you listen to your recordings, pay attention to the noise level and how the noise is affecting the no-match error rates.


What options are people asking for? If you discover that 80% of your callers are checking their savings balance, then ask 100% of your callers if they are calling about checking their Savings balance. By definition, 80% of the time your will be right.


How are people feeling about the application? You can probably get a good sense of how people feel about the application by just listening to the tone of their voice in your call recordings.

Caller ID - The Phone Cookie


We’ve all heard of Web cookies. They’re very common, have been around for years now, and can be used to achieve many results.  One in particular being personalization - remembering user preferences in anything from shopping carts to news and weather sites. They can even be used to push advertising to a user based on browsing activity.

Much like web cookies are used in websites, Caller ID can be used in IVR to achieve similar results. Again, personalization - offering custom menu options, anticipating why a caller is calling based on selections in previous calls, routing based on the area code, etc. This behavior in Web applications is very common, nearly ubiquitous. But why is it so uncommon in voice applications? Are most IVR designers living in the dark ages? Do they have limited imaginations? Are there too many barriers to adding personalization and their budgets make it prohibitive?

Perhaps the first 2 can’t be helped but the 3rd can. Enter Angel.com. We’ve always had the capability to add personalization to a voice application, using Caller ID as the ‘cookie’, all without any programming necessary. It all comes out of the box using standard Angel Voice Pages. Of course, if you wanted use your CRM data to personalize the caller experience, again, based on Caller ID or any other identifier, we make that pretty simple as well. This is why it’s so easy for us to stand behind our mantra of ‘putting the caller first’ - because we make it exceptionally easy to do so.

Strategies for “Caller First” design: advertising over the phone?

So, your company wants to “upsell” some new product or service, and your marketing department wants to add an advertisement into the system that answers your phone.

The following example demonstrates some strategies for designing a “Caller First” experience that will delight your callers, while still slipping the marketing message into the flow if it is absolutely required.

Suppose the marketing department has drafted the following advertisement to be played as the greeting to all your callers:

“Welcome to our company.
If you or someone you know is thinking about quitting drinking, you should know our office is hosting a free, hour-long workshop that can show you a fresh approach to quitting.
It’s led by a quit drinking expert.
Plus you’ll meet an ex-drinker who quit with the help of a doctor-recommended treatment option and support.
We’d love to save you a seat, so be sure to ask the receptionist, or go to www.blahblah.com.
You might be asking yourself, “What’s going to be different about quitting this time?”
Well, for starters, several things.
No finger-pointing. No scary statistics. Just honest information…and it’s free.
You’ll even get your very own take-home materials to help jump-start your quit.
Others like you have found these workshops to be useful.
All you have to do is ask our office receptionist, or go to www.blahblah.com, and you’ll be taking an important first step toward planning your quit.”

Ummmm…woof!  If I’m the caller there, I’ve already hung up at “You might be asking yourself”, because “I’m asking myself” why I phoned this company that evidently doesn’t care about wasting my time. I’ll take my business elsewhere, if I can.

Reading that monologue aloud, it’s a 60-second message.  It’s repetitive and patronizing. The callers are already captive on the phone for a full minute before they get to say or do anything!  Do the callers really want to hear all of it?  Will they be paying attention to such a thing?  Do they listen to radio or TV ads, either, or just numb their minds until it’s over?  Is the phone really the right place for such a 60-second advertisement?

What about your callers who don’t want to hear any of it, because they were calling about something else, not about any interest in a quit-drinking workshop?  Should they have to listen to the whole thing, or any of it?

Let’s see what we can do about that. Rewrite it to be good IVR. This isn’t the radio, and we don’t have 60 seconds to burn.  Make every word matter!

“If you or someone you know is thinking about quitting drinking, you should know….”  First, “If you or someone you know is” sounds clunky, and some callers might think it’s grammatically incorrect. (Should it be “is” or “are”?)  Why go there?

“Thinking about quitting drinking”: that’s three “-ing” words jammed together.  Furthermore, the word “quit” comes up at least five or six more times in the message, and reasonably patient callers might get weary of hearing it.  Maybe they’ll quit this phone call.

“You should know our office is hosting…” Is “you should know” really the right way to lead this, when the meaning is actually “we want you to know that…”?  It’s generally not a good idea to tell people what they “should” know.

“It’s led by a quit drinking expert.”  All decent workshops are led by experts in their topic, supposedly, so why is there any need to say this?

Step back. We’re really trying to convey the information:

  1. Somebody reasonably qualified
  2. is leading a free workshop
  3. that lasts for one hour, and
  4. the workshop is about quitting the habit of drinking.

So, let’s blow away all the other hype, and convey that information directly and respectfully. “Perhaps you know someone who would like to quit drinking. Our office is hosting a free one-hour workshop about quitting that habit.”

The next information to convey is that some successful quitter will speak about the way a prescription treatment program helped him.  How about this, keeping each sentence short enough that the listener will be able to make sense of all the concepts?  “Part of the presentation is by an ex-drinker who successfully quit. He had the help of a prescription treatment and its support resources.”

The workshop is supposedly worthwhile, and its next big draw is that the participants get to take home printed materials that will help them quit drinking.  However, the drafted sentence sounds merely patronizing: “You’ll even get your very own take-home materials to help jump-start your quit.”  My very own? So I can “jump-start” myself into quitting?  Why not just tell me: “Our workshop will also have take-home materials to guide participants through the recommended process.” (That could still be improved further, perhaps, but let’s move on….)

How does the caller sign up?  By asking about it, or visiting a web site.  Let’s say so, directly: “To learn more about enrolling in this free workshop, ask our receptionist, or visit www.blahblah.com.”

Let’s put it all back together and time it, reading it aloud:

“Welcome to our company.
Perhaps you know someone who would like to quit drinking.
Our office is hosting a free one-hour workshop about quitting that habit.
Part of the presentation is by an ex-drinker who successfully quit.
He had the help of a prescription treatment and its support resources.
Our workshop will also have take-home materials to guide participants through the recommended process.
To learn more about enrolling in this free workshop, ask our receptionist, or visit www.blahblah.com.
(2 second pause so the caller can mentally process what was just said)
Now, I am transferring you to the receptionist.”

We are now down to 30 seconds instead of 60, and we have (hopefully!) conveyed all the important information in a well-organized manner.  The first sentence tells the caller to keep paying attention if there is any interest in quitting drinking.  That’s some improvement.

Still, what happens to the callers who really don’t care? Do we want to waste 30 seconds of their lives as they wait impatiently to get to the reason they called?

Let’s put in a keystroke control.  Play the full advertisement to only the callers who have some interest in attending the workshop: the callers who press 1 to hear all the details after a short teaser.  Nobody else needs to hear about the web site or the workshop’s syllabus, do they?

“Welcome to our company.
Perhaps you know someone who would like to quit drinking.
Our office is hosting a free one-hour workshop about quitting that habit.
If you want to hear more about that free workshop, press 1.
(pause 2 seconds: for the caller to press 1 or do nothing)
Here’s the receptionist.”

We are down to 14 seconds!  For the callers who do press 1, we can make a new voice page that plays “Part of the presentation is by an ex-drinker”, etc etc through the end, and give it an option to repeat those details.  That will give the callers who care about the workshop an opportunity to write down some notes about the things they are hearing.

We can still do better than that. Let’s make the assumption that we should play the advertisement only the first time someone calls; if they’re calling back a second or third time, and didn’t press 1 to hear the details that first time, don’t waste the caller’s time playing any of the advertisement again!  Just go straight to the receptionist, or to a menu about other things!  We can be smarter than an answering machine.

How is that programmed?  Very easily!  This is less than 30 minutes of work in Angel.com’s Site Builder:

  • Write a row to a local data file at each call, saving the CallerID and a variable that saves a Y if 1 was pressed, or N if not (i.e. the caller wants to hear the advertisement, Y or N?).
  • Set that data file to purge itself of all rows older than 1 day, purging at some low traffic time such as 4:00am. It should only have rows for people who already called today.
  • On answering the call, check that data file for a match of the CallerID value and N (i.e. we already know that this particular caller doesn’t want to hear the ad).
  • If such a row exists, skip the voice page that plays the ad, and go directly to whatever the caller should hear next!

So: our caller who’s back for a second or third call in the day simply hears as greeting:  “Welcome to our company. Here’s the receptionist.” Delight!  Bliss! No sitting through an unwanted advertisement twice!  No sitting through the whole thing even once, but only the first few sentences of it!

Caller First. Do you want your callers to be a captive and squirming audience, annoyed by monologues every time they call, and already fuming before they get to talk to the receptionist?  Or, do you want them to get the information they truly need to know, quickly and respectfully?

You’ve purchased an IVR system that is much more resourceful than a 1980s answering machine.  Live it up!  Design it well, with an emphasis always on the caller’s point of view!

Take-home materials:

  • Shorter sentences rule.  No sentence may have 20 words or more.
  • Keep the prose simple and direct.  Get rid of any grammatical constructions that a 3rd grader couldn’t write.
  • This isn’t the radio.  Let your callers do something interactive as early in the call as possible.  No monologuing!
  • Create some delight: allow callers to bypass things they don’t care about.
  • Create even more delight: remember what each caller did or chose, the previous time that they called.  Use that knowledge to streamline the experience to their interests.
  • Someone who phones your company repeatedly doesn’t have the same needs as someone new.
  • The first draft is never the best, especially when designing IVR recordings.

Speech recognition: a fruit by “any other” name

This case study illustrates one of the many reasons why speech recognition is difficult, both in design and implementation.

The following behavior is very difficult to deliver:

  • “Tell me the name of a fruit that you like.”
  • If we hear Pineapple, Banana, Grape, Raspberry, or Orange, do X
  • If we hear any other fruit, do Y
  • If we hear anything that doesn’t sound enough like the name of a fruit, do Z for Failure

Basket X, by itself, is not very problematic. Each item is given a line for the recognizer to match, explicitly. One can add in alternate spellings or pronunciations to help it interpret what it “hears”, like this:

  • pineapple,pyneappul (do X1)
  • banana (do X2)
  • grape,grapes (do X3)
  • raspberry,razzberry (do X4)
  • orange,orrinj (do X5)

The recognizer receives an utterance (a series of sounds, considered together) from the human, and it scores that sequence against the list of fruit-name pronunciations that it “knows”. It assigns a Confidence Level, 0% up to 100%, for the one or several items on the list that appear to be the best match.

If the Confidence Level is above some assigned threshold, e.g. 45%, the recognizer returns that result as the best guess at what was said. But, if no item has received a score as high as the threshold, the returned value is No Match: the recognizer is not confident enough that the utterance matched anything on the list it is listening for. No Match? Do Z for Failure.

So far, that is not yet difficult. We always get back X1, X2, X3, X4, X5, or Z (No Match).

The system is ruined by the addition of the “any other fruit” basket, for behavior Y. The recognizer cannot know how to judge Y apart from Failure Z, unless it is given a comprehensive list of all the elements of Y that it might positively match. We must add all of these items:

  • kiwi (do Y1)
  • kumquat (do Y2)
  • watermelon (do Y3)
  • strawberry (do Y4)
  • tangelo,tanjelo (do Y5)
  • mango (do Y6)
  • black raspberry (do Y7)
  • apricot (do Y8)
  • peach (do Y9)
  • apple (do Y10)
  • …, ad infinitum, (do Yn)

Suddenly, our list has grown from five easily-distinguishable items into a much longer list, merely by trying to add one “all other fruits” basket.

  • Speech recognition works only on positive matches of known values. The recognizer cannot judge by any external attributes whether it sounded like the name of some fruit, through the meaning of the word. It does not even hear words. It hears only a sequence of meaningless sounds that it tries to match to a known list.
  • The recognizer has no way to know if “watermelon” or “water moccasin” are fruits, or any way to distinguish the two items from one another, unless one or both of them are on the list of sounds it is trying to match.
  • Again, we can’t build basket Y unless we have a complete list of all the individual fruits that should be in it.

With the addition of basket Y, the whole system begins to deliver results that humans would consider unacceptable failure:

  • The more one tries to “teach” all possible fruits to the recognizer, the less capable it gets at distinguishing any of them.
  • It becomes more difficult to say anything that will reliably return the No Match, Z, because we might accidentally hit something on the long list. If the human really said “beach”, a non-fruit, it is clearly not like anything in basket X; but, it hits too confidently on “peach” in basket Y, so we go to Y instead of Z.
  • The longer the list is, {X1, X2, X3, …, Xm, Y1, Y2, Y3, …, Yn}, the greater the chance that the recognizer will occasionally mis-recognize some ordinary things, because there are too many similar-sounding items to judge.

Proper testing of the system becomes forbiddingly difficult, as well:

  • To ensure the system’s accuracy, all of that list for behavior Y needs to be tested individually (kiwi, kumquat, watermelon, strawberry, tangelo, mango, black raspberry, apricot, peach, apple, … [hundreds of them]), to be sure that each known but unwanted fruit hits basket Y instead of the general Failure, Z.
  • The fruits that we really care about most, in basket X, lose some of their valid hits: whenever the recognizer “hears” something in basket Y that returned a higher score than the one the human really said, in basket X. Perhaps the human said “pineapple”, and the system should have returned basket X, but part of the sound got cut off in transmission. Unknown to the human, the system heard only “-apple”, it found a match for “apple” in basket Y (with higher Confidence Level than “pineapple” in basket X), and returned an unexpected behavior. Stupid computer! I said “pineapple”, which doesn’t resemble an apple in any way! “I’m sorry, we don’t have that fruit today.” Huh? When did they run out of pineapple? (It’s not telling me that it really heard “apple”….)
  • We must also ensure that an utterance intended for basket Y does not generate mistaken hits into basket X! Let’s see: I really said “black raspberry”, but the system heard only the “raspberry” part, and it acted accordingly. Meanwhile, I as an intelligent human am absolutely certain that I actually said “black raspberry”. Furthermore, I am certain that I intended to say “black raspberry”, and that I really mean “black raspberry”, not “raspberry”. How could the computer not recognize my intentions or my meaning? It seemed human enough, in the other interactions we have had during this session…. My mental model of the intelligent computer crashes, suddenly. Why did the computer suddenly become incompetent at understanding me?
  • An attempt to improve the sensitivity of “pineapple” vs “apple” (or “black raspberry” vs “raspberry”) might not work, because we cannot predict or reproduce the transmission dropouts or noise that affected only that single experimental trial. The “pineapple” and “black raspberry” test cases certainly made the system seem broken, yes, but that was only one hit, each. It just happened to be on the tester’s first and only trial, setting the (perhaps mistaken) expectation that the whole system is not yet adequately accurate. Who wants to test the system 1000 times, with a representative set of humans and environmental conditions all properly controlled, just to be able to determine the proper experimental percentage of accuracy?

Angel.com Manages the Caller Registration for “Notify NYC” — IVR/SMS App for a City-Wide Citizen Warning System

Today, Angel.com issued a press release highlighting the company’s participation in the city-wide launch of the Notify NYC Program. The Notify NYC Program is a citizen warning system that was created to enhance New York City’s public communication channels by distributing critical text and voice messages directly via the Web, e-mail and SMS text messages. Angel.com will manage the Interactive Voice Response (IVR) system to streamline the registration process for those who want to receive notifications from the City and also facilitate internal communications amongst City employees and work groups.

“Notify NYC gives people a chance to prepare for emergencies before they actually happen. By using Angel.com’s IVR technology, callers can easily become a part of this program so they can be automatically informed of any critical event.”

This is an amazing program that we can see being rolled out in a number of cities across the US.  While residents can register for Notify NYC online, the IVR registration line allows people to immediately sign up for this program wherever they hear about it.  This is one of the biggest benefits of such phone-based IVR applications — it brings immediacy to all consumers.  The likelihood that resident of NYC will only hear about this program while sitting at home with their laptop, or at work while a computer is handy is slim.  The IVR registration allows residents who hear about this while reading the paper on the subway, or from a friend at the local coffee shop to immediately dial 3-1-1 and register on the spot, before they forget about it as they go about their day.

For speed and simplicity sake, entering a telephone number and a zip code is all that is required for registrants to receive notifications in case of emergencies. Multiple account management features enable registrants to add, change or remove additional phone numbers, zip codes and notification types.

Angel.com “Spring Forward” Release - Designed to ‘Put the Caller First’

We put out our Spring Release 2009 this past weekend.  Aptly named “Spring Forward” because it was designed to address some major features our customers were asking for as well as propel our platform toward even more major releases due later this year.

But the crux of the release was based around our desire to “put the caller first” in everything, and every voice solution, we do and build.  What does this mean?  Well, in the words of Dave Rennyson (President, Angel.com):

“We’re using this phrase ‘putting the caller first’ because if we put the caller first—for our direct customers—and we help them build better IVRs, we help the industry, we help people build better applications, and we help ensure that their customers are served better.”

For more information on our Spring Forward Release, please visit the following pages:

Spring Forward Press Release >>

Spring Forward Announcement >>

Spring Forward Interview with Dave Rennyson in Speech Technology Magazine >>

Simple editing of WAV files for your Angel.com phone system

Tips on handling sound files with your Angel.com account:

1. Because you can have “extra” phone numbers in your account for only a minimal charge, put one of them to regular use with your own staff:

  • running a development copy of your voice site before you release the changes to the public on your main number(s)
  • having a place to test changes
  • playing with and learning more features of SiteBuilder!

2. Test new and old phrase recordings yourself, over the phone; not only on your computer. Put them into a simple greeting page or question page on a test site, in your own Angel account, and listen through them all for tone and pacing. Be sure your phrases make the intended effect and are intelligible over a variety of phones and situations:

  • conventional phone in a noisy room
  • cell phone from a vehicle
  • speakerphone
  • cell phone from an area with bad service
  • caller distracted by something else, not paying 100% attention to your Angel application!

3. Some audio editing tasks are very easy: such as cutting off unwanted space, taking out a few words, adding a bit more breathing space between clauses, adjusting volume, or cloning half a sentence from another prompt. Excellent free editing tools are:

Rather than requesting fresh recordings (usually 5 to 10 business days) where recycled or slightly adjusted phrases would be sufficient…do it yourself with these tools! It is fun, and you might be able to deliver good results in 15 minutes, for free.

4. Run every set of recordings through Switch to be sure they are in the correct format, and optimized for best sound on your Angel.com phone system. If phrases sound static-y on the phone, wrong formatting is probably the culprit. Here are the proper settings:

  • Output Format / Wave Encoder Option: PCM Uncompressed, 8000 Hz, 16 bits, Mono
  • Options/Conversions tab, Audio Processing: Normalize files when converting, with peak level = 70%

5. Whether it’s an IVR system, a musical performance, telling a joke, or getting a child’s attention: timing is extremely important in the delivery. Your chief weapons are fear, surprise, and ruthless efficiency…no, strike that, your chief weapon for IVR is a short silence.

Using Audacity, make a set of three “spacer” sound files that are nothing but silence, with lengths 500 milliseconds, 1000, and 2000.  Use these silences throughout your Angel.com voice site wherever a short pause would make your system’s delivery more easily understandable to your callers:

  • Before or after pronouncing data from a variable
  • Wherever you especially want the caller to pay attention to the phrase that comes next (grab the attention with a second of silence)
  • Wherever you want to give the callers a moment to think about or process what you just told them, such as a phone number or URL you want them to write down
  • At the beginning of a menu (1 second of silence is much more effective than inserting any cliched message begging for attention “as our menus have changed”)
  • Between the options within a menu, giving a moment where the callers can decide if that’s the one they want
  • Wherever the topic of your presentation is changing, such as a paragraph break within a Frequently Asked Questions message
  • More!

Gain “confidence” in your callers – A “Put the Caller First” Feature Showcase

A client developer of ours made the following comment in an email about a product gap:  “No ‘Confirm If-Necessary’ ability.  Most speech offerings allow apps to confirm if the confidence level comes back in a middle zone between rejecting and accepting the utterance.  For [Client A], we either have to set a page to confirm always, or risk false accepts, which will eventually cause a concern.”The truth of the matter is we do have this ability, but it’s just buried within the application on our “Question Pages”!  Which lead inevitably and inexorably to an educational feature series that I will be forming around enabling our users to “Put the Caller First”.

Angel applications have an ASR (Automatic Speech Recognition) setting called confidence level set to .45 (or 45%) which means that the application assigns a confidence level to everything a caller says to it and accepts responses that it is at least 45% confident it knows what you have said.

Imagine if you are in a noisy room, and someone responds to a question that you ask and you are not sure you heard them correctly – you would subconsciously assign a confidence level to what you heard, and may ask them to repeat what they said.   However instead of asking them to repeat what they said, you may just say, “I heard you say [response], is that right?” building confidence in them that you are actively listening.

Angel has the ability to do this through the use of adjusting the confidence level, turning on confirmation, and finally adjusting the confidence threshold. This is by default set at 1 when you turn on confirmation making everything confirmed regardless of how confident you are in the response.

Below is a simple guide on how to do this:

In this example we have an app where getting the response right is important but not 100% critical, and quality of the experience by speed is equally important.  If we are 75%+ confident in the response we just accept it and move on.  We only reject the response when we are less than 25% confident, forcing them to a no-match error.  Finally between 25-75% confidence, we politely let them know what we think they asked for, and ask them to confirm or deny that.  Here we are “Putting the Caller First”.

In the future we will be exploring ways to capture confidence levels into variables to enable creative things (i.e. cool things with logic pages, improved customer analysis and tuning through enterprise reporting, etc).

How’s the front door?

How’s the front door of your company’s phone presence in the world?

Here’s a useful little test.  Every week, assign two or three people from your own company to call into your own phone line.  Not the same people every week.  Rotate it.

Take notes on the total user experience.

If there was a transfer to an agent, how long did it take?

Were there any obvious problems with the automated prompts that could be fixed with a common-sense approach?  Or, any more profound problems that might require some consulting?  (It is your company’s own front door, on the phone, so things “should” run correctly!)

If your own CEO leaves a voicemail message in the company’s system, how long does it take for the promised callback?

Did the call get dropped at any weird place?

Was there anything that a brand-new customer, or a potential customer, would find confusing or off-putting?

The customers might not be able to report problems to you, and might not bother to do so.  Competitors certainly won’t.

Have fun!  Break your own system and find any problems before your customers do!  :)

Awkward phrases in the auto-attendant

Does your phone attendant have any of these greatest hits?

“Please listen carefully, as our menu options have changed.”  — Who knew?  Who memorized them?  Who even cared, beyond the company serving them up?  What’s the punishment if the caller doesn’t listen?  Put yourself into a restaurant where your waitress says: “Please listen carefully, as our salad dressing options have changed.  For House, press 1.  For Ranch, press 2.  For French, press 3.  For Poppy Seed, press 4.  For Vinaigrette, press 5.”  No tip for you, one year!  (And is that House dressing flavored like bricks, wood, or is it musty carpet?  If you don’t give me any clues what it tastes like, but just give me your own cryptic keyword or title, how can I make any intelligent choice for or against it?)

Convoluted, impenetrable, obfuscatory, constipated bureaucratic verbiage – please just don’t.  Your company wants me off the phone, quickly.  I want to be served quickly, and off the phone.  We’re in this together.  Give me straightforward choices where I don’t have to use my doctorate to figure out the sentences.  Prompt me with English at a third-grade reading level, no higher.  I’m not looking at your sentences on paper; I’m hearing them sequentially on the phone (like a radio broadcast), and I can’t fast-forward with my eyes or ears.  Keep it simple.

Three nouns in a row, or three adjectives in a row, on the phone.  – Please prompt me with one noun and one verb.  Please greet me with one noun.  If you really have to add all that detail, put it into a second clause or sentence after the greeting…and you might realize that it’s expendable.  Don’t welcome me to “the state welfare agency support line office locator”, where five other nouns modify the truly important nouns: “office locator”.  (Truly important: what can this system do for me as the caller?)  It evidently belongs to the “state welfare agency”.  It’s probably a “support line” of some sort, by inference, since I’m calling it.

“Momentarily” and “shortly” – um, don’t you mean “soon”?  The word “momentarily” really means “only for a [fleeting] moment”; will the company representative speak in a blip and be gone?  “Shortly” takes twice as long to say as “soon”.  Get the junk out of there, and not a moment too soon.  Everybody knows that the agent will probably not be with them imminently, whether the system is trying to reassure them in that direction or not.

“Your call is important to us, so please stay on the line.” – What, you pulled me off the easily ignorable hold music just to tell me that you still don’t have a capable person available to pick up my call?  I put the phone back to my ear on hearing a voice, only to be told that I’m really still on hold for the indefinite future?  And that the company would rather have me wait forever on my own time/initiative than to bother any of their people?  If my call is important, if my time as your customer is in any way valuable, how about at least offering me a chance to leave a message now so the company can call me back on their time?

“For information on blah blah blah, press 1.” I press 1, and then it says, “For information on blah blah blah, please call:” and then a different company name and their phone number! – Now, why did the first prompt lead me to believe I’d find information here on this call? Please don’t make it the customer’s problem when merged companies can’t get their own acts together into a well-organized presentation.

“Eastern Standard Time”, as in: “Our business hours are 8:00 a.m. to 6:00 p.m., Eastern Standard Time” – this almost sounds OK…but it’s wrong for half the year.  Are you going to change your automated system every six months so it correctly says “Eastern Daylight Saving Time” when appropriate?  How about just saying “Eastern Time”, leaving it that way all the time, and being done?

“Seven days a week” – strike that.  “Every day.”

“If you know your party’s extension, enter it now.” – I was invited to an extended party?  Cool!  Didn’t party lines on the phone go out sometime in the mid-1970s?  Why is this mock formal operator-ese, “your party”, still with us?  I’m calling the company to contact either a person or a department (or division).  How about: “If you know the phone extension of the person you’re calling, enter it now.”

“Visit us on the web at w-w-w, blah-blah-blah-blah” before the caller gets to make any choice. – No.  Please, no.  I called your company on the phone because your web site already didn’t give me what I needed.  I got your phone number from the web site, and I’m calling to follow up with a person or department on something I specifically need.  An automated phone call shouldn’t beg me to hang up right now and go away, even if that’s what (judging by behavior) the company really does want.  Furthermore, I’m not going to have a pen and paper handy to jot down your web address anyway, so why are you wasting my time with it?

“Please press” with every number, where the repeated “please” gets annoying. — It’s false politeness. We’re already dealing with a machine instead of a human. The “please” just sounds formulaic instead of sincere.  Please please please PLEEZZE PLEEEEEZZZZE condescend to push my buttons, saith the computer.

“Sorry, I didn’t get that.” – There is a separate essay about this stinker.  In summary: a computer is never sorry, and computers actually have less empathy (and inspire less empathy) than road kill does.  I don’t want to hear a computer apologize about its own inadequacies to serve me; I just want to be served through simple and direct questions about my needs, so I can follow the instructions and be done.

“For all other questions, including fruit bats and breakfast cereals, press 5.” – Look, doesn’t “all other questions” already catch everything I could possibly be calling about?  Why do the fruit bats and breakfast cereals need to be mentioned?  If they’re that important, shouldn’t they be their own options, and make the catch-all category be 6 or 7?  A pretty good rule of thumb is: a prompted option should probably never have the word “including” in it.  Even if you’re going to send options 5, 6, and 7 all to the same agent who handles “general” stuff, perhaps you could at least log or whisper the choice separately…and give less frustration to the caller hearing the menu, too.  “All other” means all other.

“For general information, press 1.  For information about dingo’s kidneys, press 2….” – Why is my “general” option in front of the list of specific options?   I don’t feel like listening all the way through the menu, to decide if I should have pressed 1 a long time ago.

“We are currently assisting other customers. Your call will be answered in the order in which it was received.” – If it has to say anything there, how about: “Please hold on, and someone will speak with you as soon as possible. Our people are still helping other earlier callers.” ?

“For more information, call 847-273-7502 during regular business hours.  Thank you.” – Impossible.  I have no warning that a phone number is going to be blurted in my general direction, no opportunity to write it down (even if I wanted to), and no information on their regular business hours…whoever it is.

There are easy ways around all of these problems.  Just think each of them through from the perspective of a caller who knows nothing about your company.

More to come.  See also a very long list of ideas….

Lily Tomlin’s Ernestine, and bad VUI confirmation/re-prompting

In the current issue of a bridge magazine, the ACBL Bulletin (of all places!), there is an article about badly-designed phone automation.

In her attempt to find an out-of-print book about the game, the caller tries to find the phone number of a bookstore.  The automated system gives her piles of conversational garbage and failed lookups…and then it still gives her the wrong phone number.

An excerpt:

“What city and state?”  “Fort Worth, Texas.”

“That’s Fort Worth, Texas, right?”  “Yes.”

“I’m sorry, I didn’t get that.  That’s Fort Worth, Texas, right?”  “Yes.”

“Okay, do you want residential or business?”  “Business.”

“I’m sorry, I didn’t….”  “BUSINESS.”

“Okay.  Please say the listing you want.”  “Half Price Books.”

“That’s Pentecostal Water of Life, right?”  ??????????? “No, it’s Half Price Books.”

“Please say the listing you want.”  “Okay.  HALF.  PRICE.  BOOKS.”

“What street?  It’s okay to say, ‘I don’t know.’”  “Hulen.”

“Okay.  You don’t know the street.”  (*#&@(*%&@#(*%&@#(*%&

“I’m sorry.  I didn’t get that.  What street again?”  “Hulen.”

“I think you said Cypress.  Is that correct?”  “Yes, Cypress.  That’s it.  Definitely Cypress.”

“I’m sorry.  I didn’t get that.  What street again?”  “I guess you DIDN’T get that, Miss Auto May Shun.  Hulen, but somehow it’s beginning to matter less and less.  I mean, half an hour ago I cared.  But it doesn’t seem important anymore, Hulen.  After all, the book I want is years old.  Bridge changes daily.  The basics, Hulen, might not be relevant in today’s hodgepodge of conventions and intricate twists and turns.  Clever insights are possibly being adopted as we speak, if you can call this speaking.  Hulen.”

“Okay, Hulen.  Is that right?”  “Yes.  Yes.  It is!  YES!”

“Okay, the number is 817-335-3902″.  (And the number was wrong; the author comments further….)

Let’s analyze that a bit.

The fundamental problem here is with the re-prompting strategy.  The computer apologizes “I’m sorry, I didn’t get that…” and the caller gets quickly agitated.  The agitation is not the caller’s fault.  It’s bad design.  The time wasted in the six words “I’m sorry, I didn’t get that”, along with the pretense of compassion, is enough to put a reasonable caller over the cliff on the second occurrence (or earlier!).

The system designer probably intended that the computer sound both deferential and polite, with such a phrase…but it’s counter-productive.  If every unparseable utterance leads to the computer acting sorry, the conversation falls apart.

When the computer tries to be too conversational, the caller (perceiving the thing as sort-of-human and using human speech/conversational patterns) volunteers extra words or sounds that a human would ignore.  The computer can’t ignore those extra sounds.  The human’s utterance is “out of grammar”…and the computer is “sorry” that it couldn’t figure it out.

And then, it immediately spirals into a feedback loop where the computer apologizes.   (But, IT’S NOT HUMAN!!!!!  IT’S NEVER SORRY!!!!!!!!!!  Cats don’t act sorry.  Why should computers?)   The human interrupts again with even more out-of-grammar speech (which is nonsense to the computer), and the conversation is dead.  The task never gets done accurately or efficiently in such situations.  All because the computer pretended to use, and to understand, human speech patterns.

The computer comes across as a bad human who is less capable of intelligent interaction than a one-year-old child.  Consequently, the caller gets understandably upset and then abusive.

And, in popular culture, automation ITSELF turns into the public enemy.  (“You *#$(%*& computer, why couldn’t you *&*#*&% hear me the first *#$*%& time?!?!?!?!?!?!??!?!  No!  Stop!  Stop  *&$*%&#%$ apologizing and shut the *&*%#$% up and listen to me!  No nono no nonono!  Stop!  I called YOU for *&*&#% HELP, because I need HELP, not a run-around…..”)

This bridge magazine article gives a great example.  The computer makes its wrong guesses at the caller’s request, tries to confirm things that are absolutely ludicrous (from the caller’s intelligent point of view), wastes its own speaking turns dwelling on the past, and that’s it. The conversation is dead.

There is one point in the conversation where this caller says something completely sarcastic, and the computer doesn’t get that either.  The computer is not programmed to interpret as “No” the utterance (with its desperately disparaging and sarcastic tone): “Yes, Cypress.  That’s it.  Definitely Cypress.”

It’s not the caller’s fault her emotions got riled up, to that destructive point.  It’s the system’s fault: for encouraging uncooperative behavior by the caller.  The system didn’t keep the necessary control of the conversation.  It would rather be sorry than accurate or efficient, apparently.

And the caller’s perspective is: Couldn’t the company afford to hire an intelligent person to answer the *&*#&%#% phone?  The company would rather waste the customer’s time instead of their own?  The company evidently cares most about keeping their own customers OFF the phone, either by providing a pointless and time-wasting run-around, or by begging the caller actively to go use the web?  That’s the perception.  That’s what bad service says to the customer.  The company would rather stick a clueless and unhelpful computer onto the line than pay an intelligent operator; too bad for the customers.  The company is too busy, or too self-centered, to help real people.

Remember Lily Tomlin’s character of the telephone operator Ernestine (“one ringy dingy, two ringy dingys”; “Is this the party to whom I am speaking?”)?

Ernestine sketch #1

Ernestine sketch #2

Ernestine was snotty, belligerent, self-centered, and presumptuous…but she was still easier to deal with than badly-done automation is.

In automated systems, everything must be done to keep the callers calm and focused on task.  The computer is never sorry.  The computer is never able to filter out extra noise or syllables as well as a toddler does.

More to the point: the confirmation/re-prompting strategy must keep the human saying easily understandable things (or pressing a small selection of buttons!), and volunteering NO extra sounds.

As soon as callers feel badly served, or not listened to, they’ll stop cooperating.  That’s human nature.  The computer doesn’t really care if the caller cooperates or not; it’s just cluelessly following its instructions.

The computer is not sorry.  An unparseable utterance, or even just a bunch of random noise or a digital phone dropout, happened in the “conversation”…and the computer couldn’t act on it.  Fine.  Time moves forward.  “The water is under the bridge.”  The computer must not apologize for being an inadequate conversational partner.  The computer must not speculate on the reason for the error, or blame anyone.  The past is the past.  The error is in the past.

The way out is very easy.  Errors will happen.  The way out is very easy.  The way out is very easy.

Initial statement of the question: “Fort Worth, Texas.  Is that right?”  “wekflkowhfpohf”   (error #1)

“Fort Worth, Texas.  Yes or No?”  “wejljlkwfhHJLhwekfhelsdkFJs”  (error #2, still didn’t get the Yes/No, or “Right”, or synonyms)

“Fort Worth, Texas.  Yes or No?”  “lwjelkfjh”  (error #3: give the caller a way around the side:)

“If that’s the city you want, press 1.  Otherwise, press 9.” 

Whether the error was an unrecognizable utterance or a timeout, the first two re-prompts are simply to say the question again as succinctly and directly as possible.  The third re-prompt gives the caller some unequivocal instructions NOT to speak the answer; for whatever reason, speaking wasn’t working.

The conversation is dead unless the caller can get past this point successfully.  The computer must therefore encourage the caller to cooperate in any way it will be able to understand.  Move forward and try to get a useful answer.  The past is gone.  Steer the future.

Longer junk such as “I’m sorry, I didn’t get that” encourages the caller to jump in with an interruption, miss the instructions again, editorialize, or worse.  It also encourages the caller to try to figure out and (speculatively) fix the CAUSE of the miscommunication, which is a pointless waste of time.  That utterance, whatever it was, is over and gone forever.  Try a new one.  (It also doesn’t work to say ONLY “I’m sorry, I didn’t get that” and not continue to the question; sometimes the error was caused by noise or by a caller interruption, cutting off part of the initial question, and now the context is lost.  The computer didn’t get WHAT?  I didn’t say anything.  Why did the question cut off?  What was the question?  What am I supposed to do now?  Did I kill it?)

Incidentally, this simple re-prompting strategy works well with small children, too.  Just restate the question in a calm and measured manner, making it clear that an answer is required.

“Do you want a banana, an apple, or a cookie?”  “Blah blah blah indecision indecision blah blah.”

“Banana, apple, or cookie?”  “lhklehfawefkjlawj”

“Banana, apple, or cookie?”  “Ummm…cookie!”

Ding!

, , , , , , ,

Follow up study to be presented in SpeechTek

Susan Hura, one of the head organizers of SpeechTek this year, just posted following on the VUIDS Yahoogroups group:

For those of you coming to SpeechTEK next month, Tim Pearce from Dimension Data and Mike Bergelson from Cisco are going to present year 2 data from the Alignment Index at the conference. We’re kicking off the Business Goals track with this session, Monday, August 18, 10:15-11 AM. We’ll also be hearing about a similar study conducted in the EU by VoiceObjects.

Here is a link to the session.

Vendors vs. Users: Interesting Alignment Study

Just came across a fascinating study by Dimension Data (in collaboration with Cisco) on the perception gap between “vendors” and “consumers” of speech-enabled self service solutions. By “vendors” the study refers to platform developers, system integrators, voice application developers, and speech technology vendors. 128 such vendors were surveyed for the study. By “consumers” they refer to callers who have interacted with speech-enabled self-service applications. They surveyed 1,203 such consumers.

Misalignment

The key findings revolve around 6 questions:

(1) How often would you prefer to use a speech recognition system rather than a touch-tone system? 9% of vendors answered “As little as possible,” while 45% of users gave that answer. A huge disconnect. On the flip side, 47% of users gave a qualified “Yes” — that is, they would prefer speech under some circumstances (depending on time of day, where the caller is, etc.), which tells us that users are not necessarily reflexively rejecting speech-enabled automation under all circumstances.

(2) What do you think is the main reason organizations provide automated services in their call centers? 69% of vendors said “to save money” compared to 54% of users. In other words, callers are no dupes: they fully understand what motivates to deployment of these solutions.

(3) What do you think is the most important benefit of using an automated system when you phone a call center? 51% of vendors mentioned “to avoid wait time” while 49% of users mentioned “24 x 7 service” against 18% who mentioned “Avoid wait time”! A remarkable mis-alignment and a clear opportunity for marketers and designers to exploit for increasing adoption.

(4) In general, when you’ve used a speech recognition system, which of the following best describes how well it helped you deal with your query? 77% of vendors said that it “Partially addressed the reason I called” while only 43% of users did. Another large gap. 2% of vendors responded with, “Did nothing I needed,” while 13% users gave that response. Again, another noticeable gap that points to excessive optimism from vendors. On the other hand, only 8% of vendors responded with “Fully addressed the reason I called,” while 18% of users gave that answer. In other words, it seems that vendor answers are driven by mushy conservative wishful thinking rather than insight into actual user reception.

(5) Having used a speech recognition automated system, would you now…? 44% of vendors responded with, “Be neutral to use one again” vs. only 28% of users giving the same answers. What is noteworthy is that a greater proportion of users (36%) responded with “Be happy to use one again” vs. 32% of vendors giving that answer, and a greater proportion of users (also 36%) responded with “Be reluctant to use one again” vs. 24% from vendors. In other words, just like question 4, users are more opinionated and have a less neutral disposition than vendors.

(6) The thing that annoys or irritates me most about using an automated speech application is when…. 41% of vendors answered with “System didn’t understand me,” vendors’ number one answer, while users’ number one answer was, “Transfer to agent with no context.” This is a fascinating disconnect. Only 17% of users responded with, “System didn’t understand me.” Which simply means that it’s not speech recognition that users find annoying or irritating, but the experience with the application: an additional 16% of users said, “Can’t skip ahead” and 14% said, “No alternatives”. In other words, 67% of dissatisfaction revolves around the experience with the application. Vendors by contrast focused on technology, in this case ASR and CTI (”Transfer to agent with no context” receiving 38%). “Can’t skip” received 4% and “No alternative” a mere 1%.

The report gives a couple of general recommendations such as establishing “cross-functional engagement within organizations” and ensuring “contributions from non-technology stakeholders, e.g., marketing, customer services, and usability experts.” But that is no revelation to anyone who seriously engages in voice user interface design.

What would have made the study complete would have been including a third category of stakeholders: the companies that deploy these applications — i.e., the actual customers of the vendors. I suspect that since many of these customers are sold on the value of self-service applications by the very vendors surveyed in the study, a parallel mis-alignment between customer expectations and those of the ultimate users also holds.

The authors promise to run the survey year over year. Let’s keep our eyes open. Hopefully, vendors and customers will read the report and will begin to actually align their goals and values along those of end users.

Outrage Triggers for Callers When Dealing With an IVR System

Here is a short list of IVR failures that trigger in callers a feeling of outrage — or at least loathing and contempt!

Outrage

1. You are forced to start all over after giving the IVR several pieces of information. That’s right: nothing can make a person’s day like having them emulate Sisyphus while trying to reach customer support. Nice!

2. You are made to listen to several minutes of declamations, instructions, warnings, and general statements before you are offered anything that you care about. That’s because we all love to hear other people thump their chest about how they are the best and the brightest and the loveliest, and how maybe we should check their web site next time, and that our call is so important to them that they can’t stand it, and so forth. We all love to hear that stuff, especially when we are calling because we are pissed off and need help with their crappy product.

3. You are made to wait a long time only to be routed to voice mail. This is my favorite by far. OK — we thought about it and… we don’t think you are worth speaking with. So sorry. After the tone, start speaking or whatever…. Beeeep.

4. You are made to wait a long time, finally get to an agent, but you never get your problem resolved. This is worse than #3, because you force yourself to waste 10 minutes of your existence on earth talking to someone who wouldn’t know how to help you if your hair got caught on fire before you start feeling ashamed of yourself for indulging in such bottomless self-delusion.

5. You are asked by the agent to repeat information that you already provided to the IVR. This is of course the all time classic. (Once, an agent pretended that she needed me to repeat the information “just to make sure”. I smiled and repeated it. At least she cared enough to make the effort…)

6. You are transferred from one IVR system to another IVR system. This always makes me smile: if people are able to launch businesses and make a healthy living with this kind of utter thoughtlessness, I too will become rich and famous one day….

7. The IVR system asks you to call at a later time and then hangs up on you. You gotta respect a machine that can detect a meat head, swiftly decides that it has no tolerance for such density, and then cuts its losses and moves on. When you get your act together, buddy, give me a call and we can talk….

,