Szymon RozgaPractical Bot Developmenthttps://doi.org/10.1007/978-1-4842-3540-9_4

4. Conversation Design

Szymon Rozga¹

(1)

Port Washington, New York, USA

Although the technology allows us to develop a bot that behaves in just about any way, that doesn’t mean we should. Users have certain expectations from their messaging communications such as acknowledgment of the message receipt, a quick response, and the ability to continue the conversation later. Although conversing with a bot is not the same as speaking with a human, messaging a friend is the closest analogous experience. Since users are still getting used to bots, it is reasonable to take those interactions as samples of how a bot should behave.

Successful bots can exhibit many types of behaviors, but there are some common patterns and flavors. That’s not to say innovation has stagnated; not at all! These use cases are based on commonly observed patterns in the space given technology and budget constraints. The space is ripe for innovation, and the only question is, what are the limits of our collective imagination?

These common use cases also follow certain rules as to how they communicate with users. During my career, it was essential for me to internalize that most technology users don’t use the technology the way that I do. I love the command line and its precision. Not being a native English speaker, the ambiguity of natural language has been troubling. But bots give users an ability to use this ambiguous natural language. As a result, there is a certain amount of self-restraint that bot developers need to exercise. It is easy for a developer to put together a bot experience that is more reminiscent of using a command line.

Considering the limitations of natural language processing (NLP) and user expectations, it then becomes more important than ever to be careful about how the bot behaves when it doesn’t understand things and when it’s asking for feedback from the user. With a careful approach and a conscious choice of the type of responses we send our users, creating a delightful experience is within reach.

Common Use Cases

Developers are creating all sorts of conversational experiences. We can experience bots that specialize in tasks such as selling items, answering questions about products, sending order statuses, answering inquiries about orders, provisioning cloud infrastructure, searching over multiple data sources, sharing cat GIFs, and doing millions of other things.

At a high level, we will split the bots into two larger categories: consumer and enterprise. There is of course overlap in the subcategories but also some clear dividing lines.

Common Consumer Cases

Consumer bots are typically available via channels such as Facebook Messenger, Slack, and the other public messaging apps; web chat; voice interfaces; or even custom mobile apps when a custom interface is required. On the lower end of the quality scale, they are no more than toys. On the higher end, they can be impressive feats of design and engineering. Because of the general AI and bot fever we discussed in Chapter 1, many companies are deploying a bot along with their products. Atlassian, for example, has a Slackbot for its JIRA product. Even Amazon has a chat bot integrated into its mobile shopping app. You will also find brands dipping their toes into bots via Facebook Messenger. Facebook Pages makes it easy for a company to have an outward-facing public channel to talk to its customers via either public posts or Messenger. If it is Messenger, a human agent needs to log into the page inbox and reply to each message. A first step for many companies is to deploy a Messenger bot that replies to a few types of user queries, with the rest simply left for humans to reply to. Utility-wise, we are still trying to answer the question, what makes the most sense for users? The variety of bots in the space certainly points to that. The following are some broad categories of effective approaches.

FAQ Bot

An FAQ bot is typically the first entry into the bot and NLP space by teams taking the technology for a test run. It is an easy use case: let’s take our existing FAQ and place it as a bot on Facebook Messenger or enterprise messaging. That way, the most typically asked questions can be caught by a bot before an employee spends time answering them. A simple text-based FAQ bot can turn into something quite interesting and aesthetically pleasing from a user perspective. An answer to a commonly asked question doesn’t simply have to be a block of boring text. The answer can include further content such as images, videos, and links to additional content.

For example, consider a financial services bot that can answer different types of questions about financial topics. Within its response, it can embed additional suggested topics of interest as buttons. At that point, the user can look at related terms and their definitions. If there are websites that visually represent a concept, for example, the iron condor option investment strategy, those links can be included in the response for the user to click to get more information. Of course, our conversation design needs to balance all that content with possible user overload. The sweet spot in between can be effective at providing the user with a pleasant experience with the bot. Figure 4-1 is an example of Child Fund International’s FAQ bot embedded into a web page.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig1_HTML.jpg — Figure 4-1
A basic FAQ bot in action

Task-Oriented Bot

A task-oriented bot is a virtual agent that can help users with a variety of tasks specific to a domain. These types of bots are sometimes called concierge bots . For example, JIRA’s Slackbot (Figure 4-2) is task oriented. It can create tasks and assign tasks based on a conversation a team is having.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig2_HTML.jpg — Figure 4-2
JIRA Slackbot

I once worked on a diabetes coach chat bot, which could help users who have Type 2 diabetes ask for meal and exercise advice personalized according to previous conversations and other data about the users. There are also financial services bots that connect to a trading account and update the user on their account balances and positions and even trade, like the TD Ameritrade bot (Figure 4-3). The Calendar LUIS app we developed in Chapter 3 is the base for a calendar task bot.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig3_HTML.jpg — Figure 4-3
Trading stocks using the TD Ameritrade bot

Broadcast Bot

A broadcast bot is an interesting concept and is quite common. We can think of this as a bot that reaches out to the user without prompting, as opposed to the user contacting the bot first. In some bots, it is more a pattern to keep bots engaged. For example, different news bots, like the CNN bot on Facebook Messenger, will reach out daily with the biggest stories of the day.

A subset and more nuanced version of this can be seen in some celebrity bot implementations. Typically, these types of bots exist for fun. They adopt the personality of a celebrity and can talk to users about topics of interest, products, and other ways of interacting with the celebrity’s branch. The bot can navigate you through a script of topics, send you videos and images, and maybe talk about products that the celebrity is endorsing. The conversation is almost entirely driven by the bot, instead of the user. It is an interesting storytelling device , but its success comes down to consistent fresh content. Figure 4-4 shows an example of Project Cali, a Snoop Dogg bot created for fun.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig4_HTML.jpg — Figure 4-4
Project Cali: a Snoop Dogg bot

E-commerce Bot

Although not yet big in North America, bots are slowly starting to sell products to consumers. It is not a terribly challenging task from a technical perspective; the bigger challenge is getting users to use a messaging instead of apps or websites. The amount of e-commerce integration in these kinds of bots varies. For example, some bots provide the complete end-to-end shopping experience. Looking at clothing items (Figure 4-5) or flowers (Figure 4-6) through a bot is different from an online shopping experience. Some bots lean into this and provide quirky or innovative ways of figuring out what products to show the user to get the impulse buy!

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig5_HTML.jpg — Figure 4-5
Louis Vuitton bot

We also run into experiences where the bot is responsible only for broadcasting a receipt for a purchase and order status updates, with a limited set of bot functionality. Everything else gets automatically routed to a human customer support representative. Although this kind of experience is not fully integrated e-commerce, it is a great first step into that journey and into getting customers acquainted with bots. In short, companies are embracing what is being called the digitally driven consumer journey , and bots are part of this strategy.¹

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig6_HTML.jpg — Figure 4-6
The 1-800-Flowers.com Assistant

Different messaging platforms provide different levels of payment support. We could certainly create e-commerce via a bot by providing a custom checkout page where the user can enter their payment information. The conversation is paused at this point. Once the payment is processed, a message is sent to the bot to continue the conversation. On the other hand, Facebook Messenger provides deeper integration with systems such as Stripe and PayPal. In that version, the payment experience stays completely within the Facebook Messenger app. From a user perspective, the less friction the better. And as users begin placing more trust in messaging apps to store their payment information, we will see more and more payment integrations like this. Apple has released its Business Chat² product and you bet that Apple Pay payments are fully integrated.³

Common Enterprise Cases

Enterprise bots may be more specialized to a domain or subject matter. They are typically deployed using a web chat component or integrated into enterprise messaging systems, or even enterprise Call Center and Interactive Voice Response (IVR) systems such as Cisco's Unified Communications Center. They can also be deployed on e-mail endpoints. The bots may be integrated with single sign-on solutions, powerful existing enterprise back ends, and knowledge management databases. Depending on the enterprise’s practice, these will range from simple pilot bots to machine learning–driven large-scale deployments.

Self-Service Bots

One of the most common use cases in an enterprise scenario is incident self-servicing. Enterprises have large knowledge bases that internal help desk agents use to communicate possible solutions to the user and guide them through the process of troubleshooting issues. Many of these step-by-step troubleshooting directions can be communicated to the user by a bot. For example, one of the most common queries to internal help desks is password reset. Companies could cut through a lot of volume and, frankly, money, if they were to handle such requests automatically. You could imagine an appliance manufacturer releasing a chat bot to assist in diagnosis and fixing issues before involving a service engineer.

The idea behind these self-service bots is that they can provide a variety of self-service content for the users, especially for the most common queries, and can even automate some of the common work that the customer support team is doing. These bots are usually integrated with live chat systems so that the user may initially be chatting to a bot but can be quickly rerouted into a live chat or phone conversation with a human customer service agent in case the bot’s directions do not solve the problem.

Process Automation Bots

Robotic Process Automation (RPA) is a huge topic these days. Companies like IPsoft specialize in building bots and technology that can automate business and IT tasks.⁴ In this context, bots are not necessarily chat bots but rather computer agents that perform the automation. These tasks can include everything from account provisioning, website automation and business processes automation. There are companies that focus on creating automation platforms, such as Automation Anywhere and UiPath. With machine learning these days being used for everything from contract analysis to skin cancer diagnoses, chat bots can serve as an excellent front end into these processes. In an RPA scenario, the chat bot is more of an orchestrator rather than an automator. In addition, these bots may integrate into ticketing systems such as Remedy and ServiceNow to track its work.

In other instances, the chat bot is less visible to the user. For example, Slack is a great platform for bots that listen in to a team conversation and surface data as the right natural language arises. Bots that simply listen in to some natural language input and provide answers are a type of automation bot. Say, for example, a team of medical experts looks through text descriptions of procedures and is charged with translating them to insurance codes. That process can be automated by a machine learning algorithm that observes the team’s behavior and results and can then take over the data.

Again, the actual brains behind the logic may not be inside the bot itself. There may be a separate system that implements the machine learning model for the insurance codes. Or, the automation code may be Python, PowerShell, or any other script. The bot serves as the front end to receive the natural language and orchestrate the automation (Figure 4-7).

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig7_HTML.jpg — Figure 4-7
A sample automation bot flow

Knowledge Management Bots

Another type of enterprise bot is one that can solve natural language search problems across a variety of data sources. Many firms have huge knowledge repositories across disparate sets of systems. Being able to integrate with all those sources using natural language is important. There are interesting choices to make in these bots about which content to display to the user, in what format, and how to collect feedback on which content was the most useful given a query.

The bigger problems of natural language search that these projects try to solve are fascinating and beyond the scope of this book. This type of bot can be extremely interesting in a group conversation context, where the bot is querying articles, reports, white papers, and case studies as the group is having a conversation about topics of interest. The group’s feedback to the bot during the search can further provide supervised learning data to improve the search experience even further.

Representing Conversations

How do we start developing a conversational chat bot? A good place is trying to graphically represent the conversation flow. What kinds of tasks can the chat bot handle? What intents and entities does it need to look for to achieve these goals? How does it help fill in missing data?

We will be referring to the conversation as a graph, which is a collection of nodes connected by edges. Figure 4-8 illustrates an undirected graph. Every node is connected to at least one other node in the graph. Each node represents a state of the conversation, and the edges represent a transition between states.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig8_HTML.jpg — Figure 4-8
An undirected graph

We will use arrows in the edges to show the direction of flow. This is referred to as a directed graph . We start with the root node. The root node is the state of the beginning of the conversation. Using our Calendar bot as a sample, we know that our bot should support adding new entries, editing existing entries, removing entries, checking availability, and providing a summary of our calendar or an event. We can represent the bot as shown in Figure 4-9.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig9_HTML.jpg — Figure 4-9
Representation of a calendar concierge bot conversation

Note that the conversation moves between states based on the user’s utterance, which resolves to a LUIS intent. Each node along the conversation has built-in logic to resolve the entities and execute the correct logic for a state. After a state is done executing its logic, the conversation transfers back to the root node.

The transitions between states can be invoked either programmatically or by user input. For example, say our bot supports creating calendar appointments. Recall in Chapter 3 that we created a LUIS application that allows us to pass either several or no entities as part of an utterance to add a calendar entry. If our Add New Entry dialog did not receive information about a subject and invitee, as for example in the utterance “meet tomorrow at 2pm,” we could elicit that information in another state. On the other hand, if the user uses an utterance that contains these entities, such as “meet with kim for coffee tomorrow at 2pm,” we do not need to elicit this extra information. This conditional state transition is illustrated in Figure 4-10.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig10_HTML.jpg — Figure 4-10
Conditionally transfer to state based on user input

The process of creating a conversation graph is usually referred to as intent and entity mapping; we model intent and entity combinations as transitions between state nodes.

Bot Responses

There are a variety of forms that a bot response to a user’s query may take. Understanding the different options and how to best leverage them is key to any bot design. In the following sections we will dig into a number of concepts found amongst the various channels.

Building Blocks

We now understand how we can take user input and map it to bot states and function. We also understand how we can organize our bot code into various conversation states. The next step in our design is to figure out what the bot sends to the user in return. Bots can respond in a variety of ways. By default, we think of text or speech output. Most typically, we simply send back plain text. Some messaging channels support something more complex like Markdown or HTML. Markdown is a plain-text formatting syntax.⁵ The following Markdown input translates to the formatted content in Figure 4-11:

# H1

# H2

Hello, my _name_ is **Szymon Rozga**

I like:

1. Bots

1. Dogs

1. Music

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig11_HTML.jpg — Figure 4-11
A formatted markdown document

Bot platforms can also support speech response. Many platforms also support the Speech Synthesis Markup Language (SSML) as a speech output format. SSML is a markup language that provides metadata about how speech should be constructed using elements such as pauses, breaks, changes of rate and pitch, and others. Here is a self-explanatory sample from the WC3 Recommendation⁶:

<?xml version="1.0"?>

<speak version="1.0" xmlns:="http://www.w3.org/2001/10/synthesis"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://www.w3.org/2001/10/synthesis

http://www.w3.org/TR/speech-synthesis/synthesis.xsd"

xml:lang="en-US">

That is a <emphasis> big </emphasis> car!

That is a <emphasis level="strong"> huge </emphasis>

bank account!

</speak>

Output to users does not always have to be text. We can use images and videos to communicate many ideas to our users. As part of any message sent back to the user, we may attach various content such as videos, audio files, and images. The specific supported formats will depend on the underlying operating system and channel. Some systems allow other file attachments as well, for instance XML files or a native format of some sort.

An alternative mechanism for presenting content to our users are cards. A card is typically a combination of an image, text, and optional buttons that serve as calls to action. Our YouTube Search bot from Chapter 1 (Figure 4-12) clearly displayed the video name, description, and a button to watch it in a set of cards.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig12_HTML.jpg — Figure 4-12
Horizontal list of cards; also called a carousel

This layout is called a carousel . It presents several cards side by side and gives the user the ability to swipe or scroll through the individual cards.

Buttons are typically sent as part of a card, but they can also be sent as stand-alone elements without an associate image. There are many types of buttons. The top three most popular buttons are used to open web pages, send a message back to the bot (IM back), or post a message back to the bot (post back). The difference between an IM back and a post back is that a post back message will not appear in the message history, whereas an IM back message would. Not all channels support both approaches, but the overall spirit of sending a message to the bot via a button click is widely supported.

Another type of button is a sign-in button. Sign-in buttons kick off an authentication or authorization flow via a login in a web view. Once the login is completed, the bot receives any necessary access tokens and can proceed with an authenticated session, as shown in Figure 4-13.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig13_HTML.jpg — Figure 4-13
Authenticated bot with suggested actions/quick replies

All the content described previously is kept within a user’s chat history. The carousels, the cards, the buttons, and, of course, all the text are available for the user to scroll through. There is one form of element that is displayed only in the context of the message that it is included in. That feature is suggested actions, also called quick replies . These buttons are presented on the bottom of the user interface until the user responds. These buttons are clear calls to actions and an indispensable tool for delightful conversational experiences. Figure 4-14 shows an example usage of suggested actions guiding users to video categories available in the TD Ameritrade bot .

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig14_HTML.jpg — Figure 4-14
Video category suggested action in the TD Ameritrade bot

Authentication and Authorization in Bots

Let’s be honest, no one is going to be sending a username and password to a bot chat window. This is a security risk. We do not want Facebook or Slack or any other channel to have our users’ login credentials in their message history. At the end of the day, a bot is simply a web service, so using the standard OAuth or OpenID Connect flows is a natural fit.

The right approach is to utilize a sign-in card, which is a card that includes a button that opens a login web page for the user to enter their credentials (Figure 4-15).

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig15_HTML.jpg — Figure 4-15
A standard sign-in card

Typically, this login page will be an OAuth page (Figure 4-16).

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig16_HTML.jpg — Figure 4-16
OAuth authorization code flow

OAuth 2.0⁷ is a standard for token-based authorization over the Internet. There are several different types of authorization flows that are enabled by OAuth 2.0. A three-legged OAuth flow allows a resource owner (the user) to grant access to an application (the consumer) to an API (the service provider). In the context of a bot, it looks as follows:

The user clicks a button to open a login page to a service in a web view for the third party and enters their username/password combination. The URI for this login page typically includes a client ID and a redirect URI. The redirect URI is an endpoint on our bot web service.
Once the user logs in successfully, the service redirects the user back to the bot redirect URI. The bot redirect URI endpoint receives the authorization code. This is the user’s grant to the application to use the service. The bot exchanges the authorization code for an access token (and an optional refresh token) from the token endpoint.
The bot uses the access token when making requests to the service on behalf of the bot user.
Typically, the access token is short lived, and the refresh token is longer lived. At any point, the bot can request a new access token from the token endpoint by posting the refresh token.

There is substantial documentation around the specifics of this and other OAuth flows. The RFC is a great starting point.⁸ The key point is that a bot is a web service, and the complete OAuth flows can happen in an integrated manner. The only tricky part from a UX perspective is to ensure that the browser window automatically closes when the login is completed. The various channels approach this in slightly different ways. Although one can implement the entire flow manually, something we show off in Chapter 8, the Bot Framework does provide additional tools to facilitate this process.⁹

Specialized Cards

On platforms that support them, cards are a key component of the user experience. We covered the idea of a generic cards. Some channels provide several specialized cards. For instance, a receipt card (Figure 4-17) can be sent to communicate a purchase receipt with information such as totals, tax, payment confirmation, and so forth.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig17_HTML.jpg — Figure 4-17
Messenger receipt template

In addition, Messenger gives developers the ability to utilize four air travel cards such as itinerary, boarding pass (Figure 4-18), check-in, and flight update.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig18_HTML.jpg — Figure 4-18
Messenger boarding pass template

Tapping the boarding pass card shows a full-screen version with a QR code that can be utilized at the airport (Figure 4-19). Depending on which platform we target, there may be other templates for us to use. If they exist and match your use cases, use them. They provide a good, native user experience.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig19_HTML.jpg — Figure 4-19
Messenger boarding pass template details

Another form of specialized card is one in which you use custom graphics. A common approach is to generate the custom graphics on a web service as the bot processes user input. In Chapter 11, we will build a simple custom graphics renderer using Headless Chrome to show how easily we can begin building custom graphics using HTML and JavaScript.

Lastly, Microsoft has introduced a new card format called Adaptive Cards.¹⁰ Adaptive cards , which we will explode in Chapter 11, are a platform-agnostic manner to describe layouts of text, images, and input fields using a simple container-based layout engine. The Microsoft Bot channel connectors are then able to render the cards into platform-specific renderings. Adaptive Cards are a specialized version of the custom graphics approach integrated with logic to generate buttons and behavior in a card. It remains to be seen how many channels will end up supporting this format, but many of the Microsoft-owned channels already do.

Figure 4-20 shows an example of an HTML rendering of an Adaptive Card.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig20_HTML.jpg — Figure 4-20
Adaptive Card sample

Figure 4-21 shows the rendering of an input form card on Microsoft’s Teams app.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig21_HTML.jpg — Figure 4-21
Input form card sample

Other Functions

Bots may include several other interesting pieces of functionality that can really make a bot experience shine. Some of these pieces of functionality are the following:

Proactive messaging: A bot can reach out to a user asynchronously, triggered by an event other than an incoming message. If the bot stores a user’s address (combination of service URLs and conversation and user IDs), it can utilize it to communicate to the user.
Human handoff: In customer service scenarios and highly visible public-facing bot deployments, having a mechanism to seamlessly transfer the conversation from a bot to a human agent is a requirement for a successful bot.
Payments: More and more platforms are opening their payment systems for easy conversational integration. Facebook Messenger has its Payments program with easy Stripe/PayPal integration. Microsoft provides easy Stripe integration for payments across the entire Windows ecosystem and the Bot Framework.

Conversational Experience Guidelines

There are some key guidelines that we should follow when developing a bot experience. Some of them may not apply to every type of bot or may be more relevant to consumer versus enterprise bots, but one should keep this list in mind at a minimum when designing bots.

Focus

As discussed in Chapters 1 and 2, there are limits to the technology and how intelligent a bot can be. Our bots should not try to get too clever; humans will always be able to break the bot in one way or another. For example, it is quite OK to handle greetings from the user like “hi” or “hello. We do not want to go down the rabbit hole of being handling every different type of greeting. Don’t start creating specialized responses for “what’s up?” versus “hi.” If you are reading this book, you most likely don’t have the budget that Microsoft or Google has (Figure 4-22). We are here to help with tasks, not general AI. It is OK to be honest about our bot’s limitations.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig22_HTML.jpg — Figure 4-22
Good advice for building bots

Don’t Pretend the Bot Is a Human

We do not want our bots to end up in the uncanny valley.¹¹ That is, as with most, if not all, human-like objects, real humans will feel that something is not quite right, leading to strange and eerie feelings (Figure 4-23). We do not want our users to get those feelings. This goes hand in hand with representing your bot with human likeness. If you are representing your bot via an avatar, use an icon that clearly suggests a nonhuman entity. Siri and Cortana do this very well.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig23_HTML.jpg — Figure 4-23
We’re definitely in the uncanny valley

Do Not Gender Bots

There is plenty of writing around this topic.¹² It is worth noting that even though Siri, Cortana, and Alexa and some of the older virtual assistants have female names, Google and Facebook have opted for Google Assistant and M. This trend of nongendered bots has continued in the industry. Adopting female personas can quickly get weird, as when taken to the extreme with the sexualization of AI in the movie Her.

Always Present the Next Best Action

Our bot should never leave a user hanging without the user knowing what to do next. The bot should have a welcome message introducing itself, its capabilities, and some options to the user about what it is capable of. When the user is confused and asks for help or the bot is unable to recognize the user’s input, the bot should suggest some options as well. The key point is that if at any point of the conversation the user is met by a blank message box with no suggested next steps, it becomes a confusing conversational experience. Facebook Messenger, Skype, and other channels have a contextual quick-reply feature that presents button options of the bottom of the chat interface (Figure 4-24). Presenting such suggestions is a great way to communicate our bot’s capabilities and limitations.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig24_HTML.jpg — Figure 4-24
Next best actions

Have a Consistent Tone

Bots typically will end up getting a name and personality. Although I don’t think gendered names make sense, your bot should have a personality and a consistent tone. Remember, these are brands speaking to your customers. Some bots are chatty. Others are less so. Some are formal. Others are more relaxed. Choose one for your bot and keep it consistent. And, although it is interesting technology, we should avoid using natural language generative models (machine learning algorithms that automatically generate responses) if we want to keep a brand-centric voice.

Utilize Rich Content

Bots provide us with the opportunity to utilize more than just text. We can format text and include images, videos, and audio files. We can render cards (Figure 4-25) and even create some custom graphics in your cards. We need to utilize those features to their fullest extent

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig25_HTML.jpg — Figure 4-25
Rich bot content is a good idea

Be Forgiving

Natural language is tricky. Expect user inputs to be vague. Our bots should have conversational paths to confirm information or elicit missing data. If the user is expected to enter a number, we should parse any possible input but also be clear about the bot’s expectations. If possible, provide some suggestions to the user of possible values they could enter by using a quick reply feature. The user will be pleased by such suggestions. There’s nothing more frustrating than not knowing and not being instructed how to communicate to a bot.

Avoid Getting Stuck

At any point in our bot, the user should be able to change the conversation topic. Our bot should try to get stuck in a conversation context, unless absolutely necessary. For example, let’s assume a calendar bot is asking the user for date. Our bot expects a string that resolves to a date. If the user enters “delete tomorrow’s 9am appointment,” our bot should handle the query gracefully instead of saying something like “I’m sorry that is not a date. Please enter a date in the format mm/dd/yyyy.”

Don’t Abuse Proactive Messaging

Bots give us an ability to reach out to user at any time, even without the bot seeing a message from the user. Do not abuse that privilege. In messaging applications, users get a notification any time they receive a message. There isn’t an easier way to get removed from the messaging app than constantly sending reminders or trying to re-engage. Some channels have specific policies around this as well. Be a good citizen within the messaging channels.

Provide a Clear Path to Humans

If there is one thing that should be clear by now, it is that bots cannot understand everything. Even with a limited scope of functions, there are going to be questions and issues that the bot will not be able to handle. Our bots should have an ability to somehow connect users to a human agent, if relevant to the use case. Whether it is displaying a phone number with a case number or having seamless integration into a live chat system, our bots should be clear in how our users can speak to humans for help with their issues (Figure 4-26). For example, I once encountered a bot that could answer frequently asked questions. I read a press release about the bot, so I decided to try it. I started the conversation and got a message about clicking a button. There were no buttons. I asked, “What can I do?” I got redirected to a human agent. At this point, I couldn’t do anything until a human dealt with my case. I also had no indication of how long it would take. Was their call center even open? Once the agent came around, I spoke to them and got sent back to the bot. I had total silence, no buttons. I said “Test.” The next message I got was that I was getting transferred again. At this point, I just quit. Don’t make your users quit in frustration.

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig26_HTML.jpg — Figure 4-26
Clear path to talking to a human

Learn from Your Users

It is simple to use a conversational experience to collect data from users. It also easy to use user input in resolving conflicting intents from LUIS and then utilizing that data to train LUIS. Of course, the importance given to user input should be very different than the weight given to the utterances provided by a trainer. But if we have the data, we should use it to our advantage. Figure 4-27 shows an example of how we can implement such an approach. In the diagram we store user feedback into an active learning data store, and our active learning process determines how much of the same feedback it should observe before using the data point to train LUIS. Be careful with automated training based on user input. You do not want to go the way of Tay.¹³

../images/455925_1_En_4_Chapter/455925_1_En_4_Fig27_HTML.jpg — Figure 4-27
Implementing active learning

There are more rules you may pick up on as you gain experience in this space across different messaging channels, but this list is a good starting point and something I suggest we follow on every chat bot project.

Conclusion

Conversation design is a rich field. We have numerous options for how we interact with users and how we communicate ideas in formats other than text. When developing bots, our approach should always be “do right by the user.” A user’s conversational experience can be very sensitive to tone, branding, verbosity, and overuse (you don’t need to use a card for everything). Although in the early phases there are some key abstractions, such as cards, the space has developed to best handle bot-to-user interactions. As bots become more commonplace, these mechanisms will improve and increase in number. Microsoft’s adaptive cards, for example, is a project that attempts to push the boundaries of what kind of functionality a bot can provide in conversation with users. My hope is that as bot become more and more commonplace, the messaging channels will support more and more types of behavior from bot cards.

We now have a good base understanding of the common operations bots perform and how they do so. The only remaining question is, how do we put all of this together in code? In the following chapter, we’ll do just that and put these ideas into practice.

Previous Chapter

3. Language Understanding Intelligent Service (LUIS)

Next Chapter

5. Introducing the Microsoft Bot Framework

Table of Contents for Practical Bot Development: Designing and Building Bots with Node.js and Microsoft Bot Framework