In recent years, chat bots and artificial intelligence (AI) have become a hot topic in the tech sector and the public imagination. Chat bots, computer programs that can communicate using natural language, are doing everything from ordering pizza to buying clothes to saving money on parking tickets1 to negotiating among themselves.2 Initially, developing a chat bot was tantamount to developing an integration with a messaging platform. There was no easy way to represent a conversation flow in code. When Microsoft created the Bot Framework and the Bot Builder SDK, this changed. Microsoft created a rich environment in which the developer was liberated from the concerns of integrating with individual channels and could focus on writing code that performed the conversational tasks a chat bot needs to accomplish. The Bot Builder SDK provided a generic approach to the development of conversational experiences. Microsoft’s Bot Connectors implemented the logic to translate from the generic format to channel-specific messages.
The result is that chat bot development has become significantly more accessible to millions of developers. Engineers no longer have to learn the ins and outs of integrating with something like Facebook’s Messenger APIs or Slack’s Web API. Instead, developers focus on core bot logic and the conversational experience. Microsoft worries about the rest.
The Bot Builder SDK is available for .NET and Node.js and is run as an open source MIT-licensed project on GitHub.3 The team is active in both development and responding to the various issues that development teams run into. And the team is friendly to boot!
In December 2017, Microsoft made both the Bot Framework and the Language Understanding Intelligence Service (LUIS) generally available. LUIS is Microsoft’s natural language service that will aid us in adding conversational intelligence to our bots. The Bot Framework is now also called the Azure Bot Service; the two refer to the same thing. As implied by the name, the Azure Bot Service is now a full-fledged part of Microsoft’s Azure cloud offering. Microsoft has also provided free tiers of the service so we can play with the framework to our heart’s content. All of the samples and techniques in the book can be experimented with at no cost!
Over the last years, all the big tech companies like Microsoft, Facebook, and Google, as well as many smaller ones, have been taking a stab at creating the best and easiest-to-use chat bot development frameworks. The field is very dynamic. Frameworks come and go. Things seem to change daily. Despite the space’s dynamic nature, Microsoft’s Bot Framework remains the best platform for developing powerful, fast, and flexible chat bots. I am thrilled to take you on a journey through chat bot development using this tool.
The Expectations Game
For more than two years now, a substantial chunk of my conversations with customers has been spent on discussing chat bot capabilities, what they are, and, more importantly, what they are not. Our culture largely confounds chat bot abilities with artificial intelligence, and it is easy to see why. Some chat bots employ rich natural language capabilities, leading us to imagine there is more to them. Likewise, voice-based digital assistants such as Cortana, Alexa, and Google Assistant live in our homes and may be spoken to like real humans. Why wouldn’t chat bots display more intelligence?
The culture is additionally permeated with references to the likes of IBM’s Watson on Jeopardy,4 the New York Times’ feature on the Google Brain team5 and their feats in language translation using deep learning, self-driving cars, and AlphaZero destroying the world’s highest-rated chess-playing engine after only four hours of learning how to play chess.6
These and many other stories highlight the investment and interest in these techniques, foreshadowing the kind of AI-driven interactions with our devices that we are heading toward. Developments in the field of AI have changed the way we interact with, as well as what we expect from, our technology. Assigning human attributes and abilities to our devices is becoming more prevalent. Thinkers in the cognition and science-fiction spaces have long grappled with this possibility as popularized by Asimov’s Three Laws of Robotics, a set of rules that robots obey to ensure the robots don’t go after humans. And now that there are some clear and concrete AI examples in the real world, that kind of reality seems so much closer.
Yet, reality does not match the expectations set forth by AI’s successes in some very specific problem areas. Although we have made tremendous leaps and bounds in terms of natural language processing, computer vision, emotion detection, and so forth, composing all of these pieces into a human-like intelligence, usually referred to as Artificial General Intelligence AGI, is not yet within our grasp and is not a realistic target for chat bots. For every article that celebrates the tremendous achievements in the AI space, there’s a matching article downplaying the hype around the same technology and showing examples of why this type of AI is still far from perfect (think of the articles showing all the images that computer vision algorithms still can’t correctly classify). As with any technology that has been hyped up in the media, we must be reasonable with the expectations we set on it.
Are our bots going to be agents with human-level intelligence having conversations with our users? No. Given the technology and the tasks we want our bots to accomplish, can we make our bots perform those tasks very well? Absolutely. This book aims to equip the reader with the necessary skill to build compelling, engaging, and useful chat bots. It is up to the engineer how much of the latest AI techniques you want to incorporate during this journey. Certainly, these techniques are not required for a great chat bot.
What Is a Chat Bot?
At the most basic level, a chat bot, also referred to simply as a bot throughout this book, is a computer program that can take user input in natural language and return text or rich media to the user. The user communicates with the chat bot via a messaging app, such as Facebook Messenger, Skype, Slack, and others, or via a voice-activated device such as the Amazon Echo, Google Home, or Harmon Kardon’s Invoke powered by Microsoft’s Cortana.
Figure 1-1 illustrates our first bot built using Microsoft Bot Framework. This bot simply returns the same message to the user prefixed by the string “echo: ”. The logic that runs this experience on the Bot Framework is brain-dead simple.

A simple echo bot

Cats are OK

Dogs are way better!
The code for this one is shown next. We make a request to YouTube and translate the response from YouTube format to Bot Framework cards.

A simple example of utilizing AI to drive a conversation

Proactive user messaging

A simple calendar bot integrated with Google Calendar
Now things are starting to get a bit more interesting. We are starting to take natural language and to act on it.
Why Now?
Why are bots becoming such a big deal? Certainly, they have existed in all kinds of incarnations in old-school apps like IRC7 and AOL Instant Messenger.8 And these were not little experiments. IRC bots have been around for a long time. I remember interacting with quite a few bots over IRC. Being young and naïve when it came to technology, I initially thought there was an actual human responding to my messages. I quickly grasped the idea that there was a machine sitting somewhere responding to what I was writing. The more I interacted with IRC bots, the more I treated them like a command line. This, however, was all pretty niche technology at the time. The public wasn’t interacting with bots on a daily basis so there was no need to cater to natural language interactions.
Today, the way we interact with the technology around us is completely different, and it is driven by three forces: advancements in AI, the idea of messaging apps as a conversational intelligence platform, and voice-activated conversational interfaces.
Advancements in Artificial Intelligence
Throughout the 20th century, computer scientists, biologists, linguists, and economists have made tremendous strides in the fields of cognition, artificial intelligence, artificial life, machine learning, and deep learning. The very concept of a computer program executing instructions, the Universal Turing Machine9 and the idea of a computer architecture that can digitally store code and execute the code taking inputs and producing outputs, and the Von Neumann architecture,10 are recent in human history standards but are the underlying concepts that our work on computers is based on. The beginnings of the ideas around neural networks were first published in 1943 by McCulloch and Pitts in their paper “A logical calculus of the ideas immanent in nervous activity.”11 In 1950, Asimov included the Three Laws of Robotics in his book I, Robot .12 That same year, the first paper describing how a computer can play chess, “Programming a Computer for Playing Chess” by Claude Shannon, was published. He went on to essentially inventing the field of information theory.13 From the 1960s and onward, the amount of research and growth in the space has been mind-blowing; we see proof of this every day in media coverage of the latest AI applications.
Suffice it to say, since the 1960s, machine learning and the process of building our own models using a variety of algorithms have become better performing and more accessible. Libraries such as scikit-learn for Python and Google’s Tensor Flow, among many others, are well documented with strong community support. The big technology firms have also invested enough in their computational capacity and power to be able to work on some of the most computationally intensive tasks in a reasonable time frame. Microsoft, Amazon, Google, IBM, and others are now involved in cloud platforms in one way or another. The next step has been to offer some of these machine learning algorithms on demand. If we simply examine Microsoft’s Cognitive Services as an example, we find 30 APIs at the time of this writing. These include computer vision tools like Face and Emotion Detection, Content Moderation, and OCR capabilities. It also includes language tools such as Natural Language Processing, Linguistic and Text Analytics, and Natural Language Understanding. It even includes search and knowledge tools such as Recommendations engines and Sematic Search. The availability of services that any developer can plug into at any time to access these powerful features at a reasonable cost is a significant reason why intelligent systems are becoming so much more prevalent in our lives and is one of the great pieces of infrastructure that our bots can take advantage of. We will look at Microsoft’s Cognitive Services in Chapter 10.
Messaging Apps as a Conversational Intelligence Platform
Mobile messaging apps have become all the rage in recent years. Snapchat, Slack, Telegram, iMessage, FB Messenger, WhatsApp, and WeChat are some of the most used apps on a mobile user’s phone. In fact, their usage has surpassed that of social networks such as Facebook. According to Business Insider, messaging apps began being more used than social networks sometime around the first quarter of 2015, and the trend has continued since then. Although this book will not get into details around all the relevant players in the U.S. and global markets, the key point is that Asia-based messaging apps such as WeChat and LINE have figured out the best way to grow usage via chat apps and how to monetize that usage. The monetization trend has not yet fully caught up to the U.S. market, but firms like Apple, Twitter, and Facebook have been leading the way by allowing developers to create easy chat bot and even payment integrations I do not mean to limit the discussion to said players; the trend of opening access to messaging platforms is prevalent across the board.
The ability to host these bots within an existing messaging platform opens brands up to significantly more customers. The user experience stays within the messaging application. The bot developer does not need to concern herself with things like animations and memory management as a mobile app developer might; the main concern is the conversation with the user. One of the interesting concepts that we will encounter throughout the book is that bots are not just text. They can include images, videos, and audio as well as buttons to invoke other commands. The creation of a conversational experience within the confines of an existing messaging application is an exercise of writing an app within an app; our bot is constrained by the native features supported by the messaging platform. The Bot Framework has the necessary facilities to maximally take advantage of all these features.
Voice-Activated Intelligent Assistants
Another factor significantly accelerating the development of conversational intelligence technologies is the development of voice-enabled hardware devices. One of the more significant modern virtual assistants, Siri, was introduced by Apple in 2011. Siri, now a household name, is powered underneath the hood by some of the technology behind one of the most well-known desktop voice recognition systems, Nuance’s speech-to-text product, Dragon NaturallySpeaking.
Siri was the first to market, seemingly encouraging many other companies to jump into the voice assistant game. Microsoft released its Cortana Assistant in 2014, the same year that the first Amazon Echo device was released. Cortana was initially limited to Windows Phone and Windows desktop operating system but was later made available on mobile operating systems and even Xbox. Amazon’s Echo featuring the Alexa voice assistant was the first commercially successful stand-alone hardware device and has allowed Amazon to dominate the voice assistant market early on. In subsequent years, Facebook and Google have introduced M (shut down as of early 2018) and Google Assistant, respectively. Google is jumping into the voice device game with Google Home. Harman Kardon is bringing a product called Invoke into the market, a Microsoft Cortana–powered speaker. Many other players are expanding into the market, further encouraging innovation in the space.
This increased activity and competition have been accelerated by improvements in AI and speech recognition, natural language processing, and natural language understanding technologies. The significant build-up of these technologies has increased the activity in terms of standards, frameworks, and tools to create custom capabilities for these platforms. As we will soon see, these custom capabilities, or skills, can be backed by a chat bot.
Why Should We Create Bots?
Why would we want to write bots and use messaging apps as a platform? We could just as easily write mobile apps, publish to the app store, and be done with it, no? Not exactly. There are a variety of trends in user behavior that are making this approach less feasible.
When it comes to some of the bigger brand names, downloading their app is a simple task. I want to use Facebook? Fine, I’ll get the app. I want to check my e-mail; I’ll use the app. But, I want to talk to my local flower shop? I don’t need an app for that. I don’t want an app for that. Why should I download an app for every single business I have contact with? Ideally, I would just be able to call them or, really, just text them, right?
The moves that firms are making in the market are allowing users to talk to a business directly. Let’s take Facebook as an example. A local flower shop can have a Facebook page and enable messaging on the page. Business employees can respond to customer queries in one place. Twitter has a similar feature with its new Direct Message API. That offers a lot of value for businesses. The removal of the app download friction makes it so much easier for users to begin conversing with businesses. The next step, of course, is to automate some of that communication. This is where bots come in. The messaging platform takes care of numerous concerns such as user identity, authentication, overall app stability, and so forth.

Slack bot listing
Although Slack’s listing contains a specific category called Bots, the fact is all of these apps are all bots. Some of them might be more conversational, and others could have a more command-line feel to them; as far as we are concerned, a bot is simply listening to messages and acting upon them. For the heavy conversational kinds of chat bots, the topic of natural language understanding, the discipline concerned with understanding human language, is essential for a good user experience. As such, we dedicate the Chapters 2 and 3 to the topic.
Bot Anatomy
Bot runtime
Natural language understanding engine
Conversation engine
Channel integrations
Bot Runtime

Message exchange between user, messaging platform, and a generic bot

Message exchange between the user, messaging platform, connector service, and bot using the Bot Framework
Since the bot runtime is simply a computer program listening to an HTTP endpoint, we can develop the bot using any technology that allows us to receive to HTTP messages. We can use .NET, Node.js, Python, and PHP. In fact, we could simply use the Bot Framework to gain advantage of the connectors and implement the HTTP endpoint using any approach we would like. If we did, however, we would lose out on the Bot Builder SDK. We will cover its benefits and reasons to use it in the “Conversation Engine” section later in this chapter.
Natural Language Understanding Engine
Writing a chat bot that reads and understands users’ utterances is challenging. Human language is unstructured input with flexible and inconsistent rules. And yet, our bots need to be able to take those inputs and figure out what the user is talking about. At a high level, natural language understanding engines solve two problems for the bot developer: intent classification and entity extraction.
A Sample Mapping of User Input to Intent, As Resolved by an NLU System
Utterance | Intent | Entity |
|---|---|---|
“Turn on” | TurnOn | none |
“Power off” | TurnOff | none |
“Set to 68 degrees” | SetTemperature | “68 degrees” Type: Temperature |
“Set mode to cool” | SetMode | “cool” Type: Mode |
Clearly, it is easier for our code to perform logic based on the intent and entity values, as opposed to a raw user utterance.
There are several services a bot developer can utilize to gain this NLU functionality. In the current technology environment, there are plenty of cloud-based APIs available, such as LUIS, Wit.ai, and Dialog flow, among others. LUIS is the richer and best-performing from this group and is the subject of an NLU deep dive in Chapter 3.
Conversation Engine

A sample bot conversation design diagram
The workflow always starts with the bot listening for user utterances. An utterance spoken by the user will be resolved to the intents in Table 1-1. If the intent is TurnOn or TurnOff, the bot can execute the right logic and respond with a confirmation message. If we receive a SetTemperature intent, our bot can verify that the Temperature entity exists. If not, we ask the user for it. Once we receive it, we can execute the right logic and send a confirmation response. SetMode would work similarly to SetTemperature in that we would confirm the existence of the entity and elicit it if it does not exist.
This description of what a bot does based on user inputs is a conversation. The activity of designing the types of inputs, the output, and the transitions is called conversational experience design. We cover this topic in depth in Chapter 4.
A conversation engine is the engine that tracks incoming messages, processes them, and executes the state transitions between the conversation diagram nodes (also referred to as dialogs). It does so separately for each user. The state of the conversation is stored so that when the next user message comes into the bot, the bot knows what the user’s current state is. The Bot Framework does a great job of providing the conversation engine via the Bot Builder SDK.
Aside: Intents, Entities, Actions, Slots, Oh My!
There are multiple approaches to developing bots, but they can be summarized into two approaches: bot engine and what I call bot conversation as a service. The bot engine was described earlier: we run our bot as a web service, call into NLU platforms as necessary, and use a conversation engine to route messages to dialogs. The bot conversation as a service approach was popularized by the likes of Dialogflow . The approach implies that the NLU resolution, conversation mapping, state, and transitions occur in the cloud on Dialogflow’s infrastructure. Your bot is then called by Dialogflow to modify responses or integrate with other systems.
When a user’s utterance maps to an intent and a defined set of entities, it is called an action. An action has an intent and a set of parameters. Based on our thermostat bot, we could define an action named SetTemperatureAction. This action is the SetTemperature intent with a Temperature parameter. The type of the Temperature parameter is the Temperature entity. When Dialogflow resolves an action, it can call into your bot to fulfill the action. In this model, the bot logic is focused on the execution of logic based on the NLU service’s resolution logic; the conversation engine is outsourced to the NLU service.
Action Definition for Setting a Temperature in Our Thermostat Bot
Action | Name | Type | Required? | Prompt |
|---|---|---|---|---|
SetTemperature | Temperature | Temperature | Yes | What temperature would you like to set? |
A More Complex Action Based on a Flight-Booking Bot
Action | Name | Type | Required? | Prompt |
|---|---|---|---|---|
Book Flight | From | City | Yes | Departure city |
To | City | Yes | Destination city | |
Date | Datetime | Yes | When are you traveling? |

Typical bot conversation as a service flow
The conversation as a service approach can be good at getting something up and running in short order. Unfortunately, this comes at a loss of some control and flexibility. Using the Bot Framework gets around these issues by allowing us full control over the bot engine.
Channel Integration
Building bots means addressing multiple messaging platforms. Your boss asks you to write a Facebook Messenger bot. You release it, and your boss congratulates you for your great work. He then asks you, “Can we add this as a web chat to our FAQ page?” Your bot code is tied to the Messenger Webhooks and the Send API. You waffle around and figure you can isolate some of the logic that communicates to Messenger behind a transport interface. You create a second implementation of the same interface that talks to your chat bot through web sockets. Now you have created your own abstraction of an interface between your bot and a messaging platform.
We want our bot logic to be abstracted away from the individual messaging platforms as much as possible. The details of how to receive messages from the channel and send responses are details we don’t want to concern ourselves with too much, unless we are the professionals building connectors into the various platforms. I don’t think you would be reading this book if you were. You want to develop a bot, not the infrastructure. Lucky for us, the different bot frameworks in the market typically do all of this for us, as illustrated in Figure 1-12. The frameworks allow us to write a bot in a channel-agnostic manner and then connect to those channels by going through a few clicks and entering some data. These features are usually called channels or channel integrations .
As is the case with many generic frameworks, there are some edge cases that the framework does not support because the platform feature is either too new or platform specific. In such cases, the framework should allow us to communicate to the platform in its native format. The Bot Framework provides a mechanism for this.

Your bot should not be concerned with which channels it talks to. That should be abstracted away for you.
We will cover channel and custom channel integrations in Chapters 9 and 10.
Conclusion
In this chapter, we took a quick look below the surface of the different components available to build bots. In my work, the Bot Framework has clearly won out against competitors that use a conversation as a service approach. The flexibility and control that the Bot Framework provides is a requirement for many enterprise scenarios. The Bot Framework also provides better and richer abstractions, deeper connector integration, and an open and diverse community. The Bot Framework teams has created an incredibly powerful suite that can be the foundation for any conversational bot. My team and I have been using the Bot Framework for almost two years and have found no reason to abandon the platform. In fact, the framework’s approach to conversational engines and the connector architecture have proven resilient to any use cases we have thrown its way.
For these and many other reasons, this book revolves around using Microsoft’s Bot Framework as the framework of choice. The framework is available for the C#/.NET and Node.js development platforms. For the purpose of this book, we will utilize the Node.js version. We will not utilize any additional tools like TypeScript or CoffeeScript. We simply use vanilla JavaScript to show how easy and straightforward it is to get started writing bots using the Bot Framework SDK for Node.js, aka Bot Builder.
Hype or not, the technology and techniques utilized to build bots are truly amazing. As part of this adventure, I want to make sure that we not only cover the basics of building bots but learn more about some of the underlying techniques and approaches. We will not be diving very deeply into these topics, but I’ll cover enough to give the reader an introductory level understanding of how the intelligence in bots can be implemented to feel comfortable exploring more complex scenarios. In the interest of overall book focus, when I cover such topics, I will provide links and information for additional reading material to complement the content. I am not a data scientist, but I have done my best in introducing the relevant machine learning (ML) concepts.
We are about to embark on an exciting journey though the world of conversational design, natural language understanding, and machine learning as applied to chat bots. As we cover these topics and build bots, keep in mind that these techniques apply to everything from chat bots to voice assistant skills. With natural language and voice interfaces becoming more and more prevalent both at home and in the workplace, I guarantee you will apply these concepts in both current projects and future natural language apps. Let’s get going!


































































































































































































































































































