Before I was even working in the fields of data and engineering, I have been intrigued as to how bots and automated personal assistants (the likes of Siri, Cortana and Google Now) work.
I now know enough to make a machine learning powered bot from scratch. But actually the best answer I have came later: does it matter?
In this post, I’ll dwelve into this answer and leave the interesting details of bots (the Machine Learning) for a future one.
Bots are not meeting expectations
First things first, let’s just acknowledge that the whole bots revolution has failed to live up to the high expectations many had.
The appeal was understandable for non-experts, on a superficial level. Unlike other machine learning applications, Siri and text bots are as close to everyone’s concept of AI (aka HAL 9000) as possible. Before using them, it was hard not to get excited, all skepticism aside. Even if covering only a fraction of what general AI can cover, it would still be quite useful.
WeChat seemed to be having enormous success in Asia and it relied on a conversational interface that was yet to be applied elsewhere, so it was easy to assume that it would be a matter of time before it did. So the bot gold rush began and by 2015 the hype was still at its highest. Dan Grover, then a Product Manager at WeChat, recalls in his brilliant article:
Conversations, writes WIRED, can do things traditional GUIs can’t. Matt Hartman equates the surge in text-driven apps as a kind of “hidden homescreen”. TechCrunch says “forget apps, now bots take over”. The creator of Fin thinks it’s a new paradigm all apps will move to. Dharmesh Shah wonders whether the rise of conversational UI will be the downfall of designers. (…)
Benedict Evans prophecized that the new lay of the land is “all messaging expands until it includes software.”
In practice, any interaction with most bots would quickly reveal just how far away we are from this reality. Alex Hern writes:
The problems with existing chatbots begin with how they actually work. Almost uniformly, the initial examples of Messenger bots are disastrous: unable to parse any instruction that doesn’t fit their (entirely undocumented) expectations, slow to respond when they are given the correct command, and ultimately useful only for tasks which are trivial to perform through the old apps or websites.
Mashable reports a bird’s eye view of the industry:
The Information reported last week that Facebook is rethinking its development efforts on all automation fronts after research showed bots failed to adequately address 70 percent of customer questions and requests. (…)
A recent Forrester survey found that only four percent of digital business professionals use any sort of chatbot. The research firm concluded in its most recent report on the state of the practice that its hype as a “world-changing technology” has led to a “peak of inflated expectations” that have gone largely unmet.
It seems indeed we have just left the peak and we are halfway down on the trough of disillusionment:
Whilst one can attribute exagerated expectations alone, these likely resulted from not realizing that:
- a conversational interface is only one of many interfaces, and it has a much greater friction than a GUI in many situations (regardless of the bots intelligence)
- limitations in Natural Language Processing further constrain the number of successful conversational interfaces we can develop. For now, they need to be simple and limited in scope to be accurate (more on this on the next post).
To these fundamental issues, some temporary ones might have worsened the situation:
- lack of experience in the field (design practices are/were not as established)
- a low barrier of entry that encourages numerous poor submissions
Whilst improvements on NLP are hard to predict, an answer to the first fundamental issue is just reaching the market. Or better yet, we are back to the starting point.
A pure conversational interface doesn’t suffice, but combine it with a more complex GUI and this hybrid model might just be good enough.
While WeChat is popular and supports bots, it also supports web views mixed with a conversational model to provide a much better user experience. And that is claimed to be one of the key reasons of its success, not the conversational aspect.
Dan Grover compares the experience of ordering a pizza in Microsoft’s demo bot versus Pizza Hut’s account on WeChat. In WeChat the user starts in Pizza Hut’s account (a bot), but the interface really becomes a web like interface.
73 vs 16 taps. And note how in WeChat’s version the user is provided a wealth of information on what’s available, whereas in the previous the user is left in the dark. Layer is taking one step further, where each message has the potential to be a mini application.
Perhaps more importantly, the most understated advantages of the WeChat model have less to do with chat than with the fact that users can immediately access a number of services in a central platform, not requiring installation of separate apps, accounts to be made and logged in, payment information to be entered, and more friends lists to manage.
Apple/Android Pay in mobile browsers, the evergrowing Facebook walled garden, if not improvements at OS level, might all evolve to provide the frictionless experience of WeChat, bots not required.
As Dan puts it:
This maybe a bit disheartening to hear since creating bots powered by AI sounds super cool and cutting edge, while making mobile optimized websites definitely does not.
Between starting this review a few years ago, and finally doing a post, there were enormous changes in the field. The most significant might just be the development ecosystem that sprung up and makes writing simple bots almost trivial.
From not requiring development skills, to doing only the machine learning bits (open-source or not), some service appears to cover it. See for instance Rasa (open-source), Wit.AI and Microsoft’s Bot Framework.
I won’t list all the good services I found just yet, but other people have covered most of them:
TL;DR: text interfaces are not enough, these serve a niche purpose but they can be combined with other interfaces providing more possibilities for developers to apply where appropriate. WeChat’s success was not likely due to the conversational aspect alone, but perhaps in great part due to their frictionless user experience to reach out to services (rather than needing separate apps with separate accounts and onboardings). Also, you don’t really need Machine Learning skills anymore to make bots.
Still, aren’t you curious as to how Siri finds the weather? Now that I’ve made clear it’s not that important, I’ll soon make a post about it, guilt free.