23
Building bot-ready knowledge bases Our experimental initiative to prototype bot-ready information solutions using Google’s Dialogflow Anna van Raaphorst and Dick Johnson VR Communications LLC - 4/26/20

Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Building bot-ready

knowledge basesOur experimental initiative to prototype bot-ready

information solutions using Google’s Dialogflow

Anna van Raaphorst and Dick Johnson VR Communications LLC - 4/26/20

Page 2: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Structured writing and AI technologies... A synergistic approach

4/26/20 Building bot-ready knowledge bases 2

What is structured writing?• Term coined by Robert E. Horn, visiting scholar at

Stanford University’s Center for the Study of Language and Information

• Key concept in the information mapping method of analyzing, organizing, and displaying knowledge in print and online

Sample benefits of structured writing• Provides a consistent and intuitive experience to

users• Targets content to varying audiences• Organizes large amounts of material• Ensures the completeness of documents and

knowledge bases• Promotes information reuse

Page 3: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Structured writing and AI technologies... A synergistic approach (continued)

4/26/20 Building bot-ready knowledge bases 3

Benefits of AI in document production, management, and delivery

• Applies business intelligence to the process (e.g. combines structured and unstructured docs, increases relevance, reduces redundancy)

• Streamlines document preparation• Speeds up document delivery

How we combined structured writing and AI technologies to provide synergistic information solutions• Used existing structured knowledge bases as models/templates• Expanded existing structured knowledge bases with additional

metadata to train and validate an information chatbot• Prototyped solutions using Google’s Dialogflow

Page 4: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Initial prototype project: GROCERYbot

4/26/20 Building bot-ready knowledge bases 4

Simple, 7-topic DITA/XML project

Originally written to explain the structured writing principles of DITA (Darwin Information Typing Architecture)

Originally used to illustrate, test, and validate the processing features of DITA Open Toolkit

In this project, used synergistically with an information chatbot in Google’s Dialogflow

Page 5: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

GROCERYbot as a Google Dialogflow Web Demo

5

Page 6: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Snippets from the DITA-

based grocery shopping KB

Page 7: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Value of metadata in DITA files

• Short description explains the target audience and purpose of an article

• Index terms help readers understand and locate information

4/26/20 Building bot-ready knowledge bases 7

Page 8: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

GROCERYbotproject objectives

4/26/20 Building bot-ready knowledge bases 8

CHAT WITH THE USER AT A MINIMAL LEVEL

ANSWER FROM A SET OF PRESCRIBED QUESTIONS

AND ANSWERS

REFER USERS TO KB ARTICLES FOR MORE

INFORMATION

DEFER IF USER QUERIES ARE BEYOND THE BOT’S

RESTRICTED DOMAIN

Page 9: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

GROCERYbot: Dialogflow console

• Intents are shorthand labels that represent user requests to a chatbot

• We created intents related to entities in our grocery shopping KB articles

• Entities are keywords• The Knowledge section is where we stored our

DITA output files and FAQs that serve as natural-language training material for the bot

4/26/20 Building bot-ready knowledge bases 9

Page 10: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

GROCERYbot: Knowledge component

• We added 8 files to our Dialogflow Knowledge Base:• 7 PDF output files, one for each DITA source

file• 1 set of FAQs in CSV format

• The slider bar controls how strongly we prefer Knowledge results

• The slider bar can be adjusted in response to the bot’s performance during testing and debugging

4/26/20 Building bot-ready knowledge bases 10

Page 11: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

GROCERYbot: Desired behavior

• If a user query exactly or closely matches an FAQ question or defined intent, answer the query

• If a user query can’t be answered with an FAQ answer, but is in the scope of a KB article, answer the question based on article text, or suggest the article

• If the user query is outside the scope of the grocery shopping KB, defer

4/26/20 Building bot-ready knowledge bases 11

Page 12: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Correcting defects: Adding webhook

fulfillment to GROCERYbot

• The original DITA files contained price tables, which have two major problems:• The information is static• The tables are difficult

to display on the web

4/26/20 Building bot-ready knowledge bases 12

Page 13: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

To solve the pricing issues, we added a webhook

• Webhook fulfillment allows the bot to “look up” dynamic pricing information from an external data repository and return it in response to a user query about price

• We added a new intent: UserAsksForAttributeOfCannedGood that handles user queries like “tell me the price of large black olives”

• Every time the intent is selected, a call is made to the pricehook.php script running on an external web server

• The argument to the script is a json object that contains the entity parameter detected in the user input (e.g. “large black olives”)

4/26/20 Building bot-ready knowledge bases 13

Page 14: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Webhook example

4/26/20 Building bot-ready knowledge bases 14

Page 15: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Increasing “bot-readiness”: Turning the DITA metadata into a bot training kit

• In a staged effort, we added metadata to the DITA files and made it available to the bot as a programmatic training effort

• We reasoned that the source files should be as complete as possible, so that: • The most critical and definitive content and

metadata is produced by the articles’ authors, editors, and owners at the time of their creation

• Having a well-structured and relatively complete content collection in the early stages of the bot-building project, means that time can be saved in training, testing, and putting the bot-based KB set into production

Building bot-ready knowledge bases 4/26/20 15

Page 16: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Original and additional metadata in the DITA files

Building bot-ready knowledge bases 4/26/20 16

Page 17: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Integrating GROCERYbot with Telegram

Building bot-ready knowledge bases 4/26/20 17

Page 18: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Objectives of the Telegram integration

• Based on a relevant user query, select one or more KB articles from the grocery shopping set

• Use the article title and short description as “teaser text”

• Provide the user with a link to an article that seems to satisfy their query

4/26/20 Building bot-ready knowledge bases 18

Page 19: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Steps to integrating GROCERYbot with Telegram

1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML

2. Upload the HTML files to a temporary website3. Create a Python data-wrangling script

4/26/20 Building bot-ready knowledge bases 19

Page 20: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Steps to integrating GROCERYbot with Telegram (continued)

4. Run the Python data-wrangling script to pull the relevant information out of the metadata in the HTML files and put it into a CSV file

5. In the Dialogflow console, test the new version of GROCERYbot

6. In the Telegram app, create a bot instance called SHOPbot

7. In the Dialogflow console, connect GROCERYbot with SHOPbot

8. In the Telegram app, test SHOPbot with GROCERYbot

4/26/20 Building bot-ready knowledge bases 20

Page 21: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Take-aways and lessons learned

• We are convinced that our experimental initiative validated the potential synergy between DITA-based structured writing and AI technologies, but…• The Dialogflow tool is still lacking in

important features• Effective project guidance and roadmaps

are in short supply• Creativity and customization are required

to achieve a viable solution• We hope to hear about similar efforts

between structured writing proponents and chatbot creators

4/26/20 Building bot-ready knowledge bases 21

Page 22: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Who are we?

• VR Communications LLC does information architecture; linguistic analysis; text annotation; and writing, editing, and scripting in Python and R. We generally work in the following information domains:

• Technology• Science• Languages and linguistics• History• Education

• We freelance on a works-for-hire basis, or pro bono for worthy causes

22

Page 23: Building bot-ready knowledge bases - VR Communications€¦ · 1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML 2. Upload the HTML

Who are we? (continued)

• VR Communications LLC principals are Anna van Raaphorst-Johnson and Richard (Dick) Johnson, a collaborative team who work collectively or individually.

• Anna is our content specialist: information architect, linguistic analyst, text annotator, writer, and editor. Contact Anna at: [email protected]

• Dick is our technology specialist: software engineer, web developer, writer, and researcher. Contact Dick at: [email protected]

4/26/20 Building bot-ready knowledge bases 23