Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Building bot-ready
knowledge basesOur experimental initiative to prototype bot-ready
information solutions using Google’s Dialogflow
Anna van Raaphorst and Dick Johnson VR Communications LLC - 4/26/20
Structured writing and AI technologies... A synergistic approach
4/26/20 Building bot-ready knowledge bases 2
What is structured writing?• Term coined by Robert E. Horn, visiting scholar at
Stanford University’s Center for the Study of Language and Information
• Key concept in the information mapping method of analyzing, organizing, and displaying knowledge in print and online
Sample benefits of structured writing• Provides a consistent and intuitive experience to
users• Targets content to varying audiences• Organizes large amounts of material• Ensures the completeness of documents and
knowledge bases• Promotes information reuse
Structured writing and AI technologies... A synergistic approach (continued)
4/26/20 Building bot-ready knowledge bases 3
Benefits of AI in document production, management, and delivery
• Applies business intelligence to the process (e.g. combines structured and unstructured docs, increases relevance, reduces redundancy)
• Streamlines document preparation• Speeds up document delivery
How we combined structured writing and AI technologies to provide synergistic information solutions• Used existing structured knowledge bases as models/templates• Expanded existing structured knowledge bases with additional
metadata to train and validate an information chatbot• Prototyped solutions using Google’s Dialogflow
Initial prototype project: GROCERYbot
4/26/20 Building bot-ready knowledge bases 4
Simple, 7-topic DITA/XML project
Originally written to explain the structured writing principles of DITA (Darwin Information Typing Architecture)
Originally used to illustrate, test, and validate the processing features of DITA Open Toolkit
In this project, used synergistically with an information chatbot in Google’s Dialogflow
GROCERYbot as a Google Dialogflow Web Demo
5
Snippets from the DITA-
based grocery shopping KB
Value of metadata in DITA files
• Short description explains the target audience and purpose of an article
• Index terms help readers understand and locate information
4/26/20 Building bot-ready knowledge bases 7
GROCERYbotproject objectives
4/26/20 Building bot-ready knowledge bases 8
CHAT WITH THE USER AT A MINIMAL LEVEL
ANSWER FROM A SET OF PRESCRIBED QUESTIONS
AND ANSWERS
REFER USERS TO KB ARTICLES FOR MORE
INFORMATION
DEFER IF USER QUERIES ARE BEYOND THE BOT’S
RESTRICTED DOMAIN
GROCERYbot: Dialogflow console
• Intents are shorthand labels that represent user requests to a chatbot
• We created intents related to entities in our grocery shopping KB articles
• Entities are keywords• The Knowledge section is where we stored our
DITA output files and FAQs that serve as natural-language training material for the bot
4/26/20 Building bot-ready knowledge bases 9
GROCERYbot: Knowledge component
• We added 8 files to our Dialogflow Knowledge Base:• 7 PDF output files, one for each DITA source
file• 1 set of FAQs in CSV format
• The slider bar controls how strongly we prefer Knowledge results
• The slider bar can be adjusted in response to the bot’s performance during testing and debugging
4/26/20 Building bot-ready knowledge bases 10
GROCERYbot: Desired behavior
• If a user query exactly or closely matches an FAQ question or defined intent, answer the query
• If a user query can’t be answered with an FAQ answer, but is in the scope of a KB article, answer the question based on article text, or suggest the article
• If the user query is outside the scope of the grocery shopping KB, defer
4/26/20 Building bot-ready knowledge bases 11
Correcting defects: Adding webhook
fulfillment to GROCERYbot
• The original DITA files contained price tables, which have two major problems:• The information is static• The tables are difficult
to display on the web
4/26/20 Building bot-ready knowledge bases 12
To solve the pricing issues, we added a webhook
• Webhook fulfillment allows the bot to “look up” dynamic pricing information from an external data repository and return it in response to a user query about price
• We added a new intent: UserAsksForAttributeOfCannedGood that handles user queries like “tell me the price of large black olives”
• Every time the intent is selected, a call is made to the pricehook.php script running on an external web server
• The argument to the script is a json object that contains the entity parameter detected in the user input (e.g. “large black olives”)
4/26/20 Building bot-ready knowledge bases 13
Webhook example
4/26/20 Building bot-ready knowledge bases 14
Increasing “bot-readiness”: Turning the DITA metadata into a bot training kit
• In a staged effort, we added metadata to the DITA files and made it available to the bot as a programmatic training effort
• We reasoned that the source files should be as complete as possible, so that: • The most critical and definitive content and
metadata is produced by the articles’ authors, editors, and owners at the time of their creation
• Having a well-structured and relatively complete content collection in the early stages of the bot-building project, means that time can be saved in training, testing, and putting the bot-based KB set into production
Building bot-ready knowledge bases 4/26/20 15
Original and additional metadata in the DITA files
Building bot-ready knowledge bases 4/26/20 16
Integrating GROCERYbot with Telegram
Building bot-ready knowledge bases 4/26/20 17
Objectives of the Telegram integration
• Based on a relevant user query, select one or more KB articles from the grocery shopping set
• Use the article title and short description as “teaser text”
• Provide the user with a link to an article that seems to satisfy their query
4/26/20 Building bot-ready knowledge bases 18
Steps to integrating GROCERYbot with Telegram
1. Using the oXygen editor, create the DITA-based grocery shopping files and publish them to HTML
2. Upload the HTML files to a temporary website3. Create a Python data-wrangling script
4/26/20 Building bot-ready knowledge bases 19
Steps to integrating GROCERYbot with Telegram (continued)
4. Run the Python data-wrangling script to pull the relevant information out of the metadata in the HTML files and put it into a CSV file
5. In the Dialogflow console, test the new version of GROCERYbot
6. In the Telegram app, create a bot instance called SHOPbot
7. In the Dialogflow console, connect GROCERYbot with SHOPbot
8. In the Telegram app, test SHOPbot with GROCERYbot
4/26/20 Building bot-ready knowledge bases 20
Take-aways and lessons learned
• We are convinced that our experimental initiative validated the potential synergy between DITA-based structured writing and AI technologies, but…• The Dialogflow tool is still lacking in
important features• Effective project guidance and roadmaps
are in short supply• Creativity and customization are required
to achieve a viable solution• We hope to hear about similar efforts
between structured writing proponents and chatbot creators
4/26/20 Building bot-ready knowledge bases 21
Who are we?
• VR Communications LLC does information architecture; linguistic analysis; text annotation; and writing, editing, and scripting in Python and R. We generally work in the following information domains:
• Technology• Science• Languages and linguistics• History• Education
• We freelance on a works-for-hire basis, or pro bono for worthy causes
22
Who are we? (continued)
• VR Communications LLC principals are Anna van Raaphorst-Johnson and Richard (Dick) Johnson, a collaborative team who work collectively or individually.
• Anna is our content specialist: information architect, linguistic analyst, text annotator, writer, and editor. Contact Anna at: [email protected]
• Dick is our technology specialist: software engineer, web developer, writer, and researcher. Contact Dick at: [email protected]
4/26/20 Building bot-ready knowledge bases 23