Upload
stuart-shulman
View
511
Download
4
Tags:
Embed Size (px)
DESCRIPTION
A fall 2011 briefing for personnel at Forrester in Cambridge MA.
Citation preview
The Future of Text AnalysisDr. Stuart Shulman
Texifter, LLC
Thursday, April 13, 2023
Briefing Agenda
• R&D in Annotation and Public Comments• “The Future of Text Analysis” – The vision• “What is DiscoverText?” – The software• The Features – The basics– Capturing social media importing other text– Creating archives, buckets and datasets– Coding a dataset or training a classifier
Dr. Stuart W. ShulmanFounder & CEO, Texifter, LLCAssistant Professor, Department of Political ScienceUniversity of Massachusetts AmherstDirector, Qualitative Data Analysis Program (QDAP)Associate Director, National Center for Digital GovernmentEditor, Journal of Information Technology & Politics413-545-5375 [email protected]://people.umass.edu/stu/
Major Project Components
Credentials
The Future of ProjectsProjects leverage users’ credentials to control
access to documents, tools, and resources
Documents Peers
Advanced ‘Social’ Search
Tools for Tagging Shared Analysis
Metadata Networks Filtering
Qualitative & Quantitative Findings
The Future of DocumentsImport & archive data from multiple sources into a single, searchable, unified repository
Files Web
The Future of SearcheDiscovery will search, merge, filter & classify
unlimited amounts of text and other data
Filter
Search
Classify
Report
Well Worth Reading
The Future of Tools
Duplicate & near duplicate
detection
Dynamic user-seeded tag clouds
Adaptable, intuitive and
reusable topic models &
shared memos
Sentiment detection,
redaction & seamless
adjudication
Text processing tools will enable quicker processing and more accurate results
The Future of Peer RelationsUtilize trusted peers to scale your knowledge resources,
increase productivity & lower total project costs
Peers GroupsSecurely segment your peers into project groups by
agency, firm, department, location, or affiliation,while controlling their access via credentials
Security & CredentialsData will be encrypted, secure and accessible by only peers who are granted specific permissions via their credentials
Coding, Tagging or LabelingAnnotation enhances your analysis by applying
human interpretation to machine results
Coding in Flexible Teams
Crowdsourcing
- MIT professor Eric von Hippel, specialist in innovation management“This is really the biggest paradigm shift in innovation since the Industrial Revolution”
Crowdsourcing will bring widely distributedwisdom to process of text analysis
Active Machine Learning
Search
Code
ValidateShare
Analyze
By utilizing information and decisions previously captured, we can enhance future machine-based decisions
Active Learning
Loop
What is DiscoverText?DiscoverText is a:• personal or organizational archive in the cloud• search engine for eDiscovery • social media comment aggregator• de-duplication and near duplicate clustering engine• FOIA redaction toolkit• coding, reporting and validation team workbench• repository of human annotation (text about text), and• customizable machine-learning classifier
– (beta launched April 2011)