Upload
bradley-bennet
View
140
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Introduction to Semantic Technology for SharePoint Administrators: For organizations that use SharePoint, it is their primary means of collaboration. These organizations have invested a tremendous amount of time and money to connect their employees and their data to improve communication and workflow. Despite this investment, users still spend massive amounts of time searching for relevant data and content. Search is AN ABSOLUTE FAILURE for large SharePoint deployments. The bigger the farm, the less useful and less relevant search becomes. Expert System is the leader in the development of semantic software,used by organizations to make information management more efficient, and to gain strategic knowledge through the automatic comprehension of text.
Citation preview
An introduction to semantic technologyfor SharePoint administrators
20/05/2013
2
Who We Are
Leader in the development of semantic software,used by organizations to make information mangement more efficient, and to gain strategic knowledge through the automatic comprehension of text.
3
Our Customers
4
Why are we here?
Greater demand for information in the decision making process
Ever increasing volumes of data to be considered every day (documents, emails, web pages, social media and “Big Data”)
Traditional technologies increasingly expected to manage and process information
But, most organizations are not taking advantage of all of their data.
5
Ultimately, we are here to create value from information INCREASED SALES
REDUCED COSTS
Increase customer satisfaction
Increase competitive advantage through the monitoring of markets and
innovationsEnhance brand value with targeted
social media analysis
Simplify the organization and recovery of data
Improve internal knowledge sharing
More timely and effective customer interactions
Reduce the time and costs of traditional customer assistance
Increase sales and customers
6
So much data, so little time
For organizations that use SharePoint, it is their primary means of collaboration.
These organizations have invested a tremendous amount of time and money to connect their employees and their data to improve communication and workflow.
Despite this investment, users still spend massive amounts of time searching for relevant data and content.
Search is AN ABSOLUTE FAILURE for large SharePoint deployments. The bigger the farm, the less useful and less relevant search becomes.
7
WHY?
The majority of enterprise content is unstructured in the form of electronic documents, emails, forms, etc. Searching through the textual portion of unstructured content can be a daunting task as it is highly likely the search operation will return a large number of possible results.
Further, people are generally searching for content inside content and inter-relationships between content– which complicates search even more.
Most companies are guilty of one or more of the following: Underutilization of features Lack of clear requirements or vision Not using metrics to gauge feature usage and adoption Understaffing to properly support the platform
8
The problems with unstructured data
Extraction and categorization are used to structure unstructured data and make
the retrieval and management of information more effective
Taxonomy and text mining rules are often dependent on specific business needs and influenced by market sector and
project objectives
Organizations need flexible solutions that are easily integrated and customizable, and capable of responding to specific
requirements for extraction and categorization
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
SharePoint technologies for managing information
Technology is a key factor in managing unstructured information.
There are different approaches for managing unstructured information:
• Keyword-based plus statistical elements
• Shallow linguistics
• Semantic technology
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Keyword Based
Text is divided into single words that are inserted in an alphabetical index, with no understanding of content:
Az IBM szokásosan nagy hangsúlyt helyez a továbbképzésre, így munkatársai évente számos szakmai tanfolyamon vesznek részt. Az elmúlt években a csoport több tagja is részt vett több hónapos, egyesült államokbeli, angliai illetve németországi projekt munkákban, melyek során nemzetközi csoportban végeztek fejlesztői tevékenységet.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Shallow Linguistics
• Words in the text are recognized as either belonging to the dictionary, or not.
• Acknowledged words are linked to the basic headword and a grammatical type is assigned.
• Some logical groupings are made.
• Indexes contain headwords and keywords.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Semantics
• Simulate a human’s process of text analysis.
• Morphological analysis, parsing, sentence and semantic analysis allow the extraction of large amounts of information and work from a conceptual point of view (thanks to the semantic network).
• Document indexing creates a set of words, headwords, concepts, relationships, subjects and structures (cognitive/conceptual map).
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Why Does Semantic Technology Excel?
The answer is in primary measures of Information Management:
#1 – Precision (a measure of exactness)Retrieving a high level of accurate results
that are relevant to your search.
#2 – Recall (a measure of completeness)Retrieving a high percentage of relevant documents.
Locating what applies.
Keyword and statistics (math based) technology can achieve one, but not both.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Why is Semantics Important?
...because language is too ambiguous.
Same word – Different meaningsJaguar = car
jaguar = animal Jaguars = football team
Different words – Same meaningDisability Legislation = Equal Opportunity Law
Different words – Related meaningsOrganization = company
Organization = trade unionOrganization = charity
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
What Does Properly Analyzed Mean?
4 Requirements Definition Example
Morphological Analysis understand word formsdog, dog-catcher and doggy-bag are closely related.
Grammatical Analysisunderstand the parts of
speech
"There are 40 rows in the table" uses rows as a noun, vs. "She rows 5 times a week" uses rows as a verb.
Logical Analysisunderstand how words relate to other words
“Davey Jones, represented by attorney Daniel Stanley, is married to Rebecca Carter". Rebecca is married to Davey, not Daniel.
Semantic Analysis (disambiguation)understand the context
of key words
"I used chicken broth for my soup stock" uses stock in the context of food, vs. "The company keeps lots of stock on hand" uses stock in the context of inventory.
16
Using Semantic TechnologyCan you revise this red
shape?
Copy would read…semantic intelligence
morphological analysisParsing
Sentence analysisSemantic analysis
{concepts, domain ontologies, places,
companies, products, people}
linguistic analysis
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Adding value to information with syncons and links
A syncon coincides with a node of the semantic net; each is connected to other syncons by specific semantic relations (= link) that develop a hereditary hierarchical structure.
This structure allows every node to inherit characteristics from nearby nodes, thus enriching itself with information.
Information inherently contains different kinds of links:
• hypernymy link (is a/type of) • meronymy link (has a/part of)• geographical link• linguistic relations link (subject/verb,
verb/object)
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Ordering principles
Links, which identify the semantic relationships between syncons, are the ordering principles.
Syncons may contain:• single headwords (‘set', ‘vacation‘, ‘work', ‘quick‘, ‘more')• compounds ('non-stop', 'abat-jour', ‘policeman')• collocations (‘credit card', ‘university degree', ‘go forward‘)
A syncon has the following main elements:• word class (noun, verb, adjective, adverb)• semantic relations (link)• gloss (explanation of meaning) • domain, register and frequency
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Links
The link supernomen/subnomen concerns the relationship between a specific concept and a more general one.
A supernomen is the more general term; it is a word that has a general meaning compared to those that represent a specification of the same meaning.
EXAMPLES• Dog – hunting dog – Irish terrier• Habitation – flat – two-roomed flat• Computer – portable computer – palmtop
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Links
The link superverbum/subverbum is one of the semantic relationships that link verb syncons together. This link is the equivalent for verbs compared to what link supernomen/subnomen is for nouns.
EXAMPLES• Eat – nibble at, eat listlessly• Sleep – doze, snooze• Walk – limp
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Links
The link omninomen/parsnomen is a “part/all” semantic relationship. A parsnomen is a term that indicates a part of something (omninomen).
EXAMPLES• Limb – hand – finger• House – bathroom – washbasin• Tree – trunk – bark
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Parsing
Parsing is a complete morphological, grammatical and syntactical analysis of a sentence, quickly applying many thousands of rules.
Parsing identifies every element of a text, assigning each to the appropriate logical and grammatical function.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Disambiguating Text
For a human, the meaning of a word is clear because the surrounding elements help him understand the sense in which the word is used.
Software needs an unambiguous word interpretation represented by a reference system that is equivalent to the human world experience.
If correctly trained in human common sense, the computer can achieve logical world comprehension and join it with its own memory and computing power.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Disambiguating
Disambiguating analyzes single sentences or whole documents and finds the correct meaning for each element by removing every ambiguity.
“Reasoning” takes place which identifies the different meanings of all elements of a text and the reference context.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Examples of Disambiguation
Let’s look at some sentences using the term “bomb”:
The disambiguator intercepts the first possible meaning of “bomb”: it is a sport noun which means a long high forward pass.
In the second sentence, “bomb” is still a noun, but in this case it means a commercial or artistic failure.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Examples of Disambiguation
If “bomb” appears in a sentence with the term volcano, it is interpreted as
a lump of lava.
Finally, in the following sentence, the disambiguator interprets the term “bomb” as an explosive device.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Disambiguator: Text Map
Using semantics allows for the creation of a cognitive knowledge map, a graphic view of the text elements analyzed. We will use this internet biography of Edgar Allan Poe as an example:
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Disambiguator: Classification
Classification recognizes the main categories in the text (literature, military, publishing, etc.) and identifies the main concepts according to the semantic domain identified with the corresponding percentage (“allegory”, “book review”, “character” for literature; “Tamerlane”, “United States Army”, “West Point” for military; “book” for publishing; “epic poem”, “Baudelaire” for poetry, and so on).
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Disambiguator: Main Concepts
Main concepts included in the text are listed – frequency is indicated in the document analyzed as indicated by the colored bar.
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
Semantic Analysis Means Disambiguation
Disambiguation is made possible by:
• A semantic network that contains the representation of concepts and relationships between them.
• A disambiguation engine that, based on knowledge from the semantic network, is able to associate every textual element to the meaning it represents. Revise previous
image to fit this space
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
IPOTESI INGOMBRO EVENTUALE SCREENSHOT
What is a Semantic Network?
A lexical database structured by a conceptual framework.
Which means structuring words in groups of synonyms and words that are identical or similar in expressed meaning (concept).
A concept in the language is named syncon (synonymous congressus), which is a set of synonyms representing the same lexical concept.
Thank You!
Luca ScagliariniVP, Strategy & Business Development
Expert Systemwww.expertsystem.net
For More Information