26
© cortical.io inc. 2015 Empower your enterprise with language intelligence free access at api.cortical.io contact: [email protected]

Empower your Enterprise with language intelligence_Francisco Webber

Embed Size (px)

Citation preview

© cortical.io inc. 2015

Empower your enterprise with language intelligence

free access at

api.cortical.io

contact: [email protected]

© cortical.io inc. 2015

who we are• cortical.io inc. science startup in Vienna - Austria

• result of the CEPT project (Cortical Engine for Processing Text)

• advances in brain theory guided us to a fundamentally new approach for natural language processing

• we are investor backed in the second round

• we made semantic fingerprinting accessible, robust, scalable, intuitive and easy to use

© cortical.io inc. 2015

big (text) data

• businesses, organizations and governments are threatened by the big data explosion.

• a substantial part of this data consists of text.

• computers ‘understand’ numbers but ignore the meaning of language

© cortical.io inc. 2015

the downsides

existing semantic systems are…

…hard to build (sometimes impossible)

…inaccurate & fragile (in real-world use)

…expensive to buy (licenses & services)

…tricky to integrate (setup, tuning, training)

…laborious to run (metadata management)

…hard to maintain (dictionaries, ontologies)

© cortical.io inc. 2015

Semantic Fingerprinting

5

• semantic fingerprinting bridges the gap between natural language processing and knowledge management

• language is represented using the same data format as found in the neocortex (mammalian brain)

• the cortical.io Retina behaves like a sensorial organ for language

• meaning is embodied in thousands of self-learned semantic features

© cortical.io inc. 2015

Semantic Fingerprinting

6

organ

piano

church liver

• the cortical.io Retina converts every word into its semantic fingerprint

• the fingerprints allow direct semantic comparison of the meanings between words

• similar fingerprints have similar meanings

© cortical.io inc. 2015

Semantic Similarity

7cat dogcat+dog

home & family aspects

cat specificaspects

dog specificaspects

biologyaspects

38%

© cortical.io inc. 2015

word sense disambiguation

rock

apple

computer

sense 1

sense 2

sense …nsongwriter

vocals spector airplay album

seeds flowers

pollinators pests

insects

trees

fruit

sense 2a

vegetables berries

ingredients sugar diet

sense 2 …m

food

macintosh microsoft

linux software hardware

© cortical.io inc. 2015

Meaning Based Computing

9

jaguar porsche tiger- =

© cortical.io inc. 2015

Text Fingerprinting

10

• word fingerprints can be stacked together to form fingerprints of any piece of text.

• all semantic fingerprint properties remain: similar fingerprints mean similar texts.

• representation is made through more than 16K features.

aggregation + sparsification

teens like to hear music on their mobile phones

teens like to hear music on their mobile phones

© cortical.io inc. 2015

teens like playing good music with their mobile phones

you can also consume chart hits with your notebook27%

Text Similarity 1

11

© cortical.io inc. 2015

teens like playing good music with their mobile phones

the fishermen are sailing out of the harbor9%

Text Similarity 2

12

© cortical.io inc. 2015

similarity engineexample document

most similar documents

ordered along the users

information need

query document index

result set

ranking

NLP Functionality: Search

© cortical.io inc. 2015

NLP Functionality: classification

cow elephantdog spider frog

“mammal  or  mammals  or  mammalian”

most relevant matching area

Literally:

© cortical.io inc. 2015

Demos @ cortical.io

Demonstrations

© cortical.io inc. 2015

Evaluation

16

There are very few comparable algorithms: a couple of academic ones that cannot be readily used for production purposes and Google’s Word2Vec.

The MEN Test Collection: http://clic.cimec.unitn.it/~elia.bruni/MEN.html The RG-65 Test Collection: http://www.aclweb.org/aclwiki/index.php?title=RG-65_Test_Collection_(State_of_the_art) The WordSimilarity-353 Test Collection: http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/ Yu&Dredzde 2014: http://arxiv.org/pdf/1411.4166.pdf Distributed representations of words and phrases: http://papers.nips.cc/paper/5021-di

© cortical.io inc. 2015

disciplines of

language intelligence• locate documents • find web content • match people • identify products • monitor competitors • file business information • discover new knowledge • track customer satisfaction • avoid duplication of work • advertise on the Internet • mine for evidences • improve security

© cortical.io inc. 2015

business applications

“Anything that can be expressed with text can be matched: - products with LinkedIn profiles, - tweets with Facebook timelines, - job descriptions with CVs …”

© cortical.io inc. 2015

into a stream of semantic fingerprints

not matching

convert thetwitter firehose

to generate a realtime

content sub-stream

MATCHMATCHMATCH

Filter

application: streaming text filter

© cortical.io inc. 2015

resulting filter

fingerprint

creating filter fingerprintswords

text

simple words, keywords

text or text-documents of any size

profile descriptions or message postings from social media

the expression builder allows interactive design of boolean specifications like: jaguar - Porsche = tiger

the fingerprint editor allows the “drawing” of fingerprints. The meaning of the resulting fingerprint can be monitored through the context terms

© cortical.io inc. 2015

• match people by their profiles

• no keyword or field based string matching limitations

• semantic similarity measure to compare professional profiles

• different profiles for professional, leisure, interests, sports etc…

profile fingerprint

activity fingerprint

application: profile matching

© cortical.io inc. 2015

• create fingerprints from product descriptions

• find similar products by matching description fingerprints

• create customer fingerprints from purchased products

product description fingerprint

Product recommendations

similar products recommendationsmatch

application: product recommendation

© cortical.io inc. 2015

simplicity

• no prior expertise in natural language processing or linguistics are needed.

• easy and intuitive definition of semantic filters or classifiers.

• all types of text (words, sentences, paragraphs, chapters, books, etc…) are processed in the same way using fingerprints.

• easy expansion to other languages by switching to any of the available language retinas.

• zero configuration and no parameter tweaking needed

cortical.io advantages

© cortical.io inc. 2015

cortical.io advantages

efficiency

• semantic fingerprints are small 2K byte sized binary vectors.

• only binary operators are used - no floating point operations needed.

• linear scalability as the engine takes advantage of a parallel computing infrastructure (multicore, cluster, virtualization) to match any performance needed.

• high throughput as complex NLP operations are executed in a single step and are therefore much faster than with traditional statistical systems.

© cortical.io inc. 2015

quality

• higher precision on NLP operations due to the large number of semantic features used (>16K).

• automatic disambiguation of human language due to the novel approach.

• full language independence, equally high quality results in all languages due to complete avoidance of any statistical language models.

• no unintended bias as no human input is needed as gold standard.

• automatic update as new words and concepts can be added continuously.

cortical.io advantages

© cortical.io inc. 2015

Web : www.cortical.io Service : api.cortical.io Videos : www.cortical.io/company_media.html Contact : [email protected]