23
Random indexing: On space and meaning Simon Belak

Random Indexing

Embed Size (px)

DESCRIPTION

On space and meaning.

Citation preview

Page 1: Random Indexing

Random indexing:On space and meaning

Simon Belak

Page 2: Random Indexing

Order of the day

• Meaning– Philosophy– Neuroscience– Computer science

• Space– Words as points in space– On dimensionality

• Random indexing

Page 3: Random Indexing

What’s the meaning of meaning?

Page 4: Random Indexing

Philosophers say:

“Meaning just is use.”– Wittgenstein

Page 5: Random Indexing

Neuroscientists say:

• Episodic memory semantic memory(concrete event abstract concept)

• Hebbian process

Page 6: Random Indexing

Computer scientists say:

LSA semantic networks

HALTLC

SAMACT-R

ontology

Page 7: Random Indexing

Projecting meaning into space

Page 8: Random Indexing

Adjacent words closely related

Page 9: Random Indexing

Movement

• Co-occurrences

• Hebbian process– Self-organisation– Clustering

• Evolution of language– Coach (Kocs carriage train car)

Page 10: Random Indexing

Problem: homonymsTable1.

a. An article of furniture supported by one or more vertical legs and having a flat horizontal surface.b. The objects laid out for a meal on this article of furniture.

2. The food and drink served at meals; fare: kept an excellent table.3. The company of people assembled around a table, as for a meal.4 A plateau or tableland.5.

a. A flat facet cut across the top of a precious stone.b. A stone or gem cut in this fashion.

6. Musica. The front part of the body of a stringed instrument.b. The sounding board of a harp.

7. Architecture a. A raised or sunken rectangular panel on a wall.b. A raised horizontal surface or continuous band on an exterior wall; a stringcourse.

8. A part of the human palm framed by four lines, analyzed in palmistry.9. An orderly arrangement of data, especially one in which the data are arranged in columns and rows in an essentially

rectangular form.10. An abbreviated list, as of contents; a synopsis.11. An engraved slab or tablet bearing an inscription or a device.12. Anatomy The inner or outer flat layer of bones of the skull separated by the dipole.

Page 11: Random Indexing

Solution: high dimensionality

• One dimension per word • Table extends into food, furniture, music,... dimensions

Page 12: Random Indexing

Problem: synonyms

amazing, stupefying, staggering, awesome, awful, awe-inspiring, awing, astonishing, astounding

Page 13: Random Indexing

Solution: latent meaning

• Reduced dimensionality

• Closely related words fold into one

• “Higher-order” meaning

Page 14: Random Indexing

Random indexing

Page 15: Random Indexing

The idea

• Word is the sum of it’s contexts

• Context is the sum of it’s words

• Grounding?

Page 16: Random Indexing

The algorithm

1) Take a context of words

2) Generate a context index vector

3) Add index to all the word vectors

4) Go to 1)

Episodic memory (2) + Hebbian process (3)

Page 17: Random Indexing

Dimensionality reduction

• Sparse high-dimensional ternary index

(a small number of randomly distributed +1s and -1s)

• Nearly orthogonal– Distances approximately preserved

Page 18: Random Indexing

The good

• Fast, scalable

• Trivially parallelised– Per word– Addition is associative, commutative

• Stable– Words are independent– Integer arithmetics

• Incremental

Page 19: Random Indexing

The bad

• Memory hungry– Caching (Zipf’s law)

Page 20: Random Indexing

Uses

• Comparing words to words– Query expnasion

• Comparing documents to documents – Clustering– Search– Recomendations

• Comparing documents to words– Keyword extraction

Page 21: Random Indexing

Key points

• Meaning is use

• Words in space

• Multiple meanings, multiple dimensions

• Random indexing– Cognitive rationale– Simple– Fast, scalable

Page 22: Random Indexing

Questions?

Page 23: Random Indexing

References• http://www.sics.se/~mange/papers/KarlgrenSahlgren2001.pdf• http://www.kfs.org/~jonathan/witt/tlph.html• http://www.mtsu.edu/~sschmidt/Cognitive/semantic/semantic.html• http://memory.syr.edu/marc/papers/HowaAddiJingKaha-LSAChap-doc.pdf• http://memory.psych.upenn.edu/research/research_episodic_memory.php