DOC PREVIEW
TAMU CSCE 420 - slide10

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Overview• Distributed representation for natural language and analogy.• Pentti Kanerva. Dual role of analogy in the design of a cognitivecomputer. In K. Holyoak, D. Gentner, and B. Kokinov, editors,Advances in Analogy Research: Integration of Theory and Datafrom the Cognitive, Computational, and Neural Sciences, pages164-170. 1998.• Pentti Kanerva. Large patterns make great symbols: An exampleof learning from example. NIPS*98 workshop, 1998.1Kanerva’s Perspective• Human intelligence and language are fundamentally analogicaland figurative.• Danger in relying too much on the computer metaphor.• Growth of human mind is largely due to analogical perceivingand thinking.• Imitation is a more advanced form of learning.• Possibility of the coevolution of analogy and language.• We must put figurative meaning and analogy at the center, todesign a new kind of “cognitive” computer.2Kanerva’s Perspective: Continued• Description vs. explanation: formalisms are good for describing,but may not be good enough to explain complex things likehuman thought.• The main focus is on patterns generated by sensory input andnew patterns that are derived.• “Human mind conquers the unknown by making analogies to thatwhich is known, it understands the new in terms of the old. In sodoing, it creates rich networks of mental connections andbecomes robust.”3Binary Spatter-Code (Kanerva 1998)101100010101101010101000· · · 1011010• Space of Representation: large N -dimensional vectors (whereN is very large, 1, 000 < N < 10, 000). The vectors canrepresent:1. variables (role),2. values (filler),3. composed structures, and4. mapping between structures,all in the same space.• Item Memory and Clean-up Memory: Vectors resulting frommanipulations are not exact, and a lookup table of valid/knownvectors is needed to correct errors introduced during themanipulations.4Binding and UnbindingOperators:• Binding: Basically a bit-wise XOR function.x ⊗ a, wherexi⊗ ai, for all i = 1..N ,where x and a are vectors and the meaning of the operation isthe variable assignment x = a.• Unbinding: Retrieve either the role or the filler.x ⊗ (x ⊗ a) = a (retrieved the filler)a ⊗ (x ⊗ a) = x (retrieved the role)5Binding/Unbinding Example• Role and filler vectors:x = 1111011011a = 0110111001• Bindingx ⊗ a = 1001100010• Unbindinga ⊗ (x ⊗ a) = 1111011011You can use matlab to try this:x = [ 1 1 1 1 0 1 1 0 1 1 ]a = [ 0 1 1 0 1 1 1 0 0 1 ]xor(x,a)xor(a,xor(x,a))6Merging• Merging: superimposing, also known as bundling or chunkingthrough normalized sum:hG + Hi,where each resulting bit is determined by a bit-wise majorityrule for each bit position with ties broken randomly (i.e., whenthe number of 0s and 1s are the same).• Example: the relation r(A, B), represented as r1 = A andr2 = B:R = hr + r1 ⊗ A + r2 ⊗ Bi,where r is the name of the relation, r1 the first role, r2 thesecond role, and A and B the fillers.7Property of MergingProperty of the merging operator:• the resulting vector is similar to all constituent vectors.• Example: given 10,000-dimensional random vectors x, y, z, andr, we can calculatem = hx + y + zi,the correlation coefficients of the resulting vector and theconstituents are around 0.5:corr(x, m) = 0.50, corr(y, m) = 0.48, corr (z , m) = 0.51.while for another random vector r not merged in m,corr(r, m) = 0.01.8Example of Mergingx = 1101010110y = 0000101110z = 1100010100x + y + z = 2201121320hx + y + zi = 1100010110For example, in Matlab or Octave, try:x = (sign(rand(1,10000)-0.5*ones(1,10000))+ones(1,10000))/2;y = (sign(rand(1,10000)-0.5*ones(1,10000))+ones(1,10000))/2;m = (sign((x+y)-1.5*ones(1,10000))+ones(1,10000))/2and calculate corrcoef(x,m), etc.9Distributivity• Distributivity: binding and unbinding operators can bedistributed over the merging operator.x ⊗ hG + H + Ii = hx ⊗ G + x ⊗ H + x ⊗ Ii* This property is key in the analysis of BSC.10ProbingGiven a representation R for relation r(A, B), you want to find outwhat the first role r1 is bound to.R = hr + r1 ⊗ A + r2 ⊗ Bi• Simple unbind R with r1:A0= r1 ⊗ R = r1 ⊗ hr + r1 ⊗ A + r2 ⊗ BiA0= hr1 ⊗r +r1 ⊗(r1 ⊗A)+r1 ⊗(r2 ⊗B)i, by distributivityA0= hr1 ⊗ r + A + r1 ⊗ (r2 ⊗ B)i, by unbinding,thus, A0is similar to A.• Other similar vectors r1 ⊗ r and r1 ⊗ (r2 ⊗ B) are not in thecleanup memory and are treated as noise.• Among all potential filler vectors, we can find which one is mostsimilar to A0.11Advanced Operations• Multiple operations can be slapped together, e.g., multiplesubstitutes.• Mapping can be done in many different ways:– mapping between things that share structures,– mapping between things that share objects, etc.• Holistic mapping and simple analogical retrieval.12Example• Let F be the holistic representation of France, which is definedas:F = hca ⊗ Pa + ge ⊗ WE + mo ⊗ fri,where ca=capital, Pa=Paris, ge=geographical location,WE=Western Europe, mo=monetary unit, and fr=franc.• We can probe it in many ways:F ⊗ Pa = hca + ge ⊗ WE ⊗ Pa + mo ⊗ fr ⊗ Pai,from which we can get:F ⊗ Pa ≈ ca,i.e., Paris of France is the capital.13Simple Analogy with Binary Spatter-Code• Let S be the holistic representation of Sweden, defined as:S = hca ⊗ St + ge ⊗ Sc + mo ⊗ kri,where ca=capital, St=Stockholm, ge=geographical location,Sc=Scandinavia, mo=monetary unit, and kr=krona.• We can ask “what is the Paris of Sweden”?:S ⊗ Pr,which results in nothing meaningful.• However, we can ask “what is the equivalent of Paris in France inthe case of Sweden?:S ⊗ (F ⊗ Pr) ≈ S ⊗ ca ≈ St14Multiple Substitutions• Multiple substitutions can be done at once:M = hPa ⊗ St + WE ⊗ Sc + fr ⊗ kri• Now, applying that to F:F ⊗ M,what would we expect?• After multiple applications of distributivity and unbinding, we get:≈ hca ⊗ St + ge ⊗ Sc + mo ⊗ kri,which is ≈ S, i.e.,F ⊗ M ≈ S.15Mapping of RelationsThe power of BSC is that not only objects, but also relations can becompared and manipulated using the same operators.Example: Mother relation m(·, ·) and parent relation p(·, ·).• Let the mapping between relations m(A, B) → p(A, B) berepresented as:MAB= mAB ⊗ pAB, wheremAB = hm + m1 ⊗ A + m2 ⊗ Bi, andpAB = hp + p1 ⊗ A + p2 ⊗ Bi.16Mapping of Relations: Continued• Given a single example m(A, B) → p(A, B), i.e., MAB,can we apply that to a new instance m(U, V ) (i.e., mUV) andderive p(U, V


View Full Document

TAMU CSCE 420 - slide10

Documents in this Course
Load more
Download slide10
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view slide10 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view slide10 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?