Sensation, perception and computation

Alex McLean31 March 2009rant

There’s often seen to be a fight between symbolic AI and artificial neural networks (ANNs). The difference is between either modeling either within the grammar of a language, or through training of a network of connections between cells. Both approaches have pros and cons, and you generally pick the approach that you think will serve you best. If you’re writing a database backed website you’ll probably use symbolic computation in general, although it’s possible that you’ll use an ANN in something like a recommendation system.

There is a third approach though, one I’ve fallen in love with and which unifies the other two. It’s really simple, too — it’s geometry. Of course people use geometry in their software all the time, but the point is that if you see geometry as a way of modeling things, distinct from symbols and networks, then everything becomes beautiful and simple and unified. Well, maybe a little.

Here’s an example. I’m eating my lunch, and take a bite. Thousands of sensors on my tongue, my mouth and my nose measure various specialised properties of the food. Each sensor contributes its own dimension to the data sent towards the brain. This is mixed in with information from other modalities — for example sight and sound are also known to influence taste. You end up having to process tens of thousands of data measurements, producing datapoints existing in tens of thousands of dimensions. Ouch.

Somehow all these dimensions are boiled down into just a few dimensions, e.g. bitterness, saltiness, sweetness, sourness, sweetness and umami. This is where models such as artificial neural networks thrive, in constructing low dimensional perception out of high dimensional mess.

The boiled-down dimensions of bitterness and saltiness exist in low dimensional geometry, where distance has meaning as dissimilarity. For example it’s easy to imagine placing a bunch of foods along a saltiness scale, and comparing them accordingly. This makes perfect sense — we know olives are saltier than satsumas not because we’ve learned and stored that as a symbolic relation, but because we’ve experienced their taste in the geometrical space of perception, and can compare our memories of the foods within that space (percepts as concepts, aha!).

So that’s the jump from the high dimensional jumble of a neural network to a low dimensional, meaningful space of geometry. The next jump is via shape. We can say a particular kind of taste exists as a shape in low dimensional space. For example the archetypal taste of apple is the combination of particular sweetness, sourness, saltiness etc. Some apples are sharper than others, and so you get a range of values along each such dimension accordingly, forming a shape in that geometry.

So there we have it — three ways of representing an apple, either symbolically with the word “apple”, as a taste within the geometry of perception, or in the high dimensional jumble of sensory input. These are complimentary levels of representation — if we want to remember to buy an apple we’ll just write down the word, and if we want to compare two apples we’ll do it using a geometrical dimension — “this apple is a bit sweeter than that one”.

Well I think I’m treading a tightrope here between stating the obvious and being completely nonsensical, I’d be interested in hearing which way you think I’m falling. But I think this stuff is somehow really important for programmers to think about — how does your symbolic computation relate to the geometry of perception? I’ll try to relate this to computer music in a later blog post…

If you want to read more about this way of representing things, then please read Conceptual Spaces by Peter Gärdenfors, an excellent book which has much more detail than the summary here…

17 Comments

Kassen says:

31 March 2009 at 22:02

It’s not just you; I think most texts on AI tend to walk this tightrope between obviousness and being nonsensical (it made sense to me, BTW, though I have to admit much of it was obvious as well).

The approach I like is to focus on a given and clear problem instead of trying to say something generalised about intelligence (or methodology or…). Once we know what the problem is then “obviousness” becomes a wonderful property for steps towards a solution to have while “nonsensical” remarks probably indicate we don’t actually understand the problem yet. I’m sure this too sounds obvious but I think it’s a very important topic in AI.

I think that a geometry-based strategy may make a lot of sense for music, for example to figure out what type of sound or melody might go with a given fragment, *provided we can come up with a good mapping*. A relational, symbolical, approach may also work. Much like real persons will use different types of solving strategies depending on the occasion and their temperament programs may use different strategies as well. We know that different kinds of intelligence exist and that different people will be stronger in different types of thinking. Some people are very strong in geometrical abstractions but have a weaker verbal intelligence, for example.

I’d say it’s probably a good idea to take inspiration from all of these methods and see how they might apply to programming situations, then pick one that suits our way of thinking, the language to be used and the problem at hand. Much like you I’m not sure how obvious or rambling my reply here is but I hope it’s of some use to you. I’m looking forward to your post on applying this to music, it would be good to look at a example that’s as concrete as possible, then see how these strategies might apply to it. I too am interested in “intelligence” but I don’t think either of us actually knows what it is (in a general sense), on the other hand; I do think we can tell what methods lead to good solutions in give cases.

Reply
Peter says:

31 March 2009 at 23:26

Something I’m quite keen on is the idea of tagging up different kinds of attributes semantically, so that one can get different geometric ‘projections’ in a sense by choosing how these should be weighted.

To relate this to computer music: say you have a few synthesisers, which are making music under the control of some kind of algorithmic sequencer. You may have some synthesiser fields which relate to tone colour (filter parameters etc) while others relate to dynamics (envelopes). Both of those sets may in turn be subsets of ‘timbre’, while there may be another, possibly overlapping, set of dimensions for the musical grammar of the sequencer.

These can be simply specified in a language like Java or C# with custom attributes, something like (tending towards a loose taxonomy here):

@Tags(“timbre, brightness”)
double cuttoff;

Where ‘@Tags’ is a type defined elsewhere, and subsequently reflected on in your system.

As well as potentially doing all kinds of high-fallutin’ modelling based on dynamically weighted projections of the space, this has more mundane pragmatic applications, like being able to type a few characters to quickly filter the set of parameters that will be assigned to a control device; accessing a range of controls that traverses different orientations: for example, one moment controlling all parameters of the ‘bassline’ synthesiser, the next moment controlling the filters from each of the synthesisers… which is quite exciting in its own right, if you ask me.

It’s most easy to see how this relates to variables, but I suspect that aspects of these principles may also be somewhat applicable to other things, like operations… I’m definitely going to start tending towards the nonsensical if I go on… if I haven’t already.

Reply
Ben Moran says:

1 April 2009 at 10:11

Are these representations really equivalent? In continuous mixture of percepts we experience in the real world, can we know that ‘olives are saltier than satsumas’ without additionally having formed a concept of olives and satsumas?

It seems to me that perception alone isn’t enough to make this comparison, as you don’t yet have discrete objects to compare. You need to have some sort of models accounting for your perceptions and that’s where the discrete concepts come back in.

For instance, you might perform clustering on the raw data, and label clusters ‘olives’ and ‘satsumas’ separately; modelling each class in terms of their distribution on the low-dimensional taste manifold.

Then you can indeed compare the two models in terms of saltiness. But you’ve had to introduce an intermediate discretization step, and introduce some additional entities to do this, and that’s where the jump to something like ‘symbolic’ representations comes in – the symbols are labels of these models, which in turn seem to me to be much closer to ‘concepts’.

I haven’t read Gardenfors but will take a look, sounds fascinating!

Reply
Rob Myers says:

1 April 2009 at 10:20

It’s the mixing in that is vital. The percepts of your senses are built on by cognition and the resulting structures are what is “parsed” by the mind. Both dualists and anti-dualists miss this and try to make aesthetics a matter either of purse sensuality or pure reason.

Hofstadter’s Copycat (I know I go on about it, but it is good…) as described in “Fluid Concepts and Creative Analogies” can be seen as having a strong element of shape and geometry. Concepts can be scaled or mirrored, and percepts are built as shapes of those concepts. It’s *messier* than a geometric solution should look, but it may be informative.

Reply
smallfried says:

1 April 2009 at 10:33

As in vision, the number of dimensions is directly related to our sense mechanics. For taste, this is five or six, depending if you include the newly speculated ‘fatty’ taste.
These dimensions are only about the taste of the apple, but there are millions of things associated to an apple, for instance size, form, texture, but also association with certain people, or experiences. Even if we look at just the taste of an apple, it has connections to all these things.
Now when you talk about the dimension of the geometric form, for an apple this should include the strengths of connections to all these other things. We do not limit ourselves to just a few dimensions, it just gets easier when comparing an apple to something else which has a strong connection to a certain thing. For instance, comparing a carrot to an apple on a level of redness is easy(as it is on a level of sweetness), but comparing it on a level of table-like-ness is hard.

Reply
Alex says:

1 April 2009 at 19:59

Wow, very interesting comments…

Ben — as I remember Gärdenfors describes something rather similar to that. He sees a property like ‘bird’ as being a convex shape in a geometrical space. The centroid of that shape is its prototype. For example a robin would be closer to the centroid of the shape than an emu. I’m not sure what you mean by ‘model’, if you mean some relationship or shape in space then Gärdenfors would agree with you. For example he sees verbs as shapes in a space that with a time dimension. Lawrence Barsalou has a related argument, seeing concepts as simulations of perceptions, also very interesting.

Smallfried — yes good points, we can perceive apples in all kinds of ways. Perhaps though we only attend to a few dimensions at a time. I *can* compare an apple to a table, or conjour up a distant memory of sharing an apple with a goat, but I don’t think I do that every time I eat one… Or even if I do thanks to some massively parallel cognitive processing, just a few of those dimensions will be weighted up to have any real influence over my reaction to the apple.

Kassen, thanks, a useful meditation on obviousness. I like to think though that it could be possible to find an approach that unifies symbolic, geometric and high dimensional representations. That seems to be the more human direction. For example when we speak we communicate not only symbolic words, but also geometric prosody in a space of pitch, time and timbre. The same goes with music.

Rob — I still haven’t read that book, I’ll look it up…

Peter — I’m not completely sure what you’re talking about, sounds good though!

Reply
Ben Moran says:

2 April 2009 at 00:06

That centroid step is indeed the sort of discretization I meant. After posting I found a review of Gardenfors that describes it.

A purely geometrical, nearest-neighbour approach seems too limiting to me though. How would this approach manage with things like color invariance in changing illumination, or context-dependent completion of auditory gaps? You can get rid of them by changing the mapping from the high to the low dimensional space, but you need information about the category of the object to make the adjustment…

Barsalou’s “simulations” are examples of the models I was talking about. I was also thinking of probabilistic generative models like Hinton’s Helmholtz machine.

Reply
Alex says:

2 April 2009 at 11:30

I don’t see the problem in using both symbolic and geometric (conceptual) information to change the high-dimensional (sub-conceptual) mapping. Gärdenfors focuses on the geometric level as being the most important in concept formation, but the point is that the three levels of representation are complimentary. But you’re probably right that Gärdenfors’ account doesn’t cover everything. I think his focus on geometry does add something to Barsalou’s account though. Barsalou does talk about comparing simulations of perceptions, and geometric comparisons would seem to be an obvious thing to do in a lot of cases.

Reply
Alex says:

2 April 2009 at 11:31

By the way we wrote a paper about conceptual space and music last year, might be interesting: http://doc.gold.ac.uk/isms/cspace/wp-content/uploads/2008/09/cc08.pdf

Reply
Kassen says:

2 April 2009 at 15:56

I think that the questions of equivalence (that Ben raised) and unification (that Alex touched on) are very closely related here. To me the underlying principle that will make or break these is defining our discussion domain very clearly.

We are, after all, working with analogies and descriptions here and those tend to break down when they don’t precisely line up with our needs. If I write a “drummer” class it may work very well but just because it’s a analogy for a drummer doesn’t mean I can feed it a beer if it doesn’t sound spontaneous during a recording session.

I think that when we create abstractions that precisely cover our discussion domain using these various methods then the methods are already equivalent (mathematically speaking, of course, one might be much easier to work with than another). I’m fairly certain of this, I strongly suspect it’s even obvious ;¬)

I also think that we can combine them and that we actually do so all the time, yet in a very informal and imprecise way (then we get bugs) as they will all be slightly inaccurate descriptions. To again point out something obvious (but non-trivial!); modern formal logic logic grew out of language philosophy.

What is it that you (Alex) are after here? Are you interested in how we think, how we creatively deal with these formal constructs in our code and looking to increase your effectiveness in thinking?

Reply
Pingback: Ancient History « Transfinite
Mike "Pomax" Kamermans says:

26 May 2009 at 12:32

It would, however, be great if we didn’t keep overloading words in the field of AI. Geometry is a mathematical field; as it’s used in AI at present is basically to make it more obvious that you can do comparison both between and within dimensions, but that is not a new concept; it just got a new word.

With geometric algebra and regular geometry being relevant to a fair number of AI subfields, it’s maddeningly annoying to see yet another already established term being overloaded, instead of a more appropriate word or description being used.

Geometry is beautiful, and has very little to do with multidimensional representation of data as it’s used in AI (naturally, any dimensioned data space allows for geometry, but unless you can point at the various dimensions and show what their units are and how they interrelate, it’s not geometry… it’s abusing the word).

That said, technically this just another symbolic representation, since “symbolic” just means we are describing the data itself. An apple can be symbolically represented by the world ‘apple’, but it can also be symbolically represented by a property/value set -that’s still symbolic representation.

Using property/value sets, or just property sets, is more functional for data processing in settings where natural language is hard or even meaningless, but the symbolic representation used only makes explicit what natural language assumes known. In effect, it’s not beautiful because it suddenly all makes sense, it *is* that sense, like a dictionary lets us make sense of complicated words. It’s the explicit description of any otherwise obfuscating description language.

Sadly, it can also lead to massive headaches, like when the idea is taken further, to frame descriptions for natural language (which take the idea much further) or higher order logic in general for reasoning in multidimensional data spaces.

AI’s pretty well defined by the fact that every problem allows for a plethora of approaches to solving it, but the major divide between operating on data (using symbolic representation) and operation on mappings (using neural nets) is still the same. Symbolic representation just falls apart it more approaches than going the neural net way.

(Plus, in symbolic approaches you can at least say what the algorithms mean, and thus explain why it works beyond showing mathematical validatity. Using neural nets, that benefit is lost; you can show the mathematics behind the mapping is valid, but at the cost of not being able to say what the mapping itself represents)

Reply
lokori says:

1 June 2009 at 22:11

Though the ideas presented might seem pretty obvious, I think this little article has some novelty and definitely has value. Many good ideas often look “obvious” after someone writes them down in the right way.

This sounds interesting and good because I actually tried something like this a while ago. I was trying to evaluate poker opponents by putting their moves and playing patterns in a multi-dimensional space. Then by taking the distance between points in the space a “goodness” of a certain move could be evaluated. This experiment did seem promising, but didn’t work out that well after I tried it in practise. Perhaps the “shape” was missing and taking a distance between points was too straightforward 🙂

IMHO engineering-oriented programmers really benefit from this sort of writings floating around internet. Reading through, and understanding, an awful lot of really complex AI research papers and books is not usually an option unless you are really into AI research. A lot of people just “need to get things done” and have a practical engineer’s view into the issue. Unless it’s certain that some complex writing really helps in “getting things done”, they won’t bother reading it carefully. Heck, I know that even for a researcher it’s a load of work to skim through piles of papers because most of them are actually useless crap 🙂

Though talking about “geometry” might be technically wrong here, I think it’s fine because it gets the message delivered. And that’s what counts if you step out of the researcher community. This is a blog after all, not a presentation or a research paper 🙂

Reply
Alex says:

6 June 2009 at 22:56

Hi Mike,

Thanks a lot for the thought provoking response.

Admittedly I am not a mathematician, and didn’t give any detail of geometrical properties and operations in this post, but I am talking about geometry here. The post is really about applying ideas from cognitive science to computer science/AI. In particular ideas about conceptual properties being convex regions (perhaps generated from concept prototypes as voronoi points) within integral quality dimensions. The post was getting long, I ran out of steam so just dropped in a link to Gärdenfors’ fine book, where these ideas are lifted from.

“point[ing] at the various dimensions and show what their units are and how they interrelate” is what many people are trying to do in music psychology (including myself), and I hope results there can be applied usefully in the field of computational creativity.

Gärdenfors sets geometrical conceptual space as a level of representation separate from symbolic representation. I’m struggling to understand what you mean by “describing the data itself”. What’s the data? As I understand it Gärdenfors says the percept/concept exists as geometry and the symbolic level (i.e., a word) is just a name for it.

Reply
Pingback: meanderings » Blog Archive » Things I’ve liked: June 4th
ThinkAfrica says:

11 June 2009 at 11:46

Hi, I just found your site through Bad Astronomy writing on the Chiropocalypse. Have you read the book “Supersizing the Mind” by Andy Clark? I think that it might really be your figurative cup of tea. Rock on with the text music!

Reply
Alex says:

11 June 2009 at 12:24

No I haven’t, will have a look — thanks for the tip!

Reply

17 Comments

Leave a Reply Cancel reply