Month: March 2009

Sensation, perception and computation

There’s often seen to be a fight between symbolic AI and  artificial neural networks (ANNs).  The difference is between either modeling either within the grammar of a language, or through training of a network of connections between cells.  Both approaches have pros and cons, and you generally pick the approach that you think will serve you best. If you’re writing a database backed website you’ll probably use symbolic computation in general, although it’s possible that you’ll use an ANN in something like a recommendation system.

There is a third approach though, one I’ve fallen in love with and which unifies the other two.  It’s really simple, too — it’s geometry.  Of course people use geometry in their software all the time, but the point is that if you see geometry as a way of modeling things, distinct from symbols and networks, then everything becomes beautiful and simple and unified.  Well, maybe a little.

Here’s an example.  I’m eating my lunch, and take a bite.  Thousands of sensors on my tongue, my mouth and my nose measure various specialised properties of the food.  Each sensor contributes its own dimension to the data sent towards the brain.  This is mixed in with information from other modalities — for example sight and sound are also known to influence taste.  You end up having to process tens of thousands of data measurements, producing datapoints existing in tens of thousands of dimensions.  Ouch.

Somehow all these dimensions are boiled down into just a few dimensions, e.g. bitterness, saltiness, sweetness, sourness, sweetness and umami.  This is where models such as artificial neural networks thrive, in constructing low dimensional perception out of high dimensional mess.

The boiled-down dimensions of bitterness and saltiness exist in low dimensional geometry, where distance has meaning as dissimilarity.  For example it’s easy to imagine placing a bunch of foods along a saltiness scale, and comparing them accordingly.  This makes perfect sense — we know olives are saltier than satsumas not because we’ve learned and stored that as a symbolic relation, but because we’ve experienced their taste in the geometrical space of perception, and can compare our memories of the foods within that space (percepts as concepts, aha!).

So that’s the jump from the high dimensional jumble of a neural network to a low dimensional, meaningful space of geometry.  The next jump is via shape.  We can say a particular kind of taste exists as a shape in low dimensional space.  For example the archetypal taste of apple is the combination of particular sweetness, sourness, saltiness etc.  Some apples are sharper than others, and so you get a range of values along each such dimension accordingly, forming a shape in that geometry.

So there we have it — three ways of representing an apple, either symbolically with the word “apple”, as a taste within the geometry of perception, or in the high dimensional jumble of sensory input.  These are complimentary levels of representation — if we want to remember to buy an apple we’ll just write down the word, and if we want to compare two apples we’ll do it using a geometrical dimension — “this apple is a bit sweeter than that one”.

Well I think I’m treading a tightrope here between stating the obvious and being completely nonsensical, I’d be interested in hearing which way you think I’m falling.  But I think this stuff is somehow really important for programmers to think about — how does your symbolic computation relate to the geometry of perception?  I’ll try to relate this to computer music in a later blog post…

If you want to read more about this way of representing things, then please read Conceptual Spaces by Peter Gärdenfors, an excellent book which has much more detail than the summary here…

Mary Hallock-Greenewalt

Hallock-Greenwalt at the sarabet
Hallock-Greenewalt at the sarabet

“Broadly, it is my desire to express emotions by means of timed variations of light and color in a manner analogous to that employed in the art of music. Such expression may either be for its own sake, or … as an accompaniment.”

In 1906, about 40 years after the invention of the commercial light bulb, Mary Hallock-Greenwalt (1871-1950) began work on her colour organ, the sarabet.  She was an accomplished musician, but wanted to create an equivalent artform for colour which she called nourathar (derived from the Arabic for the essense/flavour/influence of light).  Interestingly though, she came to the conclusion that colour wasn’t as important as brightness:

“In this art they, the darknesses and brightnesses, constitute the woof of the play.  They carry a chief burden of the transmitted feeling.  They also tend to make a oneness out of more than one colour, or colours, simultaneously produced.”

She was also quick to dismiss the idea of direct mappings between music and colour.

“… there is no octave to color.  color has no harmonics. … Its pristine strength is such that no two colors can fit together as identical.”

While she composed nourathar pieces to accompany music, she was against cross-domain mappings in general.

“To seek to fasten the form of one art on the form of another art, is, on the face of it, a mistake, if not an impossibility.  They are organically different things.  They will speak in different ways.”

I’m not sure if I agree with this strong claim, the human senses are integrated after all.  But I still think it insightful to reject the naive “colour scales” which others came up with — while synaesthetics can experience pitch as colour (or vice versa), I understand that no two synaesthetics experience the same scale.

An example of Hallock-Greenewalt's patented notation system
An example of Hallock-Greenewalt's patented notation system

All the quotes in this post came from Hallock-Greenewalt’s book “Nourathar: The Fine Art of Light-Color Playing”, which is a joy to read.  She was not the first colour organist, but from what I’ve seen she was the most insightful and interesting of the bunch.  Sadly however no video recordings can exist from back then, so we can only imagine what her performances could have been like, with only her notation to guide us.

I gave a quick dorkbot presentation about Mary Hallock-Greenewalt a couple of years ago, and one audience member jokingly accused me of inventing her to justify VJ culture with false history.  Well her work is well documented in her writing and patent applications, but she should certainly be better known — I recently saw a talk about colour organists which didn’t mention her, despite her huge contribution to the field.

I’ll finish this post with one last quote from the woman herself.

“Is there no expressing of fervor in the deepening of a rose to red? Can quality of ardor not be suggested in the quickness or slowness with which this transition is done?  Can zeal or eagerness not be expressed in the manner of change from blue to purple?  Are colors not “warm” or “cold”?  Is there not the fervid, the burning of intensity of feeling in the ray’s glowing into or embering back?  So much there is to choose from.”

Posted on the occasion of Ada Lovelace day 2009.


How we program

I’ve always wondered how we do programming. Code can be so clean and straight-faced, but when you step back and try to think about how you write it, a darkness descends. It’s tempting to think that your brain is working like a computer program, transforming a symbolic problem into a textual answer as sourcecode. But I don’t think that’s what is going on at all — if problems came specified in formal language, then programming would be a very different experience. We instead start with a mess, and try to find all the problems in it through the process of designing and writing code.

There’s a lovely paper called Mental imagery in program design and visual programming by Marian Petre and Alan F. Blackwell, with many great quotes from programmers trying to introspect on their work. Here’s some tasters:

“ … it moves in my head … like dancing symbols … I can see the strings [of symbols] assemble and transform, like luminous characters suspended behind my eyelids … ”

Programming is a dance of symbols behind the eyelids. Write that into a QA standard.

“It buzzes … there are things I know by the sounds, by the textures of sound or the loudness … it’s like I hear the glitches, or I hear the bits that aren’t worked out yet … ”

This programmer is describing re-purposing their sense of hearing to produce computer software. Quick, strap them into an fMRI machine!

“values as graphs in the head … flip into a different domain … transform into a combined graph … (value against time; amplitude against frequency; amplitude against time) … ”

Hmm programming as relationships within abstract spaces, and relating those spaces to one another. A nice model for thought in general, perhaps?

“It’s like describing all the dimensions of a problem in 2D, and in the third dimension you’re putting closeness to a solution.”

Another, rather different spatial approach, where goodness of solution is somehow represented by something like height.

“ … oh, that happens over there … it’s on the horizon, so I can keep an eye on it,but I don’t really need to know … ”

Exasperating, and sums things up nicely. This kind of introspection is just too hard, so much of these thought processes are entirely sub-conscious. For example you try for hours to solve a tricky problem, give up, then the answer pops into your head while you’re cycling home, otherwise thinking about dinner.

That said, while the above evidence is purely anecdotal, it gives some hints about what might be going on. I like to think that programmers tap into a general human ability to organise a messy world into far tidier problem spaces, and find their way around such spaces in much the same way as they do when bumping around in a pitch black room…

Happy old year

Hope your year has been good so far. I haven’t posted in a while, so here’s a quick update.

Got a nice gig coming up at strp festival in Eindhoven with Dave on the 10th April.

If you want evidence of me being alive then you can follow my twitter feed or my citeulike reading. Actually I’ve been trying to summarise my reading on twitter too although that can be tricky in 140 chars…

I’ve got a few things to turn into blog posts but that’ll do for the time being.