Monday, January 02, 2012

Homosexuals, women and automated natural language representation.

I have the utmost respect for individual people who achieve something creative and worthy. Gender and sexuality have very little to do with this since intellect is not generally influenced negatively by either. Most societies, however, view gender and sexuality as more important then they are. This certainly applies to homosexuals and women who I believe have had a rough time at the hands of bigots and idiots for millennia.

This imbalance is being addressed most in progressive societies and I feel lucky that I live in one of the most advanced cultures in this respect. That's not to say that there are not problems still and also it's not fair to deny that there are some differences between different classes of humans based on gender and sexuality that are sometimes ignored - these can cause problems too and ought to be addressed as openly as the fact that the intelligence and personality of humans is more important than what they look like, their chromosomal orientation and what they find attractive in other humans.

Ada Lovelace was a Victorian woman who worked closely with Charles Babbage in the middle of the 19th century in England. Charles Babbage brilliantly developed the theory for a number of different computing engines. One of these was the Analytical Engine which is assumed to be a very early precursor to the modern general purpose computer (a PC for example). Babbage was no doubt a genius for mathematical problems and theorising on computation, however the leap from maths to a general computer is often attributed to Ada rather than Charles. Lovelace's notes on the Analytical Engine contain an explanation of a way to represent alphabetical letters via the mathematical computation and thereby produce computation on language rather than only on numbers.

These notes were mostly forgotten for half a century, but contain the seeds of an idea later developed by Alan Turing in the 1930s. Alan Turing was a brilliant mathematician who developed a number of ideas interbellum that laid the foundation for modern computing. Working in mathematical theory only he was able to develop ideas about the Turing Machine. This is a model of a computing system that uses algorithms on inputted letters to produce various required outputs. It was used by subsequent computer scientists to develop actual computing devices and is still used today for theoretical work in computer science.

The modern computer is almost ubiquitous in the West, but very few people understand how they work and perhaps fewer still what they actually are! In essence a modern PC is a translation machine. It takes various inputs in binary that are representations of things that are important to humans and translates them via calculations (simple additions, subtractions, Boolean logic etc) into different representations that are important to humans. The computer never has any care or understanding of what the representations mean; it's only the fact that a human can attribute importance to the alphabetical representations (or graphical) that gives a computer any worth.

This is the key idea that Lovelace documented over 150 years ago and that Turing mathmetised and practised at Bletchley Park over 75 years ago. We owe the modern world and computers to a particular woman and a particular homosexual man. This post is a celebration of the positive affect these figures had on the world despite their situation being considered substandard even today by some. In my opinion it's time their contributions were better known and appreciated.

I'm quite sure that there were many, many individuals who contributed to this development who were neither homosexual nor female and I don't wish to forget their efforts too, but that can wait for another post.

As long as computers are representing things we care about in electrical circuits then they will always need attention to keep them up-to-date with what we find informative. The Turing Test is another legacy of Alan Turing and one which is close to my heart since I have wanted to crack it most of my life. It's extremely difficult to fool a human being when it comes to natural language since we are almost without exception experts in it and can recognise innately minute variations from "correctness" almost instantly. I'm not sure that I agree with the idea that the nut is close to being cracked (news item link). Apple's Siri is at the forefront of popular attempts at the Turing Test and even when it can recognise the language itself (not often it seems) it very often fails with "understanding". It's the understanding part that is difficult and it's predicated on many things that come naturally to us as speakers and listeners, e.g. our world knowledge, the context of the situation, body language, how we "feel" at that moment and various complex language recognition mechanisms that we don't even realise we're doing such as Anaphora Resolution.

It's my opinion that an avenue of research to follow that might lead to solving the AI language problem involves the representation of ideas from Neurolinguistics and Cognitive Science. By using the results of research into how human brains use natural language combined with logic systems from philosophy and maths we can create representations that underlie the vagaries of context, grammar and language recognition that can then be combined with other research to develop a computing device that "thinks" like a human rather than "translates" like a machine.

P.S. Britain's treatment of Alan Turing after WW2 is a disgrace in my view and an apology extremely long overdue. This being a year of celebration of Alan Turing might also be a starter for the development of a movement to further quell prejudice and harm to people based on gender or sexuality.

No comments: