Nautral Language Understanding

“It was a true solar-plexus blow, and completely knocked out, Perkins staggered back against the instrument-board. His outflung arm pushed the power-lever out to its last notch, throwing full current through the bar, which was pointed straight up as it had been when they made their landing.”

My current research in AI, such as it is, is an attempt to build a system that’s capable of understanding the above quote.  It’s from the middle of a book, and it is much much harder to understand, fully, than you might think. What I intend to do here is to unravel the process by which someone reading the book could be said to understand it. Largely the concern is about what kind of mental structures are being built and what structures must have been built by reading the previous half of the book for the passage to do what it does in the mind of the reader.

Without further ado, let us jump into the quote, which starts at the beginning of a paragraph:

“It was a true solar-plexus blow,”

There are two sources for the comprehension of this clause. First is the preceding paragraph, where a fight is described. A scene and script have been built up, like a movie in the mind. In particular, one man is holding a girl (who is struggling to escape) and another is trying to tie her feet. She kicks the second man, and that is the blow that’s being referred to.

Unlike a cinematic movie, however, much that would be evident on the screen has been left out. The specific positions of the bodies, the clothing in some cases, and many aspects of the background have been left to the imagination. In other words, the “movie” is a sequence of abstractions.

It is in no sense simply a pile of predicates, however. When I read this, I come away with a semi-visual motion script, such as could be used to orchestrate a re-enactment by action-figure dolls, even though the text doesn’t come close to specifying the actual positions or motions I imagine.

The second source is the reader’s memories of pertinent experiences, either of watching fights or having been in them. In the multi-level abstraction structure that’s being built, by and large, at least in the hands of a skillful writer, the things that get mentioned are the things that you’d pay attention to if watching the scene. It’s well established, for example in studies of eyewitness acounts in criminology, that people confabulate what happens between such points in their memory of actual events, much less from verbal stories. So to that extent, the structure of a story reflects that of memory.

If you’ve ever taken a hard blow to the solar plexus, you’ll have a much deeper understanding of this passage than someone who hasn’t. I have, and the sensation is unique; nothing else in my experience feels the same or has the same effects. If you have, note that among the few descriptions of clothing that were provided was that the girl was wearing riding boots.

At a higher level, the scene is part of an attempted abduction of the girl by the men. On this level, the reader is on tenterhooks to discover whether the abduction will succeed, given the girl’s spirited and at least partially efficacious resistance.

“and completely knocked out,”

Syntactically, this is a bit of a garden path; we expect it to be a conjunct of the previous predicate until we see the comma. It turns out to be a participle introducing the second clause. It appears to be for the benefit of those readers who have not experienced solar plexus blows. It describes the effect well enough to follow the action sensibly, but doesn’t really capture the experience.

This points out that there can be different amounts of actual understanding going on in different readers each of whom would claim to have understood the passage: there can be ties to emotions, sensations, memories, and/or mental models in various combinations and strengths.

“Perkins staggered back against the instrument-board.”

Perkins is the second man, and his staggering back is completely predictable from the description of the action so far. In fact, it’s predictable that staggering back is part of the process of his collapsing, which isn’t, and doesn’t need to be, stated explicitly. The ability to predict is one of the key elements of understanding, so we can propose that there is a model of collapsing after being knocked out (or struck in the solar plexus) that abstracts away from any particulars about the individual (or his specific position) that allows the extrapolation of the “movie,” if need be.

It’s the instrument-board that is the new item. In order to understand this, the reader has to pull into play a much broader background structure than heretofore. The action is taking place in the control room of a spaceship, and the “instrument-board” is its control panel. One is reminded of a programming language variable being looked up in a containing context after being found unbound in the local one.

“His outflung arm pushed the power-lever out to its last notch,”

Now we see that there are multiple disconnected levels of abstraction, as well as disconnected items of physical description, that need interpolation. We hear about the arm and the lever, which is local and concrete. We can imagine an arm striking a lever and pushing it. We still don’t know anything about how big the board is, where the lever is on the board, whether the board is horizontal, vertical, or tilted. We don’t know where Perkins is with respect to the pilot’s (and co-pilot’s?) seat(s). On the other hand, we do know much higher-level things about control panels, and power levers (I, for example, call to mind typical airplane cockpits as well as the spaceship control rooms in SF movies I’ve seen.) Although various things about the spaceship have been mentioned before in the book, no good description of the control room has been given; we have to assemble this as some useful level of abstraction as we read this passage.

“throwing full current through the bar,”

This is the key development, not only of the passage, but of the entire book. Note that without the model that the reader will have built up by that point, the phrase means virtually nothing. You might well think that there is a place where drinks are served on the ship.

The book is, as the perspicacious reader will have guessed, E. E. Smith’s 1928 space opera classic Skylark of Space, and its premise is that a bar of copper plated with a (fictional) transuranic metal can convert its mass to pure kinetic energy with the application of a current. The bar is the drive motor of the spaceship. As far as I know, there’s no other context, even in science fiction, where the motor of any vehicle is referred to as a bar. It certainly doesn’t happen in reality, and will not have been the case in any reader’s experience. Many of us have seen bars of copper, and to that extent the symbol is “grounded;” but in the salient sense of the story it is not. Its meaning comes completely from the model that has been built up out of pure words. (In simple terms, no AI with a static, pre-programmed ontology will be able to understand this century-old kids SF adventure story at even the most pedestrian level.)

The really remarkable thing about this phrase is that it throws implications across virtually every level at which the book is to be understood.

At the physical level, it describes the closing of a circuit and the application of voltage to a piece of metal.

At the technological level, using the (fictional) understanding of space drives built up before, that means that the ship will be placed under very heavy acceleration.

Back at the physical bodies level in which we were understanding the fight before, it means that the parties will be thrown to the floor and unable to move.

The fight will be at least suspended. At something closer to the plot level, the parties, pinned to the floor by acceleration, will be unable to stop the ship until the fuel runs out, stranding them far away in space.

This converts them from abductors and victim, where the conflict is all interpersonal, to fellow lifeboat passengers facing a common doom. There is room for various character development as they adjust to the shift in circumstances.

In a larger context, from the outside it will seem as if the abduction succeeded. This will force the girl’s fiance to give chase in his own spaceship (already a hackneyed plot by 1928, to be sure, but thereby all the more predictable for the reader).

But the fact that the ship will fly at top acceleration until its fuel is exhausted implies that the succeeding action will be far-removed from the familiar terrestrial scenes it has taken place in so far. In fact it converts the story from one of personalities and struggle in familiar circumstances (a la the Illiad) to a true voyage of the imagination (like the Odyssey).

If you are want to understand things deeply, you typically want to call in comparable things which can illuminate them by analogy. In this case the arm on the lever is like the tornado in Wizard of Oz (or of course the storm in the Odyssey); it not only throws the protagonists into a strange new world, but motivates their subsequent adventures with the quest to get home.

There is NO knowledge representation and inference scheme in the NLP field today that has even a snowball’s chance of capturing all this. But a human reader with a good grounding in the classics can see that this sentence is the turning point and spark of the whole book on something like five different levels simultaneously.

That’s quite a kick.

“… which was pointed straight up as it had been when they made their landing.”

In something of an anticlimax, Smith is keeping the reader up to date with the physical model of the ship, just in case someone wonders why, having gone down to land, it didn’t keep going down when the juice was turned back on. It had come down backwards, hanging on its thrust. Yet another model — and level of abstraction.

Graphene transistor roundup

Phaedon Avouris, winner of the Feynman Prize in 1999, is head of the nanoscale science and technology group At IBM, which has recently reported significant advances in synthesizing transistors from graphene using conventional lithography methods.

IBM Demonstrates Graphene Transistor Twice as Fast as Silicon

Graphene transistors promise 100GHz speeds

Graphene Transistors that Can Work at Blistering Speeds

Big Blue demos 100GHz chip

Nanoclast interviews Avouris

and the Science paper,

100-GHz Transistors from Wafer-Scale Epitaxial Graphene

What does this all mean?  Basically, they have overcome a couple of substantial hurdles on the way to a carbon-based electronics, namely the bandgap issue and the ability to fab at wafer scale.  They still have a way to go: they need to bring gate length down by a factor of 10 or so to be in the range of silicon, and probably a few more hurdles and a lot of just plain legwork as well.  But if the research goes through to development, and the development goes through to manufacturing, we’ll have chips that are about two-and-a-half times as fast as the corresponding ones in silicon.

The bottom line, for my money, is that Moore’s Law is safe (in the sense that it will continue to hold true) for another decade at least.  I don’t see this as being a huge spike ahead of Moore’s Law, since graphene has a lot of catch-up to play, but in the long run it probably has more upside potential in speed and size, especially if/when they can get those nanoribbons atomically precise.

The first AI blog

The first AI blog was written by a major, highly respected figure in the field. It consisted, as a blog should, of a series of short essays on various subjects relating to the central topic. It appeared in the mid-80s, just as the ARPAnet was transforming over into the internet.

The only little thing I forgot to mention was that it didn’t actually appear in blog form, which of course hadn’t been invented. The WWW didn’t appear until the next decade. It appeared in book form, albeit a somewhat unusual one since it was, as mentioned, a series of short essays, one to a page. It was, of course, Marvin Minsky’s Society of Mind.

Of course, you’re reading a blog about AI right now. The difference is that that was Minsky, and this is merely me. If you haven’t read SOM, put down your computer and go read it now.

Good. You’re back.  Here’s why SoM is relevant to our subject of whether and how soon AI is possible:

It remains a curious fact that the AI community has, for the most part, not pursued Society of Mind-like theories. It is likely that that Minsky’s framework was simply ahead of its time, in the sense that in the 1980s and 1990s, there were few AI researchers who could comfortably conceive of the full scope of issues Minsky discussed—including learning, reasoning, language, perception, action, representation, and so forth. Instead the field has shattered into dozens of subfields populated by researchers with very different goals and who speak very different technical languages. But as the field matures, the population of AI researchers with broad perspectives will surely increase, and we hope that they will choose to revisit the Society of Mind theory with a fresh eye.  (Push Singh — further quotes from the same source)

In other words, here’s a comprehensive theory of what an AI architecture ought to look like that is the summary of the lifework of one of the founders and leaders of the field, and yet no one has seriously tried to implement it.  (When I say serious, I mean put as much effort into it as has gone into, say, Grand Theft Auto.)  (There has been a serious effort to implement the theoretical approach of the CMU wing of classical AI, namely SOAR.)

Part of the reason for this is that SoM is in some sense only half a theory:

Minsky sees the mind as a vast diversity of cognitive processes each specialized to perform some type of function, such as expecting, predicting, repairing, remembering, revising, debugging, acting, comparing, generalizing, exemplifying, analogizing, simplifying, and many other such ‘ways of thinking’. There is nothing especially common or uniform about these functions; each agent can be based on a different type of process with its own distinct kinds of purposes, languages for describing things, ways of representing knowledge, methods for producing inferences, and so forth.
To get a handle on this diversity, Minsky adopts a language that is rather neutral about the internal composition of cognitive processes. He introduces the term ‘agent’ to describe any component of a cognitive process that is simple enough to understand, and the term ‘agency’ to describe societies of such agents that together performs functions more complex than any single agent could.

… but SoM doesn’t have a lot to say about what the individual functions are or how implemented, outside a few examples.  Since AI has for the past few decades concentrated on immediate results, most of the work has been on parts of the problem that could be described as stuff that would be inside a single agent, or at most an agency.

A good example of this happened a few years ago with the winning of the DARPA Grand Challenge and thus the development of the self-driving car. A few months after that happened, I was having a conversation with an AI researcher at a conference.  I maintained that the difference between the results of the first and second races — nobody got more than a mile or so, and then a couple years later several cars finished the whole 130-mile course –  represented real progress.  He pooh-poohed the idea.  All the techniques used in the cars had been previously known and published, he said.  All that had happened was that they had been integrated together into a working system.

I think this attitude goes a long way to explaining the lack of work on SoM and other overall cognitive architecture theories.  But as I reasoned previously:

The difference was, the Wright brothers knew an extra Good Trick, which was how to control the plane in the air once it was flying.
So to develop a working AI, we need the power, which we don’t think is going to be a problem. We need the lift, which is the kind of techniques found in narrow AIs and discussed above. And finally we need the control.

SoM represents a theory of how the control might work.  Where does that leave us?  Can we simply take Minsky’s books and papers and build an AI with all the existing narrow skill programs acting as agents? Hardly.  There’s a lot of work to be done, and probably several new Good Tricks left to be found.

The bottom line, though, is that we are not facing a blank wall.  We are facing a corridor with a sign reading “This way to the egress.”  Indeed we are partway down the corridor already; robotics and self-driving cars have required the development of integrated cognitive architectures along the lines that will probably lead to success.  Note that Brooks’ subsumption architecture had a lot in common with SoM.

So there is at least a case to be made that we are into the home stretch.  Of course that’s where the race really heats up and all the excitement happens…

Analogical Quadrature

So far, in making my case that AI is (a) possible and (b) likely in the next decade or two, I’ve focused on techniques which are or easily could be part of a generally intelligent system, and which will clearly be enhanced by the two orders of magnitude increase in processing power we expect from Moore’s Law by 2020.  (Note — we certainly don’t have to wait till 2020 to find out.  Existing hardware is well into the usable range, probably for less than $1M.  But you don’t get too many researchers, and no hobbyists, doing their research on machines like that today. You will in 2020.)

To make a heavier-than-air airplane fly, you need an engine.  If you have an airframe with lift-to-drag ratio r, stall speed s, and weight w, and a propellor with thrust efficiency e, you need an engine with power p=swr/e to fly. Power<p, no fly. Power>p, fly.

Both of the major American flying machine efforts understood this.  Langley spent huge effort developing light, powerful engines.  The brothers Wright built their own aeroengine from scratch in their bicycle shop.

The difference was, the Wright brothers knew an extra Good Trick, which was how to control the plane in the air once it was flying.

So to develop a working AI, we need the power, which we don’t think is going to be a problem. We need the lift, which is the kind of techniques found in narrow AIs and discussed above. And finally we need the control.

What I just said is an example of reasoning by analogy.  To an extent much greater than usually realized, most cognition and reasoning is based on analogy.  When you perform a physical skill, the specific sequence of sensory and motor signals is never exactly any of the ones that happened during practice; but they’re close enough that the mapping is straight-forward.

This is something that is well-known to the AI mainstream:

But “the big feature of human-level intelligence is not what it does when it works but what it does when it’s stuck,” Minsky said. When faced with novelty, Minsky claims, human intelligence applies “reasoning by analogy” to make the most direct tap into the cognitive glue that fuses knowledge domains.
Reasoning by analogy is a way of adapting old knowledge, which almost never perfectly matches the present situation, by following a recipe of detecting differences and tweaking parameters. It all happens so quickly that no “thinking” seems to be involved.  (EE Times)

The particular kind of reasoning by analogy that would make an associative memory machine work well can be called analogical quadrature.  This is the form of problem done most famously by Melanie Mitchell’s Copycat program: you have three things A, B, and C, and you want to find a fourth D such that A:B::C:D.  In the associative memory scheme, you need to do not the actual action you did in the memory, but the action that fits the current situation the way the remembered action fit the remembered situation.

As a simple example, if the remembered action was done by someone else, the parallel could be mapping things so that the action is done by you this time. In other words, analogical quadrature enables imitation.

If you can somehow represent your concepts as points in an n-dimensional space, analogical quadrature is falling-down easy: D=C+B-A in ordinary vector algebra. Of course, sometimes the mapping into n-space is problematical, and we are thrown back on symbolic methods such as those of the FARGitecture.

Those have their own problems, essentially the same ones as any symbolic AI: the operations and ontology in, e.g., Copycat are all idiosyncratic and hand-coded, and there’s no clear way to build a learning machine that extends them automatically.

I’ll go out on a limb and guess that the ultimate solution will involve elements of both extremes.  Search will be needed both to find new operations for symbolic formulations, and to find appropriate mappings into n-space for the subsymbolic ones.  A few key insights — new Good Tricks — will be necessary to unify the known methods and give us a solid understanding of, and engine for, analogical quadrature.  That’ll be a huge step towards general AI.

Associative memories

AI researchers in the 80s ran into a problem: the more their systems knew, the slower they ran.  Whereas we know that people who learn more tend to get faster (and better in other ways) at whatever it is they’re doing.

The solution, of course, is: Duh. the brain doesn’t work like a von Neumann model with an active processor and passive memory.  It has, in a simplified sense, a processor per fact, one per memory.  If I hold up an object and ask you what it is, you don’t calculate some canonicalization of it as a key into an indexed database. You compare it simultaneously to everything you’ve ever seen (and still remember).  Oh, yeah, that’s that potted aspidistra that Aunt Suzie keeps in her front hallway, with the burn mark from the time she …

The processing power necessary to to that kind of parallel matching is high, but not higher than the kind of processing power that we already know the brain has.  It’s also not higher than the processing power we expect to be able to throw at the problem by 2020 or so.  Suppose it takes a million ops to compare a sensed object to a memory.  10 MIPS to do it in a tenth of a second.  A modern workstation with 10 gigaops could handle 1000 concepts. A GPGPU with a teraops could handle 100K, which is still probably in the hypohuman range.  By 2020, a same priced GPGPU could do 10M concepts, which is right smack in the human range by my best estimate.

Associative memory gets you a lot.  You don’t have to parse an unknown object for algorithmic retrieval.  You don’t have to come with some one-size-fits-all representation and/or classification scheme.  Indeed, each object in memory can have its own representation if necessary or useful.

It gets better.  The memories aren’t all, or even mostly, objects.  They’re typically actions.  Let’s suppose the actions are represented as situation-action-resulting situation triples — something like Minsky’s trans-frames.  Then we can use the associative memory to

  • recognize, as described above
  • predict: search on the situation and action; the prediction is the result in the best match
  • plan: match on situation and desired result; do the action from the best match
  • generalize: every time a was done, b happened
  • model: by chaining predictions, etc

There was an attempt to do this kind of thing in mainstream AI under the name “case-based reasoning” a couple of decades ago, but it appears to have foundered for several reasons, not least of which was the inability to do heavy-duty parallel matching on extensive memory sets.

There are a number of things that need to be added to the scheme for it to be useful and robust, like embedding it in a hierarchical, multiagent architecture, the ability to do analogical quadrature, and the ability to find useful representations.  But that’s for another post.

Baytubes

Bayer (the same company that makes the aspirin) is now beginning to manufacture multi-walled carbon nanotubes in industrial quantities.  The pilot plant will produce 200 tons per year, and the market is expected to grow at 25% per year.

The MWCNTs are for materials use, meaning mostly fiber-reinforced composites, e.g. airplanes, tennis racquets, arrows,

and the like.  The major advantages over conventional polymers / fibers is that the CNTs are stronger and conductive (both electrically and thermally) — producing a plastic that is more like metal in many ways, but still much lighter.  The conductivity is supposed to be comparable to copper, i.e. good enough to use as wiring in many applications.  Looking at the data for CNTs as a polymer additive, the major effect on mechanical properties was to make them less stretchy (and about 10% stronger), while having a major effect on conductivity properties.  Nobody has yet, as far as I know, managed to figure out how to make a composite that has the really high tensile strength possibilities of the raw nanotubes.  Alternatively, CNTs in light metal matrices such as aluminum or magnesium seems to have significant possibilities.  Time will tell — but there’s still a major advance to be made.

The individual CNTs in the mix are on average 8 or so walls, 15 nm diameter, and over a micron long (i.e. an aspect ratio of at least 60 and probably in the hundreds).

Learning and search

So we will take it as given, or at least observed in some cases and reasonably likely in general, that AI can, at the current state of the programming art, handle any particular well-specified task, given enough (human) programming effort aimed at that one task.

We can be a bit more specific about what “well-specified” means.  In general, if the task has a static ontology that can be laid out by the programmers, it’s within the scope of current practice.  A huge part of the progress of early AI was in fact simply building up (hand-made) ontologies.  An ontology includes, BTW, not just a list of concept names, but the semantics: code to recognize, predict, simulate, and perform whatever things we’d expect a person to be able to do who we would describe as “understanding” the concept.

The difference between this “static AI” and real human-level intelligence is that people learn new concepts constantly.  We will learn several words a day our entire lives (estimates range from 1 to 10 and of course this depends on individual intelligence and environment). Concepts are constantly changing and growing, splitting and merging, being half-forgotten and rediscovered.

Not only the ability to create new concepts, but the fluidity and adaptability of the ones we already have, enable the robustness of human intelligence.

There’s been a lot less research on how to build concepts than there has been involving the formalization of existing ones in static form.  There’s a bias toward the latter since you get a machine that can do something useful much quicker that way.

However, there has been research in creating new concepts and we can say something about it.  It seems to be the area where the high computational resources make a difference.  The most general approach we have is search, in various forms. Deep Blue invented startling new chess strategies on the fly.  These robots evolved a number of concepts through simulated evolution.

(ps — if you want your research paper to be picked up by the pop-sci news and blogosphere, simply include the words “robot” and “predator” in it :-) )

Going back to Lenat’s AM, it’s been understood that search, in various forms, is capable of the kind of learning we need, but also that it tends to run out of steam sooner rather than later.  In other words, it seems likely that a properly set-up search is capable of inventing a fairly sophisticated concept, but you need another setup for the next one.  It’s generally accepted that some sort of evolutionary search is going on in the brain, but the system that controls it, sets up the search spaces, defines the fitness functions, and so forth, is definitely not well understood.

Thus the key to understanding when and whether general AI can happen lies in the high-level organization that can guide the application of focused search to produce a growing set of concepts that work coherently together.

Steam balloons

The brothers Montgolfier invented the hot air balloon upon the observation that smoke rises, and thus they figured that if they could catch it in a bag, the bag would be pulled upward.

Hot air ballooning is quite popular today; people think of balloons as being quaint and pretty and natural, or at least more natural than airplanes.

Actually, a modern hot-air balloon uses more fuel than an airplane does to fly the same payload for the same time. The reason, of course, is that hot air needs to be hot, but the balloon needs to be light, so that the material needs to be thin, which means in practice that heat is lost through the balloon, and needs to be regenerated by burning fuel.

With nanotech we could make a fabric of diamond sheets for strength, with vacuum for insulation, and thin metallic films (or graphene sheets) to reflect thermal radiation. That means that we could have a balloon that was much lighter than woven nylon, and yet enormously better insulated.

The air in a balloon may be typically heated to around 100C, making it 0.93 kg/m^3 (compared with 1.2 at 20C). Call it roughly 0.25 kg/m^3 lifting capacity.

But if we can insulate it, we could fill it with steam instead (at the same temperature). (Steam would condense on the walls of an uninsulated bag.) Steam at 100C has a density of 0.59, call it 0.6 kg lifting capacity. Since we aren’t losing heat (much), we could superheat it and get some extra lift, say a few 0.1kg/m^3, but there would likely be a tradeoff with insulation weight, energy rates to cover leakage, etc. Even without it, the balloon lifts its own weight, including the water in the steam (and probably 100 times that of just the balloon).

(Postscript: It’s occasionally assumed that diamond is strong enough to make air-buoyant vacuum-filled balloons. This doesn’t actually work. Hydrogen remains the champ, with a density of 0.09 kg/m^3, which is essentially negligible. But even so, a steam balloon would only have to have twice the volume of a hydrogen balloon with the same lift, as compared to 5 times for hot air.)

Gada Prize update

We’ve had a fair amount of interest in the Kartik M. Gada Humanitarian Innovation Prizes, mostly from RepRap types. They pointed out that we had a slight incompatibility in the specification of the open source requirements with those of the RepRap community itself. We’ve changed the requirements to allow either BSD or GPL.

To make a donation to the Gada Prize fund, click thru to the prize page, click on the “Join Now” button to the right, go to the Donation section, and select “Gada Prize” from the project pulldown.

The Sigil of Scoteia

At the Foresight congerence special-interest lunch on IQ tests for AI, Monica Anderson suggested a test involving separating text which had had spaces and punctuation removed, back into words.  As a somewhat whimsical version of the test, I suggested the Sigil of Scoteia:

The Sigil of Scoteia

In case you’re unfamiliar with it, it’s the frontispiece of the novel The Cream of the Jest by James Branch Cabell.  Why does the Sigil make a good AI test?

One reason is that it requires a considerably more holonic interpretation process than just separating the text would.  It’s in a handwritten script in an extremely idiosyncratic font — you need to have a good guess what the word is to figure out what the letters are, and vice versa.  Information must flow down as well as up the interpretation stack.  It takes a few minutes to figure out the Sigil; you can read the jammed-up letters version straight off:

|IAMESBRAN
CHCABELLMADETHISB
OOKSOTHATHEWHOWILLSMAY
READTHESTORYOFMANSETERNA
LLYUNSATISFIEDHUNGERINSEAR
CHOFBEAUTY|ETTARRESTAYSINACCE
SSIBLEALWAYSANDHERLOVLINES
SISHISTOLOOKONONLYINHISDRE
AMS|ALLMENSHEMUSTEVADEAT
THELASTANDMANYARETHE
WAYSOFHEREVASION

(stumbling perhaps over the name of one of the characters near the middle).

But then, once you have the words, you’ve only gotten started.  The test isn’t “separate this into words” — it’s “what does this mean?”

You could work out the words but not be able to explain them in context. You might be able to tell what the Sigil was physically in the book but be completely clueless as to its emotional meaning.  I claim that the question “what does this mean?” has different answers at every point across the diahuman range of intelligence.  Cabell was an unexcelled master of cryptic, poetic, romantic fantasy, based on a very thorough knowledge of mythology and keen insight into human nature.  Think of him as Tolkein multiplied by James Joyce, filtered through the light touch of Wodehouse.

Thus actually to “get it” with Cabell, you have to be able to understand things on lots of different levels at once.

I often feel smug at the dumbness of people in the sense that surely AI must be a low bar — how hard could it be to beat that, whatever that might have happened to be.  But there are other times when, contemplating people like Cabell, I feel like giving up, it’s just too hard.

If you can work out the words in the Sigil, you’re at the hypo/diahuman border.  If you can write a book like Cream of the Jest, you’re at the dia/epihuman border.

AI: how close are we?

In the terminology I introduced in Beyond AI, all the AI we have right now is distinctly hypohuman:

The overall question we are considering, is AI possible, can be summed up essentially as “is diahuman AI possible?”  The range of things humans can do, done as flexibly as humans can do them, and learned the way humans learn them, is as reasonable definition of intelligence as any. This is reflected in the “Wozniak Test” and the “Nilsson Test”, i.e. the ability to do human jobs.  (If nothing else, this obviates at least one other question, namely, at what point will AI have a major economic impact?)

The problem is, people have been claiming that their robots could do things like the Woz test for quite some time:

Robo-maid from 1930

From the marvelous Paleofuture blog, an advert for a robot maid in 1930!  (Not exactly, read the blog)

Today, these things are getting closer to reality:

Mahru-Z Korean robot maid

which is a lot closer to reality than the previous one — there’s a $3M/yr project behind it at the Korea Institute of Science and Technology.

Even so, I doubt that Mahru-Z or Willow Garage’s PR2 or any other existing robot could come close to passing the Woz test, much less the full Nilsson Test.  On the other hand, I think it’s pretty clear that over the past couple of decades there has been a very strong advance in robotic capabilities and, IMHO, it bids fair to make robots usable in another decade and skillful in another one after that.

How about thinking and learning?  This is really the crux of the issue; the Woz test is simply to sum up the complexity and adaptability necessary in a simple description.  Nobody is putting the processing power necessary to do serious AI into mobile robots.  What the robot example shows is that for specific skills, the state of the art in programming is pretty close to being able to program what a typical person could learn.

The structure of intelligence can be broken down into a set of skills, ranging form pouring coffee to doing integration by parts; meta-skills such as recognizing which skills are appropriate when, and planning with them; the ability to learn new skills, including meta-skills, both from imitation and by inventing them.  (Skills of course include recognizing and understanding things as well as doing things.)

Face Detection and Pose Estimation

Note that we’re well into the useful range if the AI can only learn by imitation or being taught, and never does anything particularly creative or original.  So for the lowest level of AI all we need is to program up all the basic skills we need and the ontologies — datastructures for knowledge representation — for the AI to learn some kinds of new things, or at least be reasonably adaptable.  It would clearly have a built-in “glass ceiling” over what kinds of thing it could learn, but then so do quite a few people.

One fairly good overview of the kinds of skills and meta-skills can be programmed with current techniques is the leading textbook, Russell and Norvig’s Artificial Intelligence: A Modern Approach. Just look thru the table of contents… If this thousand-page epic tome is light in any area, it would be the problems of inferring formalizations from unstructured data — but there’s a lot of work on that in the real world pursuits like data mining where people are trying to take advantage of the treasure trove represented by the internet.

Bottom line: I think we have the techniques now to build an AI at the hypo/dia border, equivalent to a dull but functional human.  It would have to run on a smallish supercomputer — say one rack full of servers stuffed with GPGPUs.  The problem is that it would take a huge, coordinated project to implement all the techniques and skills that are understood into a single integrated system, and AI in practice is a cottage industry.  Right now that’s not economically feasible, given the cost vs the economic value of one more dull human.  But those things will shift during the coming decade — the hardware will get cheaper, the software more sophisticated, and quite possibly by 2020 the economics will look different.  Then and only then will AI really take off.

A brief history of AI

  • 40s: Cybernetics, the notion the brain did logic in circuits, feedback
  • 50s: the computer, stored programs, Logic Theorist
  • 60s: LISP, semantic nets, GOFAI
  • 70s: SHRDLU, AM
  • 80s: AI winter, expert systems, neural nets
  • 90s: robots, machine learning
  • 00s: DARPA grand challenge level of competence

The main point of this post is to answer any objections of the form: you’ve been working on this so long, why don’t you have it yet?  (Or perhaps, AI is the technology of the future and always will be. :-) )

One key thing to note is that cybernetics was the original line of inquiry that was going to let us understand how the brain worked and allow us to build smart machines.  Many people assume that cybernetics failed since it more or less disappeared as a discipline.  But in fact it learned some very key and useful insights, forming the basis of control theory and neuroscience; but it fell apart due to personalities in its cadre (a veritable soap opera between Wiener and McCulloch and Pitts involving Wiener’s daughter) and political disfavor in the US involving Wiener’s antiauthoritarian stances.

So GOFAI was born with a built-in bias against some of the insights of cybernetics.  That has now been repaired; it was forced by the reintegration of control theory and the growing use of knowledge from neuroscience in the 90s, when AI robotics began to get serious.  There are reasons AI floundered in the 80s, and that’s one — another is a diversion from basic research to applications before it was really ready.

Another point that is rarely made is that AI, the small sub-discipline of CS, isn’t the real major part of the work in the 20th century that will have led to intelligent machines.  It’s the invention of the computer itself and all the work that’s been done to bring us the processing power we need to do the job, and the software to manage it and the complexity of human-comparable systems.  And nobody could reasonably claim that that effort has been standing still, or has come to nothing, or anything even vaguely similar.

An AI will be a hardware/software network and system so complex and powerful that it will make the entire ARPANET of the 70s look like a toy — and it will have to manage its own internals completely automatically.  I personally think that it will need the internal robustness that can only come from incorporating feedback and automatic resource management into the basic fabric of its computing platform.  But that’s the kind of thing that can easily be done in a decade, once someone decides to do it.  And it will be useful for a lot of other applications as well!

Is AI really possible?

I’m about to start a series of posts on the topic of why I think AI is actually possible.  I realize that most of the readers here don’t probably need too much convincing on that subject, but you’d be surprised how many very smart people, many of them professors of computer science, are skeptical to some extent or another on that point.

To start off, though, I’m just soliciting comments on the subject to try and get some feel for where the readership is on the subject, and what are the issues anyone feels are important to the argument.

Start your comment off with an indication when you think we’ll have human-level AI, and go from there:

  • in the next decade
  • in the 20s
  • 2030-2050
  • 2050-2100
  • thereafter
  • never

Feynman anniversary event to be held at University of South Carolina

Feynman anniversary event to be held at University of South Carolina. h/t Nanowerk

In February 1960, the Caltech magazine Engineering & Science published Feynman’s “Plenty of Room”, and it has been re-published ten times since then. This has become one of the best-known papers in the history of nanotechnology.
The fiftieth anniversary of the initial publication of “Plenty of Room” presents us with an opportunity to reflect upon Richard Feynman’s legacy in nanotechnology. The University of South Carolina will convene a symposium to consider the talk, the man, and the field of nanotechnology during the past fifty years. The Symposium takes place at the University of South Carolina on Friday and Saturday, 12 and 13 February 2010.
All full program (in PDF format) is available.
Registration fee: $25; no charge for USC faculty, staff or students.

Note this USC is South Carolina, not Southern California :-)

Keeping computers from ending science’s reproducibility

From Ars Technica: Nobel Intent, a thought-provoking article on what the prevalence of computational science portends for reproducibility in science:

Victoria Stodden is currently at Yale Law School, and she gave a short talk at the recent Science Online meeting in which she discussed the legal aspects of ensuring that the code behind computational tools is accessible enough for reproducibility. The obvious answer is some sort of Creative Commons or open source license, and Stodden is exploring the legal possibilities in that regard. But she makes a forceful argument that some form of code sharing will be essential.

“You need the code to see what was done,” she told Ars. “The myriad computational steps taken to achieve the results are essentially unguessable—parameter settings, function invocation sequences—so the standard for revealing it needs to be raised to that of when the science was, say, lab-based experiment.” This sort of openness is also in keeping with the scientific standards for sharing of more traditional materials and results. “It adheres to the scientific norm of transparency but also to the core practice of building on each other’s work in scientific research,” she said. But the same worries that apply to more traditional data sharing—researchers may have a competitor use that data to publish first—also apply here. In the slides from her talk, she notes that a survey she conducted of computational scientists indicates that many are concerned about attribution and the potential loss of publications in addition to legal issues. (The biggest worry is the effort involved to clean up and document existing code.)

Still, this sort of disclosure, as with other open source software, should provide a key benefit: more interested parties able to evaluate and improve the code. “Not only will we clearly publish better science, but redesigned and updated code bases will be valuable scientific contributions,” Stodden said. “Over time, we won’t stagnate forever on one set of published code.”

via Ars Technica: Nobel Intent: Keeping computers from ending science’s reproducibility.

My slides from Foresight2010

Roadmaps to Nanotech and AGI

slides are here

Josh

[note -- we know about the permission problem, trying to get it fixed][should be fixed now]

“Lies don’t work as well as they used to…”

Glenn Reynolds, a past Foresight Director, writes some analysis of the recent special election in Mass.:

Of course, what the GOP apparat does is less important nowadays than it was. As I noted before, there’s a whole lot of disintermediation going on here — Scott Brown got money and volunteers via the Internet and the Tea Party movement, to a much greater degree than he got them from the RNC. Smart candidates will realize that, too.

And lies don’t work as well as they used to. Obama promised transparency and pragmatic good government, but delivered closed-door meetings and outrageous special-interest payoffs. This made people angry. If Republicans promise honesty and less-intrusive government, but go back to their old ways, the likelihood that the Tea Party will become a full-fledged third party is much greater. …

via Instapundit » Blog Archive » SO, BROWN WON. This is big news; while the White House is still in the healthcare bunker, things li….

We don’t deal with politics here but we are concerned with technological developments that improve social decision-making and governance.  The internet has clearly been such a technology.  As one very tiny part of the generation of computer scientists that built it, I will happily accept the plaudits of a grateful world in their behalf …

Of course, the Internet could be improved as a fact-finding device, and ought to, as Eric Drexler notes:

We could benefit immensely from a medium that is as good at representing factual controversies as Wikipedia is at representing factual consensus.

What I mean by this is a social software system and community much like Wikipedia — perhaps an organic offshoot — that would operate to draw forth and present what is, roughly speaking, the best evidence on each side of a factual controversy. To function well would require a core community that shares many of the Wikipedia norms, but would invite advocates to present a far-from-neutral point of view. In an effective system of this sort, competitive pressures would drive competent advocates to participate, and incentives and constraints inherent in the dynamics and structure of the medium would drive advocates to pit their best arguments head-to-head and point-by-point against the other side’s best arguments. Ignoring or caricaturing opposing arguments simply wouldn’t work, and unsupported arguments would become more recognizable.

Success in such an innovation would provide a single place to look for the best arguments that support a point in a debate, and with these, the best counter-arguments — a single place where the absence of a good argument would be good reason to think that none exists.

Last day of free webcast of Foresight Conference on nanotech & AI

Today is the last day of the free webcast of the 2010 Foresight Conference being held now in Palo Alto.

The bandwidth coming out of the Sheraton is marginal, so the video may be low-res, but we will be posting high-res videos later, funds permitting (feel free to assist with this goal!).

You can also follow the conference on Twitter at #Foresight2010, and send in your questions in real time to the speakers that way.

Wish you all could be here with us today!  —Chris Peterson

This weekend: free webcast of Foresight Conference

There’s still time to register, but if you just can’t participate in person this year, check out the free webcast of the Foresight Conference being held this weekend in Palo Alto.

The bandwidth coming out of the Sheraton is marginal, so the video will be low-res, but we will be posting high-res videos later, funds permitting (feel free to assist with this goal!).

Unfortunately the Senior Associate Reception debate between Robin Hanson and Mencius Moldbug on futarchy will not be webcast.

You can also follow the conference on Twitter at #Foresight2010, and try sending in your questions in real time to the speakers that way.

Wish you all could be here with us this weekend!  —Chris Peterson

Is gravity an entropic spring?

Two nanoparticles connected by a polymer will tend to be drawn together at finite temperatures (though not at absolute zero) because as the polymer chain explores the states available to it, there are many more tangled and balled up ones than stretched-out straight ones — even though there is no overt force pulling the chain to any particular tangled state.  Such a situation is called an entropic spring, and behaviors like this are some of the more interesting aspects of physics at the microscale.

An arXiv paper by physicist Erik P. Verlinde purports to show that gravitational effects have the same mathematical logic behind them (in a very broad analogical sense), arising from the holographic universe description of physics (a far-out variant of string theory).  Now I don’t come close to having the physics to evaluate the theory, but Verlinde appears to be a respectable physicist.  Czech physicist Luboš Motl blogged about it:

So I remain undecided whether or not there is a sharp insight waiting along the lines of Verlinde’s paper.

and then allowed Verlinde to guest-post a long explanatory comment.

The derivation of the Einstein equations (and of Newton’s law in the earlier sections) follows very similar reasonings that exist in the literature, in particular Jacobson’s. The connection with entropy and thermodynamics is made also there. But in those previous works it is not clear WHY gravity has anything to do with entropy. No explanation for this apparent connection between gravity and entropy has been given anywhere in the literature. I mean not the precise details, even the reason why there should be such a connection in the first place was not understood.

My paper is the first that gives a reason why. Inertia, and hence motion, is due to an entropic force when space is emergent. This is new, and the essential point. This means one HAS TO keep track of the amount of information. Differences in this amount of information is precisely what makes one frame an inertial frame, and another a non-inertial frame. Information causes motion.
This can be derived without assuming Newtonian mechanics.

“Space is emergent”???  Yep, in the holographic theory, 3D space is an emergent phenomenon of a 2D information pattern (see the link above).  Weird stuff, but no weirder than other forms of string theory.

As mentioned, I don’t claim to follow this at the technical level, but given how important the math of entropy is at the microscale, it’s fun to speculate about its being important at the most macro of macroscales as well.