Archive for the ‘computing’ Category

Statistical machine translation in New Scientist

Wednesday, February 23rd, 2005

New Scientist reports on statistical machine translation and the commercialization being done by Language Weaver.

Wolfram: free will as computational irreducibility

Sunday, January 23rd, 2005

Stephen Wolfram (Wikipedia article) is the child prodigy who went on to invent Mathematica, the ubiquitous software package for mathematical analysis. It’s now been three years since the publication of his A New Kind of Science D3: The Mighty Ducks dvd (Wikipedia article) to much fanfare. The book’s main thesis is that complexity can emerge from extremely simple models, of the type that can be embodied in computer programs. He claims

My purpose in this book is to initiate a transformation in science…making it possible to make progress on a remarkable range of fundamental issues that have never successfully been addressed by any of the existing sciences before.

My Fair Lady movie download

The book is nearly 1200 pages of dense mathematics, diagrams, and discussion. The notes alone are over 300 pages, and the book is not cheap, so I’m not recommending people read it, but it is nonetheless thought-provoking, regardless of whether you accept his grandiose claims, which many people do not. For one thing, it’s never clear whether he’s claiming that his models might generate behavior which resembles the real world, or that they are the models governing the real world.

At one level, this book is a work of philosophy. So how does Wolfram approach the hoary old philosophical problem of free will? For him, free will is related to “computational irreducibility”, one of his key concepts, which basically means that there are some types of computation which don’t allow shortcuts. Such phenomena permit no predictions about what is going to happen until it actually does. There is no future until the universe has finished computing it.

Wolfram says, “I believe that it is this kind of intrinsic process [complex, unpredictable behavior generated by simple rules] that is primarily responsible for the apparent freedom in the operation of our brains.” A novel definition of “freedom”: “free of obvious laws”, “freedom from predictability”.

The Atomic Cafe divx

In a word, Wolfram believes that free will vs. determinism is a false dichotomy. The world proceeds deterministically, but appears to be (is?) imbued with “freedom” due to its unpredictability.

(Students of language may find it interesting that for this book Wolfram invented a distinct new style of writing which he claims is specifically suited to its material. That style involves starting a large percentage of his sentences with conjunctions: “And” (to show a connected thought), “But” (to show a contrasting thought), or “For” (to show background or reason). He notes that this helps break up extremely long sentences. After a few hundred pages, however, this style becomes extremely irritating.)

War Wolves dvd
Shallow Hal movie download

Computational models of neurotheology

Wednesday, January 12th, 2005

When we talk about computational models of neurotheology, what do we mean?

What first springs to mind is to model an individual brain, or more likely brain/body system, to model the biological processes associated with a religious experience. Modeling transcendance, if you will. But could we tell that in fact what is being simulated is a religious experience? Humans know they are having or had religious experiences by being, at some level, conscious of them. But we can hardly build an entire mechanism of consciousness into our computer model. And even “pure” transcendant religious experiences have historical and social backgrounds, or, to put it another way, occur within the context of certain memories, which even Blue Brain could not model. All in all, a tough problem.

More tractable would be to integrate a coarse statistical model of individual religious experience with a sociological model. In other words, we would model religious experiences, large and small, but at the population level. Some percentage of religious experiences are at the breakthough level that can jumpstart an entire new religion, whereas others might suffice to rejuvenate or sustain a religion, if experienced by enough adherents.

Once a religion has started, we would apply sociological modeling techniques to model its spread and/or decline as the system of doctrines or cermonies that religions inevitably settle into, albeit leavened by periodic awakenings that serve to inject new energy into the religion for some period of time.

Work full movie

The model involves two distinct categories of data. The first relates to the statistical frequency, intensity, and types of human religious experience. I’m not aware of any data on that topic. Our goal would be derive hypotheses for those values, hopefully ones that could be cross-validated, either by working backward through the model from the sociological data mentioned below, or by running multiple scenarios to find one or more that are consistent with the sociological data.

The sociological data I am referring to, which should be relatively easy to capture, is primarily the distribution of sizes of religious groups over time, as well as other peripheral data such as conversion rates.

A flavor of the sociological side of the model can be gained from Simulating the Emergence of New Religious Movements, a paper which crudely models the formation and growth of religions. I can’t agree with the premise that NRM (new religious movement) founders are “rational agents who obtain various social advantages such as reputation enhancement and increased respect from other utility maximizing rational agents who buy their solutions”, but the seeds of one half of the model I propose—the sociological side—are there.

I hereby name this particular approach computational socioneurotheology™.

Blue Brain: modelling the neocortex at the cellular level

Monday, January 10th, 2005

First Deep Blue. Then Blue Gene. Now, Blue Brain.

Solve chess, next tackle the wonders of the gene, then unravel the mysteries of the brain for an encore?

Sort of. Deep Blue is now in a museum, an ultimately unsatisfying technological tour de force that accomplished little more than demonstrating that the complexity parameters of chess put it within reach of your average supercomputer.

And Deep Gene is not really designed to do anything with genes, although it’s been used to do molecular simulations. It’s cleverly named to create the image of a family of supercomputing projects, but in fact has nothing to do with Deep Blue, and at heart is a massive science fair project to see how many teraflops you get when you string together 32K nodes.

Blue Brain is the catchy name of the latest project, a partnership with a Swiss university (EPFL

) to use Blue Gene to model the human brain.

Although this project has been widely reported, most of the commentary has been at the level of calling the project a “virtual brain”, claiming for example

the hope is that the virtual brain will help shed light on some aspects of human cognition, such as perception, memory and perhaps even consciousness.

Wow, a thinking computer that’s also conscious.

But readers of Numenware will want to understand the research plan in a bit more detail. The first project is a cellular-level model of a neocortical column. They’ll simulate 10,000 neurons and 100 million synapses (yes, there really are that many synapses on 10,000 neurons). They’re going to use 8,000 nodes, so it would seem obvious to have one node per neuron, but that doesn’t appear to be the approach. They say the simulation will run in “real time”, but shouldn’t it be able to go faster? Of course they’ll have snazzy visualization systems. Hey, can I go for a “walk” among your neocortical columns?

From there, the researchers hope to go down—and up. They’ll go down to the molecular level, and up to the level of the entire neocortex. To do the latter will require a simplified model of the neocortical columns, which they hope to be able to derive from the first project. They’ll eventually move on the subcortical parts of the brain and before you know it, your very own virtual brain.

It’s undoubtedly true that this is “one of the most ambitious research initiatives ever undertaken in the field of neuroscience,” in the words of EPFL’s Henry Markram, director of the project. But I wonder if the kind of knowledge we gather about brain functioning from this project will be the same kind of knowledge we gathered about chess from Deep Blue.

Markram has a very micro focus. For instance, he has sliced up thousands of rat brains and stained them and stimulated them and cataloged them. And this whole project has the same intensely micro focus. That’s extremely valuable, but it’s like building a supercomputer simulation of how gasoline ignites in order to understand how a car runs, when we don’t even understand the roles of the carburetor and fuel pump and combustion chamber, to borrow an overused analogy.

For instance, I’m sure Blue Brain will cast light on the mechanisms underlying memory, but when these guys say “memory” they mean synaptic plasticity. What I want to know is how I remember my beloved Shiba-ken, Wanda, who was hit and killed by a car in Kamakura.

It seems to me we don’t need supercomputers to model the brain, although I’m sure they’ll be useful; we need concepts to model. The actual model could be no bigger than Jay Forrester’s ground-breaking system dynamics model of the world’s socioeconomic system. The problem is not the technology for modeling—it’s what

A Man Apart move

we model.

The same goes for neurotheology. We desperately need a computer model, but before that—we need a theory.

Putting Google in charge of your TCP/IP stacks

Wednesday, October 20th, 2004

I just noticed that installing Google Desktop injected Google code into my TCP/IP stacks. I love Google Desktop, but do we really want Google controlling the TCP/IP on our computers?

Performant

Saturday, October 9th, 2004

Is “performant” a word? I came across it in an article about Microsoft Visual Studio .NET 2005:

C++ is the easiest language to use for native interop and is often the most performant.

Why Microsoft creates buggy software inefficiently

Monday, August 2nd, 2004

We’ve all heard that Microsoft software is buggy. Bill G. says this is just because so many people use it they find all the bugs and so they get more attention, plus there are those who just enjoy pissing on MS. But I just had the experience, for the first time in a decade, of programming for Windows. The bottom line is that the entire environment is so buggy, flaky, and poorly documented that it’s a miracle anyone can write a program for Windows which runs at all.

Bill says that the open-source model can’t work because there’s no economic incentive to produce solid software or support it. What I realized is that he has it exactly backwards. The MS software development model can’t produce good software because it’s corrupted by the need to get the next version of the product out the door. Good software periodically and suddenly requires being rearchitected, or “refactored”, as they say, at inopportune times from the standpoint of the CFO. With the MS model, there is never a reason to take the time to go back and build the software right.

The elegance of the MS architecture has gone steadily down since Windows 3.1, which although kind of funky, at least had a weird predictability and consistency about it. Now, there are layers upon layers of additional libraries and wrappers on top of it, each
documented more poorly than the last. The only way to use the MS docs is to search them using Google, and even then it is all too often the case that you just can’t find what you are looking for, or worse—it’s wrong.

Programming in the MS world uses the approach I call “throwing mud at the wall”. Basically, you throw mud at the wall and see what sticks. You can never figure out the best way or the right way to do something in advance so you just try all the ways and use the first one that works. It’s like playing “pin the tail on the donkey”.

A good architecture has the characteristic that it surprises you from time to time with the cool things you can do easily because of its superior design. I’ve yet to come across a single thing in the entire Windows architecture that gave me this feeling.

Just a couple of quick examples, all from the browser extension world, which is what I was working on.

  1. The MS docs refer to an API to manipulate your browser’s history list. They give the name of a header file you use to access the API. But this header file exists nowhere in the world, except in a non-MS version that you can find on the net that was reverse-engineered by some poor sap who had no choice.
  2. A key ATL library used for internet access is just missing the wide-character version of the interface needed to read a web page off the net—making the entire app fail to link. I finally found this info in the MS knowledge base—with no work-around given,
    of course.
  3. Using the technique suggested to take the user to a particular web page after running the installer—namely a one-line VB application—polluted the entire installer with “.NET-ness”, which persisted even when I removed the VB app, requiring me to rebuild the installer component from scratch.
  4. The HTML DOM API provides a W3C-compatible “text range” object to represent ranges of text within a web page. But it is so buggy that something as basic as moving and endpoint of a text range forward or backward by one character doesn’t even work. And operations on a text range corrupt the DOM.
  5. The DOM built by IE is not even well-formed to the extent it sometimes cannot be walked from start to end.
  6. The API uses multiple interfaces for the same thing—four different interfaces for windows, for example—and of course the documentation is structured so you can never find anything unless you knew what you were looking for before you started.

And so on, multiplied by ten or a hundred.

What’s surpising, then, is not that Microsoft’s software has as many bugs as it does but that it doesn’t have many, many more; not that their software is often late, sometimes by years, but that it gets released at all. And I suspect that the high levels of profitability deriving from Microsoft’s near-monopoly in many markets is hiding the fact that it is well behind even other commercial software companies in development productivity because of the abysmal state of its architecture

VI geeks on the Gmail team

Wednesday, June 2nd, 2004

Looks like Google let their VI geeks loose on the Gmail product. Not only do they have the VI direction keys like “j” and “k” implemented, they’ve even implemented two-letter VI-like combinations, such as “g” “i” for “go” “inbox”.

Will this finally be the impetus for people for people to realize that the keyboard is a useful way to navigate web pages? (Today I saw a guy working on a computer who evidently didn’t know that the tab key would get him from field to field in the form he was entering.) Unfortunately, Google has implemented keyboard control with a bizarre-o Active X control, which prevents you from seeing the source of the page, scripting it, or anything else, not to mention being highly browser-specific.

Hijacking IE\’s user style sheet

Friday, April 23rd, 2004

A new, daring hijacking exploit was carried out on my computer—taking over my user style sheet.

The user style sheet is something used by the browser which most people haven’t ever heard of. It’s mainly an accessibility feature—you could set your own CSS styles to display everything in large type, for instance. (In IE, you set it with Tools | Internet Options… | General | Accessibility… | User style sheet.)

The devious thing about this exploit was that the user style sheet the malware stuck on my computer contained CSS property values computed using Microsoft’s proprietary expression feature for dynamically computing property values. Specifically, within an expression giving the value of some attribute for the BODY tag, it was looking up certain keywords within the META tag, and if it found them created a pop-up window which took over the entire screen!

I hear that the next release of XP has anti-malware features. It certainly seems like a no-brainer to disable the expression feature in user style sheets, to not allow pop-ups to be created from within CSS expressions anywhere, or, most basically, to not allow any changes to the user style sheet without the user’s express permission.

I guess this exploit is actually not that new. This article about it dates back to summer 2003.

Review of Eric Baum\’s \”What is Thought\”

Sunday, March 21st, 2004

In What is Thought, Eric B. Baum claims that thinking is like a computer program running. Or, that humans are like computer programs. Or something like that, or maybe not, since it’s impossible to tell what he is really saying, or what he thinks “thought” is to start with.

One real good argument Baum has is that many computer scientists think the mind is like a computer. Well, I have a gardener who thinks that the mind is like a garden. Actually, sometimes he thinks it’s like a rake, he tells me.

Baum starts off saying that Turing’s model of computing was intended to capture the essence of thought, so that proves that humans are like computers, since Turing is God. Except that even Baum later admits that Turing was at best trying to model human theorem-proving behavior; certainly “thought” is more than just that.

I’m going to come back to some specific topical comments I have about this book, but first of all I just wanted to mention that reading this book I was seriously scared that my brain was going to rot away. That’s why I actually didn’t read the whole book. I worried that if I did the damage might have been too much to undo. For instance, take the following sentence, right in the first section where he’s laying out his basic ideas: “The execution of a computer program is always equivalent to pure syntax.” This isn’t merely stupid—-it goes beyond that to just being completely meaningless. “Mind typically produces a computer program capable of behaving.” Huh??? Is mind producing the program, or is it the program, or what? “The mind exploits its understanding of the world in order to reason.” Except, apparently, in the writing of this book. “Mind is essentially inherent in the DNA, in some sense.” Yes, in a sense that we will never understand from this gibberish.

Baum’s writing gives new meaning to the word “circular”. He asserts that the mind has “subroutines”, and then that proves that it’s like a computer program. I guess he’s a few decades behind in his computer science, or else he would have said the mind is “object-oriented.” “Awareness is awareness of meaningful quantities.”

In a book like this, the author of course could not omit a discussion of neural networks. Baum thinks that neural networks are a “model of brain circuits”, which by the way is wrong—-they’re a computing model vaguely inspired by brain physiology. He’s right when he says the collection of weights generated by training a neural network is in general completely opaque—humans cannot figure out how the neural network works. Nothing in a trained neural network corresponds to a “human” understanding of the problem the net has learned to solve. So if the brain is a neural network, how does this correspond to the “semantics” he talks about? If the mind is composed of subroutines, and evolution is a neural network training process, how does a neural network generate subroutines? More critically, a neural network has its initial topology defined by a human; is he saying that evolution can also evolve the appropriate network structure? He says “it is impossible to…evolve…code unless it is modular.” But trained neural networks are precisely not modular.

“Neural circuitry is akin to an executable. The DNA is more like the source code.” A cute analogy, which might work real well in the term paper the sophomore at MIT who took philosophy for his one required humanities course had to write. But what does this mean? Is the DNA what is being created by evolution? In that case, what is the equivalent of the compiler?

Baum doesn’t do much better with basic philosophy. He asks the big question: What are objects? Are they just in your mind or is there an outside physical reality to them? He then imagines he is somehow addressing this question by jumping to the question of how we know a cup is something to hold by its handle and then drink a liquid from. Sorry, Eric, saying that “the mind is an evolved program” (using a subroutine for the cup problem, of course) does not answer any questions about the nature of reality.

Baum goes on to talk about the process of individual humans learning as being the acquisition of new subroutines. This is weird. We have some built-in subroutines coming from our DNA or something and then we learn new subroutines? Are these subroutines we learn encoded in the same “language” as the ones coming from our DNA? How do they interface with them?

Now we take a big jump, to an agent-based model. There are lots of little agents running around each with their own agendas and utility functions. This model is sort of proved to be right by the fact that it’s also a model which can describe market economies. Taking a sudden right-wing detour, Baum posits that the agents work so well because they have “property rights” and try to conserve money. The agents compete and cooperate. But who set up the system within which these agents (which are also subroutines, I suppose) operate?

“Evolution has learned to search in semantically meaningful directions.” So now we have not only a learning process embodied within evolution, but a meta-learning process governing the process of evolution itself. Evolution evolves!

I’m a go player, and well-versed in the issues facing computer go. So I was particularly interested in Baum’s thoughts on this topic. I found them to be shallow, poorly informed, and lacking insight. Besides getting basic information, such as who developed what program, wrong, always a bad sign, he offers tautologies such as “Go masters play remarkably strong Go.” First, he says that we have a pre-evolved “program” for “causal” analysis. Then, he says we have a large number of “computational modules…that may very well be directly coded into the human genome”, including topological concepts like “connected”, “surrounding”, and “inside”. Besides the problem that these supposed pre-programmed modules have no connection with “causality”, the fundamental point that these modules being wired deep into our DNA, compared to computer programs which have to calculate the same concepts in a “computationally expensive way”, accounts for human strength at Go is, frankly, absurd. If Baum cannot come to a more sophisticated understanding of the complexity of go, he should not write about it at all.

I’m having a very hard time understanding why people who should know better, like Nathan Myhrvold the former Microsoft executive, would put their names on the back of this book.

Of course, we also need a theory of language. Baum has the answers here as well. Language is just “attaching labels to computational modules”. I see! He sums up his insights succinctly: “All that is needed is to attach a label to a computational module, and the particular module indicated will often be quite salient, because we share inherited inductive biases in the form of modular structure.”

“Evolution thus designed the mind for the purpose of making decisions leading to propagation of the DNA.” “I suggest that this picture will…qualitatively explain everything about our consciousness and our experience of living.” Thank God, I was afraid no-one was ever going to figure that out.

And a last bit of good news: Baum has also solved the age-old paradox concerning whether or not humans have free will. The answer is simple: DNA has evolved a mind which has free-will subroutines!