Monday, December 11, 2006

Outside In

The fundamental goal of all data modeling is to create a facsimile of the objective world in a semantic composite. A table of data does this by listing aspects of reality that matter to us (columns), then recording the data for these columns in a series of rows. The relationships between these columns is the important part, though, and these relationships are only really understandable when we view the data as a diagram. The problem here is that when we view the diagram, we hide the rows of observed data for each of the column values. Our diagrammed data model may become irrelevant as the data coming in changes, and we would not know it until it is too late.

We sympathize with the data diagram approach because that is how our brains see the world: as a series of fantastically complex interconnections. We don't store a whole lot of data in our brain. Instead, when we see an object, we make connections about various aspects of that object that link to other objects. In database terms, the human brain is a database where all the values are primary keys.

Our computer brethren, on the other hand, stink at drawing associations but are incredible storage devices. We rely on them constantly, minute to minute, on remembering our music for us, remembering phone numbers, movies, news, and whatever else will fit on our Blackberries and laptops. We would not, however, ask them for the shape of the data, or for a field that no one seems to want anymore, although the sophistication with which data design can do these things is increasing constantly.

Somehow the data needs to appear in the design diagram. Obviously it can't make a cameo in the flesh, because it would make a laughingstock of design. It requires some algorithm that can convert the pattern in the data into abstractions at the design layer. Perhaps the shapes could glow red when they haven't been selected from for over two years. Blinking red.

No comments: