Friday, March 21, 2008

Data base design

I'm working with a system I did not create. Now I have the reputation of being excessively critical of OPC (other people's code) so please bear with my tirade - you may find a grain of truth in this rant despite my nasty and critical nature. In this system there is an Order Header file called O1A, and Order Detail file called O1 and an Item/Warehouse file called I1B. No surprise there. Now here's where I get all bent out of shape. All three of these files have a field called Warehouse and all three use the same variable - the same name.

I guess I should be used to that approach. The Qantel/Comet file scheme, which is really different from the COBOL file organization scheme, lends itself to defining the same variable in more than one file.

In COBOL the data record is read into a memory block in the application program space. That space is mapped to variables: the first 6 bytes are the customer account number, the next 30 are the name, and so forth. In Comet the record is read into some invisible space in the OpSYS and then the variables are parsed into named variables.

The COBOL scheme makes it IMPOSSIBLE to use the same variable name in more than one record. It is simply not possible as the name points to a specific memory block. The same logic applies in Access, SQL, Oracle and, I expect, in all other databases.

The Qantel/Comet format scheme is a result of the small memory design of the 1970's and 1980's. Consider how much memory this approach saves. If you have the "same" data in more than one file, think Order Number, you need only one variable and you save all the assignment statements that would be required to set the Order Number in the many files where it exists.
But in the forty years since the Q machine came on the scene, memory has gotten so cheap that it is almost free. And there are problems with the common variable approach.

The Comet scheme results in hidden assignment statements; reading and writing records that are defined with the same variables causes these fields to be filled "behind the curtain." I hate this because I can't tell which program updates a variable and all the Seeks and Searches don't help. Magic assignment statements behind the curtain can be so frustrating.

And then there is the file integrity issue. I so clearly remember the time a program I wrote read an O1A record keyed by Order Number + "B" into the format that was defined as:
1511 Format ___

I LOST THE LITERAL "A". Comparisons to the constant, the literal, "A" failed. It took me a very long time to realize that I had clobbered the location containing the literal A with a B.

Since then I have seen data corruption problems migrate through a system changing order numbers almost at random. One corrupt record, with the wrong data in the Order Number field, can destroy a database in the most amazing way. And the best way to start this is to write a record with the wrong format. That'll do you really nicely.

I have more I want to say about database design and variable naming, but it will have to wait for another time.



Post a Comment

<< Home