Monday, June 30, 2008

READ and her friends

The recent conversation on the Signature web site about the usefulness of the READ statement started me thinking. There is no question that READ is a weak sister. Her newer siblings EXTRACT and INQUIRE are much better. Not only do they do more, they tell much more about the intentions of the coder. When you see an INQUIRE in a program you know you are dealing with a file that is not going to be changed; an EXTRACT alerts you that this is a file that will be changed.

The suggestion made on the web site in June 2008 was to change the Comet run-time so that READ would act like INQUIRE. After all, if all the coder wants to do is get a record, and did not use an EXTRACT, then why not let the system treat the READ just as it does an INQUIRE. The suggestion was roundly denounced. The consensus was to leave old code alone; "If it aint broke, don't fix it." I could not agree more.

Not the run time, the compiler
My suggestion is less dangerous. Rather than fool with the run time, make the compiler a little friendlier, a little smarter. The compiler should warn the coder that the READ statement is bad news and that INQUIRE or EXTRACT are the preferred statements. This kind of change would affect only new programs and programs that are being updated. And, if all you got was a warning message then you could elect to leave well enough alone when making a minor change to an old working program.

Warning messages
And while we're taling about changing the compiler's warning messages maybe we can eliminate the useless warning about duplicate definitions. No, not the error message that says the same variable is defined with different characteristics. That's a real error. I'm talking about the warning that tells you that CNBR$ is defined three times in the source. When I see that warning message filling up the compile screen I want to yell, "I know, I know, I know it's defined three times in three USEFILES and I don't dare to change them."

Warning you about non-errors that you can't change is not all that helpful. Now what I would like is a list of unreferenced statement labels and unreferenced variables. I like removing unreferenced statement labels so I can better see how the program logic works. And I like removing unreferenced variables for the same reason, BUT please don't tell me about unreferenced variables and statements in USEFILES. Please tell me about unreferenced statements and variables defined in the source program. Now that would be a useful warning message.

How about WRITE?
WRITE suffers from the same limitations as READ. It has been supplanted by it's big brothers INSERT and REWRITE just as READ has been passed by EXTRACT and INQUIRE. Why not smarten up the compiler to gently warn the developer that maybe WRITE is not the best choice.

I'm really hard pressed to find a good case for using a READ in new code but I can see at least one situation where I would use WRITE. I have designed some update programs to be restartable and in those programs I can WRITE and rewrite the new transaction file records repeatedly and do no damage. The transaction record key is unique, so the transaction record will be written once under normal circumstances, but in a restart situation it could be safely written over again. So a compiler warning message would be appropriate. I know that WRITE is not the best command to use, but in this program I know I want to use it.

Signature must be dealing with changing the compiler as they continue to add new, more powerful commands to the Comet environment. The suggestions I'm making here are probably much more difficult to implement than simply adding a new command to the language but if you agree with me and add your comments to this blog, maybe the volume of our voices will encourge them to consider these improvements.

Thursday, April 17, 2008

Hungarian names, space travel and Martha Stewart

If you looked at the blog called Variable Names you know that I am not a fan of the Comet / Qantel notion of using the same name for variables in different data files. The classic examples are Customer Number and Order Number. These two variables appear ALL over a Solutions based accounting system. The Order Number is created in Order Entry but it flows into the Order (Header) File and into all the ancillary files that comprise the Order File. It appears in the Order Detail file, as well as in all the many pointer files that drive the order processing system. In all these files, and in many others, the Order Number is not only a field in the file, it usually is the part or all of the key of the file.

I'm here to say that each of these Order Numbers should have a unique name. When a program references order number it should be immediately clear which order number is meant. So, how to come up with meaningful names?

Really smart programmers have given this some thought. We're not the first ones to worry about how to name variables. Microsoft has a scheme it uses that the folks there call Hungarian.

If you follow the link, you'll see that Hungarian is not Polish! Simonyi's scheme requires putting a prefix on the real variable name and that prefix tells the coder something about the characteristics of the variable. In the C language family the type of variable - integer, boolean, floating point - is really important; there are lots and lots of rules about using and converting data from one data type to another data type.

We don't have the luxury (or the burden) of dealing with different data types. In IB there are really only two data types: strings and numbers. Of course, we often create our own sub-data types: arrays, dates, edit masks, strings that are really numbers, and more. Our programs only work properly when we keep these different formats clearly in mind. Despite the importance of data type, I really believe that the biggest source of confusion in IB programming is knowing the origin of each variable.

Is this the customer number from the customer master or the customer number from the order file? Is this the running balance accumulated as the program runs or is this the balance in the just read order file?

I am proposing that we append the file name as a prefix to the base variable name. So the Order Number in the Order Header would be o1aOrnbr$ while o1Ornbr$ is the variable in the Order Detail file.

I know. I know. Why not O1A.Ornbr$? Because UltraEdit does not like embedded periods. As nice as dots may be, and even if Brian pointed the way, names like cosNMHDR.LENGTH just don't agree with UltraEdit.

oh yea, I promised you outer space. OK here it is. And Martha Stewart. And I know that Martha would approve of a nice, neat orderly naming convention.

Friday, March 28, 2008

ADD2QDIR

Think about where Internet Basic has been. New let's think about where it could go.
I remember when we all coded:

READ(1,000) KEY="" EXCP=FUGETABOUTIT
FUGETABOUTIT:

The FILE statement makes it much easier and cleaner to code:
FILE (0) POS=BOF. No meaningless EXCP or pointless statement label is necessary.

Getting to the beginning of the file is really a special case. The more general statement is:

READ(OrdDet,0000) KEY=Ornbr$ EXCP=FUGETABOUTIT
FUGETABOUTIT:

The Exception is meaningless because the Order Detail file is keyed by Order Number and Line Number and the READ can never (should never?) succeed.

The POSITION statement makes this construction unnecessary.

POSITION (OrdDet) KEY = Ornbr$

I never code an EXCP on a POSITION statement even though the documentation shows one. The documentation does not specify what conditions could trigger a exception; the only one I can think of is File Not Open and I would want that to cause a crash.

The CLEARFILE statement comes close to addressing the complications of:

ERASE "FILENAME" EXCP=NEXTLINE
NEXTLINE:
CREATE "FILENAME" .....
OPEN (1) "FILENAME"

Sadly the CLEARFILE statement has its own limitations. Recently we learned that CLEARFILE, which does not include an EXCP argument, is oblivious to file contention. If the file being cleared is open by another user, Comet will chop that user off and clear the file. Signature recommends LOCKing the file before clearing it.

Recently I found myself doing a lot of work with files imported into Comet from Windows applications. Internet Basic has a way to add a file to a Comet QDIR from inside an application program, but the method is awkward and results in spaghetti code.


OPEN (1) "FOREIGN" EXCP=OPNERR
. process data here
..
..
OPNERR: !Open fails on file being imported
CREATE "FOREIGN",T,DIR="XXX",EXCP=CRERR
CRERR: !Create error
If EXCP=13 !Ah ha, the file exists
Goto ... !Now it's in the QDIR
Endif

You might be tempted, I was, to code an AGAIN statement to get back to the OPEN. After all that's where all the stuff started. That wont work because it was the CREATE that trip the Exception that triggered the pseudo-create. Sadly you need a label that is almost as useless as the statement labels we used to code for POSITIONs and still do for ERASE/CREATE pairs.

I want an ADD2QDIR that needs no EXCP. Then I can OPEN foreign files without jumping all over the place.





Friday, March 21, 2008

Data base design

I'm working with a system I did not create. Now I have the reputation of being excessively critical of OPC (other people's code) so please bear with my tirade - you may find a grain of truth in this rant despite my nasty and critical nature. In this system there is an Order Header file called O1A, and Order Detail file called O1 and an Item/Warehouse file called I1B. No surprise there. Now here's where I get all bent out of shape. All three of these files have a field called Warehouse and all three use the same variable - the same name.

I guess I should be used to that approach. The Qantel/Comet file scheme, which is really different from the COBOL file organization scheme, lends itself to defining the same variable in more than one file.

In COBOL the data record is read into a memory block in the application program space. That space is mapped to variables: the first 6 bytes are the customer account number, the next 30 are the name, and so forth. In Comet the record is read into some invisible space in the OpSYS and then the variables are parsed into named variables.

The COBOL scheme makes it IMPOSSIBLE to use the same variable name in more than one record. It is simply not possible as the name points to a specific memory block. The same logic applies in Access, SQL, Oracle and, I expect, in all other databases.

The Qantel/Comet format scheme is a result of the small memory design of the 1970's and 1980's. Consider how much memory this approach saves. If you have the "same" data in more than one file, think Order Number, you need only one variable and you save all the assignment statements that would be required to set the Order Number in the many files where it exists.
But in the forty years since the Q machine came on the scene, memory has gotten so cheap that it is almost free. And there are problems with the common variable approach.

The Comet scheme results in hidden assignment statements; reading and writing records that are defined with the same variables causes these fields to be filled "behind the curtain." I hate this because I can't tell which program updates a variable and all the Seeks and Searches don't help. Magic assignment statements behind the curtain can be so frustrating.

And then there is the file integrity issue. I so clearly remember the time a program I wrote read an O1A record keyed by Order Number + "B" into the format that was defined as:
1511 Format ___
ORNBR$;__
"A";_

I LOST THE LITERAL "A". Comparisons to the constant, the literal, "A" failed. It took me a very long time to realize that I had clobbered the location containing the literal A with a B.

Since then I have seen data corruption problems migrate through a system changing order numbers almost at random. One corrupt record, with the wrong data in the Order Number field, can destroy a database in the most amazing way. And the best way to start this is to write a record with the wrong format. That'll do you really nicely.

I have more I want to say about database design and variable naming, but it will have to wait for another time.

Labels:

Friday, March 7, 2008

Naming Variables

OK, tell me again why you are restricting variable names to eight characters. You've always done it that way. #FILES limits you to eight bytes. Long names are too hard to type correctly.

#CFILES, which is where you should be now, does not limit the IB name to eight bytes. You may wonder what the limit is. I know I did so I checked the documentation: http://www.signature.net/dbmanager.html. Of course I did not find out because there is no reference to the maximum length of the IB name, but it is surely longer than eight bytes.

And UltraEdit will help with entering those long variable names. The magic is the less than well documented AutoComplete feature in UE. You invoke it with Ctrl+Space. The full explanation can be found in the UE help by searching for Auto Completion. I found it by searching for Ctrl+Space. It's great but it's not perfect. Of course, UE does not look in USEFILEs or INCLUDE files. That's a bummer. And UE only looks back. In other words, if the first reference to a variable is below your current location in the file, the Auto Completion feature does not work. Now I'm still on version 10 but the promotional material on the UltraEdit site reads just like the current HELP text - UE looks backward through 50K for auto completion help.

UltraEdit has another quirk that you need to know. The program sees periods as an end-of-word marker. So, if you have a long variable name like MAX.STRING.LEN UltraEdit will not help you. Try it yourself. Start UE. Load up FILEFIND.INC from WDL. Type MAX and press CTRL+SPACE. Bummer.

So here's my rant. Use longer and more descriptive variable names. Don't punctuate the names with periods. Learn to use UltraEdit's not-so-fabulous Auto Completion. You'll be glad you did.

Friday, February 29, 2008

Enhancements to the Comet file system

I am proposing two enhancements to the Comet file system that would be invisible and completely compatible with existing Comet application software.

Add an extension to Comet Data Files: As everyone reading this blog knows all too well, the data portion of Comet files don't have an extension. Unfortunately Windows DOES NOT LIKE filenames without extensions; does not like them one bit. I have found it impossible to send Comet files as attachments to emails. Somewhere, either at the sending side or the receiving side, a DAT extension gets stuck on the filename. So let Comet put the DAT extension on the data file name. The idea is not without precedent. Comet invisibly appends an OBJ extension to object files even though the system uniformly refers to "programs" by their root name and almost never refers to them by their real name. And if you try to CREATE a file with an OBJ extension, Comet will bark at you.

Append CR/LF to the end of every record: When a program CREATEs a file, Comet would invisibly add two bytes to the record length and when a record is written to that file Comet would invisibly append a Carriage Return and a Line Feed to the end of the record.

These two changes would not have any effect on existing programs. Existing CREATEs READs and WRITEs would not be affected.

Comet file names and record lengths would not change. BUT you could look at Comet data files using a text editor and the Google desktop search tool would be able to look at Comet data files. These seem like big benefits for small changes.

Monday, February 25, 2008

Focus

We steal words and use them for computer things that didn't exist when the words were first coined: word, string,branch. Focus is another of those words. A window has focus when the title bar is bright blue and the user HAS to answer the question in the box. Somethings I've seen call these modal windows but I'm not sure these two concepts - modal and focus - are the same.

Meanwhile I want a no-focus (blurry?) window and I don't think Comet has such an animal. It all has to do with what you mean by Help System. Most help systems are designed to answer the user's request for assistance. The user selects a HELP button or press a function key or does something else and a window opens with lots of text that can be searched to find answers.

I want something like tool tips. I want to put up an information window in Order Entry for example. The window would contain reminder information listing all the magic words that the operator can enter in the item number field. My information window would appear over the order header information, or over the blank space below line one. Once line one has been entered my program would close the window and it would not show again until the next order is created.

Oh yes, I'd like to be able to select the text font in this window so it looks very different from the 1980's Lucida Console. Italics anyone...