How To Code: Pascal

by

Stan Sieler

Allegro Consultants, Inc.


sieler@allegro.com


92/06/08

(revised 2000-05-03)

(revised 2009-06-05)

(revised 2010-06-09)

(revised 2015-06-12)

(Spanish)
(translated by adrian15)

(Belorussian)
(translated by Bohdan Zograf)

(Polish)
(translated by Mary Stefanova)

(Serbo-Croation)

(Slovenian)
(translated by NextRanks)


0. Introduction

This paper is concerned with how to write quality Pascal code. The emphasis is on the coding aspects of writing Pascal code.

Sample program: roman.p.txt, a program to convert Roman numerals into Arabic numbers.

Sample bad code: trim, an actual real-world piece of bad code.

How to ask for programming help: here.

Note: this paper uses Pascal/iX, the Pascal on the HP 3000 computer, but nearly all points apply to any variant of Pascal, and most apply to other programming languages as well.

I want the discussions in this paper to cover the four P's of programming: philosophy, performance, problems, and portability. I considered trying to divide this paper into four chapters, one for each of the P's, but realized that for many topics that I would put in one chapter, equally good arguments could be raised for putting it into another one. So, instead of having one chapter per P, I've organized this paper into the chapters: Style, Coding Choices, Performance Problems, and Portability.

In each chapter, the 4 P's are used as the principles underlying a set of guidelines. These suggested rules should not be viewed as though they were the 10 (or %12 or $A) Commandments. They are my rules, the ones that I have used (in various forms for different languages) in the writing of over two million lines of code. Guidelines should grow, as do people. I can look at programs I wrote in 1970 and see changes in style in the intervening years. Despite the changes, the overall look has remained the same.

The primary purpose of this paper is to encourage readers to think about the various aspects of style and form their own coding guidelines. This is an important underpinning to writing quality programs ... in Pascal or any language.

I've written over a million lines of high-level language source code since 1970. This, of course, includes some blank lines, comment lines, and lines that are cloned from other files. Nonetheless, this background has taught me that the "look" (or appearance) of a program is critically important to understanding what the code does.


Table of Contents

The major sections are:

0.1 Throw-away code

Guidelines:

#1-1. There is no such thing as throwaway code. Therefore, your guidelines apply to everything you write. Especially the "throwaway" or "one-shot" programs.

Commentary:

#1-1. I've noticed that I almost always re-use my supposedly throwaway or one-shot programs. And, if I've cut corners on using my guidelines I always have to pay for it later.

(back to Table of Contents)


1. Style

Style, or lack of it, can dramatically affect the quality of a program. Quality code must be written so that it is correct, efficient, and maintainable. Choosing, and following, a particular style can increase the likelihood that each of these three goals are attained. This section will first discuss variables and then the rest of a program.

(back to Table of Contents)

1.1 Names : The Basics

One of the keys to a successful program is the proper choice of names.

The success of a Pascal program lies in a good understanding of the data the program manipulates. This understanding is reflected in the choice of data structures and variables used by the program.

Throughout this paper the phrase "names", when used without a qualifier, refers to the names of variables, constants, types, procedures, and functions.

Guidelines:

#1-11. Use nouns for variables, constants, and types; use verb phrases for procedures and functions.

#1-12. Types should usually have "_type" at the end of their name.

#1-13. Names should be entirely lowercase, with an underscore ("_") separating the words (e.g.: card_count).

Commentary:

#1-11. The following one-line example should suffice to show the importance of this guideline:

   if card_count = 52 then
Is "card_count" an integer variable or a function? The difference is critical to the understanding of a program. The only guideline the reader can possible have is his/her understanding of English. (A capitalization rule would not have sufficed!)

It is important to remember that other people will be reading the code someday. ("Other" people includes you a day after you have written the code!) They will need "pointers" to help them correctly interpret the code they are reading.

According to guideline #1-11, "card_count" is a variable. If it was supposed to be a function, then its name should have been something like "count_the_cards".

#1-12. Some programmers prepend "t_", "typ_", or "type_". Others append "_t", "_typ", or "_type". A few misguided souls have been so enamored with this idea that they prepend "t_" and append "_type" to every type.

The advantage of appending, instead of prepending, is that variables related to the same thing will sort together, in a alphabetic listing of variables.

The basic idea is to clearly differentiate the names of user defined types from variables, and procedures (or functions).

#1-13. This guideline reflects the fact that we are English speakers. If We were German Speakers, our Programs should probably use Capital Letters at the Start of every Variable. Notice how much harder to read this is?

I can see three reasons for capitalizing the first letter (or all) of some types of names. None are worthwhile:

When writing in C, I do uppercase my macro names ("#define FAIL(s)..."), because that's the overwhelmingly most popular way of doing it. Still, it isn't without a pang of regret. You lose some flexibility when you follow that convention.

Now, if I change a simple variable like "lines_cached" to be a define (perhaps because I now keep track of the number of cached lines on a per-something basis), I can't just say: #define lines_cached cached_lines_count [clc_simple] I have to say #define LINES_CACHED cached_lines_count [clc_simple] *and* do a global change of "lines_cached" to "LINES_CACHED".

In the all-lower-case days of my C programming, I could switch something from being a simple variable to being a more complex one with no observable change in the rest of the code. OTOH, now when I see LINES_CACHED, it's a reminder that it's a bit more complicated a concept than "lines_cached".

(back to Table of Contents)

1.2 Names : More Details

Names for types should be descriptive, and should be clearly distinguishable as types so that they stand out when used in type coercion or "sizeof" expressions.

Guidelines:

#1-21. Generic types that are packed arrays of char should be named "pac" with the number of characters.

   type
      pac8        = packed array [1..8] of char;
      pac80       = packed array [1..80] of char;

#1-22. Types that are arrays should have "array" in their name. (Exception: pac## type names (see #1-21).)

#1-23. Types that are pointers should have "ptr" in their name. Types that are long pointers (64 bits) should have "lptr" in their name. (In Pascal/iX, ordinary pointers are 32-bits wide...a "long" pointer is a 64-bit pointer that can point anywhere in memory.)

#1-24. Records should have field names with a prefix that reflects the record they are part of.

#1-25. Simple types should not usually reflect their "size".

#1-26. Try to avoid the use of "true" and "false" for functional results. Use a type like "good_failed_type" instead.

Commentary:

#1-21. The Pascal language has special rules for variables that are packed arrays of char with a lower bound of 1. These special rules were added in an attempt to make standard Pascal usable for text manipulation. As a result, most implementations of Pascal refer to "PAC" as a variable (or type) which is a packed array of char, with lower bound of 1. Hence, it is convenient to reflect this standardization in our own types.

Note that this guideline said generic types. When I use TurboIMAGE (a DBMS on the HP 3000), and want to create a type that corresponds to an IMAGE item, I will try to give the type a name that is similar to the item's name:

         {IMAGE items:     }
         {   cust_name x20;}

   type
      cust_name_type          = packed array [1..20] of char;

In the above example, I used "20" in the type declaration only because it was a directly obvious value (given the comment about the IMAGE items). I would never use the constant of "20" later in the program. If I needed to refer to "20", I would either use the Pascal/iX construct "sizeof (cust_name_type)" or I would add a "const cust_name_len = 20" and use that constant to declare the record.

#1-24. This guideline makes the fields of a record immediately identifiable when they are used in a "with" statement. Example:

bad:

   type
      hpe_status  = record  {name matches MPE/iX's usage}
         info : shortint;
         subsys : shortint;
         end;
      hpe_status_ptr_type = ^ hpe_status;
      ...
   var
      hs_status_ptr : hpe_status_ptr_type;
   ...
   with hs_status_ptr^ do
      begin
      info := my_info;
      subsys := my_subsys;
      end;

good:

   type
      hpe_status              = record       {name matches MPE/iX's usage}
         hs_info              : shortint;
         hs_subsys            : shortint;
         end;

   ...
   with hs_status_ptr^ do
      begin
      hs_info := my_info;
      hs_subsys := my_subsys;
      end;

In the bad example, the reader who is unfamiliar with the code will not know that the pointer's fields are being changed. In the good example, it is obvious.

#1-25. In SPL (an ALGOL-like programming language), it is quite common to use a suffix or prefix to denote the "type" of a variable. (Example: double ktr'd). With Pascal's restrictions on matching types this is much less necessary.

#1-26. "true" and "false" mean little by themselves. Examples:

bad:

   function open_files : boolean;
      ...
      open_files := false;
   ...
   if open_files then
      ...

good:

   type
      good_failed_type        = (failed, good);

   function open_files : good_failed_type;
      ...
      open_files := good;
   ...
   if open_files = failed then
      ...

In the bad example, the reader has no idea whether or not "open_files" returning a true value is good or bad.

This guideline is even more important when programming in a multi-language environment, because different languages (and different operating systems) have strange ideas about whether "0" (or "false") means good or bad.

In my C programming, I use:

   #ifdef GOOD
   #undef GOOD
   #endif

   #ifdef FAILED
   #undef FAILED
   #endif

   #define GOOD 1
   #define FAILED 0
The "undefs" are there because, with astoundingly bad decision making, someone involved in BSD or Mac development decided that the file /usr/include/protocols/talkd.h should have a macro called FAILED, which has the value 2. (Or, on HP-UX, the file /usr/include/sio/scsi_meta.h has GOOD as 0, and the file /usr/include/sys/netio.h has FAILED as 3.) (Other bad exampled GOOD and FAILED exist!)

Files provided by an OS vendor should NEVER co-opt "known" words like send, get, put, FAILED, or free. Such words are simply far too generic to have any meaning in and of themselves. Sadly, we're stuck with many such words already in use.

(back to Table of Contents)

1.3 Ordering of Source

The ordering of the source code can dramatically affect the readability of a program. Ordering guidelines fall into three areas: types & constants, variables, and procedures & functions.

("include files" are harder to categorize ... sometimes they need to come first, sometimes they must be after all types/variables.)

Guidelines:

#1-31. Simple constants should be put at the start of the declaration area, followed by types, structured constants, and then by variables. Within each group, identifiers should be arranged in some order. If no other order presents itself, alphabetic order should be used. Example:

   const
      max_queues              = 10;

   type
      aardvark_quantity_type  = 0..9;
      queue_name_type         = packed array [1..8] of char;
      queue_names_type        = array [1..max_queues] of queue_name_type;
      zoo_fed_arrdvarks_type  = array [aardvark_quantity_type] of integer;

#1-32. At most one instance of "type", and "var" is needed, and up to two instances of "const" (one for simple constants, and one for structured constants).

#1-33. Identifiers in a declaration area (constants, types, variables) should be declared one per line, in some order. Alphabetic order is the default. If another ordering is used, it should be explained with a comment.

#1-34. Identifiers in a declaration area should be entered so that the "=" characters are aligned for constants and types, and the ":" characters are aligned for variables and fields in records.

#1-35. Record types that contain nested records should never be declared as "anonymous types". Examples of bad and good practices are:

bad:

   type payment_record_type = record
         pay_date : date_type;
         pay_time : time_type;
         pay_style : (paid_cash, paid_check, paid_charge);
         end;
good:
   type
      payment_style_type      = (paid_cash, paid_check, paid_charge);

      payment_record_type     = record
         pay_date             : date_type;
         pay_time             : time_type;
         pay_style            : payment_style_type;
         end;

#1-36. "Crunched" records should be avoided unless the need for tight packing of the fields overrides the performance loss they cause.

Pascal/iX supports "crunched record", an extension beyond a simple "packed record". In a crunched record, there are *no* unused bits. If you declare a crunched record with a 1 bit field, followed by an 8 bit field, followed by a 1 bit field, then the entire record will occupy 10 bits, and the 8-bit field will require extra code to load/store.

#1-37. Large outer block variables (> 256 bytes) should be declared last, even though this violates guideline #1-33. (Similarly, large local variables should be declared first.) (Note: this is a Pascal/iX performance tip ... other compilers on other machines probably do things differently!)

#1-38. Procedures and functions are intermixed (i.e.: do not separate procedures from functions).

#1-39. Procedures (and functions) should be declared in some order (alphabetic is the default).

#1-40. More than one level of nested procedure should be avoided.

#1-41. Intrinsics should be declared once, at the outer level, after all constants, types, and variables, and before any "external", "forward", or actual procedures. (An "intrinsic" is a reference to a kind of pre-compiled "external" procedure declaration, supported by most languages on the HP 3000.)

#1-42. Intrinsics should be in alphabetic order, arranged by intrinsic files. Example:

   Function ascii    : shortint;  intrinsic;
   Function binary   : shortint;  intrinsic;
   Procedure quit;                intrinsic;

In the above example, two spaces were put after the word "Function", so that the names of all intrinsics would be aligned, regardless of whether they were functions or procedures. It is unfortunate that Pascal makes us prove to the compiler that we know the functional type of every intrinsic.

Note the uppercase "P" and "F" in the above example. This is one instance of a very useful coding discipline that is explained in guideline #1-45 below.

#1-43. All procedures and functions should be declared with "forward" declarations, which are in alphabetic order.

#1-44. Types that are "fillers" should be declared as "integer" or "byte" (where "byte" is declared as 0..255) instead of as "char" for their base types.

#1-45. "forward", "external", and "intrinsic" procedure (and function) declarations should capitalize the first letter of "Procedure" and "Function".

In my C programming, I have my one master include file ("stddefs.h", which sets flags appropriate for the current operating system (e.g., #define HAVE_STRTOLL 1 if the platform has the function strtoll), followed by OS/vendor provided includes (e.g., stdlib.h, followed by my other includes (e.g., "sstypes.h",), then all my global types, enums, variables, defines of code fragments, 'forward' function declarations, then all my functions. Commentary:

#1-34. I generally align the "=" for types and consts in column 31, the ":" for fields within a type also at column 31, and the ":" for variables in column 19. If the variables have sufficiently long names, I'll often simply align the ":" in column 31. I apply this alignment to the ":" used in parameter declarations, such that the variable name starts in column ??:

   procedure parse_gribbitz (
                  a_token     : str80;
              var status      : boolean;
           anyvar stuff       : char)
         option default_parms (
                  stuff       := nil);
In the above example, note that the parameter names are aligned. (Obviously, this isn't always possible, particularly if the parameter has $alignment$ information specified.) (Note: "anyvar" is seen again in 2-9 below)

#1-35. Variables (and fields of records) that are anonymous types can never be passed by reference (as "var" parameters) to ordinary procedures. Indeed, anonymous records usually cannot be passed by value into procedures.

#1-36. Accessing fields of "crunched" records can take up to three times the number of instructions that would have been required if the record was not crunched. Consider the following two types:

   type
      bad_type                = crunched record
         bad_misc             : shortint;    {Bytes 0, 1}
         bad_status           : integer;     {Bytes 2, 3, 4, 5}
         end;  {Total size: 6 bytes}

      good_type               = record
         good_misc            : shortint;    {Bytes 0, 1}
         good_status          : integer;     {Bytes 4, 5, 6, 7}
         end;  {Total size: 8 bytes}

When the "bad_status" field is accessed, Pascal/iX emits three instructions (LDH, LDH, and DEP). When the "good_status" field is accessed, Pascal/iX emits one instruction (LDW).

#1-37. Pascal/iX can efficiently access only the first 8,192 bytes of global or the *last* 8,192 bytes of local variables. Pascal/iX allocates variables in a (roughly) first-seen, first-allocated manner. Thus, for global variables, putting the small ones first and the large ones second tends to be more efficient. For local variables, putting the large ones first, and the small ones second tends to be more efficient.

Because Pascal/iX allocates outer block ("global") variables in the opposite order than local variables, the rule to follow is:

Consider the following examples:

bad:

   var                                 {outer-block variables}
      big_array   : array [0..9999] of integer;  {40,000 bytes}
      ktr : integer;
   ...
   procedure foo;
      var
         ktr2     : integer;
         big_array_2 : array [0..9999] of integer; {40K bytes}

good:

   var                                 {outer-block variables}
            {small variables...}
      ktr         : integer;

            {big variables...}
      big_array   : array [0..9999] of integer;
   ...
   procedure foo;
      var
               {big variables...}
         big_array_2 : array [0..9999] of integer; {40K bytes}

               {small variables...}
         ktr2     : integer;

In the bad example, Pascal/iX will use two instructions to access "ktr" and "ktr2". In the good example, Pascal/iX will use a single instruction to access "ktr" and "ktr2".

#1-38. Pascal's differentiation of functions versus procedures is unfortunate, at best. We shouldn't encourage language designers to perpetuate this flaw.

#1-39. Pascal/iX's "$locality" statement can be used to tell the linker to group specified procedures together regardless of their order in the source code.

#1-40. Pascal allows procedures to be declared within procedures that are declared within procedures that are...

Nested procedures pay a run-time performance penalty when they access the variables global to them that are local to the surrounding procedures.

Procedures nested more than a total of two deep (i.e.: an "outer level" procedure and one inner procedure) usually implies that other design problems exist.

Some debuggers have difficulty setting breakpoints at nested procedures.

#1-43. This extra work often pays off well when writing a "module" that will be linked with other programs. I often put all of my "forward" declarations into a file, and then use the following QEDIT commands to generate an "external" declarations file for other modules to $include:

   t myfile.forward
   c "forward"(S)"external"@
   k myfile.external
(QEDIT is a widely used editor on the HP 3000.)

#1-44. Debug/iX's Format Virtual command (FV) will produce much more readable output for random data when the base type is numeric instead of character. (Debug/iX is the debugger bundled with MPE/iX on the HP 3000.)

#1-45. With this guideline, and an associated one in the next section, a QEDIT command like:

   l "procedure" (s)

will list the first line of every procedure declaration. Note that only the actual declaration will be listed, not the "forward" or "external" declarations, since they would have been declared with a capital "P" in "Procedure".

Note: do not tell your editor to automatically upshift all text you search for (e.g.: Set Window (UP) in QEDIT), as that will defeat the purpose of this guideline.

(back to Table of Contents)

1.4 Executable Code

This section discusses the styles of coding for the executable code part of a Pascal program.

Guidelines:

#1-50. All code should be in lower case.

#1-51. Comments should be in English, and should be in mixed case, as is the practice in English.

#1-52. Comments should appear in one of two styles, depending on their size:

Example:

         {the following loop looks for a null character}

   null_index := -1;                   {-1 will mean "not found"}
   test_inx := 0;                      {index of first char}
   done := (len = 0);                  {don't loop if no data}

   while not done do
      begin
            {see if current character is null...}
      if buf [test_inx] = chr (0) then
         begin                         {found a null!}
         null_index := test_inx;       {remember location}
         done := true;                 {terminate loop}
         end

      else
         begin                         {incr inx, check end}
         test_inx := test_inx + 1;
         if test_inx >= len then       {inx is 0-based}
            done := true;
         end;
      end;                             {while not done}

#1-53. Multi-line comments may be written with "{" and "}" on every line, or with the "{" and "}" appearing only once, on lines by themselves.

#1-54. The "{" and "}" characters are used to start and terminate comments, never the "(*" and "*)" pairs.

#1-55. "{" is typically not followed by a space, nor is "}" typically preceded by a space (unless it is to align it with the prior line's "}").

#1-56. Lines should be no longer than 72 bytes, even though Pascal/iX allows longer input lines.

#1-57. Blank lines cost nothing at run time, and should be used liberally to separate sections of code. Example:

   if ktr > max_ktr then
      max_ktr := ktr;                  {remember new high water}

   done := false;
   while not done do
      begin
      ...

#1-58. The "end" statement does not have to have a comment on it. Pascal never checks that your comment matches reality, anyway.

#1-59. The basic unit of indentation is 3, not 4 or 2.

#1-60. Indent "begin"s 3 spaces more than the start of the preceding line. Code after a "begin" (up to and including the "end") is at the same level as the "begin".

#1-61. Continuation lines are indented 6 more than the start of the first line.

#1-62. The "then" of an "if/then" statement is usually on the same line as the rest of the boolean expression, not on the next line by itself (unless necessary for spacing, and then it is indented 6), and never on the same line as the statement following the "then".

#1-63. An "else if" construct may be treated as though it were a new Pascal construct: "elseif". (I.e.: the "if" follows the "else" on the same line.)

#1-64. A "goto 999" is an acceptable method of branching to the end of a procedure (or function).

#1-65. No other "goto"s are necessary.

#1-66. Try to keep procedures below five pages (300 lines).

#1-67. Never use the word "procedure" or "function" in a comment in exactly all lower case. Instead, use "routine" or "PRocedure" or "FUnction".

#1-68. Always terminate a procedure or function with a comment of the form:

      end {nameofroutine proc};

#1-69. Put a blank between the name of a procedure/function and the "(" of the parameter list.

#1-70. Put a blank after every comma in a parameter list.

#1-71. Put blanks around operators (e.g.: " := ", " + ", " - "), and in front of left brackets (" [").

Commentary:

#1-52. Aligned comments make the code look neater. This practice makes it possible for a reader to easily read the code or the comments.

#1-55. The practice of always following a "{" with a space and preceding a "}" wastes valuable space on the line.

#1-56. Long lines will not list well on most terminals, nor are they acceptable to all editors.

#1-58. I do put a comment on the "end" statement when it is more than about 10 lines from the corresponding "begin".

#1-60. The value of "3" and the injunction against "double indentation" saves space and makes the result more readable. Consider the following two examples:

bad:

   for i := 1 to 10 do
       begin
           buf [i] := 0;
           foo [i] := 0;
       end;
   if ktr = 0 then
       begin
           if not done then
               begin
                   ...
good:
   for i := 1 to 10 do
      begin
      buf [i] := 0;
      foo [i] := 0;
      end;

   if ktr = 0 then
      begin
      if not done then
         begin
         ...

Many Pascal programmers have seen the "double indentation" style because the professor in charge of UCSD Pascal (in the early 1970s) used this style. Note that he was primarily a teacher, not a programmer.

#1-61. An example:

   if (card_count = max_card_count) and
         all_cards_accounted_for then
      begin
      ...

The goal of indenting continuation lines is to make it clear to the reader that the prior line has continued to the next line. If no extra indentation is used, then it becomes difficult to determine the difference between the next statement and the continuation of the current statement.

When I have a complex "and/or", I try to make it readable when doing continuation lines:

bad:

   if ((card_count = prior_card_count) and ((number_of_cards_left
      > cards_for_book)) then
      begin

good:

   if (      (card_count = prior_card_count)
         and (number_of_cards_left > cards_for_book) ) then
      begin

In the bad example, notice how the "begin" is blurred by the "and" starting in the same column.

#1-62. The word "then" is syntactic sugar: it fattens the listing, and has no redeeming value. When the reader sees an "if", he or she automatically knows that a "then" is coming, eventually. The indentation alone would suffice to tell the reader that the "then" statement has been found. Examples:

bad:

   if card_count = max_card_count
      then done := true
      else ...

good:

   if card_count = max_card_count then
      done := true
   else
      ...

In the above bad example, the reader has to mentally sift through the excess verbiage ("then") in front of the "done :=" in order to understand the affects of a "true" boolean expression. In the good example, reading the left side of the listing suffices.

#1-63. "else if" constructs are typically found in one of two situations:

An example of the "run-on" "if/then/else":

   if token_check ('EXIT') then
      wrapup
   else if token_check ('LIST') then
      do_list
   else if token_check ('PRINT') then
      do_print
   else
      writeln ('Unknown command: ', token);

Note: in the above style, I will often put 5 extra spaces before the first "token_check", so that a QEDIT command like LIST "token_check" will show all three "token_check" phrases nicely aligned.

An example of the nested "if/then/else":

   if card_count = max_card_count then
      if done then
         ...
      else
         discard_current_card
   else
      if current_card = joker then
         try_best_wildcard
      else
         calculate_score;

A style I recommend against is the following:

   if token_check ('EXIT') then
      wrapup
   else
   if token_check ('LIST') then
      do_list
   else
   ...

The above style has drastic readability consequences if an untimely "page eject" occurs in the listing between an "else" line and the following "if" line.

#1-64. Pascal lacks an "exit" statement. Both C and SPL have some form of "return from this procedure right now" statements. This is the only place I use a "goto" in Pascal.

#1-67. This guideline means that editor "find" and "list" commands looking for "procedure" and "function" will never accidentally find comment lines instead. (See also #1-45).

#1-68. This makes it very easy to find the end of any (or all) procedure(s) with a "find" command.

#1-69/70/71. Blanks make code more readable, just as they make English more readable. Note the blank after the comma in the previous sentence. Example:

bad:

   fid:=fopen(filename,3,0);

good:

   fid := fopen (filename, 3, 0);


(back to Table of Contents)


2. Coding Choices

This section deals with choices made in writing executable code.

Guidelines

#2-1. Decide if your style is to have functions that return errors, or procedures that have status parameters (or both), and then stick to it.

#2-2. Don't use "with" for simple pointer dereferencing. Only use "with" if indexing into an array.

#2-3. Try to avoid "repeat" loops, using "while" loops instead.

#2-4. Try to avoid using "escape" outside the scope of a "try/recover" block.

#2-5. Avoid "string"s in favor of PACs. A PAC is a Packed Array of Char with a lower bound of 1.

#2-6. Use Pascal/iX extensions when possible, unless portability is a primary goal.

#2-7. The "try/recover" construct in Pascal/iX is very useful for catching errors: both unexpected and deliberately caused. (try/recover is an error catching mechanism somewhat similar to catch/throw found in some other languages)

#2-8. Use $type_coercion 'representation'$. Never use the noncompatible level of type coercion.

#2-9. The "anyvar" parameter type is useful. Consider using it when you want to pass different types of variables to one procedure.

#2-10. The "uncheckable_anyvar" option for a procedure should be used whenever "anyvar" parameters are declared, unless you specifically want Pascal/iX to pass a hidden "actual size" parameter.

#2-11. When using "anyvar", be sure that the formal parameter type matches the alignment restrictions of the expected actual types. I.e.: if the formal parameter is declared as an "integer", then Pascal/iX will assume that all addresses passed into that parameter are a multiple of 4. Use "char" (or another byte-aligned type) as the formal parameter type if you want to pass any kind of addresses safely.

#2-12. The "default_parms" procedure option should be considered as a means of making long actual parameter lists shorter (for ordinary cases).

Commentary:

#2-1. Sometimes, I return a quick overall result with a "good_failed_type", and a detailed error in a status parameter. Example:

   function open_files (var status : hpe_status) : good_failed_type;
   ...
   if open_files (status) = failed then
      report_status (status);

#2-2. Pascal provides a "with" statement that can, in some instances, provide the compiler with a hint on how to optimize the instructions it emits for your code. Additionally, a "with" statement can save subsequent typing.

An example of a useless "with" is:

   var
      ptr : hpe_status_ptr_type;
   ...
   with ptr^ do
      begin
      hs_info := 0;
      hs_subsys := 0;
      end;

This is "useless" because the compiler & optimizer would probably have done just as good a job of emitting optimum code if we had said:

      ptr^.hs_info := 0;
      ptr^.hs_subsys := 0;
Note that the code took five lines using a "with" statement, and two lines without it.

Finally, the fact that "hs_info" and "hs_subsys" are actually fields of the record pointed to by "ptr" is somewhat obscured when the "with" is used.

An example of a useful "with" is:

   var
      statuses : array [0..9] of hpe_status_type;
   ...
   with statuses [k] do       {optimize hs_@ fields}
      begin
      hs_info := 0;
      hs_subsys := 0;
      end;

I consider this example "useful" because the optimizer would have had more work trying to compile optimum code for the equivalent non-with statements:

   statuses [k].hs_info := 0;
   statuses [k].hs_subsys := 0;

In the above examples, it is tempting to align the ":="s with the right-most ":=" of the block of assignment statements. I used to often do this when I had four or more similar assignments in a row.

The advantage is increased readability because we make it obvious that the data is related (because of the aligned ":="s). The disadvantage is that a simple QEDIT command designed to list where the "hs_info" field is changed (e.g.: LIST "hs_info :=") will fail. I found that the search-for-assignments capability outweighed the data-is-related benefit, for me.

#2-3. When a "repeat" loop is encountered, the reader won't know what the termination condition is until many more lines of program source are read. This means that he or she will not be able to check for proper setup of the termination condition. A "while" loop avoids this problem because the termination condition is clearly specified at the top of the loop.

A repeat loop can usually be changed into a while loop easily:

before:

   repeat
      begin
      ...
      end
   until
      buf [inx] = 0;

after:

   done := false;
   while not done do
      begin
      ...
      done := (buf [inx] = 0);
      end;          {while not done}

#2-4. A "non-local escape" costs thousands of cycles of CPU time to execute. In short, never plan on using this construct as a normal method of returning from a procedure. Examples:

bad:

   procedure do_work;         {note:  no status parameter!}
      ...
      if problem then
         escape (i_failed);
      ...

good:

   procedure do_work (var status : hpe_status_type);
      label
         999;
      ...
      if problem then
         begin
         status := my_failure_status;
         goto 999;            {exit}
         end;
      ...
   999:

      end {do_work proc};

#2-5. Strings hide an immense amount of slow and somewhat incorrect compiler-generated code. String concatenation, in particular, can result in "memory leaks" where your process runs out of heap space. PACs are messier to deal with, but much more efficient. This is a performance versus esthetics trade off.

#2-6. Pascal/iX is a very useful language precisely because it has a large body of extensions to standard Pascal. If you eschew using them, you would be better off programming in ANSI C or C++.

#2-7. "Try/recover" is not guaranteed to catch any errors that occur within it. Unexpected errors (e.g.: invalid index, bad virtual address) invoke an operating system routine called trap_handler which will "walk" back through your stack markers looking for the most recent "try/recover" block. This "walk" can fail if your stack is corrupted, and the "try/recover" will not be found. If this happens, and if an appropriate trap handler (e.g.: XCODETRAP) has not been armed, your process will be aborted.

#2-8. Type coercion is one of the best extensions in Pascal/iX. It provides a controlled way of overriding Pascal's type checking. The $type_coercion directive tells Pascal/iX what level of type coercion you want to allow in your program. About five different levels exist. The level I strongly recommend is representation. This level allows an expression of one type to be coerced (treated as) another type if, and only if, the two types are exactly the same size (in units of bits, not bytes).

The noncompatible level tells Pascal/iX that there should be no restrictions on type coercion. This leads to interesting bugs in programs. Some MPE/iX system crashes can be traced to using this kind of type coercion incorrectly. The following examples demonstrates how noncompatible can hide errors.

assuming:

   var
      big_ptr     : globalanyptr;
      my_address  : integer;
      small_ptr   : localanyptr;

bad:

   $type_coercion 'noncompatible'$
   my_address := integer (big_ptr);   {will get half of data!}
   my_address := integer (small_ptr); {will get 32 bit value}

good:

   $type_coercion 'representation'$
   my_address := integer (big_ptr);   {will get syntax error}
   my_address := integer (small_ptr); {will get 32 bit value}

In the bad example, the coercion of big_ptr results in setting my_address to the upper 32 bits of big_ptr (i.e.: the space id), silently losing the bottom 32 bits of the address. In the good example, Pascal/iX will generate a syntax error on the attempt to coerce a 64-bit expression (big_ptr) into a 32-bit value (integer).

#2-9. "anyvar" is a Pascal/iX extension of "var". When a formal parameter is declared as "anyvar", the compiler allows any variable to be passed as the actual parameter. Without such a feature, and without an object-oriented Pascal, you couldn't write a single procedure that would zero (erase) an arbitrary variable (see below for example).

By default, when a parameter is declared as anyvar, Pascal/iX will pass in the address of the actual parameter *and* a hidden integer-by-value which records the size of the actual parameter. The following example shows what is passed for a formal parameter like: "anyvar foo : integer", and the affect on a "sizeof (foo)" within the procedure:
Actual parameter type Hidden size field sizeof (foo)
char 1 1
shortint 2 2
integer 4 4
longint 8 8
real 4 4
longreal 8 8
hpe_status (see #1-24) 4 4
packed array [1..80] of char 80 80
#2-10. I use "anyvar" fairly often. One example is:

   procedure zero_var (anyvar foo : char);
         {Purpose: zero every byte in the parameter}
      var
         bytes_left           : integer;
         byte_ptr             : ^char;

      begin

   $push, range off$
      byte_ptr := addr (foo);
      bytes_left := sizeof (foo);  {Note: gets actual size!}

      while bytes_left > 0 do
         begin                     {zero one byte}
         byte_ptr^ := chr (0);
         bytes_left := bytes_left - 1;
         byte_ptr := addtopointer (byte_ptr, 1);
         end;
   $pop$          {range}

      end {zero_var proc};
Note the comment on the $pop$...allowing me to recall what options the $pop$ is supposedly restoring.

In Pascal/iX, $push$ saves the state of most compiler options, and $pop$ restores them. Thus, $push, range off$ ... $pop$ temporarily turns off the "range" option, and then restores it to the old state ... which is significantly different than simply turning it on when "done"!

Of course, Pascal/iX allows an even faster way of zeroing a variable, which happens to work well with checkable anyvar parameters. The entire code of the above procedure (between "begin" and "end") can be replaced by:

   fast_fill (addr (foo), 0, sizeof (foo));

#2-12. The following example, a procedure that most users would call with a "false" in the second parameter, shows the usefulness of "default_parms":

   const
      print_with_cr      = 0;
      print_without_cr   = 1;

   procedure print_msg (
                  msg                  : str80;
                  cr_or_no_cr          : integer)
         option default_parms (
                  cr_or_no_cr          := print_with_cr);

      var
         cctl_val   : shortint;

      begin

      if cr_or_no_cr = print_without_cr then
         cctl_val := octal ('320')
      else
         cctl_val := 0;

      print (msg, -strlen (msg), cctl_val);
            {Note: ignoring errors from print intrinsic}

      end {print_msg};
   ...
   print_msg ('Starting...', print_without_cr);
   print_msg ('Ending');               {does a CR/LF at end}
Note that I would not use a "default_parm" for a parameter that is omitted less than about 75% of the time.

(back to Table of Contents)


3. Performance Problems

The two biggest performance problems in Pascal/iX programs are using the built-in I/O, and using strings.

Guidelines:

#3-1. Avoid Pascal I/O. Use intrinsics instead.

#3-2. Avoid Pascal I/O. Use intrinsics instead. This is worth saying twice!

#3-3. Avoid strings in performance critical areas.

#3-4. Turn off range checking ($range off$) only when you are sure your program is running correctly.

#3-5. Use the Pascal/iX optimizer ($optimize on$).

Commentary:

#3-1. If you encapsulate your I/O calls, then their underlying implementation can be easily changed to use MPE intrinsics. This also aids portability across operating systems and across languages. The "print_msg" procedure in commentary #2-12 is an example.

The second reason for avoiding Pascal I/O constructs is efficiency. Pascal/iX I/O routines are extremely inefficient. This guideline is valid for most languages.

#3-3. String expressions cause the compiler to emit a lot of calls to "helper" routines. Instead of allocating a single work area in your stack, these routines allocate (and deallocate) many work areas on your heap. This can be an expensive activity, and can lead to loss of heap space, and eventual process termination.

#3-5. If your program runs correctly unoptimized, and has a problem optimized, then there are probably one (or more) uninitialized variables. The second most common problem is using pointers in a way that the optimizer doesn't expect (especially when accessing local variables via pointers).

(back to Table of Contents)


4. Portability

The portability of programs written in Pascal/iX can be enhanced with several techniques. Keep in mind, though, that most other Pascal implementations are not as rich as Pascal/iX. Delphi and Turbo Pascal (on IBM PC compatibles) provide some of the same features as Pascal/iX.

#4-1. Avoid these Pascal/iX extensions: extensible, readonly, anyvar, option, uncheckable_anyvar, default_parms, globalanyptr.

#4-2. Avoid most $ directives (e.g.: $extnaddr$).

#4-3. Use type coercion only for types of identical sizes. (A good tip even if you never intend to port your code!) Most PC-based Pascals have a form of type coercion. They may refer to it as "type casting".

#4-4. Avoid "crunched" records. Even most C languages do not have a functional equivalent. This includes avoiding "$HP3000_16$".

#4-5. Encapsulate "strange" constructs where possible.

#4-6. Encapsulate I/O calls. This is not a language portability issue so much as an operating system and/or performance issue.

#4-7. Keep your source code lines short (72 characters or less per line).

#4-8. Use user-defined types like "int16" and "int32" instead of "shortint" or "integer". Note: this is extremely important for C programmers!

#4-9. Be aware that fields in records may be packed differently on different machines.

#4-10. Standard Pascal does not allow pointers to point to variables. (They can only point into the heap.)

(back to Table of Contents)


5. Summary

The shortest summary of this paper is probably: you can judge a book by its cover. If a program looks nice, it probably is nice.

(back to Table of Contents)