Coding Style

December 31, 2006

by
James R. Holten III, PhD
jholten@nmt.edu



I. Overview

Coding standards allow programmers to use common conventions in the writing of their source code in the performance of their work. They ensure that common techniques are in use and that code from numerous developers have some commonality in source code appearance, variable and function naming conventions, logic flow, and comments.

This paper presents a basic coding standard that has been used in numerous commerical projects and has its roots in speed of comprehension. A reader can easily see the structure of the code and quickly understand the the programmer's intent by reading the source code.

Since C, C++, Perl, and Java are all structured languages with may common conventions and syntax commonalities the standards given here should be applied to all all four languages.

A. Background

Programmers have historically spanned the extremes from minimalist to excessively formal in the generation of their source code. The first extreme generates an encrypted collection of files and names that are often not maintainable by later programmers, or even by the same programmer after a short period of working on other code. The main contributors to code readability are "prettyprint" and associated conventions, naming of variables and functions, and comment usage and content.

1. Prettyprint

"Prettyprint" was created to reflect the logical structure of the code in the indentation of the code. This greatly enhanced the speed with which a reviewer or code maintainer could comprehend the logical structure of the code.

Kernighan and Ritchie [1] (aka K&R), via their examples in ""The C Programming Language", presented a number of conventions that have been religiously followed by many programmers and have carried over into the use of other languages such as C++, Java, and Perl, to name just a few. These conventions included such awkwardnesses as the placement the block-defining curly brackets, "{" and "}" in positions that made the start and end of the defined block difficult to identify. By placing the starting "{" on the end of the block "precursor" line (a conditional such as "if", "else", or "elseif" or a looping command such as "while", "do", or "for") the entire, often complex, conditional or looping control had to be scanned to find whether there was a block start, or by leaving out the "{", the conditional or loop only applied to the next line.

K&R also gave repeated examples of conditionals and loops whose single statement (in lieu of a block) followed on the same line, again forcing the reader to scan the entire line to see the logical structure.

Additionally there are many examples in K&R of comments which follow the code on the same line. This further obscures the actual code and code structure from the casual reader of the code.

These conventions, presented in examples by K&R, have been propagated to the conventions followed in the numerous languages which have similar structure and use the "{" and "}" block delimiters and contribute to the difficulty encountered in reading much of the code written today.

Some software packages were created to allow the automatic reformatting of lines of code to more readable forms. One of these was the UNIX utility "cb", meaning "C beautify", and several others are now available as open source on the web.

2. Naming Conventions

Use reasonable and relevant names.

3. Comment Usage and Content

Use comments where appropriate to provide a higher level view of what the code is supposed to be achieving. They should be indented to the level of the code to which they apply, and should contain complete thoughts, not just cryptic reminders, as others do not always share your context when reading your code.

B. Approach

This paper discusses defines conventions to be followed for the clarity and maintainablility of code for applicable projects covering the subjects of prettyprint, naming conventions, and comments.

II. Prettyprint

Prettyprint includes several aspects of the appearance of the code. This appearance control includes:

  1. Code structure indentation,
  2. Placement of curly brackets for code blocks, and
  3. Placement of comments within the source code.

A. Indentation

Since each of the applicable languages are structured the structure of the code logic may be easily mapped to indented subordinate blocks.

Indentation will NOT use tabs, as these may be set differently on each individual's source editor or viewer, often rendering the code structure indistinguishable when tabbed and non-tabbed lines are intermixed and viewed in editor configurations other than the original.

Each level of contained structure will be indented by three blanks beyond the indentation of the containing block. An example is:

int FunctionCall1(int dividend,
                  int divisor,
                  int *quotient,
                  int *remainder)
   {
   int error_flag = FALSE;

   if (quotient == NULL)
      {
      error_flag = TRUE;
      }
   else if (remainder == NULL)
      {
      error_flag = TRUE;
      }
   else if (divisor == 0)
      {
      error_flag = TRUE;
      }
   else
      {
      *quotient = dividend / divisor;
      *remainder = dividend % divisor;
      }
   return(error_flag);
   }

Similar indentation may be used in structure and class definitions:

typedef struct _MY_STRUCT   MYSTRUCT;
struct _MYSTRUCT
   {
   int first_value;
   float next_value;
   MYSTRUCT *prev;
   MYSTRUCT *next;
   };

B. Code Blocks and Curly Braces

In the applicable languages there are common and similar constructs for conditional execution of code blocks, looping over code blocks, and code blocks contained within functions. For each of these constructs if the included code block contains one or more lines of code then the code block should be enclosed in curly brackets.

The openning curly bracket should be placed on the first line after the preceding code line, and should be indented properly as the first line of code for the included block.

The closing curly bracket should be the first line following the last line of the included block and be indented the same as the included block. An example follows:

   if (do_block)
      {
      ExecuteFunctionCall();
      }
   else
      {
      do_other_code = 1;
      total += next_value;
      }

C. Comment Placement

Comments should start in the columns in which the current code block starts. File headers start in column 1, function and data structure declaration headers in column 1, and in-line code comments in the same column as the line of code which immediately follows the comment.

Only data declaration comments may be placed after the code and on the same line, and they should be separated from the code by an obvious blank interval.

Examples of these follow:

/*
* File: blah.c
*
* Description:  This file defines a function for the fun of it.
*/

#include 

/*
* Function: BlahBlah
*
* Description:  It outputs a nonsense messsage.
*
*/
void BlahBlah(char *message)
   {
   /*
   *  Test the message first, and only output it if it is okay.
   */
   if (message)
      {
      /*
      *  Write out the message.
      */
      fprintf(stderr, "%s\n", message);
      }
   else
      {
      /*
      *  Write out the default message.
      */
      fprintf(stderr, "Blah Blah Blah\n");
      }
   fflush(stdout);
   }

III. Naming

Use reasonable and relevant names.

IV. Comment Contents

There are four forms of comments to be used:

  1. File header,
  2. Function or data structure header,
  3. Data declaration line(s) comment, and
  4. Code line(s) comment.

Comment contents should reflect a relevant level of abstraction of what the code is intended to perform. An example is:

   /*
   *  Find the insert point in the ordered linked list.
   *  This has no executable block, only the looping.
   */
   for (element = list->first;
        element != NULL && value <= element->value;
        element = element->next);
   /*
   *  See if the point was found.
   */
   if (element)
      {
      /*
      *  Insert before this element.
      */
      new_element->next = element;
      if (element->prev)
         {
         /*
         *  Just insert between the element and its predecessor.
         */
         new_element->prev = element->prev;
         }
      else
         {
         /*
         *  Insert it at the head of the list.
         */
         new_element->prev = NULL;
         list->first = new_element;
         }
      element->prev = new_element;
      }
   else
      {
      /*
      *  It is after the last element (if any).
      */
      new_element->prev = list->last;
      if (list->last)
         {
         /*
         *  There was a "last" element on the list.
         */
         list->last->next = new_element;
         }
      list->last = new_element;
      }