"C" Coding Style

January 18, 2009

by
James R. Holten III
jholten@nmt.edu



I. Overview

Coding standards allow programmers to use common conventions in the writing of their source code in the performance of their work. They ensure that common techniques are in use and that code from numerous developers have some commonality in source code appearance, variable and function naming conventions, logic flow, and comments.

This paper presents a basic coding standard that has been used in numerous commerical projects and has its roots in speed of comprehension. A reader can easily see the structure of the code and quickly understand the the programmer's intent by reading the source code.

The standards given here may be applied to C, C++, Java, and Perl with very few modifications, some of which are noted below. This document is intended to mainly prescribe C coding standards.

A. Background

Programmers have historically spanned the extremes from minimalist to excessively formal in the generation of their source code. The first extreme generates an encrypted collection of files and names that are often not maintainable by later programmers, or even by the same programmer after a short period of working on other code. The main contributors to code readability are "prettyprint" and associated conventions, naming of variables and functions, and comment usage and content.

1. Prettyprint

"Prettyprint" was created to reflect the logical structure of the code in the indentation of the code. This greatly enhanced the speed with which a reviewer or code maintainer could comprehend the logical structure of the code.

Kernighan and Ritchie [1] (aka K&R), via their examples in ""The C Programming Language", presented a number of conventions that have been religiously followed by many programmers and have carried over into the use of other languages such as C++, Java, and Perl, to name just a few. These conventions included such awkwardnesses as the placement the block-defining curly brackets, "{" and "}" in positions that made the start and end of the defined block difficult to identify. By placing the starting "{" on the end of the block "precursor" line (a conditional such as "if", "else", or "elseif" or a looping command such as "while", "do", or "for") the entire, often complex, conditional or looping control had to be scanned to find whether there was a block start, or by leaving out the "{", the conditional or loop only applied to the next line.

K&R also gave repeated examples of conditionals and loops whose single statement (in lieu of a block) followed on the same line, again forcing the reader to scan the entire line to see the logical structure.

Additionally there are many examples in K&R of comments which follow the code on the same line. This further obscures the actual code and code structure from the casual reader of the code.

These conventions, presented in examples by K&R, have been propagated to the conventions followed in the numerous languages which have similar structure and use the "{" and "}" block delimiters and contribute to the difficulty encountered in reading much of the code written today.

Some software packages were created to allow the automatic reformatting of lines of code to more readable forms. One of these was the UNIX utility "cb", meaning "C beautify", and several others are now available as open source on the web.

2. Naming Conventions

Some of the clarity of what code is intended to do can be controlled by using relevant names for variables, classes, functions, and files. Included can be capitalization or segmentation via underlines to make the words easily distinguishable. Similarly specific capitalization or segmentation conventions may be used to identify whether the name is a variable, class, or a function (or method).

3. Comment Usage and Content

Comments are the text which sets the context for what the code is supposed to be doing. Properly commented code can enhance the readability of the code, or, as with Javadoc, allow automatic generation of code usage documents.

B. Approach

This paper discusses defines conventions to be followed for the clarity and maintainablility of code for applicable projects covering the subjects of prettyprint, naming conventions, and comments.

II. Prettyprint

Prettyprint includes several aspects of the appearance of the code. This appearance control includes:

  1. Code structure indentation,
  2. Placement of curly brackets for code blocks, and
  3. Placement of comments within the source code.

A. Indentation

Since each of the applicable languages are structured the structure of the code logic may be easily mapped to indented subordinate blocks.

Indentation will NOT use tabs, as these may be set differently on each individual's source editor or viewer, often rendering the code structure indistinguishable when tabbed and non-tabbed lines are intermixed and viewed in editor configurations other than the original.

Each level of contained structure will be indented by three blanks beyond the indentation of the containing block. An example is:

int FunctionCall1(int dividend,
                  int divisor,
                  int *quotient,
                  int *remainder)
   {
   int error_flag = FALSE;

   if (quotient == NULL)
      {
      error_flag = TRUE;
      }
   else if (remainder == NULL)
      {
      error_flag = TRUE;
      }
   else if (divisor == 0)
      {
      error_flag = TRUE;
      }
   else
      {
      *quotient = dividend / divisor;
      *remainder = dividend % divisor;
      }
   return(error_flag);
   }

Similar indentation may be used in structure and class definitions:

typedef struct _MY_STRUCT   MYSTRUCT;
struct _MYSTRUCT
   {
   int first_value;
   float next_value;
   MYSTRUCT *prev;
   MYSTRUCT *next;
   };

B. Code Blocks and Curly Braces

In the applicable languages there are common and similar constructs for conditional execution of code blocks, looping over code blocks, and code blocks contained within functions. For each of these constructs if the included code block contains one or more lines of code then the code block should be enclosed in curly brackets.

The openning curly bracket should be placed on the first line after the preceding code line, and should be indented properly as the first line of code for the included block.

The closing curly bracket should be the first line following the last line of the included block and be indented the same as the included block. An example follows:

   if (do_block)
      {
      ExecuteFunctionCall();
      }
   else
      {
      do_other_code = 1;
      total += next_value;
      }

C. Comment Placement

Comments should start in the columns in which the current code block starts. File headers start in column 1, function and data structure declaration headers in column 1, and in-line code comments in the same column as the line of code which immediately follows the comment.

Only data declaration comments may be placed after the code and on the same line, and they should be separated from the code by an obvious blank interval.

Examples of these follow:

/*
* File: blah.c
*
* Description:  This file defines a function for the fun of it.
*/

#include 

/*
* Function: BlahBlah
*
* Description:  It outputs a nonsense messsage.
*
*/
void BlahBlah(char *message)
   {
   /*
   *  Test the message first, and only output it if it is okay.
   */
   if (message)
      {
      /*
      *  Write out the message.
      */
      fprintf(stderr, "%s\n", message);
      }
   else
      {
      /*
      *  Write out the default message.
      */
      fprintf(stderr, "Blah Blah Blah\n");
      }
   fflush(stdout);
   }

III. Naming

There are five main categories of things to be named in code:

  1. variables,
  2. classes (or structs),
  3. functions (or methods),
  4. macros (#define constants included), and
  5. file or directory names.

There are many permutations of how these are named, depending on whose preferences are driving the naming. To make a specific project consistent, and therefore the code easily understood by all on the project, it is useful to settle on one combination.

variables

These shall always be lower case, with words separated by underscores.

   my_index
   set_list
   graph_node
   current_graph_node

classes or structs

These shall always be upper case, with words separated by underscores. In the case of structs, where it is useful to give each a name starting with an underscore, then have a typedef for it giving it a name without the underscore, as a struct cannot contain references to itself.

An example for a "C" structure would be a self-referential list element and the list header structure.

   typedef struct _LIST_HEAD LIST_HEAD;
   typedef struct _LIST_ELEMENT LIST_ELEMENT;

   struct _LIST_ELEMENT
      {
      char *element_value;
      LIST_ELEMENT *prev;
      LIST_ELEMENT *next;
      };
   struct _LIST_HEAD
      {
      LIST_ELEMENT *first;
      LIST_ELEMENT *last;
      };
This also gives the programmer a way to put a list of the typedefs in the start of a header file for easy reference to what is defined in the header later.

In both C and C++ the programmer often define the structures and classes in header (.h) files for inclusion in multiple code (.c) files. This allows both the code to define the class or structure methods to be defined in code files, and the code that uses them to be defined in other code files. By including the same header files they share the same structure or class definitions.

The capitalization is a clue that the name references something uniquely defined in a common header file.

functions

These shall always be mixed case, with each word capitalized. Some variation may be included by not capitalizing the first word, especially when it is an acronym for the module in which the function belongs.

   ListElementGetValue
   InvertMatrix
   srfGetSetList
   utilInsertElement

macros

These shall always be upper case, with words separated by underscores.

In both C and C++ the programmer can define constants and macros to be substituted during a preprocessing step before compiling. These are often defined in header (.h) files for inclusion in multiple code (.c) files. They are defined using the "#define" construct.


Definitions in header files:

   #define PI_SQUARED M_PI * M_PI
   #define NULL 0

   #define ELEM_REMOVE(header, element)                                       \
      {                                                                       \
      if (head == NULL)                                                       \
         {                                                                    \
         fprintf(stderr, "NULL header pointer.");                             \
         }                                                                    \
      else if (element == NULL)                                               \
         {                                                                    \
         fprintf(stderr, "NULL element pointer.");                            \
         }                                                                    \
      else                                                                    \
         {                                                                    \
         if (head->first == element)                                          \
            {                                                                 \
            head->first = element->next;                                      \
            }                                                                 \
         else if (element->prev == NULL)                                      \
            {                                                                 \
            fprintf(stderr, "Not the first element, but NULL prev pointer."); \
            }                                                                 \
         else if (element->prev->next != element)                             \
            {                                                                 \
            fprintf(stderr, "The element prev pointer is bad.");              \
            }                                                                 \
         else                                                                 \
            {                                                                 \
            element->prev->next = element->next;                              \
            }                                                                 \
         if (head->last == element)                                           \
            {                                                                 \
            head->last = element->prev;                                       \
            }                                                                 \
         else if (element->next == NULL)                                      \
            {                                                                 \
            fprintf(stderr, "Not the last element, but NULL next pointer.");  \
            }                                                                 \
         else if (element->next->prev != element)                             \
            {                                                                 \
            fprintf(stderr, "The element next pointer is bad.");              \
            }                                                                 \
         else                                                                 \
            {                                                                 \
            element->next->prev = element->prev;                              \
            }                                                                 \
         element->prev = NULL;                                                \
         element->next = NULL;                                                \
         }                                                                    \
      }                                                                       \

   #define LIST_SCAN_TO_COND(header, element, cond)                           \
      for (element = (header == NULL? NULL: header->first);                   \
           element != NULL && !(cond);                                        \
           element = element->next)                                           \


Usage in other macros in headers or in code files:

   ELEM_REMOVE(set_list, curr_set);
   curr_set = NULL;

   LIST_SCAN_TO_COND(set_list, temp_set, (temp_set->id == desired_id));

   boolean flag = false;
   LIST_SCAN_TO_COND(set_list, temp_set, (flag))
      {
      if ((temp_set->id % 1) == 13)
         {
         flag = true;
         }
      }
   desired_set = temp_set;

Since there are many places macros may be used, sometimes like variables, and sometimes like functions or "for" loops, it is useful to be able to recognize where they may be defined by their capitalization.

functions

These shall always be mixed case, with each word capitalized. Some variation may be included by not capitalizing the first word, especially when it is an acronym for the module in which the function belongs.

   ListElementGetValue
   InvertMatrix
   srfGetSetList
   utilInsertElement

IV. Comment Contents

There are four forms of comments to be used:

  1. File header,
  2. Function or data structure header,
  3. Data declaration line(s) comment, and
  4. Code line(s) comment.

Comment contents should reflect a relevant level of abstraction of what the code is intended to perform. An example is:

   /*
   *  Find the insert point in the ordered linked list.
   *  This has no executable block, only the looping.
   */
   for (element = list->first;
        element != NULL && value <= element->value;
        element = element->next);
   /*
   *  See if the point was found.
   */
   if (element)
      {
      /*
      *  Insert before this element.
      */
      new_element->next = element;
      if (element->prev)
         {
         /*
         *  Just insert between the element and its predecessor.
         */
         new_element->prev = element->prev;
         }
      else
         {
         /*
         *  Insert it at the head of the list.
         */
         new_element->prev = NULL;
         list->first = new_element;
         }
      element->prev = new_element;
      }
   else
      {
      /*
      *  It is after the last element (if any).
      */
      new_element->prev = list->last;
      if (list->last)
         {
         /*
         *  There was a "last" element on the list.
         */
         list->last->next = new_element;
         }
      list->last = new_element;
      }