ANSI C Code Generation Guide

This document provides a guide to the generation of C code from intermediate three-address instructions. The goal is to use the C compiler to do the native code generation for us, having done the memory layout and control flow work to show that the three-address code logic is correct.

Variables

Three-address code has region:offset addresses, not variable names. You can declare global data as
   char global[globalsize];
and then address a given variable at an offset of k as
 (*(k's type *)(global+k)).

Constants

As a concession to what the assembler would do for us, immediates of type "int" are allowed. For anything else, you can declare and initialize a single variable for the entire constant data region as
   char constant[constantsize] = { .../* your bytes here */... } ;
and then address a given variable at a byte-offset of k as
 (*(k's type *)(constant+k)).
Most likely this would be solely for string data.

Stack

TAC-C uses the regular C stack, but parameters and locals for all functions are given as a (sized) array of char. Locals must include space for both named locals and temporary variables. They look like
char loc[regionsize];

Parameters are a bit more awkward; we can pass an array parameter, but someone (the caller) must allocate the parameter region. Where would a parameter region be allocated? On the stack. How do we allocate the parameter region? As a temporary variable in the caller. Its size should be the maximum of the size needed by any function that it calls directly. The general form of a function, then is

returntype func(char par[]) {
   char loc[locsize+tmpsize+parmsize];
}
  intermediate  
code
instruction
                              C equivalent                                Comment
x := y + z
(integer locals)
*(int *)(loc+16) = *(int *)(loc+20) + *(int *)(loc+24);
Typical example of adding two numeric variables. If one of the operands is not an integer, we generate code to add operands as reals. One could treat constants in the same way, or generate optimized code for them.
x := - y
(local variables)
*(int *)(loc+16) = - *(int *)(loc+20);
x := y
*(int *)(loc+16) = *(int *)(loc+20);
x := &y *(int **)(loc+16) = (int *)(loc+20);
x := *y *(int *)(loc+16) = **(int **)(loc+20);
*x := y **(int **)(loc+16) = *(int *)(loc+20);
goto L100 goto L100;
if x < y then
    goto L100
if (*(int *)(loc+16) < *(int *)(loc+20)) goto L100;
if x then goto L if (*(int *)(loc+16)) goto L;
if !x then goto L if (!*(int *)(loc+16)) goto L;
param x *(int *)(loc+(parbase+paroff)) = *(int *)(loc+16);
call p,n,x *(int *)(loc+16) = p(loc+parbase); except built-ins, which are called normally
return x return *(int *)(loc+16);
global x,n1,n2 No individual globals? References in code generally use offsets.
proc x,n1,n2 int x(char par[n1]){
local x,n No individual locals. References in code generally use offsets.
label Ln Ln:
end }

Clint Jeffery