CSE 423 Lab #6: Hashing and Symbol Tables

Turnin: on canvas. Due Sunday 3/9, 11:59pm

In this week's lab, you implement a symbol table data type. Turn in (a whole compiler with the symbol table files added, makefile modified to use them, etc.) as far as you get by the due date; you will be graded as having done the lab as long as it looks like you did two or more hours of work towards a functioning symbol table.

Symbols

Write a syntax tree traversal function printsyms(struct tree *) that calls the following helper function for each identifier. You will use it in this lab only.
void printsymbol(char *s)
{
   printf("%s\n", s); fflush(stdout);
}
Integrate a call to printsyms into your compiler, calling it on the root of your syntax tree for each file. If you run this on all the source files of your compiler (*.[chly]), how many symbols occur? To check your work, you can pipe the results into UNIX wc(1) for a quick count. Then pipe the results into UNIX uniq to see how many unique symbols there are after removing duplicates.

Symbol Table Entries

A symbol table is a collection of symbol table entries. For today, each entry has an empty data payload; there is nothing in it except the symbol itself and all it can be used for is to answer the yes/no question of whether the symbol is declared in a given scope.
typedef struct sym_entry {
/*   SymbolTable table;			/* what symbol table do we belong to*/
   char *s;				/* string */
   /* more symbol attributes go here for code generation */
   struct sym_entry *next;
   } *SymbolTableEntry;
Put this in a header file, symtab.h. Whatever modules #include this file, should have it added to their makefile dependencies. To add it to foo.c:
foo.o: foo.c symtab.h
	gcc $(CFLAGS) foo.c

A Symbol table that is a linked list

You can build a symbol table as a linked list first and get it working, and then turn it into a hash table. Add the following to symtab.h:
typedef struct sym_table {
   int nEntries;			/* # of symbols in the table */
/*   struct sym_table *parent;		/* enclosing scope, superclass etc. */
   struct sym_entry *next;
   /* more per-scope/per-symbol-table attributes go here */
   } *SymbolTable;
Write a constructor function mksymtab() to create one of these symbol tables and return a pointer to it.

Symbol Table Population

Walk through your tree, looking for nodes whose production rule indicates that they are variable declarations. Find the list of variable names and insert them into the current symbol table (scope).

This task may be much bigger than it looks, since it implies that you can find the current symbol table for each node in the syntax tree. Maybe first do a pass to create all the symbol tables: can you find the tree nodes where symbol table creation should occur? Then create a mechanism (either a semantic attribute, or a stack data structure) by which you track the current symbol table as you traverse the syntax tree.

A hash function

You can use this one, or anything better that you can invent or find, so long as you cite sources.
int hash(SymbolTable st, char *s)
{
   register int h = 0;
   register char c;
   while (c = *s++) {
      h += c & 0377;
      h *= 37;
      }
   if (h < 0) h = -h;
   return h % st->nBuckets;
}

Symbol Table that uses an Array of Linked Lists

typedef struct sym_table {
   int nBuckets;			/* # of buckets */
   int nEntries;			/* # of symbols in the table */
/*   struct sym_table *parent;		/* enclosing scope, superclass etc. */
   struct sym_entry **tbl;
   /* more per-scope/per-symbol-table attributes go here */
   } *SymbolTable;

Wrapping Up: How do you tell you have your symbols in place?

You should Print them there tables out. How hard will that be?