CSE 423 Lab #3: Bison

Turnin: on Canvas in a single .zip file. This includes your answers to questions 1-12 along with your makefile and .l/.y/.c files except not the files generated by flex and bison.

0. Preliminaries

This lab assumes that you have read the required Bison reading (sections 1 and 3-6 of the GNU Bison Manual). If that's not enough, read the Bison chapters of the optional text (Flex and Bison) or the other optional text (Build Your Own Programming Language). There are other free internet websites on YACC and Bison that may be useful.

1. Bison Basics

Download (or copy paste) the following toy Flex and Bison specifications.

Flex nnws.l Bison ns.y
%option noyywrap
%%
[a-zA-Z]+ { return NAME; }
[0-9]+ { return NUMBER; }
[ \t\n]+ { }
. { fprintf(stderr, "bad char\n"); }
%token NAME NUMBER
%%
sequence : pair sequence | ;
pair : NAME NUMBER { printf("got a name-number pair\n"); } ;

  1. When you run flex nnws.l, what output file does flex write?
  2. When you run gcc -c lex.yy.c, what do you get?
  3. When you run bison ns.y, what output file does Bison write?
  4. Devise a command line for compiling the bison output to a .o file. Fix warnings by added extern prototypes for yylex() and yyerror().
  5. When you run bison -d ns.y, what header file does Bison generate?
  6. Add #include "ns.tab.h" to the header section of nnws.l. Such C code goes inside %{ ... %} marks. Re-run flex nnws.l and rerun gcc -c lex.yy.c
  7. Write a third module, main.c, that initializes yyin by opening whatever filename is given in argv[1] and then calls yyparse() and prints out the yyparse() return value to standard output with a message such as yyparse returns 0.
  8. Add a yyerror() function to main.c:
    int yyerror(char *s) {
       fprintf(stderr, "%s\n", s); exit(1);
    }
  9. Compile your main.c to main.o with gcc -c main.c
  10. Devise a command line for linking the lex and bison output .o files with main.o to make an executable named ns (short for "name sequence")
  11. Run your program on itself with ./ns nnws.l. What does it write out?
  12. Run your program on an input file (you can name this file whatever you want) containing
    DrJ 1 Evil 0
    
    What does it write out?

2. How to Translate Kotlin's Grammar Rules into Bison (group)

If you complete this lab, you should have your Flex homework wired up to a Kotlin-subset grammar. When it is running, you have a syntax checker for the K0 language. [Then you can proceed with the rest of HW#3, building syntax trees.]
  1. Create a new lab3/ subdirectory.
  2. Create your own k0gram.y file, maybe blank at first
  3. Declare the %token's per your .h file (copy paste and modify!)
  4. Look (like we did before) at the Kotlin syntax grammar. Drill down focusing on topLevelObject, functionDeclaration, functionValueParameters, type, typeConstraints, functionBody, block, statements, statement, declaration, assignment, loopStatement, expression, controlStructureBody.
  5. Each lab member translate at least 7 Kotlin grammar rules into Bison format. Your team should produce at least 14-21 Kotlin production rules. Share and review with each other, make suggestions and fixes as needed.
  6. Edit k0gram.y as needed! Lab goal: create enough Kotlin grammar to parse a "hello world" program.
  7. Copy in your k0lex.l flex specification from HW#2.
  8. Modify k0lex.l and k0gram.y until they have the same set of TERMINAL symbol names. Maybe this means renaming everything in one or the other.
  9. Run bison -d on k0gram.y to make a k0gram.tab.h.
  10. Modify your k0lex.l to #include k0gram.tab.h instead of whatever previous terminal symbol definitions it used in HW#2.
  11. Modify your HW2 main() function to call yyparse() one time in place of the while loop that called yylex() over and over again.
  12. Test your program (now a syntax checker) on both valid K0 inputs like a "hello world" program, as well as on some inputs with syntax errors.

Example

Let's translate one Kotlin grammar rule into Bison as an example.
statements:
	[statement {semis statement}] [semis]
Translate pieces from the inside first. The innermost part of this is
{semis statement}
This is chain of zero or more occurrences of semis (semicolons) followed by a statement. In Bison, chains of zero or more times are translated by adding a new non-terminal with a recursive production and an epsilon production, like this:
semis_statement:
	  semis_statement semis statement
	| { /* epsilon production */ }
	;
The next largest bit is the [statement {semis statment}], which is an optional sequence of statements. This is translated by adding a new non-terminal with the content, or an epsilon production:
optional_statement_sequence:
	  statement semis_statement
	| { /* epsilon production */ }
	;
Optional semi-colons denoted by [semis] are translated as follows:
optional_semis:
	semis
	| { /* epsilon production */ }
	;
Thus, the translation of the original non-terminal statements into Bison is:
statements:
	optional_statement_sequence optional_semis
	;
Feel free to ask questions about how to translate other aspects of Kotlin syntax grammar as needed.