CSE 423 Lab #2: Lexical Testing
For today's lab, we are going to work on HW#2 as much as we can.
And think about how to test it sufficiently.
Lab Part 1: Team Up and Swap Notes (20 minutes)
- Break into small groups, preferably of size three. Zoom folks,
I guess we'll try one big zoom group.
- Share with each other the non-trivial regex'es that you have in your
k0lex.l so far. Discuss each.
- If you have questions about what regular expression is correct,
write those down as part of your lab submission.
- For each non-trivial regex, either agree unanimously, or by
voting, which regular expressions are the most correct.
Lab Part 2: Lex Testing
- Write test cases for each token category, especially the non-trivial ones.
- Hello World is a test case (.kt file). It provides coverage for ~10 token categories.
- A brute-force lexical coverage test would include one of each token
category on the entire list -- prove that your lexical analyzer can generate
all ~200 integer codes for all the unique terminal symbols in your ytab.h file
- Subdivide this task amongst team members, share results.
- Add one or more tests for everything in Kotlin
- Test the boundary cases ("weird" and "bad" tokens).
- Unterminated strings
- Random binary garbage characters
- Nested comments
- ...etc.
What to Turn in
Lab submission: turn in a .ZIP containing:
For Part 1, a PDF document named README.pdf that provides
- your list of team members, and their participation scores (
0 for no participation, 1 for partial participation, 2 for
full participation)
- your list of which Kotlin tokens deserve non-trivial regexes.
- your group's (unanimous or vote-based) winners for these regex'es
- your list of questions (if any)
For Part 2: include in the .zip your set of lexical test cases, in .kt files.
You will receive full credit for the lab if it looks like you did at least
three hours' worth of lab work. I figure each group should have no problem
generating 30 or more test cases that collectively test over 100 of your
tokens. Probably teams can get closer to 100% coverage than that.
The tests do not have to "run", they do not have to "compile",
they only have to provide some lexical challenges relevant to Kotlin.
Answers to Lab Questions
- What do you mean, put the tokens in yylex()?
- I mean
malloc() an instance of each token inside
yylex() and initialize its fields before
you return the integer category. The type definition for tokens
should be in a .h file.
- How do I de-escape the string literals for the svals as shown in the HW#2
example output?
- You probably write a helper function and build an sval from yytext by
doing a "transforming copy" where most yytext characters are copied into
the sval, but escaped characters in yytext map 2+ characters down to
a single character in the sval.
- It seems like you want a lot of stuff to be done inside yylex()
- Yes, probably in 1+ helper functions called from the yylex()
semantic actions code fragments.
- Multi-line strings?
-
Kotlin allows multi-line strings. It was decided k0 will do multi-line
strings. However, tricky bits are negotiable.