e-x-a.org » Assignment B

Deadline: January 4th, 2019

Let’s create a very simple programming language called S# that works as follows:

  • All expressions are valued as unsigned 64bit integers, if you write a number 123 it means exactly a 64-bit unsigned 123.
  • You can combine the integers using operators +, -, *, /, %, <, >, ==, !=, prefix ~, && and ||, and group them using parentheses. E.g. (1+2)==(5+6) should be valued 0. All operators work just as in C except for ~ which means logical negation, booleans are represented as 0 and 1.
  • Several expressions can be connected using ; and grouped using braces {} to form a single expression. “Value” of the expression is the value of the last expression in the braces (similarly as in Scheme or R)
  • There is a conditional expression called if which you can use to run conditional code: if (expression) {...} {...}. The ‘else’ branch must always be present, and the parentheses and braces { } are mandatory.
  • There is a special expression write(expression) which prints out the value of an expression to the standard output
  • There is a special expression read() which reads a value from the standard input and returns it
  • You can define your own functions:
    • Definition looks like this: functionName paramA paramB {expression}. In the definition, values paramA and paramB are accessible as ‘variables’. (Their values are perfectly constant thorough the function execution, though — there is no assignment). Note that the braces { } around the function definition are mandatory.
    • Function call is done using the functionName just as with read or write — you put the expressions with the values for the parameters in parentheses behind the functionName and separate the parameters using a comma ,.
  • The program is, as usual in C-style languages, formed by a list of function definitions, and gets ‘executed’ by evaluating the parameter-less function main. Last value in main is used as a return value for the whole program.
  • All identifiers (function names and variable names) are made only of alphabetic characters (i.e. variable name param_1 is invalid for 2 different reasons).
  • Whitespace between separate tokens is ignored.

Your task: Write a transpiler of S# to C, i.e. a program that reads an S# program and outputs a program in C that does the same thing. You may also transpile to C++, since the languages are almost identical from this perspective — choose whichever you want.

The following rules apply:

  • It is recommended to wait with the implementation until we practice the recursive-descent parsing in the lab (which should make the assignment almost trivial). That should happen around the first lab in December.
  • You can use any standardized C++ library you want for the solution. That means — unlike in assignment A — that you should use smart pointers, STL containers and C++-style I/O, to create the possibly highest-level and shortest implementation you can. No low-level memory handling should be used (to avoid memory leaks).
  • In parsing, use any operator precedence you like, but make sure that multiplicative operations associate with higher precedence than additive operations.
  • Your program must always produce a valid C program — if there is a mistake in the input source, you should report an error without producing any output. The error messages are not required to be informative, just saying “error” to standard error output and returning non-zero value from main is a sufficient report. The errors that you should check for include:
    • Syntactic validity of the program
    • Using only defined variables
    • Using all functions with correct number of arguments
    • Missing or malformed main function
    • Name collisions of the defined functions and identifiers with each other, and with read, write and if
  • As usual, your program must never crash or leak memory.
  • You should use reasonable (neat and readable) code formatting, provide reasonable names to all functions and objects, and add comments that explain non-obvious parts of the code.
  • Simplification: You can use unbounded recursion both in your solution, and in the “compiled” C output code. This time, we will ignore the fact that both your transpiler and the resulting program may crash on adversarial input because of stack overflows.
Examples

S# program:

test u v {u+v+u*v}
fact x { x * if (x>1) {fact(x-1)} {1}}
main {
  write(test(read(),
             read()));
  write ( fact ( 5 ));
  0
}

may be translated to following C program:

#include <stdio.h>
#include <stdint.h> //for uint64_t
typedef uint64_t u;
u read() {u x; scanf("%lu", &x); return x;}
u test(u a, u b) { return (a+b)+(a*b); }
u fact(u x) { return x * (x>1?fact(x-1):1); }
int main() {
	printf("%lu\n", test(read(),read()));
	printf("%lu\n", fact(5));
	return 0;
}

(Note: you may use any renaming of variables and function names you want that satisfies the assignment conditions. In fact, your program output may be COMPLETELY different from this one, as long as it represents the same program.)

Another S# program:

fun x {write(x); if(x>0) {fun(x-1);} {0} }
main {fun(10)}

May be translated to following C++ program:

#include <iostream>
#include <cstdint>
uint64_t f1(uint64_t p1) { std::cout << p1 << std::endl; return (p1>0)?f1(p1-1):0; }
int main() { return f1((10)); }

The following S# program is invalid and should report an error (for any of the 7 different reasons):

func x1 x1 { func2(x2,x1) }
Main x {func(1,2,3)-}
Bonus

Notice that the following is a valid S# program that should output 3 and 4.

main {write({write(3);4})}

Precious hint: , has two different meanings in C.