Toy Compiler written in c++ using LLVM by Evian Concepción Peña
Combine an user friendly sintax and the speed of a compiled language, example: Mojo (uses MLIR) or Julia WARNING: Currently translation to machine code is commented, to develope faster I use the JIT engine of LLVM
- At parent directory -->
mkdir build - At parent directory -->
cd build - At build directory -->
cmake .. - At nicole directory -->
nicole.sh file.nc nicole.shwill compile and excute but only will compile those modified files- Still no options from command line so test/test1.nc is the program to be compiled
-
Lexer (sintax is up to the users since the parser only knows about token types and not raw information)
-
Parsing algorithm is up to users meaning it can easily be changed, for this project I just implement a Top-Down
-
Tree visualization
-
Literals Bool, Int, Float, Double, Char, String
-
Variable Declaration
-
Const Variable Declaration
-
Auto declaration (still not supported const auto but it is trivial to implement)
-
Unary operations -- (decrement), ++ (increment), -, ! (not)
-
Binary operations or, and, +, -, *, /, %, !=, ==, <, <=, >, >=
-
Operators +=, -=, *=, /=
-
Operator precedency and parenthesis expressions
-
Variable call
-
Function Declaration with return (type / void) and allowing recursion
-
Function calls
-
If conditional with else as optional
-
Ternary operator
-
Switch conditional with optional default (still gotta check that every case has the right type)
-
Loops while, do while, for
-
Every time a function, loop or conditional is being created it inherits the outer scope
-
Print in the terminal (by te moment just literal and not user types)
-
Import (still gotta check if every path works and more tests about dependency cycles)
-
Structures of data (can access either as a getter or to set the attributes after constructor, just missing methods and nested types)
-
Better error messages
-
Vectors / Array
-
Classes
-
Stop (equivalent to break in c++)
-
Pass (equivalent to continue in c++)
-
Pointers and null
-
Concatenation of expression, example: variable[index][index].method... Right now would require create an intermediate variable
-
Standar library
-
Type conversion
-
Type checking analysis
-
Optimization analysis (let the user decide between differnt levels, Default, None, Agressive...)
-
System calls
-
Threads (not sure)
-
Templates (not sure)
-
Polymorphism (not sure)
-
NameSpace (not sure)
-
Use MLIR -> Multi Level Intermediate Representation (not sure)
-
The .txt files next to the CMakeLists.txt are just meant for me to install correctly LLVM and the intellisense being able to find the library
-
Building system (cmake, in the futute might add Bazel)
-
inc for every header file (.h)
- The folders are divide by stages (since this is an early stage of the project it might not be accurate the file path)
-
- Lexical Analysis (has everything related to building a lexer)
-
- Parsing Analaysis (has everything related to de parsing, AST nodes, tables)
-
- Type checking (not implemnted)
-
- Optimization analysis (not implemented)
-
- Visitors (Code generation)
-
src for every source file (.cc)
-
test constains tests for the features of the compiler (expected to use google test library)
-
build where the binary is