Compiler construction
a tedious task with benefits
- What ?
- Why ?
- How ?
- Examples !
- Frontend
- User interaction
- Input handling
- Backend
- Doing the hard stuff
- No interaction with the user
- Scanner
- Tokenizer (Lexical Analysis)
- Preprocessing (Macros)
- Parser (Syntax Analysis)
- Validator (Semantical Analysis)
- Analyzer
- Optimizer
- Code generator
- Linker
- Runtime
- Primary goal: Produce representation of code
- Code might be represented as an AST
- This tree can be modified and searched
- In the end it can be used for output generation
- Sometimes open for modification
Statements and expressions
- Statements are closed blocks
- Expressions are open for connections
- Most statements require expressions
- Expressions carry instructions
- Statements are responsible for logic
- Hierarchy required
- Makes trivial parsing impossible
- Reverse Polish notation
- Needs to consider brackets
- Left-To-Right or Right-To-Left?
- One pass compilers (e.g. Pascal)
- Multi pass compilers (e.g. Java)
- Load and go compiler (e.g. D. Basic)
- Optimizing compilers (e.g. C)
- JIT compilers (e.g. MSIL)
- Transpiler (e.g. TypeScript)
Classical compiler scheme
- Implement scripting
- Improve flexibility
- Handle input
- Import data
- Generate code
- Increase efficiency
- What's the goal?
- Individual specification
- Existing specification
- Extending existing languages
- What's the purpose?
- What's the audience?
Your own language, no more ...
Extending existing languages
- Simple precedence parser
- Recursive descent parser
- Pratt parser
- Operator-precedence parser (e.g. Precedence climbing)
- Some programs generate lexers / parsers
- e.g. Yacc, Bison, ANTLR, ...
- They have some advantages, but introduce some problems
- Main problem: Dependency on the generator
- Any language can be used
- OOP languages certainly have advantages
- Performance for the compiler does usually not matter much
- More important is testability, robustness and flexibility
- C++, Java and C# offer a lot of possibilities
- A simple document language that transpiles to LaTeX
- Describing data formats efficiently
- A straight forward scripting language
- Code generator for e.g. some specialized HPC application
- HTML5 parser
- CSS3 parser
- Builds the DOM for HTML documents
- This representation has an API
- Can be used for e.g. transforming HTML to LaTeX
- A numerical programming language
- Loosly based on YAMP
- Compiles to MSIL
- External types (e.g. MKL) can be injected
- Perfect for scripting applications (e.g. analysis)
At least for the people who send me mail about a new language that they're designing, the general advice is: do it to learn about how to write a compiler.
Dennis Ritchie