What are the different phases of the compiler?
What is the role of each phase of the compiler?
If you are reading this article, you are more interested to get answers to these questions. I am explaining in detail with example for each compiler design phase.
Let’s dive in…
The compiler is a software program which converts high-level language code into machine level language code or language code that the computer processor can understand.
Conversion of code from one language to another has to go through multiple intermediate processes. These intermediate processes are distinguished in the 6 Phases.
In this post we see, what are the 6 phases of the compiler with an example?
This is the picture from Aniruddha handwritten notes. It will give you a bit of light while understanding compiler designing and structure in a better way.
The lexical analyzer phase reads the character stream from the source program and groups them into meaningful sequences by identifying the tokens. It is also called scanning. It makes the entry of the corresponding token into the symbol table and passes on the tokens to the next phase, syntax analyzer.
This phase is also called as parsing. In this phase, the tokens received from the previous phase are used to produce an intermediate tree-like data structure which is called the syntax tree. In this, there is an operator at each node and the operands of the operator are its child nodes. This is basically done to check if the syntax of the given statements is correct and in accordance with the rules pre-defined for the language. It produces a syntax error if the syntax is incorrect.
For e.g.: It will accept: a = b + c; It will reject: b + c = a; and produce an error for it.
The semantic analyzer uses the syntax tree of the previous phase along with symbol table to check if the given source code is semantically consistent, i.e. it is conveying an appropriate meaning.
One of the most important works of the semantic analyzer is type-checking. If the language permits some type conversions called type coercions, then the semantic analyzer does this job as well. If there is a type mismatch and there are no type coercion rules to satisfy the desired operation, it throws the semantic error.
For e.g. adding an integer and a string, then it is the job of the semantic analyzer to give an error in semantics.
If there’s a set of the statement,
float a= 10.5; float total=a*10;
The semantic analyzer will typecast the integer 10 to float 10.0 using int to float function.
After the syntax analysis and semantic analysis, many compilers generate an explicit low-level, machine-like code which is called an intermediate code. This code is generated in the intermediate code generator phase.
This code has two essential properties:
The basic purpose of generating this code is ease of translation to machine code and hence it resembles assembly language code greatly. For e.g. three address instructions are a type of intermediate code.
When the code is syntactically correct, compiler works on optimization of code for better performance. The optimized code will be converted into the target language code b the compiler. Let’s see back-end phases of the compiler with an example.
This is an optional phase which attempts to optimize the machine-independent intermediate code so that the code consumes the least possible time and power. Usually, the code is made shorter and simpler by combining steps or removing unnecessary steps which lead to the generation of optimized code.
This phase finally coverts the intermediate code or the optimized code into the target language. Usually, the target language is the machine code.
So, all the memory locations and registers are also selected and allotted during this phase itself. The code generated by this phase is executed to take inputs and generate desired outputs.
The symbol table and error handler interact with all the phases since various details are explored in each phase and symbol table are updated correspondingly. Also, various errors like syntax errors, semantic errors, run-time errors etc. can occur on each phase.
The compiler and interpreter is a bit of confusion. You can read the difference between compiler and interpreter to clear your doubt.
Hope this helps you to understand the structure and all the phases of the compiler with an example. If you have any doubt, feel free to comment below.