Language Processor(LP)€¦ · Prof.Imran Ahmad Page 3 Language Processor(LP) 2018 Program Educational Objectives (PEOs) The graduates will have a strong foundation in mathematical,

Roll No:____

Name:__________________

Sem:_______Section______

Language Processor(LP)

Prof.Imran Ahmad Page 1

Language Processor(LP) 2018

CERTIFICATE

Certified that this file is submitted by

Shri/Ku.___________________________________________________________

Roll No.________a student of ________ year of the course __________________

______________________________________ as a part of PRACTICAL/ORAL as

prescribed by the Rashtrasant Tukadoji Maharaj Nagpur University for the

subject_____________________________________ in the laboratory of

___________________________________during the academic year

_________________________ and that I have instructed him/her for the said work,

from time to time and I found him/her to be satisfactory progressive.

And that I have accessed the said work and I am satisfied that the same is up to that

standard envisaged for the course.

Date:- Signature & Name Signature & Name

of Subject Teacher of HOD



Anjuman College of Engineering and Technology Vision

To be a centre of excellence for developing quality technocrats with moral and social

ethics, to face the global challenges for the sustainable development of society.

Mission

To create conducive academic culture for learning and identifying career goals.

To provide quality technical education, research opportunities and imbibe

entrepreneurship skills contributing to the socio-economic growth of the Nation.

To inculcate values and skills, that will empower our students towards development

through technology.

Vision and Mission of the Department

Vision:

To achieve excellent standards of quality education in the field of computer science

and engineering, aiming towards development of ethically strong technical experts

contributing to the profession in the global society.

Mission:

To create outcome based education environment for learning and identifying career

goals.

Provide latest tools in a learning ambience to enhance innovations, problem solving

skills, leadership qualities team spirit and ethical responsibilities.

Inculcating awareness through innovative activities in the emerging areas of

technology.



Program Educational Objectives (PEOs)

The graduates will have a strong foundation in mathematical, scientific and

engineering fundamentals necessary to formulate, solve and analyze engineering

problem in their career.

Graduates will be able to create and design computer support systems and impart

knowledge and skills to analyze, design, test and implement various software

applications.

Graduates will work productively as computer science engineers towards betterment

of society exhibiting ethical qualities.

Program Specific Outcomes (PSOs)

Foundation of mathematical concepts: To use mathematical methodologies and

techniques for computing and solving problem using suitable mathematical analysis,

data structures, database and algorithms as per the requirement.

Foundation of Computer System: The capability and ability to interpret and

understand the fundamental concepts and methodology of computer systems and

programming. Students can understand the functionality of hardware and software

aspects of computer systems, networks and security.

Foundations of Software development: The ability to grasp the software development

lifecycle and methodologies of software system and project development.



PROGRAM: CSE DEGREE: B.E

COURSE: Language Processor SEMESTER: VII CREDITS: 1

COURSE CODE: BECSE402T COURSE TYPE: REGULAR

COURSE AREA/DOMAIN: Programming

Languages/Theoretical Foundations of Computer

Science

CONTACT HOURS: 2 hours/Week.

CORRESPONDING LAB COURSE CODE :

BECSE402P

LAB COURSE NAME : Language Processor Lab

COURSE PRE-REQUISITES:

C.CODE COURSE NAME DESCRIPTION SEM

BECSE211T Theoretical Foundations of

Computer Science

Finite Auotomata , Models of

computation, Context Free Grammer

etc

4

BECSE202T Advance “C” & Programming Logic

Design

Basic programming Knowledge 3

LAB COURSE OBJECTIVES:

To introduce the major concept areas of language translation and compiler design.

To enrich the knowledge in various phases of compiler ant its use, code optimization

techniques, machine code generation, and use of symbol table.

To extend the knowledge of parser by parsing LL parser and LR parser.

To provide practical programming skills necessary for constructing a compiler.

COURSE OUTCOMES: Language Processor

After completion of this course the students will be able -

SNO DESCRIPTION BLOOM’S TAXONOMY

LEVEL

CO.1 To Understand fundamentals of compiler. LEVEL 2

CO.2 Apply the knowledge of patterns, tokens & regular expressions for

solving a problem

LEVEL 3

CO.3 To design & conduct experiments for symbol table in compiler. LEVEL 5



CO.4 To design & conduct experiments for parser in compiler. LEVEL 5

CO.5 To acquire the knowledge of modern compiler & its features. LEVEL 2

CO.6 To apply the knowledge of lex tool & yacc tool to construct a scanner

& parser.

LEVEL 3



Lab Instructions:

Make entry in the Log Book as soon as you enter the Laboratory.

All the students should sit according to their Roll Numbers.

All the students are supposed to enter the terminal number in the Log Book.

Do not change the terminal on which you are working.

Strictly observe the instructions given by the Faculty / Lab. Instructor.

Take permission before entering in the lab and keep your belongings in the

racks.

NO FOOD, DRINK, IN ANY FORM is allowed in the lab.

TURN OFF CELL PHONES! If you need to use it, please keep it in bags.

Avoid all horseplay in the laboratory. Do not misbehave in the computer

laboratory. Work quietly.

Save often and keep your files organized.

Don’t change settings and surf safely.

Do not reboot, turn off, or move any workstation or PC.

Do not load any software on any lab computer (without prior permission of

Faculty and Technical Support Personnel). Only Lab Operators and Technical

Support Personnel are authorized to carry out these tasks.

Do not reconfigure the cabling/equipment without prior permission.

Do not play games on systems.

Turn off the machine once you are done using it.

Violation of the above rules and etiquette guidelines will result in disciplinary

action.



Continuous Assessment Practical

Exp

No NAME OF EXPERIMENT Date Sign Remark

1 INTRODUCTION TO ‘C’ COMPILER.

2 WRITE A PROGRAM TO ELEMINATE WHITE

SPACES FROM A INPUT FILE USING ARRAYS.

3 WRITE A PROGRAM TO IDENTIFY

KEYWORDS FROM A INPUT FILE .

4 WRITE A PROGRAM TO FIND WHETHER THE

STRING IS CONSTANT OR NOT.

5 WRITE A PROGRAM FOR IMPLEMENTING

SYMBOL TABLE.

6 WRITE A PROGRAM TO FIND FIRST()

FUNCTION FROM A GIVEN GRAMMAR.

7 WRITE A PROGRAM TO FIND FOLLOW()

FUNCTION FROM A GIVEN GRAMMAR.

8 WRITE A PROGRAM TO IMPLEMENT

DIRECTED ACYCLIC GRAPH (DAG).

9 STUDY OF LEX.

10 STUDY OF YACC.



CONTENTS

Exp

No NAME OF EXPERIMENT

PAGE

NO.

1 INTRODUCTION TO ‘C’ COMPILER.

2 WRITE A PROGRAM TO ELEMINATE WHITE SPACES FROM A

INPUT FILE USING ARRAYS.

3 WRITE A PROGRAM TO IDENTIFY KEYWORDS FROM A INPUT

FILE .

4 WRITE A PROGRAM TO FIND WHETHER THE STRING IS

CONSTANT OR NOT.

5 WRITE A PROGRAM FOR IMPLEMENTING SYMBOL TABLE.

6 WRITE A PROGRAM TO FIND FIRST() FUNCTION FROM A

GIVEN GRAMMAR.

7 WRITE A PROGRAM TO FIND FOLLOW() FUNCTION FROM A

GIVEN GRAMMAR.

8 WRITE A PROGRAM TO IMPLEMENT DIRECTED ACYCLIC

GRAPH (DAG).

9 STUDY OF LEX.

10 STUDY OF YACC.



EXPERIMENT NO – 1

Aim: INTRODUCTION TO ‘C’ COMPILER.

Theory:

‘C’Compilers :

“Compilation” – Translation of a ‘C’ program written in a source language i.e. C

Program into a semantically equivalent program written in a target language.

OR

A compiler is a program that reads a program written in the high-level language and

converts it into the machine or low-level language and reports the errors present in the

program. It converts the entire source code in one go or could take multiple passes to

do so, but at last, the user gets the compiled code which is ready to execute.

The compilation process is a sequence of various phases. Each phase takes input from

its previous stage, has its own representation of source program, and feeds its output to

the next phase of the compiler. Let us understand the phases of a compiler.

Phases of compiler:

The compilation process is a sequence of various phases. Each phase takes input from

its previous stage, has its own representation of source program, and feeds its output to

the next phase of the compiler. Let us understand the phases of a compiler.



Lexical Analysis

The first phase of scanner works as a text scanner. This phase scans the source code

as a stream of characters and converts it into meaningful lexemes. Lexical analyzer

represents these lexemes in the form of tokens as:

Syntax Analysis

The next phase is called the syntax analysis or parsing. It takes the token produced by

lexical analysis as input and generates a parse tree (or syntax tree). In this phase,

token arrangements are checked against the source code grammar, i.e. the parser

checks if the expression made by the tokens is syntactically correct.

Semantic Analysis

Semantic analysis checks whether the parse tree constructed follows the rules of

language. For example, assignment of values is between compatible data types, and

adding string to an integer



Intermediate Code Generation

After semantic analysis the compiler generates an intermediate code of the source

code for the target machine. It represents a program for some abstract machine. It is

in between the high-level language and the machine language.

Code Optimization

The next phase does code optimization of the intermediate code. Optimization can be

assumed as something that removes unnecessary code lines, and arranges the

sequence of statements in order to speed up the program execution without wasting

resources (CPU, memory).

Code Generation

In this phase, the code generator takes the optimized representation of the

intermediate code and maps it to the target machine language.

Symbol Table

It is a data-structure maintained throughout all the phases of a compiler. All the

identifier's names along with their types are stored here.

Conclusion : We have successfully studied the practical.

Viva Voce Question

1. What is Compiler?

________________________________________________________________

________________________________________________________________

2. What is Token?

________________________________________________________________

________________________________________________________________

3. Justify ,why code optimization phase is an optional phase?

________________________________________________________________

________________________________________________________________

________________________________________________________________

Signature of Subject Teacher



EXPERIMENT NO – 2

Aim: WRITE A PROGRAM TO ELEMINATE WHITE SPACES FROM A INPUT

FILE USING ARRAYS.

Theory:

In computer programming, whitespace is any character or series of characters that

represent horizontal or vertical space in typography. When rendered, a whitespace

character does not correspond to a visible mark, but typically does occupy an area on a

page. For example, the common whitespace

symbol U+0020 SPACE (HTML ), also ASCII 32, represents a blank

space punctuation character in text, used as a word divider in Western scripts.

Program:

#include<stdio.h>

#include<string.h>

int main()

{

char aj[1000], mj[1000];

int i = 0, j = 0, len;

printf("\n\nEnter the string: ");

gets(aj);

len = strlen(aj); // len stores the length of the input string

while(aj[i] != '\0') // till string doesn't terminate

{

if(aj[i] != ' ') // if the char is not a white space

{

/*

incrementing index j only when

the char is not space

https://en.wikipedia.org/wiki/Computer_programming

https://en.wikipedia.org/wiki/Character_(computing)

https://en.wikipedia.org/wiki/Space_(punctuation)

https://en.wikipedia.org/wiki/Typography

https://en.wikipedia.org/wiki/ASCII



https://en.wikipedia.org/wiki/Word_divider

https://en.wikipedia.org/wiki/Writing_system



*/

mj[j++] = aj[i];

}

/*

i is the index of the actual string and

is incremented irrespective of the spaces

*/

i++;

}

mj[j] = '\0';

printf("\n\nThe string after removing all the spaces is: %s", mj);

printf("\n\n\t\t\tCoding is Fun !\n\n\n");

return 0;

}

Output:



Viva Voce Question

1. What is whitespaces give the example?

________________________________________________________________

________________________________________________________________

2. What is ASCII code for whitespaces ?

________________________________________________________________

________________________________________________________________

3. Explain the logic of this code ?

________________________________________________________________

________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 3

Aim: WRITE A PROGRAM TO IDENTIFY KEYWORDS FROM A INPUT FILE .

Theory: we know that ANSI C has 32 keyword. actually Keyword are reserved words which

predefined in any language. we can not use further these words in program because they have special

meaning.if i use these words for our purpose then our compiler get confused to use these words so that

we can not use keyword as variable name and etc.

Program:

#include<stdio.h>

#include<conio.h>

#include<string.h>

void main()

{

int i,flag=0,m;

char

s[35][10]={"if","else","goto","continue","return","auto","break","case","char","const","deafult","do","double"

,"enum","extern","float","for","int","long","register",

"short","signed","sizeof","static","struct","switch","typedef","union","unsigned","void","volatile","while"

},st[32];

clrscr();

printf("\n enter the string :");

gets(st);

for(i=0;i<32;i++)

{

m=strcmp(st,s[i]);

if(m==0)

flag=1;

}

if(flag==0)

{

printf("\n it is a keyword");

}

else{



printf("\n it is a keyword");

}

getch();

}

Output:

Viva Voce Question

1. What is keyword give the example?

________________________________________________________________

________________________________________________________________

2. What is ASCII code for keyword ?

________________________________________________________________

________________________________________________________________


________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 4

Aim: WRITE A PROGRAM TO FIND WHETHER THE STRING IS IDENTIFIER

OR NOT.

Theory:

1. The first letter must be alphabet(both capital or small i.e. A-Z,a-z) or underscore(_).

2. After first letter it contains sequence of alphabet or digits(0-9) or underscore(_)

but not contain any special symbol(#,$,%,^,& etc.) and space( ).

Program:

#include

#include

#include

char keyword[][10]={"auto","break","case","char","const","continue","default","printf",

"double","else","enum","extern","float","for","goto","if","int","do",

"long","register","return","short","signed","sizeof","static","struct","switch","typedef","union","unsigned","vo

id","volatile","while"};

#define SIZE 25

int main()

{

char str[SIZE];

int len,i,flag=0;

printf("Enter The C Identifier :> ");

gets(str);

for( i = 0 ; i <= 32 ; i++ )

{

if(strcmp(str,keyword[i])==0)

{

printf(" Given string is invalid identifier\n");

exit(0);

}

}

if( str[0]=='_' || isalpha(str[0]) )

{

flag=1;

len=strlen(str);



for(i = 1 ; i < len ; i++ )

{

if( str[i]=='_' || isalpha(str[i]) || isdigit(str[i]) )

continue;

else

{

flag=0;

//printf(" H!!!!!!!! ");

break;

}

}

}

if( flag == 1 )

printf("Given string is a valid C Identifier\n");

else

printf("Given string is invalid C identifier\n");

return 0;

}

Output:

Viva Voce Question

1. What is identifier give the example?

________________________________________________________________

________________________________________________________________

2. What is ASCII code for identifier ?

https://4.bp.blogspot.com/--AGwtT1MDGQ/Wawg5MOQucI/AAAAAAAADgY/B9t6lvbnZTsjf-0gpUY3E1TOqmVeYKs0ACLcBGAs/s1600/comp1.png



________________________________________________________________

________________________________________________________________


________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 5

Aim: WRITE A PROGRAM FOR IMPLEMENTING SYMBOL TABLE.

Theory:

Symbol table is an important data structure created and maintained by compile in order to store

information about the occurrence of various entities such as variable names, function names, objects,

classes, interfaces, etc. Symbol table is used by both the analysis and the synthesis parts of a

compiler.

• A symbol table may serve the following purposes depending upon the language in hand:

• To store the names of all entities in a structured form at one place.

• To verify if a variable has been declared.

• To implement type checking, by verifying assignments and expressions in the source

code are semantically correct.

• To determine the scope of a name (scope resolution).

• A symbol table is simply a table which can be either linear or a hash table. It maintains an

entry for each name in the following format: <symbol name, type, attribute>

Implementation • If a compiler is to handle a small amount of data, then the symbol table can be implemented

as an unordered list, which is easy to code, but it is only suitable for small tables only. A

symbol table can be implemented in one of the following ways:

• Linear (sorted or unsorted) list

• Binary Search Tree

• Hash table

Program:

#include<stdio.h>

#include<conio.h>

#include<alloc.h>

#include<string.h>

#include<stdlib.h>

#define NULL 0

int size=0;

void Insert();

void Display();

void Delete();

int Search(char lab[]);

void Modify();

struct SymbTab

{



char label[10],symbol[10];

int addr;

struct SymbTab *next;};

struct SymbTab *first,*last;

void main()

{

int op,y;

char la[10];

clrscr();

do

{

printf("\n\tSYMBOL TABLE IMPLEMENTATION\n");

printf("\n\t1.INSERT\n\t2.DISPLAY\n\t3.DELETE\n\t4.SEARCH\n\t5.MODIFY\n\t6.END\n");

printf("\n\tEnter your option : ");

scanf("%d",&op);

switch(op)

{

case 1:

Insert();

break;

case 2:

Display();

break;

case 3:

Delete();

break;

case 4:

printf("\n\tEnter the label to be searched : ");

scanf("%s",la);

y=Search(la);

printf("\n\tSearch Result:");

if(y==1)

printf("\n\tThe label is present in the symbol table\n");

else

printf("\n\tThe label is not present in the symbol table\n");

break;

case 5:

Modify();

break;

case 6:

exit(0);

}

}while(op<6);

getch();

}

void Insert()

{

int n;

char l[10];

printf("\n\tEnter the label : ");

scanf("%s",l);

n=Search(l);

if(n==1)

printf("\n\tThe label exists already in the symbol table\n\tDuplicate can't be inserted");

else

{

struct SymbTab *p;



p=malloc(sizeof(struct SymbTab));

strcpy(p->label,l);

printf("\n\tEnter the symbol : ");

scanf("%s",p->symbol);

printf("\n\tEnter the address : ");

scanf("%d",&p->addr);

p->next=NULL;

if(size==0)

{

first=p;

last=p;

}

else

{

last->next=p;

last=p;

}

size++;

}

printf("\n\tLabel inserted\n");

}

void Display()

{

int i;

struct SymbTab *p;

p=first;

printf("\n\tLABEL\t\tSYMBOL\t\tADDRESS\n");

for(i=0;i<size;i++)

{

printf("\t%s\t\t%s\t\t%d\n",p->label,p->symbol,p->addr);

p=p->next;

}

}

int Search(char lab[])

{

int i,flag=0;

struct SymbTab *p;

p=first;

for(i=0;i<size;i++)

{

if(strcmp(p->label,lab)==0)

flag=1;

p=p->next;

}

return flag;

}

void Modify()

{

char l[10],nl[10];

int add,choice,i,s;

struct SymbTab *p;

p=first;

printf("\n\tWhat do you want to modify?\n");

printf("\n\t1.Only the label\n\t2.Only the address\n\t3.Both the label and address\n");

printf("\tEnter your choice : ");

scanf("%d",&choice);

switch(choice)



{

case 1:

printf("\n\tEnter the old label : ");

scanf("%s",l);

s=Search(l);

if(s==0)

printf("\n\tLabel not found\n");

else

{

printf("\n\tEnter the new label : ");

scanf("%s",nl);

for(i=0;i<size;i++)

{

if(strcmp(p->label,l)==0)

strcpy(p->label,nl);

p=p->next;

}

printf("\n\tAfter Modification:\n");

Display();

}

break;

case 2:

printf("\n\tEnter the label where the address is to be modified : ");

scanf("%s",l);

s=Search(l);

if(s==0)


else

{

printf("\n\tEnter the new address : ");

scanf("%d",&add);

for(i=0;i<size;i++)

{


p->addr=add;

p=p->next;

}


Display();

}

break;

case 3:

printf("\n\tEnter the old label : ");

scanf("%s",l);

s=Search(l);

if(s==0)


else

{

printf("\n\tEnter the new label : ");

scanf("%s",nl);

printf("\n\tEnter the new address : ");

scanf("%d",&add);

for(i=0;i<size;i++)

{


{



strcpy(p->label,nl);

p->addr=add;

}

p=p->next;

}


Display();

}

break;

}

}

void Delete()

{

int a;

char l[10];

struct SymbTab *p,*q;

p=first;

printf("\n\tEnter the label to be deleted : ");

scanf("%s",l);

a=Search(l);

if(a==0)


else

{

if(strcmp(first->label,l)==0)

first=first->next;

else if(strcmp(last->label,l)==0)

{

q=p->next;

while(strcmp(q->label,l)!=0)

{

p=p->next;

q=q->next;

}

p->next=NULL;

last=p;

}

else

{

q=p->next;

while(strcmp(q->label,l)!=0)

{

p=p->next;

q=q->next;

}

p->next=q->next;

}

size--;

printf("\n\tAfter Deletion:\n");

Display();

}

}



OUTPUT:

INSERTION:

DISPLAY:

DELETION:

SEARCH:

http://1.bp.blogspot.com/_V648gqhagYA/TTAV7t5r1LI/AAAAAAAAAnQ/IW9Bc-n_ph0/s1600/insert-function-of-symbol-table-implementation-c-source-code-cs1207-system-software-lab.JPG

http://1.bp.blogspot.com/_V648gqhagYA/TTAV4pjRb1I/AAAAAAAAAnM/3Hf56p_tVGk/s1600/display-function-of-symbol-table-implementation-c-source-code-cs1207-system-software-lab.JPG

http://2.bp.blogspot.com/_V648gqhagYA/TTAVn7a-Q0I/AAAAAAAAAnA/06Cf97f2auk/s1600/delete-function-of-symbol-table-implementation-c-source-code-cs1207-system-software-lab.JPG



MODIFICATION:

OR

#include <iostream>

#include <fstream>

#include <stdio.h>

#include <string.h>

#include <stdbool.h>

using namespace std;

int main(){

char

key[32][10]={"break","auto","case","const","char","continue","default","do","double","else","enum","extern"

,"float","for","goto","if","int","long","register","return","short","signed","sizeof","static","struct","switch","ty

pedef","union","unsigned","void","volatile","while"};

char c;

int i,count=0,k,d;

char s[20],id[25],u[10];

ifstream fin("input.cpp", ios::app);

int index = 0,break_index = 0 , chunk_index = 0 ;

char temp[10000];

http://2.bp.blogspot.com/_V648gqhagYA/TTAWVx9pH6I/AAAAAAAAAn0/okj79hWBCjk/s1600/search-function-of-symbol-table-implementation-c-source-code-cs1207-system-software-lab.JPG

http://4.bp.blogspot.com/_V648gqhagYA/TTAWCfYptLI/AAAAAAAAAnc/IAYTmI73V9M/s1600/modify-function-of-symbol-table-implementation-c-source-code-cs1207-system-software-lab.JPG



int memory = 10245;

cout<<"Variable DataType Memory\n";

while(fin)

{

fin.get(c);

temp[break_index] = c;

if(c==' ' || c == '\n' || c == ',' || c == '}') {

//Clear the space

char newtemp[10000];

int newIndex = 0;

for(int _i = 0 ; _i < strlen(temp) ; _i++){

if(temp[_i] != ' '){

newtemp[newIndex] = temp[_i];

newIndex++;

}

}

for(int __i = 0 ; __i < strlen(newtemp) ; __i++){

if(newtemp[__i] == ',' ||newtemp[__i] == ' ' || newtemp[__i] == ';' || newtemp[__i] == '('){

newtemp[__i] = '\0';

break;

}

}

char lastDataType[10];

for(int _i = 0 ; _i < 32; _i++){

if(strcmp(key[_i], newtemp) == 0){

strcpy(lastDataType,key[_i]);

break;

}

}

if(strcmp(newtemp,"int") == 0 || strcmp(newtemp,"char") == 0 || strcmp(newtemp,"float") == 0||

strcmp(newtemp,"double") == 0|| strcmp(newtemp,"long") == 0){

}

else if(newtemp[0] == '#'){

}else if(strcmp(newtemp,"main")==0 || strcmp(newtemp,"return")==0){

}

else if(newtemp[0] == ' '){ }

else if(newtemp[0] == '<' || newtemp[0] == '}' || newtemp[0] == '{'){

}

else if(newtemp[0] == '\n'){

}else if(strcmp(lastDataType,"int")==0){

memory += 2 ;

cout<< " "<<newtemp << " \t " << lastDataType << "\t " << memory << "\n";

}else if(strcmp(lastDataType,"char")==0){



memory += 1 ;


}else if(strcmp(lastDataType,"float")==0){

memory += 4 ;


}else if(strcmp(lastDataType,"double")==0){

memory += 4 ;


}else if(strcmp(lastDataType,"long")==0){

memory += 6 ;


}

break_index = 0 ;

//Clear the variable temp and new temp

for ( int i = 0 ; i< 40; i++){

newtemp[i] = ' ';

temp[i] = ' ';

}

}else{

}

break_index++;

}

}

Output:

Viva Voce Question

1. What is symbol table ?

________________________________________________________________

________________________________________________________________

2. What are the data structure required to implement symbol table ?

________________________________________________________________

________________________________________________________________


________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 6

Aim: WRITE A PROGRAM TO FIND FIRST() FUNCTION FROM A GIVEN

GRAMMAR.

Theory:

Apply following rules:

1. If X is terminal, FIRST(X) = {X}.

2. If X → ε is a production, then add ε to FIRST(X).

3. If X is a non-terminal, and X → Y1 Y2 … Yk is a production, and ε is in all of FIRST(Y1), …,

FIRST(Yk), then add ε to FIRST(X).

4. If X is a non-terminal, and X → Y1 Y2 … Yk is a production, then add a to FIRST(X) if for some

i, a is in FIRST(Yi), and ε is in all of FIRST(Y1), …, FIRST(Yi-1).

Applying rules 1 and 2 is obvious. Applying rules 3 and 4 for FIRST(Y1 Y2 … Yk) can be done as

follows:

Add all the non-ε symbols of FIRST(Y1) to FIRST(Y1 Y2 … Yk). If ε ∈ FIRST(Y1), add all the non-ε

symbols of FIRST(Y2). If ε ∈ FIRST(Y1) and ε ∈ FIRST(Y2), add all the non-ε symbols of

FIRST(Y3), and so on. Finally, add ε to FIRST(Y1 Y2 … Yk) if ε ∈ FIRST(Yi), for all 1 ≤ i ≤ k.

Example:

Consider the following grammar.

E → E + T | T

T → T * F | F

F → (E) | id

Grammar after removing left recursion:

E → TX

X → +TX | ε

T → FY

Y → *FY | ε

F → (E) | id



For the above grammar, following the above rules, the FIRST sets could be computed as follows:

FIRST(E) = FIRST(T) = FIRST(F) = {(, id}

FIRST(X) = {+, ε}

FIRST(Y) = {*, ε}

Program:

#include<stdio.h>

#include<conio.h>

char array[10][20],temp[10];

int c,n;

void fun(int,int[]);

int fun2(int i,int j,int p[],int );

void main()

{

int p[2],i,j;

clrscr();

printf("Enter the no. of productions :");

scanf("%d",&n);

printf("Enter the productions :\n");

for(i=0;i<n;i++)

scanf("%s",array[i]);

for(i=0;i<n;i++)

{

c=-1,p[0]=-1,p[1]=-1;

fun(i,p);

printf("First(%c) : [ ",array[i][0]);

for(j=0;j<=c;j++)



printf("%c,",temp[j]);

printf("\b ].\n");

getch();

}

}

int fun2(int i,int j,int p[],int key)

{

int k;

if(!key)

{

for(k=0;k<n;k++)

if(array[i][j]==array[k][0])

break;

p[0]=i;p[1]=j+1;

fun(k,p);

return 0;

}

else

{

for(k=0;k<=c;k++)

{

if(array[i][j]==temp[k])

break;

}

if(k>c)return 1;

else return 0;

}

}

void fun(int i,int p[])

{



int j,k,key;

for(j=2;array[i][j] != NULL; j++)

{

if(array[i][j-1]=='/')

{

if(array[i][j]>= 'A' && array[i][j]<='Z')

{

key=0;

fun2(i,j,p,key);

}

else

{

key = 1;

if(fun2(i,j,p,key))

temp[++c] = array[i][j];

if(array[i][j]== '@'&& p[0]!=-1) //taking '@' as null symbol

{

if(array[p[0]][p[1]]>='A' && array[p[0]][p[1]] <='Z')

{

key=0;

fun2(p[0],p[1],p,key);

}

else

if(array[p[0]][p[1]] != '/'&& array[p[0]][p[1]]!=NULL)

{

if(fun2(p[0],p[1],p,key))

temp[++c]=array[p[0]][p[1]];

}

}

}



getch();

}

}

getch();

}

Output:

Viva Voce Question

1. What are the rules to calculate FIRST information?

________________________________________________________________

________________________________________________________________

2. Why parser required FIRST information ?

________________________________________________________________

________________________________________________________________




________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 7

Aim: WRITE A PROGRAM TO FIND FOLLOW() FUNCTION FROM A GIVEN

GRAMMAR.

Theory: Apply the following rules:

1. If $ is the input end-marker, and S is the start symbol, $ ∈ FOLLOW(S).

2. If there is a production, A → αBβ, then (FIRST(β) – ε) ⊆ FOLLOW(B).

3. If there is a production, A → αB, or a production A → αBβ, where ε ∈ FIRST(β), then

FOLLOW(A) ⊆ FOLLOW(B).

Note that unlike the computation of FIRST sets for non-terminals, where the focus is on what a non-

terminal generates, the computation of FOLLOW sets depends upon where the non-terminal

appears on the RHS of a production.

Example:

For the above grammar, the FOLLOW sets can be computed by applying the above rules as follows.

FOLLOW(E) = {$, )}

FOLLOW(E) ⊆ FOLLOW(X) [in other words, FOLLOW(X) contains FOLLOW(E)]

Since there is no other rule applicable to FOLLOW(X),

FOLLOW(X) = {$, )}

FOLLOW(T) ⊆ FOLLOW(Y) …. (1)

(FIRST(X) – ε) ⊆ FOLLOW(T) i.e., {+} ⊆ FOLLOW(T) …. (2)

Also, since ε ∈ FIRST(X), FOLLOW(E) ⊆ FOLLOW(T)

i.e., {$, )} ⊆ FOLLOW(T) …. (3)

Putting (2) and (3) together, we get:

FOLLOW(T) = {$, ), +}

Since, there is no other rule applying to FOLLOW(Y), from (1), we get:

FOLLOW(Y) = {$, ), +}



Since ε ∈ FIRST(Y), FOLLOW(T) ⊆ FOLLOW(F) and FOLLOW(Y) ⊆ FOLLOW(F). Also,

(FIRST(Y) – ε) ⊆ FOLLOW(F). Putting all these together:

FOLLOW(F) = FOLLOW(T) ∪ FOLLOW(Y) ∪ (FIRST(Y) – ε) = {$, ), +, *}

Program:

#include<stdio.h>

#include<conio.h>

#define max 10

#define MAX 15

void ffun(int,int);

void fun(int,int[]);

void follow(int i);

char array[max][MAX],temp[max][MAX];

int c,n,t;

int fun2(int i,int j,int p[],int key)

{

int k;

if(!key){

for(k=0;k<n;k++)

if(array[i][j]==array[k][0])

break;

p[0]=i;p[1]=j+1;

fun(k,p);

return 0;

}

else{

for(k=0;k<=c;k++){

if(array[i][j]==temp[t][k])



break;

}

if(k>c)return 1;

else return 0;

}

}

void fun(int i,int p[])

{

int j,k,key;

for(j=2;array[i][j]!=NULL;j++)

{

if(array[i][j-1]=='/'){

if(array[i][j]>='A'&&array[i][j]<='Z'){

key=0;

fun2(i,j,p,key);

}

else{

key=1;

if(fun2(i,j,p,key))

temp[t][++c]=array[i][j];

if(array[i][j]=='@'&&p[0]!=-1){ //taking ,@, as null symbol.

if(array[p[0]][p[1]]>='A'&&array[p[0]][p[1]]<='Z'){

key=0;

fun2(p[0],p[1],p,key);

}

else

if(array[p[0]][p[1]]!='/'&&array[p[0]][p[1]]!=NULL){

if(fun2(p[0],p[1],p,key))

temp[t][++c]=array[p[0]][p[1]];



}

}

}

}

}

}

char fol[max][MAX],ff[max];int f,l,ff0;

void follow(int i)

{

int j,k;

for(j=0;j<=ff0;j++)

if(array[i][0]==ff[j])

return;

if(j>ff0)ff[++ff0]=array[i][0];

if(i==0)fol[l][++f]='$';

for(j=0;j<n;j++)

for(k=2;array[j][k]!=NULL;k++)

if(array[j][k]==array[i][0])

ffun(j,k);

}

void ffun(int j,int k)

{

int ii,null=0,tt,cc;

if(array[j][k+1]=='/'||array[j][k+1]==NULL)

null=1;

for(ii=k+1;array[j][ii]!='/'&&array[j][ii]!=NULL;ii++){

if(array[j][ii]<='Z'&&array[j][ii]>='A')

{



for(tt=0;tt<n;tt++)

if(temp[tt][0]==array[j][ii])break;

for(cc=1;temp[tt][cc]!=NULL;cc++)

{

if(temp[tt][cc]=='@')null=1;

else fol[l][++f]=temp[tt][cc];

}

}

else fol[l][++f]=array[j][ii];

}

if(null)follow(j);

}

int main()

{

int p[2],i,j;

clrscr();

printf("Enter the no. of productions :");

scanf("%d",&n);

printf("Enter the productions :\n");

for(i=0;i<n;i++)

scanf("%s",array[i]);

for(i=0,t=0;i<n;i++,t++){

c=0,p[0]=-1,p[1]=-1;

temp[t][0]=array[i][0];

fun(i,p);

temp[t][++c]=NULL;

printf("First(%c) : [ ",temp[t][0]);

for(j=1;j<c;j++)

printf("%c,",temp[t][j]);



printf("\b ].\n");

}

/* Follow Finding */

for(i=0,l=0;i<n;i++,l++)

{

f=-1;ff0=-1;

fol[l][++f]=array[i][0];

follow(i);

fol[l][++f]=NULL;

}

for(i=0;i<n;i++)

{

printf("\nFollow[%c] : [ ",fol[i][0]);

for(j=1;fol[i][j]!=NULL;j++)

printf("%c,",fol[i][j]);

printf("\b ]");

}

getch();

return 0;

}



Output:

Viva Voce Question

1. What are the rules to calculate FOLLOW information?

________________________________________________________________

________________________________________________________________

2. Why parser required FOLLOW information ?

________________________________________________________________

________________________________________________________________


________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 8

Aim: WRITE A PROGRAM TO IMPLEMENT DIRECTED ACYCLIC GRAPH

(DAG).

Theory:

Directed Acyclic Graph (DAG) is a tool that depicts the structure of basic blocks, helps to see the

flow of values flowing among the basic blocks, and offers optimization too. DAG provides easy

transformation on basic blocks. DAG can be understood here:

Leaf nodes represent identifiers, names or constants.

Interior nodes represent operators.

Interior nodes also represent the results of expressions or the identifiers/name where the values are to be

stored or assigned.

Example:

t0 = a + b

t1 = t0 + c

d = t0 + t1

[t0 = a + b]

[t1 = t0 + c]

[d = t0 + t1]

Program:

#include<stdio.h>

#include<conio.h>

#include<ctype.h>

#define size 20

typedef struct node

{

char data;

struct node *left;



struct node *right;

}btree;

btree *stack[size];

int top;

void main()

{

btree *root;

char exp[80];

btree *create(char exp[80]);

void dag(btree *root);

clrscr();

printf("\nEnter the postfix expression:\n");

scanf("%s",exp);

top=-1;

root=create(exp);

printf("\nThe tree is created.....\n");

printf("\nInorder DAG is : \n\n");

dag(root);

getch();

}

btree *create(char exp[])

{

btree *temp;

int pos;

char ch;

void push(btree*);

btree *pop();

pos=0;

ch=exp[pos];

while(ch!='\0')

{

temp=((btree*)malloc(sizeof(btree)));

temp->left=temp->right=NULL;

temp->data=ch;

if(isalpha(ch))

push(temp);

else if(ch=='+' ||ch=='-' || ch=='*' || ch=='/')

{

temp->right=pop();

temp->left=pop();

push(temp);

}

else

printf("\n Invalid char Expression\n");

pos++;

ch=exp[pos];

}

temp=pop();

return(temp);

}

void push(btree *Node)

{

if(top+1 >=size)

printf("Error:Stack is full\n");

top++;

stack[top]=Node;



}

btree* pop()

{

btree *Node;

if(top==-1)

printf("\nerror: stack is empty..\n");

Node=stack[top];

top--;

return(Node);

}

void dag(btree *root)

{

btree *temp;

temp=root;

if(temp!=NULL)

{

dag(temp->left);

printf("%c",temp->data);

dag(temp->right);

}

}

Output:

Enter the postfix expression:ab+

Inorder DAG is :

[t0 = a + b]

Viva Voce Question

1. What is Directed acyclic graph (DAG) give the example ?

________________________________________________________________

2. What are the rules to construct the DAG ?

________________________________________________________________

________________________________________________________________


________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 9

Aim: STUDY OF LEX.

Theory:

Lex is a program generator designed for lexical processing of character input streams. It accepts a

high-level, problem oriented specification for character string matching, and produces a program in a

general purpose language which recognizes regular expressions. The regular expressions are

specified by the user in the source specifications given to Lex. The Lex written code recognizes these

expressions in an input stream and partitions the input stream into strings matching the expressions.

At the boundaries between strings program sections provided by the user are executed. The Lex

source file associates the regular expressions and the program fragments. As each expression appears

in the input to the program written by Lex, the corresponding fragment is executed.

• The main job of a lexical analyser (scanner) is to break up an input stream into tokens

(tokenize input streams).• Ex :- a = b + c * d; ID ASSIGN ID PLUS ID MULT ID SEMI•

Lex is an utility to help you rapidly generate your scanners

• Structure of Lex Program• Lex source is separated into three sections by % % delimiters• The

general format of Lex source is {definitions} %% (required) {transition rules} %% (optional)

{user Code}

• Definitions• Declarations of ordinary C variables ,constants and Libraries.%{#include

<math.h>#include <stdio.h> #include <stdlib.h>%}• flex definitions :- name definition Digit

[0-9] (Regular Definition)

• Operators“ [ ] ^ - ? . * + | ( ) $ / { } % < >• If they are to be used as text characters, an escape

should be used $ = “$” = “”• Every character but blank, tab (t), newline (n) and the list above

is always a text character

• Translation Rules• The form of rules are: Pattern { action } The actions are C/C++ code.[0-

9]+ { return(Integer); } // RE{DIGIT}+ { return(Integer); } // RD




Viva Voce Question

1. What is LEX ?

________________________________________________________________

2. Why LEX is required ?

________________________________________________________________

________________________________________________________________

3. What are the rules for LEX ?

________________________________________________________________

________________________________________________________________




EXPERIMENT NO – 10

Aim: STUDY OF YACC.

Theory: Computer program input generally has some structure; in fact, every

computer program that does input can be thought of as defining an ``input language''

which it accepts. An input language may be as complex as a programming language,

or as simple as a sequence of numbers. Unfortunately, usual input facilities are

limited, difficult to use, and often are lax about checking their inputs for validity.

Yacc provides a general tool for describing the input to a computer program. The Yacc

user specifies the structures of his input, together with code to be invoked as each such

structure is recognized. Yacc turns such a specification into a subroutine that handles

the input process; frequently, it is convenient and appropriate to have most of the flow

of control in the user's application handled by this subroutine.

The input subroutine produced by Yacc calls a user-supplied routine to return the next

basic input item. Thus, the user can specify his input in terms of individual input

characters, or in terms of higher level constructs such as names and numbers. The

user-supplied routine may also handle idiomatic features such as comment and

continuation conventions, which typically defy easy grammatical specification.

Yacc is written in portable C. The class of specifications accepted is a very general

one: LALR(1) grammars with disambiguating rules.

In addition to compilers for C, APL, Pascal, RATFOR, etc., Yacc has also been used

for less conventional languages, including a phototypesetter language, several desk

calculator languages, a document retrieval system, and a Fortran debugging system.

Introduction

Yacc provides a general tool for imposing structure on the input to a computer

program. The Yacc user prepares a specification of the input process; this includes

rules describing the input structure, code to be invoked when these rules are

recognized, and a low-level routine to do the basic input. Yacc then generates a

function to control the input process. This function, called a parser, calls the user-

supplied low-level input routine (the lexical analyzer) to pick up the basic items

(called tokens) from the input stream. These tokens are organized according to the

input structure rules, called grammar rules; when one of these rules has been

recognized, then user code supplied for this rule, an action, is invoked; actions have

the ability to return values and make use of the values of other actions.

Yacc is written in a portable dialect of C[1] and the actions, and output subroutine, are

in C as well. Moreover, many of the syntactic conventions of Yacc follow C.

The heart of the input specification is a collection of grammar rules. Each rule

describes an allowable structure and gives it a name. For example, one grammar rule

might be



date : month_name day ',' year ;

Here, date, month_name, day, and year represent structures of interest in the input

process; presumably, month_name, day, and year are defined elsewhere. The comma

``,'' is enclosed in single quotes; this implies that the comma is to appear literally in the

input. The colon and semicolon merely serve as punctuation in the rule, and have no

significance in controlling the input. Thus, with proper definitions, the input

July 4, 1776

might be matched by the above rule.

An important part of the input process is carried out by the lexical analyzer. This user

routine reads the input stream, recognizing the lower level structures, and

communicates these tokens to the parser. For historical reasons, a structure recognized

by the lexical analyzer is called a terminal symbol, while the structure recognized by

the parser is called a nonterminal symbol. To avoid confusion, terminal symbols will

usually be referred to as tokens.

There is considerable leeway in deciding whether to recognize structures using the

lexical analyzer or grammar rules. For example, the rules

month_name : 'J' 'a' 'n' ;

month_name : 'F' 'e' 'b' ;

month_name : 'D' 'e' 'c' ;

might be used in the above example. The lexical analyzer would only need to

recognize individual letters, and month_name would be a nonterminal symbol. Such

low-level rules tend to waste time and space, and may complicate the specification

beyond Yacc's ability to deal with it. Usually, the lexical analyzer would recognize the

month names, and return an indication that a month_name was seen; in this case,

month_name would be a token.

Literal characters such as ``,'' must also be passed through the lexical analyzer, and are

also considered tokens.

Specification files are very flexible. It is realively easy to add to the above example

the rule

date : month '/' day '/' year ;

allowing

7 / 4 / 1776

as a synonym for

July 4, 1776

In most cases, this new rule could be ``slipped in'' to a working system with minimal

effort, and little danger of disrupting existing input.

The input being read may not conform to the specifications. These input errors are

detected as early as is theoretically possible with a left-to-right scan; thus, not only is

the chance of reading and computing with bad input data substantially reduced, but the

bad data can usually be quickly found. Error handling, provided as part of the input



specifications, permits the reentry of bad data, or the continuation of the input process

after skipping over the bad data.

In some cases, Yacc fails to produce a parser when given a set of specifications. For

example, the specifications may be self contradictory, or they may require a more

powerful recognition mechanism than that available to Yacc. The former cases

represent design errors; the latter cases can often be corrected by making the lexical

analyzer more powerful, or by rewriting some of the grammar rules. While Yacc

cannot handle all possible specifications, its power compares favorably with similar

systems; moreover, the constructions which are difficult for Yacc to

handle are also frequently difficult for human beings to handle. Some users have

reported that the discipline of formulating valid Yacc specifications for their input

revealed errors of conception or design early in the program development.

The theory underlying Yacc has been described elsewhere.[2, 3, 4] Yacc has been

extensively used in numerous practical applications, including lint,[5] the Portable C

Compiler,[6] and a system for typesetting mathematics.[7]

The next several sections describe the basic process of preparing a Yacc specification;

Section 1 describes the preparation of grammar rules, Section 2 the preparation of the

user supplied actions associated with these rules, and Section 3 the preparation of

lexical analyzers. Section 4 describes the operation of the parser. Section 5 discusses

various reasons why Yacc may be unable to produce a parser from a specification, and

what to do about it. Section 6 describes a simple mechanism for handling operator

precedences in arithmetic expressions. Section 7 discusses error detection and

recovery. Section 8 discusses the operating environment and special features of the

parsers Yacc produces. Section 9 gives some suggestions which should improve the

style and efficiency of the specifications. Section 10 discusses some advanced topics,

and Section 11 gives acknowledgements. Appendix A has a brief example, and

Appendix B gives a summary of the Yacc input syntax. Appendix C gives an example

using some of the more advanced features of Yacc, and, finally, Appendix D describes

mechanisms and syntax no longer actively supported, but provided for historical

continuity with older versions of Yacc.

1: Basic Specifications

Names refer to either tokens or nonterminal symbols. Yacc requires token names to be

declared as such. In addition, for reasons discussed in Section 3, it is often desirable to

include the lexical analyzer as part of the specification file; it may be useful to include

other programs as well. Thus, every specification file consists of three sections: the

declarations, (grammar) rules, and programs. The sections are separated by double

percent ``%%'' marks. (The percent ``%'' is generally used in Yacc specifications as an

escape character.)

In other words, a full specification file looks like

declarations

%%

rules



%%

programs

The declaration section may be empty. Moreover, if the programs section is omitted,

the second %% mark may be omitted also;

thus, the smallest legal Yacc specification is

%%

rules

Blanks, tabs, and newlines are ignored except that they may not appear in names or

multi-character reserved symbols. Comments may appear wherever a name is legal;

they are enclosed in /* . . . */, as in C and PL/I.

The rules section is made up of one or more grammar rules. A grammar rule has the

form:

A : BODY ;

A represents a nonterminal name, and BODY represents a sequence of zero or more

names and literals. The colon and the semicolon are Yacc punctuation.

Names may be of arbitrary length, and may be made up of letters, dot ``.'', underscore

``_'', and non-initial digits. Upper and lower case letters are distinct. The names used

in the body of a grammar rule may represent tokens or nonterminal symbols.

A literal consists of a character enclosed in single quotes ``'''. As in C, the backslash

``\'' is an escape character within literals, and all the C escapes are recognized. Thus

'\n' newline

'\r' return

'\'' single quote ``'''

'\\' backslash ``\''

'\t' tab

'\b' backspace

'\f' form feed

'\xxx' ``xxx'' in octal

For a number of technical reasons, the NUL character ('\0' or 0) should never be used

in grammar rules.

If there are several grammar rules with the same left hand side, the vertical bar ``|'' can

be used to avoid rewriting the left hand side. In addition, the semicolon at the end of a

rule can be dropped before a vertical bar. Thus the grammar rules

A : B C D ;

A : E F ;

A : G ;

can be given to Yacc as

A : B C D

| E F

| G

;



It is not necessary that all grammar rules with the same left side appear together in the

grammar rules section, although it makes the input much more readable, and easier to

change.

If a nonterminal symbol matches the empty string, this can be indicated in the obvious

way:

empty : ;

Names representing tokens must be declared; this is most simply done by writing

%token name1 name2 . . .

in the declarations section. (See Sections 3 , 5, and 6 for much more discussion).

Every name not defined in the declarations section is assumed to represent a

nonterminal symbol. Every nonterminal symbol must appear on the left side of at least

one rule.

Of all the nonterminal symbols, one, called the start symbol, has particular

importance. The parser is designed to recognize the start symbol; thus, this symbol

represents the largest, most general structure described by the grammar rules. By

default, the start symbol is taken to be the left hand side of the first grammar rule in

the rules section. It is possible, and in fact desirable, to declare the start symbol

explicitly in the declarations section using the %start keyword:

%start symbol

The end of the input to the parser is signaled by a special token, called the endmarker.

If the tokens up to, but not including, the endmarker form a structure which matches

the start symbol, the parser function returns to its caller after the endmarker is seen; it

accepts the input. If the endmarker is seen in any other context, it is an error.


Viva Voce Question

1. What is YACC ?

________________________________________________________________

2. Why YACC is required ?

________________________________________________________________

________________________________________________________________

3. What are the rules for YACC ?

________________________________________________________________

________________________________________________________________


Documents

Language Processor(LP)€¦ · Prof.Imran Ahmad Page 3 Language Processor(LP) 2018 Program Educational Objectives (PEOs) The graduates will have a strong foundation in mathematical,