|
|
Summary
• Pre-processor
• include directive
• define directive
• Other Preprocessor Directives
• Macros
• Example
• Tips
Preprocessor
Being a concise language, C needs something for its enhancement. So a preprocessor
is
used to enhance it. It comes with every C compiler. It makes some changes in the
code
before the compilation. The compiler gets the modified source code file. Normally
we
can’t see what the preprocessor has included. We have so far been using
#include
preprocessor directive like
#include<iostream.h>. What actually
#include does? When
we write #include<somefile>,
this somefile is
ordinary text file of C code. The line where
we write the #include
statement is replaced by the text of that file. We can’t see that file
included in our source code. However, when the compiler starts its work, it sees
all the
things in the file. Almost all of the preprocessor directives start with # sign.
There are two
ways to use #include.
We have so far been including the file names enclosing the angle
brackets i.e. #include <somefile>.
This way of referring a file tells the compiler that this
file exists in some particular folder (directory) and should be included from there.
So we
have included iostream.h,
stdlib.h,
fstream.h,
string.h and some other
files and used angle
brackets for all of these files. These files are located in a specific directory.
While using
the Dev-Cpp compiler,
you should have a look at the directory structure. Open the
Dev-
Cpp folder in the windows explorer, you will see many subfolders on the
right side. One
of these folders is ‘include’.
On expansion of the folder ‘include’,
you will see a lot of
files in this directory. Usually the extension of these files is ‘h’. Here ‘h’ stands
for
header files. Normally we add these files at the start of the program. Therefore
these are
Page 285
known as header files. We can include files anywhere in the code but it needs to
be
logical and at the proper position.
include directive
As you know, we have been using functions in the programs. If we have to refer a
function (call a function) in our program, the prototype of function must be declared
before its usage. The compiler should know the name of the function, the arguments
it is
expecting and the return type. The first parse of compilation will be successful.
If we are
using some function, it will be included in our program at the time of linking.
functions are available in the compiled form, which the linker links with
our
program. After the first parse of the compiler, it converts the source code into
object
code. Object code is machine code but is not re-locateable executable. The object
code of
our program is combined with the object code of the functions, which the
program
is using. Later, some memory location information is included and we get the executable
file. The linker performs this task while the compiler includes the name and arguments
of
the function in the object code. For checking the validity of the functions, the
compiler
needs to know the definition of the function or at least the prototype of the function.
We
have both the options for our functions. Define the function in the start of the
program
and use it in the main program. In this case, the definition of the function serves
as both
prototype and definition for the function. The compiler compiles the function and
the
main program. Then we can link and execute it. As the program gets big, it becomes
difficult to write the definitions of all the functions at the beginning of the
program.
Sometimes, we write the functions in a different file and make the object file.
We can
include the prototypes of these functions in our program in different manners. One
way is
to write the prototype of all these functions in the start before writing the program.
The
better way is to make a header file (say myheaderfile.h) and write the prototypes
of all
the functions and save it as ordinary text file. Now we need to include it in our
program
using the #include
directive. As this file is located at the place where our source code is
located, it is not included in the angle brackets in
#include directive. It
is written in
quotation marks as under:
#include “myHeaderFile.h”
The preprocessor will search for the file
“myHeaderFile.h” in the
current working
directory. Let’s see the difference between the process of the including the file
in
brackets and quotation marks. When we include the file in angle brackets, the compiler
looks in a specific directory. But it will look into the current working directory
when the
file is included in quotation marks. In the Dev-Cpp IDE, under the
tools menu option,
select compiler options.
In this dialogue box, we can specify the directories for libraries
and include files. When we use angle brackets with
#include, the compiler
will look in
the directories specified in include directories option. If we want to write our
own header
file and save it in ‘My Document’ folder, the header file should be included with
the
quotation marks.
Page 286
When we compile our source code, the compiler at first looks for the include directives
and processes them one by one. If the first directive is
#include<iostream.h>, the
compiler will search this file in the include directory. Then it will include the
complete
header file in our source code at the same position where the ‘include directive’
is
written. If the 2nd include directive contains another file, this
file will also be included in
the source code after the iostream.h
and so on. The compiler will get this expanded
source code file for compilation. As this expanded source code is not available
to us and
we will get the executable file in the end.
Can we include the header file at the point other than start of the program? Yes.
There is
no restriction. We can include wherever we want. Normally we do this at the start
of the
program as these are header files. We do not write a portion of code in a different
file and
include this file somewhere in the code. This is legal but not a practice. We have
so far
discussed include directive. Now we will discuss another important directive i.e.
define
directive.
define directive
We can define macros with the
#define directive. Macro is a special name, which is
substituted in the code by its definition, and as a result, we get an expanded code.
For
example, we are writing a program, using the constant Pi. Pi is a universal constant
and
has a value of 3.1415926. We have to write this value 3.1415926 wherever needed
in the
program. It will be better to define Pi somewhere and use Pi instead of the actual
value.
We can do the same thing with the variable Pi as
double Pi = 3.1415926 while
employing Pi as variable in the program. As this is a variable, one can re-assign
it some
new value. We want that wherever we write Pi, its natural value should be replaced.
Be
sure that the value of Pi can not be changed. With the define directive, we can
define Pi
as:
#define PI 3.1415926
We need to write the name of the symbolic constant and its value, separated by space.
Normally, we write these symbolic constants in capitals as it can be easily identifiable
in
the code. When we request the compiler to compile this file, the preprocessor looks
for
the define directives and replaces all the names in the code, defined with the define
directives by their values. So compiler does not see PI wherever we have used PI
is
replaced with 3.1415926 before the compiler compiles the file.
A small program showing the usage of #define.
/* Program to show the usage of define */
#include <iostream.h>
#define PI 3.1415926 // Defining PI
main()
{
Page 287
int radius = 5;
cout << "Area of circle with radius " << radius << " = " << PI * radius * radius;
}
What is the benefit of using it? Suppose we have written a program and are using
the
value of PI as 3.14 i.e. up to two decimal places. After verifying the accuracy
of the
result, we need to have the value of PI as 3.1415926. In case of not using PI as
define, we
have to search 3.14 and replace it with 3.1415926 each and every place in the source
code. There may be a problem in performing this ‘search and replace’ task. We can
miss
some place or replace something else. Suppose at some place, 3.14 is representing
something else like tax rate. We may change this value too accidentally, considering
it
the value for PI. So we can’t conduct a blind search and replace and expect that
it will
work fine. It will be nicer to define PI at the start of the program. We will be
using PI
instead of its value i.e. 3.1415926. Now if we want to change the value of PI, it
will be
changed only at one place. The complete program will get the new value. When we
define something with the #define
directive, it is substituted with the value before the
compiler compiles the file. This gives us a very nice control needed to change the
value
only at one place. Thus the complete program is updated.
We can also put this definition of PI in the header file. The benefit of doing this
is, every
program which is using the value of PI from this header file, will get the updated
value
when the value in header file is changed. For example, we have five functions, using
the
PI and these functions are defined in five different files. So we need to define
PI (i.e.
#define PI 3.1415926) in all the five source files. We can define it
in one header file and
include this header file in all the source code files. Each function is getting
the value of PI
from the header file by changing the value of PI in the header file, all the functions
will
be updated with this new value. As these preprocessor directives are not C statements,
so
we do not put semicolon in the end of the line. If we put the semicolon with the
#include
or #define, it will
result in a syntax error.
Other Preprocessor Directives
There are some other preprocessor directives. Here is the list of preprocessor directives.
• #include <filename>
• #include “filename”
• #define
• #undef
• #ifdef
• #ifndef
• #if
• #else
• #elif
• #endif
• #error
Page 288
• #line
• #pragma
• #assert
All the preprocessor directives start with the sharp sign (#). We can also do conditional
compilation with it. We have
#if, #else, #endif and for else if
#elif is used. It can also
be
checked whether the symbol which we have defined with
#define, is available or
not. For
this purpose, #ifdef
is used. If we have defined PI, we can always say:
#ifdef PI
… Then do something
#endif
This is an example of conditional compilation. If a symbolic constant is defined,
it will be
error to define it again. It is better to check whether it is already defined or
not. If it is
already defined and we want to give it some other value, it should be undefined
first. The
directive for undefine is #undef.
At first, we will undefine it and define it again with new
value. Another advantage of conditional compilation is ‘while debugging’. The common
technique is to put output statements at various points in the program. These statements
are used in the code to check the value of different variables and to verify that
the
program is working fine. It is extremely tedious to remove all these output statements
which we have written for the debugging. To overcome this problem, we can go for
conditional compilation. We can define a symbol at the start of the program as:
#define DEBUG
Here we have defined a symbol
DEBUG with no value in front of it. The value is optional
with the define directive. The output statements for debugging will be written as:
#ifdef DEBUG
cout << ”Control is in the while loop of calculating average”;
#endif
Now this statement will execute if the
DEBUG symbol is defined.
Otherwise, it will not
be executed.
Here is an example using the debug output statements:
// Program that shows the use of Define for debugging
// Comment the #define DEBUG and see the change in the output
#include <iostream.h>
#include <stdlib.h>
#define DEBUG
main()
Page 289
{
int z ;
int arraySize = 100;
int a[100] ;
int i;
// Initializing the array.
for ( i = 0; i < arraySize; i++ )
{
a[i] = i;
}
// If the symbol DEBUG is defined then this code will execute
#ifdef DEBUG
for ( i = 0 ; i < arraySize ; i ++ )
cout << "\t " << a[i];
#endif
cout << " Please enter a positive integer " ;
cin >> z ;
int found = 0 ;
// loop to search the number.
for ( i = 0 ; i < arraySize ; i ++ )
{
if ( z == a[i] )
{
found = 1 ;
break ;
}
}
if ( found == 1 )
cout << " We found the integer at position " << i ;
else
cout << " The number was not found " ;
}
With preprocessor directives, we can carry out conditional compilation, a macro
translation that is replacement of a symbol by the value in front of it. We can
not redefine
a symbol without undefining it first. For undefining a symbol,
#undef is used. e.g. the
symbol PI can be undefined as:
#undef PI
Page 290
Now from this point onward in the program, the symbol PI will not be available.
The
compiler will not be able to view this symbol and give error if we have used it
in the
program after undefining.
As an exercise, open some header files and read them. e.g. we have used a header
file
conio.h (i.e. #define<conio.h>
) for consol input output in our programs. This is legacy
for non-graphical systems. We have two variants of
conio in Dev-Cpp i.e.
conio.h
and conio.c (folder
is ‘Dev-Cpp\include’). Open and read it. Do not try to change
anything, as it may cause some problems. Now you have enough knowledge to read it
line by line. You will see different symbols in it starting with underscore ( _
). There are
lots of internal constants and symbolic names starting with double underscore. Therefore
we should not use such variable names that are starting with underscore. You can
find the
declaration of different functions in it e.g. the function
getche() (i.e. get character
with
echo) is declared in conio.h
file. If we try to use the function
getche() without including
the conio.h file,
the compiler will give error like ‘the function
getche() undeclared’. There
is another interesting construct in
conio.h i.e.
#ifdef __cplusplus
extern "C" {
#endif
If the symbol __cplusplus
is defined, the statement ‘extern
“C” { ‘ will be included in the
code. We have an opening brace here. Look where the closing brace is. Go to the
end of
the same file. You will find the following:
#ifdef __cplusplus
}
#endif
This is an example of conditional compilation i.e. if the symbol is defined, it
includes
these lines in the code before compiling. Go through all the header files, we have
been
using in our programs so that you can see how professional programmers write code.
If
you have the linux operating system, it is free with a source code. The source code
of
linux is written in C language. You can see the functions written by the C programming
Gurus. There may be the code of string manipulation function like string copy, string
compare etc.
Macros
Macros are classified into two categories. The first type of macros can be written
using
#define. The value of PI can be defined as:
#define PI 3.1415926
Here the symbol PI will be replaced with the actual value (i.e. 3.1415926) in the
program.
These are simple macros like symbolic names mapped to constants.
Page 291
In contrast, the second type of macros takes arguments. It is also called a parameterized
macros. Consider the following:
#define square(x) x * x
Being a non-C code, it does not require any semicolon at the end. Before the compiler
gets the file, the macro replaces all the occurrences of square (x) (that may be
square (i),
square (3) etc) with ( x * x ) (that is for square (i) is replaced by i * i, square(3)
is
replaced by 3 * 3 ). The compiler will not see square(x). Rather, it will see x
* x, and
make an executable file. There is a problem with this macro definition as seen in
the
following statement.
square (i + j);
Here we have i+j
as x in the
definition of macro. When this is replaced with the macro
definition, we will get the statement as:
i + j * i + j
This is certainly not the square of
i + j. It is evaluated
as (i + ( j * i ) + j
due to the
precedence of the operators. How can we overcome this problem? Whenever you write
a
parameterized macro, it is necessary to put the parenthesis in the definition of
macro. At
first, write the complete definition in the parenthesis, and then put the
x also in
parenthesis. The correct definition of the macro will be as:
#define square(x) ((x) * (x))
This macro will work fine. When this macro definition is replaced in the code,
parenthesis will also be copied making the computation correct.
Here is a sample program showing the use of a simple square macro:
/* Program to show the use of macro */
#include <iostream.h>
// Definition of macro square
#define square(x) ((x) * (x))
main()
{
int x;
cout << endl;
cout << " Please enter the value of x to calculate its square ";
cin >> x;
Page 292
cout << " Square of x = " << square(x) << endl;
cout << " Square of x+2 = " << square(x+2) << endl;
cout << " Square of 7 = " << square(7);
}
We can also write a function to
square(x) to calculate
the square of a number. What is the
difference between using this
square(x) macro and the
square(x) function? Whenever
we
call a function, a lot of work has to be done during the execution of the program.
The
memory in machine is used as stack for the program. The state of a program (i.e.
the
value of all the variables of the program), the line no which is currently executing
etc is
on the stack. Before calling the function, we write the arguments on the stack.
In a way,
we stop at the function calling point and the code jumps to the function definition
code.
The function picks up the values of arguments from the stack. Do some computation
and
return the control to the main program which starts executing next line. So there
is lot of
overhead in function calling. Whenever we call a function, there is some work that
needed to be done. Whenever we do a function call, like if we are calling a function
in a
loop, this overhead is involved with every iteration. The overhead is equal number
of
times the loop executed. So computer time and resources are wasted. Obviously there
are
a number of times when we need to call functions but in this simple example of
calculating square, if we use square function and the program is calling this function
1000 times, a considerable time is wasted. On the other hand, if we define square
macro
and use it. The code written in front of macro name is substituted at all the places
in the
code where we are using square macro. Therefore the code is expanded before
compilation and compiler see ordinary multiplication statements. There is no function
call involved, thus making the program run faster. We can write complex parameterized
macros. The advantage of using macros is that there is no overhead of function calls
and
the program runs faster. If we are using lot of macros in our program, it is replaced
by the
macro definition at every place in the code making the program bloat. Therefore
our
source code file becomes a large file, resulting in the enlargement of the executable
file
too. Sometimes it is better to write functions and define things in it. For simple
things like
taking a square, it is nice to write macros that are only one line code substitution
by the
preprocessor.
Take care of few things while defining macros. There is no space between the macro
name and the starting parenthesis. If we put a space there, it will be considered
as simple
macro without parameters. We can use more than one argument in the macros using
comma-separated list. The naming convention of the arguments follows the same rules
as
used in case of simple variable name. After writing the arguments, enclosing parenthesis
is used. There is always a space before starting the definition of the macro.
Example
Suppose we have a program, which is using the area of circle many times in it. Therefore
we will write a macro for the calculation of the area of circle. We know that the
formula
for area of circle is PI*r2. Now this formula is substituted wherever
we will be referring
Page 293
to this macro. We know that the PI is also a natural constant. So we will define
it first.
Then we will define the macro for the area of the circle. From the perspective of
visibility, it is good to write the name of the macro in capital as CIRCLEAREA.
We
don’t need to pass the PI as argument to it. The only thing, needed to be passed
as
argument, is radius. So the name of the macro will be as CIRCLEAREA (X).We will
write the formula for the calculation of the area of the circle as:
#define CIRCLEAREA(X) (PI * (X) * (X))
Here is the complete code of the program:
/* A simple program using the area of circle formula as macro */
#include <iostream.h>
// Defining the macros
#define PI 3.14159
#define CIRCLEAREA(X) ( PI * X * X)
main()
{
float radius;
cout << “ Enter radius of the circle: ”;
cin >> radius;
cout << “ Area of circle is ” << CIRCLEAREA (radius);
}
The CIRCLEAREA will be replaced by the actual macro definition including the entire
parenthesis in the code before compilation. As we have used the parenthesis in the
definition of the CIRCLEAREA macro. The statement for ascertaining the area of circle
with double radius will be as under:
CIRCLEAREA(2 * radius);
The above statement will work fine in calculating the correct area. As we are using
multiplication, so it may work without the use of parenthesis. But if there is some
addition or subtraction like CIRCLEAREA(radius + 2) and the macro definition does
not
contain the parenthesis, the correct area will not be calculated. Therefore always
use the
parenthesis while writing the macros that takes arguments.
There are some other things about header files. As a proficient programmer writing
your
own operating systems, you will be using these things. There are many operating
systems, which are currently in use. Windows is a popular operating system, DOS
is
another operating system for PC’s, Linux, and different variety of Unix, Sun Solaris
and
main frame operating systems. The majority of these operating systems have a C
compiler available. C is a very elegant operating systems language. It is very popular
and
Page 294
available on every platform. By and large the source code which we write in our
programs does not change from machine to machine. The things, which are changed,
are
system header files. These files belong to the machine. The header files, which
we have
written for our program, will be with the source code. But the
iostream, stdlib, stdio,
string header files have certain variations from machine to machine.
Over the years as the
C language has evolved, the names of these header files have become standard. Some
of
you may have been using some other compiler. But you have noted that in those
compilers, the header files are same, as
iostream.h, conio.h etc
are available. It applies to
operating systems. While changing operating systems, we come up with the local version
of C/C++ compiler. The name of the header files remains same. Therefore, if we port
our
code from one operating system to another, there is no need to change anything in
it. It
will automatically include the header files of that compiler. Compile it and run
it. It will
run up to 99 % without any error. There may be some behavioral change like function
getche() sometimes read a character without the enter and sometimes you
have to type the
character and press enter. So there may be such behavioral change from one operating
system to other. Nonetheless these header files lead to a lot of portability. You
can write
program at one operating system and need not to take the system header file with
the
code to the operating system.
On the other hand, the header files of our program also assist in the portability
in the
sense that we have all the function prototypes, symbolic definitions, conditional
compilations and macros at one place. While writing a lot of codes, we start writing
header files for ourselves because of the style in which we work. We have defined
some
common functions in our header files. Now when we are changing the operating system,
this header file is ported with the source code. Similarly, on staring some program,
we
include this header file because it contains utility function which we have written.
Here is an interesting example with the
#define. If you think you
are sharp here is a
challenge for you. Define you own vocabulary with the #define and write C code in
front
of it. One can write a poem using this vocabulary which will be replaced by the
preprocessor with the C code. What we need is to include one header file that contains
this vocabulary. So an ordinary English poem is actually a C code. Interesting things
can
be done using these techniques.
Tips
• All the preprocessor directives start with the # sign
• A symbol can not be redefined without undefining it first
• The conditional compilation directives help in debugging the program
• Do not declare variable names starting with underscore
• Always use parenthesis while defining macros that takes arguments
|
|
|
|