|
|
Summary
• String Handling
• String Manipulation Functions
• Character Handling Functions
• Sample Program
• String Conversion Functions
• String Functions
• Search Functions
• Examples
• Exercises
String Handling
We have briefly talked about 'Strings' in some of the previous lectures. In this
lecture,
you will see how a string may be handled. Before actually discussing the subject,
it is
pertinent to know how the things were going on before the evolution of the concept
of
'strings'.
When C language and UNIX operating system were being developed in BELL
Laboratories, the scientists wanted to publish the articles. They needed a text
editor to
publish the articles. What they needed was some easy mechanism by which the articles
could be formatted and published. We are talking about the times when PCs and word
Page 195
processors did not exist. It may be very strange thing for you people who can perform
the tasks like making the characters bold, large or format a paragraph with the
help of
word processors these days. Those scientists had not such a facility available with
them.
The task of writing article and turning into publishable material was mainly done
with the
help of typewriters. Then these computer experts decided to develop a program, which
could help in the processing of text editing in an easy manner. The resultant efforts
led to
the development of a program for editing the text. The process to edit text was
called text
processing. The in- line commands were written as a part of the text and were processed
on out put. Later, such programs were evolved in which a command was inserted for
the
functions like making the character bold. The effect of this command could be preview
and then modified if needed.
Now coming to the topic of strings again, we will discuss in detail the in-built
functions
to handle the strings.
String Manipulation Functions
C language provides many functions to manipulate strings. To understand the functions,
let’s consider building block (or unit) of a string i.e., a character. Characters
are
represented inside the computers in terms of numbers. There is a code number for
each
character, used by a computer. Mostly the computers use ASCII (American Standard
Code for Information Interchange) code for a character to store it. This is used
in the
computer memory for manipulation. It is used as an output in the form of character.
We
can write a program to see the ASCII values.
We have a data type char
to store a character. A character includes every thing, which we
can type with a keyboard for example white space, comma, full stop and colon etc
all are
characters. 0, 1, 2 are also characters. Though, as numbers, they are treated differently,
yet they are typed as characters. Another data type is called as
int, which stores whole
numbers. As we know that characters are stored in side computer as numbers so these
can
be manipulated in the same form. A character is stored in the memory in one byte
i.e. 8
bits. It means that 28 (256) different combinations for different
values can be stored. We
want to ascertain what number it stores, when we press a key on the board. In other
words, we will see what character will be displayed when we have a number in memory.
The code of the program, which displays the characters and their corresponding integer,
values (ASCII codes) as under.
In the program the statement c
= i ; has integer value
on right hand side (as i is an int)
while c has its character representation. We display the value
of i and c. It shows us the
characters and their integer values.
//This program displays the ASCII code table
# include <iostream.h>
main ( )
{
Page 196
int i, char c ;
for (i = 0 ; i < 256 ; i ++)
{
c = i ;
cout << i << “\t” << c << “\n” ;
}
}
In the output of this program, we will see integer numbers and their character
representation. For example, there is a character, say white space (which we use
between
two words). It is a non-printable character and leaves a space. From the ASCII table,
we
can see that the values of a-z and A-Z are continuos. We can get the value of an
alphabet
letter by adding 1 to the value of its previous letter. So what we need to remember
as a
baseline is the value of ‘a’ and ‘A’.
Character Handling Functions
C language provides many functions to perform useful tests and manipulations of
character data. These functions are found in the header file ctype.h.
The programs that
have character manipulation or tests on character data must have included this header
file
to avoid a compiler error. Each function in ctype.h receives
a character (an int ) or EOF
(end of file; it is a special character) as an argument. ctype.h
has many functions, which
have self-explanatory names.
Of these, int isdigit (int c) takes a simple character as its
argument and returns true or
false. This function is like a question being asked. The question can be described
whether
it is a character digit? The answer may be true or false. If the argument is a numeric
character (digit), then this function will return true otherwise false. This is
a useful
function to test the input. To check for an alphabet (i.e. a-z), the function
isalpha can be
used. isalpha will return true for alphabet a-z for small and
capital letters. Other than
alphabets, it will return false. The function isalnum (is alphanumeric)
returns true if its
argument is a digit or letter. It will return false otherwise. All the functions
included in
ctype.h are shown in the following table with their description.
Prototype Description
int isdigit( int c ) Returns true if c
is a digit and false otherwise.
int isalpha( int c ) Returns true if
c is a letter and false otherwise.
int isalnum( int c ) Returns true if
c is a digit or a letter and false otherwise.
int isxdigit( int c ) Returns true if
c is a hexadecimal digit character and false
otherwise.
Page 197
int islower( int c ) Returns true if
c is a lowercase letter and false otherwise.
int isupper( int c ) Returns true if
c is an uppercase letter; false otherwise.
int tolower( int c ) If c is an uppercase letter,
tolower returns c as a lowercase letter.
Otherwise, tolower returns the argument unchanged.
int toupper( int c ) If c is a lowercase letter,
toupper returns c as an uppercase letter.
Otherwise, toupper returns the argument unchanged.
int isspace( int c ) Returns true if
c is a white-space character—newline ('\n'),
space
(' '), form feed ('\f'), carriage return
('\r'), horizontal tab ('\t'), or
vertical tab ('\v')—and false otherwise
int iscntrl( int c ) Returns true if
c is a control character and false otherwise.
int ispunct( int c ) Returns true if
c is a printing character other than a space, a digit,
or a letter and false otherwise.
int isprint( int c ) Returns true value if
c is a printing character including space (' ')
and false otherwise.
int isgraph( int c ) Returns true if
c is a printing character other than space (' ')
and
false otherwise.
The functions tolower and toupper are
conversion functions. The tolower function
converts its uppercase letter argument into a lowercase letter. If its argument
is other than
uppercase letter, it returns the argument unchanged. Similarly the toupper
function
converts its lowercase letter argument into uppercase letter. If its argument is
other than
lowercase letter, it returns the argument without effecting any change.
Sample Program
Let’s consider the following example to further demonstrate the use of the functions
of
ctype.h. Suppose, we write a program which prompts the user to enter a string. Then
the string entered is checked to count different types of characters (digit, upper
and
lowercase letters, white space etc). We keep a counter for each category of character
entered. When the user ends the input, the number of characters entered in different
types will be displayed. In this example we are using a function getchar(), instead
of cin
to get the input. This function is defined in header file as stdio.h. While carrying
out
character manipulation, we use the getchar() function. This function reads a single
character from the input buffer or keyboard. This function can get the new line
character
‘\n’ (the ENTER key) so we run the loop for input until user presses the ENTER key.
As
soon as the getchar() gets the ENTER key pressed (i.e. new line character ‘\n’),
the loop
is terminated. We know that, every C statement returns a value. When we use an
assignment statement ( as used in our program c = getchar()), the value assigned
to the
left hand side variable is the value of the statement too. Thus, the statement (c
=
getchar()) returns the value that is assigned to char c. Afterwards, this value
is
compared with the new line character ‘\n’. If it is not equal inside the loop, we
apply the
Page 198
tests on c to check whether it is uppercase letter, lowercase letter or a digit
etc. In this
program, the whole string entered by the user is manipulated character.
Following is the code of this program.
// Example: analysis of text using <ctype.h>
#include <iostream.h>
#include <stdio.h>
#include <ctype.h>
main()
{
char c;
int i = 0, lc = 0, uc = 0, dig = 0, ws = 0, pun = 0, oth = 0;
cout << "Please enter a character string and then press ENTER: ";
// Analyse text as it is input:
while ((c = getchar()) != '\n')
{
if (islower(c))
lc++;
else if (isupper(c))
uc++;
else if (isdigit(c))
dig++;
else if (isspace(c))
ws++;
else if (ispunct(c))
pun++;
else
oth++;
}
// display the counts of different types of characters
cout << "You typed:"<< endl;
cout<< "lower case letters = "<< lc<< endl;
cout << "upper case letters = " << uc <<endl;
cout<< "digits = " << dig << endl;
cout<< "white space = "<< ws << endl;
cout<< "punctuation = "<< pun<< endl;
cout<< "others = "<< oth;
}
Page 199
A sample output of the program is given below.
Please enter a character string and then press ENTER: Sixty Five = 65.00
You typed:
lower case letters = 7
upper case letters = 2
digits = 4
white space = 3
punctuation = 2
others = 0
String Conversion Functions
The header file stdlib.h includes functions, used for different
conversions. When we get
input of a different type other than the type of variable in which the value is
being stored,
it warrants the need to convert that type into another type. These conversion functions
take an argument of a type and return it after converting into another type. These
functions and their description are given in the table below.
Prototype Description
double atof( const char *nPtr ) Converts the string nPtr
to double.
Int atoi( const char *nPtr ) Converts the string nPtr
to int.
long atol( const char *nPtr ) Converts the string nPtr
to long int.
double strtod( const char *nPtr, char
**endPtr )
Converts the string nPtr to double.
long strtol( const char *nPtr, char
**endPtr, int base )
Converts the string nPtr to long.
unsigned long strtoul( const char
*nPtr, char **endPtr, int base )
Converts the string nPtr to unsigned long.
Use of these functions:
While writing main () in a program, we can put them inside the parentheses of main.
‘int
arg c, char ** arg v are written inside the parentheses. The arg c
is the count of number
of arguments passed to the program including the name of the program itself while
arg v
is a vector of strings or an array of strings. It is used while giving command line
arguments to the program. The arguments in the command line will always be character
strings. The number in the command line (for example 12.8 or 45) are stored as strings.
While using the numbers in the program, we need these conversion functions.
Following is a simple program which demonstrate the use of atoi
function. This program
prompts the user to enter an integer between 10-100, and checks if a valid integer
is
entered.
//This program demonstrate the use of atoi function
Page 200
# include <iostream.h>
# include <stdlib.h>
main( )
{
int anInteger;
char myInt [20]
cout << "Enter an integer between 10-100 : ";
cin >> myInt;
if (atoi(myInt) == 0)
cout << "\nError : Not a valid input"; // could be non numeric
else
{
anInteger = atoi(myInt);
if (anInteger < 10 || anInteger > 100)
cout << "\nError : only integers between 10-100 are allowed!";
else
cout << "\n OK, you have entered " << anInteger;
}
}
The output of the program is as follows.
Enter an integer between 10-100 : 45.5
OK, you have entered 45
String Functions
We know a program to guess a number, stored in the computer. To find out a name
(which is a character array) among many names in the memory, we can perform string
comparison on two strings by comparing a character of first string with the corresponding
character of the second string. Before doing this, we check the length of both the
strings
to compare. C provides functions to compare strings, copy a string and for
other
string manipulations.
The following table shows the string manipulation functions and their description.
All
these functions are defined in the header file string.h, in
the C .
Function prototype Function description
char *strcpy( char *s1, const char *s2 ) Copies string
s2 into character array
s1.
The value of s1 is returned.
char *strncpy( char *s1, const char *s2,
size_t n )
Copies at most n characters of
string s2
into array s1. The value of
s1 is
Page 201
returned.
char *strcat( char *s1, const char *s2 ) Appends string
s2 to array
s1. The first
character of s2 overwrites the
terminating null character of s1.
The
value of s1 is returned.
char *strncat( char *s1, const char *s2,
size_t n )
Appends at most n characters
of string
s2 to array s1.
The first character of s2
overwrites the terminating null character
of s1. The value of
s1 is returned.
int strcmp( const char *s1, const char *s2) Compares string
s1 to s2. Returns a
negative number if s1 < s2, zero if s1 ==
s2 or a positive number if s1 > s2
int strncmp( const char *s1, const char *s2,
size_t n )
Compares up to n characters of string s1
to s2. Returns a negative number if s1 <
s2, zero if s1 == s2 or a positive number
if s1 > s2.
int strlen ( const char *s) Determines the length of string s. The
number of characters preceding the
terminating null character is returned.
Let’s look at the string copy function which is strcpy. The
prototype of this function is
char *strcpy( char *s1, const char *s2 )
Here the first argument is a pointer to a character array or string s1 whereas the
second
argument is a pointer to a string s2. The string s2 is copied to string s1 and a
pointer to
that resultant string is returned. The string s2 remains the same. We can describe
the
string s1 as the destination string and s2 as the source string. As the source remains
the
same during the execution of strcpy and other string functions,
the const keyword is used
before the name of source string. The const keyword prevents
any change in the source
string (i.e. s2). If we want to copy a number of characters of a string instead
of the entire
string, the function strncpy is employed. The function
strncpy has arguments a pointer
to destination strings (s1), a pointer to source string (s2) . The third argument
is int n.
Here n is the number of characters which we want to copy from
s2 into s1. Here s1
must
be large enough to copy the n number of characters.
The next function is strcat (string concatenation). This function
concatenates (joins) two
strings. For example, in a string, we have first name of a student, followed by
another
string, the last name of the student is found. We can concatenate these two strings
to get a
string, which holds the first and the last name of the student. For this purpose,
we use the
strcat function. The prototype of this function is char
*strcat( char *s1, const char *s2
). This function writes the string s2 (source) at the end of the string
s1(destination). The
characters of s1 are not overwritten. We can concatenate a number of characters
of s2 to
s1 by using the function strncat. Here we provide the function
three arguments, a
character pointer to s1, a character pointer to s2 while third argument is the number
of
characters to be concatenated. The prototype of this function is written as
char *strncat( char *s1, const char *s2, size_t n )
Page 202
Examples
Let’s consider some simple examples to demonstrate the use of strcpy,
strncpy, strcat
and strncat functions. To begin with, we can fully understand the use
of the function
strcpy and strncpy.
Example 1
//Program to display the operation of the strcpy() and strncpy()
# include<iostream.h>
# include<string.h>
void main()
{
char string1[15]="String1";
char string2[15]="String2";
cout<<"Before the copy :"<<endl;
cout<<"String 1:\t"<<string1<<endl;
cout<<"String 2:\t"<<string2<<endl;
//copy the whole string
strcpy(string2,string1); //copy string1 into string2
cout<<"After the copy :"<<endl;
cout<<"String 1:\t"<<string1<<endl;
cout<<"String 2:\t"<<string2<<endl;
//copy three characters of the string1 into string3
strncpy(string3, string1, 3);
cout << “strncpy (string3, string1, 3) = “ << string3 ;
}
Following is the output of the program.
Before the copy :
String 1: String1
String 2: String2
After the copy :
String 1: String1
String 2: String1
Strncpy (string3, string1, 3) = Str
Example 2 (strcat and strncat)
Page 203
The following example demonstrates the use of function strcat
and strncat.
//Program to display the operation of the strcat() and strncat()
#include <iostream.h>
#include <string.h>
int main()
{
char s1[ 20 ] = "Welcome to ";
char s2[] = "Virtual University ";
char s3[ 40 ] = "";
cout<< "s1 = " << s1 << endl << "s2 = " << s2 << endl << "s3 = " << s3 << endl;
cout<< "strcat( s1, s2 ) = “<< strcat( s1, s2 );
cout << "strncat( s3, s1, 6 ) = “ << strncat( s3, s1, 6 );
}
The output of the program is given below.
s1 = Welcome to
s2 = Virtual University
s3 =
strcat( s1, s2 ) = Welcome to Virtual University
strncat( s3, s1, 7 ) = Welcome
Now we come across the function strcmp. This function compares
two strings, and
returns an integer value depending upon the result of the comparison. The prototype
of
this function is int strcmp( const char *s1, const char *s2)
This function returns a number less than zero (a negative number), if s1 is less
than s2. It
returns zero if s1 and s2 are identical and returns a positive number (greater than
zero) if
s1 is greater than s2.
The space character in a string and lower and upper case letters are
also considered while comparing two strings. So the strings “Hello”, “hello” and
“He llo”
are three different strings these are not identical.
Similarly there is a function strncmp, which can be used to
compare a number of
characters of two strings. The prototype of this function is
int strncmp( const char *s1, const char *s2, size_t n )
Here s1 and s2 are two strings and n is the number upto which
the characters of s1 and s2
are compared. Its return type is also int. It returns a negative
number if first n characters
of s1 are less than first n characters of s2. It returns zero if n characters of
s1 and n
characters of s2 are identical. However, it returns a positive number if n characters
of s1
are greater than n characters of s2.
Now we will talk about the function, ‘strlen’ (string length)
which is used to determine
the length of a character string. This function returns the length of the string
passed to it.
The prototype of this function is given below.
int strlen ( const char *s)
Page 204
This function determines the length of string s. the number of characters preceding
the
terminating null character is returned.
Search Functions
C provides another set of functions relating to strings, called search functions.
With the
help of these functions, we can do different types of search in a string. For example,
we
can find at what position a specific character exists. We can search a character
starting
from any position in the string. We can find the preceding or proceeding string
from a
specific position. We can find a string inside another string. These functions are
given in
the following table.
Function prototype Function description
char *strchr( const char *s, int c
);
Locates the first occurrence of character c in string
s. If c is found, a pointer to c
in s is returned.
Otherwise, a NULL pointer is returned.
size_t strcspn( const char *s1,
const char *s2 );
Determines and returns the length of the initial
segment of string s1 consisting of characters not
contained in string s2.
size_t strspn( const char *s1,
const char *s2 );
Determines and returns the length of the initial
segment of string s1 consisting only of characters
contained in string s2.
char *strpbrk( const char *s1,
const char *s2 );
Locates the first occurrence in string s1 of any
character in string s2. If a character from string
s2
is found, a pointer to the character in string s1 is
returned. Otherwise, a NULL pointer is returned.
char *strrchr( const char *s, int c
);
Locates the last occurrence of c in string s.
If c is
found, a pointer to c in string s is returned.
Otherwise, a NULL pointer is returned.
char *strstr( const char *s1, const
char *s2 );
Locates the first occurrence in string s1 of string
s2.
If the string is found, a pointer to the string in s1 is
returned. Otherwise, a NULL pointer is returned.
char *strtok( char *s1, const char
*s2 );
A sequence of calls to strtok breaks string s1
into
“tokens”—logical pieces such as words in a line of
text—separated by characters contained in string
s2. The first call contains s1 as the first
argument,
and subsequent calls to continue tokenizing the
same string contain NULL as the first argument. A
pointer to the current token is returned by each call.
If there are no more tokens when the function is
called, NULL is returned.
Page 205
Example 3
Here is an example, which shows the use of different string manipulation functions.
The code of the program is given below.
//A program which shows string manipulation using <string.h>
#include <iostream.h>
#include <string.h>
#include <stdlib.h>
main()
{
char s1[] = "Welcome to " ;
char s2[] = "Virtual University" ;
char s3[] = "Welcome to Karachi" ;
char city[] = "Karachi";
char province[] = "Sind";
char s[80];
char *pc;
int n;
cout << "s1 = " << s1 << endl << "s2 = " << s2 << endl ;
cout << "s3 = " << s3 << endl ;
// function for string length
cout << "The length of s1 = " << strlen(s1) << endl ;
cout << "The length of s2 = " << strlen(s2) << endl ;
cout << "The length of s3 = " << strlen(s3) << endl ;
strcpy(s, "Hyderabad"); // string copy
cout<< "The nearest city to "<< city << " is " << s << endl ;
strcat(s, " and "); // string concatenation
strcat(s,city);
strcat(s, " are in ");
strcat(s, province);
strcat(s, ".\n");
cout << s;
if (!(strcmp (s1,s2))) // ! is used as zero is returned if s1 & s2 are equal
cout << "s1 and s2 are identical" << endl ;
else
cout << "s1 and s2 are not identical" << endl ;
if (!(strncmp (s1,s3,7))) // ! is used as zero is returned for equality
cout << "First 7 characters of s1 and s3 are identical" << endl ;
Page 206
else
cout << "First 7 characters of s1 and s3 are not identical" << endl ;
}
Following is the output of the program.
S1 = Welcome to
S2 = Virtual University
S3 = Welcome to Karachi
The length of s1 = 11
The length of s2 = 18
The length of s3 = 18
The nearest city to Karachi is Hyderabad
Hyderabad and Karachi are in Sind.
S1 and s2 are not identical
First 7 characters of s1 and s3 are identical
Exercises
1: Write a program that displays the ASCII code set in tabular form on the screen.
2: Write your own functions for different manipulations of strings.
3: Write a program, which uses different search functions.
|
|
|
|