36
1 © Janice Regan, CMPT 102, Sept. 2006 CMPT 102 Introduction to Scientific Computer Programming Strings

CMPT 102 Introduction to Scientific Computer Programming

  • Upload
    kolina

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

CMPT 102 Introduction to Scientific Computer Programming. Strings. Simple and Composite Variables. We have studied simple variables A simple variable describes a single value A simple variable has an identifier - PowerPoint PPT Presentation

Citation preview

Page 1: CMPT 102 Introduction to Scientific Computer Programming

1 © Janice Regan, CMPT 102, Sept. 2006

CMPT 102Introduction to Scientific Computer Programming

Strings

Page 2: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 2

Simple and Composite Variables We have studied simple variables

A simple variable describes a single value A simple variable has an identifier A simple variable has a type that describes the

properties of the value of the variable, the permissible operations for the variable, and the representation of the variable in computer memory

We can also have composite variables These variables describe a group of values Arrays: all values in the group have the same type Structures: different values in the group can have

different types

Page 3: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 3

Composite Variables composite variables describe a group of values

1 dimensional arrays or variables of a particular type (all entries must have the same type)

multi dimensional arrays or variables of a particular type (all entries must have the same type)

Structures containing groups of variables of different types

Strings are another special type that builds on arrays An array of characters A set of special operations appropriate for text

Page 4: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 4

One-Dimensional (1-D) Arrays An array is an indexed data structure All variables stored in an array are of the

same data type An element of an array is accessed using the

array name and an index or subscript The name of the array is the address of the

first element and the subscript is the offset In C, the subscripts always start with 0 and

increment by 1

Page 5: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 5

Example String Declaration in C

char list[10];

allocates memory for 10 characters.

Ten adjacent locations in memory are allocated

Remember C does not perform any bounds checking on arrays

list[0]

list[1]

list[2]

list[3]

list[4]

list[5]

list[6]

list[7]

list[9]

list[8]

Page 6: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 6

Initializing 1-D Arrays Strings are not the same as 1-D character

arrays You can specify individual values for each

character in a 1-D character array

/* put one character in each element of the array*/char list[8] = {‘h’,’e’,’l’,’l’, ‘o’};

After initialization memory looks like

list[0]‘h’

list[1]‘e’

list[2]‘l’

list[3]‘l’

list[4]‘o’

list[5]?

list[6]?

list[7]?

list[8]?

Page 7: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 7

Difference: String vs 1D character array Method of initialization A string always in a null termination

character (\0) This tells all the functions in the string library

where the string ends Use of the null termination character

allows strings of different length to be stored in a character array of a single length

Page 8: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 8

Strings of different lengths

Strings of different lengths can be stored in a character array The maximum number of character in the string is the number of

characters in the array minus one Blanks can be included in the string Blanks count as characters

char list[8] = {“hello”};char list1[8] = {“hi jane”};

list[0]‘h’

list[1]‘i’

list[2]‘ ’

list[3]‘j’

list[4]‘a’

list[5]‘n’

list[6]‘e’

list[7]‘\0’

list[8]?

list[0]‘h’

list[1]‘e’

list[2]‘l’

list[3]‘l’

list[4]‘o’

list[5]‘\0’

list[6]?

list[7]?

list[8]?

Page 9: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 9

Avoid a common problem (1) C does not perform any bounds checking on

arrays This means that you can accidentally change the

values of other variables by changing a value you refer to as an element of the array, which is not actually part of the array

For a string variable this is particularly easy. You must remember that character array

mystring[20] holds a string of no more than 19 characters

“hello my friend” has 15+1 characters “joe” has 3+1 characters REMEMBER THE NULL TERMINATION CHARACTER

Page 10: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 10

Avoid a common problem (2)

int count = 3;char myArray[5] = {“hello”};

After the first declaration memory looks like

After the second declaration statement above

REMEMBER: Leave room for the \0

myArray[0]‘h’

myArray[1]‘e’

myArray[2]‘l’

myArray[3]‘l’

myArray[4]‘o’

count‘\0”

myArray[0]?

myArray[1]?

myArray[2]?

myArray[3]?

myArray[4]?

count3

Page 11: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 11

Avoid a common problem (3)

int count [4]= {1,2,3,5};char mySt[5] = {“my name”};

After the first declaration memory looks like

After the second declaration statement above

mySt has no terminating \0, string library breaksArray count has been corrupted and now contains the integer equivalent of “e”

mySt[0]?

mySt[1]?

mySt[2]?

mySt[3]?

mySt[4]?

Count[0]1

Count[1]2

Count[2]3

Count[3]5

mySt[0]‘m’

mySt[1]‘y’

mySt[2]‘ ‘

mySt[3]‘n’

mySt[4]‘a’

Count[0]‘m’

Count[1]‘\0’

Count[2]3

Count[3]5

Page 12: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 12

Avoid a common problem (4) C does not perform any bounds

checking on arrays

By initializing or changing the contents of a string with a string that is longer than will fit into the character array associated with the string it is possible to change the value of a completely different variable and to break the string library for the string being initialized

It is imperative that you be very careful to avoid using strings longer than the allocated space

Page 13: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 13

Arrays of strings Declare your array of Strings

#define NUMNAMES 20

#define MAXNAMELEN 32

char names[NUMNAMES][MAXNAMELEN]

Declare and Initialize your array#define NUMMONTHS 12

#define MONTHNAMESIZE 10

char month[NUMMONTHS][MONTHNAMESIZE] = { “January”, “February”, “March”, “April”, “May”, “June”, “July”, “August”, “September”, “October”, “November”, December” };

Page 14: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 14

Initializing and array of stringsChar month[12][10] = { “January”, “February”, “March”, “April”,

“May”, “June”, “July”, “August”, “September”, “October”, “November”, December” };

‘J’ ‘a’ ‘n’ ‘u’ ‘a’ ‘r’ ‘y’ ‘\0’ ? ?

‘F’ ‘e’ ‘b’ ‘r’ ‘u’ ‘a’ ‘r’ ‘y’ ‘\0’ ?

‘M’ ‘a’ ‘r’ ‘c’ ‘h’ ‘\0’ ? ? ? ?

‘A’ ‘p’ ‘r’ ‘i’ ‘l’ ‘\0’ ? ? ? ?

‘M’ ‘a’ ‘y’ ‘\0’? ? ? ? ? ? ?

‘J’ ‘u’ ‘n’ ‘e’ ‘\0’ ? ? ? ? ?

‘J’ ‘u’ ‘l’ ‘y’ ‘\0’ ? ? ? ? ?

‘A’ ‘u’ ‘g’ ‘u’ ‘s’ ‘t’ ‘0’ ? ? ?

‘\0’

Page 15: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 15

Initializing a string You can initialize a simple variable either in the

declaration statement, or using separate assignment statements following the declaration statements.

You can also initialize the values of the string following using assignment statements following the declaration statements

When initializing a string remember to be sure that you do not put more characters in the string than there is space for

Page 16: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 16

Initializing or using 1-D arrayschar *list[10];int i;

/*Initialize each element of array list to “mystart” */for(i=0; i<10; i++){

strcpy(list[i], “mystart”);}

Page 17: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 17

Putting data into a 1-D Array Another common way of assigning values to

strings or arrays of strings is to read data values from a file directly into the string or array of strings Each value read from the file is assigned to a single

string(for example names[6]) A single row stored in the ith row in the array of

strings names is referred to as names[i], Note that checks to determine the file was opened

correctly and that data was read correctly have been omitted from the example, they should not be omitted from your code

Page 18: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 18

Array Input from a data file

#define NUMPEOPLE 30

#define NAMELEN 32

char names[NUMPEOPLE][NAMELEN];

char title[30];

int ages[NUMPEOPLE];

int k;

FILE *registrants;

registrants = fopen(“registrants”, “r”);

scanf( “%s”, title);

printf(“%s\n”, title);

for(k=0; k<NUMPEOPLE; k++)

{

fscanf(registrants, “%s %d”, names[k], &ages[k]);

printf(“%33s, %3d\n”, names[k], ages[k]);

}

Page 19: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 19

Notes on array input The string is read or written using %s When you read or write a string your read or

write all characters in that string The final character in the string is determined by the

location of the null termination character \0

When reading a string using scanf or fscanf there is no & before the name of the string The name of the string is a reference (address) of the

first element of the string

Page 20: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 20

Strings as function parameters Arrays, or parts of arrays, can be passed as

arguments to functions. An element of an string can be used as a simple

character variable parameter It can be passed by value or by reference

An entire string can be used as a parameter of a function It can only be passed by reference using the name of the

string (the name of the string is a reference to the location in memory of the first character in the string)

Page 21: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 21

Strings as a data type Remember a data type has a group of objects

(things) that can be combined in different ways using the operands for that data type.

The operands for strings are not those used for other data types (like +, -, = …)

All operations on strings are performed using functions from the string library (other than reading and writing)

To include the string library in your programinclude <string.h>

Page 22: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 22

Assigning strings You have seen that = can be used when

assigning values to strings in a declaration = cannot be used to assign a string literal to a

string: The following is not validmystring = “testinput”;

To copy one string to another the string library functions strcpy is usually usedstrcpy(mystring, “testinput”);strcpy(myCopiedString, myOriginalString);

Page 23: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 23

Assigning strings To copy one string to another the string library

functions strcpy is usually used To copy part of a string, (a substring) or to

assure you do not copy more characters into a string that it can hold you can also use strncpy

strncpy(mystring, “testinput”, 5); Note that strncpy copies the first 5 characters of

“testinput” only (testi) and does not add a \0 to the end of the copied string

strcpy copies the entire string (even it it is longer than the available space!) including the \0

Page 24: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 24

Avoid a common problem (3)

int count [4]= {1,2,3,5};char mySt[5] ;strcpy(mySt, “my name”);

After the declarations memory looks like

After the strcpy statement above

mySt has no terminating \0 in its array

mySt[0]?

mySt[1]?

mySt[2]?

mySt[3]?

mySt[4]?

Count[0]1

Count[1]2

Count[2]3

Count[3]5

mySt[0]‘m’

mySt[1]‘y’

mySt[2]‘ ‘

mySt[3]‘n’

mySt[4]‘a’

Count[0]‘m’

Count[1]‘e’

Count[2]‘\0’

Count[3]5

Page 25: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 25

Finding the length of a string To find the number of characters actually stored

in a string use the string library function strlenint len;

char mystring[30];

strcpy(mystring, “testing”);

len = strlen(mystring);

/* len now has a value 7 */

Strlen counts the number of characters in the string not including the terminating \0

Page 26: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 26

Concatenating Strings Combining two strings into a single string Use the string library functions strcat or strncat strcat and strncat take one string and append it

to the end of another string The terminating \0 is removed from the end of

the first string before the second string is added The terminating \0 is replaced at the end of the

second string strcat and strncat can create a string too long to

fit in the allocated string storage:

Page 27: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 27

Example: using strcat

char name1[10] = “marie”;

char name2[10] = “anne”;

strcat( name2, name1);

m a r ei \0 ? ? ??

a n n \0e ? ? ? ??

m a r ei \0 ? ? ??

a n n me a r i \0e

Page 28: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 28

Concatenating Strings

char mystring[20]=“start input: “;char mystring1[20] = “input1”;char mystring2[20] = “ and output”/* after the following strcat mystring1 contains */

/* “input1 and output” */ strcat(mystring1, mystring2);/* after the following strcat mystring1 contains */

/* “start input: input1 and output” */ /* this string overflows the array mystring */

strcat(mystring, mystring1);

Page 29: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 29

Example: using strcat

char name1[10] = “marie”;

char name2[10] = “anne”;

strncat( name2, name1, 2);

m a r ei \0 ? ? ??

a n n \0e ? ? ? ??

m a r ei \0 ? ? ??

a n n me a \0 ? ??

Page 30: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 30

Concatenating Strings#define STRLEN 20int len, added;char mystring[STRLEN]=“start input: “;char mystring1[STRLEN] = “input1 and output”;/* To prevent overflow find the number of */

/* characters that can be added to mystring *//* added (6) = STRLEN (20) – len(13) – 1 */len = strlen(mystring);added = STRLEN – len -1;/* after the following strcat mystring1 contains */

/* “start input: input1” */ strncat(mystring, mystring1, added);

Page 31: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 31

Comparing Strings To compare 2 strings usually use the

string library function strcmpstrcmp(mystring1, mystring2)

strcmp returns an integer, if mystring1 is alphabetically before mystring2 a

negative number will be returned If the strings are identical 0 will be returned if mystring2 is alphabetically before mystring1 a

positive number will be returned

Page 32: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 32

ASCII equivalents Each alphabetic character, number, or other

character (including whitespace characters) has an integer equivalent value

These integer values are used by strcmp to determine the alphabetical ordering. All uppercase letters precede lower case letters All numbers precede uppercase letter A string st1 contains the first few characters of a

longer string st2. st1 precedes st2 when compared

Page 33: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 33

Comparing Parts of Strings To compare the first n characters of 2

strings use the string library function strncmp

strncmp(mystring1, mystring2, n) strncmp returns an integer,

if mystring1 is alphabetically before mystring2 a negative number will be returned

If the strings are identical 0 will be returned if mystring2 is alphabetically before mystring1 a

positive number will be returned

Page 34: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 34

Conversion to and from strings Strings to numbers: sscanf

Works like fscanf bur read from a string Numbers to strings: sprintf

Works just like fprintf but writes into a stringchar mystring[20];int myvalue1= 23, myvalue2=46;sprintf( mystring, “%s:%2d, %2d”,

“myvalues are”, myvalue1, myvalue2);/* now mystring contains *//* myvalues are:23, 46 */

Page 35: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 35

Character analysis You can also analyze a string (or a character array) one

character at a time The ctype library #include <ctype.h> includes

functions for such analysis. Each of these functions returns an integer value. The value is nonzero if the condition checked is true (0 if it is false)

isalpha(char mychar); /* is an alphanumeric char */isdigit( char mychar); /* is a numeral */ispunct(char mychar); /* is a non whitespace punctuation character */ isspace(char mychar); /* is a whitespace character */tolower(char mychar); /* converts alphanumeric to lower case */toupper(char mychar); /* converts alphanumeric to upper case */h

Page 36: CMPT 102 Introduction to Scientific Computer Programming

© Janice Regan, CMPT 102, Sept. 2006 36

The ctype and string Libraries

We have had an introduction to some of the functions in these libraries.

These libraries are much more flexible than this subset of functions indicates

You should be able to read the function descriptions for the other functions in the string library and then use those functions in your programs