Part 9 -- Strings and Memory Allocation
---------------------------------------
Hello and welcome to Part 9. In this section we will discuss some
string functions and memory allocation.
Before I begin, I will do a short review of strings and characters. If
you recall from previous sections a string is usually defined in an
array like so:
char astring[] = "Welcome to C!" or,
char astring[14] = "Welcome to C!" or,
char astring[14] = {'W', 'e', 'l', 'c', 'o', 'm', 'e', ' ', 't', 'o',
' ', 'C', '!', '\0'};
Notice that the string "Welcome to C!" is 13 characters long.
Also take note that we have declared room for 14. The last character is
the null character: '\0'. Even though the null character is two
characters long (a backslash and a 0), it only takes up one character in
the array.
In the first example: char astring[] = "Welcome to C!" we didn't put an
array size inside the brackets. The compiler counts the number of
characters in the string and then places that number in the brackets
automagically. Therefore the first and second examples above are the
same. What about the third? Well, it too is also the same. I don't
know why someone would want to declare an array to hold a string this
way, though I am sure that someone, somewhere has a reason to do such a
thing. Normally I do not suggest you declare a char array this way.
It's just there for an example.
Please make sure that if you do use the third way (as seen above) of
assigning characters to an array, you have to append a '\0' yourself,
as the compiler won't do it for you.
Why is the '\0' so important? Because that is how the C language knows
where a string ends. If you took the null character out of that string
and tried to printf it to the screen, you might get the string printed
or you may not. It's undefined. Your program might simply crash. Who
knows what could happen.
I don't recall discussing characters in any of my tutorial sections.
Since it's a relatively simple subject, I'll discuss it right now.
A character is any single character, such as 'A', '4', '#'. These are
all character values. It just so happens that a character also has a
numeric code assigned to it. For example, the letter 'a' on my system
has the numeric equivalent of the number 97. Where do these numbers
come from? The come from the ASCII table (Though some come from
the EBCDIC table). Check your OS documentation to see which code table
you're computer uses.
Let's take a look at a program that uses char values:
/*------------------------------------------------------------------
Program: CHAREX.C
Author: Ian Patterson
Date: Sep. 24, 2001
Description: This is a program that shows a usage of the
char data type.
-----------------------------------------------------------------*/
#include<stdio.h>
int main(void)
{
char MyChar;
MyChar = 'a';
/* To display type char in printf, use %c */
printf("The variable MyChar contains: %c\n", MyChar);
printf("The numeric code corresponding to MyChar is: %d\n", MyChar);
return 0;
}
The above program should print out the letter 'a' and the numeric value
assigned to it. Now back to our talk on strings...
There is another way to define strings. A way without arrays. You use
pointers. We discussed pointers briefly in the last chapter. Actually,
the only pointers we discussed were pointers to int. In this section we
will discuss pointers to char.
It's rather the same. You define a char pointer the same way you would
define an int pointer. Like so:
char *ptrString;
Ok. Now that we have a ptr, we should make it point to something.
Since we want to display a string with pointers, we can actually make
our variable point to our string, like so:
*ptrString = "Hello and welcome to C.";
The pointer is initialized to point at the first character of the
string.
Here is a snippet of code that would print this onto the output device:
...
...
char *ptrString = "Hello and welcome to C.";
/* use the %s conversion specifier to display a string */
printf("Variable ptrString is: %s", ptrString);
...
...
I do believe I said something about memory allocation in the title, so
that will be our next topic.
We will start with a definition of the malloc function. The following
was copied from the n869 public draft of the C language standard:
7.20.3.3 The malloc function
Synopsis
#include <stdlib.h>
void *malloc(size_t size);
Description:
The malloc function allocates space for an object whose
size is specified by size and whose value is indeterminate.
Returns:
The malloc function returns either a null pointer or a
pointer to the allocated space.
From the above synopsis we can see the we must include the header file,
<stdlib.h> for the function prototype. The malloc function accepts a
size of type size_t. The type "size_t" is new to you. I'll go over it
now. Once again I look to the n869 public draft. I'll paraphrase what
it says. In short it says this:
size_t:
size_t is an unsigned integer defined in <stddef.h>.
Ok. Does that mean you have to include the header file stddef.h? No.
Luckily, the C standard guarantees that it's placed in scope by including
<stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h> or <wchar.h>
On return, malloc will return either a NULL pointer or a pointer to the
allocated space. What is a NULL pointer? It's different from the null
character you use to end a string. Don't mix these two nulls up. I
like to think of the null that ends a string as NIL or sometimes NUL.
You can make your own definition if you like. For example:
#define NUL '\0'
I've done that on occasion. Back to our NULL pointer. The NULL pointer
is fairly interesting. In fact, it's interesting enough for people to
write pages and pages on the subject. I recommend before you continue
on in this tutorial to read the following from Steve Summit's FAQ for
comp.lang.c. Go to http://www.eskimo.com/~scs/C-faq/top.html and find
the section on NULL pointers. I will give you a shortened version.
In short, if you have a pointer, let's say: char *MyPtr; and for some
reason you need it to be NULL, then you have some possibilities. The most
common are:
char *MyPtr = NULL; /* Most common */
char *MyPtr = 0;
char *MyPtr = ((void *)0);
In the end, as long as it's zero it should work as a NULL pointer just
fine.
See the FAQ above for some more explanations.
What I am going to do, is show you a program using malloc to allocate
some space. Here's the code:
/*-------------------------------------------------------------
Program: MALLOC.C
Author: Ian Patterson
Date: Sep. 25, 2001
Description: This is a program that demonstrates memory
allocation using malloc.
------------------------------------------------------------*/
#include<stdio.h>
#include<stdlib.h>
int main(void)
{
int *mychar;
if ((mychar = malloc(100)) == NULL) {
puts("Memory allocation error!");
return EXIT_FAILURE;
}
puts("Memory allocated successfully!");
free(mychar);
return EXIT_SUCCESS; /* EXIT_SUCCESS & EXIT_FAILURE are found */
} /* in <stdlib.h> */
The above program allocates enough space for 100 bytes. We know that
it will allocate 100 because a char is guaranteed to be 1 byte in size.
The example below will give you an example.
We do the allocation and the test on the same line -- inside the "if"
statement. This can be broken up, like so:
mychar = malloc(100 * sizeof(char)); /* space for 100 chars */
if (mychar == NULL) {
printf("Memory not allocated.\n");
return EXIT_FAILURE;
}
Either way is correct. Choose the way that is easiest for you to
understand. If the allocation fails, we return with EXIT_FAILURE, which
is defined in <stdlib.h>. You could also use the exit() function in the
event of an error such as this.
Near the end of our program we use free() to free any memory we
allocated with malloc. Its use is simple. Pass it the variable you
used to allocate the memory to. We'll return to this function later on.
Needless to say, this program is rather pointless. I will show you
some string functions and then we will be able to do something more
useful with malloc.
We will begin with the strcpy() function. From the n869 public draft:
Synopsis:
#include <string.h>
char *strcpy(char * restrict s1, const char * restrict s2);
Description:
The strcpy function copies the string pointed to by s2
(including the terminating null character) into the array
pointed to by s1. If copying takes place between objects
that overlap, the behavior is undefined.
Returns:
The strcpy function returns the value of s1.
There you have the technical definition of the strcpy function. Notice
in the prototype the keyword "restrict". I will not be discussing what
the restrict keyword does in this tutorial, but for our usage, an
understanding of restrict is not necessary. Following is a program that
makes use of the strcpy function:
/*-------------------------------------------------------------
Program: STRCOPY.C
Author: Ian Patterson
Date: Sep. 29, 2001
Description: This is a program that demonstrates the
use of the strcpy function.
------------------------------------------------------------*/
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(void)
{
char *Str1, *Str2;
Str1 = "Welcome to Earth.";
Str2 = "Walking on the Sun.";
printf("Str1 = \"%s\"\n", Str1);
printf("Str2 = \"%s\"\n\n", Str2);
if ((Str2 = malloc(strlen(Str1)+1)) == NULL) {
puts("Memory allocation error!");
return EXIT_FAILURE;
}
strcpy(Str2, Str1);
printf("Str2 is now: \"%s\"\n", Str2);
free(Str2);
return EXIT_SUCCESS;
}
We need three header files for this program. <stdio.h> for printf,
<stdlib.h> for malloc and EXIT_SUCCESS plus <string.h> for strcpy.
The first thing we do is make some variables. We know that strcpy
requires two pointers to char. We therefore make two pointers. After
that, we initialize them with some strings. Str1 and Str2 point to the
first character in the string. Which in both cases happens to be the
letter "W". We then call printf to display these strings, using the
"%s" modifier.
After we display the strings, we allocate enough memory to store the
string we are going to copy into "Str2". We do this by getting the size
of the string we are going to copy.
We then use the strcpy function to do the copying of the string. Then
we display the output. Notice I use some escape sequences to display
the quotes. I used the macro EXIT_SUCCESS as the return value.
The next string function that we'll deal with is strlen. Sometimes, it
may be necessary for you to find the length of a string. In cases such
as that, you can use this function. The synopsis of strlen follows.
Synopsis:
#include <string.h>
size_t strlen(const char *s);
Description:
The strlen function computes the length of the string
pointed to by s.
Returns:
The strlen function returns the number of characters
that precede the terminating null character.
The function returns a type "size_t" which was discussed earlier in this
part of the tutorial. It's an unsigned type. A simple example follows.
/*-------------------------------------------------------------
Program: STRLEN.C
Author: Ian Patterson
Date: Sep. 29, 2001
Description: This is a program that demonstrates the
use of the strlen function.
------------------------------------------------------------*/
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(void)
{
char inputVar[80], *change;
size_t len;
char *ret_ptr; /* Used to check return of fgets() */
printf("Enter some text: ");
*ret_ptr = fgets(inputVar, sizeof(inputVar), stdin);
printf("\n");
if (ret_ptr != NULL){
printf("You put: %s", ret_ptr);
change = strchr(inputVar, '\n'); /* Change \n to \0 */
if (change != NULL)
*change = '\0';
else{
printf("Error during strchr()\n");
return EXIT_FAILURE;
}
len = strlen(inputVar);
printf("Length was: %lu\n", (unsigned long)len);
}
else{
printf("Error processing fgets()\n");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
There is our complete program. I have used a function that we haven't
discussed yet, called strchr. I will discuss it momentarily. The
program flow is as follows:
1. It asks the user to enter text.
2. Using fgets (so we can avoid buffer overrun) we accept input.
3. We then output whatever the user has typed in.
4. We find the '\n' character and replace it with a '\0'.
5. We get the length of the string using strlen.
6. We print the length to the screen.
7. Return a portable value to the caller (the OS usually).
Now, the first 3 steps were covered in other parts of this tutorial.
If you want a refresher, I suggest you go back and read the parts again.
Step 4 is something new. We use strchr. The reason we use strchr is
because fgets retains the newline character. Therefore, if you type in
"Hi There!", the length returned would be 10, even though you only typed
nine characters. The answer to this problem is to replace the '\n' with
a '\0'.
The strchr function scans a string for the FIRST occurrence of a given
character (including the null terminator). Its prototype is:
char *strchr(const char *s, int c);
It returns a pointer to the located character or a NULL pointer if the
character does not appear in the string.
Take a look at the section of the program where we used strchr. What
happens there is strchr finds the '\n' and places a pointer at that
position. It stores that position in the "change" variable. We then
assign a new value to that location using: "*change = '\0'. After we do
that we can easily get the correct length of our input string.
Step 5 is the strlen function discussed above. It places the return
value into "len".
Step 6 displays "len". Notice that we are displaying an unsigned long.
Here's why. size_t is considered the largest unsigned number available.
We can't simply go: "printf("%size_t", inputVar);" It would not work.
So we cast it to the largest unsigned type.
In the new standard, they have made a special modifier to use instead of
having to cast all the time. Let's say you have a compiler that is
compliant to the new C standard. You could do this instead:
"printf("%zu", inputVar)"
Neat-o eh? Saves me some typing.
That's all there is to it. There are many string functions. I suggest
that you read a good book to find some others, or read your compiler
documentation. I will go over other string functions as needed by the
programs we will write later on in this tutorial. For now however, this
is all I'm going to write for Part 9.
Let's move on to Part 10.