Intermediate C Programming

Writing and Testing Programs 75

./ prog inputs / input2 > outputs / output224

diff e xpected / expected 2 outputs / output225

test3 : prog27

./ prog inputs / input3 > outputs / output328

diff e xpected / expected 3 outputs / output329

test4 : prog31

./ prog inputs / input4 > outputs / output432

diff e xpected / expected 4 outputs / output433

clean :35

/ bin / rm -f *. o prog outputs /*36

5.3 Invalid Memory Access

In Section 2.3.5, I said “if an array has n elements, the valid indexes are 0, 1, 2, ..., n−1.”

What happens if we use an invalid index? The simple answer is the program’s behavior

is undeﬁned. That means anything could happen, and it will not be predictable. If the

index is incorrect, the program will access a memory address that does not belong to the

array. Remember that as a programmer, you have no control over which memory addresses

your program can use. We do not know what is stored at an address outside the range of

the array. Consider this program:

/*1

* wro n gindex .c2

*/3

#in clude < stdio .h >4

#in clude < stdlib .h >5

#in clude < string .h >6

int main ( i n t argc , char * * argv )7

int x = -2;9

int arr [] = {0 , 1, 2 , 3 , 4};10

int y = 15;11

printf (" & x = %p , & y = %p\ n" , & x , & y);12

printf (" & arr [0] = %p , & arr [4] = %p \ n" , & arr [0] ,13

& arr [4]) ;14

printf (" x = %d , y = %d\ n" , x , y);15

arr [ -1] = 7;16

arr [5] = -23;17

printf (" x = %d , y = %d\ n" , x , y);18

arr [6] = 108;19

printf (" x = %d , y = %d\ n" , x , y);20

arr [7] = -353;21

printf (" x = %d , y = %d\ n" , x , y);22

return EXIT _SUCCES S ;23

}24

76 Intermediate C Programming

An array is created at line 10 and it has 5 elements. The valid indexes are 0, 1, 2, 3,

and 4. Lines 12 and 13 print the addresses of x, y, and the array. Lines 16, 17, 19, and 21

use incorrect indexes. If we compile, link, and execute this program, we may ﬁnd that the

values of x or y are changed because we are using incorrect indexes. This is not guaranteed,

and the results will depend on the speciﬁc compiler. This is the output when I run this

program:

& x = 0x7fffcabf4e68, & y = 0x7fffcabf4e6c

& arr[0] = 0x7fffcabf4e50, & arr[4] = 0x7fffcabf4e60

x = -2, y = 15

x = 108, y = 15

x = 108, y = -353

As we can see, x has changed because of this assignment:

arr [6] = 108;19

Similarly, y is changed because of this assignment:

arr [7] = -353;21

In this example, the gcc compiler has reordered the local variables in the call stack. The

addresses of x and y are larger than the addresses of the array elements. Thus, x and y are

changed when the indexes are 6 and 7 respectively. The program uses addresses that are

given to it by the operating system. If we run the above program again, we will likely see

diﬀerent addresses for x and y. It is possible that neither x nor y, but something else, is

changed due to using the invalid indexes.

The real problem is that the program’s behavior is undeﬁned. What does this mean more

precisely? The eﬀects of wrong indexes may change when the program runs on diﬀerent

machines. Sometimes, it seems nothing is wrong, even though there is a pernicious error

in the code. Sometimes, the values of x or y may be changed. Sometimes, the program

stops with the message “Segmentation fault (core dumped).” This means that the program

intends to read from or write to a memory address that does not belong to the program.

Modern computers usually run many programs at once. Each program is given part of the

memory. If one program tries to read from or write to a wrong address, the operating system

may stop the program. This protects the other programs running on the same computer.

Be careful about the word “may” here. The operating system does not keep track of

every memory address. Instead, the operating system uses “pages” as a unit. The size of

a page of memory varies on the operating system; 4KB is common. If the wrong address

is within a page given to a program, then the operating system will not stop the program.

That means that a program may modify a variable within a valid page unintentionally

without causing a segmentation fault. In this example, x and y are changed. It is called

a segmentation fault, and not a page fault, because some processors organize memory by

variable sized “segments”. Page and segment may be used together: A segment may contain

multiple pages. In fact, “page fault” is already used in the context of virtual memory. Thus,

the name “segmentation fault” remains. If the wrong address is outside the program’s

segments, then the operating system will stop the program. Try replacing

arr [7] = -353;21

arr [7000] = -353;21

Writing and Testing Programs 77

compile, link, and run the program again. We will very likely see “Segmentation fault (core

dumped)”. The index 7000 is too large and probably outside the page given by the operating

system.

5.4 Using valgrind to Check Memory Access Errors

In Linux, a program called valgrind can help detect problems of accessing invalid

addresses. Please check whether valgrind has been installed on your computer. To use

valgrind, add

$ valgrind –tool=memcheck –verbose

before the command running the program. For example,

$ gcc -g -Wall -Wshadow wrongindex.c -o wrongindex

$ valgrind –tool=memcheck –verbose ./wrongindex

The ﬁrst line uses gcc to create the executable ﬁle called wrongindex. To execute

this program, type ./wrongindex. The second line adds valgrind --tool=memcheck

--verbose in front of ./wrongindex. This will cause the program to be run within

valgrind, which in turn carefully checks every memory access to make sure that it is

valid. The output will include the following message:

Invalid write of size 4

at 0x4005D5: main (wrongindex.c:20)

This message says that something is wrong at the 20th line of the program. Sometimes

valgrind prints a lot to the computer screen. It is useful to direct valgrind’s output to a

log ﬁle, like so:

$ valgrind –tool=memcheck –verbose –log-ﬁle=valgrindlog ./wrongindex

Typing this long command repeatedly is too much work, and the command can be put

in the Makefile. This is the new Makefile for the program that determines whether an

array has distinct elements:

GCC = gcc1

CFLAGS = -g - Wall - Wshadow2

VALGRIND = valgrind -- tool = memc h eck -- verbose --log - file3

prog : aredi s tinct .o main .o6

$( GCC ) $ ( CFLAGS ) aredist inct . o main . o -o prog # no -c7

aredis tinct . o: aredist inct . c9

$( GCC ) $ ( CFLAGS ) -c are d istinct .c10

main . o: main .c12

$( GCC ) $ ( CFLAGS ) -c main . c13

78 Intermediate C Programming

testall : test0 test1 test2 test3 test415

test0 : prog17

./ prog inputs / input0 > outputs / output018

diff e xpected / expected 0 outputs / output019

$( VALGRIND )= log0 ./ prog inputs / input0 > / dev / null20

test1 : prog22

./ prog inputs / input1 > outputs / output123

diff e xpected / expected 1 outputs / output124

$( VALGRIND )= log1 ./ prog inputs / input0 > / dev / null25

test2 : prog27

./ prog inputs / input2 > outputs / output228

diff e xpected / expected 2 outputs / output229

$( VALGRIND )= log2 ./ prog inputs / input0 > / dev / null30

test3 : prog32

./ prog inputs / input3 > outputs / output333

diff e xpected / expected 3 outputs / output334

$( VALGRIND )= log3 ./ prog inputs / input0 > / dev / null35

test4 : prog37

./ prog inputs / input4 > outputs / output438

diff e xpected / expected 4 outputs / output439

$( VALGRIND )= log4 ./ prog inputs / input0 > / dev / null40

clean :42

/ bin / rm -f *. o prog outputs /* log *43

The third line creates a symbol for the valgrind command. What is /dev/null in the

20th line? Running prog will produce the output “The elements are distinct.” or “The

elements are not distinct.” This output has already been stored in outputs/output0 in

line 18. Thus, in line 20, the output is discarded. In Linux, /dev/null is a special ﬁle that

simply discards everything put into this special ﬁle. It is the “black hole” in Linux. Line 20

says “ignore any output produced by running prog”. After making these changes we can

type

$ make testall

and a lot of commands will run. The outputs of valgrind are stored in the log ﬁles. We

can use the grep command to check whether any error has been detected by valgrind:

$ grep ERROR *log*

If the result is

ERROR SUMMARY: 0 errors from 0 contexts

valgrind has not detected any problems.

Even though valgrind is helpful identifying which lines cause problems, valgrind is

not perfect. Sometimes, a program has problems and valgrind fails to detect them. This

happens because valgrind itself has limitations. The limitations may occur when running

Writing and Testing Programs 79

certain system calls to talk directly to hardware. Please read the valgrind document for

more information on its limitations. For example, On x86 and amd64, there is no support

for 3DNow! instructions. ... Valgrind’s signal simulation is not as robust as it could be.

... Please understand that valgrind is another tool that can help, but not replace, good

software developers. In many cases, valgrind can detect memory problems. When valgrind

says that a program has no invalid memory accesses, it is still possible that the program

has problems not tested by the speciﬁc test cases. It is also possible that valgrind misses

an error because of its limitations. How can you prevent memory access errors? When you

write programs, be careful how the indexes are calculated. It is important to read your code

before testing it, because testing can only determine if a program is wrong.

Some programming languages, such as Java, check the index every time an array element

is read or written. If a wrong index is detected, an exception is thrown. This guarantees

that every invalid index is detected. However, checking indexes slows down the program.

C’s design principle is to do only what a program says and nothing more. This is a trade-oﬀ

in the designs of programming languages.

5.5 Test Coverage

Ideally, tests should check every line in the program. If some lines are not checked, it is

possible these lines contain mistakes. Checking every line means that every if statement

(and other conditions) is entered in some test cases. If a line is never tested, it is possible

that tests overlook the possible scenarios. It is also possible that the program has a defect

in its logic. Consider this example:

i f (( x < 0) && (x > 400) )1

vx = -vx ;3

This is part of a computer game of a bouncing ball in a court whose width is 400. The

intent of this code is to change the horizontal velocity vx when the ball hits the left wall as

x < 0 or hits the right wall as x > 400. What is wrong with this code? The intention is

i f (( x < 0) || (x > 400) )1

vx = -vx ;3

However, the mistake is using && (and) instead of || (or). Since it is impossible for x to be

smaller than zero and at the same time greater than 400,

vx = -vx ;3

is never executed.

A programmer needs to read code carefully and detect these types of errors. There are

tools that can help ﬁnd these types of problems, and one such tool examines test coverage.

It determines whether a particular line of code has been executed for a particular test input.

Here is an example:

/*1

coverage .c2

Previous Chapter

5. Writing and Testing Programs (2/4)

Next Chapter

5. Writing and Testing Programs (4/4)

Table of Contents for Intermediate C Programming

Table of Contents for
Intermediate C Programming