Chapter 13. Input and Output

Programs must be able to write data to files or to physical output devices such as displays or printers, and to read in data from files or input devices such as a keyboard. The C standard library provides numerous functions for these purposes. This chapter presents a survey of the part of the standard library that is devoted to input and output, which is often referred to as the I/O library. Further details on the individual functions can be found in Chapter 18. Apart from these library functions, the C language itself contains no input or output support at all.

All of the basic functions, macros, and types for input and output are declared in the header file stdio.h. The corresponding input and output function declarations for wide characters of the type wchar_t are contained in the header file wchar.h.

Tip

As alternatives to the traditional standard I/O functions, C11 introduces many new functions that permit more secure programming, in particular by checking the bounds of arrays when copying data. These alternative functions have names that end with the suffix _s (such as scanf_s(), for example).

Support for these “secure” functions is optional. The macro __STDC_LIB_EXT1__ is defined in implementations that provide them (see “Functions with Bounds-Checking”).

Streams

From the point of view of a C program, all kinds of files and devices for input and output are uniformly represented as logical data streams regardless of whether the program reads or writes a character or byte at a time, or text lines, or data blocks of a given size. Streams in C can be either text or binary streams, although on some systems even this difference is nil. Opening a file by means of the function fopen() (or tmpfile()) creates a new stream, which then exists until closed by the fclose() function. C leaves file management up to the execution environment—in other words, the system on which the program runs. Thus, a stream is a channel by which data can flow from the execution environment to the program, or from the program to its environment. Devices, such as consoles, are addressed in the same way as files.

Every stream has a lock that the I/O library’s functions use for synchronization when several threads access the same stream. All stream I/O functions first obtain exclusive access to a stream before performing read or write operations, or querying and moving the stream’s file position indicator. Once the operation has been performed, the stream is released again for access by other threads. Exclusive stream access prevents “data races” and concurrent I/O operations. For more information about multithreaded programs, see Chapter 14.

Text Streams

A text stream transports the characters of a text that is divided into lines. A line of text consists of a sequence of characters ending in a newline character. A line of text can also be empty, meaning that it consists of a newline character only. The last line transported may or may not have to end with a newline character, depending on the implementation.

The internal representation of text in a C program is the same regardless of the system on which the program is running. Text input and output on a given system may involve removing, adding, or altering certain characters. For example, on systems that are not Unix-based, end-of-line indicators ordinarily have to be converted into newline characters when reading text files, as on Windows systems, for instance, where the end-of-line indicator is a sequence of two control characters, \r (carriage return) and \n (newline). Similarly, the control character ^Z (character code 26) in a text stream on Windows indicates the end of the stream.

As the programmer, you generally do not have to worry about the necessary adaptations, because they are performed automatically by the I/O functions in the standard library. However, if you want to be sure that an input function call yields exactly the same text that was written by a previous output function call, your text should contain only the newline and horizontal tab control characters, in addition to printable characters. Furthermore, the last line should end with a newline character, and no line should end with a space immediately before the newline character.

Binary Streams

A binary stream is a sequence of bytes that are transmitted without modification. That is, the I/O functions do not involve any interpretation of control characters when operating on binary streams. Data written to a file through a binary stream can always be read back unchanged on the same system. However, in certain implementations there may be extra zero-valued bytes appended at the end of the stream.

Binary streams are normally used to write binary data—for example, database records—without converting it to text. If a program reads the contents of a text file through a binary stream, then the text appears in the program in its stored form, with all the control characters used on the given system.

Tip

On common Unix systems, there is no difference between text streams and binary streams.

Files

A file represents a sequence of bytes. The fopen() function associates a file with a stream and initializes an object of the type FILE, which contains all the information necessary to control the stream. Such information includes a pointer to the buffer used; a file position indicator, which specifies a position to access in the file; and flags to indicate error and end-of-file conditions.

Each of the functions that open files—namely, fopen(), freopen(), and tmpfile()—returns a pointer to a FILE object for the stream associated with the file being opened. Once you have opened a file, you can call functions to transfer data and to manipulate the stream. Such functions have a pointer to a FILE object—commonly called a FILE pointer—as one of their arguments. The FILE pointer specifies the stream on which the operation is carried out.

The I/O library also contains functions that operate on the file system, and take the name of a file as one of their parameters. These functions do not require the file to be opened first. They include the following:

  • The remove() function deletes a file (or an empty directory). The string argument is the file’s name. If the file has more than one name, then remove() only deletes the specified name, not the file itself. The data may remain accessible in some other way, but not under the deleted filename.

  • The rename() function changes the name of a file (or directory). The function’s two string arguments are the old and new names, in that order. The remove() and rename() functions both have the return type int, and return zero on success, or a nonzero value on failure. The following statement changes the name of the file songs.dat to mysongs.dat:

    if ( rename( "songs.dat", "mysongs.dat" ) != 0 )
      fprintf( stderr, "Error renaming \"songs.dat\".\n" );

    Conditions that can cause the rename() function to fail include the following: no file exists with the old name; the program does not have the necessary access privileges; or the file is open. The rules for forming permissible filenames depend on the implementation.

File Position

Like the elements of a char array, each character in an ordinary file has a definite position in the file. The file position indicator in the object representing the stream determines the position of the next character to be read or written.

When you open a file for reading or writing, the file position indicator points to the beginning of the file so that the next character accessed has the position 0. If you open the file in “append” mode, the file position indicator may point to the end of the file. Each read or write operation increases the indicator by the number of characters read from the file or written to the file. This behavior makes it simple to process the contents of a file sequentially. Random access within the file is achieved by using functions that change the file position indicator, fseek(), fsetpos(), and rewind(), which are discussed in detail in“Random File Access”.

Of course, not all files support changing access positions. Sequential I/O devices such as terminals and printers do not, for example.

Buffers

In working with files, it is generally not efficient to read or write individual characters. For this reason, a stream has a buffer in which it collects characters, which are transferred as a block to or from the file. Sometimes you don’t want buffering, however. For example, after an error has occurred, you might want to write data to a file as directly as possible.

Streams are buffered in one of three ways:

Fully buffered

The characters in the buffer are normally transferred only when the buffer is full.

Line-buffered

The characters in the buffer are normally transferred only when a newline character is written to the buffer, or when the buffer is full. A stream’s buffer is also written to the file when the program requests input through an unbuffered stream, or when an input request on a line-buffered stream causes characters to be read from the host environment.

Unbuffered

Characters are transferred as promptly as possible.

You can also explicitly transfer the characters in the stream’s output buffer to the associated file by calling the fflush() function. The buffer is also flushed when you close a stream, and normal program termination flushes the buffers of all the program’s streams.

When you open an ordinary file by calling fopen(), the new stream is fully buffered. Opening interactive devices is different, however: such device files are associated on opening with a line-buffered stream. After you have opened a file, and before you perform the first input or output operation on it, you can change the buffering mode using the setbuf() or setvbuf() function.

The Standard Streams

Three standard text streams are available to every C program on starting. These streams do not have to be explicitly opened. Table 13-1 lists them by the names of their respective FILE pointers.

Table 13-1. The standard streams
FILE pointer Common name Buffering mode

stdin

Standard input

Line-buffered

stdout

Standard output

Line-buffered

stderr

Standard error output

Unbuffered

stdin is usually associated with the keyboard, and stdout and stderr with the console display. These associations can be modified by redirection. Redirection is performed either by the program calling the freopen() function, or by the environment in which the program is executed.

Opening and Closing Files

To write to a new file or modify the contents of an existing file, you must first open the file. When you open a file, you must specify an access mode indicating whether you plan to read, to write, or some combination of the two. When you have finished using a file, close it to release resources.

Opening a File

The standard library provides the function fopen() to open a file (for special cases, the freopen() and tmpfile() functions also open files):

FILE *fopen( const char * restrict filename,
             const char * restrict mode );

This function opens the file whose name is specified by the string filename. The filename may contain a directory part, and must not be longer than the maximum length specified by the value of the macro FILENAME_MAX. The second argument, mode, is also a string, and specifies the access mode. The possible access modes are described in the next section. The fopen() function associates the file with a new stream:

FILE *freopen( const char * restrictfilename,
               const char * restrict mode,
               FILE * restrict stream );

This function redirects a stream. Like fopen(), freopen() opens the specified file in the specified mode. However, rather than creating a new stream, freopen() associates the file with the existing stream specified by the third argument. The file previously associated with that stream is closed. The most common use of freopen() is to redirect the standard streams, stdin, stdout, and stderr.

FILE *tmpfile( void );

The tmpfile() function creates a new temporary file whose name is distinct from all other existing files, and opens the file for binary writing and reading (as if the mode string "wb+" were used in an fopen() call). If the program is terminated normally, the file is automatically deleted.

All three file-opening functions, fopen(), freopen() and tmpfile(), return a pointer to the opened stream if successful, or a null pointer to indicate failure.

Tip

If a file is opened for writing, the program should have exclusive access to the file to prevent simultaneous write operations by other programs. The traditional standard functions do not guarantee exclusive file access, but three of the new “secure” functions introduced by C11, fopen_s(), freopen_s() and tmpfile_s(), do provide exclusive access, if the operating system supports it.

Access Modes

The access mode specified by the second argument to fopen() or freopen() determines what input and output operations the new stream permits. The permissible values of the mode string are restricted. The first character in the mode string is always r for “read,” w for “write,” or a for “append,” and in the simplest case, the string contains just that one character. However, the mode string may also contain one or both of the characters + and b (in either order: +b has the same effect as b+).

A plus sign (+) in the mode string means that both read and write operations are permitted. However, a program must not alternate immediately between reading and writing. After a write operation, you must call the fflush() function or a positioning function (fseek(), fsetpos(), or rewind()) before performing a read operation. After a read operation, you must call a positioning function before performing a write operation.

A b in the mode string causes the file to be opened in binary mode—that is, the new stream associated with the file is a binary stream. If there is no b in the mode string, the new stream is a text stream.

If the mode string begins with r, the file must already exist in the file system. If the mode string begins with w, then the file will be created if it does not already exist. If it does exist, its previous contents will be lost, because the fopen() function truncates it to zero length in “write” mode.

C11 introduces the capability to open a file in exclusive write mode, if the operating system supports it. To specify exclusive access, you can use the suffix x in a mode string that begins with w, such as wx or w+bx. The file-opening function then fails—returning a null pointer—if the file already exists or cannot be created. Otherwise, the file is created and opened for exclusive access.

A mode string beginning with a (for append) also causes the file to be created if it does not already exist. If the file does exist, however, its contents are preserved, because all write operations are automatically performed at the end of the file. Here is a brief example:

#include <stdio.h>
#include <stdbool.h>
_Bool isReadWriteable( const char *filename )
{
  FILE *fp = fopen( filename, "r+" );  // Open a file to read and write.

  if ( fp != NULL )                    // Did fopen() succeed?
  {
    fclose(fp);              // Yes: close the file; no error handling.
    return true;
  }
  else                       // No.
    return false;
}

This example also illustrates how to close a file using the fclose() function.

Closing a File

To close a file, use the fclose() function. The prototype of this function is:

int fclose( FILE *fp );

The function flushes any data still pending in the buffer to the file, closes the file, and releases any memory used for the stream’s input and output buffers. The fclose() function returns zero on success, or EOF if an error occurs.

When the program exits, all open files are closed automatically. Nonetheless, you should always close any file that you have finished processing. Otherwise, you risk losing data in the case of an abnormal program termination. Furthermore, there is a limit to the number of files that a program may have open at one time; the number of open files allowed is greater than or equal to the value of the constant FOPEN_MAX.

Reading and Writing

This section describes the functions that actually retrieve data from or send data to a stream. First, there is another detail to consider: an open stream can be used either for byte characters or for wide characters.

Byte-Oriented and Wide-Oriented Streams

In addition to the type char, C also provides a type for wide characters, named wchar_t. This type is wide enough to represent any character in the extended character sets that the implementation supports (see “Wide Characters and Multibyte Characters”). Accordingly, there are two complete sets of functions for input and output of characters and strings: the byte-character I/O functions and the wide-character I/O functions. Functions in the second set operate on characters with the type wchar_t. Each stream has an orientation that determines which set of functions is appropriate.

Immediately after you open a file, the orientation of the stream associated with it is undetermined. If the first file access is performed by a byte-character I/O function, then from that point on the stream is byte-oriented. If the first access is by a wide-character function, then the stream is wide-oriented. The orientation of the standard streams, stdin, stdout, and stderr, is likewise undetermined when the program starts.

You can call the function fwide() at any time to ascertain a stream’s orientation. Before the first I/O operation, fwide() can also set a new stream’s orientation. To change a stream’s orientation once it has been determined, you must first reopen the stream by calling the freopen() function.

The wide characters written to a wide-oriented stream are stored as multibyte characters in the file associated with the stream. The read and write functions implicitly perform the necessary conversion between wide characters of type wchar_t and the multibyte character encoding. This conversion may be stateful. In other words, the value of a given byte in the multibyte encoding may depend on control characters that precede it, which alter the shift state or conversion state of the character sequence. For this reason, each wide-oriented stream has an associated object with the type mbstate_t, which stores the current multibyte conversion state. The functions fgetpos() and fsetpos(), which get and set the value of the file position indicator, also save and restore the conversion state for the given file position.

Error Handling

The I/O functions can use a number of mechanisms to indicate to the caller when they incur errors, including return values, error and EOF flags in the FILE object, and the global error variable errno. To read which mechanisms are used by a given function, see the individual function descriptions in Chapter 18. This section describes the I/O error-handling mechanisms in general.

Return values and status flags

The I/O functions generally indicate any errors that occur by their return value. In addition, they also set an error flag in the FILE object that controls the stream if an error in reading or writing occurs. To query this flag, you can call the ferror() function. Here is an example:

(void)fputc( '*', fp );         // Write an asterisk to the stream fp.
if ( ferror(fp) )
  fprintf( stderr, "Error writing.\n" );

Furthermore, read functions set the stream’s EOF flag on reaching the end of the file. You can query this flag by calling the feof() function. A number of read functions return the value of the macro EOF if you attempt to read beyond the last character in the file. (Wide-character functions return the value WEOF.) A return value of EOF or WEOF can also indicate an error, however. To distinguish between the two cases, you must call ferror() or feof(), as the following example illustrates:

int i, c;
char buffer[1024];
/* ... Open a file to read using the stream fp ... */
i = 0;
while ( i < 1024 &&               // While there is space in the buffer
        (c = fgetc( fp )) != EOF) // ... and the stream can deliver
  buffer[i++] = (char)c;          // characters.
if ( i < 1024 && ! feof(fp) )
  fprintf( stderr, "Error reading.\n" );

The if statement in this example prints an error message if fgetc() returns EOF and the EOF flag is not set.

The error variable errno

Several standard library functions support more specific error handling by setting the global error variable errno to a value that indicates the kind of error that has occurred. Stream handling functions that set the errno variable include ftell(), fgetpos(), and fsetpos(). Depending on the implementation, other functions may also set the errno variable. errno is declared in the header errno.h with the type int (see Chapter 16). errno.h also defines macros for the possible values of errno.

The perror() function prints a system-specific error message for the current value of errno to the stderr stream:

long pos = ftell(fp);      // Get the current file position.
if ( pos < 0L )            // ftell() returns -1L if an error occurs.
  perror( "ftell()" );

The perror() function prints its string argument followed by a colon, the error message, and a newline character. The error message is the same as the string that strerror() would return if called with the given value of errno as its argument. In the previous example, the perror() function as implemented in the GCC compiler prints the following output to indicate an invalid FILE pointer argument:

ftell(): Bad file descriptor

The error variable errno is also set by functions that convert between wide characters and multibyte characters in reading from or writing to a wide-oriented stream. Such conversions are performed internally by calls to the wcrtomb() and mbrtowc() functions. When these functions are unable to supply a valid conversion, they return the value of -1 cast to size_t, and set errno to the value of EILSEQ (for “illegal sequence”).

Unformatted I/O

The standard library provides functions to read and write unformatted data in the form of individual characters, strings, or blocks of any given size. This section describes these functions, listing the prototypes of both the byte-character and the wide-character functions. The type wint_t is an integer type capable of representing at least all the values in the range of wchar_t, and the additional value WEOF. The macro WEOF has the type wint_t and a value that is distinct from all the character codes in the extended character set.

Tip

Unlike EOF, the value of WEOF is not necessarily negative.

Reading characters

Use the following functions to read characters from a file:

int fgetc( FILE *fp );
int getc( FILE *fp );
int getchar( void );
wint_t fgetwc( FILE *fp );
wint_t getwc( FILE *fp );
wint_t getwchar( void );

The fgetc() function reads a character from the input stream referenced by fp. The return value is the character read, or EOF if an error occurred. The macro getc() has the same effect as the function fgetc(). The macro is commonly used because it is faster than a function call. However, if the argument fp is an expression with side effects (see Chapter 5), then you should use the function instead because a macro may evaluate its argument more than once. The macro getchar() reads a character from standard input. It is equivalent to getc(stdin).

fgetwc(), getwc(), and getwchar() are the corresponding functions and macros for wide-oriented streams. These functions set the global variable errno to the value EILSEQ if an error occurs in converting a multibyte character to a wide character.

Putting a character back

Use one of the following functions to push a character back into the stream from whence it came:

int ungetc( intc, FILE *fp );
wint_t ungetwc( wint_t c, FILE *fp );

ungetc() and ungetwc() push the last character read, c, back onto the input stream referenced by fp. Subsequent read operations then read the characters put back, in LIFO (last in, first out) order—that is, the last character put back is the first one to be read. You can always put back at least one character, but repeated attempts might or might not succeed. The functions return EOF (or WEOF) on failure, or the character pushed onto the stream on success.

Writing characters

The following functions allow you to write individual characters to a stream:

int fputc( intc, FILE *fp );
int putc( int c, FILE *fp);
int putchar( int c );
wint_t fputwc( wchar_t wc, FILE *fp );
wint_t putwc( wchar_t wc, FILE *fp );
wint_t putwchar( wchar_t wc );

The function fputc() writes the character value of the argument c to the output stream referenced by fp. The return value is the character written, or EOF if an error occurred. The macro putc() has the same effect as the function fputc(). If either of its arguments is an expression with side effects (see Chapter 5), then you should use the function instead because a macro might evaluate its arguments more than once. The macro putchar() writes the specified character to the standard output stream.

fputwc(), putwc(), and putwchar() are the corresponding functions and macros for wide-oriented streams. These functions set the global variable errno to the value EILSEQ if an error occurs in converting the wide character to a multibyte character.

The following example copies the contents of a file opened for reading, referenced by fpIn, to a file opened for writing, referenced by fpOut (both streams are byte-oriented):

_Bool error = 0;
int c;
rewind( fpIn );     // Set the file position indicator to the beginning
                    // of the file, and clear the error and EOF flags.
while (( c = getc( fpIn )) != EOF )  // Read one character at a time.
  if ( putc( c, fpOut ) == EOF )     // Write each character to the
  {                                  // output stream.
    error = 1; break;                // A write error.
  }
if ( ferror( fpIn ))                 // A read error.
  error = 1;

Reading strings

The following functions allow you to read a string from a stream:

char *fgets( char *buf, int n, FILE *fp );
wchar_t *fgetws( wchar_t *buf, int n, FILE *fp);
char *gets( char *buf);                     // Obsolete
char *gets_s(char *buf, size_t n);          // C11

The functions fgets() and fgetws() read up to n − 1 characters from the input stream referenced by fp into the buffer addressed by buf, appending a null character to terminate the string. If the functions encounter a newline character or the end of the file before they have read the maximum number of characters, then only the characters read up to that point are read into the buffer. The newline character '\n' (or, in a wide-oriented stream, L'\n') is also stored in the buffer if read.

gets() reads a line of text from standard input into the buffer addressed by buf. The newline character that ends the line is replaced by the null character that terminates the string in the buffer. fgets() is a preferable alternative to gets(), as gets() offers no way to limit the number of characters read. The C11 standard retires the function gets() and adds a further alternative to gets(), the new function gets_s(), in implementations that support bounds-checking interfaces.

All four functions return the value of their argument buf, or a null pointer if an error occurred, or if there were no more characters to be read before the end of the file.

Writing strings

Use the following functions to write a null-terminated string to a stream:

int fputs( const char *s, FILE *fp );
int puts( const char *s );
int fputws( const wchar_t *s, FILE *fp );

The three puts functions have some features in common as well as certain differences:

  • fputs() and fputws() write the strings to the output stream referenced by fp. The null character that terminates the string is not written to the output stream.

  • puts() writes the string s to the standard output stream, followed by a newline character. There is no wide-character function that corresponds to puts().

  • All three functions return EOF (not WEOF) if an error occurred, or a non-negative value to indicate success.

The function in the following example prints all the lines of a file that contain a specified string.

// Write to stdout all the lines containing the specified search
// string in the file opened for reading as fpIn.
// Return value: The number of lines containing the search string,
//               or -1 on error.
// ----------------------------------------------------------------
#include <stdio.h>
#include <string.h>
int searchFile( FILE*fpIn, const char *keyword )
{
  #define MAX_LINE 256
  char line[MAX_LINE] = "";
  int count = 0;

  if ( fpIn == NULL || keyword == NULL )
    return -1;
  else
    rewind( fpIn );

  while ( fgets( line, MAX_LINE, fpIn ) != NULL )
    if ( strstr( line, keyword ) != NULL )
    {
      ++count;
      fputs( line, stdout );
    }

  if ( !feof( fpIn ) )
    return -1;
  else
    return count;
}

Reading and writing blocks

The fread() function reads up to n objects whose size is size from the stream referenced by fp, and stores them in the array addressed by buffer:

size_t fread( void *buffer, size_t size, size_t n, FILE *fp );

The function’s return value is the number of objects transferred. A return value less than the argument n indicates that the end of the file was reached while reading, or that an error occurred.

The fwrite() function sends n objects whose size is size from the array addressed by buffer to the output stream referenced by fp:

size_t fwrite( const void *buffer, size_t size, size_t n, FILE *fp );

Again, the return value is the number of objects written. A return value less than the argument n indicates that an error occurred.

Because the fread() and fwrite() functions do not deal with characters or strings as such, there are no corresponding functions for wide-oriented streams. On systems that distinguish between text and binary streams, the fread() and fwrite() functions should be used only with binary streams.

The function in the following example assumes that records have been saved in the file records.dat by means of the fwrite() function. A key value of 0 indicates that a record has been marked as deleted. In copying records to a new file, the program skips over records whose key is 0:

// Copy records to a new file, filtering out those with the key 0.
// ---------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>

#define ARRAY_LEN 100       // Maximum number of records in the buffer.
// A structure type for the records:
typedef struct { long key;
                 char name[32];
                 /* ... other fields in the record ... */ } Record_t;

char inFile[ ]  = "records.dat",                // Filenames.
     outFile[ ] = "packed.dat";

// Terminate the program with an error message:
static inline void error_exit( int status, const char *error_msg )
{
  fputs( error_msg, stderr );
  exit( status );
}

int main()
{
  FILE *fpIn, *fpOut;
  Record_t record, *pArray;
  unsigned int i;

  if (( fpIn = fopen( inFile, "rb" )) == NULL )        // Open to read.
    error_exit( 1, "Error on opening input file." );

  else if (( fpOut = fopen( outFile, "wb" )) == NULL ) // Open to write.
    error_exit( 2, "Error on opening output file." );

  else                                            // Create the buffer.
   if ((pArray = malloc( ARRAY_LEN * sizeof(Record_t) )) == NULL )
    error_exit( 3, "Insufficient memory." );

  i = 0;                                // Read one record at a time:
  while ( fread( &record, sizeof(Record_t), 1, fpIn ) == 1 )
  {
    if ( record.key != 0L )             // If not marked as deleted ...
    {                                   // ... then copy the record:
       pArray[i++] = record;
       if ( i == ARRAY_LEN )                    // Buffer full?
       {                                        // Yes: write to file.
         if ( fwrite( pArray, sizeof(Record_t), i, fpOut) < i )
            break;
         i = 0;
       }
    }
  }
  if ( i > 0 && !ferror(fpOut) )        // Write the remaining records.
    fwrite( pArray, sizeof(Record_t), i, fpOut );

  if ( ferror(fpOut) )                               // Handle errors.
    error_exit( 4, "Error on writing to output file." );
  else if ( ferror(fpIn) )
    error_exit( 5, "Error on reading input file." );

  return 0;
}

Formatted Output

C provides formatted data output by means of the printf() family of functions. This section illustrates commonly used formatting options with appropriate examples. A complete, tabular description of output formatting options is included in Part II; see the discussion of the printf() function in Chapter 18.

The printf() function family

The printf() function and its various related functions all share the same capabilities of formatting data output as specified by an argument called the format string. However, the various functions have different output destinations and ways of receiving the data intended for output. The printf() functions for byte-oriented streams are:

int printf( const char * restrict format, ... );

Writes to the standard output stream, stdout.

int fprintf( FILE * restrict fp, const char * restrict format, ... );

Writes to the output stream specified by fp. The printf() function can be considered to be a special case of fprintf().

int sprintf( char * restrict buf,
             const char * restrict
format, ... );

Writes the formatted output to the char array addressed by buf, and appends a terminating null character.

int snprintf( char * restrict buf, size_t n,
              const char * restrict
format, ... );

Like sprintf(), but never writes more than n bytes to the output buffer.

The ellipsis (...) in these function prototypes stands for more arguments, which are optional. Another subset of the printf() functions takes a pointer to an argument list, rather than accepting a variable number of arguments directly in the function call. The names of these functions begin with a v for “variable argument list”:

int vprintf( const char * restrictformat, va_list argptr );
int vfprintf( FILE * restrict fp, const char * restrict format,
              va_list argptr );
int vsprintf( char * restrict buf, const char * restrict format,
              va_list argptr );
int vsnprintf( char * restrict buffer, size_t n,
               const char * restrict format, va_list argptr );

To use the variable argument list functions, you must include stdarg.h in addition to stdio.h.

There are counterparts to all of these functions for output to wide-oriented streams. The wide-character printf() functions have names containing wprintf instead of printf, as in vfwprintf() and swprintf(), for example. There is one exception: there is no snwprintf(). Instead, swprintf() corresponds to the function snprintf(), with a parameter for the maximum output length.

The C11 standard provides a new “secure” alternative to each of these functions. The names of these new functions end in the suffix _s (for example, fprintf_s()). The new functions test whether any pointer arguments they receive are null pointers.

The format string

One argument passed to every printf() function is a format string. This is a definition of the data output format, and contains some combination of ordinary characters and conversion specifications. Each conversion specification defines how the function should convert and format one of the optional arguments for output. The printf() function writes the format string to the output destination, replacing each conversion specification in the process with the formatted value of the corresponding optional argument.

A conversion specification begins with a percent sign % and ends with a letter, called the conversion specifier. (To include a percent sign in the output, there is a special conversion specification: %%. printf() converts this sequence into a single percent sign.)

Tip

The syntax of a conversion specification ends with the conversion specifier. Throughout the rest of this section, we use both these terms frequently in talking about the format strings used in printf() and scanf() function calls.

The conversion specifier determines the type of conversion to be performed, and must match the corresponding optional argument. Here is an example:

int score = 120;
char player[ ] = "Mary";
printf( "%s has %d points.\n", player, score );

The format string in this printf() call contains two conversion specifications: %s and %d. Accordingly, two optional arguments have been specified: a string, matching the conversion specifier s (for “string”), and an int, matching the conversion specifier d (for “decimal”). The function call in the example writes the following line to standard output:

Mary has 120 points.

All conversion specifications (with the exception of %%) have the following general format:

%[flags][field_width][.precision][length_modifier]specifier

The parts of this syntax that are indicated in square brackets are all optional, but any of them that you include must be placed in the order shown here. The permissible conversion specifications for each argument type are described in the sections that follow. Any conversion specification can include a field width. The precision does not apply to all conversion types, however, and its significance is different depending on the specifier.

Field widths

The field width option is especially useful in formatting tabular output. If included, the field width must be a positive decimal integer (or an asterisk, as described momentarily). It specifies the minimum number of characters in the output of the corresponding data item. The default behavior is to position the converted data right-justified in the field, padding it with spaces to the left. If the flags include a minus sign (-), then the information is left-justified, and the excess field width padded with space characters to the right.

The following example first prints a line numbering the character positions to illustrate the effect of the field width option:

printf("1234567890123456\n");                // Character positions.
printf( "%-10s %s\n", "Player", "Score" );   // Table headers.
printf( "%-10s %4d\n", "John", 120 );        // Field widths: 10; 4.
printf( "%-10s %4d\n", "Mary", 77 );

These statements produce a little table:

1234567890123456
Player     Score
John        120
Mary         77

If the output conversion results in more characters than the specified width of the field, then the field is expanded as necessary to print the complete data output.

If a field is right-justified, it can be padded with leading zeros instead of spaces. To do so, include a 0 (that’s the digit zero) in the conversion specification’s flags. The following example prints a date in the format mm-dd-yyyy:

int month = 5, day = 1, year = 1987;
printf( "Date of birth: %02d-%02d-%04d\n", month, day, year );

This printf() call produces the following output:

Date of birth: 05-01-1987

You can also use a variable to specify the field width. To do so, insert an asterisk (*) as the field width in the conversion specification, and include an additional optional argument in the printf() call. This argument must have the type int, and must appear immediately before the argument to be converted for output. Here is an example:

char str[ ] = "Variable field width";
int width = 30;
printf( "%-*s!\n", width, str );

The printf statement in this example prints the string str at the left end of a field whose width is determined by the variable width. The results are as follows:

Variable field width          !

Notice the trailing spaces preceding the bang (!) character in the output. Those spaces are not present in the string used to initialize str[ ]. The spaces are generated by virtue of the fact that the printf statement specifies a 30-character width for the string.

Printing characters and strings

The printf() conversion specifier for strings is s, as you have already seen in the previous examples. The specifier for individual characters is c (for char). They are summarized in Table 13-2.

Table 13-2. Conversion specifiers for printing characters and strings
Specifier Argument types Representation

c

int

A single character

s

Pointer to any char type

The string addressed by the pointer argument

The following example prints a separator character between the elements in a list of team members:

char *team[ ] = { "Vivian", "Tim", "Frank", "Sally" };
char separator = ';';
for ( int i = 0;  i < sizeof(team)/sizeof(char *); ++i )
  printf( "%10s%c ", team[i], separator );
putchar( '\n' );

The argument represented by the specification %c can also have a narrower type than int, such as char. Integer promotion automatically converts such an argument to int. The printf() function then converts the int arguments to unsigned char, and prints the corresponding character.

For string output, you can also specify the maximum number of characters of the string that may be printed. This is a special use of the precision option in the conversion specification, which consists of a dot followed by a decimal integer. Here is an example:

char msg[] = "Every solution breeds new problems.";
printf( "%.14s\n", msg );      // Precision: 14.
printf( "%20.14s\n", msg );    // Field width is 20; precision is 14.
printf( "%.8s\n", msg+6 );     // Print the string starting at the 7th
                               // character in msg, with precision 8.

These statements produce the following output:

Every solution
      Every solution
solution

Printing integers

The printf() functions can convert integer values into decimal, octal, or hexadecimal notation. The conversion specifiers listed in Table 13-3 are provided for this purpose.

Table 13-3. Conversion specifiers for printing integers
Specifier Argument types Representation

d, i

int

Decimal

u

unsigned int

Decimal

o

unsigned int

Octal

x

unsigned int

Hexadecimal with lowercase a, b, c, d, e, f

X

unsigned int

Hexadecimal with uppercase A, B, C, D, E, F

The following example illustrates different conversions of the same integer value:

printf( "%4d %4o %4x %4X\n", 63, 63, 63, 63 );

This printf() call produces the following output:

63   77   3f   3F

The specifiers u, o, x, and X interpret the corresponding argument as an unsigned integer. If the argument’s type is int and its value negative, the converted output is the positive number that corresponds to the argument’s bit pattern when interpreted as an unsigned int:

printf( "%d   %u   %X\n", -1, -1, -1 );

If int is 32 bits wide, this statement yields the following output:

-1   4294967295   FFFFFFFF

Because the arguments are subject to integer promotion, the same conversion specifiers can be used to format short and unsigned short arguments. For arguments with the type long or unsigned long, you must prefix the length modifier l (a lowercase L) to the d, i, u, o, x, and X specifiers. Similarly, the length modifier for arguments with the type long long or unsigned long long is ll (two lowercase Ls). Here is an example:

long bignumber = 100000L;
unsigned long long hugenumber = 100000ULL * 1000000ULL;
printf( "%ld   %llX\n", bignumber, hugenumber );

These statements produce the following output:

100000   2540BE400

Printing floating-point numbers

Table 13-4 shows the printf() conversion specifiers to format floating-point numbers in various ways.

Table 13-4. Conversion specifiers for printing floating-point numbers
Specifier Argument types Representation

f

double

Decimal floating-point number

e, E

double

Exponential notation, decimal

g, G

double

Floating-point or exponential notation, whichever is shorter

a, A

double

Exponential notation, hexadecimal

The most commonly used specifiers are f and e (or E). The following example illustrates how they work:

double x = 12.34;
printf( "%f  %e  %E\n", x, x, x );

This printf() call generates following output line:

12.340000  1.234000e+01  1.234000E+01

The e that appears in the exponential notation in the output is lowercase or uppercase, depending on whether you use e or E for the conversion specifier. Furthermore, as the example illustrates, the default output shows precision to six decimal places. The precision option in the conversion specification modifies this behavior:

double value = 8.765;
printf( "Value: %.2f\n", value );        // Precision is 2: output to
                                         // two decimal places.
printf( "Integer value:\n"
        " Rounded:     %5.0f\n"          // Field width 5; precision 0.
        " Truncated:   %5d\n", value, (int)value );

These printf() calls produce the following output:

Value: 8.77
Integer value:
 Rounded:        9
 Truncated:      8

As this example illustrates, printf() rounds floating-point numbers up or down in converting them for output. If you specify a precision of 0, the decimal point itself is suppressed. If you simply want to truncate the fractional part of the value, you can cast the floating-point number as an integer type.

The specifiers described can also be used with float arguments, because they are automatically promoted to double. To print arguments of type long double, however, you must insert the length modifier L before the conversion specifier, as in this example:

#include <math.h>
long double xxl = expl(1000);
printf( "e to the power of 1000 is %.2Le\n", xxl );

Formatted Input

To read in data from a formatted source, C provides the scanf() family of functions. Like the printf() functions, the scanf() functions take as one of their arguments a format string that controls the conversion between the I/O format and the program’s internal data. This section highlights the differences between the uses of format strings and conversion specifications in the scanf() and printf() functions.

The scanf() function family

The various scanf() functions all process the characters in the input source in the same way. They differ in the kinds of data sources they read, however, and in the ways in which they receive their arguments. The scanf() functions for byte-oriented streams are:

int scanf( const char * restrict format, ... );

Reads from the standard input stream, stdin.

int fscanf( FILE * restrict fp, const char * restrict format, ... );

Reads from the input stream referenced by fp.

int sscanf( const char * restrict src,
            const char * restrict
format, ... );

Reads from the char array addressed by src.

The ellipsis (…) stands for more arguments, which are optional. The optional arguments are pointers to the variables in which the scanf() function stores the results of its conversions.

Like the printf() functions, the scanf() family also includes variants that take a pointer to an argument list, rather than accepting a variable number of arguments directly in the function call. The names of these functions begin with the letter v for “variable argument list”: vscanf(), vfscanf(), and vsscanf(). To use the variable argument list functions, you must include stdarg.h in addition to stdio.h.

There are counterparts to all of these functions for reading wide-oriented streams. The names of the wide-character functions contain the sequence wscanf in place of scanf, as in wscanf() and vfwscanf(), for example.

The C11 standard provides a new “secure” alternative to each of the scanf() functions. The names of these new functions end in the suffix _s, as in fscanf_s(), for example. The new functions test whether the array bounds would be exceeded before reading a string into an array.

The format string

The format string for the scanf() functions contains both ordinary characters and conversion specifications that define how to interpret and convert the sequences of characters read. Most of the conversion specifiers for the scanf() functions are similar to those defined for the printf() functions. However, conversion specifications in the scanf() functions have no flags and no precision options. The general syntax of conversion specifications for the scanf() functions is as follows:

%[*][field_width][length_modifier]specifier

For each conversion specification in the format string, one or more characters are read from the input source and converted in accordance with the conversion specifier. The result is stored in the object addressed by the corresponding pointer argument. Here is an example:

int age = 0;
char name[64] = "";
printf( "Please enter your first name and your age:\n" );
scanf( "%s%d", name, &age );

Suppose that the user enters the following line when prompted:

Bob 27\n

The scanf() call writes the string Bob into the char array name, and the value 27 in the int variable age.

All conversion specifications, except those with the specifier c, skip over leading whitespace characters. In the previous example, the user could type any number of space, tab, or newline characters before the first word, Bob, or between Bob and 27, without affecting the results.

The sequence of characters read for a given conversion specification ends when scanf() reads any whitespace character, or any character that cannot be interpreted under that conversion specification. Such a character is pushed back onto the input stream so that processing for the next conversion specification begins with that character. In the previous example, suppose the user enters this line:

Bob 27years\n

Then on reaching the character y, which cannot be part of a decimal numeral, scanf() stops reading characters for the conversion specification %d. After the function call, the characters years\n would remain in the input stream’s buffer.

If, after skipping over any whitespace, scanf() doesn’t find a character that matches the current conversion specification, an error occurs and the scanf() function stops processing the input. We’ll show you how to detect such errors in a moment.

Often the format string in a scanf() function call contains only conversion specifications. If not, all other characters in the format string, except whitespace characters, must literally match characters in corresponding positions in the input source. Otherwise, the scanf() function quits processing and pushes the mismatched character back on to the input stream.

One or more consecutive whitespace characters in the format string matches any number of consecutive whitespace characters in the input stream. In other words, for any whitespace in the format string, scanf() reads past all whitespace characters in the data source up to the first non-whitespace character. Knowing this, what’s the matter with the following scanf() call?

scanf( "%s%d\n", name, &age );      // Problem?

Suppose that the user enters the following line:

Bob 27\n

In this case, scanf() doesn’t return after reading the newline character but instead continues reading more input—until a non-whitespace character comes along.

Sometimes you will want to read past any sequence of characters that matches a certain conversion specification without storing the result. You can achieve exactly this effect by inserting an asterisk (*) immediately after the percent sign (%) in the conversion specification. Do not include a pointer argument for a conversion specification with an asterisk.

The return value of a scanf() function is the number of data items successfully converted and stored. If everything goes well, the return value matches the number of conversion specifications, not counting any that contain an asterisk. The scanf() functions return the value of EOF if a read error occurs or they reach the end of the input source before converting any data items. Here is an example:

if ( scanf( "%s%d", name, &age ) < 2 )
  fprintf( stderr, "Bad input.\n" );
else
{  /* ... Test the values stored ... */  }

Field widths

The field width is a positive decimal integer that specifies the maximum number of characters that scanf() reads for the given conversion specification. For string input, this item can be used to prevent buffer overflows:

char city[32];
printf( "Your city: ");
if ( scanf( "%31s", city ) < 1 )  // Never read in more than 31
                                  // characters!
  fprintf( stderr, "Error reading from standard input.\ n" );
else
/* ... */

Unlike printf(), which exceeds the specified field width whenever the output is longer than that number of characters, scanf() with the s conversion specifier never writes more characters to a buffer than the number specified by the field width.

Reading characters and strings

The conversion specifications %c and %1c read the next character in the input stream, even if it is a whitespace character. By specifying a field width, you can read that exact number of characters, including whitespace characters, as long as the end of the input stream does not intervene. When you read more than one character in this way, the corresponding pointer argument must point to a char array that is large enough to hold all the characters read. The scanf() function with the c conversion specifer does not append a terminating null character. Here is an example:

scanf( "%*5c" );

This scanf() call reads and discards the next five characters in the input source.

The conversion specification %s always reads just one word, as a whitespace character ends the sequence read. To read entire text lines, you can use the fgets() function.

The following example reads the contents of a text file word by word. Here we assume that the file pointer fp is associated with a text file that has been opened for reading:

char word[128];
while ( fscanf( fp, "%127s", word ) == 1 )
{
  /* ... process the word read ... */
}

In addition to the conversion specifier s, you can also read strings using the “scanset” specifier, which consists of an unordered set of characters between square brackets ([scanset]). The scanf() function then reads all characters, and saves them as a string (with a terminating null character), until it reaches a character that does not match any of those in the scanset. Here is an example:

char strNumber[32];
scanf( "%[0123456789]", strNumber );

If the user enters 345X67, then scanf() stores the string 345\0 in the array strNumber. The character X and all subsequent characters remain in the input buffer.

To invert the scanset—that is, to match all characters except those between the square brackets—insert a caret (^) immediately after the opening bracket. The following scanf() call reads all characters, including whitespace, up to a punctuation character that terminates a sentence, and then reads the punctuation character itself:

char ch, sentence[512];
scanf( "%511[^.!?]%c", sentence, &ch );

The following scanf() call can be used to read and discard all characters up to the end of the current line:

scanf( "%*[^\n]%*c" );

Reading integers

Like the printf() functions, the scanf() functions offer the following conversion specifiers for integers: d, i, u, o, x, and X. These allow you to read and convert decimal, octal, and hexadecimal notation to int or unsigned int variables. Here is an example:

// Read a non-negative decimal integer:
unsigned int value = 0;
if ( scanf( "%u", &value ) < 1 )
  fprintf( stderr, "Unable to read an integer.\n" );
else
  /* ... */

For the specifier i in the scanf() functions, the base of the numeral read is not predefined. Instead, it is determined by the prefix of the numeric character sequence read, in the same way as for integer constants in C source code (see “Integer Constants”). If the character sequence does not begin with a zero, then it is interpreted as a decimal numeral. If it does begin with a zero and the second character is not x or X, then the sequence is interpreted as an octal numeral. A sequence that begins with 0x or 0X is read as a hexadecimal numeral.

To assign the integer read to a short, char, long, or long long variable (or to a variable of a corresponding unsigned type), you must insert a length modifier before the conversion specifier: h for short, hh for char, l for long, or ll for long long. In the following example, the FILE pointer fp refers to a file opened for reading:

unsigned long position = 0;
if (fscanf( fp, "%lX", &position) < 1 )  // Read a hexadecimal integer.
  /* ... Handle error: unable to read a numeral ... */

Reading floating-point numbers

To process floating-point numerals, the scanf() functions use the same conversion specifiers as printf(): f, e, E, g, and G. Furthermore, C99 has added the specifiers a and A. All of these specifiers interpret the character sequence read in the same way. The character sequences that can be interpreted as floating-point numerals are the same as the valid floating-point constants in C; see “Floating-Point Constants”. scanf() can also convert integer numerals and store them in floating-point variables.

All of these specifiers convert the numeral read into a floating-point value with the type float. If you want to convert and store the value read as a variable of type, double or long double, you must insert a length modifier: either l (a lowercase L) for double, or L for long double. Here is an example:

float x = 0.0F;
double xx = 0.0;
// Read in two floating-point numbers; convert one to float and the
// other to double:
if ( scanf( "%f %lf", &x, &xx ) < 2 )
  /* ... */

If this scanf() call receives the input sequence 12.3 7\n, then it stores the value 12.3 in the float variable x, and the value 7.0 in the double variable xx.

Random File Access

Random file access refers to the ability to read or modify information directly at any given position in a file. You do this by getting and setting a file position indicator, which represents the current access position in the file associated with a given stream.

Obtaining the Current File Position

The following functions return the current file access position. Use one of these functions when you need to note a position in the file to return to it later:

long ftell( FILE *fp );

ftell() returns the file position of the stream specified by fp. For a binary stream, this is the same as the number of characters in the file before this given position—that is, the offset of the current character from the beginning of the file. ftell() returns -1 if an error occurs.

int fgetpos( FILE * restrict fp, fpos_t * restrict ppos );

fgetpos() writes the file position indicator for the stream designated by fp to an object of type fpos_t, addressed by ppos. If fp is a wide-oriented stream, then the indicator saved by fgetpos() also includes the stream’s current conversion state (see “Byte-Oriented and Wide-Oriented Streams”). fgetpos() returns a nonzero value to indicate that an error occurred. A return value of zero indicates success.

The following example records the positions of all lines in the text file messages.txt that begin with the character #:

#define ARRAY_LEN 1000
long arrPos[ARRAY_LEN] = { 0L };
FILE *fp = fopen( "messages.txt", "r" );
if ( fp != NULL)
{
  int i = 0, c1 = '\n', c2;
  while ( i < ARRAY_LEN  && ( c2 = getc(fp) ) != EOF )
  {
    if ( c1 == '\n'  &&  c2 == '#' )
      arrPos[i++] = ftell( fp ) - 1;
    c1 = c2;
  }
  /* ... */
}

Setting the File Access Position

The following functions modify the file position indicator:

int fsetpos( FILE *fp, const fpos_t *ppos );

Sets both the file position indicator and the conversion state to the values stored in the object referenced by ppos. These values must have been obtained by a call to the fgetpos() function. If successful, fsetpos() returns 0 and clears the stream’s EOF flag. A nonzero return value indicates an error.

int fseek( FILE *fp, long offset, int origin );

Sets the file position indicator to a position specified by the value of offset and by a reference point indicated by the origin argument. The offset argument indicates a position relative to one of three possible reference points, which are identified by macro values. Table 13-5 lists these macros, as well as the numeric values that were used for origin before ANSI C defined them. The value of offset can be negative. The resulting file position must be greater than or equal to zero, however.

Table 13-5. The origin parameter in fseek()
Macro name Traditional value of origin Offset is relative to

SEEK_SET

0

The beginning of the file

SEEK_CUR

1

The current file position

SEEK_END

2

The end of the file

When working with text streams—on systems that distinguish between text and binary streams—you should always use a value obtained by calling the ftell() function for the offset argument, and let origin have the value SEEK_SET. The function pairs ftell()–fseek() and fgetpos()–fsetpos() are not mutually compatible, because the fpos_t object used by fgetpos() and fsetpos() to indicate a file position may not have an arithmetic type.

If successful, fseek() clears the stream’s EOF flag and returns zero. A nonzero return value indicates an error. rewind() sets the file position indicator to the beginning of the file and clears the stream’s EOF and error flags:

void rewind( FILE *fp );

Except for the error flag, the call rewind(fp) is equivalent to:

(void)fseek(fp, 0L, SEEK_SET )

If the file has been opened for reading and writing, you can perform either a read or a write operation after a successful call to fseek(), fsetpos(), or rewind().

The following example uses an index table to store the positions of records in the file. This approach permits direct access to a record that needs to be updated:

// setNewName(): Finds a keyword in an index table
// and updates the corresponding record in the file.
// The file containing the records must be opened in
// "update mode"; i.e., with the mode string "r+b".
// Arguments: - A FILE pointer to the open data file;
//            - The key;
//            - The new name.
// Return value: A pointer to the updated record,
//               or NULL if no such record was found.
// ---------------------------------------------------------------
#include <stdio.h>
#include <string.h>
#include "Record.h"  // Defines the types Record_t, IndexEntry_t:
                     // typedef struct { long key; char name[32];
                     //                  /* ... */ } Record_t;
                     // typedef struct { long key, pos; } IndexEntry_t;

extern IndexEntry_t indexTab[];     // The index table.
extern int indexLen;                 // The number of table entries.

Record_t *setNewName( FILE *fp, long key, const char *newname )
{
  static Record_t record;
  int i;
  for ( i = 0; i < indexLen; ++i )
  {
    if ( key == indexTab[i].key )
      break;                         // Found the specified key.
  }
  if ( i == indexLen )
    return NULL;                     // No match found.
  // Set the file position to the record:
  if (fseek( fp, indexTab[i].pos, SEEK_SET ) != 0 )
    return NULL;                     // Positioning failed.
  // Read the record:
  if ( fread( &record, sizeof(Record_t), 1, fp ) != 1 )
    return NULL;                     // Error on reading.

  if ( key != record.key )           // Test the key.
    return NULL;
  else
  {                                  // Update the record:
    size_t size = sizeof(record.name);
    strncpy( record.name, newname, size-1 );
    record.name[size-1] = '\0';

    if ( fseek( fp, indexTab[i].pos, SEEK_SET ) != 0 )
      return NULL;                   // Error setting file position.
    if ( fwrite( &record, sizeof(Record_t), 1, fp ) != 1 )
      return NULL;                   // Error writing to file.

    return &record;
  }
}

The second fseek() call before the write operation could also be replaced with the following, moving the file pointer relative to its previous position:

if (fseek( fp, -(long)sizeof(Record_t), SEEK_CUR ) != 0 )
    return NULL;                     // Error setting file position.