Broad Network


Accessing Files at the Disk in the C Computer Language

Part 42 of the Complete C Course

Foreword: Accessing Files at the Disk in the C Computer Language

By: Chrysanthus Date Published: 22 Jun 2024

Introduction

The stdio library is for input and output of the computer. Input is from the keyboard, or disk file or network interface card, to a running (executing) program (memory) in the computer. Output is from a running program, to the terminal (monitor), or disk file or network interface card. Input or output in this sense, moves as bytes (8 bits) in a sequence. A byte is like a Western European character. This sequence of moving bytes is called a stream. Such a stream is encapsulated in an object, called a File object (similar to a struct object). A stream can be an input stream, or it can be an output stream.

Most streams pass through a buffer in memory before reaching their destinations. Different streams have their different buffers. A buffer is an array in memory. A buffer can be an output buffer or an input buffer. In many cases, a buffer has to get full, as the bytes of the stream are moving into it, before it is sent or flushed to its destination.

There are two types of streams: text stream or binary stream. A text stream is an ordered sequence of characters composed into lines. Each line consists of zero or more characters, plus a terminating new-line character. Whether the last line requires a terminating new-line character or some other end-of-line character, is implementation-defined. By default, C uses '\n'. Characters within a line, may have to be added, altered, or deleted on input and on output to a stream, to conform to differing conventions for representing text in the host environment. A file created with the text editor, is an example of a text file.

Binary stream refers to sound files, image files, video files, and compiled executable program files. A binary stream is an ordered sequence of bytes. Data read in from a binary stream shall compare equal to the data that were earlier written out to that stream, under the same implementation. Such a stream may, however, have an implementation-defined number of null characters appended to the end of the stream. This section of the chapter deals with character files and not binary files.

Opening and Closing a File
Before a file in disk can be created, or accessed, the file has to be opened. Opening a file, means creating a stream (path) for the file and its buffer. This is internally done by C. All the programmer has to do, is to use the right function call, from the stdio library, to open and close the file. Closing a file means releasing the stream and buffer, so that other programs can use such resources.

Opening a File
The synopsis to open a file is,

    #include <stdio.h>
    FILE * fopen(const char * restrict filename, const char * restrict mode);

filename (in quotes) is the name of the file in disk. mode is explained below.

The fopen function returns a pointer to the object (File object) controlling the stream. If the open operation fails, fopen() returns a null pointer. A null pointer is decimal zero.

Closing a File
Any file opened has to be closed. The synopsis to close a file is:

    #include <stdio.h>
    int fclose(FILE *stream);

The fclose() function returns zero if the stream was successfully closed, or EOF (see below) if any errors were detected. The stream is not closed if any error is detected.

mode

A mode is a string with special content. Thus:

Table 8.42.1 modes and their Meanings for Text Files
modeMeaning
wxcreate new text file for writing
wtruncate to zero length if file already exists, or create new text file for writing
aappend; open or create text file for writing at end-of-file, and downwards
ropen text file for reading
r+open text file for update (reading and/or writing anywhere within the file)


Writing a Sequence of Characters to File
The synopsis for writing a sequence of characters to an opened file is:

    #include <stdio.h>
    size_t fwrite(const void * restrict ptr, size_t size, size_t nmemb, FILE * restrict stream);

where ptr is a pointer to an array of chars. size is the size of one character. nmemb is the number of chars (from the beginning of the array). This excludes the newline char that has to be written to file, in order to mark the end of a line. stream is the file object pointer for the stream. The string terminating '\0' character should not be sent.

The fwrite() function returns the number of elements successfully written, which will be less than nmemb, only if a write error is encountered. If size or nmemb is zero, fwrite() returns zero and the state of the stream remains unchanged.

Writing only One Character to File
The synopsis for writing only one character to an opened file is:

    #include <stdio.h>
    int fputc(int c, FILE *stream);

The fputc function returns the character written. If a write error occurs, the error indicator for the stream is set and fputc returns EOF. EOF expands to (is replaced by) an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file; that is, no more input from the stream.

Writing Example
The following program opens a new text file in the user home directory, uses the fwrite() function, creates a new file, and writes three lines of text to the text file, with each line ending with the new-line character. The filename is file1.txt. It is created in the home directory. The fputc() adds the newline character (single byte) for each line.

    #include <stdio.h>

    int main(int argc, char *argv[])
        {
            char str1[] = "I love you.";
            char str2[] = "Yes, I need you.";
            char str3[] = "Indeed, I want you.";
            
            FILE *strm = fopen("file1.txt", "wx");
            if (strm != 0) {
                fwrite(str1, 1, sizeof(str1)-1, strm); fputc('\n', strm);
                fwrite(str2, 1, sizeof(str2)-1, strm); fputc('\n', strm);
                fwrite(str3, 1, sizeof(str3)-1, strm); fputc('\n', strm);
                if (fclose(strm) != 0)
                    printf("Opened file could not be closed!");
            }
            
            return 0;
        }

The stream object type is "FILE" and not "file" - case sensitivity. The sizeof() operator gives the size of a string, which includes the terminating null character (\0). In order not to consider the terminating null character, 1 is subtracted from the the result of the sizeof() operator. The fputc() function as used here, sends the one byte newline character (\n) at the end of the line. This is possible because, as characters are written, the file position indicator points, to the position of the next character to be written. The last '\n' character sent at the end of file, is not included into the file at that position, everything being equal. If the file, file1.txt is opened using the text editor of the operating system, the three lines would be seen.

Note: The reader should not confuse between '\n' and '\0'. '\n' is end of line, while '\0' is end of string. A very long string can have many '\n' within it. The "sizeof(string) - 1" will include them, but will not include '\0'.

Reading a Sequence of Characters From File
The synopsis for reading a sequence of characters from an opened file is:

    #include <stdio.h>
    size_t fread(void * restrict ptr, size_t size, size_t nmemb, FILE * restrict stream);

nmemb number of bytes is read into the array, pointed to by ptr, already created. size is the size of one character. stream is a pointer to the stream, opened by the fopen() function. The fread function returns the number of elements successfully read, which may be less than nmemb if a read error or end-of-file is encountered. The idea is to be reading chunks of characters of the same size, into the ptr array (then removing them and sending to their destinations), until the number of characters read is less than nmemb. As the chunks of characters are read, the content of ptr array is replaced (and the content is printed for the example below). The following program illustrates this with the file1.txt file:

    #include <stdio.h>

    int main(int argc, char *argv[])
        {
            char arr[10];
            
            FILE *strm = fopen("file1.txt", "r");
            if (strm != 0) {
                while (fread(arr, sizeof(char), 10, strm) > 0) {
                    printf("%s", arr);
                }
                printf("\n");
                
                if (fclose(strm) != 0)
                    printf("Opened file could not be closed!");
            }
            
            return 0;
        }

Note that the second argument for the fopen function call, is "r" for reading. The size (nmemb) of the chunk of characters chosen was 10. With the gcc compiler, the output is:

    I love you.
    Yes, I need you.
    Indeed, I want you.
    w

The extra line with 'w' is not supposed to be there. In order to avoid this problem, with the gcc compiler, the characters of the file should be read one-by-one, until end-of-file (EOF) is reached. EOF is an integer and a negative value. Also note that the printf() function call, will print an array of chars, even if the sequence of chars is not terminated with '\0'.

Reading One Character at a Time
The synopsis for the fgetc() function is:

    #include <stdio.h>
    int fgetc(FILE *stream);

The fgetc() function reads and returns the next character pointed to by the file position indicator, and then advances the indicator to point to the next character, forward. When the end-of-file is reached, the fgetc() function returns EOF. The following program illustrates this:

    #include <stdio.h>

    int main(int argc, char *argv[])
        {
            char ch;
            
            FILE *strm = fopen("file1.txt", "r");
            if (strm != 0) {
                while ((ch = fgetc(strm)) != EOF) {
                    printf("%c", ch);
                }
                
                if (fclose(strm) != 0)
                    printf("Opened file could not be closed!");
            }
            
            return 0;
        }

Note the parentheses around "ch = fgetc(strm)", to make sure it is executed first, before comparing the result with EOF. EOF is not in quotes. The output is:

    I love you.
    Yes, I need you.
    Indeed, I want you.

With the gcc compiler, an implicit '\n' is returned at the end of the file. So, the extra "printf("\n");" was not necessary towards the end of the program.

Append to a File
Append means add at the bottom. This means an existing text file with content, has to be opened with the second parameter of fopen(), being "a". The following program appends extra two lines to the file, file1.txt .

    #include <stdio.h>

    int main(int argc, char *argv[])
        {
            char str4[] = "This is the fourth line.";
            char str5[] = "This is the fifth line.";
            
            FILE *strm = fopen("file1.txt", "a");
            if (strm != 0) {
                fwrite(str4, 1, sizeof(str4)-1, strm); fputc('\n', strm);
                fwrite(str5, 1, sizeof(str5)-1, strm); fputc('\n', strm);
                if (fclose(strm) != 0)
                    printf("Opened file could not be closed!");
            }
            
            return 0;
        }

File Positioning Functions
A stream has a file position indicator. This indicator indicates where the next char will be. There are four main file positioning functions in the stdio (standard input/output) library. The four functions are ftell(), fseek(), fsetpos() and fgetpos(). These functions are in pairs: ftell() and fseek() form one pair, and fsetpos() and fgetpos() form another pair. Only the pair, ftell() and fseek(), are explained here.

The ftell() Function Call
The synopsis for the ftell() function call is:

    #include <stdio.h>
    long int ftell(FILE *stream);

The ftell() function obtains the current value of the file position indicator, for the stream pointed to by stream.

If successful, the ftell function returns the current value of the file position indicator for the stream. On failure, the ftell() function returns &#8722;1L (L for long) and stores an implementation-defined positive value in errno (see later). The "long int" return type of the prototype above, is for binary files. For the file stream and file1.txt above, the statement would be:

            long int pos = ftell(strm);

With the gcc compiler, an integer number has a width of 4 bytes, while a long integer number has a width of 8 bytes. The specification for long int for the printf() function is %li .

The fseek() function Call
The synopsis for the fseek() function is:

    #include <stdio.h>
    int fseek(FILE *stream, long int offset, int whence);

The fseek() function sets the file position indicator for the stream pointed to by the stream pointer (e.g. strm above). If a read or write error occurs, the error indicator for the stream is set and fseek() fails (see return value below).

whence can be SEEK_SET, SEEK_CUR or SEEK_END, each of which is an integer.  SEEK_SET means, the value of the file position indicator should be for the first character in the file. SEEK_CUR means, point to the current value of the file position indicator. SEEK_END is for end-of-file (just after the last character in the file). The offset number is added to any of these positions.

The fseek() function returns nonzero, only for a request that cannot be satisfied. Possible fseek() code segments for the above stream and file1.txt, are as follows:

                if ((fseek(strm, 0, SEEK_SET)) == 0) {
                    //read the first character
                }

                if ((fseek(strm, 0, SEEK_CUR)) == 0) {
                    //read the current character
                }

                if ((fseek(strm, 0, SEEK_END)) == 0) {
                    //file can be closed, because pointer points just after the last character of the file
                }

Editing a File

Editing (modifying) a file, means inserting a sequence of characters, or deleting a sequence of characters, or replacing a sequence of characters, anywhere within the file. In this case, the file stream has to be opened for update, with "r+" as the second argument of the fopen() function call.

Though with the gcc compiler, the new-line character, (\n) that was added at the last line is not shown, it is still there, and has to be taken into account, under certain conditions. At the moment, the content of the file1.txt file is:

    I love you.
    Yes, I need you.
    Indeed, I want you.
    This is the fourth line.
    This is the fifth line.

The C language does not have any predefined function to insert a sequence of characters or to delete a sequence of characters. The only actions easily achieved are replacement and appending. Appending has been discussed above.

To change the first line of the file1.txt file, from "I love you." to "I hate you.", the word, "love" has to be replaced by "hate". The zero based indexes of the characters of "love" have to be known. 'l' is at index 2, 'o' is at index 3. 'v' is at index 4. 'e' is at index 5. The following program changes the first line from "I love you." to "I hate you." :

    #include <stdio.h>

    int main(int argc, char *argv[])
        {
            FILE *strm = fopen("file1.txt", "r+");
            if (strm != 0) {
                if ((fseek(strm, 2, SEEK_SET)) == 0) {
                    fputc('h', strm);
                }
                if ((fseek(strm, 3, SEEK_SET)) == 0) {
                    fputc('a', strm);
                }
                if ((fseek(strm, 4, SEEK_SET)) == 0) {
                    fputc('t', strm);
                }
                if ((fseek(strm, 5, SEEK_SET)) == 0) {
                    fputc('e', strm);
                }
                
                if (fclose(strm) != 0)
                    printf("Opened file could not be closed!");
            }
            
            return 0;
        }

The characters in the second, third, fourth lines, etc. are at higher indexes. A program to read all the characters in the edited file, one-by-one has been given above.



Related Links

More Related Links

Cousins

BACK NEXT

Comments