White Space in the C Computer Language
Part 22 of the Complete C Course
Foreword: White Space in the C Computer Language
By: Chrysanthus Date Published: 11 Jun 2024
Introduction
Types of Characters
The ordinary man considers, A, as a character, B, as another character, C, as another character, and so on. 5 should also be considered as a character, when it is in single quotes; 7 should also be considered as a character, when it is in single quotes. 8 should also be considered as a character, when it is in single quotes; and so on. On the computer keyboard, there are non-commonly used characters, such as the asterisk, *. In many computer languages, a white space character consists of two items. It begins with a back slash, and followed by text. These two items effectively form the white space character. For example, \n, is a white space character for a good number of computer languages; it has been mentioned before; it is mentioned below. Such two-item characters are better called Escape Sequences. They are also called Special Characters. The term "white space" is also written as "whitespace".
The Horizontal Tab
While somebody is writing with a pen on a piece of paper, if he/she wants to start a new paragraph, he/she does not start at the left margin; he/she shifts a bit to the right (indents), before starting. That indentation can be considered as a horizontal tab. There is an escape sequence that can be used to achieve this, in C. It is, \t. It begins with a back slash, and followed by 't' in lower case. This special character is called the horizontal tab. Read and test the following code (program):
#include <stdio.h>
int main(int argc, char *argv[])
{
char *const str = "\tand the sentence begins";
printf("%s\n", str);
return 0;
}
At the output, there is a long horizontal space in front of the text. The horizontal tab character can be placed anywhere in the string, and there can be more than one of them in a string. The horizontal tab character (\t), can be used to format output of a lot of text that is in table format. Such formatting, needs fixed width font characters. Today, input and output for the ordinary user, is mostly done with windows, though.
The Vertical Tab
As the horizontal tab exists, a Vertical Tab in the vertical direction also exists. For many languages, the escape sequence for the vertical tab is '\v'. There is the backslash, followed by 'v' in lower case. This character, '\v' may not work well with the gcc compiler, at the default settings.
A form feed is more of an instruction than a blank (white) space character. It is called a white space character, because it can cause a blank space. Imagine that there are about ten lines of text for a document. Also imagine that in the middle of this text, there is the escape sequence, \f, which is what many languages use as the form feed character. Now, while the page that has this text is being printed (displayed), when the printer (or screen) reaches this escape sequence, it should not print the rest of the text below on the current page; it should advance the page, leaving a blank space and then starts printing the rest of the text on the next page (paper). Form Feed means: print the rest of the text on the next page, just after feeding in the next page (paper to the printer). If the printer meets this character at the end of the current page, then no blank space would be produced, as the rest of the text would be printed (or displayed) on the next page fed.
Line Terminators
Two escape sequences are described below as white space characters, but they do not really produce blank spaces. However, they affect where the next line or text would be printed or displayed.
The horizontal tab and vertical tab white space characters are by themselves blank spaces. The form feed character can produce a blank space depending on its position in the current page. In itself, it is more of an instruction than a blank space character. The two escape sequences below, are not blank space characters by themselves. They are actually line terminators, but in many forums they are called, white space characters.
Carriage Return
Imagine that a line of text is to be displayed (printed), and there is the escape sequence, \r in the middle of the line of text. \r is known as the Carriage Return character for many computer languages. When the printer or screen reaches this point, it sends the ink (or light) to the beginning of the current line. After this, if printing were to continue, the current line will be written over, by the right half of the line of text. The carriage return escaped sequence is normally used in conjunction with the Newline escape sequence (see next).
The Newline
Imagine that a line of text is to be displayed and there is the escape sequence, \n in the middle of the line of text. \n is known as the Newline character for many languages. When the printer or screen reaches this point, it sends the ink (or light) to the next (to-be-displayed) line. It is not clear whether the ink should go to the beginning or middle or end of the next line. If the programmer wants printing to continue at the beginning of the next line, then he/she has to use both \r and \n together (i.e. \r\n) at the same point in the line of text. With some languages (compilers or interpreters), \n alone serves the purpose of the presence of both \r and \n.
Read and test the following code:
#include <stdio.h>
int main(int argc, char *argv[])
{
char *const str = "This is the first sentence. \r\nThis is the second sentence.";
printf("%s\n", str);
return 0;
}
The output is:
This is the first sentence.
This is the second sentence.
The output should be made of two lines. The first line should have the first sentence and the second line should have the second sentence, though both sentences are in one line in the code. Note how the printf() function call, has been coded here.
The Space Character Itself
Pressing the space-bar key on the keyboard, produces one space character. That is a white-space. A sequence of space-bar characters, produces a longer white space.
Note
The escape sequences for the white spaces are not displayed as \t, \f, etc. The user sees only their effects. The space, horizontal tab, vertical tab and form feed characters can be considered as pure white space characters. The line terminators, can be considered as indirect white space characters.