URL Encoding
HTML Character Sets - Part 4
Forward: In this part of my series, I talk about URL Encoding.
By: Chrysanthus Date Published: 31 Jul 2012
Introduction
Note: If you cannot see the code or if you think anything is missing (broken link, image absent, etc.), just contact me at forchatrans@yahoo.com. That is, contact me for the slightest problem you have about what you are reading.
Description
What is typed in the address bar of a browser is a URL. A URL can also be used as the value of the href attribute of an a element. The URL is used in a few other places in the web page. When you type an address in the address bar of the web browser, and click Go, the URL is sent across the Internet.
Form dataset is also sent across the Internet in a similar way. If the value of the form method attribute is "get", then the form information would be sent as a long URL, something like,
http://fine-hosting.com?firstname=Juan+Mary&lastname=Jones
This URL sends the first name, “Juan Mary” and last name, “Jones” of a woman through the Internet to a server.
A URL is sent as an ASCII code (characters from the ASCII character set); any space is replaced by a + sign. All the characters in the above URL are in the ASCII character set. There are two problems: The characters such as, :, /, ?. = and & have special meanings in the URL. If these characters are in data area (e.g. name) of the URL, then they have to be coded. There are characters that are not in the ASCII character set, that can also be sent within the URL. Because of these two problems there is another character set, which is a modified ASCII character set. It is called the URL encoding set; it is more of an encoding scheme.
In URL encoding; a space is + or %20, ? is %3F; a is %61; b is %62; c is %63. Now, special characters of the URL and non ASCII characters of the URL can be called, unsafe ASCII characters. So in a URL, unsafe ASCII characters must be coded; the normal ASCII characters can remain un-coded. A character code begins with % followed by 2 hexadecimal digits. The following tables list all the URL encoding characters and their encoding:.
ASCII Character | URL-encoding |
---|---|
space | %20 |
! | %21 |
" | %22 |
# | %23 |
$ | %24 |
% | %25 |
& | %26 |
' | %27 |
( | %28 |
) | %29 |
* | %2A |
+ | %2B |
, | %2C |
- | %2D |
. | %2E |
/ | %2F |
0 | %30 |
1 | %31 |
2 | %32 |
3 | %33 |
4 | %34 |
5 | %35 |
6 | %36 |
7 | %37 |
8 | %38 |
9 | %39 |
: | %3A |
; | %3B |
< | %3C |
= | %3D |
> | %3E |
? | %3F |
@ | %40 |
A | %41 |
B | %42 |
C | %43 |
D | %44 |
E | %45 |
F | %46 |
G | %47 |
H | %48 |
I | %49 |
J | %4A |
K | %4B |
L | %4C |
M | %4D |
N | %4E |
O | %4F |
P | %50 |
Q | %51 |
R | %52 |
S | %53 |
T | %54 |
U | %55 |
V | %56 |
W | %57 |
X | %58 |
Y | %59 |
Z | %5A |
[ | %5B |
\ | %5C |
] | %5D |
^ | %5E |
_ | %5F |
` | %60 |
a | %61 |
b | %62 |
c | %63 |
d | %64 |
e | %65 |
f | %66 |
g | %67 |
h | %68 |
i | %69 |
j | %6A |
k | %6B |
l | %6C |
m | %6D |
n | %6E |
o | %6F |
p | %70 |
q | %71 |
r | %72 |
s | %73 |
t | %74 |
u | %75 |
v | %76 |
w | %77 |
x | %78 |
y | %79 |
z | %7A |
{ | %7B |
| | %7C |
} | %7D |
~ | %7E |
%7F | |
€ | %80 |
%81 | |
‚ | %82 |
ƒ | %83 |
„ | %84 |
… | %85 |
† | %86 |
‡ | %87 |
ˆ | %88 |
‰ | %89 |
Š | %8A |
‹ | %8B |
Œ | %8C |
%8D | |
Ž | %8E |
%8F | |
%90 | |
‘ | %91 |
’ | %92 |
“ | %93 |
” | %94 |
• | %95 |
– | %96 |
— | %97 |
˜ | %98 |
™ | %99 |
š | %9A |
› | %9B |
œ | %9C |
%9D | |
ž | %9E |
Ÿ | %9F |
%A0 | |
¡ | %A1 |
¢ | %A2 |
£ | %A3 |
%A4 | |
¥ | %A5 |
| | %A6 |
§ | %A7 |
¨ | %A8 |
© | %A9 |
ª | %AA |
« | %AB |
¬ | %AC |
¯ | %AD |
® | %AE |
¯ | %AF |
° | %B0 |
± | %B1 |
² | %B2 |
³ | %B3 |
´ | %B4 |
µ | %B5 |
¶ | %B6 |
· | %B7 |
¸ | %B8 |
¹ | %B9 |
º | %BA |
» | %BB |
¼ | %BC |
½ | %BD |
¾ | %BE |
¿ | %BF |
À | %C0 |
Á | %C1 |
 | %C2 |
à | %C3 |
Ä | %C4 |
Å | %C5 |
Æ | %C6 |
Ç | %C7 |
È | %C8 |
É | %C9 |
Ê | %CA |
Ë | %CB |
Ì | %CC |
Í | %CD |
Î | %CE |
Ï | %CF |
Ð | %D0 |
Ñ | %D1 |
Ò | %D2 |
Ó | %D3 |
Ô | %D4 |
Õ | %D5 |
Ö | %D6 |
%D7 | |
Ø | %D8 |
Ù | %D9 |
Ú | %DA |
Û | %DB |
Ü | %DC |
Ý | %DD |
Þ | %DE |
ß | %DF |
à | %E0 |
á | %E1 |
â | %E2 |
ã | %E3 |
ä | %E4 |
å | %E5 |
æ | %E6 |
ç | %E7 |
è | %E8 |
é | %E9 |
ê | %EA |
ë | %EB |
ì | %EC |
í | %ED |
î | %EE |
ï | %EF |
ð | %F0 |
ñ | %F1 |
ò | %F2 |
ó | %F3 |
ô | %F4 |
õ | %F5 |
ö | %F6 |
÷ | %F7 |
ø | %F8 |
ù | %F9 |
ú | %FA |
û | %FB |
ü | %FC |
ý | %FD |
þ | %FE |
ÿ | %FF |
ASCII Character | Description | URL-encoding |
---|---|---|
NUL | null character | %00 |
SOH | start of header | %01 |
STX | start of text | %02 |
ETX | end of text | %03 |
EOT | end of transmission | %04 |
ENQ | enquiry | %05 |
ACK | acknowledge | %06 |
BEL | bell (ring) | %07 |
BS | backspace | %08 |
HT | horizontal tab | %09 |
LF | line feed | %0A |
VT | vertical tab | %0B |
FF | form feed | %0C |
CR | carriage return | %0D |
SO | shift out | %0E |
SI | shift in | %0F |
DLE | data link escape | %10 |
DC1 | device control 1 | %11 |
DC2 | device control 2 | %12 |
DC3 | device control 3 | %13 |
DC4 | device control 4 | %14 |
NAK | negative acknowledge | %15 |
SYN | synchronize | %16 |
ETB | end transmission block | %17 |
CAN | cancel | %18 |
EM | end of medium | %19 |
SUB | substitute | %1A |
ESC | escape | %1B |
FS | file separator | %1C |
GS | group separator | %1D |
RS | record separator | %1E |
US | unit separator | %1F |
Chrys
Related Links
Major in Website DesignWeb Development Course
HTML Course
CSS Course
ECMAScript Course
NEXT