|
IME
stands for Multipurpose Internet Mail Extensions. It is a
standard developed originally to extend the capabilities of electronic
mail by allowing e-mail messages to include virtually any type of
data, not only plain text. However, the mechanisms of MIME proved so
useful and well-designed that they are now used in many other fields,
including HTML. The latest MIME specification can be found in RFCs
2045
through 2049.
The existing e-mail transport systems, such as Simple Mail Transfer
Protocol (SMTP) and Post Office Protocol 3 (POP3), do not accept
anything but plain text in the body of a message. This means that a
message should contain only printable (noncontrol) characters of 7-bit
ASCII and the lines should not exceed some reasonable length. To
overcome this limitation, MIME introduces methods to convert binary
data or texts in more-than-seven-bit encodings into "mail safe" plain
ASCII text.
Special MIME header fields are added to such messages to specify
what conversion method was used (if any) and what was the original
type of the data sent in the message. For text messages, along with
other parameters, the character encoding (character set, or
charset) of the message body is specified. This mechanism is
important for us because it is used not only in e-mail messages but
also in HTTP headers for HTML documents transferred over the network.
The charset parameter is a part of the
Content-Type header field and takes the following form:
Content-Type: text/html; charset=ISO-8859-1
Here, text/html is the standard identifier of the "HTML source"
data type, and ISO-8859-1 indicates the character encoding
used by the text of the HTML document. Both these values are taken
from the official registries of content data
types, character
sets, and other MIME-related classifiers maintained by IANA
(Internet Assigned Numbers Authority).
It is this official registry that makes MIME so useful beyond the
e-mail realm. For our purposes, it is especially important that MIME
has developed a standard way of communicating the character encoding
of a document. The list of registered MIME charset values
can be obtained from IANA.
|