HTML Unleashed. Internationalizing HTML: MIME
HTML Unleashed: Internationalizing HTML
IME stands for Multipurpose Internet Mail Extensions. It is a standard developed originally to extend the capabilities of electronic mail by allowing e-mail messages to include virtually any type of data, not only plain text. However, the mechanisms of MIME proved so useful and well-designed that they are now used in many other fields, including HTML. The latest MIME specification can be found in RFCs 2045 through 2049.
The existing e-mail transport systems, such as Simple Mail Transfer Protocol (SMTP) and Post Office Protocol 3 (POP3), do not accept anything but plain text in the body of a message. This means that a message should contain only printable (noncontrol) characters of 7-bit ASCII and the lines should not exceed some reasonable length. To overcome this limitation, MIME introduces methods to convert binary data or texts in more-than-seven-bit encodings into "mail safe" plain ASCII text.
Special MIME header fields are added to such messages to specify what conversion method was used (if any) and what was the original type of the data sent in the message. For text messages, along with other parameters, the character encoding (character set, or charset) of the message body is specified. This mechanism is important for us because it is used not only in e-mail messages but also in HTTP headers for HTML documents transferred over the network. The charset parameter is a part of the Content-Type header field and takes the following form:
Content-Type: text/html; charset=ISO-8859-1
Here, text/html is the standard identifier of the "HTML source" data type, and ISO-8859-1 indicates the character encoding used by the text of the HTML document. Both these values are taken from the official registries of content data types, character sets, and other MIME-related classifiers maintained by IANA (Internet Assigned Numbers Authority).
It is this official registry that makes MIME so useful beyond the e-mail realm. For our purposes, it is especially important that MIME has developed a standard way of communicating the character encoding of a document. The list of registered MIME charset values can be obtained from IANA.
Revised: Jun. 16, 1997