HTML Unleashed. Internationalizing HTML: Font Issues | WebReference

HTML Unleashed. Internationalizing HTML: Font Issues


HTML Unleashed: Internationalizing HTML

Font Issues


Fonts lie on the boundary between visual presentation aspects of HTML documents and the problems of HTML internationalization.  It's of little use to have HTML supporting Unicode if you cannot display its character repertoire (or at least, the part of Unicode that your document makes use of).

Of course, most users interested in non-English Web content already have corresponding fonts installed on their systems.  Often these fonts are supplied with localized versions of operating systems or other software and use encodings that are popular for a particular language.  Common browsers such as Netscape Navigator allow using such fonts for viewing web pages.

However, what's needed is a method to ensure the proper display of multilanguage data on any given system.  One solution might be creating and distributing a free (or inexpensive) multilanguage font pack or a single font with Unicode character layout.  A free Unicode font named Cyberbit is available from Bitstream at

The big downside to the single-font solution is that the file size of a typical Unicode font is several megabytes (even without the ideographs area).  Probably the most practical solution for the Web today is a glyph server, a proxy server that substitutes inline bitmaps for all non-ASCII characters on the page you're viewing.  Intermediation by such a server is a quick way to read a foreign-language page without any font headaches.  Glyph servers now available include (Japanese only) and

The latest versions of popular browsers offer yet another solution to this problem.  Both Netscape Communicator 4 and Microsoft Internet Explorer 4 support some variations of font embedding, the technique allowing you to send, along with a Web page, the fonts needed to view it.  The primary goal of this extension is to improve the typographic quality of Web pages, although font embedding, as we'll see shortly, may seriously affect the internationalization issue.

Microsoft's implementation of font embedding (see uses the widely deployed True Type font format.  Embedded fonts can be efficiently compressed, subranged (only characters used on the page are included), and protected from unauthorized distribution.  However, Internet Explorer relies heavily on the font display services of the operating system, thereby damaging portability.  Netscape's solution, called dynamic fonts, is based on the TrueDoc technology from BitStream (see; it's more portable because the browser itself takes care of the display.

So what does font embedding mean to internationalization?  The possibility of ensuring the proper display of any characters in a document on any system capable of handling outline fonts is a big plus.  However, there are some dangerous pitfalls along this path.

First, being able to rely on supplied fonts, some HTML authors (as well as browser manufacturers) might go wild in the area of character sets support.  In fact, a character encoding of a document needs to comply only to that of the accompanying font, which makes nearly all HTML internationalization provisions described in this chapter redundant---and, as a result, puts them in danger of death by neglect without software support.  Of course, fonts for Web distribution may use Unicode, but there's no guarantee that this will always be the case.

Second, the HTML font support puts additional emphasis on the visual presentation of a document, which is the aspect already being overemphasized with the current proprietary HTML extensions.  Many documents on the Web today are created without any concern for portability or SGML compliance, and it's not very likely that font embedding in HTML documents, as implemented by the major browser companies, will ever improve the situation.

On the other hand, the urgent need for precise typographic control in HTML is obvious.  In view of that, W3 Consortium has recently proposed its own alternative to the proprietary font embedding solutions.  In accordance with the ideology of separating structure from presentation, this solution (see is an extension of the CSS system providing for detailed description, intelligent matching, and format-independent downloading of fonts.  With this solution, user agents have four ways to select fonts for the presentation of HTML elements:

  • Exact matching of the font specified in the style sheet with one of the fonts installed in the system;
  • Intelligent matching of the specified font with a similar but different system font, if exact match is unavailable;
  • Downloading of the font file over the network, if the two above options are unavailable and if the font's URL is specified;
  • As a last resort, some user agents may perform font synthesis to create necessary fonts on the fly based on the style sheet's font description.

The future of HTML internationalization is quite obscure now.  The standards surveyed in this chapter have just been finalized, and their implementations in software are few.  Also, the big software companies are particularly known for poor support of official standards and pursuing their own proprietary extensions instead.

However, most national webs are now growing much faster than the Web as a whole, so that Internet-related products without at least some international support are likely to become rare very soon.  In view of this, the provisions of HTML 4.0 and related standards have, besides the thoughtful design, the clear advantage of being open, independent, and stable.


Created: Jun. 15, 1997
Revised: Jun. 16, 1997