|
onts
lie on the boundary between visual presentation aspects of
HTML documents and the problems of HTML internationalization. It's of
little use to have HTML supporting Unicode if you cannot display its
character repertoire (or at least, the part of Unicode that your
document makes use of).
Of course, most users interested in non-English Web content already
have corresponding fonts installed on their systems. Often these fonts
are supplied with localized versions of operating systems or other
software and use encodings that are popular for a particular language.
Common browsers such as Netscape Navigator allow using such fonts for
viewing web pages.
However, what's needed is a method to ensure the proper display of
multilanguage data on any given system. One solution might be creating
and distributing a free (or inexpensive) multilanguage font pack or a
single font with Unicode character layout. A free Unicode font named
Cyberbit is available from Bitstream at
http://www.bitstream.com/cyberbit.htm.
The big downside to the single-font solution is that the file size of
a typical Unicode font is several megabytes (even without the ideographs
area). Probably the most practical solution for the Web today is a glyph
server, a proxy server that substitutes inline bitmaps for all non-ASCII
characters on the page you're viewing. Intermediation by such a server
is a quick way to read a foreign-language page without any font
headaches. Glyph servers now available include
http://www.lfw.org/shodouka/ (Japanese only) and http://baka.aubg.bg.
The latest versions of popular browsers offer yet another solution to
this problem. Both Netscape Communicator 4 and Microsoft Internet
Explorer 4 support some variations of font embedding, the technique
allowing you to send, along with a Web page, the fonts needed to view
it. The primary goal of this extension is to improve the typographic
quality of Web pages, although font embedding, as we'll see shortly, may
seriously affect the internationalization issue.
Microsoft's implementation of font embedding (see
http://www.microsoft.com/typography/web/embedding/default.htm) uses the
widely deployed True Type font format. Embedded fonts can be efficiently
compressed, subranged (only characters used on the page are included),
and protected from unauthorized distribution. However, Internet Explorer
relies heavily on the font display services of the operating system,
thereby damaging portability. Netscape's solution, called dynamic fonts,
is based on the TrueDoc technology from BitStream (see
http://www.bitstream.com/world/); it's more portable because the browser
itself takes care of the display.
So what does font embedding mean to internationalization? The
possibility of ensuring the proper display of any characters in a
document on any system capable of handling outline fonts is a big plus.
However, there are some dangerous pitfalls along this path.
First, being able to rely on supplied fonts, some HTML authors (as
well as browser manufacturers) might go wild in the area of character
sets support. In fact, a character encoding of a document needs to
comply only to that of the accompanying font, which makes nearly all
HTML internationalization provisions described in this chapter
redundant---and, as a result, puts them in danger of death by neglect
without software support. Of course, fonts for Web distribution may use
Unicode, but there's no guarantee that this will always be the case.
Second, the HTML font support puts additional emphasis on the visual
presentation of a document, which is the aspect already being
overemphasized with the current proprietary HTML extensions. Many
documents on the Web today are created without any concern for
portability or SGML compliance, and it's not very likely that font
embedding in HTML documents, as implemented by the major browser
companies, will ever improve the situation.
On the other hand, the urgent need for precise typographic control in
HTML is obvious. In view of that, W3 Consortium has recently proposed
its own alternative to the proprietary font embedding solutions. In
accordance with the ideology of separating structure from presentation,
this solution (see http://www.w3.org/TR/WD-font) is
an extension of the
CSS system providing for detailed description, intelligent matching, and
format-independent downloading of fonts. With this solution, user agents
have four ways to select fonts for the presentation of HTML elements:
- Exact matching of the font specified in the style sheet
with one of the fonts installed in the system;
- Intelligent matching of the specified font with a similar but
different system font, if exact match is unavailable;
- Downloading of the font file over the network, if the two above
options are unavailable and if the font's URL is specified;
- As a last resort, some user agents may perform font
synthesis to create necessary fonts on the fly based on the style
sheet's font description.
The future of HTML internationalization is quite obscure now. The
standards surveyed in this chapter have just been finalized, and their
implementations in software are few. Also, the big software companies
are particularly known for poor support of official standards and
pursuing their own proprietary extensions instead.
However, most national webs are now growing much faster than the Web
as a whole, so that Internet-related products without at least some
international support are likely to become rare very soon. In view of
this, the provisions of HTML 4.0 and related standards have, besides the
thoughtful design, the clear advantage of being open, independent, and
stable.
|