Yushi KOMACHI
<komachi@y-adagio.com>
MLIT-3 (3rd International Symposium on Standardization of Multilingual Information Technology), Hanoi, Vietnum
Oct. 6-7, 1998
Table of Contents
1. Introduction
The internet services, in particular Web and email services have made it possible to interchange documents with inexpensive charge and world wide cover area. Today we can actually interchange our documents without considering any boundary between countries, at least, in the interchange operations. It resulted in a new concept of a logical territory (internet service area) where multiple languages and cultures may be observed.
In the beginning of internet days, email users described their documents defaultly in English language and ASCII coded character set. After recognized as a useful infrastructure, the internet became be expected for document interchange to be performed in creator's or recipient's mother languages.
Corresponding to those requirements, most of today's browsers can support "multilingual". Their "multilingual" functions are, however, provided by selecting an appropriate language on the menu. They should be referred to as multi-localization rather than multilingual. They cannot render such a document as contains different language parts within a page or within the document itself.
The documents interchanged in the internet logical territory are often required to be multilingual mixture, i.e., described using multiple languages within a paragraph, a page or a document. Here we will call those document as real multilingual documents. A typical example is an participant list of online meeting, where each participant should be described his own language/script.
The real multilingual documents must be rendered and represented according to appropriate multilingual formatting[1]. It means that those document should be created and treated with following considerations:
- (1) coded character set including multilingual repertoires
- (2) font set required for multilingual rendering
- (3) style specification for multilingual rendering
This article focuses on the item (2) and discusses about user requirements on font technology for rendering multilingual mixture and on font properties for multilingual document interchange. Then, some new properties for soft copy presentation[2] are discussed.
Some research regarding those topics was carried out in 1997 by CICC (Center of the International Cooperation for Computerization) and the activities were reported[3].
2. Existing Font Resource Architecture
ISO/IEC JTC1/SC18 (today it is reorganized to be JTC1/SC34) developed several standards for document processing/description languages and font information interchange. They are based on an operational model[4] for document creation and interchange. Its basic concept are:
- system/implementation independent
- open document interchange
- natural language independent
- character independent
- separation between logical structure and formatting
As far as font is concerned, architecture, interchange format and glyph shape representation of font resource are specified by ISO/IEC 9541 part 1, 2 and 3 respectively[5],[6],[7]. In several countries they are approved as their national standards as they are or translated, e.g., JIS X 4161, 4162 and 4163 respectively.
In general the font standards can support multilingual font treatments. For example it defines several writing modes, alignment lines, typeface design grouping, and other font properties required for multilingual mixed formatting. In actual, however, they are not enough for detailed formatting for non-Latin countries.
There proposed to include some additional font properties and developed Amendment 2 [8] to ISO/IEC 9541-1 with the support of international fonts experts. The status of the document is that DAM had approved and the final text was forwarded to ISO for publication. It can support some Kanji specific formatting. However, the properties defined in the AM2 still insufficient to full support of multilingual mixture.
3. User Requirements for Multilingual and Soft Copy Documents
Before the discussion of font properties, user requirements for multilingual and soft copy documents description are clarified from the font technology's point of view.
- (1) The documents include several parts described in different languages from each other. For example, English, Japanese and other Asian languages should be supported.
- (2) The mixed formatting should be based on the formatting rules defined in associated countries.
- (3) The document support color text and color background which may be moving.
- (4) The documents should be open-interchanged, i.e., the documents are described by internationally approved standards.
4. Additional Font Properties and their Related Values
All the properties defined in this clause are optional and should be used with the required/optional properties of ISO/IEC 9541-1. The properties are described by the extended Backus-Naur Form.
NOTE Those properties should be treated only as a trigger for further discussion, since the discussions within CICC are not yet sufficient.
4.1 Properties for multilingual mixed
4.1.1 Font Resource Combination Name (FONTRESCOMB)
FONTRESNAME is a property-list to specify a font resource combination typically used in formatting the documents which contain several character repertoires.
fontrescomb-property ::= fontrescomb-name, fontrescomb-value-property-list fontrescomb-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//FONTRESCOMB fontres-value-property-list ::= (fontname-property)+4.1.2 Alignment Position (ALIGNPOSITION)
ALIGNPOSITION is rel-rationals, defining the position of alignment relative to the height of body size.
alignposition-property ::= alignposition-name, alignposition-value alignposition-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ALIGNPOSITION alignposition-value ::= REL-RATIONALNOTE This property can only be applied to the writing mode whose WRMODENAME has the value of RIGHT-TO-LEFT or LEFT-TO-RIGHT.
4.1.3 Property-value Lists
For an actual font substitution, some property values should be listed for typical font resources. The list should includes the values of:
- FILLRATIO
- DSNAREAS
- AVRESC
The property-value list should be specified as an informative annex of font resource architecture standards.
4.2 Properties for interlinear formatting
The properties of interlinear objects are basically the issue of formatting rather than font. In some systems, however, font properties for formatting hinting are effective. From this point of view, the ISO/IEC 9541-1 introduced such formatting hinting properties as Scores.
4.2.1 Ruby (RUBY)
RUBY is a property-list, specifying sub-line description to associates the pronounciation or the explanation with words and phrases which in most cases consist of Kanji characters.
ruby-property ::= ruby-name, ruby-value-property-list ruby-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBY ruby-value-property-list ::= (ruby-glyph-complement-property|ruby-font-size-property|ruby-formatting-type -property|ruby-typeface-property|ruby-horizontal-writing-direction-alignment -property|ruby-writing-vertical-direction-alignment-property|ruby-line-progr ession-direction-offset-property)+ ruby-glyph-complement-property ::= ruby-glyph-complement-name, ruby-glyph-complement-value ruby-glyph-complement-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYCOMP ruby-glyph-complement-value ::= "HIRAGANA-RUBY" | "KATAKANA-RUBY" | "DEFAULT" ruby-font-size-property ::= ruby-font-size-name, ruby-font-size-value ruby-font-size-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYSIZE ruby-font-size-value ::= "HALF"|"ONE-THIRD" -- 1/2 | 1/3 ruby-formatting-type-property ::= ruby-formatting-type-name, ruby-formatting-type-value ruby-formatting-type-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYTYPE ruby-formatting-type-value ::= "2-CHARS-RUBY"|"3-CHARS-RUBY" ruby-typeface-property ::= ruby-typeface-name, ruby-typeface-value ruby-typeface-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYTYPEFACE ruby-typeface-value ::= STRUCTURED-NAME ruby-horizontal-writing-direction-alignment-property ::= ruby-horizontal-writing-direction-alignment-name, ruby-horizontal-writing-direction-alignment-value ruby-horizontal-writing-direction-alignment-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYHORALIGN ruby-horizontal-writing-direction-alignment-value ::= "TOP"|"CENTER"|"JUSTIFICATION" ruby-vertical-writing-direction-alignment-property ::= ruby-vertical-writing-direction-alignment-name, ruby-vertical-writing-direction-alignment-value ruby-vertical-writing-direction-alignment-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RYBYVERALIGN ruby-vertical-writing-direction-alignment-value ::= "CENTER"|"JUSTIFICATION" ruby-line-progression-direction-offset-property ::= ruby-line-progression-direction-offset-name, ruby-line-progression-direction-offset-value ruby-line-progression-direction-offset-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RUBYOFFSET ruby-line-progression-direction-offset-value ::= REL-RATIONAL4.2.2 Return Mark (RETMARK)
RETMARK is a property-list, specifying sub-line description to indicate the reading sequence for Japanese interpretation of old Chinese documents.
return-mark-property ::= return-mark-name, return-mark-property-list return-mark-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RETMARK return-mark-property-list ::= (return-mark-offset-x-property| return-mark-offset-y-property)* return-mark-offset-x-property ::= return-mark-offset-x-name, return-mark-offset-x-value return-mark-offset-x-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RETMARKXOFFSET return-mark-offset-x-value ::= REL-RATIONAL return-mark-offset-y-property ::= return-mark-offset-y-name, return-mark-offset-y-value return-mark-offset-y-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//RETMARKYOFFSET return-mark-offset-y-value ::= REL-RATIONAL4.2.3 Added Kana (ADDEDKANA)
ADDEDKANA is a property-list, specifying sub-line description to complement the pronunciation of Kanji for Japanese interpretation of old Chinese documents.
added-kana-property ::= added-kana-name, added-kana-property-list added-kana-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ADDEDKANA added-kana-property-list ::= (added-kana-font-size-property|added-kana-typeface-property| added-kana-offset-x-property|added-kana-offset-y-property)* added-kana-font-size-property ::= added-kana-font-size-name, added-kana-font-size-value-type, added-kana-font-size-value added-kana-font-size-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ADDEDKANASIZE added-kana-font-size-value-type ::= "ABS" | "RELATIVE" added-kana-font-size-value ::= REL-RATIONAL added-kana-typeface-property ::= added-kana-typeface-name, added-kana-typeface-value added-kana-typeface-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ADDEDKANATYPEFACE added-kana-typeface-value ::= STRUCTURED-NAME added-kana-offset-x-property ::= added-kana-offset-x-name, added-kana-offset-x-value added-kana-offset-x-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ADDEDKANAXOFFSET added-kana-offset-x-value ::= REL-RATIONAL added-kana-offset-y-property ::= added-kana-offset-y-name, added-kana-offset-y-value added-kana-offset-y-name ::= STRUCTURED-NAME -- ISO/IEC 9541-1//ADDEDKANAYOFFSET added-kana-offset-y-value ::= REL-RATIONAL4.3 Properties for color formatting
NOTE For color formatting, there should be more discussions, in particular, on font resource specific issues.
4.3.1 Design Color (DSNCOLOR)
DSNCOLOR is a structured-name, the recommended color at which the font resource is designed to be displayed, according to the judgment of the design source (DSNSOURCE).
dsncolor-property ::= dsncolor-name, dsncolor-value dsncolor-name ::= STRUCTUREDE-NAME -- ISO/IEC 9541-1//DSNCOLOR dsncolor-value ::= STRUCTURED-NAME5. Standardizing Strategy
The properties are strongly requested to be approved as an international standard for open interchange of multilingual documents. For international standardizing, the following strategies are suggested:
- (1) Specification of the font properties are drafted by CICC committee.
- (2) The draft specification is reviewed by fonts and formatting experts in Asia countries. The discussion will be carried out on some conferences or meetings (e.g., CJK DOCP, MLIT).
- (3) After the review and discussions, new work item proposal (or other appropriate procedure) will be submitted to ISO/IEC JTC1/SC34 under the support of AFSIT member countries. Subdivision scheme will be preferable rather than normal NP procedure.
- (4) PDAM and DAM texts will be drafted being based on the discussion in Asian experts.
6. Conclusion
New font properties for multilingual mixtiure and web documents are proposed to satisfy the user requirements for open interchange of multimingual documents in an internet environment. Since they are expected to be authorized as an international standard, some procedures and strategy are suggested.
Proposals and comments on font technology, in particular, regarding country stecific font properties, are welcomed and appreciated.
References
- [1] Y. Komachi, Multilingual documents and their character information processing, Bit, 29, 12, pp.45-50, 1997-12.
- [2] FDPC, Annual report '95 of multimedia font committee, 1996-03.
- [3] CICC, Committee report of standard data for international information interchange, 9-CICC-CM01, 1998-03.
- [4] ISO/IEC PDTR 11585(JTC1/SC18 N3809), Operational model for document description and processing languages, 1992-09.
- [5] ISO/IEC 9541-1, Font information interchange - Architecture, 1991-09.
- [6] ISO/IEC 9541-2, Font information interchange - Interchange format, 1991-09.
- [7] ISO/IEC 9541-3, Font information interchange - Glyph shape representation, 1994-03.
- [8] ISO/IEC JTC1/WG4 N1986, AM2/9541-1: Minor Enhancements to the Architecture to Address Font Technology Advances, 1998-05.