Embedding Glyph Identifiers in XML Documents

Worling Draft Jan.16, 2002

Editor:
KAWAMATA Akira (Pie Dey CO.,Ltd.) <>
INSTAC XML WG2:
MURATA Makoto (Fuji Xerox Information Systems)
KOMACHI Yushi (Panasonic)
KAWAMATA Akira (Piedey)
UCHIYAMA Mitsukazu (Toshiba)
KAMIMURA Keisuke (GLOCOM)
IMAGO Satosi (RICOH)

Abstract

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is subset of JIS TR X 0047:2001. It's depend the work of Extended Kanji Processing Council, and it was a modefied version of XKP GAIJI Exchange Specification.

Table of Contents

1 Scope

2 Glyph Reference Language
2.1 Attribute "name"

Appendix

A Examples
B References
B.1 Normative Referece
B.2 Informative Reference


1 Scope

This section is normative.

This Specification provides an XML-based language for emdedding glyph identifiers in an XML document.

NOTE: "Glyph" is defined in ISO/IEC 9541-1 as "a recognizable abstract graphic symbol which is independent of any specific design."

A glyph identifier is registered through the procedure for glyphs in ISO/IEC 10036.

2 Glyph Reference Language

This section is normative.

Glyph Reference Language is a language for embedding glyph identifier(s) with XML documents.

The namespace name is "http://www.xml.gr.jp/xmlns/PRE/Reference". The attribute for specifying glyph identifiers (i.e. the attribute name shown below) belongs to this namespace.

2.1 Attribute "name"

<!ATTLIST AnyElement name CDATA #REQUIRED>

NOTE: This Specification uses the DTD syntax for convenience. In practice, namespace prefixes shall be attached to the attributes.

Elements containing this attribute are meant to reference to glyphs. The value of this attribute is a glyph identifier.

NOTE: The attribute 'name' is typicaly used for searching glyph identifiers.

Appendices

A Examples (Non-Normative)

Example 1

This XHTML document has a special glyph of '吉'. A normal glyph for '吉' has a long upper line. While the special one has a short upper line. In ISO/IEC 10646-1, these two variation were unified to form a single code point. But many Japanese people need to distinguish them. This example includes infomation for search processors to distinguish two variations, but does not include for display or printing processors. Note that 吉田茂(Yoshida Sigeru) is a Prime Minister of Japan in 1946-1954 A.D.

<html xmlns="http://www.w3.org/1999/xhtml">
<body xmlns:glyph="http://www.xml.gr.jp/PRE/Reference">
<p><span glyph:name="ISO/IEC 10036/RA//Glyphs:10003290"
>吉</span>田茂</p>
</body>
</html>

Example 2

Same as Example 1, but includes information for human readers. An human readable comment was inserted. The search processors ignore the value of span elements. As a result, the comments will not be used for search.

<html xmlns="http://www.w3.org/1999/xhtml">
<body xmlns:glyph="http://www.xml.gr.jp/PRE/Reference">
<p><span glyph:name="ISO/IEC 10036/RA//Glyphs:10003290"
>吉(The version of Short Upper Line)</span>田茂</p>
</body>
</html>

Example 3

Same as Example 1, but includes GIF graphics to express the glyph variation of '吉' for displaying or printing. GIF graphics is merely example, and any other graphics formats are suitable for this purpose.

<html xmlns="http://www.w3.org/1999/xhtml">
<body xmlns:glyph="http://www.xml.gr.jp/PRE/Reference">
<p><img glyph:name="ISO/IEC 10036/RA//Glyphs:10003290"
src="http://www.mojikyo.gr.jp/gif/003/003290.gif"
alt="吉(The version of Short Upper Line)" />田茂</p>
</body>
</html>

Example 4

Same as Example 1, but includes a reference to one font file which includes a variation of '吉'. This example assumes one font file exists at http://www.xxx.yyy/zzz, and it has a glyph shape of variation for '吉' in code point of 'A'.
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<style type="text/css" media="screen, print">
      @font-face {
        font-family: "UpperShortYoshi";
        src: url("http://www.xxx.yyy/zzz")
      }
      .upperShortYoshi { font-family: "UpperShortYoshi" }
</style>
</head>
<body xmlns:glyph="http://www.xml.gr.jp/PRE/Reference">
<p><span class="upperShortYoshi" glyph:name="ISO/IEC 10036/RA//Glyphs:10003290"
>A</span>田茂</p>
</body>
</html>

Example 5

This example expresses the name of 吉田茂 by SVG. This is a graphics file, but the search processor can detect which glyph was described by path shapes by glyph:name attribute.

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 9.0, SVG Export Plug-In  -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20000303 Stylable//EN"
 "http://www.w3.org/TR/2000/03/WD-SVG-20000303/DTD/svg-20000303-stylable.dtd" [
    <!ENTITY st0 "font-family:'MS-Gothic';">
    <!ENTITY st1 "font-size:12;">
    <!ENTITY st2 "fill-rule:nonzero;clip-rule:nonzero;stroke:#000000;
stroke-miterlimit:4;">
    <!ENTITY st3 "stroke:none;">
]>
<svg  width="35.825pt" height="11.953pt" viewBox="0 0 35.825 11.953"
xml:space="preserve"
xmlns:glyph="http://www.xml.gr.jp/PRE/Reference">
    <g id="_x0083__x008C__x0083_C_x0083__x0084__x0081__x005B__x0020_1"
 style="&st2;">
        <path style="&st3;" d="M0,4.139h4.609V2.592H0.922V1.701h3.688
v-1.5h1.031c0.313,0.063,0.328,0.188,0.047,0.375v1.125h3.734v0.891H5.688v1.54
7h4.703v0.891H0V4.139z M1.422,6.248h7.547v4.547H7.891v-0.422H2.5v0.516H1.422
V6.248z M2.5,7.139v2.344h5.391V7.139H2.5
            z" glyph:name="ISO/IEC 10036/RA//Glyphs:10003290" />
        <text transform="matrix(1 0 0 1 11.8247 10.3125)"><tspan
 x="0" y="0" style="&st3; &st0; &st1;">田茂</tspan></text>
    </g>
</svg>
NOTE: All above examples have the same glyph varietion for '吉'. Search processors reports all above documents have the same glyph. But displaying or printing processors simply ignore the glyph:name attribute, without understanding which glyph is represented.

B References

B.1 Normative References

XML
Extensible Markup Language (XML) 1.0 (Second Edition) , Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, 2000. W3C Recommendation available at: http://www.w3.org/TR/REC-xml.
namespace
Namespaces in XML, Tim Bray, Dave Hollander, Andrew Layman, 1999. W3C Recommendation available at: http://www.w3.org/TR/REC-xml-names.
ISO/IEC 9541
ISO/IEC 9541-1:1991, Information technology - Font Information Interchange - Part 1: Architecture, ISO (International Organization for Standardization), 1991
ISO/IEC 10036
ISO/IEC 10036:1996, Information Technology -- Font information interchange -- Procedures for registration of font-related identifiers, ISO (International Organization for Standardization), 1996

E.2 Informative References

ISO/IEC 9070
ISO/IEC 9070:1991, Information Technology -- SGML support facilities -- Registration procedures for public text owner identifiers, ISO (International Organization for Standardization), 1991