4.6 ISO/IEC N1946


次の規格原案(Working Draft)の一部を翻訳して, その概要を示す。

ISO/IEC JTC1/WG4 N1964

ISO/IEC JTC1/WG4

Document Description Languages

Title:
Information Processing -- Guidelines for accessing SGML-encoded data and metadata from databases, knowledge bases and search tools
Source:
ISO JTC1/WG4
Project:
SGML Rapporteur Group
Status of Document:
Liaison statement
Requested action:
Standards development groups interested in the storage of SGML data and metadata are invited to take part in a workshop to discuss how best to develop suitably generalized guidelines for the storage and retrieval of SGML documents.
Summary of major points:
SGML data and metadata is becoming vital for the communication of data over the public and private networks. To date storage of such information has typically been managed at file level. Increasingly SGML data is being stored in databases, where access to the stored data is being controlled at levels below that of a complete file/document/message. For interworking of data storage environments to be possible, techniques must be developed so that tools are able to identify, request and update subcomponents of SGML data stores in a way that is consistent across environments, and which is not dependent on the type of storage system being used.
Distribution:
National bodies participating in or observing the activities of JTC/WG4, relevant IETF and W3C working groups and IT standard development organizations that liaise with JTC1.

情報処理 -- データベース, 知識ベース及び探査ツールからSGML符号化されたデータ及びメタデータをアクセスするための指針

The Standardized Generalized Markup Language (SGML) defined in ISO 8879 has become a key factor in the development of markup languages for interconnecting the World Wide Web (WWW) of documents that are accessible through the Internet. The role of searching, and other forms of intelligent access facilities, for finding documents and smaller units of referencable information has become widely recognized another key factor in the success of the WWW. To date such facilities have tended to be developed in a somewhat piecemeal fashion, designed to solve one problem at a time. The result is that it is rarely possible to interconnect systems based on different web searching paradigms.

The Standard Query Language (SQL) defined in ISO 9075 is widely recognized as the most portable way of requesting information from relational databases, and is used as one of the main access routes for all such databases, and for certain classes of object oriented database. For many types of object oriented database, however, SQL, even in the extended forms offered by SQL3 and SQL/MM, is too limited. Various languages have been proposed for accessing object oriented databases, the most portable of which is the Object Query Language (OQL). Whilst well designed for access to object sets where class inheritance and related facilities are required, OQL is not optimized for dealing with the types of hierarchical structures encountered in document databases.

The SGML Document Query Language (SDQL) defined in ISO 10179, which is based on facilities for the identification of nodes in SGML document structures defined in ISO 10744, has been tailor-made for identifying relationships within structured documents. SDQL does not, however, contain the mechanisms that are required to control access to, updating of, or interrogation of, structured document repositories. Such facilities form a vital part of standards such as SQL and OQL and other languages used to manage information repositories, and are essential in the development of repositories for hierarchically structured data sets.

Because of the ever-increasing amount of information that is becoming accessible via the Internet, the use of intelligent agents to help identify the most relevant form of information for a particular user is becoming increasingly important. It is no longer enough to just use entered terms to search for relevant data. Today search engines also need to know which searches have previously been successful, both for a particular user and for similar users. Increasingly search systems are being based on the use of advanced "knowledge bases" and on statistical analysis of data use.

Initially used principally for large document sets, structured documents are now regularly encountered as part of ephemeral objects such as electronic mail messages and the returning of search results, and for the interchange of metadata associated with documents and their storage units. Because SGML is now commonly being used for messaging between computer systems, techniques are needed to allow the integration of data, metadata, message processing commands and other data management related operations. Such inter-system messages should be able to interact with the operating system within which they are being used, and must be able to interrogate their operating environment to determine what information should be supplied to what local processes.

A further problem exists in trying to archive data which contains references to information identified using advanced search techniques. Unless it can be guaranteed that the same result will be returned each time a particular search is made, a mechanism is needed to record which objects were returned by the original search in a form that will allow the same set of objects to be recalled whenever the referencing document is accessed. The Hypermedia/Time-based Structuring Language (HyTime) defined in ISO 10744 includes mechanisms for uniquely recording the location of stored objects, and mechanisms that can be used to link such addresses to particular points in the documents they were referenced from. It does not, however, include sufficient properties to ensure safe long-term management of such information.

A number of projects have looked at specific aspects of these problems, including:

There is a need to ensure that proposed mechanisms for accessing individual data objects stored in structured data sets can interwork. For this to be possible it is important that those developing processes dependant on database access develop a common set of guidelines for accessing SGML-encoded data and metadata from databases, knowledge bases and search tools. To start the discussions needed to develop such guidelines, JTC1/WG4 invites those groups interested in the subject to contribute to a workshop on user needs for such systems to be held in Paris on 22nd May 1998, immediately following on from the SGML/XML Europe 98 and immediately preceding the W3C meetings to be held in the same city the following week. (This venue is felt to be the one at which the greatest number of SGML experts will be gathered during the first half of 1998.)

Following these initial discussions it is anticipated that an electronic discussion group will be set up to help to draft a set of sharable guidelines. When these have been agreed, and there is evidence that they are being implemented, it is proposed that the Guidelines should be published as an ISO technical report.

Those wishing to make a formal presentation to the workshop should contact Dr. James Mason, Convenor of JTC1/WG4, with a summary or abstract of their proposals by 10th April 1998. Where draft standards for processes involving access to SGML-encoded data and metadata from databases, knowledge bases and search tools already exist WG4 would appreciate if copies of the standard, or pointers to copies of the standard, could be submitted to Dr. Mason so that they can be made available to those attending the workshop through links placed on the WG4 home page (see http://www.ornl.gov/sgml/WG4/home.htm).


[4.目次に戻る] [4.7に移る]