Gendatam - A Genealogical Data Model Some Implementation Considerations Model Revision Date: 1st March 2005 Model First Publication Date: 25th September 2002 This Document published: 1st March 2005 Copyright (c)2002-2005 Graywork Products Ltd All Rights Reserved Author: Peter J. Seymour, Graywork Products Ltd Email: gdm@gendatam.com (note: plain-text emails are preferred). -------------------------------------------------------------------------- 1. PURPOSE This document is intended to provide guidance for software developers in the implementation of the Gendatam model. 2. SCOPE This is an informal working document and is not claimed to be comprehensive. It will be added to as seems appropriate. 3. GUIDANCE 3.1 Treatment of unwanted or unrecognised data. A program processing Gendatam data need not implement processing of all record types or of all record subtypes. Where such a program only reads the file, unwanted data may be discarded on input. However, where the program both reads and re-writes the file, unwanted data must be retained for output. In this latter case, the program must also retain for output any data it does not recognise. This unrecognised data could for instance be a record of unrecognised record type, a record of unrecognised subtype or an unexpected number of records of known subtype. - Robustness Input data should be handled in a manner such that invalid or unexpected data does not cause malfunction. How this is done is implementation dependent. A defective input record that cannot be reliably repaired may be marked as 'damaged' and must be treated as 'unrecognised'. - End-of-line issues On input, a processing program must deal correctly with any combination of CR and LF control characters at line-end. On output, the recommended line-end sequence is CRLF, although the execution platform scheme may be used if difficult to avoid. While it is probable that line-end marking will be done in a consistent manner throughout a file, it should not be assumed that this will necessarily be the case. - File Encoding The default file encoding system is UTF8 and this must be used until others are officially allowed. UTF8, however, does not fully meet the design objectives. Other existing schemes, including UTF7, will be investigated. It is not intended to invent any new scheme. - Record Id Values -It is intended that a user will not need to manipulate record ids (in the sense of needing to key in a record id value). Any processing of record ids should be 'computer-assisted' to this end. - Data objects are referenced by 'record id'. It is assumed that complete files will be loaded into main memory. During program execution, the objects will be loaded into an temporary implementation-defined table or database and accessed as required via the record id value. This temporary table or database may be partly or fully file-based if implementation convenience indicates this. A reference to a non-existent record id discovered during processing by the user should be reported to the user with a suggested corrective action. - Cross-file referencing is implemented using a 64-bit record id where the 64 bits may be viewed as two contiguous 32-bit fields. (This necessarily follows if record ids are 32-bit and file ids are to be given the same flexibility). The high-order (leftmost) 32 bits may be seen as a 31-bit signed integer giving the file id. The low-order (rightmost) 32 bits similarly giving the record id. The index of file ids to file names is kept in the FILES.PRP configuration file. Externally, a comma is used as a separator to divide the two parts (eg fileid,recordid) of the reference. Where the record id consists only of an internal reference, the file id is omitted and the separator should also be omitted. The whole record id may be null. If there is a file id, there must be a separator and record id. It is not an error for there to be a leading separator with no file id. Cross-file referencing is potentially problematic if applied as a general data feature. Its main use is envisaged as for instance keeping Archive and Global Evidence records in separate files from the main data. The two parts of the 64-bit field should be understood as numeric values. Therefore, a typical value in external representation might be '12,1234'. - min screen size? What minimum screen size should be accommodated? The answer depends on what screen space is required to accommodate some relevant area of the program display. - Colour and Font Themes The Colour and Font themes used in the Gendatam Suite are subject to further development.