Welcome to the Creatures Wiki! Log in and join the community.

Difference between revisions of "GNO files"

From Creatures Wiki
Jump to navigation Jump to search
(Rewrite file description closer to original format, not BNF)
(One intermediate revision by the same user not shown)
Line 7: Line 7:
 
This information applies to c2e (C3/DS) only. See the external links below for C1/C2.
 
This information applies to c2e (C3/DS) only. See the external links below for C1/C2.
  
All multi-byte values in the file are in [http://en.wikipedia.org/wiki/Little_endian#Little-endian little endian] notation.
+
All multi-byte values in the file are in [http://en.wikipedia.org/wiki/Little_endian#Little-endian little endian].
  
A [http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form BNF] like description of the file format follows.
+
A description of the file format follows:
  
 
<pre>
 
<pre>
<Caption-File>    ::= <SVRule-Section> <Caption-Section> <Encoding-Hints>
+
GNO file:
+
  SHORT Number of bytes to be read next = 2
<SVRule-Section>  ::= <word-size> <count> count*<SV-Note> <zero-padding>
+
  SHORT Number of SV notes
+
  SVNoteStruct[Number of SV notes]
<Caption-Section> ::= <word-size> <count> count*<Caption-Note> <zero-padding>
+
  ZERO-PADDING
+
  SHORT Number of bytes to be read next = 2
<word-size> ::= 2-byte-integer
+
  SHORT Number of gene notes
<string>    ::= <count> count*<ASCII character>
+
  GeneNoteStruct[Number of gene notes]
<count>    ::= word-sized-integer
+
  ZERO-PADDING
+
  EncodingHint
 
<SV-Note> ::= <gene-type>,
 
              <gene-sub-type>,
 
              <unique-id>,
 
              <rule-number>
 
              <unknown>,
 
              <Annotation-List>,
 
              <unknown>,
 
              <general-note>
 
  
<Caption-Note> := <gene-type>,
+
SVNoteStruct:
                  <gene-sub-type>,
+
  SHORT Gene type
                  <unique-id>,
+
  SHORT Gene Sub-type
                  <unknown>
+
   SHORT Unique ID
                  <caption>,
+
  SHORT Rule Number
                  <rich-text-annotation>
+
  SHORT unknown always zero
+
  STRING Annotations[16]
+
   SHORT unknown always zero
<gene-type>      ::= word-sized-integer
+
   STRING General Notes
<gene-sub-type>   ::= word-sized-integer
 
<unique-id>      ::= word-sized-integer
 
<rule-number>    ::= word-sized-integer
 
 
<unknown>        ::= word-sized-integer | <string>
 
 
<Annotation-List> ::= 16*<string>
 
<general-note>    ::= <string>
 
    
 
<caption>        ::= <string>
 
    
 
<rich-text-annotation> ::= <string>
 
  
<zero-padding> ::= { '0x00' }
+
GeneNoteStruct:
 +
  SHORT Gene type
 +
  SHORT Gene Sub-type
 +
  SHORT Unique ID
 +
  SHORT unknown always zero
 +
  STRING Caption
 +
  STRING Rich Text Annotation
  
 +
STRING:
 +
  SHORT Number of bytes in string
 +
  BYTE[Number of bytes in string] Text string
  
<Encoding-Hints> ::= '0xDC050000'=1500,
+
EncodingHint:
                      '0x3F420F00'=999999,
+
  '0xDC050000'=1500
                      '0x0D00'=13
+
  '0x3F420F00'=999999
                      '0x0200'=2
+
  '0x0D00'=13
                      '0x0E00'=14
+
  '0x0200'=2
 +
  '0x0E00'=14
 
</pre>
 
</pre>
  
The <code>word-size</code> values are two&nbsp;(0x0200) in all observed cases.
+
The <code>gene type</code> and <code>gene sub type</code> are the same as in [[GEN files|genome files]].
  
The <code>gene-type</code> and <code>gene-sub-type</code> are the same as in [[GEN files|genome files]].
+
The <code>unique id</code> is value from 0 to 255, it is shown under the "G-ID" column in the [[Genetics Kit]].
 
 
The <code>unique-id</code> is value from 0 to 255, it is shown under the "G-ID" column in the [[Genetics Kit]].
 
 
It is the combination of gene type, sub-type, and this ID that uniquely identifies a caption and attaches the caption to a specific gene in the genome.
 
It is the combination of gene type, sub-type, and this ID that uniquely identifies a caption and attaches the caption to a specific gene in the genome.
  
The <code>rule-number</code> is either 0 or 1.
+
The <code>rule number</code> is either 0 or 1.
 
Zero is the initialisation rule and one is the update rule.
 
Zero is the initialisation rule and one is the update rule.
  
The sixteen annotations in an <code>SV-Note</code> are for each of the sixteen possible instructions in an [[SVRules|SVRule]].
+
The sixteen annotations in an <code>SVNoteStruct</code> are for each of the sixteen possible instructions in an [[SVRules|SVRule]].
  
The <code>Caption-Note</code>'s <code>caption</code> is shown under the "Description" column of the [[Genetics Kit]] and names the gene.
+
The <code>GeneNoteStruct</code>'s <code>caption</code> is shown under the "Description" column of the [[Genetics Kit]] and names the gene.
  
The <code>rich-text-annotation</code> is shown with the "Notes" button on the gene editing dialog of the [[Genetics Kit]].
+
The <code>rich text annotation</code> is shown with the "Notes" button on the gene editing dialog of the [[Genetics Kit]].
 
It is in Microsoft's [http://en.wikipedia.org/wiki/Rich_Text_Format RTF] format.
 
It is in Microsoft's [http://en.wikipedia.org/wiki/Rich_Text_Format RTF] format.
 
The original caption files provided with [[Creatures 3]] only contain a single
 
The original caption files provided with [[Creatures 3]] only contain a single
<code>rich-text-annotation</code>
+
<code>rich text annotation</code>
 
(for the chemical receptor gene captioned as "091 Belladonna poisoning - Receptor").
 
(for the chemical receptor gene captioned as "091 Belladonna poisoning - Receptor").
  
Line 88: Line 75:
 
When writing a program to parse caption files care should be taken if these bytes are non-zero
 
When writing a program to parse caption files care should be taken if these bytes are non-zero
 
since if the value is meant to be interpreted as a string then the size of the enclosing
 
since if the value is meant to be interpreted as a string then the size of the enclosing
<code>SV-Note</code> or <code>Caption-Note</code> will be different then expected.
+
<code>SVNoteStruct</code> or <code>GeneNoteStruct</code> will be different then expected.
  
 
===Padding and "encoding" used by [[Genetics Kit]]===
 
===Padding and "encoding" used by [[Genetics Kit]]===
  
The <code>zero-padding</code> appears to be the correct number of zeros such that if they were interpreted as
+
The <code>zero padding</code> appears to be the correct number of zeros such that if they were interpreted as
<code>SV-Note</code> or <code>Caption-Note</code> records
+
<code>SVNoteStruct</code> or <code>GeneNoteStruct</code> records
 
(with zero-length strings)
 
(with zero-length strings)
 
then there would be a total of 3001&nbsp;records (of each type).
 
then there would be a total of 3001&nbsp;records (of each type).
 
It is believed that this number of records is related to the values in the
 
It is believed that this number of records is related to the values in the
<code>Encoding-hints</code> bytes at the end of the file.
+
<code>Encoding hints</code> bytes at the end of the file.
  
 
Reverse engineering of the [[Genetics Kit]] indicates that when opening a caption file it reads the last ten&nbsp;(10) bytes of the file first.
 
Reverse engineering of the [[Genetics Kit]] indicates that when opening a caption file it reads the last ten&nbsp;(10) bytes of the file first.
Line 125: Line 112:
  
 
==External links==
 
==External links==
*[http://www.gamewareeurope.com/GWDev/cdn/cdn_more.php?CDN_article_id=10 Creature Labs' C3/DS GNO Technical Information]
+
*[https://web.archive.org/web/20170814231459/http://www.gamewareeurope.com/GWDev/cdn/cdn_more.php?CDN_article_id=10 Creature Labs' C3/DS GNO Technical Information]
 
*[http://double.co.nz/creatures/creatures2/gnoformat.htm C1/C2 GNO file format]
 
*[http://double.co.nz/creatures/creatures2/gnoformat.htm C1/C2 GNO file format]
 
[[Category:File formats]]
 
[[Category:File formats]]
 
[[Category:Genetics]]
 
[[Category:Genetics]]

Revision as of 23:43, 25 June 2020

.gno files contain optional notes to accompany GEN files for new breeds.

The C3/DS information on the Gameware site (linked to below) is incomplete and inaccurate. In particular there are a few places where there are a couple of extra zero bytes and the files have large amounts of zero padding as well as some additional undocumented values at the end of the file.

File Format

This information applies to c2e (C3/DS) only. See the external links below for C1/C2.

All multi-byte values in the file are in little endian.

A description of the file format follows:

GNO file:
  SHORT Number of bytes to be read next = 2
  SHORT Number of SV notes
  SVNoteStruct[Number of SV notes]
  ZERO-PADDING
  SHORT Number of bytes to be read next = 2
  SHORT Number of gene notes
  GeneNoteStruct[Number of gene notes]
  ZERO-PADDING
  EncodingHint

SVNoteStruct:
  SHORT Gene type
  SHORT Gene Sub-type
  SHORT Unique ID
  SHORT Rule Number
  SHORT unknown always zero
  STRING Annotations[16]
  SHORT unknown always zero
  STRING General Notes

GeneNoteStruct:
  SHORT Gene type
  SHORT Gene Sub-type
  SHORT Unique ID
  SHORT unknown always zero
  STRING Caption
  STRING Rich Text Annotation

STRING:
  SHORT Number of bytes in string
  BYTE[Number of bytes in string] Text string

EncodingHint:
  '0xDC050000'=1500
  '0x3F420F00'=999999
  '0x0D00'=13
  '0x0200'=2
  '0x0E00'=14

The gene type and gene sub type are the same as in genome files.

The unique id is value from 0 to 255, it is shown under the "G-ID" column in the Genetics Kit. It is the combination of gene type, sub-type, and this ID that uniquely identifies a caption and attaches the caption to a specific gene in the genome.

The rule number is either 0 or 1. Zero is the initialisation rule and one is the update rule.

The sixteen annotations in an SVNoteStruct are for each of the sixteen possible instructions in an SVRule.

The GeneNoteStruct's caption is shown under the "Description" column of the Genetics Kit and names the gene.

The rich text annotation is shown with the "Notes" button on the gene editing dialog of the Genetics Kit. It is in Microsoft's RTF format. The original caption files provided with Creatures 3 only contain a single rich text annotation (for the chemical receptor gene captioned as "091 Belladonna poisoning - Receptor").

The unknown fields are zero (0x0000) in all observed cases. It is unclear if this is a reserved integer value or a reserved string value (of length zero). When writing a program to parse caption files care should be taken if these bytes are non-zero since if the value is meant to be interpreted as a string then the size of the enclosing SVNoteStruct or GeneNoteStruct will be different then expected.

Padding and "encoding" used by Genetics Kit

The zero padding appears to be the correct number of zeros such that if they were interpreted as SVNoteStruct or GeneNoteStruct records (with zero-length strings) then there would be a total of 3001 records (of each type). It is believed that this number of records is related to the values in the Encoding hints bytes at the end of the file.

Reverse engineering of the Genetics Kit indicates that when opening a caption file it reads the last ten (10) bytes of the file first. If checks for the exact sequence for the integer value "999999" (as a magic number) and does some verification of the other values. The "13" and "14" values are thought to be some kind of record size indication, the genetics kit verifies that the two values differ by one. The "2" value is thought to be the word size or file format version number.

If the values of the last ten bytes check out then the Genetics Kit reads the four bytes immediately preceding them. It is thought that this value is the array size and relates to there being enough zero padding for 3001 records, the genetics kit compares the read value to exactly "1500". At this point the Genetics Kit repositions to the beginning of the file and it either uses VBA routines to read arrays (if the Encoding-Hints match its expectations) or it falls back to reading the file record by record.

Since all known caption files contain the padding, the "record-by-record" parsing method in the Genetics Kit is normally not used. It is therefore uncertain if it is as robust or error-free as the normal routines used to parse the file. When writing software to write a caption files this should be taken into account when considering if the padding method and undocumented bytes should be emulated or not.

Related links

External links