AMOS file formats

From ExoticA
Revision as of 07:55, 2 August 2007 by Kyz (talk | contribs)

AMOS is the name given to a number of BASIC-like programming languages created for the Amiga by François Lionet, who is also known for the BASIC-like language STOS for the Atari ST and ClickPlay for the IBM PC. AMOS languages include "AMOS The Creator", "Easy AMOS" and "AMOS Professional".

This article attempts to cover all file formats defined by the AMOS software itself. File formats invented in third party software shall not be included. This is a first draft and many formats are not included.

All multi-byte integers are big-endian unless otherwise specified.

AMOS source code file format

AMOS source code is normally stored in a file with the extension ".AMOS". It begins with 16 bytes of ASCII text from the following list:

Text Tested? Saved from which AMOS?
"AMOS Pro101V\0\0\0\0" Yes AMOS Professional
"AMOS Basic V134 " Yes AMOS Pro, but AMOS 1.3 compatible
"AMOS Basic V1.3 " Yes AMOS The Creator v1.3
"AMOS Basic V1.00" Yes AMOS The Creator v1.0 - v1.2
"AMOS Pro101v\0\0\0\0" No AMOS Professional
"AMOS Basic v134 " No AMOS Pro, but AMOS 1.3 compatible
"AMOS Basic v1.3 " No AMOS The Creator v1.3
"AMOS Basic v1.00" No AMOS The Creator v1.0 - v1.2

As you can see the upper-case "V" shows that the source code has been tested, and the lower-case "v" shows that the source code has not been tested. This refers to whether the AMOS interpreter has performed a sanity test on all lines of code, and found no syntax errors.

After the 16 byte header is a 4-byte 32-bit unsigned integer stating the number of bytes of tokenised BASIC code. This is immediately followed by the BASIC code itself, for the length given.

Finally, the 4-bytes ASCII identifier "AmBs" is given, followed by a 2-byte 16-bit unsigned integer with the number of memory banks to follow. This is followed by the banks themselves, individually sized. Each bank can either be a sprite bank, an icon bank or a regular memory bank. There is no more data in the source code file after this. If a sprite bank is given, it always occupies bank 1 and there must not be another sprite bank or regular memory bank with a bank number of 1. If an icon bank is given, it always occupies bank 2 and there must not be another icon bank or regular memory bank with a bank number of 2. Tokenised BASIC code format

The tokenised BASIC code is a stream of tokenised lines. Each tokenised line has the following format:

  • 1 byte: The length of this line in words (2 bytes), including this byte. To get the length of the line in bytes, double this value.
  • 1 byte: The indent level of this line. AMOS automatically indents lines to show program structure. If printing this line as ASCII text, you should print {indent level + 1} space characters as the beginning of the line, or no spaces if the value is less than 2.
  • many bytes: a sequence of tokens. Each token is at least two bytes, and all tokens are rounded to to a multiple of two bytes. Each token is individually sized. The tokens always end with a compulsory null token.

AMOS considers each token as a signed 16-bit number. Token values between 0x0000 and 0x004E are special printing and have differing sizes, all others are simply a signed offset into AMOS's internal token table. The text of the token in the internal token table is what should be printed. Some of these tokens have special size rules, all others are 2 bytes in size.

Specially printed tokens

Token Type Interpretation
0x0000 null token Marks the end of line. Always 2 bytes long.
0x0006 Variable reference e.g. Print XYZ
  • 2 bytes: token (0x0006, 0x000C, 0x0012 or 0x0018)
  • 2 bytes: unknown purpose
  • 1 byte: length of the ASCII string for the variable or label name
  • 1 byte: flags, for tokens 0x0006, 0x0012 and 0x0018:
    • bit 1 set: this is a floating point reference, e.g. "XYZ#"
    • bit 2 set: this is a string reference, e.g. "XYZ$"
  • many bytes: the ASCII string, with the above-given length.

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x000C Label e.g. XYZ: or 190 at the start of a line
0x0012 Procedure call reference e.g. XYZ["hello"]
0x0018 Label reference e.g. Goto XYZ
0x0026 String with double quotes e.g. "hello"
  • 2 bytes: token (0x0026 or 0x002E)
  • 2 bytes: the length of the string
  • many bytes: the ASCII string, with the above given length

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x002E String with single quotes e.g. 'hello'
0x001E Binary integer value e.g. %100101
  • 2 bytes: token (0x001E, 0x0036 or 0x003E)
  • 4 bytes: the integer value
0x0036 Hexidecimal integer value e.g. $80FAA010
0x003E Decimal integer value e.g. 1234567890
0x0046 Floating point value e.g. 3.14
  • 2 bytes: token (0x0046)
  • 4 bytes: the single-precision floating point value.
    • bits 31-8: mantissa (24 bits)
    • bit 7: sign bit. Positive if 0, negative if 1
    • bits 6-0: exponent

An exponent of 0 means 0.0, regardless of mantissa. Counting from MSB (23) to LSB (0), each bit set in the mantissa is 2^(mantissa_bit + exponent - 88)

0x004E Extension command
  • 2 bytes: token (0x004E)
  • 1 byte: extension number (1 to 26)
  • 1 byte: unused
  • 2 bytes: signed 16-bit offset into extension's token table

Specially sized tokens

Token Type Interpretation
0x064A Rem Print the remark string in addition to the remark token.
  • 2 bytes: token (0x064A or 0x0652)
  • 1 byte: unused
  • 1 byte: length of remark string
  • many bytes: the ASCII remark string, with the above-given length.

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x0652 Rem type 2
0x023C

0x0250 0x0268 0x027E 0x02BE 0x02D0 0x0404||For Repeat While Do If Else Data||

  • 2 bytes: token (0x023C, 0x0250, 0x0268, 0x027E, 0x02BE, 0x02D0 or 0x0404)
  • 2 bytes: unknown purpose
0x0290

0x029E 0x0316||Exit If Exit On||

  • 2 bytes: token (0x0290, 0x029E or 0x0376)
  • 4 bytes: unknown purpose
0x0376 Procedure
  • 2 bytes: token (0x0376)
  • 4 bytes: number of bytes to corresponding End Proc line
(start of line + 8 + above = start of End Proc line)
(start of line + 8 + 6 + above = line after End Proc line)
  • 2 bytes: part of seed for encryption
  • 1 byte: flags
    • bit 7: if set, procedure is folded
    • bit 6: if set, procedure is locked and shouldn't be unfolded
    • bit 5: if set, procedure is currently encrypted
    • bit 4: if set, procedure contains compiled code and not tokens
    • 1 byte: part of seed for encryption

Procedure decryption source code

If you should find a procedure (0x0376) token with the "is encrypted" bit set, run this C function on the code and it will decrypt the contents of the procedure.

/* fetches a 4-byte integer in big-endian format */
#define EndGetM32(a)  ((((a)[0])<<24)|(((a)[1])<<16)|(((a)[2])<<8)|((a)[3]))
/* fetches a 2-byte integer in big-endian format */
#define EndGetM16(a)  ((((a)[0])<<8)|((a)[1]))

void decrypt_procedure(unsigned char *src) {
  unsigned char *line, *next, *endline;
  unsigned int key, key2, key3, size;

  /* ensure src is a pointer to a line with the PROCEDURE token on it */
  if (EndGetM16(&src[2]) != 0x0376) return;

  /* do not operate on compiled procedures */
  if (src[10] & 0x10) return;

  /* size+8+6 is the start of the line after ENDPROC */
  size = EndGetM32(&src[4]);
  endline = &src[size+8+6];
  line = next = &src[src[0] * 2];

  /* initialise encryption keys */
  key = (size << 8) | src[11];
  key2 = 1;
  key3 = EndGetM16(&src[8]);

  while (line < endline) {
    line = next;
    next = &line[line[0] * 2];

    /* decrypt one line */
    for (line += 4; line < next;) {
      *line++ ^= (key >> 8) & 0xFF;
      *line++ ^=  key       & 0xFF;
      key  += key2;
      key2 += key3;
      key = (key >> 1) | (key << 31);
    }
  }
  src[10] ^= 0x20; /* toggle "is encrypted" bit */
}

AMOS Sprite and Icon bank formats

A sprite bank and an icon bank share very similar attributes. They define graphic data which can be drawn onscreen.

  • 4 bytes: "AmSp" for sprites (bank 1) or "AmIc" for icons (bank 2)
  • 2 bytes: the number of sprites or icons to follow
  • many bytes: the above-counted sprites or icons
  • 64 bytes: a 32-entry colour palette. Each entry has the Amiga COLORx hardware register format, which is 0x0RGB, where R, G and B represent red, green and blue colour components and are between 0x0 (minimum) and 0xF (maximum).

Each sprite or icon has this format:

  • 2 bytes: width, in 16-bit words
  • 2 bytes: height, in raster lines
  • 2 bytes: depth, in bitplanes (1 to 5)
  • 2 bytes: hot-spot X co-ordinate
  • 2 bytes: hot-spot Y co-ordinate
  • many bytes: width*height*depth*2 bytes of planar graphic data

AMOS Memory Bank formats

An AMOS Memory bank is simply a named block of data. AMOS allows for 15 such banks in an AMOS program, and they can also be loaded and saved at runtime using the "Load" and "Save" commands. Each bank has a standard 20 byte header, although the "length" field in this header does not count the "name" field of this header as part of the header. Each bank can be located in "chip" memory, which is accessible to the Amiga's custom graphics and sound processors, or it can be located in "fast" memory, which is only accessible to the CPU. The header format is as follows:

  • 4 bytes: the ASCII identifier "AmBk"
  • 2 bytes: the bank number (1-15)
  • 2 bytes: 0 for chip memory bank, 1 for fast memory bank
  • 4 bytes: bank length, but only bits 27 to 0. Bits 28 and 29 are undefined, not part of the length field. Bit 30 means "try chip memory", bit 31 means "try fast memory" if set.
  • 8 bytes: the bank name. It is always an unterminated ASCII string which is padded with spaces.

The header is followed by the bank data, which is {bank length - 8} bytes long.

AMOS Music Bank format

This bank has the name "Music" and is created with various conversion utilities shipped with AMOS. It is played back with the Music extension. See AMOS Music Bank format for more details.

AMOS AMAL Bank format

This bank has the name "Amal". It contains instructions in AMOS Animation Language format. This will be described in more detail in a later draft of this document.

AMOS Menu Bank format

This bank has the name "Menu". It contains pull-down menu definitions. This will be described in more detail in a later draft of this document.

AMOS Data Bank format

This bank has the name "Datas". It is created in AMOS using the "Reserve As Data" command, and has no specific format.

AMOS Work Bank format

This bank has the name "Work". It is created in AMOS using the "Reserve As Work" command, and has no specific format. As a Work bank, it is not saved as part of the source code, unlike normal data banks.

AMOS Asm Bank format

This bank has the name "Asm". It contains Amiga machine code that was loaded into a bank using the "Pload" command and has no specific format (other than containing MC680x0 binary code).

AMOS Picture Bank format

This bank has the name "Pac.Pic." and is created with the Compact extension's "Pack" command. See AMOS Pac.Pic. format for more details.

AMOS Samples Bank format

This bank has the name "Samples" and is created with the Sample Bank Editor shipped with AMOS. The samples can be played back with the Music extension. The format of the bank is as follows:

  • 20 byte: the regular bank header, as described above
  • 2 bytes: the number of samples in this bank
  • many bytes: A list of offsets, each 4 bytes long, to each sample. The offset is relative to the location of the "number of samples" field above.

The format of each sample is as follows:

  • 8 bytes: the name for the sample, in ASCII.
  • 2 bytes: the frequency of the sample in hertz.
  • 4 bytes: the length of the sample in bytes.
  • many bytes: the sample data itself, a stream of twos complement signed 8-bit PCM samples