AMOS file formats

From ExoticA
Revision as of 00:39, 8 September 2007 by Kyz (talk | contribs)

AMOS is the name given to a number of BASIC-like programming languages created for the Amiga by François Lionet, who is also known for the BASIC-like language STOS for the Atari ST and ClickPlay for the IBM PC. AMOS languages include "AMOS The Creator", "Easy AMOS" and "AMOS Professional".

AMOS comes with its own integrated development environment. As this is a self-contained system, AMOS uses its own invented file formats for everything from source code to graphics and sound.

This article attempts to cover all file formats defined by the AMOS software itself. File formats invented in third party software shall not be included. All multi-byte integers are big-endian (the Amiga's native format) unless otherwise specified. For example, the value 0x12345678 is stored as the bytes 0x12 (at the lowest memory location), then 0x34, then 0x56, finally 0x78 in the highest memory location.

Introduction to AMOS

AMOS is an interpreted, BASIC-like language. You work in AMOS's built-in source code editor. Your code is stored in memory as tokens; each BASIC keyword is represented by a 2-byte number. You edit each line in your program as text, but once you finish editing any line in your program and move onto the next, the line is (re-)converted to tokens.

Just because your code can be tokenised, it doesn't mean that it works. Whenever you run a program, AMOS "tests" it, checking it for any syntax errors or other errors that can be detected before running the program. If any errors are found, AMOS jumps to this line and shows you the error code, rather than running the program. You can do this at any time, without running the program, by pressing the "Test" button. You can save your source code to disk, regardless of whether your code has passed the test or not. It is saved as a collection of tokens. To allow other programs, particularly the AMOS Compiler, to avoid having to do the same syntax check, AMOS records in the saved file whether the source was tested or not.

Extensions

AMOS not only includes the core language, it has these things called "extensions". These are written by assembly code, not AMOS, and they add more commands to the AMOS language. Each extension is intended to be loaded into a specific "slot": AMOS has 25 "slots" for extensions. Some extensions can work in any slot, but you should stick to the recommended slot, because when you use extension instructions, the slot number gets saved into your source code. If you move the extension to another slot, suddenly your source code doesn't work any more. If you look at the source code in AMOS without the extension, your code just has "Extension M" or "Extension L" or another letter of the alphabet where your extension-specific command once was. When you try and test or run your program, AMOS says "Extension Not Loaded" at you.

Sometimes, two extensions want to use the same slot; they can't do this, you can only have one or the other. Your choice of extensions, along with all other global settings, are stored in AMOS's config file. This is "AMOS1_3_PAL.env", "AMOS1_3_NTSC.env" or "AMOSPro_Interpreter_Config", depending on your AMOS version.

Banks

In order to work with multimedia such as pictures and music, AMOS has this concept of a "memory bank". You can have up to 15 memory banks. For example, you can load several pieces of music into different memory banks, and then identify which one you want to play by a number: "Track Play 5" will play music in bank 5. Or you could load a packed picture into bank 4 and say "Unpack 4 to 0".

While you can load anything into any bank, some of the commands can only take their data from specific bank numbers. Bank 1 is used for Sprites, which are controlled with instructions beginning "Sprite" or "Bob". Bank 2 is for Icons, which are controlled with instructions beginning "Icon". Bank 3 is used for music in AMOS's native music format.

If you have memory banks in use while saving your source code, the contents of the banks get saved along with them. This makes it easy to bundle your code with the data it works on. The exception to this is banks created with the "Reserve As Work" instruction.

AMOS source code file format

AMOS source code is normally stored in a file with the extension ".AMOS". It begins with 16 bytes of ASCII text from the following list:

Text Tested? Saved from which AMOS?
"AMOS Pro101V\0\0\0\0" Yes AMOS Professional
"AMOS Pro101v\0\0\0\0" No AMOS Professional
"AMOS Basic V134 " Yes AMOS Pro, but AMOS 1.3 compatible
"AMOS Basic v134 " No AMOS Pro, but AMOS 1.3 compatible
"AMOS Basic V1.3 " Yes AMOS The Creator v1.3
"AMOS Basic v1.3 " No AMOS The Creator v1.3
"AMOS Basic V1.00" Yes AMOS The Creator v1.0 - v1.2
"AMOS Basic v1.00" No AMOS The Creator v1.0 - v1.2

As can be seen from the table, the 12th character in the text is either "V", which means "tested", or "v", which means "not tested". "Tested" in this case refers to whether the AMOS interpreter has performed a syntax check on all lines of code, and found no syntax errors. While you can save AMOS source code to disk at any time, you can only run it or compile it if it has been tested first.

After the 16 byte header is a 4-byte 32-bit unsigned integer stating the number of bytes of tokenised BASIC code. This is immediately followed by the BASIC code itself, for the length given.

Finally, after the BASIC code, a 4-bytes ASCII identifier "AmBs" is given, followed by a 2-byte 16-bit unsigned integer with the number of memory banks to follow. This is followed by the banks themselves, individually sized. Each bank can either be a sprite bank, an icon bank or a regular memory bank. There is no more data in the source code file after this. If a sprite bank is given, it always occupies bank 1 and there must not be another sprite bank or regular memory bank with a bank number of 1. If an icon bank is given, it always occupies bank 2 and there must not be another icon bank or regular memory bank with a bank number of 2.

Tokenised BASIC code format

The tokenised BASIC code is a stream of tokenised lines. Each tokenised line has the following format:

  • 1 byte: The length of this line in words (2 bytes), including this byte. To get the length of the line in bytes, double this value.
  • 1 byte: The indent level of this line. AMOS automatically indents lines to show program structure. If printing this line as ASCII text, you should print {indent level + 1} space characters as the beginning of the line, or no spaces if the value is less than 2.
  • many bytes: a sequence of tokens. Each token is at least two bytes, and all tokens are rounded to to a multiple of two bytes. Each token is individually sized. The tokens always end with a compulsory null token.

AMOS considers each token as a signed 16-bit number. Token values between 0x0000 and 0x004E are special printing and have differing sizes, all others are simply a signed offset into AMOS's internal token table. The text of the token in the internal token table is what should be printed. Some of these tokens have special size rules, all others are 2 bytes in size.

Specially printed tokens

Token Type Interpretation
0x0000 null token Marks the end of line. Always 2 bytes long.
0x0006 Variable reference, e.g. Print XYZ
  • 2 bytes: token (0x0006, 0x000C, 0x0012 or 0x0018)
  • 2 bytes: unknown purpose
  • 1 byte: length of the ASCII string for the variable or label name
  • 1 byte: flags, for tokens 0x0006, 0x0012 and 0x0018:
    • bit 1 set: this is a floating point reference, e.g. "XYZ#"
    • bit 2 set: this is a string reference, e.g. "XYZ$"
  • many bytes: the ASCII string, with the above-given length.

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x000C Label, e.g. XYZ: or 190 at the start of a line
0x0012 Procedure call reference, e.g. XYZ["hello"]
0x0018 Label reference, e.g. Goto XYZ
0x0026 String with double quotes, e.g. "XYZ"
  • 2 bytes: token (0x0026 or 0x002E)
  • 2 bytes: the length of the string
  • many bytes: the ASCII string, with the above given length

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x002E String with single quotes, e.g. 'XYZ'
0x001E Binary integer value, e.g. %100101
  • 2 bytes: token (0x001E, 0x0036 or 0x003E)
  • 4 bytes: the integer value
0x0036 Hexidecimal integer value, e.g. $80FAA010
0x003E Decimal integer value, e.g. 1234567890
0x0046 Floating point value, e.g. 3.1452
  • 2 bytes: token (0x0046)
  • 4 bytes: the single-precision floating point value.
    • bits 31-8: mantissa (24 bits)
    • bit 7: sign bit. Positive if 0, negative if 1
    • bits 6-0: exponent

An exponent of 0 means 0.0, regardless of mantissa. Counting from MSB (23) to LSB (0), each bit set in the mantissa is 2^(mantissa_bit + exponent - 88)

0x004E Extension command
  • 2 bytes: token (0x004E)
  • 1 byte: extension number (1 to 26)
  • 1 byte: unused
  • 2 bytes: signed 16-bit offset into extension's token table

Specially sized tokens

Token Type Interpretation
0x064A Rem

Print the remark string in addition to the remark token.

  • 2 bytes: token (0x064A or 0x0652)
  • 1 byte: unused
  • 1 byte: length of remark string
  • many bytes: the ASCII remark string, with the above-given length.

The ASCII string is null terminated and its length is rounded up to a multiple of two.

0x0652 Rem type 2
0x023C For
  • 2 bytes: token (0x023C, 0x0250, 0x0268, 0x027E, 0x02BE, 0x02D0 or 0x0404)
  • 2 bytes: unknown purpose
0x0250 Repeat
0x0268 While
0x027E Do
0x02BE If
0x02D0 Else
0x0404 Data
0x0290 Exit If
  • 2 bytes: token (0x0290, 0x029E or 0x0376)
  • 4 bytes: unknown purpose
0x029E Exit
0x0316 On
0x0376 Procedure
  • 2 bytes: token (0x0376)
  • 4 bytes: number of bytes to corresponding End Proc line
(start of line + 8 + above = start of End Proc line)
(start of line + 8 + 6 + above = line after End Proc line)
  • 2 bytes: part of seed for encryption
  • 1 byte: flags
    • bit 7: if set, procedure is folded
    • bit 6: if set, procedure is locked and shouldn't be unfolded
    • bit 5: if set, procedure is currently encrypted
    • bit 4: if set, procedure contains compiled code and not tokens
  • 1 byte: part of seed for encryption

Procedure decryption source code

If you should find a procedure (0x0376) token with the "is encrypted" bit set, run this C function on the code and it will decrypt the contents of the procedure.

/* fetches a 4-byte integer in big-endian format */
#define EndGetM32(a)  ((((a)[0])<<24)|(((a)[1])<<16)|(((a)[2])<<8)|((a)[3]))
/* fetches a 2-byte integer in big-endian format */
#define EndGetM16(a)  ((((a)[0])<<8)|((a)[1]))

void decrypt_procedure(unsigned char *src) {
  unsigned char *line, *next, *endline;
  unsigned int key, key2, key3, size;

  /* ensure src is a pointer to a line with the PROCEDURE token on it */
  if (EndGetM16(&src[2]) != 0x0376) return;

  /* do not operate on compiled procedures */
  if (src[10] & 0x10) return;

  /* size+8+6 is the start of the line after ENDPROC */
  size = EndGetM32(&src[4]);
  endline = &src[size+8+6];
  line = next = &src[src[0] * 2];

  /* initialise encryption keys */
  key = (size << 8) | src[11];
  key2 = 1;
  key3 = EndGetM16(&src[8]);

  while (line < endline) {
    line = next;
    next = &line[line[0] * 2];

    /* decrypt one line */
    for (line += 4; line < next;) {
      *line++ ^= (key >> 8) & 0xFF;
      *line++ ^=  key       & 0xFF;
      key  += key2;
      key2 += key3;
      key = (key >> 1) | (key << 31);
    }
  }
  src[10] ^= 0x20; /* toggle "is encrypted" bit */
}

AMOS Sprite and Icon bank formats

A sprite bank and an icon bank share very similar attributes. They define graphic data which can be drawn onscreen.

  • 4 bytes: "AmSp" for sprites (bank 1) or "AmIc" for icons (bank 2)
  • 2 bytes: the number of sprites or icons to follow
  • many bytes: the above-counted sprites or icons
  • 64 bytes: a 32-entry colour palette. Each entry has the Amiga COLORx hardware register format, which is 0x0RGB, where R, G and B represent red, green and blue colour components and are between 0x0 (minimum) and 0xF (maximum).

Each sprite or icon has this format:

  • 2 bytes: width, in 16-bit words
  • 2 bytes: height, in raster lines
  • 2 bytes: depth, in bitplanes (1 to 5)
  • 2 bytes: hot-spot X co-ordinate
  • 2 bytes: hot-spot Y co-ordinate
  • many bytes: width*height*depth*2 bytes of planar graphic data

AMOS Memory Bank formats

An AMOS Memory bank is simply a named block of data. AMOS allows for 15 such banks in an AMOS program, and they can also be loaded and saved at runtime using the "Load" and "Save" commands. Each bank has a standard 20 byte header, although the "length" field in this header does not count the "name" field of this header as part of the header. Each bank can be located in "chip" memory, which is accessible to the Amiga's custom graphics and sound processors, or it can be located in "fast" memory, which is only accessible to the CPU. The header format is as follows:

  • 4 bytes: the ASCII identifier "AmBk"
  • 2 bytes: the bank number (1-15)
  • 2 bytes: 0 for chip memory bank, 1 for fast memory bank
  • 4 bytes: bank length, but only bits 27 to 0. Bits 28 and 29 are undefined, not part of the length field. Bit 30 means "try chip memory", bit 31 means "try fast memory" if set.
  • 8 bytes: the bank name. It is always an unterminated ASCII string which is padded with spaces.

The header is followed by the bank data, which is {bank length - 8} bytes long.

AMOS Music Bank format

This bank has the name "Music" and is created with various conversion utilities shipped with AMOS. It is played back with the Music extension. See AMOS Music Bank format for more details.

AMOS AMAL Bank format

This bank has the name "Amal". It contains instructions in AMOS Animation Language format. This will be described in more detail in a later draft of this document.

AMOS Menu Bank format

This bank has the name "Menu". It contains pull-down menu definitions. This will be described in more detail in a later draft of this document.

AMOS Data Bank format

This bank has the name "Datas". It is created in AMOS using the "Reserve As Data" command, and has no specific format.

AMOS Work Bank format

This bank has the name "Work". It is created in AMOS using the "Reserve As Work" command, and has no specific format. As a Work bank, it is not saved as part of the source code, unlike normal data banks.

AMOS Asm Bank format

This bank has the name "Asm". It contains Amiga machine code that was loaded into a bank using the "Pload" command and has no specific format (other than containing MC680x0 binary code).

AMOS Picture Bank format

This bank has the name "Pac.Pic." and is created with the Compact extension's "Pack" command. See AMOS Pac.Pic. format for more details.

AMOS Samples Bank format

This bank has the name "Samples" and is created with the Sample Bank Editor shipped with AMOS. The samples can be played back with the Music extension. The format of the bank is as follows:

  • 20 byte: the regular bank header, as described above
  • 2 bytes: the number of samples in this bank
  • many bytes: A list of offsets, each 4 bytes long, to each sample. The offset is relative to the location of the "number of samples" field above.

The format of each sample is as follows:

  • 8 bytes: the name for the sample, in ASCII.
  • 2 bytes: the frequency of the sample in hertz.
  • 4 bytes: the length of the sample in bytes.
  • many bytes: the sample data itself, a stream of twos complement signed 8-bit PCM samples