Newsgroups: rec.games.corewar From: DURHAM@ricevm1.rice.edu (Mark A. Durham) Subject: Intro to Redcode Part I Organization: Rice University, Houston, TX Date: Thu, 14 Nov 1991 09:41:37 GMT Introduction to Redcode ----------------------- I. Preface - Reader Beware! { Part I } II. Notation { Part I } III. MARS Peculiarities { Part I } IV. Address Modes { Part II } V. Instruction Set { Part II } ---------------------------------------------------------------------- I. Preface - Reader Beware! The name "Core War" arguably can be claimed as public domain. Thus, any program can pass itself off as an implementation of Core War. Ideally, one would like to write a Redcode program on one system and know that it will run in exactly the same manner on every other system. Alas, this is not the case. Basically, Core War systems fall under one of four catagories: Non-ICWS, ICWS'86, ICWS'88, or Extended. Non-ICWS systems are usually a variant of Core War as described by A. K. Dewdney in his "Computer Recreations" articles appearing in Scientific American. ICWS'86 and ICWS'88 systems conform to the standards set out by the International Core War Society in their standards of 1986 and 1988, respectively. Extended systems generally support ICWS'86, ICWS'88, and proprietary extensions to those standards. I will discuss frequently common extensions as if they were available on all Extended systems (which they most certainly are not). I will not describe Non-ICWS systems in this article. Most Non- ICWS systems will be easily understood if you understand the systems described in this article however. Although called "standards", ICWS'86 and ICWS'88 (to a lesser extent) both suffer from ambiguities and extra-standard issues which I will try to address. This is where the reader should beware. Because almost any interpretation of the standard(s) is as valid as any other, I naturally prefer MY interpretation. I will try to point out other common interpretations when ambiguities arise though, and I will clearly indicate what is interpretation (mine or otherwise) as such. You have been warned! ---------------------------------------------------------------------- II. Notation "86:" will indicate an ICWS'86 feature. "88:" will indicate an ICWS'88 feature. "X:" will indicate an Extended feature. "Durham:" will indicate my biased interpretation. "Other:" will indicate interpretations adhered to by others. "Commentary:" is me explaining what I am doing and why. "Editorial:" is me railing for or against certain usages. Items without colon-suffixed prefaces can be considered universal. Redcode consists of assembly language instructions of the form <label> <opcode> <A-mode><A-field>, <B-mode><B-field> <comment> An example Recode program: ; Imp ; by A. K. Dewdney ; imp MOV imp, imp+1 ; This program copies itself ahead one END ; instruction and moves through memory. The <label> is optional. 86: <label> begins in the first column, is one to eight characters long, beginning with an alphabetic character and consisting entirely of alphanumerals. Case is ignored ("abc" is equivalent to "ABC"). 88: <label> as above, except length is not limited and case is not addressed. Only the first eight characters are considered significant. X: <label> can be preceded by any amount of whitespace (spaces, tabs, and newlines), consists of any number of significant alphanumerals but must start with an alphabetic, and case is significant ("abc" is different from "ABC"). Commentary: I will always use lowercase letters for labels to distinguish labels from opcodes and family operands. The <opcode> is separated from the <label> (if there is one) by whitespace. Opcodes may be entered in either uppercase or lowercase. The case does not alter the instruction. DAT, MOV, ADD, SUB, JMP, JMZ, JMN, DJN, CMP, SPL, and END are acceptable opcodes. 86: SPACE is also recognized as an opcode. 88: SLT and EQU are recognized as opcodes. SPACE is not. X: All of the above are recognized as opcodes as well as XCH and PCT, plus countless other extensions. Commentary: END, SPACE, and EQU are known as pseudo-ops because they really indicate instructions to the assembler and do not produce executable code. I will always capitalize opcodes and pseudo-ops to distinguish them from labels and text. The <A-mode> and <A-field> taken together are referred to as the A-operand. Similarly, the <B-mode><B-field> combination is known as the B-operand. The A-operand is optional for some opcodes. The B-operand is optional for some opcodes. Only END can go without at least one operand. 86: Operands are separated by a comma. 88: Operands are separated by whitespace. X: Operands are separated by whitespace and/or a comma. Lack of a comma can lead to unexpected behaviour for ambiguous constructs. Commentary: The '88 standard forces you to write an operand without whitespace, reserving whitespace to separate the operands. I like whitespace in my expressions, therefore I prefer to separate my operands with a comma and will do so here for clarity. <mode> is # (Immediate Addressing), @ (Indirect Addressing), or < (86: Auto-Decrement Indirect, 88: Pre-Decrement Indirect). A missing mode indicates Direct Addressing. 86: $ is an acceptable mode, also indicating Direct Addressing. 88: $ is not an acceptable mode. X: $ is an acceptable mode as in 86:. Commentary: The distinction between Auto-Decrement Indirect Addressing and Pre-Decrement Indirect Addressing is semantic, not syntactic. <field> is any combination of labels and integers separated by the arithmetic operators + (addition) and - (subtraction). 86: Parentheses are explicitly forbidden. "*" is defined as a special label symbol meaning the current statement. 88: Arithmetic operators * (multiplication) and / (integer division) are added. "*" is NOT allowed as a special label as in 86:. X: Parentheses and whitespace are permitted in expressions. Commentary: The use of "*" as meaning the current statement may be useful in some real assemblers, but is completely superfluous in a Redcode assembler. The current statement can always be referred to as 0 in Redcode. <comment> begins with a ; (semicolon), ends with a newline, and can have any number of intervening characters. A comment may appear on a line by itself with no instruction preceding it. 88: Blank lines are explicitly allowed. I will often use "A" to mean any A-operand and "B" to mean any B-operand (capitalization is important). I use "a" to mean any A- field and "b" to mean any B-field. For this reason, I never use "a" or "b" as an actual label. I enclose sets of operands or instructions in curly braces. Thus "A" is equivalent to "{ a, #a, @a, <a }". I use "???" to mean any opcode and "x" or "label" as an arbitrary label. Thus, the complete family of acceptable Redcode statements can be represented as x ??? A, B ; This represents all possible Redcode statements. "???" is rarely used as most often we wish to discuss the behaviour of a specific opcode. I will often use labels such as "x-1" (despite its illegality) for the instruction before the instruction labelled "x", for the logically obvious reason. "M" always stands for the integer with the same value as the MARS memory size. ---------------------------------------------------------------------- III. MARS Peculiarities There are two things about MARS which make Redcode different from any other assembly language. The first of these is that there are no absolute addresses in MARS. The second is that memory is circular. Because there are no absolute addresses, all Redcode is written using relative addressing. In relative addressing, all addresses are interpreted as offsets from the currently executing instruction. Address 0 is the currently executing instruction. Address -1 was the previously executed instruction (assuming no jumps or branches). Address +1 is the next instruction to execute (again assuming no jumps or branches). Because memory is circular, each instruction has an infinite number of addresses. Assuming a memory size of M, the current instruction has the addresses { ..., -2M, -M, 0, M, 2M, ... }. The previous instruction is { ..., -1-2M, -1-M, -1, M-1, 2M-1, ... }. The next instruction is { ..., 1-2M, 1-M, 1, M+1, 2M+1, ... }. Commentary: MARS systems have historically been made to operate on object code which takes advantage of this circularity by insisting that fields be normalized to positive integers between 0 and M-1, inclusive. Since memory size is often not known at the time of assembly, a loader in the MARS system (which does know the memory size) takes care of field normalization in addition to its normal operations of code placement and task pointer initialization. Commentary: Redcode programmers often want to know what the memory size of the MARS is ahead of time. This is not always possible. Since normalized fields can only represent integers between 0 and M-1 inclusive, we can not represent M in a normalized field. The next best thing? M-1. But how can we write M-1 when we do not know the memory size? Recall from above that -1 is equivalent to M-1. Final word of caution: -1/2 is assembled as 0 (not as M/2) since the expression is evaluated within the assembler as -0.5 and then truncated. 86: Only two assembled-Redcode programs (warriors) are loaded into MARS memory (core). 88: Core is initialized to (filled with) DAT 0, 0 before loading any warriors. Any number of warriors may be loaded into core. Commentary: Tournaments almost always pit warrior versus warrior with only two warriors in core. MARS is a multi-tasking system. Warriors start as just one task, but can "split" off additional tasks. When all of a warriors tasks have been killed, the warrior is declared dead. When there is a sole warrior still executing in core, that warrior is declared the winner. 86: Tasks are limited to a maximum of 64 for each warrior. 88: The task limit is not set by the standard. ----------------------------------------------------------------------