Maz is a Z80 macro assembler, which is currently under development. I wouldn't advise using it until it's at version 1, or at least until I change this message.
If using a new version of npm, you can run maz using npx: 'npx maz ...'. The command line parameters are:
Option (short and long) | Description |
---|---|
-s filename --src filename |
The source file to assemble |
-o filename --out filename |
The binary file to output to |
-l filename --list filename |
The listing file to create |
-b --brief |
Show brief error messages |
-u --undoc |
Show warnings when undocumented instructions are used |
-p path --path path |
Specify a directory to search in when looking for files to include. This option can be specified multiple times. |
Numbers are decimal unless one of the following applies:
- Numbers which start with $ or end with h are in hex
- Numbers which end with o are in octal
- Numbers which start with % or end with b are in binary
Numbers can have _ in them to improve readability: eg: 1101_0100_1010_1111b
Source files are expected to be encoded in UTF-8, and so all strings are processed as UTF-8 strings. Characters in the range 00-7F are the same as ASCII characters. Example:
68 db "h"
c3 9f db "ß"
e2 84 a2 db "™"
f0 9f 98 81 db "😁"
Strings can be entered just like JavaScript strings. They can be enclosed in double or single quotes, and support the following escape sequences:
Code | Output |
---|---|
\0 | a NULL character (0) |
\' | a single quote |
\" | a double quote |
\\ | a backslash |
\n | a newline (10) |
\r | a carriage return (13) |
\v | a vertical tab (11) |
\t | a tab (9) |
\b | a backspace (8) |
\f | a formfeed (12) |
\xXX | a Latin-1 character |
\uXXXX | a Unicode character |
Note that although you can enter a latin character such as \xFF, because the string is encoded at UTF-8 it will actually be made up of two bytes: C3 BF.
One and two character strings can be used as numbers, where the first byte is the low order byte and the second is the high order byte. For example, "ab" is the equivalent of $6261. (When I say two character strings, I really mean two bytes, when encoded in UTF-8).
You can also repeat a string using the multiplication operator: "ho! " * 3 results in "ho! ho! ho!" (However, "ho" * 3 produces a number).
Labels contains any of the following: a-z, A-Z, 0-9, _
$ can be used to refer to the address of the current statement. In the case of DBs, it refers to the address of the start of the DBs.
When defining a label in a block, you can prefix it with an at symbol (@) to make it public. This makes it get declared at the top level instead of in the block's scope.
Macros may have the same name as a label.
There are no reserved words, so you may have labels which are the same as registers, and macros which are the same as instructions. This will be very confusing though, and will cause you to end up with ridiculous bugs in your code.
When assembling the code, there are two addresses which are tracked: the output address, and the code address. The output address defines where the generated bytes go in memory and in the output file. The code address initially is the same as the output address, but it is possible to change this using the PHASE directive. When this happens then the code is assembled as if it were to go in memory at the code address, but it still get places at the output address. This is useful if the assembled code is going to get copied to a different location before execution. The output and code addresses both start at 0 until set by ORG or PHASE.
Some example code, including the output on the left (output address, followed by bytes). Notice how the second jump instruction jumps to $200, despite the fact that the 'two' label is output at location $105.
.org $100
0100 3e01 one: ld a,1
0102 c30001 jp one
.phase $200
0105 3e01 two: ld a,1
0107 c30002 jp two
Any directive shown below without a leading full stop (period) may also be written with a leading full stop. The full stop is not optional if it is shown.
- db bytes
- defb bytes
- Declare bytes. Strings are converted into their UTF-8 values.
Only the low byte of any numeric value is used.
db 0,1,100,$12,"hello\n",$1234 00 01 64 12 68 65 6c 6c 6f 0a 34
- dw words
- defw words
- Declare words. Strings are converted into their UTF-8 values.
Words are stored low-byte first. If the string has an odd length, an extra zero is stored.
dw 0,$1234,"hello",$12345 00 00 34 12 68 65 6c 6c 6f 00 45 23
- ds expression
- defs expression
- Declare storage. The expression is the number of bytes of storage to reserve. The bytes will be initialised to zeroes.
ds 12
- equ expression
- Sets the value of a label. This must be prefixed with a label (or more than one label).
- org expression
- Set origin. Sets the address where the next instructions will be assembled to, or bytes will be stored.
org $100
- .phase expression
- Set phase. Sets the code address without changing the output address.
- .endphase
- .dephase
- End phase – Ends phased compilation, and resets the code address to match the output address.
- macro macroname labels
- Define a macro. The parameters can be treated the same as EQUates until the macro ends.
macro add a,b
- endm
- End a macro definition.
- macroname arguments
- Call a macro.
- .block
- Start a block. Any labels declared within the block are only visible within that block. Blocks can be nested, so labels are also visible to blocks within the block.
- .endblock
- End a block.
- .align expression
- Align the next byte so that its address modulo expression is zero. (If we're in phased compilation, the phase address is aligned.)
- .ìnclude filename
- Include a file and process it as if it were in the current file. The filename is relative to the current file. If you include a file from an included file, that filename will also be relative to the file the includ statement is in.
.include "something/routines.z80"
- .if expression
- Assembles the code following the .if statement if the expression evaluates to true, up the next .endif or .else statement.
Note that the if expression must be evaluable on the first pass of assembly. Also, weird things might happen if an .if-.else-.endif block crosses a macro boundary.
- .else
- If an .if statement's expression evaluted to false, the code following the .else statement is assembled isntead, up to the next .endif statement.
- .endif
- Marks the end of a conditiona assembly block.
Maz supports expressions with proper precedence, and various different syntaxes for operators. The operators are, in order of precedence (highest first):
Operators | Description |
---|---|
( ) function() | ( ) → brackets function() → various functions (see below) |
! ~ + - | Unary operators ! → logical not ~ → bitwise not + → unary plus - → unary minus |
* / % mod | * → multiply / → divide % or mod → modulus |
+ - | + → add - → subtract |
<< shl >> shr | << or shl → shift left >> or shr → shift right |
< lt > gt <= le >= ge | < or lt → less than > or gt → greater than <= or le → less than or equal >= or ge → greater than or equal |
& and | & or and → bitwise and |
^ xor | ^ or xor → bitwise xor |
| or | | or or → bitwise or |
&& | && → logical and |
|| | || → logical or |
?: | ?: → ternary operator |
Textual operators must be followed by either some whitespace or an open bracket.
There are some functions which you can use in expressions:
Function | Description |
---|---|
min(x, y, ...) | Returns the smallest value |
max(x, y, ...) | Returns the largest value |
swp(x) | Swaps high and low order bytes of number x |
cat(x, y, ...) | Concatenates the strings |
rpt(s, n) | Repeat string s, n times |
Maz supports all undocumented Z80 instructions. See this website for a list of undocumented instructions: http://clrhome.org/table/
The Z80 instruction syntax is a little inconsitent around some 8-bit instructions, for example ADD A,B includes the accumulator, but SUB B doesn't. When using Maz the "A," is optional for all of the following instructions: ADD, ADC, SUB, SBC, AND, XOR, OR, CP.