65k Design Notes

This page describes some discussion that I went through while doing the programming specification for the 65k. I started this process in 2010, but explicitely decided to leave it there until my hardware pipeline had run dry (i.e. after the USB and Blitter boards being done).

During this time from time to time I thought about the 65k design. For example when I had to program some 6502 code I though how I could benefit from the 65k writing this piece of code. From this for example comes the "2s-complement" opcode, but also the prefix bit determining when to expand the opcode (so you can do a 64bit addition with a byte immediate operand). On the other I also thought about how to implement it, where I found the most complexity in the decode area, that also has to work with different bus width options.

Some of the decisions made here may very well be premature optimizations. I will revisit these and may change them in the further process.

News:

2011-12-23 Addded some discussion about the ZE/EE and LE prefixes
2011-05-01 Started this page

Table of content

Interrupt Compatibility
Extension Prefixes
Placing opcodes in the opcode map
- Placing opcodes on the Extension page
- Absolute y-indexed addressing
Elimination of Redundancy
- (Quick) Add/Substract to E/B
Convenience Opcodes
- 2s complement
- Shift X and Y

Interrupt Compatibility

TODO

Extension Prefixes

When providing wider registers than before, of course there must be a way to access those registers in a wide way. This is given with the RS prefix. However, what is the best way to manage transfers between different register widths?

There should be a way to load for example a byte and extend it to a wider size. Or the result of an operation should be automatically extended. And how can a value be truncated? What about loading X and Y with short values when they are always used full width in the addressing modes?

The ZE prefix

The first attempt was the ZE prefix. Without the prefix all stores to a register would be zero-extended. I.e. when, after a load or another operation, the value was written to a register, the value was zero-extended to full width. This accommodated for the X/Y loads fine, and provided some automation for AC. As each operation was sized by RS anyway, zero-extending them was not a problem. However, to add an 8-bit value to AC with a full width operation, the ADC would still need to load the full width value, even though the value was only 8 bit.

The EE prefix

Therefore the EE prefix - Early Extension means that the value is loaded on RS width, and the zero-extension is done "early": before the operation, that then happens on full width. This solves the problem of loading a narrow value for a wide operation. However ...

The LE prefix

... there is only the ability to extend with zeros. And can a register not be extended after having been set with a separate opcode? So now there is an LE prefix, that extends a value when loaded from memory (or another register). I.e. instead of extending a value when it is written like in the original approach, it is now extended when loaded. With the EXT opcode it can then be truncated to the relevant number of bits - or the value could just be stored to memory with a narrow STA instead of a full-width one.

There are two ways of behaviour with the LE prefix.

Load/Transfer operations: Here the value is zero-extended by default, flags are set from the original width given by RS. This allows to simply clear an index register with LDY #$00, no matter the processor width and no matter what was left over in Y from previous code. As flags are set from the RS-width value, this is also compatible with the 6502. Also it provides more information, as for example the sign of the value would be masked by a zero-extension would the flags be taken from the full width value.
Other operations: In other operations the extension takes place before the operation. I.e. RS determines the load width, but the operation is done on full width. The flags are taken from the result of the operation - which is full width in case of an extension. So flags represent the true result of the operation. However, to avoid accidently extending to a full operation - which is most certainly not wanted in e.g. ADC - the default here is to not extend.

The default operation on a load of X and Y thus is to zero-extend the registers, as they are always used full-width in the addressing modes. Flags would be set from the original size. This implements the "least surprise" strategy. Not-extending could be done with an explicit LE prefix saying so.

For the AC it is less important, as every operation on AC is explicitely sized (using RS) or 8-bit only - so upper bits do not matter on such operations and can be left by default. Therefore AC is not extended by default.

Placing opcodes in the opcode map

When thinking about the opcode decoding I learned that in the FPGA you can save "chip space" when you can combine opcodes. I.e. if you have a large case switch for opcodes, it pays out to say, select all "LDA" opcodes in a single case, and not have a case switch for each addressing mode.

Therefore, for the decoding, I decided to implement a two-stage process: first decode the opcode into something like "Operation" and "Addressing Mode" and "Options" bits, then in the second step do a micro-sequence for the addressing mode and in the execution phase continue with the operation micro-sequence. This way many logic parts can be shared - between same opcodes with different addressing modes and between different opcodes with the same addressing mode.

The consequences of this decision are described below:

Placing opcodes on the Extension page

You might have noticed that the LEA and PEA opcodes are distributed over the extension page. This is done with the above decision in mind. If you look closely you will see that the LEA and PEA addressing modes mirror those of the LDA and STA opcodes in the standard opcode page!

In the original opcode map, column $8 (Low opcode nibble) holds opcodes without operand, i.e. one-byte opcodes like CLC. I have mirrored this in the extension page.

Absolute y-indexed addressing

Some of the opcodes in the original 6502 show an asymmetry in that compared to similar opcodes even though they have an absolute x-indexed addressing mode, they don't have the absolute y-indexed addressing mode. Examples are ASL, ROL, LSR, ROR but also DEC, INC or STZ. STX would also profit from an absolute y-indexed addressing mode. So I have put them into colum $f, at their "logical" places in the opcode map. I.e. when ASL abs,X is opcode $1E, then ASL abs,Y is opcode $1F.

Elimination of Redundancy

Another point when designing the opcode set is to not introduce redundancy, i.e. two opcodes that basically do the same. Thinking about such redundancy reduction may even lead to new features...

(Quick) Add/Substract to E/B

While revisiting the opcode set I found that I had two opcodes that basically do the same, INE (quick increment by 1-8) and ADE immediate, plus similar combinations for substract/decrement and the B register.

Some comments on E register suggest that it is always used full-size. I.e. an immediate add of an 8-bit value would overflow up to the full size if necessary. So instead of a quick INE #8 an equivalent ADE #8 could be used (although the latter has 3 instead of 2 bytes total length).

So the first thing to do is to remove the INE/INB/DEE/DEB quick opcodes.

But wouldn't it be nice if I could add an 8-bit value (like from an immediate addressing mode or when read from an 8-bit variable) to a 32-bit AC without having to do clever tricks? With only the RS prefix bits the "register size" also determines the width of the data in memory, so it is not possible to have an 8-bit variable and use it in a wider operation.

Here another prefix bit would probably be useful to implement the "early" extension.

Convenience Opcodes

This section describes some convenient opcodes that I found missing during my recent 6502 programming.

2s complement

When I programmed my USB stack for the 6502 I found that I had to substract two values, but that the one to substract was in the AC, not the one to substract from. So I could not just use SEC/SBC to do the substraction. Instead I found I had to invert - in the sense of doing the 2s-complement - AC, then add to the second value, to substract the first from the second value.

To make this process easier, I added the INV opcode.

Shift X and Y

When I programmed access to the data structure in the USB driver, I was wondering how to access tabular structure data using the index registers. I had to actually compute an index by multiplying with the structure size.

To make this process easier, I added the shift left/right x/y register quick opcodes (SLX/SRX/SLY/SRY). These allow to quickly compute at least power-of-2 offsets for aligned data structures.

Return to Homepage

Last modified: 2011-12-24