6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Jun 02, 2024 10:36 pm

All times are UTC




Post new topic Reply to topic  [ 163 posts ]  Go to page Previous  1 ... 7, 8, 9, 10, 11  Next
Author Message
PostPosted: Tue Mar 29, 2022 8:01 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1932
Location: Sacramento, CA, USA
You heard incorrectly. That's an illegal, disgusting and sadistic use of force. I said that they were inbred, not me.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 01, 2022 2:25 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
Thinking about the relative strengths of the 6502 always returns focus to the zero page.

Within the zero page, the 6502 indexing modes do behave like the 6800 index with displacement mode, with dramatic results. Taking liberties that the list need not be preserved after traversal, it is handled in parts which do fit in the zero page.

Code:
                          00004 ; 328611 ($503A3) cycles
                          00005 ; 256134 ($3E886) cycles after inlining subroutines
                          00006
 0000 08                  00007 Free     fcb    Heap
 0001 4B                  00008 J        fcb    75        ; Inner loop runs 75 times
 0002 0000 0000           00009 Total    fdb    0,0
 0006 0BB7                00010 I        fdb    2999
                          00011
 0008                     00012 Heap:
                          00013
 0200                     00014          org    $200
                          00015
                          00016 ;
                          00017 ; Allocate a node
                          00018 ;
 0200                     00019 Alloc:
 0200 A5 00           [3] 00020          lda    Free      ; Allocate a node
 0202 AA              [2] 00021          tax              ; Return it in X
                          00022
 0203 18              [2] 00023          clc              ; Point to free memory
 0204 69 03           [2] 00024          adc    #3
 0206 85 00           [3] 00025          sta    Free
                          00026
 0208 60              [6] 00027          rts
                          00028
                          00029 ;
                          00030 ; Prepend a node to the list
                          00031 ;
 0209                     00032 Prepend:
 0209 20 0200         [6] 00033          jsr    Alloc     ; Allocate a new node
                          00034
 020C 94 00           [4] 00035          sty    0,X       ; Point next of new node to Root
                          00036
 020E 8A              [2] 00037          txa              ; This is now the first node
 020F A8              [2] 00038          tay
                          00039
 0210 A5 06           [3] 00040          lda    I         ; Store value in node
 0212 95 01           [4] 00041          sta    1,X
 0214 A5 07           [3] 00042          lda    I+1
 0216 95 02           [4] 00043          sta    2,X
                          00044
 0218 60              [6] 00045          rts
                          00046
                          00047 ;
                          00048 ; Traverse the list and sum the values
                          00049 ;
 0219                     00050 NoCarry:
 0219 B9 0000       [4/5] 00051          lda    0,Y       ; Point to the next node
                          00052
 021C F0 1A (0238)  [2/3] 00053          beq    Summed    ; Until end of list
                          00054
 021E A8              [2] 00055          tay
                          00056
 021F                     00057 Sum:
 021F 18              [2] 00058          clc              ; Add value of node to sum
 0220 A5 02           [3] 00059          lda    Total
 0222 79 0001       [4/5] 00060          adc    1,Y
 0225 85 02           [3] 00061          sta    Total
 0227 A5 03           [3] 00062          lda    Total+1
 0229 79 0002       [4/5] 00063          adc    2,Y
 022C 85 03           [3] 00064          sta    Total+1
 022E 90 E9 (0219)  [2/3] 00065          bcc    NoCarry
                          00066
 0230 E6 04           [5] 00067          inc    Total+2
                          00068
 0232 D0 E5 (0219)  [2/3] 00069          bne    NoCarry
                          00070
 0234 E6 05           [5] 00071          inc    Total+3
                          00072
 0236 D0 E1 (0219)  [2/3] 00073          bne    NoCarry   ; Always branches
                          00074
 0238                     00075 Summed:
 0238 60              [6] 00076          rts
                          00077
                          00078 ;
                          00079 ;
                          00080 ;
 0239                     00081 RepeatOuterLoop:
 0239 20 021F         [6] 00082          jsr    Sum       ; Traverse the list and add them up
                          00083
 023C A0 00           [2] 00084          ldy    #0        ; Clear list
                          00085
 023E A9 08           [2] 00086          lda    #Heap     ; Reset the heap
 0240 85 00           [3] 00087          sta    Free
                          00088
 0242 A9 4B           [2] 00089          lda    #75       ; Run inner loop 75 times
 0244 85 01           [3] 00090          sta    J
                          00091
 0246 D0 06 (024E)  [2/3] 00092          bne    InnerLoop ; Always branches
                          00093
 0248                     00094 NoBorrow:
 0248 C6 06           [5] 00095          dec    I
                          00096
 024A C6 01           [5] 00097          dec    J
 024C F0 EB (0239)  [2/3] 00098          beq    RepeatOuterLoop
                          00099
 024E                     00100 Main:
 024E                     00101 InnerLoop:
 024E 20 0209         [6] 00102          jsr    Prepend
                          00103
 0251 A5 06           [3] 00104          lda    I         ; Low byte of count zero?
 0253 D0 F3 (0248)  [2/3] 00105          bne    NoBorrow  ; No, no borrow
                          00106
 0255 C6 07           [5] 00107          dec    I+1       ; Borrow from upper byte
 0257 10 EF (0248)  [2/3] 00108          bpl    NoBorrow  ; Done if it goes negative
                          00109
 0259 20 021F         [6] 00110          jsr    Sum       ; Traverse the list and add them up
                          00111
 025C 00              [7] 00112          brk
                          00113
 024E                     00114          end    Main


What is good for the goose is good for the gander and the gosling. The door is now open for the others to exploit their zero pages in the same manner.

Unfortunately, the 6800 cannot load the index register from a single byte. Its minor gain does not save it from last place.

Code:
                          00004 * 366717 ($5987D) cycles
                          00005 * 288067 ($46543) cycles after inlining subroutines
                          00006
 0000 000D                00007 Free     fdb    Heap
 0002 0000                00008 Root     fdb    0
 0004 0000                00009 New      fdb    0
 0006 0000 0000           00010 Total    fdb    0,0
 000A 3C                  00011 J        fcb    60        ; Inner loop runs 60 times
 000B 0BB7                00012 I        fdb    2999
                          00013
 000D                     00014 Heap     rmb    4*60
                          00015
                          00016 *
                          00017 * Allocate a node
                          00018 *
 00FD                     00019 Alloc
 00FD D6 01           [3] 00020          ldab   Free+1    ; Allocate a node
 00FF D7 05           [4] 00021          stab   New+1     ; Return it in New
                          00022
 0101 CB 04           [2] 00023          addb   #4        ; Point to free memory
 0103 D7 01           [4] 00024          stab   Free+1
                          00025
 0105 39              [5] 00026          rts
                          00027
                          00028 *
                          00029 * Prepend a node to the list
                          00030 *
 0106                     00031 Prepend
 0106 8D F5 (00FD)    [8] 00032          bsr    Alloc     ; Allocate a new node
                          00033
 0108 DE 04           [4] 00034          ldx    New       ; Point to the new node
                          00035
 010A D6 03           [3] 00036          ldab   Root+1    ; Point next of new node to Root
 010C E7 01           [6] 00037          stab   1,X
 010E 6F 00           [7] 00038          clr    ,X
                          00039
 0110 A7 03           [6] 00040          staa   3,X       ; Store value in node
 0112 D6 0B           [3] 00041          ldab   I
 0114 E7 02           [6] 00042          stab   2,X
                          00043
 0116 DF 02           [5] 00044          stx    Root      ; This is now the first node
                          00045
 0118 39              [5] 00046          rts
                          00047
                          00048 *
                          00049 * Traverse the list and sum the values
                          00050 *
 0119                     00051 Sum
 0119 DE 02           [4] 00052          ldx    Root
 011B 96 08           [3] 00053          ldaa   Total+2
 011D D6 09           [3] 00054          ldab   Total+3
                          00055
 011F                     00056 SumLoop
 011F EB 03           [5] 00057          addb   3,X       ; Add value of node to sum
 0121 A9 02           [5] 00058          adca   2,X
 0123 24 08 (012D)    [4] 00059          bcc    NoCarry
                          00060
 0125 7C 0007         [6] 00061          inc    Total+1
                          00062
 0128 26 03 (012D)    [4] 00063          bne    NoCarry
                          00064
 012A 7C 0006         [6] 00065          inc    Total
                          00066
 012D                     00067 NoCarry
 012D EE 00           [6] 00068          ldx    ,X        ; Point to the next node
 012F 26 EE (011F)    [4] 00069          bne    SumLoop   ; Until end of list
                          00070
 0131 97 08           [4] 00071          staa   Total+2
 0133 D7 09           [4] 00072          stab   Total+3
                          00073
 0135 39              [5] 00074          rts
                          00075
                          00076 *
                          00077 *
                          00078 *
 0136                     00079 RepeatOuterLoop
 0136 97 0C           [4] 00080          staa   I+1       ; Save low byte of count
                          00081
 0138 8D DF (0119)    [8] 00082          bsr    Sum       ; Traverse the list and add them up
                          00083
 013A DF 02           [5] 00084          stx    Root      ; Clear list
                          00085
 013C 86 0D           [2] 00086          ldaa   #Heap     ; Reset the heap
 013E 97 01           [4] 00087          staa   Free+1
                          00088
 0140 86 3C           [2] 00089          ldaa   #60       ; Run inner loop 60 times
 0142 97 0A           [4] 00090          staa   J
                          00091
 0144                     00092 Main
 0144                     00093 InnerLoop
 0144 96 0C           [3] 00094          ldaa   I+1       ; Get low byte of node value
                          00095
 0146 20 06 (014E)    [4] 00096          bra    IntoLoop
                          00097
 0148                     00098 NoBorrow
 0148 4A              [2] 00099          deca
                          00100
 0149 7A 000A         [6] 00101          dec    J
 014C 27 E8 (0136)    [4] 00102          beq    RepeatOuterLoop
                          00103
 014E                     00104 IntoLoop
 014E 8D B6 (0106)    [8] 00105          bsr    Prepend
                          00106
 0150 4D              [2] 00107          tsta             ; Low byte of count zero?
 0151 26 F5 (0148)    [4] 00108          bne    NoBorrow  ; No, no borrow
                          00109
 0153 7A 000B         [6] 00110          dec    I         ; Borrow from upper byte
 0156 2A F0 (0148)    [4] 00111          bpl    NoBorrow  ; Done if it goes negative
                          00112
 0158 8D BF (0119)    [8] 00113          bsr    Sum       ; Traverse the list and add them up
                          00114
 015A 3F             [12] 00115          swi
                          00116
 0144                     00117          end    Main


Even though the 8080 architecture does not directly advantage the first 256 bytes of its memory space, single-byte addresses do present a huge opportunity in light of the clumsy addressing capabilities. Some tricky coding keeps the 8080 solidly in second place.

Code:
                          00004 ; 757274 (0B8E1Ah) cycles
                          00005 ; 594194 (091112h) cycles after inlining subroutines
                          00006
 0000 0000 0000           00007 Total   dw      0,0
 0004 0BB7                00008 I       dw      2999
                          00009
 0006                     00010 Heap    ds      3*75
                          00011
                          00012 ;
                          00013 ; Allocate a node
                          00014 ;
 00E7                     00015 Alloc:
 00E7 6B              [5] 00016         mov     L,E             ; Allocate a node to return in HL
                          00017
 00E8 1C              [5] 00018         inr     E               ; Point to free memory
 00E9 1C              [5] 00019         inr     E
 00EA 1C              [5] 00020         inr     E
                          00021
 00EB C9             [10] 00022         ret
                          00023
                          00024 ;
                          00025 ; Prepend a node to the list
                          00026 ;
 00EC                     00027 Prepend:
 00EC CD 00E7        [17] 00028         call    Alloc           ; Allocate a new node
                          00029
 00EF 72              [7] 00030         mov     M,D             ; Point next of new node to Root
                          00031
 00F0 55              [5] 00032         mov     D,L             ; This is the new first node
                          00033
 00F1 23              [5] 00034         inx     H               ; Store value in node
 00F2 71              [7] 00035         mov     M,C
 00F3 23              [5] 00036         inx     H
 00F4 3A 0005        [13] 00037         lda     I+1
 00F7 77              [7] 00038         mov     M,A
                          00039
 00F8 C9             [10] 00040         ret
                          00041
                          00042 ;
                          00043 ; Traverse the list and sum the values
                          00044 ;
 00F9                     00045 Sum:
 00F9 6A              [5] 00046         mov     L,D             ; Point HL to first node
 00FA EB              [4] 00047         xchg
 00FB 2A 0000        [16] 00048         lhld    Total           ; Load cumulative total
 00FE EB              [4] 00049         xchg
                          00050
 00FF                     00051 SumLoop:
 00FF 23              [5] 00052         inx     H               ; Add value of node to sum
 0100 4E              [7] 00053         mov     C,M
 0101 23              [5] 00054         inx     H
 0102 46              [7] 00055         mov     B,M
 0103 EB              [4] 00056         xchg
 0104 09             [10] 00057         dad     B
 0105 EB              [4] 00058         xchg
 0106 D2 010E        [10] 00059         jnc     NoCarry
                          00060
 0109 7D              [5] 00061         mov     A,L             ; Carry to high high word
 010A 2E 02           [7] 00062         mvi     L,Total+2
 010C 34             [10] 00063         inr     M
 010D 6F              [5] 00064         mov     L,A
                          00065
 010E                     00066 NoCarry:
 010E 2B              [5] 00067         dcx     H               ; Point to the next node
 010F 2B              [5] 00068         dcx     H
 0110 6E              [7] 00069         mov     L,M
                          00070
 0111 7D              [5] 00071         mov     A,L
 0112 B5              [4] 00072         ora     L               ; Until end of list
 0113 C2 00FF        [10] 00073         jnz     SumLoop
                          00074
 0116 EB              [4] 00075         xchg
 0117 22 0000        [16] 00076         shld    Total
 011A EB              [4] 00077         xchg                    ; Leave 0 in HL
                          00078
 011B C9             [10] 00079         ret
                          00080
                          00081 ;
                          00082 ;
                          00083 ;
 011C                     00084 Main:
 011C 26 00           [7] 00085         mvi     H,0
                          00086
 011E 3A 0004        [13] 00087         lda     I
 0121 4F              [5] 00088         mov     C,A
                          00089
 0122 C3 012A        [10] 00090         jmp     IntoOuterLoop
                          00091
 0125                     00092 RepeatOuterLoop:
 0125 C5             [11] 00093         push    B
                          00094
 0126 CD 00F9        [17] 00095         call    Sum             ; Traverse the list and add them up
                          00096
 0129 C1             [10] 00097         pop     B
                          00098
 012A                     00099 IntoOuterLoop:
 012A 06 4B           [7] 00100         mvi     B,75
                          00101
 012C 11 0006        [10] 00102         lxi     D,Heap          ; Reset the heap and clear list
                          00103
 012F C3 0137        [10] 00104         jmp     IntoLoop
                          00105
 0132                     00106 NoBorrow:
 0132 0D              [5] 00107         dcr     C
                          00108
 0133 05              [5] 00109         dcr     B
 0134 CA 0125        [10] 00110         jz      RepeatOuterLoop
                          00111
 0137                     00112 IntoLoop:
 0137 CD 00EC        [17] 00113         call    Prepend
                          00114
 013A 79              [5] 00115         mov     A,C             ; Low byte of count zero?
 013B B1              [4] 00116         ora     C
 013C C2 0132        [10] 00117         jnz     NoBorrow        ; No, no borrow
                          00118
 013F 2E 05           [7] 00119         mvi     L,I+1           ; Borrow from upper byte
 0141 35             [10] 00120         dcr     M
 0142 F2 0132        [10] 00121         jp      NoBorrow        ; Done if it goes negative
                          00122
 0145 CD 00F9        [17] 00123         call    Sum             ; Traverse the list and add them up
                          00124
 0148 76              [7] 00125         hlt
                          00126
 011C                     00127         end     Main


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 02, 2022 5:18 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
A change in register usage narrows the gap but does not alter the standings.

Code:
                          00007 * 360323 ($57F83) cycles
                          00008 * 281673 ($44C49) cycles after inlining subroutines
                          00009
 0000 000D                00010 Free     fdb    Heap
 0002 0000                00011 Root     fdb    0
 0004 0000                00012 New      fdb    0
 0006 0000 0000           00013 Total    fdb    0,0
 000A 3C                  00014 J        fcb    60        ; Inner loop runs 60 times
 000B 0BB7                00015 I        fdb    2999
                          00016
 000D                     00017 Heap     rmb    4*60
                          00018
                          00019 *
                          00020 * Allocate a node
                          00021 *
 00FD                     00022 Alloc
 00FD 97 05           [4] 00023          staa   New+1     ; Return it in New
                          00024
 00FF 8B 04           [2] 00025          adda   #4        ; Point to free memory
                          00026
 0101 39              [5] 00027          rts
                          00028
                          00029 *
                          00030 * Prepend a node to the list
                          00031 *
 0102                     00032 Prepend
 0102 8D F9 (00FD)    [8] 00033          bsr    Alloc     ; Allocate a new node
                          00034
 0104 DE 04           [4] 00035          ldx    New       ; Point to the new node
                          00036
 0106 E7 03           [6] 00037          stab   3,X       ; Store value in node
 0108 D6 0B           [3] 00038          ldab   I
 010A E7 02           [6] 00039          stab   2,X
                          00040
 010C D6 03           [3] 00041          ldab   Root+1    ; Point next of new node to Root
 010E E7 01           [6] 00042          stab   1,X
 0110 6F 00           [7] 00043          clr    ,X
                          00044
 0112 DF 02           [5] 00045          stx    Root      ; This is now the first node
                          00046
 0114 39              [5] 00047          rts
                          00048
                          00049 *
                          00050 * Traverse the list and sum the values
                          00051 *
 0115                     00052 Sum
 0115 DE 02           [4] 00053          ldx    Root
 0117 96 08           [3] 00054          ldaa   Total+2
 0119 D6 09           [3] 00055          ldab   Total+3
                          00056
 011B                     00057 SumLoop
 011B EB 03           [5] 00058          addb   3,X       ; Add value of node to sum
 011D A9 02           [5] 00059          adca   2,X
 011F 24 08 (0129)    [4] 00060          bcc    NoCarry
                          00061
 0121 7C 0007         [6] 00062          inc    Total+1
                          00063
 0124 26 03 (0129)    [4] 00064          bne    NoCarry
                          00065
 0126 7C 0006         [6] 00066          inc    Total
                          00067
 0129                     00068 NoCarry
 0129 EE 00           [6] 00069          ldx    ,X        ; Point to the next node
 012B 26 EE (011B)    [4] 00070          bne    SumLoop   ; Until end of list
                          00071
 012D 97 08           [4] 00072          staa   Total+2
 012F D7 09           [4] 00073          stab   Total+3
                          00074
 0131 39              [5] 00075          rts
                          00076
                          00077 *
                          00078 *
                          00079 *
 0132                     00080 RepeatOuterLoop
 0132 8D E1 (0115)    [8] 00081          bsr    Sum       ; Traverse the list and add them up
                          00082
 0134 DF 02           [5] 00083          stx    Root      ; Clear list
                          00084
 0136 86 3C           [2] 00085          ldaa   #60       ; Run inner loop 60 times
 0138 97 0A           [4] 00086          staa   J
                          00087
 013A                     00088 Main
 013A 86 0D           [2] 00089          ldaa   #Heap     ; Reset the heap
                          00090
 013C D6 0C           [3] 00091          ldab   I+1       ; Load low byte of count
                          00092
 013E 20 08 (0148)    [4] 00093          bra    IntoLoop
                          00094
 0140                     00095 NoBorrow
 0140 5A              [2] 00096          decb
 0141 D7 0C           [4] 00097          stab   I+1
                          00098
 0143 7A 000A         [6] 00099          dec    J
 0146 27 EA (0132)    [4] 00100          beq    RepeatOuterLoop
                          00101
 0148                     00102 IntoLoop
 0148 8D B8 (0102)    [8] 00103          bsr    Prepend
                          00104
 014A D6 0C           [3] 00105          ldab   I+1       ; Low byte of count zero?
 014C 26 F2 (0140)    [4] 00106          bne    NoBorrow  ; No, no borrow
                          00107
 014E 7A 000B         [6] 00108          dec    I         ; Borrow from upper byte
 0151 2A ED (0140)    [4] 00109          bpl    NoBorrow  ; Done if it goes negative
                          00110
 0153 8D C0 (0115)    [8] 00111          bsr    Sum       ; Traverse the list and add them up
                          00112
 0155 3F             [12] 00113          swi
                          00114
 013A                     00115          end    Main


Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 08, 2022 9:10 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
In another thread, barrym95838 wrote:
barrym95838 wrote:
The 6809 designers had P.I. code directly in mind when they designed it


To which I wrote:
BillG wrote:
I have written one 6809 program using PIC and it was no walk in the park, piece of cake or bed of roses.

That program is the resident part of a system extension to add command recall, aka history, to FLEX. The 6800 and 6502 versions use a relocation bitmap similar to how MOVCPM worked on an 8080 - assemble the code at two addresses a multiple of 256 bytes apart; the bytes which differed needed to be adjusted to relocate the code. But the code can only be placed at addresses differing from the original location by entire pages.

I thought that PIC would allow me to put the code anywhere and not waste up to a page of memory. It did, but one programming idiom was not easy to implement.

In absolute 6800 code, I can write:
Code:
    cpx     #TheEnd

to compare register X with the address of the end of a data structure.

In PIC 6809 code I cannot write:
Code:
    cmpx     #TheEnd,PCR


I can write:
Code:
    leay    TheEnd,PCR

to load the address into a register, but I cannot easily compare it with another register. I had to push it onto the stack and compare it there:
Code:
    leay    TheEnd,PCR
    pshs    Y
    cmpx    ,S++

which is substantially larger and slower.

If only Motorola had given me four "compare effective address" instructions: ceax, ceay, ceau and ceas...

To which BigEd wrote:
BigEd wrote:
That doesn't seem so bad - you found an idiom which works and you could wrap it in a macro.

To which I wrote:
BillG wrote:
The initial intended target of the 6809 version of the program is a Peripheral Technologies PT69-5 single board computer.

It is based on a 6809 processor running at 2 MHz. Unlike the 6502, evolution of the 6809 stopped when Motorola moved on to bigger things, the 680x0 family. FLEX on a 6809 officially supports up to 56 KBytes of RAM. I specifically chose to target the PT69-5 because it has an additional 4 KBytes, much of which is otherwise unused; my code and the history buffer resides there so that the amount of memory available for program use remained the same.

Instead of that push and compare sequence in my last post, I ended up doing the load effective address and storing the result in a variable during program initialization then comparing pointers against that later - much smaller and faster. The point is that position independent code on the 6809 does not come free though it is easier than on most other architectures.

To which Dr Jefyll wrote:
Dr Jefyll wrote:
BillG wrote:
Unlike the 6502, evolution of the 6809 stopped when Motorola moved on to bigger things
Maybe Motorola moved on, but there's definitely some further-evolved 6809 DNA out there. :) Back in the day, Hitachi out-6809ed Motorola by an embarrassing margin with their backward compatible 6309. "To the 6809 specifications, it adds higher clock rates, enhanced features, new instructions, and additional registers. Most of the new instructions were added to support the additional registers, as well as up to 32-bit math, hardware division, bit manipulations, and block transfers."

Also noteworthy are the MC9S12 and HCS12 series. These guys have "only" three stackpointer/index regs (U got turfed) but otherwise seem very 6809-like. Unfortunately this resemblance includes the 6809's very high prevalence of dead cycles, but on the plus side the MC9S12 and HCS12 are in current production by NXP and presumably clock a great deal faster.

-- Jeff


The Hitachi 6309 is to the 6809 as the 65816 is to the 6502. The addition of more instructions, registers and addressing modes along with a "native" mode. Even though it put its resources into the 68000 and not in further evolving the 6809, Motorola applied legal pressure to keep Hitachi from documenting the "improvements" they made. It took the hacker community to do that.

The evolution of the Motorola 8-bit line is more like

6800 --> 6801 --> HC11 --> HC12 --> HC16
|
+--> 6809

Unfortunately, when they went to the 6809, Motorola dropped many of the efficient inherent instructions in favor of generalized versions. Among those lost were TAB, TBA, ABA, SBA, CBA, INX, DEX, PSHA, PSHB, PULA, PULB. The generalized instructions were bigger and slower. Some dropped instructions were not replaced; the recommended workarounds were seriously inefficient in some cases.

One of the 6809 designers later wrote a master's thesis looking back on the design decisions:

https://www.cs.utexas.edu/ftp/techreports/tr81-206a.pdf

Interesting reading. He said that adding the direct page register was a mistake. I disagree; though I have not used that capability, it did not apparently cost much to implement.

To add to what I previously said:

BillG wrote:
If only Motorola had given me four "compare effective address" instructions: ceax, ceay, ceau and ceas...


I would also add lead to load into and cead to compare with the combined accumulator D. It would have been really nice to be able to do things like:
Code:
    cead    7,X
    ceax    D,Y


Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 08, 2022 11:08 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10818
Location: England
Quote:
One of the 6809 designers later wrote a master's thesis looking back on the design decisions


That's a good find, thanks! There's a condensed version of some of the same ideas here
https://dl.acm.org/doi/pdf/10.1145/1500676.1500739
although it's not the same document. (But it does have a text layer so it's easier to copy and paste!)

Quote:
we wanted to prove that it was possible to produce an inexpensive microprocessor that was also easy to program. We felt that too many of the existing microprocessors were needlessly difficult to program. We suspected that the reason was not that it was impossible to make a microcomputer that was easy to program, but, rather, that the architects of the early microprocessors were generally more hardware oriented than software oriented.


Quote:
We wanted the architecture to efficiently support modern block-structured high-level languages. Features such as stack addressing were included for this purpose. We also wanted to better support assembly language with the ability to write recursive, reentrant programs and position-independent programs.


Edit: I notice Joel Boney's thesis is split into 50 page chunks:
https://www.cs.utexas.edu/ftp/techreports/tr81-206a.pdf
https://www.cs.utexas.edu/ftp/techreports/tr81-206b.pdf
https://www.cs.utexas.edu/ftp/techreports/tr81-206c.pdf


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 09, 2022 8:04 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
Thanks. I did not realize that.

I have more reading to do...


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 09, 2022 10:24 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
I quickly skimmed the last two documents. A more detailed read will have to wait for the weekend.

My initial impressions about his two sets of papers:

* His samples were too small. Granted the studies were done before Al Gore invented the Internet and the whole open source phenomenon arose. The choice of the 6839 floating point library ROM is going to be biased toward position independent code and away from extended and direct addressing. The monitor, as system software, will avoid the use of the direct page so that application programs can have it.

* He said that lbsr, leax, pshs were the top three most frequently appearing opcodes.

680x programmers are used to coding BSR to call a subroutine over JSR because it is both smaller and faster. On the 6800, if the assembler said that a branch was out of range, change it JMP. On the 6809, it is much easier to add 'L' to make it LBSR even though it is a cycle slower than JMP. Some programmers may have gotten into the habit of always coding "LBSR" instead of "BSR" or "JMP.". I do not believe as he does that a majority of programmers strove for position independent code.

Likewise, it is much easier to change an out of range short conditional branch to a long one than inserting and branching around an absolute jump as was the case on the 6800. Long conditional branches are good and not because they are position independent.

By getting rid of INX and DEX, what did he think was going to happen? He did not filter out "LEAX 1,X" and "LEAX -1,X" for special consideration.

PSHA and PULA were very commonly used on the 6800 with PSHB and PULB slightly less so. Those were moved from the 6800 "inherent" bucket to the 6809 pshs/puls one. With more than one index register, there is nowhere near the need for pushing and pulling X as was the case on the 6800.

* He said that the indexed addressing mode was more heavily used on the 6809 than the 6800.

Again, by getting rid of INX and DEX, usage was moved from the 6800 "inherent" bucket to the 6809 "indexed" one.

* He said that indirect addressing modes were seldom used.

His survey was done before the use of high-level programming languages, especially C, significantly displaced assembly language on 8-bit microprocessors. Things like "LDA [6,S]" are likely common in code generated by compilers.

Something like "LDA [6,S+]" is not very useful. It would be different if it was the address at "6,S" instead of the S register that was being incremented and that is not supported.

* The answer for ABA, SBA and CBA not being orthogonal is to add AAB, SAB and CAB. Those are very useful instructions to have. The 6809 has around thirty unused single-byte opcodes; I would have also kept the single-byte forms of INX, DEX, PSHA, PULA, PSHB, PULB as the 6801, HC11, etc. did.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 10, 2022 2:23 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
As I read about the limited amount of software available to him for analysis, I keep getting a feeling of "lost opportunity."

Southwest Technical Products (SWTPC) was less than a hundred miles down the road in San Antonio. The SWTPC 6800 computer was the most common computer built around the 6800 processor; it was mentioned quite often in magazines such as Popular Electronics.

SWTPC offered a good selection of software at very inexpensive prices including BASIC interpreters as well as a native text editor and assembler.

When the time came to do his "post mortem" analysis, SWTPC was selling a 6809 computer running FLEX and an even larger selection of software.

At that time, Motorola was making huge bucks selling processors to be used in vehicle engine controllers so they may have intentionally chosen to ignore "toy" computers.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 10, 2022 10:14 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
BillG wrote:
In another thread, barrym95838 wrote:
barrym95838 wrote:
The 6809 designers had P.I. code directly in mind when they designed it


Parts 2 and 3 of the Byte articles:
https://archive.org/details/byte-magazi ... ew=theater
https://archive.org/details/byte-magazi ... ew=theater

One of the things which bug me about programming the 6809 is that the test memory instruction TST takes one more cycle than the instruction to load the byte into a register. The processor treats it like a read-modify-write instruction without the write.


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 11, 2022 1:59 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3362
Location: Ontario, Canada
BillG wrote:
the test memory instruction TST takes one more cycle than the instruction to load the byte into a register.
And a whole raft of the simplest, most often-used instructions take one more cycle on 6809 than the equivalent instruction on 6502. :roll:

    LDA direct-page (and similar examples) 4 cycles instead of 3
    LDA absolute (and similar examples) 5 cycles instead of 4
    INC direct-page (and similar examples) 6 cycles instead of 5
    INC absolute (and similar examples) 7 cycles instead of 6

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 11, 2022 6:35 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
In section 6.1 of https://www.cs.utexas.edu/ftp/techreports/tr81-206b.pdf, he mentions that the 6801 was internally faster than the 6809 and the instructions which could have been made a cycle faster:

* load effective address (lea*)
* subroutine calls (jsr, bsr, lbsr)
* pushes and pulls
* direct and extended addressing modes
* possibly indexed mode
* an additional cycle off of 5-bit offset indexing (these two would have really helped the new forms of INX and DEX)

I suspect that they faced enormous pressure to ship the 6809 before the upcoming 68000 made it less relevant. The 68000 did not arrive in quantity until late 1980, so I wish they had taken the time to do that optimization.

I am still chapped that they did not base TST on the LDA or LDB logic instead of the INC or DEC.


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 11, 2022 11:49 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3362
Location: Ontario, Canada
BillG wrote:
I suspect that they faced enormous pressure to ship the 6809 before the upcoming 68000 made it less relevant.
Could be, I guess. But the MC9S12 and HCS12 are plagued with dead cycles just like their 6809 ancestor. :|

Hitachi, however, was highly successful in eliminating dead cycles (on the 6309, I mean).

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 22, 2022 11:47 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
There is an ambiguity in the 6809 I finally got around to testing on real hardware.

If I do this:
Code:
    ldx     #0
    leax    ,-X

X obviously contains $FFFF.

But if I do this:
Code:
    ldx     #0
    leax    ,X+

does X contain 0 or 1?

It would depend upon whether the load is done first or the increment.

I will reveal the answer later while you guess and discuss...


Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 22, 2022 3:06 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1932
Location: Sacramento, CA, USA
I ran into that conundrum while designing my own 65m32a architecture (a decades-long work in progress), and decided that the increment should be after the potential read of the effective address but before the load (or store), resulting in X=1. I still don't know if that decision is going to bite me in the future.

[Edit: actually I don't believe I thought this particular case through, because it (LDX #,X+) is not allowed in 65m32a assembly. STX ,X+ is allowed, and would result in a one being stored in location zero.]

For the 65xx, here's a loosely related puzzle that almost certainly has a stable solution (disregarding an untimely interrupt):
Code:
    org $0100
    ldx #5
    txs
    jsr $1234
    brk
Where did we jsr? Was it $1234 or somewhere else? Maybe $0134?

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 23, 2022 10:48 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 693
Location: North Tejas
barrym95838 wrote:
I ran into that conundrum while designing my own 65m32a architecture (a decades-long work in progress), and decided that the increment should be after the potential read of the effective address but before the load (or store), resulting in X=1. I still don't know if that decision is going to bite me in the future.

[Edit: actually I don't believe I thought this particular case through, because it (LDX #,X+) is not allowed in 65m32a assembly. STX ,X+ is allowed, and would result in a one being stored in location zero.]

Did you ever get around to implementing it, either in simulation or actual hardware?

barrym95838 wrote:
For the 65xx, here's a loosely related puzzle that almost certainly has a stable solution (disregarding an untimely interrupt):
Code:
    org $0100
    ldx #5
    txs
    jsr $1234
    brk
Where did we jsr? Was it $1234 or somewhere else? Maybe $0134?

Interesting. Someone with single stepping capability will have to try it on real hardware.

My simulator, which I in no way claim is authoritative, jumps to $1234 with the instruction changed to jsr $105.

As for
BillG wrote:
Code:
    ldx     #0
    leax    ,X+

does X contain 0 or 1?

the result is 0.

I had no idea going in which it was going to be. My guess at this point is that the indexing logic (which is agnostic as to the type of instruction being executed) loads the effective address into an internal register for use by the ALU, incrementing X at the end of the process. That internal register is copied to X in the final "write back" stage of the instruction.

It is interesting that the 68000 does not allow
Code:
    lea    (A1)+,A1


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 163 posts ]  Go to page Previous  1 ... 7, 8, 9, 10, 11  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: