-
Notifications
You must be signed in to change notification settings - Fork 43
Description
This is more like JFYI and me being annoying because I can't resist myself when I see Z80 code... sorry. :)
MULADD:
or a
jr z, DONE ; weight=0: skip entirely
jp m, NEG ; weight<0: subtract
; weight=+1: add activation
ld hl, (ACC)
add hl, de
ld (ACC), hl
ret
NEG:
cp 0FFh
jr z, NEG1 ; weight=-1
; weight=-2: subtract twice
ld hl, (ACC)
sbc hl, de
sbc hl, de
ld (ACC), hl
ret
NEG1:
; weight=-1: subtract once
ld hl, (ACC)
sbc hl, de
ld (ACC), hl
ret
I believe (haven't tested the code, but "should" work) you can speed it up a bit for machines like ZX Spectrum - having Zilog timing of instructions, for machines like CPC where machine cycles are forced to be multiples of 4T this will be slightly slower in some paths, overall should be still faster:
(for ZX: -3T for +1 weight, -11T for 0 weight, -10T for -1 weight and -9T for -2 weight ... if I'm counting it correctly from head)
MULADD:
or a
ret z ; weight=0: skip entirely
ld hl,ACC
jp m, NEG ; weight<0: subtract
; weight=+1: add activation
ld a,e
add a,(hl)
ld (hl),a
inc hl ; can be `inc l` if ACC is ALIGN 2
ld a,d
adc a,(hl)
ld (hl),a
ret
NEG:
inc a
jr z, NEG1 ; weight=-1: subtract once
rl e ; CF=0 from `or a`
rl d ; weight=-2: subtract twice (DE*=2)
NEG1:
ld a,e
sub (hl)
ld (hl),a
inc hl ; can be `inc l` if ACC is ALIGN 2
ld a,d
sbc a,(hl)
ld (hl),a
ret
BTW for weight -2, you are doing 2x sbc, so the carry overflow from first subtraction goes into the accumulator from bottom... I don't think that's intended, but probably doesn't matter in NN calculation as it will skew results in negligible way, if at all (I guess that subtraction often doesn't overflow at all or generally should not overflow, otherwise acc would have to have more bits to accumulate the result properly... I guess you can collect a bit more of those extra overflow bits in case you are oscillating around total zero weight with several -2 weights).
And looking at the loop calling MULADD itself, this could be sped up further quite a bit, but it's very tiresome for me to read the code generator syntax, I wish you would use external assembler like sjasmplus and generate from python rather regular Z80 syntax code snippets as strings, so they can be read and edited in common syntax... :D (but that's my personal bias/preference).