My journey with ATTiny4313 (part 4)

Part 4: Dealing with interrupts

Managing the /STROBE line

The protocol between the driver and the Soundpool MO4

The communication between the Atari and the Soundpool MO4 is achieved via the parallel port (or printer port, sometimes called "Centronics") and a specific driver; I've already detailed the connection in this article.

What is /STROBE used for?

Since the parallel port is asynchronous, /STROBE is used to warn the MO4 that a new byte is available.
But /STROBE is also used to reset the MO4 when it is pulsed 4 times in a row. In response, BUSY is expected to be high in less than 2 micro-second.
The reset is requested when the driver starts, and after a bunch of data has been sent.
The driver always send data to the 4 outputs of the MO4, from output 1 to 4: for each output, one byte is sent to indicate how many bytes will follow for this output, then the data bytes:

  1. Send data to the MO4:
    1. Send data for output 1
      1. Write the quantity of bytes to follow
      2. Pulse /STROBE: the MO4 reads the quantity
      3. Write byte #1 to the parallel port
      4. Pulse /STROBE: the MO4 reads byte #1 and sent it to MIDI
      5. Write byte #2 to the parallel port
      6. Pulse /STROBE: the MO4 reads byte #2 and sent it to MIDI
      7. ...
    2. Repeat for output 2
    3. Repeat for output 3
    4. Repeat for output 4
  2. Reset the MO4
    1. Write 0 to the parallel port
    2. Pulse /STROBE: 0 byte to follow for output 1
    3. Pulse /STROBE: 0 byte to follow for output 2
    4. Pulse /STROBE: 0 byte to follow for output 3
    5. Pulse /STROBE: 0 byte to follow for output 4
(Assembler MC68000)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Driver write routines, called by MROS.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
driver_write_port1:
    ...
    bra.b  driver_write
driver_write_port2:
    ...
    bra.b  driver_write
driver_write_port3:
    ...
    bra.b  driver_write
driver_write_port4:
    ...
driver_write:
    ...
    bsr.w  reset_hardware
    ...
    bne.b  .dowrite
    ...
    rts
    
.dowrite:
    ...
    bsr.w   write_byte_hardware   ; Actually write the data to the hardware.
    ...
    rts

Timing issue

The tight timing doesn't allow the micro-controller to properly manage the signal (i.e. counting pulses in a time frame and raising BUSY accordingly). To overcome this issue, I decided to delegate the counting to an external chip, the decade counter 74LS90, in a divide-by-five configuration with CLKA connected to /STROBE and Qd connected to BUSY.

Each transition of STROBE from low to high will increment the counter. The outputs Qb and Qd are used to detect the changes; Qb will trigger PCINT11 and start a timer, Qd will trigger INT1 and stop the timer. Otherwise, the ATTiny must reset the decade counter by pulsing R1 and R2 (pins 2 & 3) for at least 15 ns, which will bring Qb, Qc and Qd to the initial state. We use the Timer 1, started When PCINT11 state changes.

Chronograms

Instructions and chronogram when data is sent

Note: on the Atari, the parallel port (or printer port, sometimes called "Centronics") is managed via the sound chip, the Yamaha YM-2149. On this chip, we use 2 unused registers: register #14 (port A) and #15 (port B). Port A

(Assembler MC68000)
write_byte_hardware:
    (...)
    lea     ym_base.w,a1 ; a1 contains the address of YM-2149
    lea     2(a1),a2     ; a2 = ym_data.

    move.b	#15,(a1)     ; Select the port B of YM-2149
    move.b	port_bitfield_copy(a5),(a2)	; Write port bitfield.
.loop:
    move.b	#14,(a1)     ; Select the port A of YM-2149
    move.b	d1,(a2)      ; Pulse strobe (bit 5)
    move.b	d0,(a2)
    move.b	#15,(a1)     ; Select the port B of YM-2149
    move.b	(a3)+,(a2)   ; Write data.
    dbf	d3,.loop

    move.b	#14,(a1)     ; Select the port A of YM-2149
    move.b	d1,(a2)      ; Pulse strobe.
    move.b	d0,(a2)
    
    move.b	mfp_gpio_reg.w,d0
    move.b	#$FE,interrupt_pending_reg.w ; Clear interrupt.
    and.w	#1,d0                        ; Set hardware_ready if busy line was 0.
    subq.w	#1,d0
    or.w	d0,hardware_ready(a5)

    move.w	#$2400,sr                    ; Interrupts on.
    movem.l	(a7)+,d0-d3/a0-a3
    rts
Given the timing sheet of the MC68000, this gives the following chronogram on /STROBE:

Instructions and chronogram when a reset is requested

(Assembler MC68000)
reset_hardware:
    (...)
    move.b	#15,ym_base.w    ; Select the port B of YM-2149
    move.b	#0,(a5)          ; Set data lines to 0.
    (...)
    move.b	d1,(a5)          ; Pulse strobe low-high four times.
    move.b	d0,(a5)          ; This gets the hardware back to a state where
    move.b	d1,(a5)          ; it's expecting a port bitfield, regardless
    move.b	d0,(a5)          ; of what state it's currently in.
    move.b	d1,(a5)
    move.b	d0,(a5)
    move.b	d1,(a5)
    move.b	d0,(a5)          ; End with strobe high.
    
    move.b	mfp_gpio_reg.w,d0            ; Fetch busy line status.
    move.b	#$FE,interrupt_pending_reg.w ; Clear busy interrupt.
    lea 	vars(pc),a5
    and.w	#1,d0                        ; Extract the bit 0 (busy input)
    eori.w	#1,d0                        ; Invert the bit: 0 = busy high = not ready, 1 = busy low = ready
    move.w	d0,hardware_ready(a5)        ; Conditionally mark hardware ready.

    move.w	(a7)+,sr
    movem.l	(a7)+,d0-d1/a5
    rts
The overall schematic is the following.

And now?

Speed

The better understanding of the protocol between the Atari and the MO4 raised a question: is the ATTiny4313 still best suited for this project? The addition of the decade counter (74LS90), which I chose a bit by chance, was actually a good help since it allows to manage the BUSY signal faster than what could be achieved with the ATTiny, but it also added a bit of complexity

Originally, the Soundpool MO4 is built uppon a CPLD, the ispLSI1016 from Lattice. For simple logic, CPLD and FPGA are faster than a 8 bit CPU paced at 8 MHz. The ATTiny is very efficient: most of the instructions executes in 1 or 2 clock cycles. For example, a LDI rr,#xx (load immediate value #xx to the register rr) instruction on the ATTiny is executed in 1 cycle; the equivalent on a the Atari's CPU, the Motorola MC680000, (move.b #xx,rr) takes 8 cycles. I think this efficiency is due to the architecture: ATTiny is based on Harvard architecture, while MC68000 is based on Von Neumann architecture.

A better option?

In my opinion, adding a piece of hardware should be avoided before trying to find a better options. So, could it be possible to get rid of the decade counter? Well, now that I understand the protocol better, I thought it could be possible to implement another algorithm: a state machine.

Follow up on next part: Implementation of a State Machine.

Follow up:  Part 1  Part 2  Part 3  Part 4  Part 5  Part 6 .

Comments

Popular Posts