V9958 – “The WAIT” – investigation of the CPU/VDP /WAIT interface

… on the way back to munich, we had some time to do a little code review of our gfx library. thinking about the cpu to video chip timings and again read the well known datasheets of the V9938/V9958. suddenly i got an enlightenment and we came to the following conclusion.

as described in the datasheet (V9958-Technical-manual_v1.0.pdf) of the V9958 there are different timings given for different kind of writes. so as far as we understand there are the following timings

  1. the first 2 bytes send to vdp during a write are always register writes which require a short delay of at least 2µs in between each byte
  2. the write of the 3rd byte (after the 2nd) requires a delay of 8µs. any further “single byte transfer” – during a vram write – also requires the 8µs delay. the same is true if we want to initiate a register write direclty after a vram write.
  3. the 3rd and n-th byte write to port #3 (index register port) during a bulk register write requires only the 2µs between each byte

With this in mind, we can optimize our library a little bit by using different “nop slides” for address setup and vram writes.

We enhance our vdp.inc and built two macros which provide the different delay we need.

.macro vdp_wait_s
  jsr vdp_nopslide_2m ; 2m for 2µs wait
...

.macro vdp_wait_l
  jsr vdp_nopslide_8m ; 8m for 8µs wait
...

steckSchwein is running at 8Mhz, so we also defined some equations and used ca65 macros to build our nop slides.

.define CLOCK_SPEED_MHZ 8

; long delay with 6µ+2µs (below)
MAX_NOPS_8M = (6 * 1000 / (1000 / CLOCK_SPEED_MHZ)) / 2 
; 8Mhz, 125ns per cycle, wait 6µs = 6000ns 
; = 6000ns / 125ns = 48cl / 2 => 24 NOP 

; short delay with 2µs wait
MAX_NOPS_2M = (2 * 1000 / (1000 / CLOCK_SPEED_MHZ) -12) / 2 
; -12 => jsr/rts = 2 * 6cl = 12cl must be subtract

.macro m_vdp_nopslide
vdp_nopslide_8m:
   ; long delay with 6+2 2µs wait
   .repeat MAX_NOPS_8M
      nop
   .endrepeat
vdp_nopslide_2m:	
   .repeat MAX_NOPS_2M
      nop
   .endrepeat
   rts
.endmacro

Another interesting thing would be, “how does the /WAIT” behave in this situation? the assumption here is, that the /WAIT will behave in the way as specified. so /WAIT will be go low at least after 130ns from CSW. so to handover the /RDY handling to the vdp via the /WAIT pin, we have to apply only 1 wait state from our WS-Gen. after one wait state, we can release the /RDY low from our WS so that the vdp /WAIT can drive /RDY as needed.

Back home, Thomas did the test and changed the waitstate generator firmware for the GAL16V8.

The equation was

W2 = ROM * UART * SND * /VDP 
W1 = W2 
     + /ROM * UART * VDP

and was changed to

W2 = /SND
W1 = W2
     + /ROM 			; /ROM wait state if ROM is cs
     + /VDP			; /VDP wait state if VDP is cs

So finally, we only need one wait state from the waitstate generator to access the VDP. If the VDP requires more time – surely – during a video memory access it will drive /WAIT to low as long as needed. So after the explcit 1WS from our wait state generator we now hand over the /RDY control to the VDP. How our /RDY and /WAIT really work together is subject to one of our next sessions where we’re going to measure the things with a logic analyzer and oscilloscope. Nevertheless, it works in this way and it works exaclty as specified within the datasheet.

This entry was posted in 64k, 6502, 65c02, 9918, Allgemein, assembly, cpu, experiment, nop, steckschwein, timing, tms9929, V9958, vdp, video, video chip, waitstate. Bookmark the permalink.