The newly made boards made their way from China to Munich. Starting now, the multi board version of the Steckschwein is made up of 3 Boards: CPU/Memory, IO/UART and V9958-OPL2.
It’s time for another hardware upgrade. Since we really want to get our single board Steckschwein done, we are going for higher integration of our multiboard prototype. After integrating the UART to the IO-Board, we integrate the OPL2 sound part onto the V9958 video board, so the current Steckschwein multi board incarnations are reduced to three boards.
We did postpone our plan to upgrade sound to OPL3 because Daniel Illgen, which we met at VCFb, convinced us with some awesome OPL2 tunes that OPL2 is still cool. Also, we save the extra oscillator, since the OPL2 can be clocked using the CPUCLK-Pin from the V9958, which happens to provide 3.58MHz.
We did upgrade however the video ram. The first prototype had Bank 0 and Bank 1, maxing out vram at 128k. We decided to include the Extended memory bank, too, this time, giving the V9958 extra 64k, which can be accessed using the blitter command functions. Why not?
Also, to make the connector side more compact, we decided to not use RCA jacks for RGB anymore, but an 8pin DIN jack, which also carries the audio signal. So hooking up a 1084 Monitor or TV will only require a single cable. We use the same DIN jack and pinout as the NeoGeo uses, so there are even ready made cables available.
… on the way back to munich, we had some time to do a little code review of our gfx library. thinking about the cpu to video chip timings and again read the well known datasheets of the V9938/V9958. suddenly i got an enlightenment and we came to the following conclusion.
as described in the datasheet (V9958-Technical-manual_v1.0.pdf) of the V9958 there are different timings given for different kind of writes. so as far as we understand there are the following timings
- the first 2 bytes send to vdp during a write are always register writes which require a short delay of at least 2µs in between each byte
- the write of the 3rd byte (after the 2nd) requires a delay of 8µs. any further “single byte transfer” – during a vram write – also requires the 8µs delay. the same is true if we want to initiate a register write direclty after a vram write.
- the 3rd and n-th byte write to port #3 (index register port) during a bulk register write requires only the 2µs between each byte
With this in mind, we can optimize our library a little bit by using different “nop slides” for address setup and vram writes.
We enhance our vdp.inc and built two macros which provide the different delay we need.
.macro vdp_wait_s jsr vdp_nopslide_2m ; 2m for 2µs wait ... .macro vdp_wait_l jsr vdp_nopslide_8m ; 8m for 8µs wait ...
steckSchwein is running at 8Mhz, so we also defined some equations and used ca65 macros to build our nop slides.
.define CLOCK_SPEED_MHZ 8 ; long delay with 6µ+2µs (below) MAX_NOPS_8M = (6 * 1000 / (1000 / CLOCK_SPEED_MHZ)) / 2 ; 8Mhz, 125ns per cycle, wait 6µs = 6000ns ; = 6000ns / 125ns = 48cl / 2 => 24 NOP ; short delay with 2µs wait MAX_NOPS_2M = (2 * 1000 / (1000 / CLOCK_SPEED_MHZ) -12) / 2 ; -12 => jsr/rts = 2 * 6cl = 12cl must be subtract .macro m_vdp_nopslide vdp_nopslide_8m: ; long delay with 6+2 2µs wait .repeat MAX_NOPS_8M nop .endrepeat vdp_nopslide_2m: .repeat MAX_NOPS_2M nop .endrepeat rts .endmacro
Another interesting thing would be, “how does the /WAIT” behave in this situation? the assumption here is, that the /WAIT will behave in the way as specified. so /WAIT will be go low at least after 130ns from CSW. so to handover the /RDY handling to the vdp via the /WAIT pin, we have to apply only 1 wait state from our WS-Gen. after one wait state, we can release the /RDY low from our WS so that the vdp /WAIT can drive /RDY as needed.
Back home, Thomas did the test and changed the waitstate generator firmware for the GAL16V8.
The equation was
W2 = ROM * UART * SND *
/VDP W1 = W2 + /ROM * UART * VDP
and was changed to
W2 = /SND W1 = W2 + /ROM ; /ROM wait state if ROM is cs + /VDP ; /VDP wait state if VDP is cs
So finally, we only need one wait state from the waitstate generator to access the VDP. If the VDP requires more time – surely – during a video memory access it will drive /WAIT to low as long as needed. So after the explcit 1WS from our wait state generator we now hand over the /RDY control to the VDP. How our /RDY and /WAIT really work together is subject to one of our next sessions where we’re going to measure the things with a logic analyzer and oscilloscope. Nevertheless, it works in this way and it works exaclty as specified within the datasheet.
VCF 2018 in Berlin was great! We’ve met interesting people there. Got a handshake with Scot W. Stevenson who for(th)ced us to use his TaliForth2 😉
Later on Saturday Daniel Illgen – maintainer of the Adlib Tracker II for Linux – had decided to honor us with his OPL2 knowledge while on the way out. He advised us to keep the OPL2 sound chip on the Steckschwein, because the OPL2 chip is still not outbid. We hat doubts at first, but then we could listen to OPL2 with so called “software low frequency oscillation” (soft lfo) and the drums and bass sounds great!
Beside the VCF there where talks about demos and the history of the demoscene then and now. There where two interesting and awesome talks given by “SvOlli” about the demoscene and demo coding on the Atari VCS (Stella).
Here are the slides of out talks and Links to the livestream from Saturday 13.10.2018.
- Steckschwein – The history and why it’s called “Steckschwin”
- Steckschwein – 6502 Test Driven Development and Continuous Integration
- Livestream: https://vcfb.de/2018/
Many thanks to Dr. Stefan Höltgen and his team arround the VCFB which made it possible that we could take a part on that cool event!
We are very excited that we will be exhibiting the Steckschwein at the Vintage Computing Festival Berlin . Also, we will be holding a talk about everything Steckschwein on Saturday, the 13.10. at 10:30.
The Vintage Computing Festival Berlin will again be taking place at the “Deutsches Technikmuseum Berlin”, the German technical museum, which is a very interesting place to go by itself.
This is going to be good.
The woz monitor, also known as WOZMON, is a pretty simple memory monitor and was the system software located in the 256 byte PROM on the Apple I.
Wozmon is used to inspect and modify memory contents or to execute programs already located in memory. Steve Wozniak managed to squeeze all that functionality into 256 bytes. That’s right, bytes. Not megabytes, not kilobytes. Bytes.
We already had attempted to get wozmon ported to our Steckschwein, but we did not succeed so far. That might have been because the wozmon-code is a little bit hard to read and makes use of some Apple I specific things, which we did not know they were, since we do not have any expertise about the Apple I.
Fortunately, we got asked by Neil Franklin to proof-read his in-depth article about the Apple I and wozmon, which provided us with the missing background knowledge. So, as a proof of correctness and helpfulness of his article, it was time for a new porting attempt. As it turned out, there had to be 2 bigger changes, one while a char is input, and one while a char is output, and a few smaller but important considerations.
We started off using Jeff Tranter’s Version, because Jeff saved us some grunt work by having adapted the code to ca65 syntax.
So here is our Version of wozmon adapted to run on top of SteckOS in all it’s glory:
; The WOZ Monitor for the Apple 1 ; Written by Steve Wozniak in 1976
Credit where credit is due!
.include "common.inc" .include "../kernel/kernel.inc" .include "../kernel/kernel_jumptable.inc" .include "appstart.inc" appstart $1000
Our Standard SteckOS includes. The appstart macro takes care of creating a commodore style file “header” with the load address in the first 2 bytes of the file.
; Page 0 Variables XAML = $24 ; Last "opened" location Low XAMH = $25 ; Last "opened" location High STL = $26 ; Store address Low STH = $27 ; Store address High L = $28 ; Hex value parsing Low H = $29 ; Hex value parsing High YSAV = $2A ; Used to see if hex value is given MODE = $2B ; $00=XAM, $7F=STOR, $AE=BLOCK XAM
Nothing changed here.
; Other Variables IN = $0300 ; Input buffer to $027F ; KBD = $D010 ; PIA.A keyboard input ; KBDCR = $D011 ; PIA.A keyboard control register ; DSP = $D012 ; PIA.B display output register ; DSPCR = $D013 ; PIA.B display control register ; .org $FF00 ; .export RESET
We need to put the input buffer from $027F to somewhere else, because $027F collides with our I/O area. $0300 should be fine.
We do not use a PIA for i/o, so we can get rid of those labels. Also, the start address is already defined above, and we won’t need to export the RESET label.
RESET: CLD ; Clear decimal arithmetic mode. CLI LDY #$7F ; Mask for DSP data direction register. ; STY DSP ; Set it up. LDA #$A7 ; KBD and DSP control register mask. ; STA KBDCR ; Enable interrupts, set CA1, CB1, for ; STA DSPCR ; positive edge sense/output mode.
No need to initialize the PIA chip which we don’t have, but we still need to initialize the A and Y registers.
NOTCR: ; CMP #'_' ; "_"? CMP #$08 + $80 BEQ BACKSPACE ; Yes.
Backspace is $08 on the Steckschwein, not “_”.
CMP #$9B ; ESC? BEQ ESCAPE ; Yes. INY ; Advance text index. BPL NEXTCHAR ; Auto ESC if > 127. ESCAPE: LDA #'\' + $80 ; "\". JSR ECHO ; Output it. GETLINE: LDA #$8A ; CR. JSR ECHO ; Output it. LDY #$01 ; Initialize text index. BACKSPACE: DEY ; Back up text index. BMI GETLINE ; Beyond start of line, reinitialize. NEXTCHAR: ; LDA KBDCR ; Key ready? ; BPL NEXTCHAR ; Loop until ready. ; LDA KBD ; Load character. B7 should be ‘1’. keyin toupper ORA #$80
Here is our first major code change. Keyboard input is handled using SteckOS means. Then we convert the received character to uppercase, since the Apple I uses uppercase only, hence wozmon does not handle lowercase.
Also, and most important, the Apple I keyboard generated ASCII with bit 7 set to “1”. We need to emulate that.
With these modifications, the bulk of the code can remain as is.
STA IN,Y ; Add to text buffer. JSR ECHO ; Display character. CMP #$8D ; CR? BNE NOTCR ; No. LDY #$FF ; Reset text index. LDA #$00 ; For XAM mode. TAX ; 0->X. SETSTOR: ASL ; Leaves $7B if setting STOR mode. SETMODE: STA MODE ; $00=XAM $7B=STOR $AE=BLOK XAM BLSKIP: INY ; Advance text index. NEXTITEM: LDA IN,Y ; Get character. CMP #$8D ; CR? BEQ GETLINE ; Yes, done this line. CMP #'.' + $80 ; "."? BCC BLSKIP ; Skip delimiter. BEQ SETMODE ; Yes. Set STOR mode. CMP #':' + $80 ; ":"? BEQ SETSTOR ; Yes. Set STOR mode. CMP #'R' + $80 ; "R"? BEQ RUN ; Yes. Run user program. STX L ; $00-> L. STX H ; and H. STY YSAV ; Save Y for comparison. NEXTHEX: LDA IN,Y ; Get character for hex test. EOR #$B0 ; Map digits to $0-9. CMP #$0A ; Digit? BCC DIG ; Yes. ADC #$88 ; Map letter "A"-"F" to $FA-FF. CMP #$FA ; Hex letter? BCC NOTHEX ; No, character not hex. DIG: ASL ASL ; Hex digit to MSD of A. ASL ASL LDX #$04 ; Shift count. HEXSHIFT: ASL ; Hex digit left, MSB to carry. ROL L ; Rotate into LSD. ROL H ; Rotate into MSD’s. DEX ; Done 4 shifts? BNE HEXSHIFT ; No, loop. INY ; Advance text index. BNE NEXTHEX ; Always taken. Check next char for hex. NOTHEX: CPY YSAV ; Check if L, H empty (no hex digits). BEQ ESCAPE ; Yes, generate ESC sequence. BIT MODE ; Test MODE byte. BVC NOTSTOR ; B6=0 STOR 1 for XAM & BLOCK XAM LDA L ; LSD’s of hex data. STA (STL,X) ; Store at current ‘store index’. INC STL ; Increment store index. BNE NEXTITEM ; Get next item. (no carry). INC STH ; Add carry to ‘store index’ high order. TONEXTITEM: JMP NEXTITEM ; Get next command item. RUN: JMP (XAML) ; Run at current XAM index. NOTSTOR: BMI XAMNEXT ; B7=0 for XAM, 1 for BLOCK XAM. LDX #$02 ; Byte count. SETADR: LDA L-1,X ; Copy hex data to STA STL-1,X ; ‘store index’. STA XAML-1,X ; And to ‘XAM index’. DEX ; Next of 2 bytes. BNE SETADR ; Loop unless X=0. NXTPRNT: BNE PRDATA ; NE means no address to print. LDA #$8A ; CR. JSR ECHO ; Output it. LDA XAMH ; ‘Examine index’ high-order byte. JSR PRBYTE ; Output it in hex format. LDA XAML ; Low-order ‘examine index’ byte. JSR PRBYTE ; Output it in hex format. LDA #':' + $80 ; ":". JSR ECHO ; Output it. PRDATA: LDA #$A0 ; Blank. JSR ECHO ; Output it. LDA (XAML,X) ; Get data byte at ‘examine index’. JSR PRBYTE ; Output it in hex format. XAMNEXT: STX MODE ; 0->MODE (XAM mode). LDA XAML CMP L ; Compare ‘examine index’ to hex data. LDA XAMH SBC H BCS TONEXTITEM ; Not less, so no more data to output. INC XAML BNE MOD8CHK ; Increment ‘examine index’. INC XAMH MOD8CHK: LDA XAML ; Check low-order ‘examine index’ byte AND #$07 ; For MOD 8=0 BPL NXTPRNT ; Always taken. PRBYTE: PHA ; Save A for LSD. LSR LSR LSR ; MSD to LSD position. LSR JSR PRHEX ; Output hex digit. PLA ; Restore A. PRHEX: AND #$0F ; Mask LSD for hex print. ORA #'0' + $80 ; Add "0". CMP #$BA ; Digit? BCC ECHO ; Yes, output it. ADC #$06 ; Add offset for letter. ECHO: ; BIT DSP ; bit (B7) cleared yet? ; BMI ECHO ; No, wait for display. ; STA DSP ; Output character. Sets DA. pha and #$7F jsr krn_chrout pla RTS ; Return.
Second important change. We got rid of the Apple I specific routine to output characters and use the chrout-routine of the SteckOS-kernel. But in order not to output garbage, we need to unset bit 7. Since all comparisons afterwards still rely on bit 7 being set, we save the A register to the stack and restore it afterwards.
A faster way would be to just set bit 7 again by doing a ORA #$80 before the RTS, but what the heck.
; BRK ; unused ; BRK ; unused
We don’t need those.
; Interrupt Vectors ; .WORD $0F00 ; NMI ; .WORD RESET ; RESET ; .WORD $0000 ; BRK/IRQ
We don’t need those either since we are not using wozmon as system software. Interrupt handling is still done by the SteckOS kernel.
Zum ersten Mal in der Geschichte des Steckschweins sind wir dieses Jahr nicht als Aussteller auf dem VCFe dabei.
Das Kommen lohnt sich aber trotzdem – diesmal findet das VCFe nämlich im Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften in Garching statt.