After a two month absence, I can reveal that the CPC2 has another component almost ready for prime time. The past few weeks have been spent writing the Verilog RTL code for the uPD765 floppy disk controller, from scratch. It was one of the more interesting and sophisticated functions that I’ve written. Take a look at this “proof of life”:
All of the best practices I learned in earlier modules came into play for this work. Keeping logic constrained in it’s own clock domain, and settling logic signals on alternative clock edges, and simulating thoroughly before moving to silicon testing.
In the simulation below I added two CPUs to represent the CPC and support CPU and wrote test harnesses in assembly to test the interaction through the floppy controller.
After weeks of simulation, surprisingly, there was only one fault when moving from the simulator to silicon. In the simulator, I’d ‘wired’ in the CPC and support CPUs to their 4MHz and 48MHz clocks respectively. Simulation worked perfectly. In translating to silicon, I’d used the 16MHz clock for the FDC instead of 4MHz. Oops. It produced some very random failures that were pretty hard to diagnose. I had broken the JTAG connector in an earlier process, so I couldn’t attach a USB Blaster to run a logic check. Trial and error for a week, trying to diagnose the problem until I realised with a groan that the fault was not in the new uPD765 module, but in the wiring connecting it to the CPC. Once I connected the 4MHz clock to the FDC instead of the 16MHz, the rest was “just software”!
So, how did I do it? Here’s the beginning of a design document. The rest of the CPC has not required a design to date due to the straightforward architecture and the abundance of technical documentation, but the FDC was definitely more complex. The general premise is that the FDC is a state machine that duplicates the function of the uPD765 controller. It is not timing compatible giving it considerably more ease of use. For example the reading the data from the ‘disk’ is not timing critical and the data will wait until the CPC is ready, unlike the real CPC, which can ‘miss’ data as it passes under the read head.
One of the challenges that I had to overcome is that even the official datasheet for the uPD765 doesn’t clearly describe the operation of the status registers. I needed to reverse engineer the function from the the CPC emulator, WinApe. This was slow going, but there were not that many conditions to check for.
After painstakingly checking each condition and documenting the resulting codes, I had a pretty good summary of the process. Building the finite state machine (FSM) was fairly simple as each operation goes through all steps in the process:
IDLE->PARAM->READ->EXEC->WRITE->RESULT->REPEAT
Some of the steps require no action, but fortunately timing is not critical, so it can go through each step and skip immediately to the next, for example “Sense.Int” requires no parameters, has no exec stage, and only returns two parameters. However, going through each stage simplifies the FSM and allow the support CPU to track and evaluate every instruction.
The READ and WRITE state sends a signal to the supervisor CPU and pauses until the supervisor has checked the parameters, read or written data to the FIFO buffers, sets the response flags then signals the FDC that processing is complete. The FDC then moves through the remaining states and reads out the result bytes before returning to idle.
Fortunately, the CPC uses only a few instructions, seek a track, read a sector, write a sector, and it provides a pretty serviceable floppy interface. As an example, here’s the log file for the video above.
If you’re interested in the Verilog code, take a look fdc.v in the GitHub repository.
On interesting construct that I used in this module was a function instead of an unwieldy if-then-else. The characteristics of each command are recorded in a table (param_table). This defines how many paraneters, read or write stage, how many results, etc. The same construct is also used for looking up the system registers during result. The only thing to watch out for here is that if you’re using a wire or registers to populate your values in a constant assignment, then this won’t simulate properly in iVerilog. An example construct is:
// Case entry 4'd7: {4'd0, wire_type, register_type};
When you update the register_type or wire_type, it won’t be reflected through a constant assignment. However, an edge triggered assignment will pick up the latest values. These functions are the equivalent of:
reg <= (regx == 4'd7) ? {4'd0, wire_type, register_type} : (other condition) ? {some other value.....
However, it should provide much more readability and maintainability. Any thoughts?
The example supervisor code simply uses some of the support processor memory to store the sector information for this proof of concent, but this is extremely limited. I really need to get the new PCB built with the eMMC and SDRAM to expand the DISK/RAM/ROM capability.
Stay tuned for that soon!