Another update and another dead end. In an attempt to add the much-needed storage mentioned in my last post, I managed to damage the board so that the JTAG connection stopped working. I added the EPCQ configuration flash chip to the board, only to find that I’d wired the data in and data out back to front and the flash can’t be read from the FPGA. I looked at direct access through the ASMI connection, but I couldn’t get this working. I guess it doesn’t expect the chip to be wired in ‘backwards’! In a futile recovery attempt, I tried to solder in the spare 16G eMMC card, but managed to short out the power pins again. Upon desoldering the eMMC, I must have damaged something because the JTAG connection stopped working. The chip could still be programmed through the supervisor connection on the fast-passive-parallel port, so it was not extensively damaged, but the JTAG connection was pretty essential to efficient RTL development. For now the board is relegated to the ‘post-project-review’ bin. Developing the hardware, RTL and software all in parallel creates too much inefficiency. So I decided to use this board to finish the development, the Terasic Cyclone V GX Starter:
Time for a quick update. I’ve integrated the SDRAM controller, the byte cache, and created some logic to map some of the SDRAM address space to the CPC ROM enable line. I also created some logic to allow the support CPU to push data into the SDRAM. This means that the support CPU can alter the ROM configuration of the CPC2 based on a user-set configuration.
To test the set-up, I created an example ROM that when booted by the CPC, it copies itself to address 0x4000, then dumps 64 bytes of memory at 0x4000. This will test the SDRAM controller, the cache and the cache replacement algorithm. Here’s the output.
Following a summary of the timing closure challenges in my last post, here’s a few more lessons learned from the process of trying to get my SDRAM and DMA controller to run at their fastest possible speed.
A lot of my timing closure process involved changing the RTL code and checking the effect on the timing. It’s a slow and laborious process, so here’s a list of my findings so you can save the hours of compilation time that it took me to test these.
Hello CPC fans! Has it really been 4 months since my last post? How time flies. Thanks to codepainters for the prompt to get going on my next post, and for reminding me that there are readers following the progress of this project.
I spent the first 3 months of this year refining the caching SDRAM controller and chasing timing closure issues. The SDRAM controller and it’s byte cache worked absolutely fine in simulation but failed to work reliably when I put the design in silicon at high speed. The sort of random and unpredictable behaviour I saw is usually indicative of timing issues, (especially when it worked at lower speeds), so I turned my attention to the timing reports from Quartus. This was new ground for me. I knew that IO timing was something that I should pay attention to, but until today my designs generally ran at sub 50MHz speeds where there’s enough timing slack to not notice the issues. However, on every design I’ve ever produced, at least one or two of the Timequest reports were in red, meaning that there were timing violations and hidden problems waiting to rear up when I least wanted them to. Continue reading
I’ll start with a confession. While I’ve had some success with FPGAs, I managed this despite not understanding some of the basics, such as system design constraints, particularly in Quartus. This post covers both my research into system design/timing constraints and the result of the byte cache that sits on top of the SDRAM controller. I’ll break with my tradition and save the screen grab for the end.
Cache (Screen) Grab (Click for large)
This screen grab shows one of the key process steps in the caching controller, the cache line replacement. The red bracket indicates the cache i/o ports and some key internal state variables, and the blue bracket indicates the data cache i/o ports on the dual-port ram. Continue reading