Retro CPC Dongle – Part 31

I’ll start with a confession. While I’ve had some success with FPGAs, I managed this despite not understanding some of the basics, such as system design constraints, particularly in Quartus. This post covers both my research into system design/timing constraints and the result of the byte cache that sits on top of the SDRAM controller. I’ll break with my tradition and save the screen grab for the end.

Signal Design Constraints

Firstly, if you thought that the Quartus TimeQuest Timing Analyzer just analyzed timing, then you’d be in good company, but you’d be wrong. To my astonishment the SDC file is more than just for checking whether timing is met, it’s actually used to adjust the logic paths to ensure it’s met, if that’s possible.

I’ll give you a two examples. The first is my HDMI controller in the last build. It worked fine most of the time, but occasionally, upon compiling the unchanged HDMI code, the timing was affected by other logic taking up the preferred paths and I’d get noise in the signal caused by the clock rising within the time for signal set-up (tSU) and the hold time after clock goes high (tH). This is because the clock did not have a direct relationship with the data signals. Most of the time it worked, but sometimes it didn’t. Higher clock frequencies would be less likely to work.

The second example is the SDRAM controller of the recent builds. For some reason, Quartus SignalTap showed that the DQ lines were up to 2.5 clocks off the data! The simulation worked absolutely fine. The data changed on the falling edge and was registered on the rising edge. On my 100MHz SDRAM clock, there was easily 5nS between transition and registration. The chip I was using should be good for 500MHz logic, so something was clearly wrong.

I spent nearly 2 weeks trying to work out where my logic was wrong, only to conclude that the logic wasn’t wrong. In fact, it wasn’t possible to rearrange the logic to alter the timing predictably. The Quartus fitter just did it’s thing and the logic paths would be selected ‘randomly’.

Have a look here, here, here, here and here for the documents that I used to work out what was going on. There’s really no substitute for reading the documentation to understand something as fundamental and important as the timing constraints.

To prevent the ‘random’ behaviour, the solution is to define a relationship between the output clock and the signal lines so that the fitter will sensibly select the routing paths within the FPGA so that the signals arrive at the output at the same time.

First off, I needed to define the SDRAM_CLK as a virtual clock. This is taken straight from the 100MHz PLL output, and as it is routed on the global clock lines, this is extremely fast path to the output. However, the logic that sets up the address lines (ADR), the control lines (WR/RD/CS) and the data lines (DQ) takes a very convoluted path through the logic and in my design, ended up taking 20nS longer than the clock to be output. So I defined the relationship between the new ‘virtual’ clock SDRAM_CLK and the other control/data lines. The linked documents above describe how to do this and it’s very dependent upon the design, so generic descriptions here probably won’t help you. However, here is my SDC. While the SDC looks complex, the TimeQuest GUI guides you through creating the syntax.

The most important bits are:

  • DRAM_CLK in the ‘Create Clock’ section, creating a virtual clock to synchronise the other signals
  • The Input Delay and Output Delay forces the relationship between the virtual clock and the input/output signals.
  • The false paths – these prevent Quartus thrashing around trying to find routing paths that satisfy routes that are not important or not related. An example is the relationship between the 100MHz clock and the 4MHz clock are false paths. The sample Z80 code I use creates a multi-cycle setup and hold period, so a constraint is unnecessary as the Z80 will always wait many, many 100MHz cycles between setup and sample.
  • Most importantly, don’t ignore the unconstrained paths in the TimeQuest reports. These are the signals that may shift in relation to the rest of the logic and even change between compilations to give different results.

Two Weeks Later

After properly setting up the relationship between the DQ signal and the Clock, I was getting a reliable interface and both input and output signals on the bi-directional port DQ were arriving correctly.

Take a look below at the sdram:dram|Dq_out and DRAM_DQ signals. These were arriving at different time before the constraints. Now they arrive as the OE becomes active for the tristate buffer.

Properly Constrained Interface

 

The combination of the SDRAM, bus arbiter, and byte cache were still not reliable however. With the SignalTap analysis showing that the data was being stored and retrieved correctly, it suggested there was a logic or design issues. And indeed there is. The labelled -93, +25, +50, +75 are the markers for the Z80 T1, T2, T3 timing cycles. The capture above shows where the timing misses the sample in window for the Z80 data, after the T3 cycle. This is because the Auto Refresh is in progress when the request is made, so this has to be completed before the flush takes place and the read happens.

I made some rough calculations and I estimate that the timing will make the window if the interface runs at 160MHz rather than 100MHz, but it will still be tight. If there is a read in progress for the video RAM, this will add another 12 or so clocks to the whole cycle and may still cause problems.

The solution for this test is simply to tie the data_valid signal to the wait_n signal of the Z80. This will force the Z80 to wait until the data is available, but may present timing authenticity issues for the CPC2 and is not preferred. However, the nature of the SDRAM can’t be changed, so a refresh may always come at an inconvenient time and will have to be worked around.

Well, it’s been a lot of head scratching, but I’d encourage anyone who is serious about high speed interfaces in FPGAs to read the linked documents above. The timing analyser isn’t just for analysis!

This week, I’ll be putting the new double-sided build together. It’s a nervous time, as a lot of time, energy and money has gone into this so far and this will hopefully be the last build I will need. I’ll let you all know how the double sided build goes based on my theory!

Last Post <====> Next Post

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s