Please forgive me, I'm at wit's end here. Any suggestions welcome.
I've got a very dense Cyclone III design (~90% utilization) with a very large number of DDR I/Os (>250 pins), using all four PLLs (in order to spread out the power consumption by phase-shifting unrelated logic).
Surprisingly, it works great. No errors, no bit-flips, no PLL lock lost.
The cyclone and the peripherals it talks to are on a card; eight of these cards plug into a backplane which supplies power, 10mhz base clock, and JTAG. I can use any card in any slot and things work great.
If I so much as plug in a second card (nevermind programming it!) all four PLLs on the first card immediately lose lock (output "locked" glitches low) and misbehavior begins. There are no high-speed signals between the cards and the backplane (fastest toggle is the 10mhz base clock).
This baffles me because the unprogrammed card draws hardly any power, and nothing on it is switching. I've probed the shared 10mhz clock, vccA, vccint, and vccio on a scope, and nothing budges by more than 5mV, which is less than 1%. In particular the clock waveform is the nice smooth curve you'd expect. Everything is decapped out the wazoo (excessively, perhaps... over 1200uF of bulk capacitance per FPGA on each of vccint+vccio, plus two dozen 1uF+0.1uF ceramics and a 47uF tantalum for good luck).
I'm at a loss here. I've gone through the list of reasons for PLL lock loss and none of them apply -- if any of them were the cause, then one card wouldn't work in isolation -- but it works great!
Very puzzled. Is there any other advice out there on troubleshooting PLL lock loss besides the checklist at this link?
https://www.altera.com/support/suppo...loss-lock.html
Edit: I should also add that it isn't the "plugging" action that's responsible -- if I power up the system with two cards physically installed, then attempt to program one of them, I get the same PLLs-frequently-losing-lock behavior. So any electrical noise caused by the physical action of inserting the second card into the slot can't explain the failures.