Assembling the trigger crossbar board over lunch.
Not thrilled with the paste print quality, very inconsistent. the top left corner was way too thick as the board flexed during printing, the middle BGA skipped some pads, and the WLCSP in the bottom right was near perfect.
These big boards bend too much in my paste fixture, I need to find a way to prevent that before I do any more boards of this scale.
It was reworkable at least. Which is good as I didn't have enough paste left for a second print.
The FPGA came in an absolutely ludicrous amount of packaging as is usual.
Continuing to assemble the trigger crossbar. It's a pretty packed board on the top side so going slow.
Expecting to need some rework in the top left IO area despire my best attempts to clean up the solder paste prints but whatever, nothing I can't handle.
Slowly coming together.
Most of the high density analog stuff is done, just have to do the power supply and some stuff around the MCU.
Last few IO channels done, all of the non coaxial SMT connectors on. Still a bunch of small passives but it's getting close!
All the power stuff is done just a bit more in the bottom right corner plus the front row of SMPMs.
But time to be a dad for a bit.
Ok bedtime routine is over, board fully populated and reflowed. Time to check for problems and then stuff the PTH parts.
Initial visual inspection findings: seven bridges, all on DFNs in the top left corner due to excessive paste volume from the bad print.
The main MCU also looks a touch northwest of centered on the silkscreen. Hopefully it's just silk misalignment and the BGA is in the right spot, we'll know when I try to power it up.
Found the first mistake in the board. The SFP+ cage is slightly too far inboard and the front EMI shield fingers are pushing it up.
The module still mates fine but it's angled a tad away from the board instead of parallel.
All done and ready to start bringup!
The FPGA is getting a heatsink eventually but I'm holding off for bringup as it's hard to remove once installed. If I need to do any rework I'd rather leave it off.
Ok, first step is going to be floorplanning and cable management. There's going to be a lot of dongles and test leads involved.
The IBC is going to be on the left side, with 48V input power going to it.
I'll be making some firmware tweaks to the IBC (adding a communications interface so the main board can do remote on/off, query voltage and current and temperature, etc), so I need SWD and UART to it.
Then the main board needs the big 12V power cable to the IBC plus the control cable with I2C, output enable, and 3.3V standby.
Supervisor MCU on main board needs SWD and UART.
Main MCU needs JTAG and UART.
And then the FPGA needs JTAG.
IBC is the first thing to set up, I think. The current firmware turns on the 12V output automatically and for bringup I don't want it on immediately.
Next problem: the IBC power cable is very stiff. Fine inside an enclosure if bolted down but annoying if just taped to a bench.
And the 100mm data cable is a bit short as a result ordering some 150 and 300 to get more flexibility in bench layout.
Aaaand ok, bigger problem. The data cable is wired in a straight line, but the PCB assumes pin 1 on one side maps to pin of the other (not inverted, as it's wired).
Let's see if there's any way I can re-arrange the contacts in the cable as a temporary fix.
Mid term I'll spin a tiny adapter board that mirrors the pinout, and of course on the next board design I'll get it right.
Welp, lesson learned.
https://github.com/azonenberg/pcb-checklist/commit/9799c35c4195de7278accb1f8b85a39ea2c60c3d
Looking at the cable it doesn't seem like it will be too hard to remove the contacts and reinsert them. I have three and only one needs to survive the process so fingers crossed.
Crisis averted. This was actually less work than I was afraid it would be.
Progress. Supervisor MCU is alive enough to respond over SWD when powered from the IBC.
Main 12V rail (and thus everything downstream of it) is still off.
After a brief bit of confusion swapping TX and RX on the header to the FTDI (because come on, who doesn't do that the first time you bring up a board) the supervisor is talking to me on the UART.
Now to try and turn on the 12V rail, energizing the inputs to all of the DC-DCs across the board. This is the first point in the bringup where there's a nontrivial risk of magic smoke (since the 3V3_SB rail is on a super low current LDO).
Starting to bring up power rails. As usual for my recent prototypes there's a microcontroller that has all of the EN/PGOOD signals and controls all of the rail sequencing.
12V0 came up fine and measures 11.988V, perfectly normal.
The (unswitched) 5V0 analog rail came up fine, measuring 5.0371V.
The first switched rail was 1V0, measuring 0.99195V. Also fine.
1V8 is where it gets interesting. The rail comes up just fine and seems to stabilize, but PGOOD never goes high (or at least, the supervisor doesn't think it does). So it panics, thinking that there's a short on the rail, and e-stops every power rail on the board.
Welp, I measure 100 milliohms to ground from 1V8_PGOOD. Sounds like a solder defect. Time to rip the whole setup apart and find the short...
Aaand I damaged the bodge cable to the power supply in the process. And I don't have a spare.
Hopefully this little bit of plastic isn't too important and I can tape/glue it in place until the other ones I ordered arrive...
Anyway, nothing visibly shorted on the board. I guess the next step is to pull the DC-DC module and see if there's a bridge under it?
I give up. Blasted the thing with hot air for IDK how long and it's not working. Just too much thermal mass on ground planes I guess.
I'm just gonna live without PGOOD, add a fixed 5ms timer, and hope it doesn't short in the future.
Continuing down the list of rails: 1V8 measures 1.7903V.
3V3 measures 3.29110V.
FPGA and (I think) main MCU are now responsive over JTAG.
1V2 only comes up to around 500 mV so something is shorted there, will investigate later. Pretty sure that rail is only used for the gigabit Ethernet PHY so it should be an easy fix once I find the short. Assuming it's not under the DC-DC itself which is frighteningly possible considering the state of the 1V8 PGOOD.
And 3V0_N isn't working either.
FPGA is, at least, alive and running a blinky (but without the SERDES rails being functional).
And the main MCU responds over JTAG but I havent tried running code on it.
Sooo basically we have no real communications capabilities (1V2 is needed for baseT ethernet, GTX rails needed for SFP+) but most of the rest of the board seem alive.
3V0_N only powers one external trigger input so if that doesn't work it'll be annoying but not the end of the world.
Off to bed for now. Tomorrow, I think I'll pull the board again then see if maybe IR bottom preheat plus hot air on top is sufficient to get these DC-DC modules reworked.
When I say "most of the rest of the board" I mean power and major digital ICs, of course.
The whole trigger I/O and relay driver subsystem, aside from being not shorted, is completely untested.
Nap time which means more board bringup work.
Leaving all of the malfunctioning rails switched off until I have more time to troubleshoot and rework (probably meaning tonight after bedtime).
In the meantime I'll work on the output subsystem.
Good progress. After spending WAY too much time fighting with the STM32H7 SPI controller over a stupid typo in a bitfield, I have the DAC controlling VCCO on the output ports working.
Output channels 0-3 (unbuffed 1.8V FPGA outputs) are working as expected, as are 4-7 (buffered level shifted outputs with DAC-controlled VCCIO).
Next up is verifying the bidirectional ports in output mode, which means writing some code to control the relays.
I guessed the polarity wrong at first (the relay datasheet didn't give any super obvious indication as to which of the two directions of the twin-coil relay was going to one port or the other).
But on the second FPGA bitstream I got all of the relays switched to output mode and I have nice happy pulses coming out of them.
So now that's all 12 output channels (4x unbuffered 1.8V output, 4x buffered output, 4x buffered bidir) confirmed working.
That's probably it for tonight since I have some other stuff I wanted to get done outside the lab, but good progress.
Tomorrow evening my focus will be on verifying the input ports. At that point I'll have all of the building blocks I need for minimal functionality of the core trigger-crossbar functionality controlled over serial console.
Then I can start thinking about debugging the power issues more.
To recap, I have four rails showing signs of trouble which are currently disabled in firmware on the supervisor while I focus on bringing up other parts of the design.
1) 1V2: powers MGTXAVTT for the FPGA transceivers plus the 1000baseT PHY. It powers up but only to ~500 mV. Unsure if problem with the feedback network or shorted so far.
2) GTX_1V0: powers MGTXAVCC for the FPGA transceivers, acting unstable and jumping all over the place.
3) GTX_1V8: powers MGTXAVCCAUX for the FPGA transceivers, not coming up at all
4) 3V0_N: regulated from +3V3, powers a single comparator for high speed CDR trigger. Not coming up at all.
Supplies 1-3 are Murata MYMGK00504ERSR, 4 is a LMR70503.
Suspecting solder defects as the root cause on all of them.
Kid is in bed and I have more time to work on debugging this.
I think I'm going to work on 1V2 next as it's on the most critical paths: it's VCCIO for the comparator on the CDR trigger, it's Vtt for the transceivers, and it's the only rail on the 1000baseT PHY that isn't up yet.
So it's a precursor to any kind of high speed comms interface.
The rail is *not* actually shorted, on closer inspection. It measures 408 ohms to ground. But it wasn't coming up properly.
Before I pull the DC-DC module let's rule out all other possible faults.
The 0R in the remote sense path (R12) is present, soldering looks good, and measures 0 ohms end to end.
Feedback path resistors look good, one of them is definitely a 0R. I can't easily measure the 10K in circuit due to other stuff attached but the reading I got was at least vaguely plausible.
Nothing else seems obviously wrong on the board.
Sooo I guess that means I'm out of options, time to pull the module and hope I clear a short under it.
And hope with bottom side IR preheat I can actually get it off.
Pulled and reinstalled U3. No idea if I got it back on right, but I tried.
Also heated up U7, the GTX_1V0 regulator, but didn't get it off. This board is a massive heatsink...
Maybe I need to be even more aggressive on the bottom side preheat. I'll try and add a thermal sensor or something to the bottom for the next round of rework.
But first let's see if it acts any different.
Well, that did something lol.
1V2 now isn't even trying to start, but PGOOD is floating high. Probably not making contact on some of the pins.
And GTX_1V0 is coming up to a stable-ish voltage but with a lot of ripple. I think maybe the filtering network is the problem, gonna try swapping that with a 0R.
Progress! After replacing the 1V2 DC-DC module outright (looked like I damaged a pad when I desoldered it) and swapping the ferrite on the GTX_1V0 output with a shunt, both rails come up.
Ripple is borderline for the 10 mV p-p tolerance for the GTX, but (barely) within spec.
So now in theory I should be able to bring up the 1000baseT PHY, and the only thing stopping me from bringing up the GTXes is now the GTX_1V8 rail.
Then I can look at 3V0_N later.
Very crowded bench. I had even more probes on it before but right now since I know I still have to rework GTX_1V8, I'm only connecting the minimum I need for console, power, and jtag.
Success! RGMII PHY came out of reset fine, negotiated a link at gigabit with my upstream switch, and the FPGA is getting an RGMII RX clock.
Didn't take it all the way to passing packets but that's a lot more signs of life than I had when the power rail wasn't turning on :)
Time to take a break and stretch for a bit. This is excellent progress.
All of the functionality needed for a minimum viable system now appears to be operational other than the trigger inputs (not tested yet, but I see no cause for concern at this point).
1000baseT is more than enough for SCPI remote control of stuff plus SSH debug console, plus letting me build out a whole bunch of functionality in my general embedded instrument firmware stack.
The GTXes were really just included on this board as "the FPGA has them and I feel bad not hooking them up to connectors".
That, and I'm low on baseT drops to this bench so running the board off a fiber drop would be convenient. And a CDR trigger would be a nice stretch goal to have eventually.
OK, back to GTX_1V8 work. This is the last rail blocking me from using the FPGA transceivers.
It's not even attempting to turn on, I see no motion whatsoever on the rail when I scope it.
Conjecture: GTX_1V8_EN not making contact or shorted to ground under the DC-DC due to bad solder paste print (seeing a pattern here?)
First rework attempt: Swapped the output side ferrite with a shunt and heated the DC-DC up enough that I saw stray solder balls extruding out the side (indicating outer lands reflowing and rejecting excess solder) but didn't fully remove it.
We'll see if that did anything before I go all the way to full removal (difficult and a bit risky especially given proximity to the FPGA)
Nope, that didn't do it. Back to plan A, pull the whole module,
If this doesn't work there is a fallback option. GTX_1V8 doesn't need a whole lot of power and I think I have enough capacity on the 1V8 rail to do what I need.
So I could just remove FB4 (now a shunt) and bodgewire GTX_1V8 straight to the regular 1V8 rail.
GTX_1V8 is now coming up but outputting 0.598V.
This is a very interesting number, because it's extremely close to the nominal reference voltage on the DC-DC converter.
Which suggests that Rtrim isn't connected properly.
Back side looks good so I guess the only reasonable option is to squirt some flux under and heat up the board again?
And it's now alive. The ripple is more than I'd like (16.4 mV p-p) which is out of spec, but close enough I'll probably get away with it for now.
Realistically, I probably shouldn't have used a SMPS for this rail given how much current it actually pulls. But hey, one of the goals of this board was to validate power stuff for future FPGA projects. Negative results are still results.
I need to wrap up soon because tomorrow is a work day but hoping to get some signs of life out of the GTX first.
GTX bringup isn't happening before bed. I always forget how many clocks and general infrastructure is needed even to get a PRBS out.
But I have everything cabled up for tomorrow and got a small amount (not nearly enough) of much needed bench tidying done.
3V0_N still isn't coming up, I'll figure that out later on. It's not needed for anything but the single ended front panel CDR trigger input. Losing that (when I already have two differential front panel SERDES inputs) isn't a huge deal even if I never get it working.
For now, 99% of the board functionality seems to be ready to go so I can start writing firmware and gateware tomorrow.
At some point I'll probably do some more extensive defluxing of the reworked areas but it doesn't have to be in the next day or two. Saving the heatsink for later just in case I find more stuff that needs bodging near the FPGA.
Also, life hack for anyone else who has a PicoScope 6000E series oscilloscope: the logic analyzer pods are connected to the scope via a standard SFF-8087 mini-SAS cable, you can find the Molex part number right on the supplied cable. But you don't have to only use that cable!
I had some longer ones than the stock one sitting around and they work fine, at least at lower speeds (you might have trouble working to full rated bandwidth due to cable losses). But this gives me much more freedom of bench layout since I can put the pod a lot further from the scope than with the stock cable.
Took a few minutes to get initial proof of life out of the GTXes.
Here's a PRBS-15 out of the front panel "sync" port. Cable losses de-embedded to the SMPM reference plane, but losses in the DC block and SMA - 2.92mm adapter are not included in the de-embed.
After the PCB channel (about 100mm of TU872SLK), once I spent a few minutes tweaking TX FFE settings, the waveform comfortably passes the XFI eye mask.