A3V3 for the baseT PHY. Also an analog rail filtered off of 3V3 (last 3.3V power domain on the board, I promise! at least other than the ones for the SFP+ which don't have test points)
This one is sagging a bit due to resistive losses in the filter network, down to 3.284V. Still well within tolerance for the PHY.
Ripple is 6.941 mV p-p, 0.934 mV RMS. I'm happy with that.
VSMPS for the STM32H7 (main MCU). This is the nominally 1.8V output of the MCU's integrated DC-DC converter which is then LDO'd down to run the various internal power domains with dynamic voltage scaling.
1.801V so right where it should be. Ripple is 12.00 mV p-p, 0.70 mV RMS, so no cause for concern. This is clean enough I don't think it's the source of the junk on 3V3.
Even far more dirty power would be fine here seeing as this rail is LDO'd before use. So no need to revisit it.
1V8. This is VCCAUX for the FPGA plus VCCIO for the first 4 output channels, as well as running the FPGA input buffers for all of the LVDS trigger inputs.
Very good: 1.793V average. Ripple is 5.49 mV p-p / 0.64 mV RMS. Well within spec.
Finally the last 1.8V power domain: GTX_1V8. This is the VCCAUX supply for the SERDES quad on the FPGA.
Average is right where I want, 1.796V.
Using a MYMGK for this with 4A rated output was probably overkill and the very light load is probably the reason the ripple is slightly out of spec (14.81 mV p-p / 1.63 mV RMS, datasheet wants no more than 10 mV p-p).
That said, I might still be OK since this is the ripple measured on the plane close to the DC-DC and I actually have a 4.7 uF cap across the via going to the FPGA ball that will attenuate some of the sharper spikes.
At some point I might try to take a differential measurement across that cap but for now I think I'll call this "not great, but probably tolerable". The main impact of a bit more ripple here would be degraded jitter performance on the GTX (since this is the PLL supply) and I'm seeing eyes that look perfectly fine.
Moving on down the list to lower voltage domains we get MCU Vcore, the output of the on die LDO for the STM32H7.
Not a whole lot going on here. I might poke more when I have a slightly busier firmware running but I'm not worried.
I think around the 7.5us mark we might be seeing the CPU activity increase when An Ethernet frame arrives? There looks like an activity spike in the spectrogram on the 256 MHz AHB bus frequency, but it's really hard to see if you don't know what to look for.
Average voltage is 1.364V which is right where it should be. Ripple is 5.92 mV p-p / 0.68 mV RMS, totally fine.
Next, 1V2. This is the 1.2V domain that runs the Ethernet PHY, as well as the MGTAVTT rail for the FPGA transceivers.
We can see some 250 MHz activity from the PHY as well as spiky broadband noise around 500-600 MHz from the switching spikes.
This is where it should be voltage wise (1.1996V) but noisier than I want: 31.63 mV p-p / 0.96 mV RMS. Spec for the GTX is, again, 10 mV p-p.
But actually, looking more closely at the datasheet requirement, the ripple requirement is specified from 10 kHz to 80 MHz. And this switching noise is much higher frequency (presumably within the range that substrate bypass caps would notch most of it out).
With a FIR LPF applied to the measured waveform to cut off everything past 80 MHz, I'm only seeing about 2 mV of ripple.
And with a 200 MHz frontend bandwidth limiter on the scope I'm seeing 4.80 mV p-p / 0.26 mV RMS ripple. So this rail actually *is* in spec.
Backing up to GTX_1V8, it's also (barely) in spec with the bandwidth limit: 8.97 mV p-p / 1.62 mV RMS.
I don't like the high frequency spikes but it does meet the datasheet noise limits.
Next, D1V2. This is the digital core of the KSZ9031 PHY, filtered off of 1V2 to prevent noise from the Ethernet subsystem leaking into the transceivers.
1.180V, ripple is 14.20 p-p and 3.57 mV RMS. Noise is dominated by 125 MHz and harmonics thereof. Gee, I wonder why an Ethernet PHY would be doing that...
Next, A1V2 on the KSZ9031, the 1.2V filtered analog rail. Decidedly quieter, 9.043 mV p-p / 1.44 mV RMS.
Some spikes at expected harmonics of 125 MHz and also one at 1.125 GHz which is interesting. Maybe that's what the internal PLL runs at?
Anyway, looks fine to me.
This one is interesting. A1V2_PLL on the KSZ9031. Higher than the other filtered rails at 1.191V, probably because lower I*R loss across the ferrite (the PLL doesn't pull much current). Ripple is almost nonexistent and dominated by low frequency from the power supply: 6.98 mV p-p / 0.84 mV RMS.
But the FFT picks up a lot of stuff that's too weak to see in the time domain view.
Big spike at the 125 MHz core clock, then a nice comb pretty much every 25 MHz from there on out.
Super quiet overall and well within acceptable limits.
For being the digital core of the FPGA this rail is shockingly quiet.
I mean, I don't have THAT much stuff going on in the FPGA but I was ping flooding the thing and it was buffering a bunch of input clocks to various outputs etc.
996.3 mV (nominal 1.0), ripple is 6.13 mV p-p / 0.42 mV RMS. One of the quietest rails on the board despite being digital.
Almost done, I promise. This is power domain 16 for anyone keeping count.
GTX_1V0, the digital core of the SERDES block. 999.901 mV average on a nominal 1.0V, absolutely nailed it! Ripple is 7.22 mV p-p / 0.85 mV RMS without any bandwidth limit. This is far enough inside the spec that I didn't even bother measuring within the specified 10 kHz - 80 MHz band.
And finally, power domain 16. The one I was saving for last because it was both the last numerically, and because I knew it would be horrible from previous testing.
3V0_N is good on average at -2.984V.
But the ripple is absolutely atrocious, ranging from -2.75 to -3.35V (596 mV p-p, 7.7 mV RMS).
Also of note is that the 3V0_N rail has negative going spikes exactly when the 3V3 rail has positive spikes and vice versa. Perfectly phase aligned.
So it definitely looks like the 3V0_N supply is the source of all the junk on the 3V3 rail. Fix that and it should clean up both rails at once.
Also, I now have 2.2 GB of waveform data saved (almost, but not exclusively, from the rail validation work). Will be useful to be able to go back and look at saved waveforms and do A/B testing showing the impact of rework.
So now the question is, where do I go from here?
The SMPS is fairly straightforward, basically reference schematic for the LMR70503.
Layout wise, I see one potential issue: while the bulk caps (at least the nearest one) are connected to the chip on the top layer on the 3V3 side, the ground is not (only by vias to layer 2). So that adds a bit of loop inductance.
Not much, but some.
For initial rework, I'm thinking of shoving an 0.47 uF 0402 X7R across the C2/C3 GND/3V3 vias. Unless anybody else has better ideas?
Actually an 0402 wouldn't fit on the top side. I could do one on the back side, which would add via inductance, but anything I put on the top would have to be an 0201.
I don't work with 0201s all that much.
My options in inventory look to be CL03A104KQ3NNNH (0.1 uF) or GRM033R60J474KE90D (0.47 uF).
Both are 6.3V rated and would be operating at 3.3V.
The Samsung 0.1 uF loses 50% of its capacitance at 3.3V, thus making it 50 nF.
The Murata 0.47 uF loses 62% at 3.3V (due to the higher capacitance and thinner dielectric leading to higher field strength) but starts with much more capacitance - it's still 178 nF, about 4x the Samsung, so I'll go with that.
Here's the area in question under the microscope.
I need to bodge this capacitor between the two vias at the south side of the WLCSP (U26) at the center of the image, in the gap between it and C154 (a massive 0805).
It's an annoying angle but I have plenty of space. I might almost be able to fit an 0402 in there if I tried hard but let's see what the 0201 does first.
Soldermask removed, test fitting the part.
An 0402 might fit but it'd be annoying to solder because it would be pretty much LGA style mounting with no free pad to get the iron onto.
I think I'll put the 0201 here and if I want more capacitance put an 0402 on the back side of the same via.
I'm gonna say *that* did something.
602 -> 380 mV p-p ripple reduction from that one cap.
I can't easily fit much more on the top so let's see what happens if I add an 0402 on the bottom straddling the same vias. Hopefully the extra ESL won't matter TOO much.
Whoops. I bodged the wrong vias.
That's going from 3V0_N to ground not 3V3 to ground.
That explains why the input ripple wasn't impacted much. Let's see if I can squeeze another cap right next to it...
And after the second cap the 3V3 spikes are massively reduced and the output ripple is down to 286 mV p-p.
I think that's it for tonight as it's getting late. Tomorrow I'll see about adding some back side caps and see if I can cut these spikes down even further.
Clearly on the right track, though.
Added ground strap on the top layer from the main ground via to the north end of C154. This actually *increased* the p-p ripple to 335 mV. Maybe because more high freq switching noise is getting onto the ground plane due to the lower ESL?
I think I'll keep it though, and just add more capacitance.
Second 0201 on the input doesn't seem to have helped. Maybe I need to add a bulk cap on the bottom now?
Ok I think I've hit the limit of what I can do with input capacitance.
Going from nothing, 0201 on output, 0201 on input, ground strap on top, second 0201 on top, 0402 on bottom:
VOUT p-p ripple 596, 380, 286, 333, 328, 319 mV.
VIN p-p ripple 62.6. 66.4, 43.7, 42.6, 39.7, 34.9.
So let's see what happens if I bodge another cap on the output side next.
Top side 0201 on the output didn't change output ripple at all (319 mV) but dropped input to the lowest it's ever been (32.6). Huh.
Let's try an 0402 0.47 on the top right near the output cap.
That helped a lot. Output ripple now 236 mV. Let's see if I can find space for another.
The capacitive tower of babel is growing. I'm about to add my third cap to the second story.
And with that, ripple is down to 171 mV p-p.
Still not *great* but I'm tempted to stop soon. It seems like I'm hitting diminishing returns of what I can do without a re-layout of the supply.
Maybe one or two more, though. After lunch.
Nope, two more 0402s on the output piggybacked above the bulk cap did zilch. Probably too much inductance for them to help.
I guess I'll have to live with a 3.4x reduction from the original ripple. Improved, if not as far as I'd like.
This is the last planned rework.
So I'm gonna pull the board and give it a nice scrub in some SWAS to deflux, then set it back up on the bench and focus on firmware dev from here on out.
Have a few chores to do but when I'm done it's cleanup time.
This is the best flux remover I've found for no-clean fluxes, but it's not particularly good for you (10-30% tetrahydrofurfuyl alcohol).
The stuff is stinky enough that after rereading the SDS I've decided to stop using it outside the fume hood.
Board is cleaned up and back on the bench.
Tried out the CDR trigger input for the first time and I'm getting flatlined output. Not sure what happened there.
But it's a pretty low priority for debug because I have so many other things to do and I already have a differential CDR trigger capability in hardware (no firmware to implement it, but it's there when I have time). The single ended was really just a bonus.
I'm not having good luck with comparators on this project though...
Aaaand I'm not getting signals on several of the input channels and the DAC Vref looks bad.
Thinking back, I don't think I ever solved the problem of DAC0 not giving valid output.
So cleaning the board might have been a bit premature, the DAC might still need rework...
In the meantime, let me at least validate all of the other stuff I can before decabling *again*.
FPGA and MCU firmware seems to be at MVP level (that's "minimum viable prototype", as in all core functionality needed for operation has been demonstrated but not necessarily fully debugged).
In particular, I can plug the device into a 10/100/1000baseT LAN (the SFP+ side still needs more work but it's functional without that) and it comes up on a hard coded IP address because I haven't got around to adding UART console commands to change that. It's reachable via SSH using a hard coded username and password and gives you a shell equivalent to what you get from the UART.
There's a SCPI server on port 5025 that provides commands to change direction of bidirectional ports, set threshold of inputs, and set swing of outputs.
Oh, and the crossbar itself is done and working and you can set mux selectors via SCPI.
The CDR trigger needs more hardware debug. The SFP+ links up and should be able to receive frames, but the arbitration logic for deciding whether to send frames to the SFP+ or the RGMII MAC isn't done yet so right now I hard code sending to RGMII.
The BERT/pattern generator ports are working and spitting out hardcoded PRBS15, but I don't have any firmware/gateware to control swing, pattern, inversion, FFE taps, or the receiver yet.
As far as hardware goes, since this board seems to have had a lot of solder issues, let's just go through and validate every single one of the IO ports.
IN0...IN7 are nonfunctional due to the DAC being flatlined; all testing on them will have to be tabled until I rework the DAC that I somehow forgot to do earlier. I swear I had got to it... this is what happens when you're doing bringup in between parenting a toddler I guess lol. I should keep better notes.
All of the unbuffered outputs (OUT0-3) work.
All of the buffered unidirectional outputs (OUT4-7) work.
All of the bidir IOs (IO8-11) work as both inputs and outputs.
I strongly suspect the inputs on 0...7 will work fine once I fix the DAC. Guessing just bad soldering since the SPI bus to it looks fine. Just mad I didn't do that already as I thought I had finished the rework...
There's some strange hangs in the MCU firmware where sometimes it takes longer than expected to process a SCPI command, and it seems to take a while to start responding to ping when first powered up. So still debug to do on that front too.
Reflowed the DAC and now it's definitely improved.
IN2/3 is flatlined despite having valid input and Vref looking correct.
Probably a solder defect, these are ones that I reworked early on and it looked good visually but I didn't do a functionality check yet as I didn't have the necessary firmware done.
Also definitely need to debug why the firmware suddenly stops responding when I'm sending it SCPI commands, then is fine when I reconnect.
So much for thinking I was done with rework, lol.
But I think this actually *will* be the last rework considering I've now validated pretty much everything hardware wise.
Famous last words.
Anyway it's getting late enough I don't want to do any fine soldering but I am going to start work on getting the BERT functionality up.
I'm somewhat impressed. These two random 36" KF047 cables I had lying around are phase matched tightly enough that the ML4039-BTP BERT can send a 10.3125 Gbps PRBS15, with totally guesstimated FFE settings, to the Kintex-7 GTX with a very low BER (zero errors in the minute or so that I felt like waiting).
I guess now I have to implement all of the glue I need to support proper BERT operation now. Like runtime configurable rates (ideally DRP to dynamically tweak PLL configuration but we'll see how that goes), eye scan, inversion, etc.
Pretty soon I'm going to be ready to start building a scopehal driver for this thing, the firmware feature set is coming together.
Still some instability on the Ethernet or TCP/IP side (not yet sure if FPGA or MCU is at fault) but that's orthogonal to the application firmware and I'm on a roll here so I think I'll keep it up.
And I think that's TX-side BERT functionality done, other than rate control (it's fixed 10.3125 Gbps right now... sub-rate to divide by 2/4/8/16 will be straightforward but hitting nice round lower rates will require some DRP fun since the QPLL is fixed 10.3125 for 10GbE in the same quad).
This is probably complete enough feature set wise that I can start building the scopehal driver. There's no RX over SCPI yet, but that will come later.
Ok so it looks like the slow startup after the link comes up is because my IP stack never sends an ARP query for the default gateway.
So it just sits there not sending any packets until a gratuitous ARP or broadcast from the gateway's IP shows up.
That should get fixed.
And the random socket drops were due to the ARP cache entries for the destination host timing out and me not sending a query to refresh the cache before it expires.
Well, that's fixed now lol. Also tinkered a bit more and found some FFE coefficients I like better than the ones I was using, this is a beautiful eye (10.3125 Gbps PRBS15, cable de-embedded to the SMPM reference plane)
So now after kiddo goes to bed I can start writing the scopehal driver for real lol.
Came back from troubleshooting some other issues to find that I may have made the deepest eye pattern I've ever done with ngscopeclient: 23 billion UIs with a mask hit rate of 1.07e-8.
And I suspect it'd be better if not for the fact that the CDR PLL seems to be locking ever so slightly left of center (note left side of the center mask is hitting while right has clearance).
I feel bad closing it but I need to start doing driver dev...
Making good progress. The libscopehal driver for the crossbar now has channel nodes for all of the inputs and outputs.
The crossbar ports are being mostly ignored for now as I need to do more GUI work to define the thresholding and such, that will happen in a few days.
The TX side of the BERT is now almost done. I need to add a "transmit disable" flag to set TXINHIBIT on the GTX (FPGA, MCU, and driver work involved), implement the driver-side code to set FFE coefficients, and then add query commands to the MCU firmware so a newly connected ngscopeclient session is able to correctly recognize the existing instrument setup.
Then I can start work on the RX.
And at some point I still have to work on timebase controls.
FFE control and readback now working. Output enable is now the only thing left to have full support for a TX-only BERT (aka a serial pattern generator).
Error insertion will come later but I don't have support for that in the API yet (the MultiLane BERT has support in the SDK but I don't think I ever added it in the scopehal driver).
I'll probably do some minimal timebase controls (just 1/2, 1/4, 1/8, 1/16 sub-rate modes without any PLL reconfiguration) next, and probably custom pattern support, before moving on to building the receiver.
Also need to add APIs to scopehal to say "don't show refclk controls because this BERT doesn't have ext ref in/out ports".
The "real" UltraScale+ based dedicated BERT I'm building later on will of course have this and more. The crossbar BERT is more of a minimalistic "as long as I have SERDES on the FPGA let me pin them out" deal.
Hooked up some glue logic in the FPGA to access the transceiver DRPs over QSPI from the MCU (not yet tested).
Currently fighting with sub-rate modes - the link is running at full rate no matter what I do. Not sure why.
There's always the option of changing the link speed over DRP too.
And I'm trying to figure out how I can make the transceiver work using the QPLL by default, but still have the option to configure the CPLL. I see a control signal that looks like it does what I want but I can't get the the Vivado IP wrapper to let me actually see it.
Maybe I should just instantiate the raw GTXE2_CHANNEL primitive (which honestly I probably should have been doing all along).
Got sub-rate modes working, and configurable via SCPI (but not yet in the scopehal driver).
But there's an interesting reflection visible in the eye in sub-rate mode.
I currently have two suspects.
1) The Mini-Circuits BLK-18-S+ SMA DC blocks I have on the signal path, but mated to a 2.92mm connector. It's about the right length to be causing the issue, and is deliberately not torqued down because of the mismatched connector types. I have a pair of BLK-K44+ 2.92mm DC blocks inbound that I wanted to have anyway. If the issue goes away tomorrow when they get here I'll know.
2) Differential via on the PCB without a return current via near it (board design bug that slipped through design review, I have return vias on every high speed signal via except this pair and the one next to it).
Both of these diffpairs are missing return vias.
The right pair is the one I'm currently probing, and the best return path it's got is through the ground via 2.6mm to the right (which is also attached to a SMPS... but that's a different issue entirely!)
Swapped out all of the SMA parts, now it's SMPM on the DUT and all 2.92 on the scope side (properly torqued). Reflection didn't go away.
So it's probably the PCB, and the via transition is my lead suspect. Not enough to impair functionality (or even close to it), but something I can't unsee now.
At some point I'll compare TX1 (shown here), TX0 (left via in the above screenshot), and SYNC (TX-only channel with different layout that has return vias).
If SYNC doesn't have this artifact and TX0 does, I'll say case closed. If all three have it, then I'm more inclined to blame the SMPM launch.
Looks like the 10 MHz lower cutoff on my DC blocks (Mini-Circuits BLK-K44+) isn't low enough for some of the sub-rate testing I'm doing.
Here's a PRBS15 at 1.28906 Gbps (10.3125 / 8) with a long run of zeroes. The baseline wander is pretty clear.
Ordered some Centric RF C0140's which go down to 9 kHz.
Latest bench top organizational experiment. Little adhesive backed magnetic steel plates that mate to magnetic standoffs. Easy removal of the DUT for rework without messing with tape or screws, holds more securely than the tape setup I had been using.
Tradeoff is that moving things around means moving or replacing the steel plates, and they are conductive so you need to be careful about dangling parts on the PCB underside.
Starting to think about what I'm going to hang off that "front panel" connector.
Current WIP schematic has an indicator LED for each trigger in/out channel (which will eventually have a pulse-stretched activity LED), direction for each of the bidir ports, an LCD that will eventually display IP/MAC address and other config settings, and power/reset buttons.
Still have to figure out all of the logistics of how to go from bare board to pretty front panel (light pipes, display bezels, etc). I've never had a project get that "done" before!
Speaking of front panels, I'm starting to noodle around in ProtoCase Designer to see how I'm going to actually make a chassis for this.
I'm sure there's cheaper services (and more F/OSS friendly tools) but for a one-off it'll get the job done without much fuss.
Initial observation: the standard "rackmount enclosure" they have expects the back and bottom panels to screw together, which blocks that area from being usable for PCB standoffs.
My boards all have mounting holes 5mm from the PCB edge. This is a problem.
Luckily it looks like they have an alternate enclosure design (1U extruded aluminum "type F") that has aluminum extrusions for the sides and flat rear panels, which should let me get by with tighter clearances.
But I definitely need to change that for my next design!
Ok so it looks like it's not going to be super practical to flush-mount the power board. But that's a single Mini-Fit Jr connector so it should be straightforward to rig up some kind of bodge cable to a panel-mounted connector.
On to the logic board for now...
After some more digging it looks like I actually *will* be able to flush mount the board in the extruded aluminum chassis. The connector is a bit recessed but I'll live with that for this board rev, oversizing the cutout slightly to provide clearance for the mating connector.
Next IBC board spin I'll make it stick out more.