Starting to think about GPU acceleration of protocol decodes (rather than just basic math blocks) in ngscopeclient.
Here's the filter graph I'm thinking of using as a benchmark: dual lane QSGMII, 20M points per channel.
The filter graph takes 944 ms end to end to run on my box (2x Xeon 6144 + 2080 Ti).
Major time consumers:
* Eye pattern (~345 ms)
* QSGMII (~337 ms)
* CDR PLL (~310 ms)
* 8B10B (~160 ms)
* SGMII (~120 ms)
Note that the displayed times do not add up to the total wall clock execution time, because the filter graph scheduler runs blocks in parallel whenever possible to maximize performance.