azonenberg,
@azonenberg@ioc.exchange avatar

Reworked all the bridges on the ESD diodes that I found during initial visual inspection, and tidied up a few bulk caps.

Did continuity tests to sanity check on each power rail and nothing is shorted.

Gonna start populating the front side after the little one goes to sleep. Should go faster than the back since it's mostly large ICs not hundreds of 0402s.

ftg,
@ftg@mastodon.radio avatar

@azonenberg
This whole thread was a delight to read.
Very interesting to see the bring up on a rather complex homebrew project.
The well structured approach does make it look far less daunting than I previously considered.

azonenberg,
@azonenberg@ioc.exchange avatar

@ftg Yeah you have to take a systematic approach for design and testing of something this complex.

This is my biggest, most expensive, and most complex board design by quite a few metrics. So pushing my own limits in terms of design, assembly, etc capability.

azonenberg,
@azonenberg@ioc.exchange avatar

Starting front side assembly. Paste print looks a lot nicer.

FPGA paste print
Ethernet PHY paste print

azonenberg,
@azonenberg@ioc.exchange avatar

I usually begin top side assembly with large but flat components like BGAs so I don't risk knocking tiny stuff around while placing them. Then smaller passives and ICs, and tall capacitors and connectors last.

This FPGA is the single most expensive component I've ever put on a board. Shipping an entire tray for one chip might be slight overkill though...

Closeup of the FPGA
BGA tray containing a single part

azonenberg,
@azonenberg@ioc.exchange avatar

All the BGAs and most of the big QFNs done. Still tons of tiny components left, but nowhere near as many as the back had!

ckfinite,

@azonenberg What does the Super GPIO LED(?) indicate?

azonenberg,
@azonenberg@ioc.exchange avatar

@ckfinite GPIOs on the supervisor MCU (little M0 which runs reset and power rail sequencing). Not to be confused with the main MCU, a much beefier M7.

azonenberg,
@azonenberg@ioc.exchange avatar

Probably about half done. Time to take a stretch break.

j,
@j@j-w.au avatar

@azonenberg just in awe of this (despite having missed what Latentpink is!)

azonenberg,
@azonenberg@ioc.exchange avatar

@j It's a 14x 10/100/1000baseT + 1x 10G SFP+ managed Ethernet switch. Fully open source switch engine on the FPGA, not using a closed switch ASIC.

This is a prototyping platform and testbed for LATENTRED, a planned 1U 24+1 or 24+2 port switch based on the same FPGA and code.

I've been interested in building an open hardware switch for over a decade at this point and things kept on delaying it for one reason or another. But I'm closer than I've ever been!

j,
@j@j-w.au avatar

@azonenberg oh, nice!! Thanks for the overview!

claudius,

@azonenberg are those antenna connectors as Test-Points?

azonenberg,
@azonenberg@ioc.exchange avatar

@claudius U.FL for power rail test points specifically.

The Teledyne LeCroy RPxxxx series probes (I have an RP4030) are designed to interface to U.FL to provide high bandwidth ripple/noise measurements.

This board has almost 30 power domains hence the numerous U.FL's :)

claudius,

@azonenberg neat :-D

azonenberg,
@azonenberg@ioc.exchange avatar

Getting closer. Mostly just power supply stuff left. The lab is getting to be a bit of a mess with component bins covering every bit of bench and floor space.

Part bins on the floor
The growing pile of empty SMT component tape

azonenberg,
@azonenberg@ioc.exchange avatar

But the board is starting to look pretty nice! Definitely less work than the back side.

Partially populated LATENTPINK board

azonenberg,
@azonenberg@ioc.exchange avatar

Here goes... Hope this works.

Board in the oven

Nukular,
@Nukular@chaos.social avatar

@azonenberg waitasec, you can just solder the IDC connectors for JTAG using paste in the oven? That's amazing 😍, it never even occurred to me that this was possible!

azonenberg,
@azonenberg@ioc.exchange avatar

@Nukular Pin in paste / thru hole reflow is a thing, yes.

You need suitable connectors (not all are able to get their plastic parts this hot) so check the datasheet carefully. For example the RJ45s I use on this board are almost certainly not reflowable, while the JTAG connectors are listed as reflowable in the datasheet.

It can also be tricky to get adequate solder volume. You can actually buy SMT chip component shaped "bricks" or "preforms" made of solder, on tape, that can be placed next to a thru hole component's lead to provide additional metal without adding more flux. I've been thinking of getting some.

Nukular,
@Nukular@chaos.social avatar

@azonenberg I love pin in paste and actually used it for usb connectors a few times. I'm just blown away that it's possible for the IDC connectors, I always shy away from using them because they can be a total pain to hand solder if you have many of them and the SMD ones always feel like they're going to rip off way to easily.

azonenberg,
@azonenberg@ioc.exchange avatar

@Nukular The specific one I'm using here is Molex 0878311420.

The entire Milli-Grid connector series is specified as reflowable and Molex even has some test data and profile guidance on page 12-13 of https://www.japanese.molex.com/content/dam/molex/molex-dot-com/products/automated/en-us/productspecificationpdf/877/87761/PS-87761-100-001.pdf.

ccoetzer,

@azonenberg quick, before the wife wakes up and notices you are using her overn.

azonenberg,
@azonenberg@ioc.exchange avatar

@ccoetzer Lol this oven is for lab use only. No way am I letting someone get grease and crumbs all over it :)

azonenberg,
@azonenberg@ioc.exchange avatar

Out of the oven, BGAs all look good under side view optical microscopy (best I can do without X-ray).

Two 0402s needed touchup with an iron due to poor wetting; they were 33 ohm resistors from a reel I've had since 2014 so they might be starting to oxidize too much for my ROL0 flux to handle.

Tomorrow I'll populate the through hole connectors then start the bringup process.

azonenberg,
@azonenberg@ioc.exchange avatar

All soldered up and ready to start bringup!

Later today after my little lab assistant goes to bed, that is. She's still a year or two from being ready to take readings off test points for me... Being able to speak in full sentences is probably a prerequisite.

These are just quick phone pics, I'll do some beauty shots with the A7R and macro lens later.

Assembled board seen from the front
Assembled board seen from the bottom

azonenberg,
@azonenberg@ioc.exchange avatar

Fit testing the thermal solution. Looks mostly good, but not permanently mounting it yet. If i find problems early on it'll be easier to rework without a heatsink in the way.

I provisioned for two fans but we'll start with one and see how it goes.

The QDR-II+ heatsink is somewhat sheltered by the RS232 jack and probably won't see much airflow bit heatsinking it was more of a "just in case" vs the FPGA and main PHY which will definitely need it. So i think I'll be OK.

wackyvorlon,

@azonenberg Okay, this is probably a stupid question, what are you building?

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon 14x 10/100/1000baseT + 1x 10Gbase-R SFP+ managed Ethernet switch. FPGA based fabric (so fully open source packet datapath) rather than using a switch ASIC.

This is a scaled down prototype for a planned 24 port 1U version.

wackyvorlon,

@azonenberg Wow. That is really cool. Why are you building it?

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon Because I can? Lol. That, and my Cisco 24 port edge switches are getting a bit long in the tooth and I need more baseT ports on my main lab bench.

A homebrew switch has been on my projects list since 2012 or so. My first board was a 4 port switch built around a 25k cell Spartan-6 but a combination of board design issues, lack of FPGA space, lack of experience, and lack of proper test equipment meant it never worked.

Over the years I've built various bits and bobs, got better at FPGA design, and got better test equipment so I can actually test it.

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon And my backorders from 2021 arrived a few months ago so I finally had the parts I needed to build it.

wackyvorlon,

@azonenberg Good enough reason for me! That stuff is very much a dark art to me. It’s an amazing project, I’m incredibly impressed. Also I’m so curious, how much was that super expensive FPGA?

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon https://www.digikey.com/en/products/detail/amd/XC7K160T-2FBG484C/3911021

Currently sells for $435 although I ordered it back in 2021 so price may have been a bit different then.

Putting it in the toaster oven was still a bit nerve wracking no matter how many times i double checked.

wackyvorlon,

@azonenberg Holy shit. That must be able to do some incredible stuff.

wackyvorlon,

@azonenberg Just ran across this FPGA. I had no idea ones this expensive even existed…

https://www.digikey.ca/en/products/detail/amd/XC7K410T-2FFG900I/3543163

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon You think that's big? Look up the XCVU9P lol.

Needless to say I won't be using one in a design any time soon

wackyvorlon,

@azonenberg Oh my GOD. What on earth are those used for?!

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon ASIC R&D mostly. If you're spending many $M on a new SoC or GPU design, spending a few bucks on prototyping hardware is not going to break the bank.

Also, Digikey markups for expensive chips in low volume are insane. The devkit for that $50K chip is like $8K so I have to assume they charge big clients even less than that per chip. Expensive, but if you're Qualcomm or Apple or Nvidia you can easily afford racks full of them to support a big project.

manawyrm,
@manawyrm@chaos.social avatar

@azonenberg @wackyvorlon XCVU57P goes brrrr 😬

„Hey, eh, Boss, that strange new BGA chip, I broke that while soldering, is that an issue?“ 😹

azonenberg,
@azonenberg@ioc.exchange avatar

@wackyvorlon Yeah it's the biggest FPGA I've used in a design.

I own larger ones (an XC7A200T on an AC701 dev board, an XC7A200T in inventory that I haven't made a board for, and a pair of XCAU25P's also in inventory), and have used an XCVU9P on somebody else's dev board, but haven't done a board design for any of those yet.

Spec wise the XC7K160T has 101400 LUT6s, 202800 flipflops, 600 DSP slices (25x18 bit multiplier plus some extra stuff for efficient FIR filter implementation), 325 36 Kbit SRAM blocks, sixteen PLLs, and up to eight 12.5 Gbps SERDES lanes and 400 GPIOs. The package I'm using on this board "only" pins out four of the SERDES (max 10 Gbps in this speed grade), and 285 GPIOs.

Not at all enormous by professional FPGA standards, but roughly 20x the logic capacity of, and significantly faster than, an ice40hx8k.

azonenberg,
@azonenberg@ioc.exchange avatar

Kid is asleep so it's back to the lab for me.

After a bit of cable management we've got the first signs of life out of the board.

Applied 12V power to the input and it's drawing 3.6 mA. This is normal and expected, as all power rails are supposed to be off at this point other than the raw input and the 3.3V standby rail driven by an LDO to power the supervisor.

Next step is to put some code on the supervisor and start bringing up more power rails.

Bench setup showing switch board and support cabling

azonenberg,
@azonenberg@ioc.exchange avatar

Supervisor is alive enough to respond to SWD. That's a good sign.

azonenberg,
@azonenberg@ioc.exchange avatar

Spent a little while updating my STM32 peripheral library for the L031 (this was my first design using it) but I now have the PLL active and a blinky running at 16 MHz from flash.

Now to get a serial console up so I can get some more debug output besides a single LED...

azonenberg,
@azonenberg@ioc.exchange avatar

grumbles and resets "days since last wasted time chasing bug caused by a datasheet errata" counter to zero

jpm,
@jpm@aus.social avatar

@azonenberg I love those. The SAMG51 errata that says “these dozen pins should only be configured as inputs in GPIO mode” is a fave of mine

azonenberg,
@azonenberg@ioc.exchange avatar

Ok, UART is alive. Next step is to bring up a timer, then I'll have enough stuff working on the supervisor that I can begin actual power rail testing.

Can you tell I spend a lot of time in IDA? :P

azonenberg,
@azonenberg@ioc.exchange avatar

Timer and logging framework are up. Ready to actually move forward with bringup.

So far the only rails that are active are 12V0_RAW (unregulated 12V prior to the main load switch) and 3V3_SB (3.3V standby for the supervisor), which is very in spec - averaging 3.30027V.

azonenberg,
@azonenberg@ioc.exchange avatar

Next rail is 12V0, the core 12V power feed for all of the other DC-DC converters. This is driven by a load switch which limits slew rate so that I don't pull too much inrush current.

This is the first rail that's under software control from the supervisor.

It came up just fine and measures 11.9979V. Total power draw from the input climbed to 17 mA which doesn't sound unreasonable for five big DC-DC bricks.

azonenberg,
@azonenberg@ioc.exchange avatar

Next is 1V0, the core power supply for the FPGA, QSGMII PHY, and SGMII PHYs. This is a big one with a lot of load on it, so lots of room for something to go wrong.

It came up perfectly as well, sitting at about 1.00015V. Overall input power draw is around 100 mA at 12V so an extra 83 mA. Assuming 90% conversion efficiency this means the board is pulling about 896 mA on 1V0 at idle!

In the interests of limiting potential damage to the expensive prototype if there's a short, the supervisor is pretty aggressive with timing and rail monitoring. If it commands a rail to come up and it fails to give PGOOD after 5 ms, it will automatically panic and shut down all power, then print a diagnostic message to the UART.

Code snippet showing the rail monitoring logic

azonenberg, (edited )
@azonenberg@ioc.exchange avatar

Unfortunately, the streak has come to an end with 1V2 which failed to come up within the (admittedly aggressive) 2 ms timeout. The automatic shutdown did its job and I don't think anything fried.

Next step: toss some probes down and see what's going on with that rail.

azonenberg,
@azonenberg@ioc.exchange avatar

It's definitely not a dead short as I saw several hundred mV on the multimeter that was monitoring the rail. So maybe it's just soft-starting slower than I expected?

azonenberg,
@azonenberg@ioc.exchange avatar

Time to break out the sillyscope and see what's going on...

azonenberg,
@azonenberg@ioc.exchange avatar

Looks like the 2ms timeout might just be too aggressive. It seems like the rail (blue) is coming up just fine then the MCU gets antsy and shuts it down before it's come up all the way.

But hey, it was a good test of my protections!

azonenberg,
@azonenberg@ioc.exchange avatar

Yep, 2ms was too tight. With a 5ms timeout it comes up fine and is putting out 1.193V.

azonenberg,
@azonenberg@ioc.exchange avatar

1V8 is next. This is the core rail for the QDR-II+ SRAM and also runs (through a load switch which is currently off) most of the single ended digital I/Os on the board.

It came up fine, measures 1.792V, and the board is pulling 135 mA from the 12V input.

Still performing nominally but I need to be up early-ish tomorrow to do family weekend things so this is probably as far as I'm going to get.

Tomorrow I need to bring up the 1V8_IO, 2V5, 3V3, and Vref/Vtt rails and verify that all of the analog rails filtered off the core ones have correct voltages.

Then I can hook up to the JTAG on the main MCU and the FPGA, load some blinkies, and begin the fun part of the bringup process!

azonenberg,
@azonenberg@ioc.exchange avatar

But first, time to open a support case with STMicro for the six datasheet errata I found while bringing up the supervisor firmware.

Just once I want to do a design with a new digital chip of nontrivial complexity and not have to do this. Plz?

emeb,
@emeb@society.oftrolls.com avatar

@azonenberg Which ST part and what errors did you spot?

azonenberg,
@azonenberg@ioc.exchange avatar

@emeb STM32L031.

Mostly reserved blocks not being the size the datasheet said they were (important, because I use these sizes for padding blocks in linker scripts to make sure each SFR is at the right offset).

azonenberg,
@azonenberg@ioc.exchange avatar
emeb,
@emeb@society.oftrolls.com avatar

@azonenberg OK, that makes sense. I'm curious what are the advantages of putting your peripheral register definitions into the linker script vs using initialized pointers the way ST's CMSIS headers do it?

emeb,
@emeb@society.oftrolls.com avatar

@azonenberg (actually not "initialized pointers" but more structures with addresses defined by preprocessor macros)

azonenberg,
@azonenberg@ioc.exchange avatar

@emeb I find this a lot more readable.

azonenberg,
@azonenberg@ioc.exchange avatar

Time to bring up another rail. 1V8_IO is slightly lower than 1V8 (1.7886V) due to voltage drop across the load switch, but this is well within acceptable limits.

Pulling 188.9 mA (2.3W) on the 12V input.

azonenberg,
@azonenberg@ioc.exchange avatar

Tried to bring up Vref / Vtt for the QDR-II+ but I'm seeing 1.8V instead of 900 mV which isn't good.

This is <= VCCIO so I don't think I damaged any of the input buffers on the RAM. (The FPGA is definitely fine since these pins aren't even configured as Vref inputs yet and the bank is powered by 1.8V).

But I either have a PCB assembly problem or something wrong in the schematic. Time to do some digging...

azonenberg,
@azonenberg@ioc.exchange avatar

Not great, seeing 1.8V on Vtt even with the Vtt regulator disabled (but with 1V8_IO on).

With 1V8_IO disabled but 1V8 on, I'm also seeing 1.8V on Vtt. But Vref isn't showing much of anything in that state.

azonenberg,
@azonenberg@ioc.exchange avatar

OH, I think I see the problem. AVIN of the LP2996 is fed by 3V3, which hasn't come up yet. So we probably get weird behavior for the regulator if it's given a signal on PVIN before AVIN (1V8) comes up...

azonenberg,
@azonenberg@ioc.exchange avatar

Yep, LP2996 datasheet says that AVIN is supposed to come up first.

This is a bit of a conundrum because the FPGA has VCCAUX driven by 1V8 and VCCO driven by 3V3.

And it wants VCCAUX to come up before VCCO.

But we're allowed to do the opposite (VCCO > VCCAUX + 2.625V) for up to Tvvco2vccaux (300-800ms depending on temperature) per power cycle, with a total of 240K power cycles. This might lead to some glitching on 3.3V GPIOs on the FPGA but I think that will be OK in this use case.

Yay for fixing hardware problems in software! I'll just bring up 3.3V before 1.8 and we should be OK.

For LATENTRED I'll switch AVIN to run on 3V3_SB at which point everything should be OK.

azonenberg,
@azonenberg@ioc.exchange avatar

With this sequencing fix, 3V3 comes up fine (slightly low, 3.2804V, but that's acceptable) and the board is now drawing 207 mA / 2.5W at the input.

And Vtt is now showing zero volts with the regulator disabled, which is what we expect.

With the regulator enabled (and 1V8_IO enabled) we show 900.39 mV on Vtt and 897.33 mV on Vref, while drawing 227 mA (2.7W) at the input).

This is a bit more of a Vref-Vtt delta than I'd like but it shouldn't be enough to cause problems.

azonenberg,
@azonenberg@ioc.exchange avatar

Final power rail is 2V5, which runs a lot of analog stuff in the PHY.

This came up fine as well, although also a bit low: 2.4834V.

Now pulling 293 mA (3.5W) at the input.

This is all of the core power rails done. Now I just have to add a few lines of code to release the FPGA and MCU resets and I'll be ready to start bringup of them.

azonenberg,
@azonenberg@ioc.exchange avatar

Main MCU is alive enough to respond to SWD! Always a good sign.

azonenberg,
@azonenberg@ioc.exchange avatar

MCU VSMPS is 1.3705V, VCORE is 1.0108V. I think that sounds right for default state with no configuration?

azonenberg,
@azonenberg@ioc.exchange avatar

After a bunch of driver fighting to try and get Vivado and OpenOCD to each open one (and only one) of the two Digilent HS2 JTAG dongles I had plugged into the same computer, we have the FPGA responding to JTAG and giving good rail voltages off the XADC!

azonenberg,
@azonenberg@ioc.exchange avatar

This was a close shave. Almost couldn't fit both JTAG cables next to each other.

I verified non-interference of the board side male connector but forgot the female IDC connectkr overhung on the sides.

azonenberg,
@azonenberg@ioc.exchange avatar

Bringup is going pretty well I think.

Maybe could use a bit more kapton tape?

jpm,
@jpm@aus.social avatar

@azonenberg a crap-ton of kapton

azonenberg,
@azonenberg@ioc.exchange avatar

Gradually bringing up firmware on the main MCU. UART, uptime timer, and config variable database are running, about to work on the link to the FPGA.

But first I need to do a bit of independent testing on the FPGA.

azonenberg,
@azonenberg@ioc.exchange avatar

Switch PCB beauty shots as promised before I cover up all the pretty laser markings on the FPGA with ugly blue thermal pads and heatsinks :P

Angled view of PCB top side from the rear
Top down view of FPGA, MCU, and main PHY

azonenberg,
@azonenberg@ioc.exchange avatar

As I continue bringup, the next issue I'm finding - not a bug per se, more of a lack of forethought - is that the main MCU doesn't have a connection to the DONE pin on the FPGA. (The supervisor does, but there's no data bus between them, just two GPIOs intended to be used for requesting a power cycle or soft shutdown).

I can probably hack something up by putting a pulldown on the quad SPI CS# line and waiting for it to be driven high by the FPGA, or something. But that's still a bit annoying.

azonenberg, (edited )
@azonenberg@ioc.exchange avatar

And now the heatsinks are on. Accidentally used a slightly too large thermal pad on the FPGA (it overhangs by 0.5mm or so) which I might trim eventually, but it's nonconductive silicone so shouldn't hurt anything to leave on. Just ugly.

nanographs,

@azonenberg what compound is the thermal pad?

azonenberg,
@azonenberg@ioc.exchange avatar

@nanographs T-Global Technology TG-A6200-25-25-0.5.

Here's the specs https://www.tglobaltechnology.com/uploads/files/tds/TG-A6200.pdf

I had ordered a smaller 15x15mm pad for the FPGA (24x24mm package, but the die is around 10x14mm and the substrate doesn't need cooling) and intended this for the 26x26mm Ethernet PHY.

Accidentally opened the wrong package and it "looked right" so I put it on and didn't realize it was too big until I had already pushed the pins in.

azonenberg,
@azonenberg@ioc.exchange avatar

Just loaded a test bitstream on the FPGA and verified the LEDs all work. And the supervisor is able to see when the FPGA is up.

Next step, I think, will be getting the MCU and FPGA to talk to each other.

azonenberg,
@azonenberg@ioc.exchange avatar

Got a stripped down version of the base FPGA bitstream running.

It's super nice having all of the data from different instrumentation all coming to one place in ngscopeclient so I can have a single dashboard to look at everything.

azonenberg,
@azonenberg@ioc.exchange avatar

And here's the filter graph to go with it (plus an extra bonus calculating total power drawn by the DUT)

azonenberg,
@azonenberg@ioc.exchange avatar

After fixing a few PEBKAC issues, MCU and FPGA are talking over quad SPI.

But the data coming back is shifted by a nibble or two from what I expect. Not yet sure if timing or logic issue.

Should have put test points on the QSPI bus but silly me thought that since it worked last time, I'd be fine with PHY layer stuff and could just use an ILA on the FPGA...

azonenberg,
@azonenberg@ioc.exchange avatar

And they're talking properly! That's it for tonight, I have to be awake in five hours...

I'll probably work on thermal stuff after work, since that affects the health of the rest of the board. The tachometer output of the fan goes to the FPGA (for... reasons) so I need to implement a speed monitoring block and make it output RPM values over QSPI to the MCU.

Then I need to add a PWM generator on the MCU, and bring up an I2C bus to poll the four temperature sensors around the PCB.

azonenberg,
@azonenberg@ioc.exchange avatar

Also I found a new design oversight.

I have monitors for the supervisor on every regulator PGOOD pin so I can detect and shut down if a rail starts sagging due to overcurrent etc.

But I don't have an ADC pin on the 12V input so I can't detect a failure of input power and sequence rails off properly. All I can do is wait until one rail trips out of regulation then panic shutdown the rest (without proper sequencing delays since this is indistinguishable from a short).

jpm,
@jpm@aus.social avatar

@azonenberg can you bodge it in?

azonenberg,
@azonenberg@ioc.exchange avatar

@jpm There's nowhere to bodge it to: the dinky little 32 pin qfn is 100% utilized except for one gpio led that I'm pretty sure isn't on an ADC pin.

I've found a few missing things on the supervisor so I think for LATENTRED I'll bump up to a larger package as well as adding some kind of proper data bus (i2c/spi) between it and the main micro.

anotherandrew,

@jpm @azonenberg you could bodge a comparator output going to that pin as a “12V failing” signal I guess?

azonenberg,
@azonenberg@ioc.exchange avatar

@anotherandrew @jpm Yeah it's definitely possible. for s prototype that I hopefully won't be power cycling too often, probably not worth the effort.

But something of the sort will definitely be designed into the full scale switch.

jpm,
@jpm@aus.social avatar

@azonenberg damn! QFN48 will give you a bit more breathing room, probably end up finding a few extra bits to hook up too

azonenberg,
@azonenberg@ioc.exchange avatar

Cool observation from high resolution power rail monitoring: the STM32 internal voltage reference drifts slightly with temperature, so as the die warms up after having been off all night the regulated core voltage slowly increases.

azonenberg,
@azonenberg@ioc.exchange avatar

I2C4 isn't happy. Trying to read the MAC address EEPROM and getting hung up sending an I2C start bit. The register is supposed to be self cleared in hardware and I'm not seeing it ever clear.

So either there's a peripheral setup issue (nothing jumps out at me in a quick register dump) or something is wrong in hardware (SDA or SCL stuck/open).

Unfortunately this bus is on internal and back side routing exclusively (again, should have put a top side test point on... Derp). So I'm gonna have to rip off some tape and invert the board when I get home from work and see what's really going on.

azonenberg,
@azonenberg@ioc.exchange avatar

Started a google doc with a live "things to do better next time" list. So far all are minor annoyances or things I can work around without having to bodge the board. (Anyone have a self hosted, lightweight suggestion for this kind of thing? Etherpad or something?)

https://docs.google.com/document/d/10j4HWuMBLfLvX5Notvezs26lcIxuNnWbeJlv_JciUEA/edit?usp=drivesdk

The I2C4 issue smells like a soldering issue so far, but I'll know more when I get home and land probes on the bus.

My main bench scope is out for service still so I'll need to use the 16 GHz monster to troubleshoot my I2C. Miiiiiight be slight overkill...

(I could also use the PicoScope but it's on the other side of the bench, not sure if probes will reach all the way over here)

cr1901,
@cr1901@mastodon.social avatar

@azonenberg I use TiddlyWiki for my personal notes, but I wouldn't necessarily call that lightweight.

azonenberg,
@azonenberg@ioc.exchange avatar

@cr1901 I'm specifically thinking for realtime collaborative editing vs an explicit edit/save single user paradigm like a wiki.

cr1901,
@cr1901@mastodon.social avatar

@azonenberg I wonder if hackmd had any user-hosted alternatives...

jix,

@azonenberg I quite like https://hedgedoc.org/ (formerly known as CodiMD/HackMD CE)

It's a realtime collaborative markdown editor, so not WYSIWYG but with a side-by-side realtime preview. I consider that an advantage for most of my use cases. Getting formatted and/or structured content out of an etherpad lite into something else used to be somewhat annoying, but is trivial with markdown.

No experience with hosting it myself, but I also haven't heard any complaints from people I know that host one (who mostly moved away from etherpad lite).

f4grx,
@f4grx@chaos.social avatar

@azonenberg framapads

azonenberg,
@azonenberg@ioc.exchange avatar

Back from work and debugging the I2C issues.

I2C1 (temp sensors) is giving NAKs to any bus access while I2C4 (mac addr eeprom) hangs trying to send a start bit.

Probing I2C1 at the pins of the temp sensors shows SDA stuck at 0 while SCL is floating high as expected. Wonder if I have a bad solder connection on the pullups?

Time to pull some tape and cables off the board and get it back under the microscope.

azonenberg,
@azonenberg@ioc.exchange avatar

Spaghetti situation is not improving. And I'm even more confused. I think I'm closer to the issue, but I don't know what it is yet.

azonenberg,
@azonenberg@ioc.exchange avatar

OK, that explains everything.

Misread the alt function table and had PB6-PB9 set to AF4.

Turns out that while AF4 is I2C4 on some other pins, on PB8/PB9 it's... I2C1.

So I had two sets of pins muxed to the same peripheral and Bad Things(tm) happened, including traffic going out the wrong pins (gee, I wonder why it never got acked...)

azonenberg,
@azonenberg@ioc.exchange avatar

After changing PB8/PB9 to AF6, the correct location for I2C4, both buses are now happy!

azonenberg,
@azonenberg@ioc.exchange avatar

Yep, this looks more sane.

The FPGA -> MCU QSPI link probably needs some timing tweaks still; it works at 25.6 MHz but when I try to bump it up to 32 or 42.6 MHz I start seeing results shifted by a nibble.

Will troubleshoot that later, I don't need more than 100 Mbps of MCU-FPGA throughput now (if ever).

Next step will be building the fan tachometer in the FPGA, I think.

azonenberg,
@azonenberg@ioc.exchange avatar

Tachometer core on the FPGA builds OK but is giving values that are way off the ~5k RPM I measured for the fan with a scope.

Not yet sure why. The tach block integrates N (currently 16) cycles of the waveform, measuring period against a stable reference clock, then converts frequency from Hz to RPM.

I have a dead time (currently 1000 clocks at 187.5 MHz, so 5.3 us) after each toggle for debouncing which might be too short. Or maybe it's a math error converting from Hz to RPM. I'll find out tomorrow.

azonenberg,
@azonenberg@ioc.exchange avatar

Turns out that while I did have a small math error (two pulses per revolution on the green wire, not two toggles per revolution), the main error was actually in my bit-serial divider IP.

Which I had written back in grad school for my thesis, and it worked great on that CPU because I happened to have the inputs stable from when a divide was issued until it retired. The interface spec called for the divider to register the inputs on the first cycle, but one line of code used the unregistered value instead. Oops!

Anyway, I now have working fan tachometers (no PWM outputs yet, so they're always at max RPM), plus I can read the FPGA sensors using the XADC, and the I2C sensors scattered around the board.

The STM32 also has an on-die temp sensor which I'm not using yet, but I think that's the only missing bit.

azonenberg,
@azonenberg@ioc.exchange avatar

None of the Ethernet PHYs or power supply components have die temperature sensors on them to my knowledge. The SFP+ may have a sensor on its I2C bus, but I haven't brought that up yet (that will come much later).

Also tweaked a few timing settings on the quad SPI and I'm now getting reliable performance at 42.66 MHz (170.64 Mbps). That's as fast as I can go without either changing my FPGA-side QSPI IP to not require 4x oversampling, or moving it out of the RAM controller clock domain into something faster (which would then necessitate a lot more CDC blocks on the core fabric SFRs).

While the sensors are brought up in that they work and I have functions that read them, there's no commands in the CLI to read them later on (yet). So for now all you can get is single-point measurements during boot.

azonenberg,
@azonenberg@ioc.exchange avatar

So now there's a few directions I can go for what to bring up next:

  • PWM outputs for the fans
  • Warm reboot request between main MCU and supervisor
  • RGMII management interface
  • SFP+ uplink
  • SGMII edge ports
  • QSGMII edge ports
  • QDR-II+ SRAM

I'm thinking the RAM might be good to do next since it's fairly self contained and easy to test in isolation.

j,
@j@j-w.au avatar

@azonenberg I can't get over how quickly you're powering through this work, how neat your work and work area are, and how you're also making the time to take us with you. It's inspirational!

In the meantime, I can't get a DPI panel to work with a Raspberry Pi. Our projects are BASICALLY equivalent, yes indeed!

azonenberg,
@azonenberg@ioc.exchange avatar

@j You're seeing the one bench I cleared off (with carefully chosen camera angles to not show the [redacted] from $DAYJOB's client on the adjacent bench).

I assure you there's other parts of the lab that are less pretty right now. With a toddler at home and work keeping me busier than usual, I haven't been keeping up with my usual weekly/monthly maintenance as much as I'd like. So I've been focusing on anything that impacts safety or gets in the way of the stuff I'm actively working on.

I have three GPUs sitting on another bench waiting for me to schedule a maintenance shutdown of the VM server to put them in for ngscopeclient CI testing. They've been sitting there since like April.

azonenberg,
@azonenberg@ioc.exchange avatar

@j If this makes you feel better about yourself... Here you go lol.

Messy lab bench

j,
@j@j-w.au avatar

@azonenberg I know that game: I have a 3yo boy and a 5yo girl... life changes a fair bit, hey? Still though, it's really inspiring that you're getting so much done — and I'm excited to try actually taping test leads to the bench... the thought somehow never occurred to me! (Plus, you know, the ambition of your project itself is incredible and I'm loving the ride)

azonenberg,
@azonenberg@ioc.exchange avatar

@j Nobody can ever accuse me of not dreaming big :)

This project has been in the making since 2012. Many of my other big projects (like ngscopeclient/libscopehal and the probes) originally started as me realizing I needed better tooling to develop/debug it.

The extreme number of networking and networking-adjacent protocol decodes in libscopehal (baseT autonegotiation, 10baseT, 100baseTX, 1000baseX, 10GbaseR, SGMII, QSGMII, RGMII, MDIO, and probably more that escape me off the top of my head) is not an accident.

azonenberg,
@azonenberg@ioc.exchange avatar

While waiting for a RAM test bitstream, wired up a test fixture for sniffing and verifying traffic on the SFP+.

It's just two back to back optics connected through 6 dB RF splitters with the other leg of each going to the scope.

azonenberg,
@azonenberg@ioc.exchange avatar

And it's a good thing I checked.

Apparently this wall port is spitting out 1000base-X traffic, not 10Gbase-R.

Time to go fix that before I think about bringing up the 10GbE on this board!

azonenberg,
@azonenberg@ioc.exchange avatar

Aha, that would do it. PP4/34 is connected via an obviously temporary patch cable to a 1000base-SX optic on one of my 1G switches. And there's a cable coming off my 10G core switch dangling right next to it.

I must have needed a 1000baseX test signal a while back and forgot to reconnect the cable.

Closeup of the temporary connection

azonenberg,
@azonenberg@ioc.exchange avatar

And getting nice looking 10Gbase-R idles coming off the switch now.

The line coming off the LATENTPINK board is flatlined, which is unsurprising as the FPGA design loaded on it doesn't yet bring up any of the transceivers.

azonenberg,
@azonenberg@ioc.exchange avatar

It seems all of my simulation testing paid off, possibly? My homebrewed QDR-II+ controller seems to have worked on the first attempt in real hardware!

It uses a fair bit of juice (unsurprisingly, given all of the SSTL signals). Power consumption jumped from 5.5W to 8.2W (2.7W delta) when I loaded the new bitstream, but everything is still happy (FPGA Tj is at 39.5C and seems to be stable).

This is running the RAM at 375 MHz (750 MT/s), comfortably less than the 450 MHz (900 MT/s) speed grade limit. But that's all I need to get 24 Gbps of throughput, which is the requirement for this board to saturate 14x 1 Gbps + 1x 10 Gbps links.

No MIG, no PHASERs, no weird MEMORY_QDR mode on the ISERDES to sample on CQ and CQ# rising edges.

Just using IDDR's clocked by a 90 degree PLL shifted version of CQ/CQ# fed to a single IBUFDS.

Next step will be to write a full BIST core so I can get more confidence than "I poked two addresses in the VIO and it seemed OK".

Vivado VIO screenshot showing successful write and readback of a memory address

azonenberg,
@azonenberg@ioc.exchange avatar

And the SRAM BIST passes! This is one of the parts of the design I was most worried about (it's not super fast, but it was a custom controller for a type of memory I had never used before) and it was one of the least painful to bring up.

azonenberg,
@azonenberg@ioc.exchange avatar

Started bringing up the SFP+ interface.

The MCU now correctly detects optic insertion/removal and toggles TX_DISABLE a short time after the optic is inserted.

So far RX_LOS is ignored and I don't do anything with the RS pins. The DOM logging is just a test, I won't actually dump all the sensors every time an optic is inserted long term. That will be under "show interface transceiver" or similar (along with lots more details).

jpm,
@jpm@aus.social avatar

@azonenberg ooh neat, got the DOM data via I2C pretty easily by the looks of it?

azonenberg,
@azonenberg@ioc.exchange avatar

@jpm Yeah.There's a ton of EEPROM and monitoring fields I'm not logging yet, this is just a start to sanity check that I can talk to it and get plausible values.

My focus at this stage is board bringup not feature-complete firmware development. Once I verify a subsystem isn't obviously broken I move on to the next.

jpm,
@jpm@aus.social avatar

@azonenberg yep makes perfect sense, and very cool to see your firmware talking to a SFP+ module getting basic diagnostic data out of the thing

azonenberg,
@azonenberg@ioc.exchange avatar

But something is wrong, the transmit data seems very unstable and I'm not seeing anything that makes sense.

I think this might just be the optic sending noise with the FPGA either not transmitting at all, or transmitting gibberish. My logic analyzer in the FPGA fabric is failing to arm because Vivado isn't seeing a clock.

azonenberg,
@azonenberg@ioc.exchange avatar

Well that explains the implementation warning I was getting about an "invalid clock configuration" that I had been chasing for a while but never found the root cause of.

The transceiver quad PLL had a typo in one setting so it wasn't locking. That explains a lot.

Now linking up and seeing broadcasts on the sandbox network.

SFP+ link/activity LEDs on the board don't currently do anything, so that will probably be the next TODO item.

Note that the eye patterns in the screenshot are taken off the SFP+ mid-span tap, so while they' can be used as a reasonable proxy for jitter in the actual waveform, they won't show small reflections or vertical eye closure present on the actual DUT. At some point I'll probably land probes on the actual differential pairs on the PCB, but for the moment it looks to be clean enough I doubt there's any problem there.

Vivado screenshot showing a frame coming off the MAC
ngscopeclient screenshot showing power usage dipping during loading of a new bitstream, then climbing to slightly above the previous value as the quad PLL starts
ngscopeclient screenshot showing good 10Gbase-R data coming off both sides of the fiber pair

azonenberg,
@azonenberg@ioc.exchange avatar

Ok so, the obvious next step is to tie up a few loose ends around the SFP+ uplink (make sure all the low speed control signals are tied off, maybe add some logic to check TX_FAULT, make the link up signal on the FPGA drive the link LED, and add a pulse stretcher for the activity LED).

After that, I think I'll work on bringing up the RGMII PHY on the management port and finish the remaining bits of glue for shoving Ethernet frames over the quad SPI bus so that the STM32 can actually be reached over the network via ping / SSH.

azonenberg,
@azonenberg@ioc.exchange avatar

Bringup is proceeding nicely. Got the SFP+ indicator LEDs working and cabled up the baseT management interface.

azonenberg,
@azonenberg@ioc.exchange avatar

RGMII management port came up on the first try with no fuss. Yet another painless bringup step.

I haven't had to bodge the board at all (although I did solder probes to the I2C pullups thanks to the lack of a designed-in test point on them) which is a slight surprise.

I either did a really good job designing and verifying this board, or there's a catastrophic error lurking right around the corner in one of the subsystems I haven't looked at yet. We'll find out soon!

Vivado ILA screenshot showing successful read of a PHY MDIO register
Vivado ILA screenshot showing an incoming Ethernet frame on the RGMII bus

azonenberg,
@azonenberg@ioc.exchange avatar

It's now pulling 8.8W and while temperatures are gradually increasing as I load down the board, they're all well within safe limits:

  • SGMII PHY area (both PHYs idle): 25C
  • RGMII PHY / 1.2V regulator area (linked up at 1 Gbps, no traffic): 28C
  • MCU / 3.3V regulator area: 30C
  • SFP+ optic (linked up at 10 Gbps, no traffic): 34C
  • QDR-II+ SRAM area: 34C
  • FPGA die: 42C

I kinda expected the RGMII PHY to run hotter but right now the fan cooling the FPGA is blowing over it first, so I guess that's helping.

The heatsink on the FPGA seems to be doing its job so far. This is my first board that I designed a thermal solution into (vs running cool, or having one bodged on ex post facto), so good to see it's at least somewhat functional.

azonenberg,
@azonenberg@ioc.exchange avatar

That's it for tonight since I have to be up for work in the morning.

Next step is going to be building out more firmware and gateware around the management interface:

  • Make the MDIO bus accessible over QSPI from the MCU, rather than just a JTAG debug core
  • Finish the FIFO logic and interface code on both MCU and FPGA side, so I can send and receive Ethernet frames from the MCU
  • Verify SSH over real Ethernet
azonenberg,
@azonenberg@ioc.exchange avatar

Decided to try bringing up the SGMII ports first.

Hmm, i wonder why port g12 wasn't responding? Easy fix with 30 seconds of hot air. Another failed solder joint from that same reel of 2013 era 33Ω resistors. Might be time to retire the reel?

Now it links up fine at gigabit speed, and the SGMII link is up.

But it's not responding over MDIO which is a bit of a head scratcher. I tried bruteforcing the entire 5 bit PHY address space (in case there was a problem with address straps) and got no response at any address.

azonenberg, (edited )
@azonenberg@ioc.exchange avatar

I'm not sure what's going on here. The PHY is obviously right way round on the PCB, getting power and a clock, and not in reset or power down if all the other functions work.

There's no issue with the FPGA soldering, PCB traces, or RTL; I threw probes at the PHY pins and saw well formed MDIO traffic.

Failed soldering on only mdio and mdc, of both PHYs, seems unlikely.

MDC has a good clock so it's fine; MDIO is clearly not shorted/open since it's got well formed headers. It's got a pullup and is sitting at VCCIO during the idle period (when a PHY should respond).

MDC frequency is 2.5 MHz (same as I use for the KSZ9031, but a different bus) which is well below the 25 MHz Fmax for the DP83867.

azonenberg,
@azonenberg@ioc.exchange avatar

Tweaked drive strength on MDIO and MDC in case I was overdriving and causing ringing or something (hard to see with this suboptimal probing setup) but no luck.

azonenberg,
@azonenberg@ioc.exchange avatar

Touched up every pin on the PHY, one by one with microscope inspection, in case I had a bad solder joint.

No change in behavior.

azonenberg,
@azonenberg@ioc.exchange avatar

Still confuzzled. Tried a few more things (hooking up the INT/PWRDN pin to the FPGA with an on die pullup in case having the pin unused in the bitstream did something weird), verified relative timing of MDC and MDIO were sane.

I can't understand how the PHY can be happy enough that it links up 1000baseT to my laptop, has an estimated e-12 BER for the SGMII link to the FPGA (based on 8b10b error and total symbols performance counters), and yet is unresponsive over MDIO.

The only explanation I could think of was a soldering problem that happened to affect those two pins but I specifically resoldered them.

And that wouldn't explain why the second PHY is equally unresponsive.

whitequark,
@whitequark@mastodon.social avatar

@azonenberg wrong straps, maybe?

azonenberg,
@azonenberg@ioc.exchange avatar

@whitequark All of the other straps are doing what they should (it's coming up in SGMII mode vs RGMII, etc).

And a bad strap would still make it show up at some MDIO address. I bruteforced the entire address space (only 5 bits) and nothing is talking to me.

whitequark,
@whitequark@mastodon.social avatar

@azonenberg this is a silly thing but have you tried doing 802.3ah accesses?

azonenberg,
@azonenberg@ioc.exchange avatar

@whitequark Is that clause 45 mode? I haven't tried, my current MDIO controller doesn't implement it - only 22.

whitequark,
@whitequark@mastodon.social avatar

@azonenberg I'm asking because of this odd clause in the datasheet

whitequark,
@whitequark@mastodon.social avatar

@azonenberg it supports both clause 22 and clause 45 which is why this is silly, but just in case?

azonenberg,
@azonenberg@ioc.exchange avatar

@whitequark Yeah I can try if fixing the GPIO_0 strap doesn't work I guess.

whitequark,
@whitequark@mastodon.social avatar

@azonenberg oh I deciphered what's on the screenshot

this basically means "it always needs a 32-bit quiet time after reset"

azonenberg,
@azonenberg@ioc.exchange avatar

@whitequark Yeah that's how I interpreted it.

And it's OK if MDC is toggling as long as MDIO is high during that time.

azonenberg,
@azonenberg@ioc.exchange avatar

Found and fixed a power rail sequencing issue (the DP83867 wants its 2.5V analog rail stable prior to the 1.8V analog rail, and I was ramping in reverse order). There is (explicitly stated, not assumed) no sequencing requirement for these rails vs the 1.0V digital core and the VCCIO rail.

No change in behavior with that fixed.

azonenberg,
@azonenberg@ioc.exchange avatar

https://e2e.ti.com/support/interface-group/interface/f/interface-forum/762033/dp83867e-after-resetn-is-cleared-mdio-bit-goes-low-and-stays-low

Innnnteresting. Apparently GPIO_0 is a strap?? Let me try pulling that low and see what happens.

azonenberg,
@azonenberg@ioc.exchange avatar

Tried a new bitstream with explicit pulldowns on GPIO_0 and GPIO_1.

Reading a bit more, it seems that if GPIO_0 is strapped wrong it will pull MDIO to VCC/2. Which is not what I'm seeing here.

Instead, I'm seeing MDIO tristated and floating high (as if the PHY isn't even attempting to talk to me).

azonenberg,
@azonenberg@ioc.exchange avatar

Bootup delay from RST# (blue) going high to MDC (yellow) beginning to toggle. 867 us, datasheet only requires 195 us.

And plenty of toggles on MDC before activity on MDIO (green).

Note that the actual PHY I/O signals are LVCMOS18; I'm probing MDIO at the PHY pins but the FPGA mirrors MDC and RST# to a 3.3V GPIO connector since it's tricky to get too many probes on a little QFN.

I did probe separately to confirm that MDC is reaching the actual PHY pins, and since it's linking up RST# is obviously clearing OK.

azonenberg,
@azonenberg@ioc.exchange avatar

Hmmmm, interesting, The VSC8512 isn't responding over MDIO either.

I wonder if it's something about my FGPA-side MDIO controller (weird timing or something) and the KSZ9031 is more forgiving? It's the only PHY I recall having used with it in the past.

Anybody have ideas?

ngscopeclient screenshot showing relative timing of MDIO and MDC
ngscopeclient screenshot showing relative timing of MDIO and MDC

azonenberg,
@azonenberg@ioc.exchange avatar

Every time I look there's more spaghetti.

azonenberg,
@azonenberg@ioc.exchange avatar

OK, scratch that theory. I just found an old ngscopeclient dataset from my previous experiments with the DP83867.

I had MDIO working successfully using this same controller on it. In that particular case I was running at LVCMOS25 levels rather than the LVCMOS18 I'm using here, but it was the same FPGA IP.

So clearly the DP83867 is able to work with my controller. Which makes me lean back towards a hardware issue again.

But I'm still at a loss as to what could make MDIO fail but literally everything else work.

azonenberg,
@azonenberg@ioc.exchange avatar

Actually no, the PHY was running at LVCMOS18 levels. The FPGA was using LVCMOS25 and I had a level shifter.

So almost every single thing was the same on that board vs here. What changed??

cibyr,

@azonenberg double-check the pinout?

azonenberg,
@azonenberg@ioc.exchange avatar

Awake for the day and troubleshooting more.

The remainder of the PHY is working fine, for sure. The fabric CDR block is locked and I'm getting valid 8b10b symbols.

After letting it run overnight, I have 5.1e12 symbols received without error on g13 (baseT link down) and 5.12e12 symbols with 71 errors on g12 (baseT link up). This is unsurprising as with the link up there's more power consumption, noise, variability to the data, etc.

But this gives a real world symbol error rate of roughly 7.1e-10. Given that symbols are 10 bits long, we extrapolate a BER of 7.1e-11. This is a slight underestimate since it doesn't catch errors that turn one valid 8b10b symbol into another valid one with correct disparity, but it's good enough as an OOM level approximation, and e-11 BER sounds plausible for a fairly short link on the same PCB.

azonenberg,
@azonenberg@ioc.exchange avatar
azonenberg,
@azonenberg@ioc.exchange avatar

Resetting the PHY and probing strap pins one at a time to verify actual voltage during reset:

RX_CTRL = 441 mV = 0.245x VCCIO, comfortably in the middle of Mode 3 (autonegotiation enabled).

GPIO_0 = 10 mV = 0.005x VCCIO, very much mode 0 (RX0 clock skew = 0)

GPIO_1 = 10 mV, RX2/RX1 clock skew = 0

1V8_IO = 1.784V, all good

1V0 = 1.002V, all good

RST# is a nice clean rising edge from 0 to 1.78V, looks good there.

PWRDN# / INT# is 1.77V, no concerns there. (Even if the PHY was in power-down mode I'm pretty sure the MDIO interface would be up)

LED2 is at 4 mV, mode 1, RGMII TX1/TX0 clock skew = 0

LED1 is at 14 mV, mode 1, ANEG_SEL=0 (advertise all modes including 10baseT), TX2 clock skew = 0

LED0 is at 308 mV, 0.17x VDD, mode 2. Mirror disabled, SGMII enabled.

This is correct config for g12 which is what I'm probing; g13 is wired identically but should have mirror mode enabled (but I can also configure this via MDIO so not a big deal).

A1V8 is 1.79V, happy.

azonenberg,
@azonenberg@ioc.exchange avatar

So that's the entire south side of the PHY verified correct levels.

Now let's check the west side where MDIO addressing is configured.

RX_D2/SGMII_RX_P = 561 mV = 0.31x Vdd. That's wrong, it's between the mode 3 and mode 4 strap ranges.

I have no strap resistors on this pin and it's AC coupled to the FPGA (so any biasing coming from the FPGA shouldn't affect it, I'm probing at the PHY side of the coupling cap).

Per datasheet it's supposed to have a 9 kΩ pulldown in strap mode, and be max 0.098x VDDIO if left floating (strap config for mode 1, which is what I want).

azonenberg,
@azonenberg@ioc.exchange avatar

Scoping with a longer time scale: it looks like after reset is asserted the pulldown starts discharging the coupling capacitor, but it takes a long time to do so.

So I either need to add an external pulldown (not relying on "open circuit = mode 0") or just assert reset longer so the on-die pulldown has time to do its thing. The latter seems to be easy enough, let's try that...

azonenberg,
@azonenberg@ioc.exchange avatar

Well that took a log longer to find than I expected!

With a 4x increased reset pulse duration, the AC coupling caps on the SGMII have enough time to fully discharge and we get correct strap values on all of the SGMII pins.

I was barking up the wrong tree for a while assuming that incorrect MDIO address straps would lead to the device coming up at a different, unintended address, but it would always respond somewhere. Since it didn't show up at any address I assumed the problem was elsewhere.

Turns out the ranges don't overlap (in particular mode 3 is up to 0.284x VDDIO and mode 4 is above 0.694x) and I guess if you're in that middle ground it won't work at all, vs coming up in one mode or the other.

Now it comes up and is detected with a valid PHY ID so I can continue the full bringup cycle.

The VSC8512 isn't responding over MDIO either but I'll address that issue separately once I get to it. Probably totally unrelated problem.

ngscopeclient screenshot showing R-C falloff of strap / RXD pin (green) after reset (blue) is asserted

azonenberg,
@azonenberg@ioc.exchange avatar

Anyway, I guess I'll go back to the original plan of getting SSH up and building out more CLI commands (things like printing out low level phy debug info).

That will probably take the rest of the evening since I have some errands and family weekend stuff to do too.

azonenberg,
@azonenberg@ioc.exchange avatar

Incoming Ethernet frames are now buffered in the FPGA and read out by the MCU for processing. I might tune the buffer size, it's pretty small for now, but it's usable.

Here's some broadcast traffic on my sandbox network. So far it's just being printed to the UART and not actually being processed by the IP stack, that's the next step.

Then on to transmits.

azonenberg,
@azonenberg@ioc.exchange avatar

Hooked up the IP stack and added some logging hooks to indicate when it tries to transmit.

For now, all outbound frames are dropped because there's no code on the FPGA to actually operate the transmit path (and I haven't even defined the registers for the MCU to send a frame yet).

So that's next.

azonenberg,
@azonenberg@ioc.exchange avatar

And it's now pingable!

When I try to SSH to it, I hang after sending the SSH2_MSG_KEXINIT message. Unsure if this is an IP stack issue, a crypto driver issue on the STM32, or something else. Will troubleshoot tomorrow.

There seem to be some bugs where it'll enter a bad state and stop responding to pings as well.

azonenberg,
@azonenberg@ioc.exchange avatar

The high latency, if you're curious, is mostly because of some blocking UART debug prints in the packet processing path. It doesn't take a whole lot of text at 115.2 Kbps to add 10ms of latency.

So that will speed up a lot in the future.

ignaloidas,
@ignaloidas@not.acu.lt avatar

@azonenberg no non-blocking debug? setting up a buffer and a DMA transfer usually isn't that much work

azonenberg,
@azonenberg@ioc.exchange avatar

@ignaloidas Been on my todo for a while but haven't got to it.

I have an interrupt driven RX FIFO so I don't drop bytes if I'm busy doing something; a TX version would be simple enough to build.

azonenberg,
@azonenberg@ioc.exchange avatar

Found one of the problems: I cleared the FPGA IRQ line latch after reading the interrupt status register. But I only read a single Ethernet frame per IRQ assertion.

So if two frames show up before I've read the first one, the second one won't get read until a third one shows up, etc.

Eventually enough frames will get forgotten that the buffer fills up and all traffic stops flowing.

Got a trivial fix (don't latch IRQ, it's asserted nonstop if there's data in the buffer) and am building a new bitstream with it.

But soooomeone wanted to go to the park so it's gonna be a while before I'll know if the fix worked...

azonenberg,
@azonenberg@ioc.exchange avatar

So at this point I've done at least preliminary "hardware isn't catastrophically busticated" bringup on just about everything other than the QSGMII PHY, which was unresponsive on MDIO during a 30 second first attempt but I didn't do any more extensive debug.

Power: voltages all in range, but no ripple measurements yet

Supervisor: working fine, firmware basically done except for watchdog/warm reboot functionality

Main MCU: no issues

Thermal: everything working fine with fans at max RPM. Haven't tried to PWM them yet but they're so quiet at max RPM I may not even bother

10G SFP+: links up and can read EEPROM and DOM

RGMII PHY: fully functional and passing traffic

SGMII PHYs: SGMII links up with e-11 BER and seems to work fine. MDIO alive. BaseT side seems fine on g12, but g13 only links up at 100mbit speeds. Suspecting solder issue but have not investigated yet.

QDR-II+ SRAM: fully functional

azonenberg,
@azonenberg@ioc.exchange avatar

Took out the debug prints and fixed the IRQ issue, as well as a hardware crypto engine issue.

Now I can ping it with 250us RTT (through a router and two or three switch hops).

And I get further when trying to SSH to it. Now the device kicks me off after the SSH2_MSG_NEWKEYS message. Wonder why?

azonenberg,
@azonenberg@ioc.exchange avatar

Hmmm. I'm seeing AES-GCM decryption fail, but when I dump out the calculated keys I get the same keys that OpenSSH is trying to use clientside.

Which makes me think it's some subtle difference in behavior between the STM32F7 crypto engine (which I originally wrote this code for) and the STM32H7 crypto engine (which I'm now using). This is going to be fuuuun.

azonenberg,
@azonenberg@ioc.exchange avatar

Looking at the datasheets side by side it seems key endianness is different. But that wasn't enough to make it work so there's something else going on too.

brouhaha,
@brouhaha@mastodon.social avatar

@azonenberg For added security, the byte ordering is randomized for every key exchange.

azonenberg,
@azonenberg@ioc.exchange avatar

Hexdumping more stuff it seems I'm getting valid decryption of the message now (a SSH_MSG_SERVICE_REQUEST with service of "ssh-userauth").

But the GCM tag is way off.

This is the frustrating thing about debugging crypto. If you get any of the magic incantations even slightly wrong, you get random gibberish with no clue about what went wrong.

azonenberg,
@azonenberg@ioc.exchange avatar

Progress: it seems I had to endian-swap the length fields in the last GCM block vs the STM32F7. Now it's getting all the way up to the point of the client seeing a SSH_MSG_CHANNEL_SUCCESS that I sent after successful password authentication, but the contents of the packet seem garbled so it aborts.

azonenberg,
@azonenberg@ioc.exchange avatar

It lives!

Somehow I was sending replies to SSH_MSG_CHANNEL_REQUEST packets by writing into the incoming packet buffer, not the reply.

And by dumb luck I guess whatever uninitialized garbage was in the reply buffer happened to resemble a valid SSH_MSG_CHANNEL_SUCCESS message before, but not now? Lol.

Anyway, this is a great success. Kid is up from her nap so that's it for a while, tonight after bedtime I'll hook up the Curve25519 accelerator on the FPGA to speed session creation a bit, then work on a bunch of CLI commands to dump PHY information and such.

azonenberg,
@azonenberg@ioc.exchange avatar

Bumped the optimization level on my firmware up from -O0 to -O2 because creating a SSH session was too slow.

But the FPGA curve25519 accelerator is still over 48x faster than the software implementation. Pretty happy with that :)

azonenberg,
@azonenberg@ioc.exchange avatar

Ephemeral ECDH key generation and shared secret calculation now use the FPGA accelerator and SSH session creation now feels about as fast as it does when logging into a regular PC.

I can probably extend the same accelerator block (with some minimal tweaks) to also support the public key side of signing, but for now crypto_sign() is still being done entirely in software and only the two crypto_scalarmult() calls in the SSH session creation are accelerated.

Still a massive improvement in responsiveness, it cut about 400ms of latency off session creation.

azonenberg,
@azonenberg@ioc.exchange avatar

Added some code to poll PHY MDIO register state (not using irq pins yet) and the SGMII PHYs seem less happy. One refuses to report link up over MDIO (even if the LEDs are on and the link partner says it's up), the other reports link up but flaps.

Just when I thought that was working...

azonenberg,
@azonenberg@ioc.exchange avatar

So I guess the first question is if I'm actually addressing the correct PHY and if it's in fact reporting link flaps. And how the MDIO link state compares to the SGMII autonegotiation state.

azonenberg,
@azonenberg@ioc.exchange avatar

After some soldering, i think I'm ready to start debugging!

azonenberg,
@azonenberg@ioc.exchange avatar

So, here's the basic setup.

Blue and green wires go to the MDIO bus, which is slow enough (2.5 Mbps with very low drive strength) that I'm not worried about reflections off a few inches of flying wire. Standard 10x passive probes clip to the other end of each.

The two black probes are Teledyne LeCroy QuickLink solder in probe tips. One is going to a D420-A and the other to a D1330; both have way more bandwidth than I need to see SGMII clearly.

I'd use my own AKL-PT5 probes for this measurement (well within their capabilities) except that I'd need to AC couple the measurement and somehow I only have one SMA DC block on the shelf right now. That will be rectified by the end of the week.

azonenberg,
@azonenberg@ioc.exchange avatar

Initial observations: The SGMII RX waveform looks decent enough and passes the eye mask required for the FPGA to decode it. I've seen better, but I'm in no hurry to rework the board because of this.

Swing and drive strength on the TX seem a bit excessive, I should probably turn it down. The eye is wide open but the PHY could probably hear the FPGA from the next room over!

Valid MDIO traffic is present. This particular waveform has two packets at the start, then four, then two more.

The MCU reads four registers per polling cycle: basic control and status of PHY 0, then of PHY 1. After polling each PHY, it checks for a link state change and logs a message to the UART.

The long delay after the last packet suggests this is being detected as a link state change.

ngscopeclient screenshot showing bursts of activity on the MDIO bus and SGMII eye patterns

azonenberg,
@azonenberg@ioc.exchange avatar

The SGMII link appears to be up the whole time, and at no point does it fall back into the negotiation state. So the link is probably not actually flapping; this smells like a bug on the microcontroller side.

We expect g13 (phy address 1) to be down, and g12 (phy address 0) to be up at 1 Gbps.

Looking at the actual MDIO bus traffic in this capture, we have:

  • PHY 1 ctl: 1G/full
  • PHY 1 stat: Down
  • PHY 0 ctl: 1G/full
  • PHY 0 stat: Up
  • PHY 1 ctl: 1G/full
  • PHY 1 stat: Down
  • PHY 0 ctl: 1g/full
  • PHY 0 stat: Up

Nothing seems obviously wrong here.

ngscopeclient protocol decode dump showing MDIO transactions (see main toot for discussion)

azonenberg,
@azonenberg@ioc.exchange avatar

OK, this is starting to smell like an FPGA issue.

We know the actual MDIO traffic on the wire is fine, but sometimes we're reading 0x7949 for the basic control register on port 12.

Interestingly, this is the same value we just read from the basic status register on port 13.

And then at 22.420, we read the basic control register for port 13 as 0x116d. 0x6d is the value we just read from the basic status register on port 12.

So I think there's some kind of bug in the FPGA MDIO-to-QSPI bridge where sometimes it will return a previous value instead of what was actually read.

azonenberg,
@azonenberg@ioc.exchange avatar

New FPGA bitstream with some additional debug logic, as well as changing the FPGA output buffer from DIFF_HSTL_I_DCI_18 to LVDS.

TX data eye measured at the PHY side is still reasonably open, but way lower amplitude than before. I'll double check the spec later but this should be plenty open enough.

azonenberg,
@azonenberg@ioc.exchange avatar

And here's the bug, caught red handed.

We start with the MDIO transceiver being busy with a read of address 0x00. The read data register is still 0x7949, the previous value, because the read is still in progress.

At T=7862 the MCU begins a 4-word burst read of REG_DP_MDIO (0x004c). This is a 32-bit little endian register with the read value in the low 16 bits, a bunch of write-only configuration, and a busy flag in the MSB.

By T=7887 when we read SPI address 0x004f (where the busy flag is) the read has just finished.

So the MCU thinks it's successfully read the whole register.

The fix is pretty simple: latch the busy flag when address 0x4c is read (the entire 32-bit register has to be read in one go, byte access is not supported). The MCU will then read {busy, 0x7949} just like it did on the previous poll, then read the correct value on the subsequent polling cycle.

azonenberg,
@azonenberg@ioc.exchange avatar

Yay, no more flapping!

Tomorrow's problem: while g12 links up fine at gigabit speed, last time I tried g13 would struggle a bit then come up at 100 Mbps (verified by link partner).

That's probably a hardware issue of some sort since g12 and g13 are supposed to be identically configured. The RJ45 pinout is mirrored because of the tab-up vs tab-down jacks, but that should (famous last words) be fine because the DP83867 has a register to enable ABCD -> DCBA mirroring, which I think I've set correctly.

I made a quick pass over the schematic and nothing seemed otherwise different, it was largely copy-pasted other than the PHYADDR strap pins.

Serial console screenshot showing g12 and mgmt0 linking up

azonenberg,
@azonenberg@ioc.exchange avatar

Yet another command that I wish "real" switches had.

There will of course be fancy commands that include nice detailed decodes of port state. But sometimes there's no substitute for getting close to the metal.

leifdavisson,
@leifdavisson@ioc.exchange avatar

deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @leifdavisson My own open hardware design. See the rest of the thread for updates on board bringup and initial firmware development.

    https://github.com/azonenberg/latentpacket/tree/master/latentpink

    leifdavisson,
    @leifdavisson@ioc.exchange avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @leifdavisson I don't think you can strip FreeBSD down enough to run on this.

    The main processor is a Cortex-M7 with no MMU, half a megabyte of SRAM, and a megabyte of flash. It runs a bare metal SSH server and serial console that pokes registers on the FPGA over SPI.

    No OS, just a trivial event loop and a couple of interrupt handlers. The intent is that with a few hours of fuzzing you can gain very high confidence that the platform is pretty safe against network attacks coming from the management interface.

    The fabric side has no way to even talk to the CPU (management is a physically separate RJ45 on the back of the unit) so there's ~zero attack surface coming from the front panel or uplink ports.

    The actual switch fabric is a fixed function (old school ASIC style datapath design) FPGA based shared bus architecture. Incoming frames are pushed into a FIFO in QDR-II+ SRAM, looked up against the MAC table, then forward out the correct destination port(s).

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @leifdavisson To give you an idea of how lightweight things are, the current firmware including the necessary libc functions, hardware drivers, TCP/IP, SSH, and the management interface compiles to about 70 kB with gcc -O2.

    That's the whole firmware, no other external dependencies.

    leifdavisson,
    @leifdavisson@ioc.exchange avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @leifdavisson Yep. The whole point of this project was to make a switch with modern SSH cipher suites (ssh-ed25519, aes-gcm), modern performance and hardware capabilities (10G uplinks), but extreme simplicity and no kitchen sink full of bells, whistles, attack surface, and bugs.

    I just need the ability to turn ports on and off, force to specific speeds, do port VLANs and 802.1q trunks, TDR testing on copper interfaces, and move a lot of data.

    That's about it.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Well, that was a slightly larger yak than I originally expected but it's been thoroughly shaved.

    SSH clients on the switch can now see log messages. For now this is enabled by default, although long term I might have this controlled by a per-unit configuration setting or off by default with a Cisco-style "terminal monitor" command to start seeing log messages.

    During development I want ALL the logs so I'll leave it like this for now.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Next step will be to implement some of the commands I copied over (commented out) from the Ethernet tap board, and make any tweaks needed to support the additional PHY chipsets on the board.

    In particular, I want to be able to send test patterns out both DP83867's to check for soldering issues before I debug the 100mbit-only link issue further.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Ok, I should sleep...

    But on the plus side, I have the code to send test patterns working (including the three special test patterns that the DP83867 specifies in addition to the IEE-defined ones).

    Won't be able to actually debug the g13 100mbit issue until tomorrow after work but I should have all the groundwork laid now.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Oh I'm sorry you wanted less cable spaghetti? i swear you said you wanted more. I even bought a new roll of ESD tape to wrangle it all.

    Got the baseT test fixture cabled up so I can troubleshoot g13's link issues after work, but didn't have time to collect any data yet.

    If you haven't seen it before, this is a handy dandy little gizmo consisting of two RJ45 jacks connected back to back by dual directional couplers.

    This gives me 16 SMA outputs with 10 dB attenuated views of each of the 8 wires in the twisted pair cable, seen from both directions. I'm using an 8 channel PicoScope 6824E to look at all 8 lines coming out of the DP83867, ignoring the inbound data from the other side.

    Test fixture with two RJ45 connectors and eight blue coaxial cables coming off one side and feeding into an oscilloscope
    Closeup of the test fixture. Two RJ45 jacks connect to each other with bare copper PCB traces. At each end of the path are eight directional couplers to tap off the signal going to the oscilloscope
    Closeup of the bench setup showing the switch prototype surrounded by dozens of wires carefully taped to the bench to prevent anything from moving

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg when rf vodoo meets computers!

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx PicoScope 6824E. 500 MHz, USB3 attached, and can push about 2.5 Gbps of sample data to the host PC. My only complaint is there's not more bandwidth.

    But it's great for when you need to look at lots of lower speed signals. It's my go-to for most embedded debug work these days. Pico sent it to me as a dev unit for scopehal work and I loved it.

    https://www.picotech.com/oscilloscope/6000/picoscope-6000-features

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg picoscopes are good, we have a two channel unit at dayjob. Quite nice to use.

    gsuberland,
    @gsuberland@chaos.social avatar

    @f4grx @azonenberg I got a 3206B direct from them on eBay for super cheap when they were clearing out old revision stock. really nice scope.

    mmu_man,
    @mmu_man@m.g3l.org avatar

    @azonenberg this makes me want to throw a cat at it! 🐱

    anotherandrew,

    @azonenberg I thought that looked kind of familiar…

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @anotherandrew Yep this is the next generation version that can split bidir signalling.

    jaseg,
    @jaseg@chaos.social avatar

    @azonenberg Those are some amazing bench pictures!

    mmu_man,
    @mmu_man@m.g3l.org avatar

    @azonenberg that's something for @unbinare !

    dynode,
    @dynode@mas.to avatar

    @azonenberg Madness! 🤪

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Hmmm.

    Set up a test pattern on g13 and expected to see it coming out all pairs of the link, but only seeing it on pair A.

    Thought this pointed to a soldering issue, except I'm seeing it on g12 as well (which links up just fine).

    So I guess I need to read the datasheet and see if there's a test pattern mux register or something I'm missing...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Yep, there is. MMD 0x1f register 0x25, TMCH_CTRL, defaults to only sending the test pattern out pair A.

    With that fixed, on g12 I'm seeing the test pattern on all pairs. So we know our register config is correct there.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And now here's what we see on g13. One of these is not like the other.

    Probably a solder defect but I'll need to pull the board to investigate. Decabling this will take a while...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    I wish it was a solder defect. The truth is worse.

    Not sure how this got through design review...

    Datasheet screenshot showing D+ on pin 8 and center tap on pin 9

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Looking at the layout, bodging this is going to be fuuuun.

    g10, g8, and g4 have pair D routed on layer 6 of 8. Getting to them (assuming I come from the back of the PCB to avoid desoldering the connector) will mean drilling down 250 μm - annoying but not too bad.

    g13, g6, g2, and g0, all have pair D router on layer 3 of 8. Getting to this from the back side will mean drilling down almost 1.3 mm. That will be decidedly less fun.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    The good news is that I have almost 1mm of width and as much length as I need to play with. There's basically nothing on other layers that I'm likely to hit.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And worst case, this isn't a fatal issue for a prototype. Having half the ports only run in 100baseTX mode, or even not work at all, would surely be annoying. But it wouldn't prevent me from using the board as a development platform for the full scale 24 port switch, which was the real goal.

    But I'd like to make it fully functional if I can.

    Not happening tonight, though. I've got too much else on my plate with time constraints.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    One good thing about this bodge is that it's going to be hard (I know better than to say impossible) to screw the board up more.

    There's no other signals in that area that I could potentially damage, so as long as I don't drop the board while I'm fixturing it or something, it will be no less broken post bodge than before I started. And if all goes well, it'll work better.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Actually I might try some fixturing work and a preliminary cut while waiting for stuff to run on another project.

    My microscope ring light was too fat to clear so I bodged up an LED headlamp with some tape.

    Endmill descending over PCB
    View through microscope showing area of the planned rework with an 0.25mm endmill hovering just above the board

    azonenberg,
    @azonenberg@ioc.exchange avatar

    First test cut. Through layers 8 (back) and 7 (ground plane). There's an LED trace on layer 6 we might get close to, but if it's damaged not a huge deal, plenty of other places to reconnect if required. Layers 5 and 4 are power planes we need to not short, then 3 is where the actual bodge will happen.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Down to layer 5.

    gnarf,
    @gnarf@hachyderm.io avatar

    @azonenberg Oh boy, been there, done that. I had to drill down to layer 4 of 8 to fix 2 swapped control lines for DDR3 RAM.

    Hope all goes well, this can be quite sweat-inducing!

    gnarf,
    @gnarf@hachyderm.io avatar

    @azonenberg There goes all my precious impedance control :(

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @gnarf At least control lines are slower than DQ's.

    Did it end up working?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @gnarf Also you're the first person I've seen on Mastodon doing this kind of high end inner layer rework.

    You obviously used a mill of some sort for your edit given the nice clean outline and flat hole bottom. What was the tooling setup like? CNC or manual?

    gnarf,
    @gnarf@hachyderm.io avatar

    @azonenberg It did work! Though I had to clock it at 800Mb/s instead of 1066Mb/s or else there'd be sporadic errors. I hope it's just due to the extra wire and not any other issue, rev2 will be ordered next week.

    gnarf,
    @gnarf@hachyderm.io avatar

    @azonenberg Thanks! Yeah, that was a CNC of a friend, though he built/modified it himself, so unfortunately I don't have a model number for you. He has a little shop with multiple mills and CNCs, though this one is his most precise. We just loaded the gerbers into his CAD and even used the mill to cut through the trace

    I really appreciate your work! What's your setup? I have to drive 200km to visit my friend, so something compact at home would be appreciated :)

    image/jpeg

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @gnarf Oh cool! Mine is a Sherline 5200 mini mill with a random cheap import stereo microscope attached to it.

    They have a CNC option but I'm not sure it has the Z axis accuracy to repeatably hit traces without cutting them (especially given variables like the thickness of soldermask, board planarity, etc). I prefer to work optically, closing the loop by eye.

    Works great for shallow cuts but for high aspect ratio edits (e.g. hitting layer 3 of 8 from the underside of the board) it gets difficult: you need to make a big triangular cut towards your line of sight in order to go deep, otherwise you can't see the bottom of the hole.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    First connector (on the DP83867s) bodged. Not attempting the rest (on the VSC8512) until I've brought it up.

    Ended up milling all the way down and cutting the track then reconnecting on the surface. there's a small stub off a via which isn't great but it'll probably be fine on a prototype.

    I'll save the other six for later. If the phy doesn't work, no point spending time reworking the RJ45s.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Looks like that fixed it at least.

    Jitter test waveform showing good signal on g13

    pa,
    @pa@hachyderm.io avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @pa That's only I think 125 MHz? Not super fast. I have 10 Gbps signals elsewhere on the board.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Initial signs of life out of the QSGMII PHY!

    It's responding to MDIO with the correct address, but twice (?) and at 8 addresses (this is a 12 port PHY). Suspecting a timing issue related to the level shifters on the MDIO bus, but not sure yet. Dropping the MDIO clock frequency by 10x from 2.5 MHz to 250 kHz didn't fix it.

    The actual PHY side seems OK, it links up with my laptop on every port I've tried (aside from the known pair D issue on the upper row of ports).

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Also whoops I misspoke. The Ethernet test fixture is 16 dB couplers not 10. The directional coupler I use for TDR stuff is 10 dB and I mixed them up.

    Too much RF hardware :p

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Reading the programming guide in the VSC8512 datasheet.

    Why??? IEEE has a perfectly well defined way to access up to 2^16 extended registers. You don't need to roll your own way to do it.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Loaded an FPGA bitstream that instantiates the QSGMII transceivers on the FPGA.

    Power consumption climbed to 12.7W and the FPGA die temperature is up to 48.5C.

    The 1V0 rail for the GTXes is sagging to 975.5 mV under load, since it's just pi filtered off of the main FPGA 1V0 rail without an independent remote sense. This is within spec... barely. But definitely something I will want to work on in the future. The full LATENTRED switch (with eight transceivers) will definitely need a dedicated SERDES power rail with independent regulation.

    The FPGA 1V0 rail is doing just fine, 1.0015V at the test point and 0.996V measured by the on die ADC.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    The thermal pad and heatsink pressure seem fine. Heatsink surface temperature is only 5C below die temperature so not much of a gradient there.

    Thermal image of FPGA heatsink showing 43.6C surface temperature

    azonenberg,
    @azonenberg@ioc.exchange avatar

    FPGA logic reports none of the QSGMII links are up.

    Not entirely surprising since I've never actually tested the QSGMII block in hardware, but still a bit annoying.

    I think that's it for today. Tomorrow I'll decable the whole setup (again), and probably try to bodge one or more of the VSC8512 RJ45s as long as i have it off the bench.

    Then get test leads on the VSC8512 MDIO bus (to see if anything funky is happening with timing there, I still can only talk to 8 of the 12 PHYs... might be a register misconfiguration too though), and probably land a high BW probe on one or more of the QSGMII lanes to see what's happening with that.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Quick handheld probe measurement off the QSGMII TX line from the FPGA.

    Definitely some logic bugs, we're supposed to have K28.1 in lane 0 and all I'm seeing is K28.5.

    The eye (measured at the PHY side of the coupling capacitor) is pretty wide open, but I will definitely want to tweak driver settings given the closure in the right half. Need to check this against the QSGMII eye mask but I don't have the specs for that in ngscopeclient yet (also a job for tomorrow).

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Seems like drive on my QSGMII TX is just a little bit over the top. Left eye has the transmitter mask, right has the receiver.

    This is a mid-channel measurement (at the AC coupling cap) so we need to be better than the RX mask but don't need to pass the TX.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Back to the lab for the evening and continuing switch bringup.

    Double checking pins on the VSC8512 and so far not seeing any issues.

    I did notice that the thermal diode is tied off to ground, which is in retrospect a mistake. I should have provided a means to monitor it externally. Now I have no way to tell if the PHY is overheating other than by pointing a FLIR camera at the heatsink and adding a couple of degrees to the reading.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Signal integrity tweaking on the QSGMII.

    Took initial measurements with an AKL-PT5 and a D1330, then cross checked the PT5 measurements against a D1605.

    Long shot of very crowded lab bench covered in cables and probes
    Closeup of AKL-PT5 on QSGMII TX pair

    azonenberg,
    @azonenberg@ioc.exchange avatar

    After some tweaking, the QSGMII TX waveform isn't overshooting.

    But when I soldered an AKL-PT5 on, I saw a huge dip around T=25ps that I don't remember seeing in the handheld probe view (maybe it didn't have enough BW to show it?)

    I repeated the same measurement with a D1605 (shown here) just in case it was an artifact of the PT5. Other than a bit less noise, the eye looked identical.

    Need to check and see if the remaining QSGMII lanes have similar issues or if this is the only one, or what. It technically passes the QSGMII eye mask so it should work but I wouldn't want to field it looking like this!

    RX drive strength is a bit higher than spec, but the FPGA will happily eat it so I'm not concerned.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Looking at the QSGMII link state, it seems that the FPGA is sending autonegotiation codeword 0x4001 (SGMII mode, no remote fault etc, no next page).

    The PHY is sending K28.5 D16.2 which is IDLE 2, so I think this means it's waiting for the FPGA to go "ok, link is up"?

    Reading register 19E3 from the PHY (link partner clause 37 ability) shows 0x4001, the same thing the FPGA is sending. This means that the PHY is seeing my autonegotiation traffic and decoding it correctly.

    Register 17E3 is 0x0409: no SGMII alignment error or remote fault, no full duplex advertised by MAC (seems wrong), no half duplex advertised by MAC, link partner AN capable, link not connected, AN not complete, signal present.

    But... bit 5 of the AN advertisement (which means full duplex capable) is reserved, must be zero in SGMII mode. So I'm not sure if this is a problem or not.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Well here's a problem. My SGMII MAC isn't properly dropping ordered sets when the RX FIFO fills up.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Fixed a bunch of bugs in the SGMII block, the QSGMII-SGMII bridge, and even in ngscopeclient.

    And the TX eye still isn't very pretty, I need to investigate that more.

    But the QSGMII links are now alive! Let's see if I can actually pass traffic...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And it looks like the PHY is able to receive traffic! Haven't tested if it decodes properly in the FPGA etc, but the PHY is sending well formed QSGMII, the FPGA sees the link as up, and the decode in libscopehal is making sense of it.

    Not sending anything yet. A lot more work needed on the switch logic in the FPGA to make that happen.

    Screenshot of ngscopeclient showing the filter graph for the QSGMII analysis

    anotherandrew,

    @azonenberg out of curiosity, why do you feed the channel into the clock recovery PLL but the data to theough a threshold filter? Why not threshold the clock? Does it help to (maintain) lock even with weaker signals?

    azonenberg, (edited )
    @azonenberg@ioc.exchange avatar

    @anotherandrew The CDR PLL filter does internal thresholding on analog inputs with sub sample interpolation (currently linear but may switch to cubic eventually) to find zero crossings with high accuracy.

    While it can work with a digital input if necessary, jitter performance degrades due to the lack of interpolation (essentially the phase detector block has its input rounded to the nearest integer sample index). With 5 Gbps data and a 40 Gsps sample rate, you only have eight samples per UI so that sub sample precision makes a significant difference in stability of the recovered clock.

    The protocol decode block just needs a digital waveform you can sample on the edges of the recovered clock.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Continuing switch bringup work.

    All ports (except the four VSC8512 interfaces which aren't responding over MDIO) have link state/speed working and queriable via the MCU.

    Something is wonky with the basic status register, it's saying the link is half duplex even though it's negotiated to full duplex (in fact, only advertising full duplex). Not sure if this is a bug or what. Might have something to do with the 8051 microcode patch I haven't yet applied?

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg because of course it has a 8051.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx It also has a MIPS, which is inactive. And an entire Ethernet switch engine.

    Because the 12-port PHY is (as far as I can tell) a neutered 12-port switch ASIC.

    gsuberland,
    @gsuberland@chaos.social avatar

    @f4grx @azonenberg whoops! all 8051

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Spent a while today debugging on live hardware and finally reproduced the issue in simulation.

    Packets more than 32 128-bit words in length will max out the prefetch FIFO but I never continue to fetch traffic after that point. There's a big giant TODO comment I never implemented. Oops.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Found and fixed a few more bugs (including one that hadn't bit me yet, but would have become bad under heavier network traffic). Timing is getting a bit tricky, this one path (basically arbitration to decide which input FIFO to pop into the shared bus) is going to have to get reworked before I scale up to 24 ports.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Did a bunch of timing fixes and added some more pipeline stages. Latency is higher than I'd like now and I'll definitely want to work on reducing it, but it should do for a starting point.

    Also did some per-link power estimates: about 13.3W in the current test configuration (management port, SFP+ uplink, and two VSC8512 edge ports active at 1 Gbps, no packet traffic).

    This climbs to about 13.8W (+0.5W, so 0.25W per interface) if looping back two DP83867 interfaces, and 14W (+0.7W, so 0.35W per interface) looping back two VSC8512 interfaces.

    With all links up, I thus project that the total board power consumption would climb to about 17.3W. This would likely increase a bit further with heavy traffic due to increased toggles on the SRAM bus etc.

    Not too bad for a ~16 port switch (counting management and uplink ports). I've also put zero effort into optimizing the FPGA design for power to date, so there's probably things I can do to improve there.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Off the top of my head:

    • If an entire group of four baseT links is down or disabled, I can shut down the QSGMII SERDES
    • If there's no traffic on the read side of the SRAM bus, I can disable the input terminations
    • If there's no traffic on the write side of the SRAM bus, I might be able to tristate the bus except for control signals
    • It might be possible to consolidate/optimize PLL configuration to use less PLLs
    • There's definitely work to be done to use less long range high fanout clocks on the FPGA
    • Improve gating of unused signals on wide buses etc to avoid propagation of toggles that don't do useful work
    azonenberg, (edited )
    @azonenberg@ioc.exchange avatar

    Always a fun day when you have to write code like this...

    Hopefully this will give me a trigger condition that will let me figure out why my switch fabric is deadlocking trying to forward a packet without actually doing anything to it.

    gsuberland,
    @gsuberland@chaos.social avatar

    @azonenberg my favourite nomenclature for this pattern is ++fuse == blown :D

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Welp. Somehow I'm trying to start forwarding from port .

    Except I only have 15 ports (14 plus the uplink) and port numbers are zero based.

    Looks like I was incrementing the round robin counter but forgot to add the "mod portcount" bit.

    And apparently whatever logic Vivado synthesizes for accessing the 16th element of a 15-element vector resulted in the arbiter thinking it had data to send, entering the busy state, but then never getting a done signal.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And after a few more fixes, it's working!

    Here an ARP frame shows up on port 0 (g0), is received via QSGMII, transferred to the core clock domain, processed through the SRAM FIFO (all offscreen).

    Then at T=32 it's looked up in the MAC address table. At T=35 the table returns "not found", which makes sense since the destination is a layer 2 broadcast.

    At T=39 a forwarding decision is made: the frame should be broadcast to all of VLAN 99 except for g0, where the frame came from. In this example config that's ports 5 (g5) and 14 (xg0).

    Then at T=41 after some pipeline latency, data begins flowing.

    It ends up in /dev/null for now because there's no exit queues between the frame_* control signals and the TX-side MAC IPs. But that's the only missing piece to make this a fully functional, if very basic, switch!

    Vivado logic analyzer screenshot showing control signals for forwarding a packet. See toot text for discussion.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    FPGA resource usage is growing, but things are still looking good in terms of being able to finish the job - and hopefully fit a full 24 port design in the same FPGA.

    Current total fabric usage including the logic analyzer IP is 34% LUT, 23% FF, 39% BRAM, 6% DSP, 100% SERDES (duh), 65% IO, 53% global clocks, 25% MMCM/PLL.

    One big unknown is how to scale the architecture up to 24 ports, since the current shared bus architecture is running close to its max performance with 14 ports and assumes a single memory channel. Refactoring this to work with a dual channel RAM controller will be interesting.

    One "easy" option is to have essentially two independent sub-switches and a high bandwidth interconnect between them. But that might mean duplicating resources like the MAC address table.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Added exit queues and it's getting fuller. 38% LUT, 25% FF, 48% BRAM, 6% DSP, 100% SERDES, 65% IO, 53% BUFG, 25% MMCM / PLL.

    Still missing VLAN tag insertion for outbound trunk ports (and some other logic to propagate VLAN tag information to support that) but in theory it should be capable of switching between access ports now. About to try in hardware, wish me luck!

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And no go. My pings aren't being seen and I'm seeing no transmit activity on the QSGMII link.

    But at least I have some idea of where to add on-chip debug probes to troubleshoot further.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Ok, turns out there is transmit activity but it's gibberish. Skipping data bytes or something.

    Upon closer inspection it seems I had incorrect TX clock configuration (feeding TXUSRCLK with 156.25 MHz instead of 125) due to some confusing GTX configuration. Hopefully this will fix it...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    It's alive!! First light on the switch passing packets!

    When I ping flooded through it, it locked up and stopped forwarding traffic until I reloaded the FPGA. Probably related to one of the dozens of FIFO-full error handling code paths I haven't tested or fully implemented.

    Still lots more work to do: VLAN tag insertion on outbound trunk interfaces, 10/100 support in the SGMII MAC, performance counters, tons of error handling, lots of CLI commands, investigating SI on the QSGMII TX diffpair, figuring out why g8-g11 aren't responding on MDIO, power integrity validation...

    jpm,
    @jpm@aus.social avatar

    @azonenberg absolutely amazing progress, congratulations!

    anotherandrew,

    @azonenberg congratulations! I am following your progress with considerable interest, one of my favourite things to do here on mastodon.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Found a few more thermometers on the board. Turns out in addition to the externally pinned out thermal diode on the VSC8512 (which I didn't hook up to anything) there is an (undocumented, but used in some example code I dug up) internal digital temperature sensor.

    There's also one on the STM32.

    jpm,
    @jpm@aus.social avatar

    @azonenberg is that just a coincidence that the SFP+ is the same temperature to 2 decimal places?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @jpm Yes. Actual sensor resolution is less than this (I might end up printing only one digit) and so there's some collisions when normalizing.

    Internally I convert all temperatures from the multitude of different formats to 8.8 fixed point and from there to decimal degrees C.

    jpm,
    @jpm@aus.social avatar

    @azonenberg cool (daaaaaad joke). Thought it looked weird having 2 temperatures the same to 2 decimal places. Probably not even worth showing any decimal places, does it really matter if the SFP is 48.0°C or 48.45°C ?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Fixed a bunch of bugs and reduced latency of the QDR-II+ controller. End to end latency from read request to full burst data in hand - including PCB trace delays and clock domain crossing but not the additional pipeline stage for ECC - is now down to nine clocks at 187.5 MHz (48 ns). Probably more room to improve further on that but it's already way better than the 11-17 cycles I was seeing before with a less efficient CDC structure.

    It no longer falls over instantly when ping flooded, however sustained floods (especially with preload) still make it start corrupting packets. So I've fixed the easiest-to-trigger bug and there's still more.

    Debating how much time I want to spend chasing bugs in the current fabric architecture since I know it won't scale to 24 ports and barely makes timing as-is. Might just blow away everything between the input FIFOs and the MAC table and redo it clean slate.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Welp, seems I have a new bug: I'm reading a frame out of the input FIFO that's shifted by one word.

    The first word of the packet (src/dest MAC address, ethertype, and first 4 bytes of payload) is gone (sent as part of the previous packet<, then there's another word that I assume is the start of the subsequent packet at the end.

    Seems to be triggered by heavy traffic like ping floods, but haven't caught it happening on the write side yet.

    So far not sure if fifo pointers are getting desynced or if I'm writing bad data out of the CDC.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Nope, the SRAM FIFO is fine. Garbage in, garbage out.

    So the problem is happening earlier on, in the CDC or maybe as I'm filling buffers to be written to SRAM?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Yeeep, it's something in the CDC FIFO (or the logic interfacing with it).

    When the packet that actually goes sideways starts, there's already six words of data in the CDC buffer. But all of the other state - most notably packet metadata with length, vlan ID, etc - is missing, so that data gets ignored and isn't popped until more data shows up, at which point you get a hodgepodge of both packets.

    Still don't know which clock domain the actual bug is in so this will be fun...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Oops it's 3:30 AM and I have to be awake for work tomorrow... But I think I found the bug.

    If I'm right it's one of those "how did this ever work" moments. Very confused as to how ping flooding makes it fail, it seems like it should always fail with packets of a certain length mod 16.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Nope, that wasn't it. But it put me on the trail of the actual bug.

    Not one but two packets before SHTF, something goes wrong. There's nothing in the metadata fifo, there's nothing visible on the read side of the data fifo, but the write side of the data fifo shows 506 free words, out of a capacity of 512.

    Meaning something pushed six words into it, then (for at least the few hundred clocks I have data captured for), never asserted the "commit" flag.

    This CDC FIFO has a commit/rollback mechanism intended to be used for store-and-forward packet processing; the write side maintains a private write pointer that is only pushed to the read side when you hit "commit". Until then, the available space is decreased but the read side still shows empty.

    The intent is to commit on end of packet with valid FCS and roll back on end of packet with invalid FCS, or if the FIFO fills prior to the end of a packet. Having stale data in the buffer that never gets commited/rolled back SHOULD be impossible...

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And here's the root cause: https://github.com/azonenberg/latentpacket/commit/15a9c4359809ae00801205d9f1fa73a02463f06d

    The VLAN tag removal logic on the input side, between the MAC and the CDC FIFO, was failing to forward the "drop" flag. So any time a packet had a FCS failure, the metadata would be discarded and the packet content would be prepended to the next valid packet.

    This solves the "ping -f" hang; I just did a test of 100K pings with only 25 drops and it was still working fine after that.

    This now raises two new questions:

    1. Why did I still lose 25 packets? Judging by the previous bug, at least some are getting FCS errors. Is this signal integrity on the QSGMII link, a logic bug in the MAC, or something else?

    2. When I ping flood with preload, i.e. ping -f -l 50, the switch still hard locks up pretty quickly. So I have a second, likely unrelated bug caused by a lot of packets in quick succession.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Looks like the incoming data is occasionally (25 of 100K packets in my last test) getting corrupted somewhere between the upstream switch MAC and my 32 bit MAC data bus.

    In between:

    • Switch PHY
    • On rack patch cable
    • Plant cable
    • Bench patch cable
    • Magjack and PCB
    • VSC8512
    • QSGMII link to 7 series GTX
    • My QSGMII to SGMII demux
    • My SGMII PCS
    • My GMII MAC

    Suspecting something in the serdes/QSGMII region, but not sure yet.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Closing in on this bug.

    The data coming off the PHY is fine, verified by sniffing and protocol decoding the QSGMII link.

    The data entering the decode side of the PCS (after elastic buffer shifting from SERDES clock domain to MAC clock domain) is wrong.

    First guess: something in that buffer is borked and it's filling up, rather than dropping idles between packets when it gets too full like it's supposed to. If the remote side of the link has a clock a few ppm faster than the FPGA, the FPGA will have to occasionally drop idles to rate match. If that logic is broken we'll just see random bytes of data not show up when they should.

    ngscopeclient screenshot showing received data sequence 1d 1e 1f 20 21... coming from PHY to FPGA

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Hmmmm. It helps if your elastic buffer drops extra idle ordered sets when it's almost full.

    Not when almost empty. 🤦‍♂️

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And fixed. Now I can start chasing the "switch fails when ping flooded with preload" bug.

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg congrats, thats a relief I imagine!

    azonenberg,
    @azonenberg@ioc.exchange avatar

    OK, this one is interesting.

    The switch is forwarding packets that are completely correct except for the first 16 bytes, which at first glance appear to be gibberish.

    The 16 byte size is a clue, since most of the fabric and the external packet buffer SRAM are using a 128-bit datapath, while the MAC/PCS blocks are narrower (8-32 bits at various spots).

    So the problem here is likely a lot closer to the core than the previous bug.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    When your 16K entry FIFO has 16388 free spots in it, that's awesome!

    It's a TARDIS or something, bigger on the inside than the outside. ... right?

    mupuf,
    @mupuf@fosstodon.org avatar

    @azonenberg Off by 4, that's rare!

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @mupuf I think i know where it's going wrong (increments meant for one port are being directed to another). Just not why yet.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @mupuf This makes sense wrt it being triggered by high traffic. Two packets on different ports need to arrive very close together then some state gets jumbled.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Fixed that one with a complete rewrite of the FIFO pop logic, but there's still a (rarer) bug somewhere else. Lovely.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Switch fabric reliability is improving! I'm now needing heavier and heavier loads and triggering less frequent bugs.

    The one I'm chasing now involves a port getting stuck in the PREFETCH state, indicating it's asked for data from external RAM but it got less data than it expected.

    I'm actually getting up to a pretty decent link utilization with this ping flood. Far from saturating the pipe, but looks like maybe 20-30% ish?

    ngscopeclient screenshot showing QSGMII data capture and protocol decode with dozens of ping packets at various points in the forwarding process

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Pretty sure I have a root cause on this one already. Just took a few P&R runs to get probes on the right signals.

    I cleared the prefetch-in-progress combinatorially on the last cycle of a prefetch to enable gap-free transitioning to a second prefetch on a different port.

    But when I started a prefetch I'd also start a read request to the RAM that cycle. So if this happened the second prefetch would steal the bus cycle from the first.

    The fix is simply to not do that, and wait until next cycle to fetch the next word. As a bonus, this eliminates a critical path I was worried about.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Yep, that was the bug. Seemed to fix the other packet corruption problem I had been chasing as well.

    So at this point there are no known bugs in the fabric and it's time to work on building other stuff.

    I still need a bazillion performance counters to evaluate how things are going as I push the fabric to heavier loads, plus a lot of debug features for things like printing PHY status registers in human readable form.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Adding performance counters and a bunch of other debug features is gradually increasing FPGA resource usage to a concerning level.

    Fitting the rest of LATENTPINK is not going to be a problem, but there won't be a whole lot of free space.

    I could probably... probably... shoehorn a full 24+1/24+2 port LATENTRED design into the 7k160t if I really squeezed. But I'd have to start cutting features and I'd have no room for e.g. potential layer 3 processing or ACLs in the future.

    The question then becomes, what do I replace it with? I want "comfortably more" than the 100K LUTs of the 7k160t, enough high performance IO for two channels of QDR-II+, and at least eight transceivers.

    The XC7K325T is out, I want to stay with free Vivado for F/OSS friendliness reasons (and to avoid increasing the already significant project budget by another $3K), so there's no path forward using 7 series.

    Assuming I stay Xilinx, that means UltraScale or UltraScale+.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And if I limit myself to parts supported by free Vivado, that leaves five options: XCAU25P, XCKU025, XCKU035, XCKU3P, XCKU5P.

    The AU25P is by far the least expensive (XCAU25P-1FFVB676E is $427 at Digikey) and I have two in inventory already. It's got 40% more LUT capacity than the 7k160t, but slightly less block RAM, and a lot less IO: 208 HP and 96 HD. I'd need 196 HP for the RAM, leaving 12 left: enough for clock and Vref and that's about it.

    Which leaves me HD pins for interfacing with the MCU, maybe driving some indicator LEDs, and boot flash. But for a 24+2 port design I only need 6 GTs for QSGMII and 2 for 10G, so I'd have four extras.

    Which is good because RGMII would really be pushing limits for HD I/O, and free GTs would let me use a SGMII PHY instead.

    So as long as I can get by with 300 BRAMs (I'm using 157 in LATENTPINK including the management engine and MAC table which don't scale with interface count, so should be doable?) I think I've got a good shot.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    The XCKU025 is a lot pricier (XCKU025-1FFVA1156C is $1288 at Digikey). 45% bigger than the 7k160t, so almost the same size as the au25p, but has 360 BRAMs - a nice increase over the AU25P.

    It also has 208 HP IOs, but has 104 HR IOs instead of slow UltraScale+ HD IOs (which should have no trouble doing RGMII for the management port).

    Fabric performance might actually be a little slower than the AU+ since it's 20nm rather than 16nm, but both should be comfortably faster than the 28nm 7k160t.

    Also, the AU25P is the biggest AU+ device so there's no upgrade path if I outgrow it, while the KU025 FFVA1156 package is pin compatible with the KU035.

    Interestingly, though, the KU025 is not offered in any of the lower pin count packages like FBVA676. So if I went with the Kintex UltraScale route I'd need a PCB with enough layer capacity to fan out an 1156 ball package.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    The XCKU3P is even more expensive (XCKU3P-1FFVB676E is $1491 at Digikey), and 60% larger than the 7k160t, also with 360 BRAMs (same capacity as the KU025), but it also has 48 UltraRAMs so the total usable on-die memory capacity is more than doubled.

    Most interestingly, the FFVB676 package is pin compatible with the XCAU25P if I'm reading the docs correctly (but has only 72 HD IOs vs 96, so if I wanted the PCB to be compatible I'd need to avoid the last 24 sites).

    But this leaves open the possibility that I could design LATENTRED with the intention of using the AU25P, with potential to scale up to the KU3P or even KU5P if I ran out of fabric resources without having to respin the PCB.

    dlharmon,
    @dlharmon@chaos.social avatar

    @azonenberg I have 2 gray market KU5Ps on my desk, should know if they are any good in about a month. Only real concern would be the potential of a programmed encryption key. Each came from a different source in China for under $200. They don't appear to be reballs, near identical appearance to a Digikey sourced AU20P. I wasn't impressed with packaging from the first seller but the second was fine. Both need a bake. I checked a few ESD diodes, GTY terminations, fine.

    dlharmon,
    @dlharmon@chaos.social avatar

    @azonenberg I also did verify the AU20P has all 16 GTY bonded so they are using not only the same die but the same package substrate as KU3/5P.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @dlharmon I saw 16 GTYs in the Vivado floorplanner.

    So you're saying AU20P, AU25P, KU3P, KU5P are all the same die and just software limited/fused?

    That seems to suggest it should be possible to push the AU25P to near 100% utilization, right? Since an 100% utilized AU25P would only be about 70% of actual die sites used, there will be lots of flexibility wrt floorplanning, congestion, etc?

    dlharmon,
    @dlharmon@chaos.social avatar

    @azonenberg I'm confident of that. Floorplanner, quiescent current, bitstream size are identical.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @dlharmon In that case, the AU25P sounds like an excellent choice for the switch since it'll be trivial to swap in the bigger chip if I run out of space.

    I won't even need to redo my floorplan constraints.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Well that was weird. Something I did apparently resulted in Vivado unplacing all of my I/O pins?? Never had that happen before.

    I have all of the old pinout constraints in Git so it's not a huge deal, but wasted a P&R run finding it out after bitstream generation failed.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Thinking more, I might still have a shot at fitting everything into the 7k160t. I have two of them sitting around earmarked for this project and have no other near term use for them, so I'd like to try and make it work if I possibly can. Layer 3 functionality in the edge switch isn't a huge deal, that was going to be a 10G core switch thing anyway.

    New plan is to get LATENTPINK closer to completion, then attempt scaling the fabric up to 24+2 ports in simulation and build an FPGA design for a notional pinout of it. See what happens and if I can make timing.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Had a sudden realization while playing with the little one in the backyard: 7 series DSP48 slices have 48 bit counters/accumulators, 48 bits should be big enough for my perf counters, and I have 566 unused DSP slices on this FPGA right now.

    Let's see how much space I can save by doing this (and a few other optimizations I had in mind for the counters)!

    Here's the baseline: 37831 LUT, 56959 FF.

    I'll do the conversion in a couple of stages to compare the area improvements at each step and verify I didn't break anything.

    First (running now) is to reduce the 64-bit counters to 48 bits, and remove some redundant pipeline registers in the CDC paths.

    In this layout screenshot MAC/PCS logic is dark blue, input buffering and CDCs are green, exit queues are pink, debug logic is light pink, the MAC address table and forwarding decision logic is brown, and the crypto accelerator is cyan. Performance counters, which I'm trying to massively shrink, are yellow.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    That initial cleanup cut 2500 flipflops and 700 LUTs off the area.

    Now the counters are all 48 bits, which should make them the right size to absorb into DSP48s.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    And now almost 4000 FFs and 100 LUTs chopped off that, at the cost of 80 DSPs. Seems like a good deal if you ask me.

    Still room for optimization on the readout muxing and CDC. And I can now move the perf counters closer to their parent logic since there's plenty of unused DSP slices in the MAC/PCS area.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    After some more tweaking I managed to pack things in even tighter.

    Now CLOCKREGION_X1Y0, X1Y4, and X0Y4 are completely empty.

    The SGMII logic in the right half of CLOCKREGION_X1Y1 won't be needed if I move to dual VSC8512s for LATENTRED.

    The QSGMII logic (blue, top right) will need to be replicated into the vacant space above it to hook up another 12 ports, then the exit queues (magenta at 3 o'clock position) will need to be replicated as well - perhaps into the top left area.

    Then I'll need to find somewhere for more ingress queues (green), and a second channel of RAM and controller (red) along the right side of the chip. Moving some of the packet metadata into the external RAM might help free up block RAM.

    I think this might actually be doable as 24 ports with dual SFP+ uplinks in a 7k160t.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Here's what I'm thinking. Green X denotes logic to be removed (not needed in 24 port design).

    Green arrow(s) denotes logic to be replicated from its current location to the new location.

    claudius,

    @azonenberg this looks like one of the american football strategy charts to me.

    jpm,
    @jpm@aus.social avatar

    @azonenberg iperf test next?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @jpm Iperf will happen once I'm ready to stress it to the max.

    Ping is easier for debug since the packets are serialized and i get nice feedback as to which ones didn't make it, which I can cross-check against scope/LA captures to figure out where things went bad.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @jpm This is also a single port pair test (upstream -> g2, laptop -> g0) with no other ports participating.

    For a more proper stress test I'll make a bunch of vlans and add daisy-chain cables so a frame might come in g0, out g2, in g4, out g6, in g8, out g10, in g12, out g14. This will create a lot more load on the fabric without me having to hook up a dozen separate machines running separate iperf servers etc.

    But I still can't run the fabric beyond 50% load until i finish reworking all of the odd-numbered ports (in the upper row) to fix the pin swaps. I did one as a test to confirm that this was the only problem, but still have to do the other six.

    mwick83,

    @azonenberg
    May rule of thumb evolved over the years: it's always the fifo sigh but on the other hand CDC is a pretty close 2nd ;)

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @mwick83 The CDC is a FIFO too.

    It was gonna be the fifo one way or another, just a question of which one :)

    The overall switch fabric architecture is roughly one small input CDC FIFO per port, one big FIFO in external RAM, one small exit CDC FIFO per port.

    Right now I'm pointing fingers somewhere between the input MAC and the data written to the external RAM FIFO but haven't tracked down exactly where yet.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Still just DC voltage measurements, not looking at ripple yet. That will come later.

    jpm,
    @jpm@aus.social avatar

    @azonenberg that’s not a bodge wire it’s an antenna now! Seriously good bodge though

    mkarliner,
    @mkarliner@mastodon.modern-industry.com avatar

    @azonenberg

    You aren't really trying.
    I can see the desk.

    jpm,
    @jpm@aus.social avatar

    @azonenberg same cable moved between the 2 ports? I’ve seen plenty of marginal patch cables (and structured cable runs) fail at 1G then autoneg down to 100M

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @jpm Yes same cable.

    Per lane eye measurements, plus TDRing the cable, are pending and should rule that out as a possibility.

    biggestsonicfan,
    @biggestsonicfan@digipres.club avatar

    @azonenberg This has turned into organized chaos and I am here for it!

    jpm,
    @jpm@aus.social avatar

    @azonenberg amazing progress on this, I’m enjoying the journey!

    emeb,
    @emeb@society.oftrolls.com avatar

    @azonenberg Nice work - that was a subtle bug.

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg nice find. That was hard!

    raivis,

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @raivis Possible? It was a Silego GreenPAK4 devkit that I abused into being a level shifter. Big pile of rainbow flywire spaghetti and slow IO cells.

    But with MDIO at 2.5 MHz and a datasheet Fmax of 25 MHz, I can't imagine I've got any kind of timing errors.

    raivis,

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @raivis Yeah.

    And this isn't exactly a strong driver, I'm using a Xilinx 7 series LVCMOS18 output buffer with 4 mA drive and slow slew rate.

    To be clear, at this stage I only know that one of the two DP83867s is fully alive. It's entirely possible there's a hardware issue with the second one as I haven't thoroughly tested it (verifying SGMII proof of life is next on my TODO). But it shouldn't be failing in such a way that it blocks the bus only when the other PHY wants to transmit but doesn't block me sending the preamble or address headers etc.

    I can't do a link-up test easily with the second PHY because it's wired with pairs mirrored (A-D -> D-A). There's a MDIO register to swap the pair ordering but that assumes you have MDIO operational...

    anotherandrew,

    @azonenberg this veneer of cautious optimism over systemic dread is so, so familiar to me. Experience has taught me that my job as an embedded systems engineer is not to eliminate mistakes, but rather to ensure the mistakes are as small/trivial as possible. There’ll always be some on any prototype.

    anotherandrew,

    @azonenberg it’s that same experience that the greenhorns scoff at when I tell them that I look at engineering as a modern kind of voodoo; practically every design bringup requires a blood sacrifice. Doesn’t have to be much: a scratch from an xacto blade, prick from an 0.100” header or minor soldering burn is enough and ensures things will work out.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @anotherandrew Lol no blood sacrifice yet.

    And there have been a few mistakes in the board (the circular dependency in power rail sequencing, and a bunch of things i should have put test points on but didn't.

    But nothing requiring me to break out the blue wire yet.

    claudius, (edited )

    @azonenberg
    me, in the beginning: "why is he taping every cable to the table?"

    me, now: "oh, that's why!"

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @claudius Yeah. Now imagine a bunch more CAT5 coming off the right side and some differential probes hitting high speed test points on the middle of the board...

    It's only gonna get worse.

    claudius,

    @azonenberg I think I've said this before, but it bears repeating: I tremendously enjoy this whole process.

    My personal ability is limited to very simple (low frequency, 2-layer, simple MCU) PCB designs. And I imagine this is what amateur sports enthusiasts get out of watching pro sports :-D

    I kinda understand what you're doing, but at the same time this is ~10 years of experience above my skills.

    lambda,
    @lambda@chaosfurs.social avatar

    @azonenberg O kapton, my kapton!

    jpm,
    @jpm@aus.social avatar

    @azonenberg likely the SFP+ optic will, along with all the other cool info that is exposed (eg module voltage, laser Tx and Rx power) via SFF-8472.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @jpm Yeah it's https://www.fs.com/products/11552.html?attribute=71429&id=2062755 or something very similar.

    It's got DOM, but I haven't read through the relevant part of the spec to get that up yet. I figured I'd do that as part of the broader SFP+ bringup (including the SERDES IP on the FPGA and all of the other stuff).

    jpm,
    @jpm@aus.social avatar

    @azonenberg yep I’ve got similar ones from FS and are fully supported for DOM.

    SFF-8472 looks pretty easy, no more difficult than any other I2C device, and at first glance mostly looks like mapping bits and bytes to descriptive strings or numbers with little calculation involved

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg stm32 i2c is known to be a pita. did you have a look at how other i2c drivers do it, for inspiration?

    philpem,
    @philpem@digipres.club avatar

    @azonenberg I think some of them have an internal die temperature sensor too. The STM32L151 springs to mind.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @philpem Yep, not sure about the H735 in particular but it wouldn't surprise me if it had one.

    The 500 uV shift in Vcore is cool to see but not surprising at all.

    philpem,
    @philpem@digipres.club avatar

    @azonenberg I was about to say "if it's in the microvolts it's probably a non-issue" until I realised that's half a millivolt and if that's the ADC reference too - hmm. "Impact depends on application" as they say.

    anotherandrew,

    @azonenberg “Super UART” makes me smile. I know what your silkscreen means, but all I seem to be able to think of is

    Animated picture of Super Grover from Sesame Street flying

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg Your microkvs is interesting. I've done something similar myself at dayjob. Based on external eeprom for data survivability, and just a log, parsed at boot. I did that for data security because write and erases can be interrupted and the flash/eeprom can be left in an unstable state.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx Yeah, microkvs was designed specifically to ensure that interrupted writes/erases would always leave things in a valid state (marking block as used, then writing data, then writing checksum) etc. In theory any aborted write should be seamlessly rolled back and the previously written value will take priority.

    The bank-swap operation is similar: it erases the stale bank, then copies the latest revision of each object over, and only then updates the rev number in the block header so that the new bank is marked active.

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg very cool. Many people forget that flash operations are not instantaneous and will not always succeed. This is critical to have in mind for successful usage of raw flash in a product.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx Yeah. It hasn't had a whole lot of actual field testing so bugs are definitely possible, but that was an explicit design goal and I think i got close.

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg it's actually hard, you're close. The crc is not enough. You need to write one more "flag" after writing the crc to confirm full write completion. Without that, you cant be sure that the crc was written completely. If the crc is valid, you could be reading a metastable bits that will be wrong at the next read. If the flag is just "not erased" (even if incomplete) you know that the crc was written completely. If the flag is still erased, your crc was not written completely.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx So what I do in KVS::StoreObject() is:

    • Write header data including CRC to reserve log entry
    • Write and verify object content
    • Write and verify object name

    If the log write is interrupted you might have a corrupted log entry but it'll be marked as "in use" next boot (i.e. at least one bit will hopefully not be metastable so it won't be considered blank, and the next write will use the subsequent log block).

    If the object write is interrupted, you've got the space marked as in-use by the header, but since the object name hasn't been written it'll be considered a placeholder object with name of 0xff..ff and nobody will ever read it.

    If the object name write is interrupted, the next scan of the log will fail to recognize it and it will fall back to the previous revision of the object.

    Worst case scenario here I think is inconsistency where you sometimes consider rev N valid and other times N-1 valid (i.e. it may or may not have actually committed).

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx But this will self-resolve when you ping-pong the KVS to the other flash block (it will pick one version or the other to migrate to the new block).

    f4grx,
    @f4grx@chaos.social avatar

    @azonenberg clever. The header/data separation seem to serve as flagging. that should work for sure, to be really certain you would need pretty serious systematic power interruption testing. Your power supervisor might help if you ever want to do that.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @f4grx Yeah I don't want to put that kind of flash wear on a board I'll be using for real.

    Makes more sense as a dedicated PCB just for that test.

    emeb,
    @emeb@society.oftrolls.com avatar

    @azonenberg So very tidy!

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @emeb As you start adding more cables it's impossible to manage the spaghetti any other way.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @emeb Once I have an RS232 console cable, a couple of baseT links, a fiber, and some scope probes on the board it'll be serious spaghetti land.

    biggestsonicfan,
    @biggestsonicfan@digipres.club avatar

    @azonenberg As previously discussed, tape is cheap, spinning boards and parts is not! 😄

    Unixbigot,
    @Unixbigot@aus.social avatar

    @azonenberg thank you for this thread I am learning many useful factinos

    azonenberg,
    @azonenberg@ioc.exchange avatar

    Continuity check does not show any shorts from 1V8 or 1V8_IO to Vtt, innnteresting.

    The Vref and Vtt rails are coming from a LP2996A (U18).

    Here's the relevant schematic.

    mupuf,
    @mupuf@fosstodon.org avatar

    @azonenberg Thanks a lot for the writeup, I deeply enjoyed it!

    Looking forward to the follow-ups, after a well-deserved break with your family :)

    blinken,
    @blinken@hachyderm.io avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @blinken I'm using a Teledyne LeCroy RP4030 for ripple and stability measurements. i also have a chain of adapters from U.FL to 4mm banana that I use for DC level measurements with a 5 3/4 digit multimeter.

    blinken,
    @blinken@hachyderm.io avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @blinken It was actually pretty cheap compared to the D1330 and D1605 differential probes, lol.

    And an open hardware power rail probe based on that architecture has been on my todo list for a while but I'm too busy designing four other probes to add more to the list until at least one is finalized and ready for production.

    blinken,
    @blinken@hachyderm.io avatar

    deleted_by_author

  • Loading...
  • azonenberg,
    @azonenberg@ioc.exchange avatar

    @blinken Hey, sorry I didn't reply earlier. This is still on my "to look at" list but I haven't had a chance to give it a proper read.

    biggestsonicfan,
    @biggestsonicfan@digipres.club avatar

    @azonenberg The amount of polyimide tape used here speaks to me on a don't-you-dare-move-during-testing level.

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @biggestsonicfan I do not need something to slip or get snagged and yeet a >$2K prototype I spent three days assembling across the lab and onto a concrete floor.

    Tape is cheap.

    Also, once I start doing SI verification I'm going to have multiple very expensive differential probes soldered to it.

    biggestsonicfan,
    @biggestsonicfan@digipres.club avatar

    @azonenberg 100% with you there. Godspeed, can't wait to see this project to continue progress! 😤

    biggestsonicfan,
    @biggestsonicfan@digipres.club avatar

    @azonenberg Watching the transformation from bare PCB to this within 24 hours has left me in awe, incredible work! Hope all goes well and the magic smoke stays where it's supposed to!

    Oakdevtech,

    @azonenberg I love watching the progress and build up of your designs. Hoping this works no smoke :)

    dlharmon,
    @dlharmon@chaos.social avatar

    @azonenberg Pin in paste on those headers?

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @dlharmon Yep. It usually works pretty well although I occasionally need to add a tiny bit extra with an iron.

    Been thinking about adding solder preforms to them just to make sure, but haven't found a source for them in less than full reel volume (expensive and takes up more shelf space than I'd like).

    arclight,

    @azonenberg That looks like a ton of length-matched traces

    azonenberg,
    @azonenberg@ioc.exchange avatar

    @arclight Yep it's a dual port 36 bit SRAM. So 72 data lines plus address, control, clock, etc.

    And 14 channels of Ethernet (56 diff pairs), two of SGMII (six diff pairs), three of QSGMII (another six diff pairs), XFI (two pairs). Lots of fast stuff here.

    arclight,

    @azonenberg Board design is way above my head but I've watched enough repair & maker videos to start noticing details.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • ethstaker
  • DreamBathrooms
  • mdbf
  • InstantRegret
  • ngwrru68w68
  • magazineikmin
  • everett
  • thenastyranch
  • Youngstown
  • slotface
  • cisconetworking
  • kavyap
  • osvaldo12
  • modclub
  • megavids
  • GTA5RPClips
  • khanakhh
  • tacticalgear
  • Durango
  • rosin
  • normalnudes
  • Leos
  • provamag3
  • tester
  • cubers
  • anitta
  • JUstTest
  • lostlight
  • All magazines