Low-latency ways to twiddle pins using computers

I’ve been looking for high-bandwidth, low-latency ways to control GPIOs - 100 MHz parallel buses, for example - directly with an x86 computer.

Why would you want to do this? There’s a few reasons. Maybe you want to soak in a lot of data from a detector - the Red Pitaya, much loved by nuclear detector enthusiasts for this purpose, sidesteps this issue by having an FPGA and ARM sharing the same silicon fabric on an SOC.

Or consider LinuxCNC/EMC2, where linux with RT-PREEMPT acts as a full 3+ axis motion controller. The parallel port is used to directly generate servo step/direction pulses. It works staggeringly well. There’s a real convenience to this that’s hard to emulate if you add a separate motion control system - at work, dealing with myriad vendors’ interfaces to motion controllers seems to cause a lot of headaches. Sure, you can often use a Teensy or an Arduino or your favorite USB GPIO system to offload realtime stuff; but you sacrifice much in software development and elegance. You have to worry about the speed, robustness, and latency of the interface; you need two sets of build toolchains for different targets.

It seems like such a waste to have this incredibly versatile processor, with 2+ GHz clocks, compatible with all your existing toolchains - and have no way to harness that speed in hardware.

Some modern processors do still have GPIO, but performance is not superb. (Some companies like Versalogic sell M.2 gpio expander cards - but these also don’t have good performance)

The Raspberry Pi is pretty good at this. The parallel Broadcom SMI interface is very fast, as you can see in these excellent posts. Unfortunately, I haven’t generally found Pis to be suitable for primary devices; memory card longevity being only one of the many reasons. Hence my interest in adding performant GPIO to conventional x86 motherboards.

It’s interesting that, though transfer speeds have increased, communication with the processor has perhaps become ‘de-democratized’ into a walled garden - whereas one used to be able to operate a turtle robot or program a Basic STAMP directly from the parallel port, now only a few companies produce the silicon middleware required to interface with the processors of the day; and, for FPGA interfacing, the IP required to connect to modern buses is costly. Unlike PCI or ISA, which are simple wide parallel buses, despite there great projects offering open source PCIe cores, creating a PCIe module still seems to be a heavyweight task. The same goes for USB-C or any other modern I/O.

Of course, I can’t complain about where progress has taken us - fiddling with hardware is now more accessible than ever. Arduinos, cheap AVR programmers, $10 logic analyzers - there’s a much lower barrier to entry for nearly all hardware projects. This just seems to be one area where we lost out.

If you’re not afraid of using vendor IP, Joelw has compiled a fantastic list of fpga boards, including those with PCIe interfaces. The M.2 LiteFury and NiteFury look great, especially with the LVDS pairs; but there aren’t enough GPIO to make a parallel bus, and I’m not aware of an open-source toolchain that targets these yet.

Mesa cards have a really interesting reconfigurable FPGA - http://wiki.linuxcnc.org/cgi-bin/wiki.pl?HostMot2.


The IDE port is something of a holdout; a 16-bit wide, half-duplex, 83 MHz maximum clock bus with a strikingly simple handshake. The PIO mode is practically designed for GPIO. Whereas Parallel SCSI died early, commodity PCIe to IDE bridgeware is still manufactured and relatively inexpensive.

The standards 1

There are projects that use the IDE bus for fast directly-programmed GPIO.



Original PCI


Dragon board

The FT600/ FT601

The FT242