A Guide to List-Mode Data Streams in Pixie-16 Hardware
Introduction
This post is the first in a series that answers customer questions we see about Pixie-16 data pipelines. It describes general aspects of data flow in the Pixie-16 hardware.
Data streams in the Pixie-16 Module
Data movement within the Pixie-16 hardware begins at the analog front end. The incoming signals pass through analog signal conditioning electronics before digitization. ADCs digitize the incoming signals. The ADC data flows to a Signal Processing FPGA (FIPPI), which analyzes the signals in real-time. The FIPPI calculates the signal’s energy, arrival time, and QDC sums. These data make up the record’s header. The FIPPI can store a copy of the incoming data stream, called the trace. Aggregating these two components produces a complete record.
During a list-mode data run, the DSP migrates complete records to the System FPGA’s External FIFO through the DSP bus. The DSP bus has a maximum data throughput of approximately 50 MB/s. All channels in the system share this bandwidth. A 16 channel system would have a maximum bandwidth of ~3 MB/s per channel. At the maximum DSP data rate you will fill the External FIFO every 10 milliseconds.
To prevent data loss, ensure that the total data rate across all channels in a module does not exceed ~50 MB/s.
The External FIFO can contain a maximum of 0.5 MB of data. The System FPGA monitors the amount of data in the external FIFO in real-time. The DSP checks the space available against the record length (header plus trace). When the DSP detects that a full record will not fit in the External FIFO, it will turn on the front panel’s error light. This will cause data to back up in the FIPPIs, and may cause data loss.
Data movement in the PCI Framework
Users read data out of the External FIFO via the PCI interface on the modules. The most common Pixie-16 Crate is the 14-slot Wiener PXI model. In 2020, we replaced these crates with a 14-slot Wiener PXIe model. All 14-slot crates contain two PCI buses. We’ll call them Bus A (Slots 2-7) and Bus B (Slots 8-14). The relationship between these buses and the crate controller introduces bottlenecks in the system.
The PXI crates have daisy-chained PCI buses. The two PCI buses have a bridge between them. Data on Bus B always passes through Bus A when read out of the crate. When you’re reading from a single module, this doesn’t matter too much. Each module would get 100% of the available bandwidth. Parallel reads would have a different profile. A module with an excessive rate on Bus A may leave little bandwidth for the modules on Bus B.
PXIe crates improve upon the PCI design by allowing the buses to operate in parallel. Each bus has a dedicated bridge to the crate controller. You can read data from Bus A and Bus B at the same time. You now have the ability to read ~100 MB/s from each bus. This effectively doubles the throughput of the crate for parallel operations. Single module data operations are still capped at an average of 50 MB/s due to the bottleneck at the DSP.
What does this mean for me?
With a single module or a sequential data readout framework, you may never use the maximum PCI throughput. The DSP bus cannot transfer data fast enough to reach this maximum. We’ll see in the next post how sequential readout methods increase the likelihood of data loss. We’ll also look at how we can handle such a situation.