Skip to content

What are market data gaps and how to deal with them

In the world of high-performance trading systems, accurate and timely market data is paramount. Yet even in the most robust infrastructures, market data gaps—missing or out-of-sequence messages—are an unavoidable reality. Understanding how they occur, how to detect them, and how to recover is critical for ensuring data integrity and making reliable trading decisions.

Market Data Distribution: Multicast with A/B Redundancy

Most modern exchanges distribute market data using UDP multicast. This protocol is ideal for efficiently sending real-time data to multiple subscribers with minimal overhead.

Each market data feed is typically delivered via two redundant multicast channels, known as the A and B feeds. These feeds carry identical data but are sent on different multicast groups and usually on different routes. Their purpose is simple: redundancy. If one side misses a packet due to network congestion or a hardware issue, the other side can fill the gap.

Why Do Gaps Occur?

Despite best efforts, packet loss happens. Here are the 3 main areas where gaps might happen:

1. NIC buffer overflow

2. Software Gaps

3. Switch and Network congestion or packet loss

How to Detect Gaps

1. Gaps on your NIC (Network Interface)

The most common place where gaps occur—and the first place you should check—is your NIC. Any NIC only has a limited number of RX buffers and if your software can't keep up with consuming them, they will overflow and drop packets.

A. On standard Linux interfaces:

You can use netstat -i command to display NIC statistics.

Look for the RX-ERR and DROP counters. If these increase while your market data feed is live, packets are being dropped before they even reach your application.

B. Solarflare NICs (Onload / EFVI stack)

When using Solarflare EFVI or Onload for kernel bypass, packet drops won't appear in netstat. Instead, check for: * evq->rx_drops counters (exposed via EFVI) * ethtool -S <interface> for interface-specific stats

2. Switch-Level or Network Drops

Switches generally don't expose drop stats to receivers directly, but you can infer problems by watching for identical sequence number gaps on both A and B feeds at the same time (suggests the drop happened before the NIC, likely at the switch or upstream) or identical sequence number gaps on two separate machines connected to the same switch.

3. Software Gaps

Software gaps mean gaps that happen inside the software consuming and parsing the raw data. You can infer if you have gaps in your software when NIC and Network review didn't identify a problem.

Most common issues with software:

  1. Raw packet was picked up from NIC into application queue, but the application queue got overrun.
  2. Packet was wrongfully discarded—for example, it might have been filtered out by application-level logic.
  3. Parsing error - there was no gap, but your feed handler failed to parse the packet correctly.

Sequence Numbers and Arbitration

Each multicast message from the exchange carries a sequence number. This is your first line of defense. A typical handling loop checks:

if (seqNo != lastSeqNo + 1) {
    // Gap detected
}

When a gap is detected, your recovery logic should arbitrate between A and B:

  1. Cache incoming packets from either side
  2. Wait a short time (e.g. until the cached queue reaches a certain size) for the missing packet to arrive from the other feed
  3. Use the first valid continuation of the expected sequence number
  4. Iterate over the cached packets from the queue and process when applicable
  5. Resume processing if and when the gap was filled.

If the missing message is not recovered within the timeout window, the gap must be escalated.

When Both A and B Drop the Same Packet

If both A and B feeds missed the same sequence number, you have two main options depending on the exchange:

1. Clear Book and Request a Snapshot

If your platform is not in a recoverable state, clear your internal order book and:

  1. Request or subscribe to a snapshot (if the exchange supports one)
  2. Rebuild state from that snapshot
  3. Resume processing from the next valid sequence number
  4. For example, CME MDP allows snapshot recovery via their dedicated snapshot multicast line.

2. Request Retransmission

Some exchanges offer retransmission channels:

  1. Send a retransmission request specifying the missing sequence number(s)
  2. Reinsert the retransmitted packets into your processing queue
  3. Resume only after the missing data is recovered and the feed is gap-free.

Retransmission logic must handle: * Duplicates (in case the message arrives on both the feed and retransmission) * Delays (so you may need to pause updates temporarily)

Summary: Best Practices for Gap Handling

  • Always compare A and B sequence numbers
  • Track last sequence number per channel
  • Defer processing until a valid seqNo = lastSeqNo + 1
  • Cache out-of-order packets briefly to allow recovery
  • Log all gaps for diagnostics
  • Monitor packet drops at NIC and switch
  • Have a fallback plan for snapshots or retransmissions

Robust gap handling separates a resilient trading system from a fragile one. By carefully tracking sequence numbers, maintaining failover logic, and integrating with exchange-provided recovery channels, you can protect the integrity of your order book even in the face of network turbulence.