Network Taps are a crucial part of the tool kit used by engineers responsible for network security, performance, and capacity management. Taps are robust devices that rarely ever exhibit issues, but as with any network equipment – it may occasionally be necessary to troubleshoot them. In some cases this is an inherent part of a multi-layered process of elimination, but in certain circumstances the process might begin with the Tap itself. The following procedures are presented with Datacom Systems fiber Taps* in mind, but the general principles are applicable to all brands of Taps.
We will begin with a general overview of Fiber Tap design, and then review the manner in which they are used, which will provide the context required for troubleshooting.
*Datacom Systems fiber SINGLEstream Taps utilize internal fiber Taps. As such, the troubleshooting procedure will have some commonalities, but the chip sets used in SINGLEstream Taps add a layer of troubleshooting complexity – which will be addressed in a future article.
How Do Fiber Taps Work?
Fiber Taps are a Layer 1 device. Each Tap has two “Network Port” pairs, which connect to the optical splitter assembly (multi-link fiber taps have multiple assemblies, each of which operates independently the others – therefore these troubleshooting steps can be used for any single channel of a multi-link Tap.) The port pairs are typically LC connector bulkheads, which feed the light through internal splitter assemblies. There is one splitter assembly for each side of the fiber pair which comprises the tapped link. The data copies are borrowed by splitting the light that passes through the Tap assemblies. A standard split is 50/50, with 50% of the light passing from one endpoint directly through the splitter assembly to the other endpoint. The remaining 50% of the light travels out the “tap leg” to each Monitor port (a 70/30 split is also available – which keeps 70% of the light on the tapped link and sends 30% to the Monitor port.) There are two Monitor ports – typically labeled as A and B – with data copies of the traffic from one end of the tapped link being sent out one Monitor port, and data copies of traffic from the other end of the link sent out the other monitor port.
The process of borrowing a portion of the light for the Monitor ports introduces “insertion loss,” whereby the strength of the signal passing on the link is reduced. In some instances this insertion loss characteristic will be an important metric in troubleshooting.
Fiber Taps are a truly passive device, and do not require power to split the light. The light traveling on each fiber is split by passive couplers. Therefore, the process is identical regardless of the speed of the link. Note: fiber type/grade and manufacturing tolerances dictate that certain Tap models are appropriate for speeds only up to 10G, whereas as other models will be compatible with higher speeds – as high as 100G and 400G in some cases. In all cases, Tap models certified as supporting up toa certain speed will also support use in lower speed links. The exception to this is special multiplexed topologies such as SR-4, LR-4 etc. In those instances, specific Taps that support those topologies must be used.
Symptoms Indicating That Troubleshooting Is Required
- Monitor ports are not sending out data copies, but link is functioning normally
- Link has “bounced” for unknown reasons
- Link is down and not passing traffic
- High error rate detected by one or both endpoints of the tapped link
- Monitor ports are sending data but monitoring tool indicates that some traffic (packets) is missing/dropped
Steps to Troubleshooting
Always commence troubleshooting by validating the integrity of the patch cables used to connect the link endpoints to the TAP, and also check the cables used to connect the Monitor ports to the tool – or to the Network Packet Broker, if one is being used for aggregation and replication of the traffic copies by the Tap(s). Also crucial is to verify that the core diameter of the fiber in the Tap matches the fiber type used in the tapped link. Taps are available for legacy 62.5 micron multimode links, 50 micron multimode links, and 9 micron single mode links.
A digital light power measurement meter is an invaluable part of troubleshooting fiber links. If contacting Datacom Systems or any other vendor for support, the dB light levels seen at the point where light enters the Network ports of the tap may be requested, as well as the light levels measured from the Monitor ports. The SFP and SFP+ transceiver modules of most equipment may allow the user to query the modules and ascertain the approximate light levels being sent and received, but exact measurements at the tap can be obtained only with the digital light power measurement meter.
- Monitor ports are not sending out data copies, but link is functioning normally:
Fiber Taps are directional devices. If the TX side of each fiber pair coming from an endpoint device is not oriented properly and is not connected to the RX side of the LC connector pair on the Tap, then the light will not be split correctly. This results in 50% of the light (or 70% in the case of a 70/30 tap) remaining on the links, but only reflections being sent out the “tap leg” to the Monitor port. In such instances, a crossover cable must be used, or the user must unclip the end of each fiber cable connecting to the Tap, and “roll over” the fibers so that TX from each endpoint goes into the RX port of the tap. Note: on Datacom Systems fiber Taps, the RX side of the LC connector pair is always on the right-hand side of the pair, and the TX on the left-hand side. Every LC connector pair will be clearly labeled with an upward facing triangle below the RX side of the pair, and a downward facing triangle below the TX side.IN the case of Monitor ports, both sides of the pair are TX, requiring that simplex cables or special “Y” cables be used to connect the Monitor ports to tools or Packet Brokers.
- Link has “bounced” for unknown reasons:
Check the port status and historical records of the two endpoint devices of the tapped link. If either of them currently is or has been “flapping,” then it will result in the link intermittently not passing traffic. This typically excludes the Tap as the cause.
- High error rate detected by one or both endpoints of the tapped link:
The transceiver modules used on the ports of the endpoint devices should be investigated first for possible intermittent failure. When in doubt, replace it with a proven good comparable module.
Another possible cause is low light levels on the link itself. When a link that is relatively close to the allowable length for that topology is tapped, the insertion loss from a 50/50 tap may cause the light sent to each endpoint to drop to a level at or below that which is required. In some cases this will manifest itself as errors on the link. The solution is to shorten the physical length of the link or deploy a 70/30 tap instead. Note: for links that are between devices in the same rack, a 50/50 Tap is generally regarded as acceptable. When tapping links that traverse between racks in a data center or between floors in a multi-story building, there is greater risk of insertion loss affecting link integrity.
- Some traffic from tapped link is visible but some packets appear to be missing:
This symptom has multiple possible causes, none related to the tap itself. If a Network Packet Broker or other device for aggregating data from multiple Taps is being used, then the most likely culprit is oversubscription within the Packer Broker itself. Note: fiber taps are non-aggregating devices. In other words, the Tap sends out separate copies of the A and B sides of the link, always allows the link to operate at up to 100% of its capacity, and sends 100% of the data copy traffic out the Monitor ports. There is zero risk of oversubscription or dropped packets being caused by the Tap itself.
An even more likely possibility is limitations with the monitoring/capture tool itself. It’s important to note that, even when equipped with robust internal components, a conventional PC/server with 1G or 10G capture interfaces is not capable of capturing high utilization levels of traffic for a sustained duration of time. Although turnkey monitoring tools supplied by network security and visibility vendors are typically more robust, even these tools have distinct limitations to their sustained throughout capability.
Independent consultant Chris Greer of Packet Pioneer https://packetpioneer.com/ said this in a 2017 interview:
“The ubiquitous use of Wireshark on a laptop translates to a high degree of packet loss which can significantly decrease the accuracy of any analytics run on that data. In a 1 Gb data stream, I had 82 percent packet loss,” Greer says. “One gig went out and I was only able to capture 18 percent of the packets. As I turned down the volume, I had to go down to about 50 Mb per second on my laptop to capture all the packets. Fifty megabits per second, are you kidding me? We should never be using a laptop in a data center to capture packets. We’re going to drop packets and that’s going to affect our analysis. If we just take Wireshark and install it on a laptop, and put it on a link, we can’t capture forever,” Greer says. “We can ring buffer, we can get captures, to the limit that the laptop can capture. But today we need long-term capture because problems are intermittent. We need to be able to catch it in the act. I think that’s the most difficult thing, just being there, on the link, when the problem is occurring. How many times have you heard of a problem, someone complains of something, by the time you run out there to analyze it, the problem’s gone? For me, an organization has flown me up and they’ve had an issue. They fly me up and the problem disappears. I leave and the problem comes back.
Long-term, stream-to-disk packet capture is what we need.”*
*(Datacom Systems does not presently sell or recommend specific brands or models of long-term stream to disk data capture storage solutions, but it’s crucial for all users to recognize where the potential bottlenecks for data capture might appear in visibility solutions.)
In a future installment of this series, we will explore troubleshooting processes targeted at aggregation style Taps that utilize integrated fiber taps.