Klebanov is a technical solutions architect with Cisco Systems. He has 15 years of network industry experience. In recent years he has been closely involved with architecting data center solutions. He can be reached at [email protected] or on LinkedIn at http://www.linkedin.com/in/davidklebanov
Most organizations today use dedicated storage networks in the data center, but the concept of leveraging converged network infrastructure to provide organizational storage services is gaining steam.
In most cases NAS storage traffic already rides existing network infrastructure in a converged manner, so in this article we are going to focus on converging SAN traffic.
Traditional SANs define the concept of a fabric where initiators (servers) are connected to targets (storage arrays) through a purpose-built fiber infrastructure, which consists of SAN switches using Fibre Channel protocol for encapsulating and forwarding the SCSI traffic hop-by-hop and end-to-end. One way to make use of converged network concept is to have SCSI traffic encapsulated in a protocol that can be transported over existing IP networks.
TCP/IP provides a very convenient over-the-top way of deploying SAN environment in such manner. SCSI protocol transported using TCP/IP is called iSCSI.
Using iSCSI has several main drawbacks:
• An iSCSI capable storage array is required. An alternative is to use a storage switch that can terminate iSCSI TCP/IP connections, unwrap the SCSI portion and forward it in a native format toward a Fibre Channel connected storage array. In this case storage array is not required to support iSCSI.
• Over-the-top behavior of iSCSI prevents storage administrators from enforcing per-hop storage characteristics and controls, which is an inherent part of the vast majority of SAN deployments.
• The use of the TCP protocol makes iSCSI traffic susceptible to TCP slow-start mechanism because TCP cannot differentiate between data and storage traffic. iSCSI can leverage switching infrastructure that supports Data Center Bridging (more on that later), which can apply selective back-pressure, pausing iSCSI traffic and preventing it from being indiscriminately dropped during times of congestion.
Things like firewalls, IDS/IPS and application optimization can also influence the delivery of iSCSI traffic end-to-end, but on the up side, iSCSI allows us to extend the reach of a storage network as far as our IP infrastructure goes, which may be way beyond the limits of a dedicated traditional Fibre Channel SAN environment.
Fibre Channel over Ethernet
Another method of delivering SCSI connectivity between initiators and targets in a converged network fashion is to encapsulate it in Fibre Channel protocol and use Ethernet frames for transport. This technology is called Fibre Channel over Ethernet (FCoE). FCoE I/O and control traffic is differentiated from the regular data traffic by the means of ethertype values present in the Ethernet frame header fields.
One might ask, isn't that another form of over-the-top behavior we described with iSCSI? After all, iSCSI uses TCP/IP and FCoE uses Ethernet, so what's the difference?
The fundamental difference is that, unlike iSCSI, which appears to the underlying network as TCP/IP traffic, FCoE introduces a concept of Fibre Channel Forwarder (FCF), which is a logical Fibre Channel entity within the Ethernet switch. The switch forwards all FCoE traffic to the FCF for processing, which allows for enforcement of per-hop Fibre Channel behavior and other controls rather than simply switching Ethernet encapsulated storage I/O and control traffic between initiators and targets using a MAC address table for reachability.
This Ethernet network awareness of the storage traffic is a significant difference between over-the-top behavior of iSCSI and per-hop behavior of FCoE. It also gives birth to a concept of Unified Fabric, in which data and storage traffic share the same infrastructure, while keeping their behavioral characteristics separate.
With Unified Fabrics, however, you have to consider which Ethernet network topologies can be used to transport FCoE traffic. Not every topology is suitable because one of the fundamental principles of building resilient storage area networks is maintaining separation of Side A and Side B. FCoE traffic forwarding over a Unified Fabric network must comply with this behavior.
Distributed port channel technologies, such as vPC or MLAG, can break SideA/SideB separation and are generally excluded from the storage traffic path. Exceptions are the links between end nodes and first hop Unified Fabric switches. Here distributed port channels provide increased redundancy and bandwidth to the data traffic, while storage traffic still utilizes traditional multipathing.
Other Layer 2 technologies, such as VPLS, OTV, TRiLL/FabricPath have any-to-any connectivity model, which makes it difficult to maintain side separation. They also cannot guarantee lossless delivery and as such are not good candidates for carrying FCoE traffic, at least not in their current form.
Living on the edge
Let us now see how Unified Fabric is used at the edge of our network.
Many modern data centers have adopted top of rack switching principles. For organizations deploying FCoE, top of rack switches had become part of the Unified Fabric network. Enforcing full Fibre Channel features on those edge switches introduces an administrative burden and does not scale well, especially since the number of top of rack Unified Fabric switches might be significantly larger than the number of edge switches in traditional storage network.
Traditional Fibre Channel SANs addressed this my employing N_Port Virtualization feature or NPV on the edge storage switches. NPV allowed the edge storage switch appear as an N_Port type host port to the upstream SAN aggregation or core switches, while proxying Fabric Logins (FLOGIs) from the connected servers as Fabric Discovery (FDISC) messages. Storage switches operating in NPV mode require minimal SAN configuration and do not consume Domain ID resources, which are limited to 239 in any single fabric. For NPV to work, the upstream SAN switch must support N_Port ID Virtualization or NPIV feature.
Unified Fabric mimics this behavior for two cases, single hop FCoE and multihop FCoE.
Single hop FCoE
In this topology, the first hop edge FCoE switch is connected through a traditional Fibre Channel interface to an upstream SAN, as well as through an Ethernet interface to an upstream IP network. When FCoE attached server sends traffic, the Fibre Channel encapsulated portion is forwarded to FCF for processing based on the destination MAC address of the Ethernet frame (FCF-MAC) and the FCoE VLAN ID. Non-FCoE Ethernet traffic is forwarded toward the data network.
In essence, in a single hop FCoE topology, top of rack FCoE switches act as splitters sending storage traffic towards the SAN environment and data traffic toward the IP network. They can also operate in NPV mode, just like traditional edge storage switches.
This model works best as an initial step toward building a wider Unified Fabric network and it also preserves the investment in Fibre Channel technology in SAN Aggregation and Core, as well as Fibre Channel attached Storage Arrays. On the server side, 1Gb and 10Gb Network Interface Cards make way to 10Gb Converged Network Adapters.
In this topology, the first hop edge FCoE switch is connected through a Unified Fabric Ethernet interface to an upstream FCoE switch, effectively creating a multihop FCoE topology. To realize the administrative and scale advantages of NPV-NPIV operation, FCoE employs similar methods respectively called FCoE-NPV and FCoE-NPIV, where first hop FCoE switch operating in FCoE-NPV mode proxies server FLOGIs to an upstream FCoE switch, which acts as FCoE-NPIV.
FCoE switches operating in FCoE-NPV mode do not act as full Fibre Channel Forwarders and require minimal configuration effort. They also do not consume Domain ID, which makes it a very elegant model for top of rack deployment in a multihop FCoE Unified Fabric environment.
The use of FCoE-NPV and FCoE-NPIV technologies allows building FCoE networks beyond the reach of a single hop, however this is not the only method of doing so.
Another method is to employ FIP snooping technology, which allows edge switches to snoop out FCoE control messages sent using FCoE Initiation Protocol (FIP) to provide added value services, specifically in securing initiator-FCF relationship and protecting it from a man-in-the-middle attack. FCoE-NPV supersedes FIP snooping functionality and provides more comprehensive set of services.
Now, what if you wanted to build an even longer reach multihop FCoE network? It is possible to cascade FCoE-NPV switches and, while it is also possible to cascade FIP snooping bridges, there are operational and functional implications which could raise concerns with storage network designers.
Cascading FCoE-NPV switches works, but it is not an advisable deployment model and in real world scenarios, long reach (more than two hops) multihop FCoE mimics the behavior of traditional Fibre Channel networks by employing Virtual Edge ports, or VE_Ports, to create Ethernet InterSwitch Links or ISLs.
VE_Ports can also be interconnected using TCP/IP overlay, which is what Fibre Channel over IP or FCIP does, however there are no mainstream practical models of extending FCoE using FCIP, even though such deployments are technically possible.
FCIP allows extending ISLs as far as the reach of our IP network, however, such solutions, again, rely on masking storage traffic on intermediate nodes. As such, FCIP comes with similar disclaimers as iSCSI.
An alternative to using multihop FCoE at the edge of Unified Fabric network is to use 802.1BR port extender technology.
The IEEE has almost ratified the standard and some pre-standard 802.1BR products are already shipping. It is a matter of debate whether the use of port extenders even falls into the multihop category or whether it should be considered a single hop design.
Enhancing Ethernet for Unified Fabric
Earlier in this article we touched on the detrimental impact that unfavorable network conditions can have on a Unified Fabric environment. Traditional Ethernet protocol does not provide guaranteed or assured delivery, differentiated flow control mechanisms or adequate bandwidth allocation control, which raises concerns for supporting Fibre Channel traffic. To address these concerns Ethernet had to be enhanced to create a new concept called Data Center Bridging (DCB). DCB capable switches employ several IEEE standards defined by the 802.1 Data Center Bridging Workgroup. The key standards are:
• Priority Flow Control defined under 802.1Qbb expands on 802.3x Flow Control by using per 802.1p COS PAUSE frames, rather than per physical link PAUSE frames. FCoE traffic is normally marked with COS value of 3, so using PFC allows DCB switches to apply selective back-pressure and pause Ethernet encapsulated storage frames before they are dropped due to queuing buffers shortage.
• Enhanced Transmission Selection defined under 802.1Qaz outlines the principles behind bandwidth allocation between different traffic classes.
• Data Center Bridging Exchange is based on LLDP protocol defined under 802.1AB. It is used for discovery, as well as capabilities and settings negotiation between neighboring switches or switches and hosts. DCBX is also important for configuration consistency across the DCB environment.
Another condition common in data networks is the fan-out effect, where traffic arriving from multiple ingress switch ports is to be sent out one or a few egress switch ports. With high volume ingress traffic, egress port buffers can be overrun causing eventual traffic drops. Such behavior is less than ideal for storage traffic.
Mature DCB implementations employ Virtual Output Queuing mechanism, which utilizes the concept similar to traditional Fibre Channel buffer-to-buffer credits. With VOQ, a central arbiter allocates transmit credits for frames traversing a switch forwarding fabric. Credits are allocated based on the egress port buffer availability. If the egress port buffer cannot accommodate the frame, credit is not issued and the frame is queued on the ingress port. Once the ingress port buffer can no longer accommodate additional frames, Priority Flow Control is triggered to issue a PAUSE frame down the link.
Granular availability of port buffers, VOQ mechanism and PFC are all class-based behaviors, which can be selectively applied to FCoE traffic, making sure that it has the least chance of being dropped during the times of network congestion. The combination of those behaviors is an extremely powerful tool in deploying lossless Ethernet networks needed for unified data and storage services.
FCoE and Unified Fabric truly open a new page in converged network delivery, allowing economic deployment models and new service offerings. Going forward, FCoE and Unified Fabric will redefine how our data center networks deliver storage and data connectivity.
Read more about data center in Network World's Data Center section.