| 0 comments ]

Goal of this blog is to get a clear vision about the Flex-10 port mappings that HP uses to facilitate their blades with NIC’s, with the special focus towards VMware ESX/vSphere.

First we start of looking at the “NIC to Interconnect“-mappings. These are pretty straight forward and should be known to all HP c-Class Administrators.
In our example we use HP BL460 G6 blades with 4 Flex-10 NIC’s (two onboard and two provided via a Dual Port Mezzanine Card)

Please note that the connections that are drawn below are hardwired connections on the Backplane of the HP c7000 Enclosure.

HP Blade Connections toward Interconnect Modules

(The reason that we use Mezzanine Slot 2 instead of Slot 1 is due to the fact that we have other servers in the enclosure as well that already have a connection via Mezzanine Slot 1)

So, our VMware vSphere Host is physically equipped with 4 10GB NIC’s so you would expect to see 4 vmnic’s in ESX right?….. Wrong!
The HP Virtual Connect Domain virtualizes each 10GB NIC and creates 4 FlexNics for it. After doing some math ;) we can conclude that we will get 16 vmnic’s in our ESX Host.

The image below shows us that we get 4 FlexNics per Port and how these FlexNics correspond to a vmnic from within ESX.

FlexNic to vmknic mapping

So in the image above we see that for example Port 1 from the Onboard Adapter is divided into 4 FlexNics: 1A, 1B, 1C and 1D.
PCI numbering (and thus the order in which the vmnic’s are numbered within ESX) is based on 1A (onboard), 2A (onboard), 1B (onboard), 2B (onboard), 1C (Onboard), 2C (Onboard) etc.

Notice that the first 8 vmnic’s are from the Onboard Card and the second 8 vmnic’s are from the Mezzanine Card.

From within the HP Virtual Connect Manager we can divide the available 10 GB speed over those 4 FlexNics, for example we can give 1A (vmnic0) 1GB, 1B (vmnic2) 7GB, 1C (vmnic4) 1GB which will leave us with 1GB to give out for 1D (vmnic6).

Bandwidth Allocation

Since vSphere has much better iSCSI performance than ESX 3.5 did, we decided to use the full 10GB bandwidth to connect the LeftHand iSCSI storage. Technically this means that we give 1 FlexNic 10GB which leaves us with 0GB to share among the other 3 FlexNics remaining (per port).

The image below shows how the technical design looks now:

FlexNic to vmknic mapping with 10 GB iSCSI

From a Virtual Connect Manager perspective we used the following settings in the attached Server Profile (see image below)

Virtual Connect Server Profile

Pleaste note that we defined all 16 NIC’s and left 6 of them “Unassigned”.

The “Unassigned”-ones are the FlexNics from Mezzanine Slot 2 which didn’t got any bandwidth assigned to them as you can see in the “Allocated Bandwidth”-column.
So for iSCSI we selected MZ2:1-A en MZ2:2-A as the 2 links with 10 GB allocated, leaving 0GB for MZ2:1-B, MZ2:2-B etc etc.

The final picture from vSwitch perspective looks like this, where we separated:

-Service Console (1GB – vSwitch0)
-VMotion (7GB – vSwitch1)
-Fault Tolerance (1GB – vSwitch2)
-VM Network’s (1GB – vSwitch3)

And gave the full 10GB to the iSCSI Storage. (vSwitch4)

vSwitch Perspective

Please note that the above design contains two single point of failure’s, whenever the Onboard NIC fails my whole front-end fails (same story whenever the Mezzanine Card fails, in that case my whole storage will be lost.)
Customer constraints however kept me from doing it the way displayed in the image below (which obviously is technically the best way). In the image below we also cover hardware failure from either the Onboard or Mezzanine Card.

Without the Single Point of Failures

So now that I’ve explained the mappings from Virtual Connect (FlexNics) towards ESX (vmnics) lets take a look at the rest of the Virtual Connect Domain configuration.

There are two Shared Uplink Sets (SUS) created:

- FRONTEND which controls the COS, VMotion, VM Networks and Fault Tolerance;
- STORAGE which controls the physically separated iSCSI Storage LAN.

The “FRONTEND”-SUS is connected via 4 10GB connections towards two Cisco 6509. (20GB Active/20GB Passive)
The “STORAGE”-SUS is connected via 4 10GB connections towards two Cisco Nexus 5000’s (20GB Active/20GB Passive)

Virtual Connect Shared Uplink Sets

Word of advice: It’s recommended to use Portfast-settings on the endpoints of the Shared Uplink Set-connections. While doing failover tests we noticed that our networking department didn’t turned on Portfast as we had requested which resulted in spanning tree kicking in whenever we powered on a Virtual Connect Module.

Word of advice: Next issue we ran into where some CRC errors in the Virtual Connect Statistics (while the Cisco’s didn’t register any CRC errors). These errors disappeared when we defined the Shared Uplink Sets as 10GB static speed instead of “auto”.

Last word of advice: while implementing a technical environment like this it’s crucial to test every possible failure, from single ESX Host to all the separate components. I’ve wrote very detailed documents about it and it helped us discover a very strange technical problem which I’m currently investigating.

0 comments

Post a Comment