1. L2Bridge without MAC learning?

L2Bridge without MAC learning?

Home Forums FABRIC General Questions and Discussion L2Bridge without MAC learning?

Viewing 13 posts - 1 through 13 (of 13 total)
  • Author
    Posts
  • #3683
    Fraida Fund
    Participant

      Hello,

      It is not mentioned in the Network Services in FABRIC article, but it seems as if packets are filtered by MAC learning on the L2Bridge type network.

      Is it possible to get an L2Bridge network (Ethernet service connecting multiple interfaces in a single site) without MAC learning?

      • This topic was modified 1 year, 3 months ago by Fraida Fund.
      #3689
      yoursunny
      Participant

        it seems as if packets are filtered by MAC learning on the L2Bridge type network

        What observation led you to this conclusion?

        What are you trying to do, how it behaved, and how do you expect it to behave?

        #3694
        Fraida Fund
        Participant

          Well, one observation is that Paul said so elsewhere in the forum.

          L2Bridge: These bridges are like a local network switch/bridge that connects any number of local nodes within a single site.  These local bridges are directly connected to the nodes so your bandwidth will be limited by the maximum bandwidth of the NICs that you are using (i.e. ConnectX_6 NICs will provide 100Gbps). This bridge is not programable and only performs simple MAC learning. The key use of these bridges are that they can only connect to nodes within a single FABRIC site.

          Another observation: suppose I create 4 VMs with a basic NIC on each, and connect each basic NIC to an L2Bridge-type network. I capture traffic on each of the NICs with tcpdump. A frame sent by host 1 with host 2’s address as the destination MAC only appears at the NIC on host 2, and not on host 3 or host 4.

          Another observation: suppose I create a Linux switch connecting multiple hosts, using a L2Bridge between my Linux switch and each of the connected hosts. (as in e.g. this example.) Non-broadcast frames sent from the hosts don’t make it to the Linux switch interfaces, so the bridge does not work.

          What I am trying to do: I am trying to connect multiple basic NIC interfaces with a network link, so that any frame sent by any NIC on the link appears at every other NIC on the link. Like the way an Ethernet segment behaves, or Ethernet segments connected by a hub, or Ethernet segments connected by a switch with MAC learning disabled.

          #3695
          yoursunny
          Participant

            NIC_Basic is a Virtual Function (VF) on the ConnectX-6 Ethernet adapter. The hardware Ethernet adapter is shared among many VFs, and it determines which VF shall receive an incoming packet by matching the destination address. Therefore, NIC_Basic cannot receive Ethernet frames whose destination address differs from its own address.

            #3696
            Paul Ruth
            Keymaster

              Fraida,

              Coincidentally, I ran into this issue recently when putting together an example that I intend to share with you in our meeting with Kate this week.  I have a working prototype that looks like your example that uses a 5th VM to run a software OVS switch (https://witestlab.poly.edu/blog/basic-ethernet-switch-operation/).

              There are actually a couple issues going on here that I had to work around… and its super impressive that Yoursunny identified the trickiest part.

              The main issue is the one that Yoursunny pointed out related to the Basic NICs being SRIOV virtual functions on a ConnectX-6.

              You can think of the ConnectX-6 as a mini-switch that uses its physical port(s) as trunks between the itself and the bigger dataplane switch.  The mini-switch then has several access ports (i.e. SRIOV virtual functions) that that are passed through to the various VMs.  The traffic on each of these access ports is basically a “pseudo wire” going through the ConnectX-6 between the VM and the dataplane switch.   The problem is that the ConnectX-6 “mini-switch” is also doing MAC learning on the “pseudo wires” and is filtering the traffic.  I think this is a unforeseen problem with our SRIOV configuration and just needs to be changed in the future.  We are working on this.

              The effect this has on your example is that an OVS VM that is using 4 Basic NICs connected to 4 other hosts will not see traffic sent directly to one host from another. The ARP request will go through because it is an broadcast but the ARP reply is filtered by the ConnectX-6 “mini-switch”.  Without the ARPs, we don’t get very far.

              My workaround is to use dedicated ConnectX-5s for the OVS switch VM (the hosts can be Basic NICs).  The dedicated NIC are on access ports connected directly to the dataplane switch so there is no “mini-switch” filtering packets in between.  This isn’t a great solution because it limits the degree of your OVS switch and uses a much more scarce resource type.  The better long-term solution is for us to turn off MAC learning on the ConnectX-6 “mini-switches”.

              I can tell you more about this later this week when we talk with Kate.

              Paul

               

               

               

              #3697
              Fraida Fund
              Participant

                Yes, let’s discuss further. I can think of a bunch of scenarios where we would want the interfaces to be in “promiscuous mode” and in some of them, it will not be practical to use dedicated interfaces (we need too many interfaces in “promiscuous mode”).

                #3701
                Ilya Baldin
                Participant

                  We will open an internal ticket about it. The VFs are created on the worker node at boot and then given out by the Control Framework to the virtual machines and we need to check what options are set on them at creation time (typically they cannot be changed once created).

                  @yoursunny may be right and it may or may not be possible for us to change this behavior – we will report here once we know more. Thank you all for your feedback.

                  #4011
                  Fraida Fund
                  Participant

                    Hi! I wanted to follow up on this, since this functionality is used in educational materials, I am working to transition those materials ahead of the imminent retirement of InstaGENI, and I need to consider what platform to transition them to.

                    Is this issue expected to be fix-able? If yes, is there a rough timeline? (Is it likely to be fixed before InstaGENI is retired?)

                    #4020
                    Paul Ruth
                    Keymaster

                      Ezra and I looked a this a while back and it seemed that the Mellanox cards were not handling these frames the way we expected given the config options that we used.  We’ll  need to revisit this.   I’ll get back to you about this.

                      thanks,

                      Paul

                      #5114
                      Ilya Baldin
                      Participant

                        Just to bring this back up – we are working with NVidia/Mellanox engineering support on this. Their engineers are able to reproduce the problem (which is good news). They are trying to figure out the difference between a working setup and a non-working setup.

                        #5115
                        Fraida Fund
                        Participant

                          Thanks, I appreciate the update!

                          #5330
                          Ezra Kissel
                          Participant

                            Fraida – after some back-and-forth with NVIDIA/Mellanox, it appears the desired bridged virtual function forwarding behavior is not something currently supported. We are continuing to discuss alternatives with them to see if there is a solution we can support. Apologies for this issue dragging out for so long.

                            #5332
                            Fraida Fund
                            Participant

                              Thanks for keeping me informed!

                            Viewing 13 posts - 1 through 13 (of 13 total)
                            • You must be logged in to reply to this topic.