1. Paul Ruth

Paul Ruth

Forum Replies Created

Viewing 15 posts - 256 through 270 (of 271 total)
  • Author
    Posts
  • in reply to: Replication of customized configurations #944
    Paul Ruth
    Keymaster

      Fu Shen,

      There isn’t really a way to save the state of a slice. We recommend scripting the configuration of your experiments so that you can easily bring the experiment up/down.  Some configuration can take a long to to build.  In these cases, you might want to create containers to package your software so that the can easily be deployed on the VMs.

      I am working on some examples using DockerHub to deploy software.  One example that is not quite done yet but could be useful is a slice that deploys a P4 software switch using DockerHub.

      An initial version of the notebook can be found here: https://github.com/fabric-testbed/jupyter-examples/blob/pruth-utils/fabric_examples/beta_functionality/fabric_p4_bmv2/fabric_p4_simple_router.ipynb

      Note that this example uses some utility functions to copy scripts into the VMs then runs them.  The scripts run “docker run …” and other configuration.  Building the software in this docker container takes a couple of hours but installing the pre-built container take about a minute.  This streamlines the creation of the experiment and makes it very easy to share the experiment with other users.

      Paul

      in reply to: JupyterHub: 403 : Forbidden Access not allowed #935
      Paul Ruth
      Keymaster

        I just approved Jason. His account was still pending but should work how. I need to check on Muhammad’s account.

         

        in reply to: JupyterHub: 403 : Forbidden Access not allowed #922
        Paul Ruth
        Keymaster

          Ok, have him try one more time. I fixed another attribute on the backend.

          in reply to: Node image types #919
          Paul Ruth
          Keymaster

            Good question.  For now, the only images available are “default_ubuntu_20” and “default_centos_8”.  We plan to enable custom images at a later date.  Let us know if you have a specific experiment that needs a different OS flavor.

            Generally, custom images are useful if you have specific Linux flavor, kernel, or device driver requirements.  We recommend packaging software in Docker containers. Starting an experiment can use a baseVM image followed by pulling a Docker container from a service such as Docker Hub.  Stay tuned for a software P4 switch example that I will release soon. It loads the experimental software suing Docker Hub.

             

            in reply to: JupyterHub: 403 : Forbidden Access not allowed #915
            Paul Ruth
            Keymaster

              Can you try again but make sure that all cookies related to FABRIC and CIlogon are deleted first? There may be some weird state related to the previous attempts. I know you said he tried with a different browser but I just want to make sure he never tried with the new browser before.

              in reply to: JupyterHub: 403 : Forbidden Access not allowed #913
              Paul Ruth
              Keymaster

                I added JupyterHub access to his account. Please try again.

                Paul

                Paul Ruth
                Keymaster

                  I agree this would be ideal. The issue is that the forums/learn site is WordPress under the covers. We have looked into single sign on across our custom portal and WordPress, but there are other priories at the moment.

                  Thanks for the feed back. Streamlining login is something we will be working toward.

                  in reply to: Errors logging into portal with CI Logon / CoManage #905
                  Paul Ruth
                  Keymaster

                    Joe, I tried to fix it on the backend. Can you try again, please?

                    If that doesn’t work I will need to elevate this to Michael Stealey.

                    Paul

                    • This reply was modified 2 years, 6 months ago by Paul Ruth.
                    in reply to: LBNL Maintenance Today (Aug 20) #764
                    Paul Ruth
                    Keymaster

                      Maintenance is complete.

                      in reply to: Failed when adding two SharedNIC_ConnectX_6 to one node #746
                      Paul Ruth
                      Keymaster

                        You definitely can add multiple connectX-6 card.  Given our very limited dev capacity, it may be that there are no nodes left that have 2 connectX-6’s in the same host.

                        What is the error you see in the sliver status for that failed VM?

                        (see this example to get sliver status: https://github.com/fabric-testbed/jupyter-examples/blob/master/fabric_examples/basic_examples/get_slivers.ipynb)

                        • This reply was modified 2 years, 8 months ago by Paul Ruth.
                        in reply to: Questions about node ports and bandwidth between sites #537
                        Paul Ruth
                        Keymaster

                          Yes, that is all correct. The only additional thing to think about is if you add VLAN tags to the interfaces.  If there are VLAN tags you need create the virtual interfaces inside the VM.

                          Also, if you are using a L2P2P you must use a VLAN tag.  If you are not seeing traffic this could be why.

                          Paul

                          in reply to: How to view components of active slice #517
                          Paul Ruth
                          Keymaster
                            in reply to: Questions about node ports and bandwidth between sites #516
                            Paul Ruth
                            Keymaster

                              The best examples will be in the Jupyterhub environment. We have pre-installed a suite of example notebooks from the following github repo. The notebooks are currently in development and will improve over time. You may need to do a git pull on the repo to get the newest example notebooks.

                              https://github.com/fabric-testbed/jupyter-examples

                              Specifically, the examples in this folder will be useful:

                              https://github.com/fabric-testbed/jupyter-examples/tree/master/fabric_examples/basic_examples

                              These notebooks are very new and maybe incomplete. I will try to update the notebooks today with the most current information.

                              More generally, there are 3 types of layer2 “Network Services” on FABRIC (layer3 services are still in development):

                              1. L2Bridge: These bridges are like a local network switch/bridge that connects any number of local nodes within a single site.  These local bridges are directly connected to the nodes so your bandwidth will be limited by the maximum bandwidth of the NICs that you are using (i.e. ConnectX_6 NICs will provide 100Gbps). This bridge is not programable and only performs simple MAC learning. The key use of these bridges are that they can only connect to nodes within a single FABRIC site.
                              2. L2P2P (Peer-to-peer):  These are peer-to-peer layer2 circuits that connect exactly 2 nodes. These nodes must be on different FABRIC sites.  These circuits will have user specified QoS. QoS is currently in development and is not yet available.  Once QoS is is available, user will be able to request dedicated bandwidths on these circuits.
                              3. L2S2S (Site-to-site):  Site-2-site is a hybrid of L2Bridge and L2P2P with some limitations. With S2S a user gets a pair of L2Bridges on different FABRIC sites that have a circuit connecting them.  Any number of nodes on either of the sites can be connected to the single S2S Network Service.  All nodes connected to the S2S service are on the same L2 network and can use the same layer3 subnet. One big limitation of S2S is that the wide are circuit will be best effort and will not have any guaranteed QoS.  If you want guaranteed QoS, you will need to use L2P2P and setup your own routing or switching on each end.

                               

                              Although the notebooks include examples of how to configure and use the data plane networks, the context below might be helpful.

                              When you add an interface to a network service you have a couple options. The interfaces can be tagged or untagged with VLANs.  By default adding an interface to a network service will result in an untagged interface.

                              • untagged interfaces: These behave like an access port. In other words, any VLAN FABRIC assigned tags will be stripped from the layer2 traffic before it is passed to the user’s node.  The user’s node will not need to process VLAN tags.

                               

                              • tagged interfaces: The interfaces behave lit a trunk port. VLAN tags are left on the l2 traffic that enters the node. In this case, the user must process the VLAN tags with in the node. FABRIC allows the user to specify the VLAN tag that should be on the traffic as it enters the node. There are separate name spaces for these tags per interface so users are free to use any VLAN tag they wish and there will be no conflict with other users.

                              Tagged example that applies VLAN tag 200:

                              
                              n1.add_component(model_type=ComponentModelType.SmartNIC_ConnectX_6, name='n1-nic1')
                              n2.add_component(model_type=ComponentModelType.SmartNIC_ConnectX_5, name='n2-nic1')
                              
                              n1_iface=n1.interface_list[0]
                              n2_iface=n2.interface_list[0]
                              
                              t.add_network_service(name='ptp1', nstype=ServiceType.L2PTP,
                              interfaces=[n1_iface, n2_iface])
                              
                              if_labels = n1_iface.get_property(pname="labels")
                              if_labels.vlan = "200"
                              i.set_properties(labels=if_labels)
                              
                              if_labels = n2_iface.get_property(pname="labels")
                              if_labels.vlan = "200"
                              i.set_properties(labels=if_labels)

                              Let me know if this helps

                              in reply to: Questions about node ports and bandwidth between sites #478
                              Paul Ruth
                              Keymaster

                                Update:

                                I tried this myself and was able to get ~6Gbps but only after tuning as suggested by the ESnet site.  I did this with VMs at UKY and LBNL. Both VMs were bigger than the default (32 cores, 64G ram… this is probably bigger than necessary).

                                I also found that jumbo frames is not yet possible between these sites. We are working on making this possible soon.

                                in reply to: Questions about node ports and bandwidth between sites #477
                                Paul Ruth
                                Keymaster

                                  Generally, eth0 is used as a management interface. This is the network that you use when you ssh to the node from the Internet.  You should avoid using this network for experiments.

                                  The interfaces numbered eth1 (or higher) will be the ones associated with the network component(s) that you have added to your node. These are the ones you should use for experiments.

                                  Re: the slow performance of the experimental network.  Our initial deployment does not yet use the dedicated L1 circuits that we will have as they become available. Instead it uses I2 AL2S.  However, even with AL2S you should be able to get much higher bandwidth.  I would expect you could get over 10Gbps (maybe even as much at 100Gbps).  There are a couple of possible issues:

                                  1. Our network deployment needs to be configured/tuned correctly. This is such low bandwidth that I suspect something in the path is dropping packets.  What slice configuration did you use? I assume you have one node at UKY and one at LBNL, is this true? Also, which components did you include on the nodes?
                                  2. Your end hosts need to be tuned for high-latency, high-bandwidth data transfers.  From the ifconfig info I can see that your nodes are not using jumbo frames.  There are probably some other tuning optimization you can make. ESnet has a great resource for learning about this:  https://fasterdata.es.net/host-tuning/linux/ 

                                  Please let us know which components you are using in the VMs. I would like to try this myself and see if I have the same issues.

                                  Paul

                                   

                                   

                                Viewing 15 posts - 256 through 270 (of 271 total)