1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 271 through 285 (of 372 total)
  • Author
    Posts
  • in reply to: FailedPostStartHook Error when launching Jupyter Notebook #4885
    Komal Thareja
    Participant

      Hi Sarah,

      When you have the bleeding edge/beyond bleeding edge container running, could you please share the output of the following commands?


      ls -lrt /home/fabric/work/
      ls -lrt /home/fabric/work/fabric_config

      Also, could you please share the warning message you couldn’t upload to kthare10@renci.org

      Thanks,
      Komal

      in reply to: Long running slice stability issue.  #4811
      Komal Thareja
      Participant

        Hi Fegping,

        Node1: ens7 maps to NIC3 It was configured as below:

        NOTE the prefixlen is set to 128 instead of 64.

        ens7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet6 2602:fcfb:1d:2::2 prefixlen 128 scopeid 0x0
        inet6 fe80::7f:aeff:fe44:cbc9 prefixlen 64 scopeid 0x20 ether 02:7f:ae:44:cb:c9 txqueuelen 1000 (Ethernet)
        RX packets 28126 bytes 2617668 (2.6 MB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 2581 bytes 208710 (208.7 KB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

        I brought this interface down and re-configured the IP address using the following command:

        ip -6 addr add 2602:fcfb:1d:2::2/64 dev ens7

        After this I can ping the gateway as well as other nodes.

        root@node1:~# ping 2602:fcfb:1d:2::4
        PING 2602:fcfb:1d:2::4(2602:fcfb:1d:2::4) 56 data bytes
        64 bytes from 2602:fcfb:1d:2::4: icmp_seq=1 ttl=64 time=0.186 ms
        ^C
        --- 2602:fcfb:1d:2::4 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.186/0.186/0.186/0.000 ms
        root@node1:~# ping 2602:fcfb:1d:2::1
        PING 2602:fcfb:1d:2::1(2602:fcfb:1d:2::1) 56 data bytes
        64 bytes from 2602:fcfb:1d:2::1: icmp_seq=1 ttl=64 time=0.555 ms
        ^C
        --- 2602:fcfb:1d:2::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.555/0.555/0.555/0.000 ms

        Node2: IP was configured on ens7 However, mac address for NIC3 02:15:60:C2:7A:AD maps to ens9
        I configured ens9 with the command ip -6 addr add 2602:fcfb:1d:2::3/64 dev ens9and can now ping gateway and other nodes.

         

        root@node2:~# ping 2602:fcfb:1d:2::1
        PING 2602:fcfb:1d:2::1(2602:fcfb:1d:2::1) 56 data bytes
        64 bytes from 2602:fcfb:1d:2::1: icmp_seq=1 ttl=64 time=0.948 ms
        64 bytes from 2602:fcfb:1d:2::1: icmp_seq=2 ttl=64 time=0.440 ms
        ^C
        --- 2602:fcfb:1d:2::1 ping statistics ---
        2 packets transmitted, 2 received, 0% packet loss, time 1007ms
        rtt min/avg/max/mdev = 0.440/0.694/0.948/0.254 ms
        root@node2:~# ping 2602:fcfb:1d:2::2
        PING 2602:fcfb:1d:2::2(2602:fcfb:1d:2::2) 56 data bytes
        64 bytes from 2602:fcfb:1d:2::2: icmp_seq=1 ttl=64 time=0.146 ms
        64 bytes from 2602:fcfb:1d:2::2: icmp_seq=2 ttl=64 time=0.082 ms
        ^C
        --- 2602:fcfb:1d:2::2 ping statistics ---
        2 packets transmitted, 2 received, 0% packet loss, time 1010ms
        rtt min/avg/max/mdev = 0.082/0.114/0.146/0.032 ms

        Please configure the IPs on other interfaces or share the IPs and I can help configure them.

        Thanks,
        Komal

        Komal Thareja
        Participant

          Hi Elie,

          Could you please share the output of the following commands from your container?

          pip list|grep fabric

          cat ~/work/fabric_config/requirements.txt

          If you have any entries for fabrictestbed-extensions in ~/work/fabric_config/requirements.txt Please remove them and restart your container via File -> Hub Control Panel -> Stop My Server followed by Start My Server.

          Thanks,

          Komal

          in reply to: Long running slice stability issue.  #4802
          Komal Thareja
          Participant

            Hi Fengping,

            I have rebooted both Node1 and Node2. They should be accessible now. Please set up the IPs as per the mac addresses shared above. Please do let me know if anything else is needed form my side.

            Thanks,

            Komal

             

             

            in reply to: Long running slice stability issue.  #4786
            Komal Thareja
            Participant

              You can confirm the interfaces for Node1 and Node2 via the mac addresses:

              Node1

              02:7F:AE:44:CB:C9 => NIC3

              06:E3:D6:00:5B:06=> NIC2

              02:BC:A6:3F:C7:CB=> NIC1

              Node2

              02:15:60:C2:7A:AD=>NIC3

              02:1D:B9:31:E7:23=> NIC2

              02:B5:53:89:2C:E6=> NIC1

              Thanks,

              Komal

              in reply to: Long running slice stability issue.  #4785
              Komal Thareja
              Participant

                Hi Fengping,

                I think ens7 -> net1, ens8->net3 and ens9 -> net2 Please let me know once you get the public access back. I can help figure out the interfaces.

                Thanks,

                Komal

                in reply to: Long running slice stability issue.  #4780
                Komal Thareja
                Participant

                  Hello Fengping,

                  I have re-attached the pci devices for the VMs: node1and node2. You would need to reassign the IP addresses back on them for your links to work. Please let us know if the links are working as expected after configuring the IP addresses.

                  Thanks,

                  Komal

                  Komal Thareja
                  Participant

                    Maintenance is completed. Testbed is open for use.

                    in reply to: Jupyter Hub Outage – Cluster Issues #4635
                    Komal Thareja
                    Participant

                      GKE cluster issues are resolved, Jupyter Hub is back online. Apologies for the inconvenience!

                      Thanks,

                      Komal

                      in reply to: A public IP for the Fabric node #4603
                      Komal Thareja
                      Participant

                        @yoursunny – Thank you for sharing the example scripts. Appreciate it!

                        @Xusheng – You can use FabNetv4Ext or FabNetv6Ext services as explained here.

                        Also, we have two example notebooks one each for FabNetv4Ext or FabNetv6Ext available via start_here.ipynb:

                        • FABNet IPv4 Ext (Layer 3): Connect to FABRIC’s IPv4 internet with external access (manual)
                        • FABNet IPv6 Ext (Layer 3): Connect to FABRIC’s IPv6 internet with external access (manual)

                        Thanks,
                        Komal

                        in reply to: IPv6 on FABRIC: A hop with a low MTU #4580
                        Komal Thareja
                        Participant

                          @yoursunny Thank you for sharing your script. We have updated MTU setting across sites and were able to use your script as well for testing. However, with latest fablib changes for performance improvements, the script needed to be adjusted a little bit. Sharing the updated script here.

                          Thanks,
                          Komal

                          in reply to: manual cleanup needed? #4579
                          Komal Thareja
                          Participant

                            Hi Fengping,

                            Thank you so much for reporting this issue. There was a bug which led to allocating same subnet to multiple slices. So when a second slice got allocated the same subnet the traffic stopped working for your slice.

                            I have applied the fix for the bug on production. Could you please delete your slice and recreate it? Apologies for the inconvenience.

                            Appreciate your help with making the system better.

                            Thanks,
                            Komal

                            in reply to: manual cleanup needed? #4575
                            Komal Thareja
                            Participant

                              Please try this to create 12 VMs, this shall let you use almost the entire worker w.r.t cores. I will keep you posted about the flavor details.

                              
                              
                              #Create Slice
                              slice = fablib.new_slice(name=slice_name)
                              
                              # Network
                              net1 = slice.add_l2network(name=network_name, subnet=IPv4Network("192.168.1.0/24"))
                              
                              node_name = "Node"
                              number_of_nodes = 12
                              for x in range(number_of_nodes):
                                disk = 500
                                if x == 0:
                                  disk = 4000
                                node = slice.add_node(name=f'{node_name}{x}', site=site, cores='62', ram='128', disk=disk)
                                iface = node.add_component(model='NIC_Basic', name='nic1').get_interfaces()[0]
                                iface.set_mode('auto')
                                net1.add_interface(iface)
                              
                              #Submit Slice Request
                              slice.submit();
                              

                              Thanks,
                              Komal

                              in reply to: manual cleanup needed? #4572
                              Komal Thareja
                              Participant

                                With the current flavor definition, I would recommend requesting VMs with the configuration:

                                cores='62', ram='384', disk='2000'

                                Anything bigger than this maps to fabric.c64.m384.d4000 and only one of the workers i.e. cern-w1 can accomodate 4TB disks and rest of the worker can at max accomodate 2TB disk. I will discuss this internally to work on providing a better flavor to accomodate your slice.

                                Thanks,

                                Komal

                                P.S: I was able to successfully create a slice with the above configuration.

                                in reply to: manual cleanup needed? #4570
                                Komal Thareja
                                Participant

                                  I looked at the instance types, please try setting the core='62', ram='384', disk='100'

                                  FYI: https://github.com/fabric-testbed/InformationModel/blob/master/fim/slivers/data/instance_sizes.json this might be useful for VM sizing.

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 271 through 285 (of 372 total)