1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 226 through 240 (of 319 total)
  • Author
    Posts
  • in reply to: Jupyter Hub Outage – Cluster Issues #4635
    Komal Thareja
    Participant

      GKE cluster issues are resolved, Jupyter Hub is back online. Apologies for the inconvenience!

      Thanks,

      Komal

      in reply to: A public IP for the Fabric node #4603
      Komal Thareja
      Participant

        @yoursunny – Thank you for sharing the example scripts. Appreciate it!

        @Xusheng – You can use FabNetv4Ext or FabNetv6Ext services as explained here.

        Also, we have two example notebooks one each for FabNetv4Ext or FabNetv6Ext available via start_here.ipynb:

        • FABNet IPv4 Ext (Layer 3): Connect to FABRIC’s IPv4 internet with external access (manual)
        • FABNet IPv6 Ext (Layer 3): Connect to FABRIC’s IPv6 internet with external access (manual)

        Thanks,
        Komal

        in reply to: IPv6 on FABRIC: A hop with a low MTU #4580
        Komal Thareja
        Participant

          @yoursunny Thank you for sharing your script. We have updated MTU setting across sites and were able to use your script as well for testing. However, with latest fablib changes for performance improvements, the script needed to be adjusted a little bit. Sharing the updated script here.

          Thanks,
          Komal

          in reply to: manual cleanup needed? #4579
          Komal Thareja
          Participant

            Hi Fengping,

            Thank you so much for reporting this issue. There was a bug which led to allocating same subnet to multiple slices. So when a second slice got allocated the same subnet the traffic stopped working for your slice.

            I have applied the fix for the bug on production. Could you please delete your slice and recreate it? Apologies for the inconvenience.

            Appreciate your help with making the system better.

            Thanks,
            Komal

            in reply to: manual cleanup needed? #4575
            Komal Thareja
            Participant

              Please try this to create 12 VMs, this shall let you use almost the entire worker w.r.t cores. I will keep you posted about the flavor details.

              
              
              #Create Slice
              slice = fablib.new_slice(name=slice_name)
              
              # Network
              net1 = slice.add_l2network(name=network_name, subnet=IPv4Network("192.168.1.0/24"))
              
              node_name = "Node"
              number_of_nodes = 12
              for x in range(number_of_nodes):
                disk = 500
                if x == 0:
                  disk = 4000
                node = slice.add_node(name=f'{node_name}{x}', site=site, cores='62', ram='128', disk=disk)
                iface = node.add_component(model='NIC_Basic', name='nic1').get_interfaces()[0]
                iface.set_mode('auto')
                net1.add_interface(iface)
              
              #Submit Slice Request
              slice.submit();
              

              Thanks,
              Komal

              in reply to: manual cleanup needed? #4572
              Komal Thareja
              Participant

                With the current flavor definition, I would recommend requesting VMs with the configuration:

                cores='62', ram='384', disk='2000'

                Anything bigger than this maps to fabric.c64.m384.d4000 and only one of the workers i.e. cern-w1 can accomodate 4TB disks and rest of the worker can at max accomodate 2TB disk. I will discuss this internally to work on providing a better flavor to accomodate your slice.

                Thanks,

                Komal

                P.S: I was able to successfully create a slice with the above configuration.

                in reply to: manual cleanup needed? #4570
                Komal Thareja
                Participant

                  I looked at the instance types, please try setting the core='62', ram='384', disk='100'

                  FYI: https://github.com/fabric-testbed/InformationModel/blob/master/fim/slivers/data/instance_sizes.json this might be useful for VM sizing.

                  Thanks,

                  Komal

                  in reply to: manual cleanup needed? #4569
                  Komal Thareja
                  Participant

                    I looked at your slices and found that you have 2 Dead Slices and 6 Closing Slices. All the slices are requesting VMs on a single site CERN. All the Slice requests are requesting either 120 or 60 cores. Regardless of the disk size, for core/ram requested these are mapped to the following flavors. Considering that there are other slices also on CERN site, your slice cannot be accommodated by single CERN site. Please consider either spanning your slice across multiple sites or reducing the size of the VM not only w.r.t disk but also cores/ram.

                    We currently only have a limited number of flavors and your core/ram request is being mapped to a huge disk.

                    core: 120 , ram: 480 G, ==>  fabric.c64.m384.d4000

                    core: 60 , ram: 360 G,  ==> fabric.c60.m384.d2000

                    NOTE: No manual cleanup is needed the software is behaving as designed.

                    Thanks,

                    Komal

                    in reply to: manual cleanup needed? #4566
                    Komal Thareja
                    Participant

                      Hi Fengping,

                      Your second slice failed with the error: Insufficient resources as depicted below. Please note that slice deletion is not synchronous, it may take some time for all the resources associated with a slice to be deleted. Please consider adding slight delay between subsequent slice creation attempts if both the slices are requesting resources from the same site which might not have been released yet by the first slice.

                      Resource Type: VM Notices: Reservation 113cd41c-26df-461e-8dc9-f93ed92fcebf (Slice ServiceXSlice(66a78e70-ecf2-41e7-be12-740561904991) Graph Id:cc871ebc-e290-4b44-ab36-046d3cd2da00 Owner:fengping@uchicago.edu) is in state (Closed,None_) (Last ticket update: Insufficient resources : ['disk'])

                       

                      For the second slice, you can view the failure reasons from the portal, by select the check box ‘Include Dead/Closed Slices`.

                      Please try creating the slice again and let us know if you still see errors.

                       

                      Thanks,

                      Komal

                      in reply to: Unable to run old Jupyter notebooks #4563
                      Komal Thareja
                      Participant

                        Sorry, for not indicating this before, you can restart the container from File -> Hub Control Panel -> Stop My Server, then logout and login.

                        in reply to: Unable to run old Jupyter notebooks #4561
                        Komal Thareja
                        Participant

                          Could you please remove the entries for fabrictestbed-extensions from ~/work/fabric_config/requirements.txt and restart your container? Please try your notebooks after the restart.

                          Thanks,

                          Komal

                          in reply to: Unable to run old Jupyter notebooks #4559
                          Komal Thareja
                          Participant

                            Hello Acheme,

                            Could you please share the output of the following commands? Also, which container are you using?

                            pip list | grep fabric

                            cat ~/work/fabric_config/requirements.txt

                            Thanks,

                            Komal

                            in reply to: Unable to create slices. #4553
                            Komal Thareja
                            Participant

                              Hello Thushari,

                              We are looking into this and have found that in a race condition when list_resources times out, you may observe the output as shared by you. We are debugging this and would work on a fix. In the meanwhile, could you please explicitly pass a site name when adding a node to your slice and try it?

                              Also, please try both 1.4.6 and 1.5.1 containers and let us know if it works for any of the options.

                              Thanks,

                              Komal

                              Komal Thareja
                              Participant

                                It looks like an older version of fablib is running in your container. Could you please ensure that there are no entries for fabrictestbed-extensions in fabric_config/requirements.txt ? Please restart your JH container after that and try the notebook again.

                                Versions should something like below for 1.4.6 and 1.5.1 container options.

                                For 1.4.6 container:

                                pip list | grep fabric
                                fabric 3.1.0
                                fabric-credmgr-client 1.3.2
                                fabric-fim 1.4.14
                                fabric-fss-utils 1.4.0
                                fabric-orchestrator-client 1.4.7
                                fabrictestbed 1.4.7
                                fabrictestbed-extensions 1.4.6

                                For 1.5.1 container:


                                $ pip list | grep fabric
                                fabric 3.1.0
                                fabric-credmgr-client 1.5.0
                                fabric_fim 1.5.2
                                fabric_fss_utils 1.5.0
                                fabric-orchestrator-client 1.5.1
                                fabrictestbed 1.5.1
                                fabrictestbed-extensions 1.5.1

                                 

                                Thanks,

                                Komal

                                Komal Thareja
                                Participant

                                  Hello Nagmat,

                                  Could you please share the output of the following command from a terminal window in JH container?

                                  pip list | grep fabric

                                   

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 226 through 240 (of 319 total)