1. Nishanth Shyamkumar

Nishanth Shyamkumar

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 18 total)
  • Author
    Posts
  • in reply to: How to use long-lived tokens in experiments #7283
    Nishanth Shyamkumar
    Participant

      Thanks Komal, I tested it and it is working without any issues after the update.

      in reply to: No candidate nodes found error #7244
      Nishanth Shyamkumar
      Participant

        Thanks for the info. Is there some way to get the maintenance status of a site through some API , or must the user just keep track of it through forum updates?

        in reply to: Slice resubmit fails with already configured error. #7243
        Nishanth Shyamkumar
        Participant

          Hi Komal,

          Here is a code snippet, it’s a bit complex since there are a some design mechanisms at play here. However, the essential part is:
          There is a while loop that attempts to setup the slice and request port mirror resources by invoking setup_slice(). If it fails, then the failed slice is deleted and in the next attempt, the number of VMs requested are reduced and the slice creation is once again requested.

           

           

           

          def setup_slice():
              …
              A block of code that checks for available smartNICS and assigns a VM for each.
              Splits total available switch ports on 1 site into N groups, where N is the number of VMs.
              Specify other resources like CPU, RAM etc.
              …
              pmnet = {}
              num_pmservices = {}     # Track mirrored port count per VM
              listener_pmservice_name = {}
              ports_mirrored = {}     # Track mirrored port count per site
              random.seed(None, 2)
              for listener_site in listener_sites:
                  pmnet[listener_site]=[]
                  # To keep track of ports mirrored on each site, within the port list
                  ports_mirrored[listener_site] = 0
                  j = 0
                  max_active_ports = port_count[listener_site]
                  for listener_node in listener_nodes[listener_site]:
                      k = 0
                      listener_interface_idx = 0
                      listener_pmservice_name[listener_node] = []
                      node_name = listener_node_name[listener_site][j]
                      avail_port_node_maxcnt = len(mod_port_list[listener_site][node_name])  # Each node(VM) monitors an assigned fraction of the total available ports.
                      for listener_interface in listener_interfaces[node_name]:
                          #print(f’listener_interface = {listener_interface}’)
                          if (listener_interface_idx % 2 == 0):
                              random_index = random.randint(0, int(avail_port_node_maxcnt / 2 – 1))   # first listener interface of NIC randomizes within the first half
                          else:
                              random_index = random.randint(int(avail_port_node_maxcnt/2), avail_port_node_maxcnt – 1) # second listener interface randomizes within the second half
                          listener_interface_idx += 1
                          if ports_mirrored[listener_site] < max_active_ports:
                              listener_pmservice_name[listener_node].append(f'{listener_site}_{node_name}_pmservice{ports_mirrored[listener_site]}’)
                              pmnet[listener_site].append(pmslice.add_port_mirror_service(name=listener_pmservice_name[listener_node][k],
                                                    mirror_interface_name=mod_port_list[listener_site][node_name][random_index],
                                                    receive_interface=listener_interface,
                                                    mirror_direction = listener_direction[listener_site]))
                              with open(startup_log_file, “a”) as slog:
                                  slog.write(f”{listener_site}# mirror interface name: {mod_port_list[listener_site][node_name][random_index]} mirrored to {listener_interface}\n”)
                                  slog.close()
                              ports_mirrored[listener_site] = ports_mirrored[listener_site] + 1
                              k = k + 1
                          else:
                              with open(startup_log_file, “a”) as slog:
                                  slog.write(f”No more ports available for mirroring\n”)
                                  slog.close()
                                  break
                      j = j + 1
                      num_pmservices[listener_node] = k
          #Submit Slice Request
          port_reduce_count = 0
          retry = 0
          while (retry != 1):
              try:
                  setup_slice(port_reduce_count)
                  pmslice.submit(progress=True, wait_timeout=2400, wait_interval=120)
                  if pmslice.get_state() == “StableError”:
                      raise Exception(“Slice state is StableError”)
                  retry = 1
              except Exception as e:
                  if pmslice.get_state() == “StableError”:
                      fablib.delete_slice(listener_slice_name)
                  else:
                      pmslice.delete()
                  time.sleep(120)

           

           

           

           

          in reply to: How to use long-lived tokens in experiments #7207
          Nishanth Shyamkumar
          Participant

            Hi Komal,

            I tried this and it still does not work. Here are the fabric packages in my environment:

            [code]

            pip list | grep fab │
            fabric-credmgr-client 1.6.1 │
            fabric_fim 1.6.1 │
            fabric_fss_utils 1.5.1 │
            fabric-orchestrator-client 1.6.1 │
            fabrictestbed 1.6.9 │
            fabrictestbed-extensions 1.6.5

            [/code]

            The fabrictestbed is at 1.6.9, yet the slice_manager.py and specifically the __load_tokens still has the refresh token Exception check.

            in reply to: How to use long-lived tokens in experiments #7125
            Nishanth Shyamkumar
            Participant

              Hi Komal,

              Looking at the source code, the required change in slice_manager.py is not present on the main branch. It is available in the other branches: adv-res, llt and 1.7
              Should I use one of these branches to use the long lived tokens?
              Essentially:
              pip install git+https://github.com/fabric-testbed/fabrictestbed@1.7

              in reply to: How to use long-lived tokens in experiments #7093
              Nishanth Shyamkumar
              Participant

                Hi Komal,

                I am using fablib from within a Python program. Can you let me know which branch of fabrictestbed-extensions should I use to have this updated change? Is it the main branch? Or branch 1.7?

                pip install git+https://github.com/fabric-testbed/fabrictestbed-extensions@main

                in reply to: TACC always failing with insufficient resources:Disk# #7059
                Nishanth Shyamkumar
                Participant

                  Thanks, so it does indeed stand for disk space.

                  When I look at the graphical stats on the Fabric Portal, it mentions that TACC has 103263/107463 GB free (it may not be the latest info, but I don’t think it varies by much). How can I ask Fabric to assign my VM on an underlying server where there is enough hard disk space ?

                  in reply to: Multi-day FABRIC maintenance (January 1-5, 2024) #6223
                  Nishanth Shyamkumar
                  Participant

                    “These 4 sites will be placed in pre-maintenance mode several days in advance so that no new experiments can be created after the indicated date. We apologize for any inconvenience this may cause.”

                    Which is the indicated date mentioned here? Is it the date that these sites go into pre-maintenance or is it Jan 1st? In other words, can I create new slivers on these sites until Jan 1st?

                    in reply to: Lack of space in Server Filesystem #6220
                    Nishanth Shyamkumar
                    Participant

                      Thanks Ilya. This solution worked. It requires a slight bit of manual intervention, but is still mostly scriptable. It can possibly be fully scripted as well using the Fablib APIs, and I will look into that when time permits.

                      in reply to: Timeout while creating slice #6209
                      Nishanth Shyamkumar
                      Participant

                        Thanks Paul, I looked into the source code for this and I saw that the ‘main’ branch actually includes a change that propagates the wait_timeout parameter to the self.wait() function call. However it’s available only on the ‘beyond bleeding’ while I was testing on the ‘bleeding’ framework.

                        I am still sticking with the ‘bleeding’ framework as of now because the code is structured in such a way that if I set progress=True (which is the default), then the wait_timeout propagates to self.wait(). I was testing earlier with progress=False, since I didn’t want the overhead of GUI representation of the data, but I got to use it for now at least.

                        I tested with progress=True and wait_timeout=2400 and it works for now. The slice submission takes between 1000 to 1500 seconds to complete, but it does succeed in the end.

                        in reply to: Port mirroring issue for Bundle-Ether ports #6206
                        Nishanth Shyamkumar
                        Participant

                          Thanks for the information.

                          in reply to: pmservice issue for multiple uplink ports #6169
                          Nishanth Shyamkumar
                          Participant

                            Xi, following your guidance, I double checked the ports and for AMST it’s using 2 vlans on the same port, and my code was treating it as 2 different uplink ports, which is where the duplication happened. I fixed it on my side and now AMST is being provisioned successfully.

                            I was unable to recreate the above issue in SEAT at the moment. If I see this issue in SEAT again, I will update with a message on this thread. Thanks for the help.

                            in reply to: Download file to local system from Jupyter notebook #6154
                            Nishanth Shyamkumar
                            Participant

                              Thanks, I agree with you Ilya. I have seen some of the APIs in the link you shared, and they can delete files, and shutdown the kernel etc. which is a huge security risk.
                              It’s probably more pragmatic to download using the right click option via the UI.

                              in reply to: Download file to local system from Jupyter notebook #6152
                              Nishanth Shyamkumar
                              Participant

                                Thanks Ilya for the very useful answer. I was able to download my compressed file with a GET request via curl using this method.

                                Just some additional information for others,
                                1) To access the Hub Panel in Jupyter notebook, click on ‘File’->’Hub Control Panel’

                                2) From here click on ‘Token’->’Request New API token’ and save the generated token number, which can be used in the curl request as a header (-H).

                                I still have to look into generating tokens without manual intervention, that will close the automation loop. If there are any updates I will post them here.

                                in reply to: Bmv2 max performance in FABRIC #5493
                                Nishanth Shyamkumar
                                Participant

                                  Hi,

                                  I was able to get ~1Gbps performance with TCP, however there is always a slight mismatch between packets sent and received as reported by iperf3, with couple of retries during the 1st second.
                                  I used a UDP connection instead and I am seeing a consistent behaviour of reasonable amount of losses on the 1st second across runs.
                                  I am using the same notebook as mentioned above, but replaced the client command with the following:
                                  iperf3 -c 192.168.2.10 -u -l 1300 -b 600M

                                  I have pasted the results for multiple runs with the above mentioned values. Is there an explanation for this drop, mostly at the first second.

                                  Accepted connection from 192.168.1.10, port 36772
                                  [ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 33996
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-1.00 sec 69.7 MBytes 585 Mbits/sec 0.044 ms 501/56731 (0.88%)
                                  [ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57691 (0%)
                                  [ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57696 (0%)
                                  [ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
                                  [ 5] 4.00-5.00 sec 71.5 MBytes 599 Mbits/sec 0.052 ms 58/57694 (0.1%)
                                  [ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
                                  [ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
                                  [ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57691 (0%)
                                  [ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57693 (0%)
                                  [ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.018 ms 0/57709 (0%)
                                  [ 5] 10.00-10.02 sec 1.12 MBytes 600 Mbits/sec 0.004 ms 0/901 (0%)
                                  – – – – – – – – – – – – – – – – – – – – – – – – –
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-10.02 sec 715 MBytes 598 Mbits/sec 0.004 ms 559/576881 (0.097%) receiver
                                  ———————————————————–
                                  Server listening on 5201
                                  ———————————————————–
                                  Accepted connection from 192.168.1.10, port 46278
                                  [ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 39296
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-1.00 sec 69.6 MBytes 584 Mbits/sec 0.058 ms 598/56731 (1.1%)
                                  [ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57692 (0%)
                                  [ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.053 ms 0/57693 (0%)
                                  [ 5] 3.00-4.00 sec 71.5 MBytes 599 Mbits/sec 0.112 ms 0/57634 (0%)
                                  [ 5] 4.00-5.00 sec 71.6 MBytes 601 Mbits/sec 0.041 ms 5/57751 (0.0087%)
                                  [ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.064 ms 0/57690 (0%)
                                  [ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
                                  [ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.043 ms 0/57690 (0%)
                                  [ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57695 (0%)
                                  [ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57692 (0%)
                                  [ 5] 10.00-10.02 sec 1.13 MBytes 610 Mbits/sec 0.013 ms 0/914 (0%)
                                  – – – – – – – – – – – – – – – – – – – – – – – – –
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.013 ms 603/576876 (0.1%) receiver
                                  ———————————————————–
                                  Server listening on 5201
                                  ———————————————————–
                                  Accepted connection from 192.168.1.10, port 42888
                                  [ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 36183
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-1.00 sec 70.3 MBytes 590 Mbits/sec 0.063 ms 0/56730 (0%)
                                  [ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57694 (0%)
                                  [ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.061 ms 0/57691 (0%)
                                  [ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57693 (0%)
                                  [ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
                                  [ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
                                  [ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57692 (0%)
                                  [ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57695 (0%)
                                  [ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
                                  [ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.062 ms 0/57691 (0%)
                                  [ 5] 10.00-10.02 sec 1.07 MBytes 577 Mbits/sec 0.053 ms 0/867 (0%)
                                  – – – – – – – – – – – – – – – – – – – – – – – – –
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-10.02 sec 715 MBytes 599 Mbits/sec 0.053 ms 0/576827 (0%) receiver
                                  ———————————————————–
                                  Server listening on 5201
                                  ———————————————————–
                                  Accepted connection from 192.168.1.10, port 53714
                                  [ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 46178
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-1.00 sec 69.1 MBytes 580 Mbits/sec 0.056 ms 965/56730 (1.7%)
                                  [ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57695 (0%)
                                  [ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.050 ms 0/57689 (0%)
                                  [ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57693 (0%)
                                  [ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57692 (0%)
                                  [ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57692 (0%)
                                  [ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.042 ms 0/57695 (0%)
                                  [ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57688 (0%)
                                  [ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
                                  [ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57692 (0%)
                                  [ 5] 10.00-10.02 sec 1.07 MBytes 574 Mbits/sec 0.054 ms 0/867 (0%)
                                  – – – – – – – – – – – – – – – – – – – – – – – – –
                                  [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
                                  [ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.054 ms 965/576826 (0.17%) receiver

                                Viewing 15 posts - 1 through 15 (of 18 total)