1. Nishanth Shyamkumar

Nishanth Shyamkumar

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 20 total)
  • Author
    Posts
  • in reply to: bastion key fails authentication #8252
    Nishanth Shyamkumar
    Participant

      Hi,

      I regenerated a fresh new keypair and it works now. Thanks.

      in reply to: Infrastructure-metrics queries #7888
      Nishanth Shyamkumar
      Participant

        Hi,

        A follow up question on this,
        1) Does this mean that HC always holds the correct value of that counter ?
        2) What happens to non-HC counters when it exceeds 32 bits? Does it get set to 2^32 – 1, or does it overflow and we see the remainder (true value % (2^32)) in this field ?

        in reply to: How to use long-lived tokens in experiments #7283
        Nishanth Shyamkumar
        Participant

          Thanks Komal, I tested it and it is working without any issues after the update.

          in reply to: No candidate nodes found error #7244
          Nishanth Shyamkumar
          Participant

            Thanks for the info. Is there some way to get the maintenance status of a site through some API , or must the user just keep track of it through forum updates?

            in reply to: Slice resubmit fails with already configured error. #7243
            Nishanth Shyamkumar
            Participant

              Hi Komal,

              Here is a code snippet, it’s a bit complex since there are a some design mechanisms at play here. However, the essential part is:
              There is a while loop that attempts to setup the slice and request port mirror resources by invoking setup_slice(). If it fails, then the failed slice is deleted and in the next attempt, the number of VMs requested are reduced and the slice creation is once again requested.

               

               

               

              def setup_slice():
                  …
                  A block of code that checks for available smartNICS and assigns a VM for each.
                  Splits total available switch ports on 1 site into N groups, where N is the number of VMs.
                  Specify other resources like CPU, RAM etc.
                  …
                  pmnet = {}
                  num_pmservices = {}     # Track mirrored port count per VM
                  listener_pmservice_name = {}
                  ports_mirrored = {}     # Track mirrored port count per site
                  random.seed(None, 2)
                  for listener_site in listener_sites:
                      pmnet[listener_site]=[]
                      # To keep track of ports mirrored on each site, within the port list
                      ports_mirrored[listener_site] = 0
                      j = 0
                      max_active_ports = port_count[listener_site]
                      for listener_node in listener_nodes[listener_site]:
                          k = 0
                          listener_interface_idx = 0
                          listener_pmservice_name[listener_node] = []
                          node_name = listener_node_name[listener_site][j]
                          avail_port_node_maxcnt = len(mod_port_list[listener_site][node_name])  # Each node(VM) monitors an assigned fraction of the total available ports.
                          for listener_interface in listener_interfaces[node_name]:
                              #print(f’listener_interface = {listener_interface}’)
                              if (listener_interface_idx % 2 == 0):
                                  random_index = random.randint(0, int(avail_port_node_maxcnt / 2 – 1))   # first listener interface of NIC randomizes within the first half
                              else:
                                  random_index = random.randint(int(avail_port_node_maxcnt/2), avail_port_node_maxcnt – 1) # second listener interface randomizes within the second half
                              listener_interface_idx += 1
                              if ports_mirrored[listener_site] < max_active_ports:
                                  listener_pmservice_name[listener_node].append(f'{listener_site}_{node_name}_pmservice{ports_mirrored[listener_site]}’)
                                  pmnet[listener_site].append(pmslice.add_port_mirror_service(name=listener_pmservice_name[listener_node][k],
                                                        mirror_interface_name=mod_port_list[listener_site][node_name][random_index],
                                                        receive_interface=listener_interface,
                                                        mirror_direction = listener_direction[listener_site]))
                                  with open(startup_log_file, “a”) as slog:
                                      slog.write(f”{listener_site}# mirror interface name: {mod_port_list[listener_site][node_name][random_index]} mirrored to {listener_interface}\n”)
                                      slog.close()
                                  ports_mirrored[listener_site] = ports_mirrored[listener_site] + 1
                                  k = k + 1
                              else:
                                  with open(startup_log_file, “a”) as slog:
                                      slog.write(f”No more ports available for mirroring\n”)
                                      slog.close()
                                      break
                          j = j + 1
                          num_pmservices[listener_node] = k
              #Submit Slice Request
              port_reduce_count = 0
              retry = 0
              while (retry != 1):
                  try:
                      setup_slice(port_reduce_count)
                      pmslice.submit(progress=True, wait_timeout=2400, wait_interval=120)
                      if pmslice.get_state() == “StableError”:
                          raise Exception(“Slice state is StableError”)
                      retry = 1
                  except Exception as e:
                      if pmslice.get_state() == “StableError”:
                          fablib.delete_slice(listener_slice_name)
                      else:
                          pmslice.delete()
                      time.sleep(120)

               

               

               

               

              in reply to: How to use long-lived tokens in experiments #7207
              Nishanth Shyamkumar
              Participant

                Hi Komal,

                I tried this and it still does not work. Here are the fabric packages in my environment:

                [code]

                pip list | grep fab │
                fabric-credmgr-client 1.6.1 │
                fabric_fim 1.6.1 │
                fabric_fss_utils 1.5.1 │
                fabric-orchestrator-client 1.6.1 │
                fabrictestbed 1.6.9 │
                fabrictestbed-extensions 1.6.5

                [/code]

                The fabrictestbed is at 1.6.9, yet the slice_manager.py and specifically the __load_tokens still has the refresh token Exception check.

                in reply to: How to use long-lived tokens in experiments #7125
                Nishanth Shyamkumar
                Participant

                  Hi Komal,

                  Looking at the source code, the required change in slice_manager.py is not present on the main branch. It is available in the other branches: adv-res, llt and 1.7
                  Should I use one of these branches to use the long lived tokens?
                  Essentially:
                  pip install git+https://github.com/fabric-testbed/fabrictestbed@1.7

                  in reply to: How to use long-lived tokens in experiments #7093
                  Nishanth Shyamkumar
                  Participant

                    Hi Komal,

                    I am using fablib from within a Python program. Can you let me know which branch of fabrictestbed-extensions should I use to have this updated change? Is it the main branch? Or branch 1.7?

                    pip install git+https://github.com/fabric-testbed/fabrictestbed-extensions@main

                    in reply to: TACC always failing with insufficient resources:Disk# #7059
                    Nishanth Shyamkumar
                    Participant

                      Thanks, so it does indeed stand for disk space.

                      When I look at the graphical stats on the Fabric Portal, it mentions that TACC has 103263/107463 GB free (it may not be the latest info, but I don’t think it varies by much). How can I ask Fabric to assign my VM on an underlying server where there is enough hard disk space ?

                      in reply to: Multi-day FABRIC maintenance (January 1-5, 2024) #6223
                      Nishanth Shyamkumar
                      Participant

                        “These 4 sites will be placed in pre-maintenance mode several days in advance so that no new experiments can be created after the indicated date. We apologize for any inconvenience this may cause.”

                        Which is the indicated date mentioned here? Is it the date that these sites go into pre-maintenance or is it Jan 1st? In other words, can I create new slivers on these sites until Jan 1st?

                        in reply to: Lack of space in Server Filesystem #6220
                        Nishanth Shyamkumar
                        Participant

                          Thanks Ilya. This solution worked. It requires a slight bit of manual intervention, but is still mostly scriptable. It can possibly be fully scripted as well using the Fablib APIs, and I will look into that when time permits.

                          in reply to: Timeout while creating slice #6209
                          Nishanth Shyamkumar
                          Participant

                            Thanks Paul, I looked into the source code for this and I saw that the ‘main’ branch actually includes a change that propagates the wait_timeout parameter to the self.wait() function call. However it’s available only on the ‘beyond bleeding’ while I was testing on the ‘bleeding’ framework.

                            I am still sticking with the ‘bleeding’ framework as of now because the code is structured in such a way that if I set progress=True (which is the default), then the wait_timeout propagates to self.wait(). I was testing earlier with progress=False, since I didn’t want the overhead of GUI representation of the data, but I got to use it for now at least.

                            I tested with progress=True and wait_timeout=2400 and it works for now. The slice submission takes between 1000 to 1500 seconds to complete, but it does succeed in the end.

                            in reply to: Port mirroring issue for Bundle-Ether ports #6206
                            Nishanth Shyamkumar
                            Participant

                              Thanks for the information.

                              in reply to: pmservice issue for multiple uplink ports #6169
                              Nishanth Shyamkumar
                              Participant

                                Xi, following your guidance, I double checked the ports and for AMST it’s using 2 vlans on the same port, and my code was treating it as 2 different uplink ports, which is where the duplication happened. I fixed it on my side and now AMST is being provisioned successfully.

                                I was unable to recreate the above issue in SEAT at the moment. If I see this issue in SEAT again, I will update with a message on this thread. Thanks for the help.

                                in reply to: Download file to local system from Jupyter notebook #6154
                                Nishanth Shyamkumar
                                Participant

                                  Thanks, I agree with you Ilya. I have seen some of the APIs in the link you shared, and they can delete files, and shutdown the kernel etc. which is a huge security risk.
                                  It’s probably more pragmatic to download using the right click option via the UI.

                                Viewing 15 posts - 1 through 15 (of 20 total)
                                FABRIC invites nominations for four awards recognizing innovative uses of FABRIC resources—Best Published Paper, Best FABRIC Matrix, Best FABRIC Experiment, and Best Classroom Use of FABRIC — submissions due by **Monday, February 24 at 11:59 PM ET**, and winners announced at KNIT10. [>>>Submit Form](https://docs.google.com/forms/d/e/1FAIpQLSeTp3i2iDhB7bHgN8ryMxZci8ya87yjeQd7_JMZImUodNinVA/viewform)

                                KNIT10 Call for Demos Now Open! Submit your demo by **February 24**. [>>>Submit Demo](https://docs.google.com/forms/d/e/1FAIpQLScRIWqHliNP3DFWBCnalYN_fBXJXVM0PpP9YWWJdSebC95TvA/viewform)
                                FABRIC invites nominations for four awards recognizing innovative uses of FABRIC resources—Best Published Paper, Best FABRIC Matrix, Best FABRIC Experiment, and Best Classroom Use of FABRIC — submissions due by **Monday, February 24 at 11:59 PM ET**, and winners announced at KNIT10. [>>>Submit Form](https://docs.google.com/forms/d/e/1FAIpQLSeTp3i2iDhB7bHgN8ryMxZci8ya87yjeQd7_JMZImUodNinVA/viewform)

                                KNIT10 Call for Demos Now Open! Submit your demo by **February 24**. [>>>Submit Demo](https://docs.google.com/forms/d/e/1FAIpQLScRIWqHliNP3DFWBCnalYN_fBXJXVM0PpP9YWWJdSebC95TvA/viewform)