1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 46 through 60 (of 501 total)
  • Author
    Posts
  • Komal Thareja
    Participant

      Hi Nirmala/Peter,

      fabrictestbed-extensions==1.9.1 is pushed to pypi and available in Bleeding Edge JH container.

      This version contains the fix for the get_interfaces.

      Thanks,

      Komal

      Komal Thareja
      Participant

        Hi Nirmala,

        Could you please try using the fablib from this branch: https://github.com/fabric-testbed/fabrictestbed-extensions/tree/rel1.9.1 ?

        nw.get_interfaces() should work now.

        I plan to push this main branch Monday. I’ll keep you posted.

        Please let me know if this resolves the issue.

        Thanks,

        Komal

        1 user thanked author for this post.
        Komal Thareja
        Participant

          Hi Philip,

          Thanks for pointing that out — noted! I’ll discuss internally to see if we can support specifying bandwidth in Mbps via the API.

          In the meantime, you could consider using tools like tc to shape traffic at a more granular level to meet your 30 Mbps requirement.

          Best regards,
          Komal

          in reply to: channel 0: open failed: connect failed: No route to host #8759
          Komal Thareja
          Participant

            Hi Ajay,

            node.os_reboot() is recommended to be executed only if you are doing CPU pinning or NUMA tuning. This failed because your VM was already in shutoff state. If the intent is to just reboot the VM, please use sudo reboot via node.execute(). Also, what kind of workload is your application/experiment running? We are noticing some kernel level CPU locks on the host where your VM is running. We want to investigate if something from your experiment is triggering this. Could you please share more details about the experiment workload being executed on this VM?

            Appreciate your help with this!

            Thanks,

            Komal

            in reply to: channel 0: open failed: connect failed: No route to host #8757
            Komal Thareja
            Participant

              Hi Ajay,

              Your VM was in a shutoff state, which I’ve now restored. Could you please share the notebook that outlines the type of workload you’re running on this VM? We’ve observed similar instances with your slices in the past, so having this information would help us identify the root cause of your VMs shutting down.

              Thanks,
              Komal

              in reply to: permission for Bastion Key “too open” #8748
              Komal Thareja
              Participant

                Hi Nirmala,

                Could you please change the permissions of the key as indicated in the error message using the command below via terminal in your JH container? This shall fix the issue.

                chmod 600 /home/fabric/work/fabric_config/Nirmala

                Thanks,

                Komal

                in reply to: Error when creating a slice #8745
                Komal Thareja
                Participant

                  Hi Garegin,

                  Thank you for sharing your observation. There was performance fix to improve how the interfaces are handled in the fablib. I suspect this may have introduced this issue. I will investigate this and post a fix once ready.

                  For now, please use the workaround as you suggested. Apologies for the inconvenience.

                  Thanks,

                  Komal

                  Komal Thareja
                  Participant

                    We have scheduled maintenance from July 28 to early August. This feature is planned to be rolled out during that period and should be available afterward.

                    Best regards,
                    Komal
                    in reply to: Error Creating Slices #8697
                    Komal Thareja
                    Participant

                      Hi Suhib,

                      Issue has been resolved, and slice provisioning is now functioning correctly. Could you please try your slice again?

                      Thanks,

                      Komal

                      Komal Thareja
                      Participant

                        The Network AM has been restored, and slice provisioning is now functioning correctly.

                        in reply to: I lost control on my slice. #8690
                        Komal Thareja
                        Participant

                          Hi Jiri,

                          Thank you for bringing this to our attention. We’ve identified an issue with our Network AM and are actively investigating it. Apologies for the inconvenience.

                          Your slice is currently in the Dead state, and all associated resources have been released. You can toggle the view on the portal to hide Dead/Closing slices if needed.

                          Thanks,

                          Komal

                          in reply to: Error logging into Nodes #8686
                          Komal Thareja
                          Participant

                            Posting an update here:

                            The component responsible for pushing SSH keys to bastion host encountered an error due to a network event at RENCI. This component has been restored and keys should work now.

                            Geoff confirmed that his keys are working. @Suhib – Please try your slice/using the keys again and let us know if you still are running into this error.

                            Thank you Geoff and Suhib for bringing this to our attention! Appreciate it!

                            Best,

                            Komal

                            1 user thanked author for this post.
                            in reply to: Error logging into Nodes #8683
                            Komal Thareja
                            Participant

                              Hi,

                              For JH environment:

                              Could you please try running the notebook: jupyter-examples-rel1.8.*/configure_and_validate/configure_and_validate.ipynb and share the output observed?

                              Also, after this please try running the Hello Fabric notebook and share your observation.

                              For your local setup:

                              Could you please trying setting up the environment as suggested here: https://learn.fabric-testbed.net/knowledge-base/advanced-jupyter-hub/#running-fabric-containers-locally ?

                              Best,

                              Komal

                              in reply to: Interconnection Details Between Hosts at the Same Site #8659
                              Komal Thareja
                              Participant

                                Hi Fatih,

                                Could you please share your Slice ID and also what kind of NICs are you using in your slice?

                                Thanks,

                                Komal

                                in reply to: Lost network interface after rebooting of vm3 in a cluster #8623
                                Komal Thareja
                                Participant

                                  Hi Ajay,

                                  Thanks for reaching out. Could you please share any details about what may have caused the VM to crash? This information will help us better understand the root cause.

                                  It appears that the PCI devices were detached from your VM during the crash. I’ve gone ahead and restored the VM — you should now be able to access it and use the GPUs as expected.

                                  Please let me know if you continue to face any issues.

                                  Best,
                                  Komal Thareja

                                Viewing 15 posts - 46 through 60 (of 501 total)