1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 46 through 60 (of 557 total)
  • Author
    Posts
  • in reply to: Help Recovering Slice State to StableOK from StableError #9350
    Komal Thareja
    Moderator

      Hi Fatih,

      I hope you are doing well too, and thank you for reaching out with the detailed description.

      The slice is currently in a StableError state because some slivers encountered failures during the earlier modification attempt and were subsequently closed. This behavior is intentional: FABRIC reports the slice as StableError to preserve visibility into past sliver failures, even after the problematic resources have been cleaned up.

      At this point, since the affected network service has already been closed and no longer appears in the slice, there is no further action required to delete it. Your remaining active resources should continue to function normally, and their operation is not impacted by the slice being in StableError. In other words, this state is informational rather than blocking.

      If you would like to proceed with a clean state, the recommended option is to create a new slice with the desired topology. Otherwise, you may continue using the current slice as-is if the active slivers meet your needs.

      Please let us know if you have any additional questions or if you’d like help recreating the slice or network service.

      Best regards,
      Komal

      in reply to: Slice gone and token issues #9341
      Komal Thareja
      Moderator

        Hi Ilya,

        Happy New Year to you as well!

        The slice is still up and running; however, your project has expired, which is preventing the CM from issuing tokens.

        I’ve requested Michael to extend your project, and that should resolve the issue shortly.

        Best regards,
        Komal

        in reply to: Maintenance Started Tuesday, January 06 – 9:00 AM EST #9338
        Komal Thareja
        Moderator

          Maintenance has been completed! Testbed is open to use!

          Best,

          Komal

          in reply to: CPU model and frequency #9322
          Komal Thareja
          Moderator

            Hi YoursSunny,

            Thank you for reaching out! This information is not currently exposed through the API. However, it is documented here and may be helpful:
            https://learn.fabric-testbed.net/knowledge-base/fabric-site-hardware-configurations/

            I’ll also raise this with our team to discuss whether we can extend the API to support this in the future.

            Best regards,
            Komal

            Komal Thareja
            Moderator

              Hi,

              I’ve fixed NS6 as well. Please try to update your experiment scripts to avoid overwriting the authorized_keys file in the future.

              Best,
              Komal

              in reply to: Issue with L2PTP Tunnels #9270
              Komal Thareja
              Moderator

                Hi Fatih,

                Just wanted to check if you were able to acquire resources for longer duration. Please let us know if we can help in anyway.

                Best,

                Komal

                Komal Thareja
                Moderator

                  Hello Danilo,

                  I’ve restored the keys used by the Control Framework. You should now be able to add your keys via POA.

                  Please be careful not to overwrite any existing keys, and make sure to take a backup of your data beforehand.

                  @yoursunny — great suggestion. So far, we’ve avoided building our own images to reduce additional effort, but we’ll explore ways to either avoid this altogether or introduce a new user without requiring custom OS images.

                  Best regards,
                  Komal

                  in reply to: Issue with L2PTP Tunnels #9237
                  Komal Thareja
                  Moderator

                    Hi Fatih,

                    Apologies for the delayed response, but most likely the links you are requesting have been reserved in advance causing your renew to fail. I will look at other reservations today and work with the other users to see if we can get your slice stay up for longer duration. I will keep you posted!

                    Best,

                    Komal

                    in reply to: Date format error when extending slice #9236
                    Komal Thareja
                    Moderator

                      Hi Xavier,

                      Until we resolve this on the portal, you can also extend your slice via the JH, checkout this example on Jupyter Hub: fabric_examples/fablib_api/renew_slice/renew_slice.ipynb

                      Best,

                      Komal

                      Komal Thareja
                      Moderator

                        Hi Sourya,

                        Is this still an issue?

                        Best,

                        Komal

                        in reply to: Issue with L2PTP Tunnels #9195
                        Komal Thareja
                        Moderator

                          Hi Fatih,

                          I see that the following three slivers are currently in a Closed state. Please note that a renew is not a single-shot operation.

                          When you renew a slice, it transitions into the Configuring state and reports which individual slivers were successfully extended and which were not. You can verify this in the Portal by viewing the slice topology, or—if you are renewing from JupyterHub—fablib will explicitly report which slivers failed to renew.

                          You can also check this programmatically:

                          slice = fablib.get_slice(slice_name)
                          slice.list_slivers()
                          

                          Here are the affected reservations:

                          • Reservation ID: 990127bd-aa06-4992-8847-c76654faf0e8
                            State: Closed
                            Reason: Insufficient resources — No path available with the requested QoS
                          • Reservation ID: 30dd426f-9ddc-424b-bec7-ca8631540ea4
                            State: Closed
                            Reason: Insufficient resources — No path available with the requested QoS
                          • Reservation ID: cb4372e4-fb05-454e-8662-f53e297689f8
                            State: Closed
                            Reason: Insufficient resources — No path available with the requested QoS

                          These slivers were not able to secure a viable path during renewal, which is why they are now in a closed state.

                          To re-add these network services, you can modify the slice as follows:

                          1. Fetch the current slice topology, remove the closed network services, and submit the slice.
                          2. Fetch the updated topology, add the required network services again, and submit once more.
                          3. You can refer to this example for guidance on modifying an existing slice (adding/removing resources):
                            fabric_examples/fablib_api/modify_slice/modify-add-node-network.ipynb

                          Please let me know if you’d like help with the modify workflow or with re-submitting the network services.

                          Best,
                          Komal

                          in reply to: Issue with L2PTP Tunnels #9193
                          Komal Thareja
                          Moderator

                            Hi Fatih,

                            Thanks for reaching out.

                            I looked into your slice, and it appears that the two network services associated with VLAN 300 and VLAN 600 are currently in a Closed state. Both reservations show the same ticket update:

                            “Insufficient resources: No path available with the requested QoS.”

                            Here are the details:

                            Reservation ID: 8a83db0f-03f1-44b0-843f-c6e0c2664cfe
                            Slice ID: fdf2fd5b-b1b0-46ef-b51a-4d55e0fd5c47
                            Resource Type: L2PTP
                            State: Closed
                            Reason: No path available with requested QoS

                            Reservation ID: 257fae2a-28ca-4430-bb85-77864b3d5c25
                            Slice ID: fdf2fd5b-b1b0-46ef-b51a-4d55e0fd5c47
                            Resource Type: L2PTP
                            State: Closed
                            Reason: No path available with requested QoS

                            This indicates that the system was unable to allocate a viable path for these two tunnels during your most recent renewal window, which is why they are not active now.

                            If you would like, you can try the following:

                            • Re-declare or re-submit these two network services in your slice.
                            • Lower the QoS requirement temporarily to see if a path becomes available.

                            Please feel free to reach out if you need help updating the slice or if you would like us to investigate further.

                            Best regards,
                            Komal

                            Komal Thareja
                            Moderator

                              Hi Danilo,

                              I found that the authorized_keys file on both NS1 and NS5 was empty, which is why SSH—whether through the admin key or the Control Framework—was failing resulting in POA/addKey failure. It seems this may have happened unintentionally as part of the experiment.

                              I’ve manually restored SSH access so the Control Framework should now function properly, including POA. Could you please try adding your keys to these VMs again using POA? That should re-establish your SSH access.

                              Please be careful not to remove or overwrite the authorized_keys file in the process.

                              Best,

                              Komal

                              in reply to: Bluefield DPL pull failing due to timeout #9170
                              Komal Thareja
                              Moderator

                                I tried running docker pull manually on DALL and SEAT, and it worked fine on both. The artifact also ran successfully on SEAT with following changes. The issue appears to be related to the Docker installation via docker.io.

                                I have also passed this to the artifact author so they can make the required updates.

                                I made the following changes to get the artifact working:

                                • Changed the image to docker_ubuntu_24.
                                • Updated Step 34 to remove docker.io from the installation commands.
                                stdout, stderr = node1.execute('sudo apt-get update', quiet=True)
                                stdout, stderr = node1.execute('sudo apt-get install -y build-essential python3-pip net-tools', quiet=True)
                                stdout, stderr = node2.execute('sudo apt-get update', quiet=True)
                                stdout, stderr = node2.execute('sudo apt-get install -y build-essential python3-pip net-tools', quiet=True)
                                stdout, stderr = node1.execute('sudo pip3 install meson ninja', quiet=True)
                                stdout, stderr = node2.execute('sudo apt install -y python3-scapy', quiet=True)
                                

                                Best,
                                Komal

                                in reply to: Bluefield DPL pull failing due to timeout #9169
                                Komal Thareja
                                Moderator

                                  Hi Nishanth,

                                  I tried on UTAH, MICH, MASS and docker pull seems to work.

                                  Could you please try nslookup nvcr.io and then try the docker pull command?

                                  I will also check with Mert/Hussam to see if we have any known issues on SEAT and DALL.

                                  Best,

                                  Komal

                                Viewing 15 posts - 46 through 60 (of 557 total)