1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 76 through 90 (of 501 total)
  • Author
    Posts
  • in reply to: Unable to SSH into my Nodes #8519
    Komal Thareja
    Participant

      Could you please check your VM again?

      All PCI devices had been disconnected. I have reconnected them to your VM. Please check it.

      Also, could you please share the sequence of operations that lead your VM to this state?

      It would be helpful to see if there is anything that needs to be fixed on our control software.

      Thanks,

      Komal

      in reply to: Unable to SSH into my Nodes #8517
      Komal Thareja
      Participant

        Please share your slice ID and also the output of the command: ifconfig -a

        Thanks,

        Komal

        in reply to: L2Bridge not forwarding packets in SALT #8512
        Komal Thareja
        Participant

          Thank you Alex for sharing this observation! I temporarily assigned IP addresses to these interfaces on r3 and 4 nodes and do not see ping working between them.

          Network service as provisioned looks ok. I am reaching out to the network team and will keep you posted.

          Thanks,

          Komal

          in reply to: Unable to SSH into my Nodes #8511
          Komal Thareja
          Participant

            Hi Ajay,

            You can use the following code snippet to reboot the node:

            slice = fablib.get_slice(slice_name)
            node = slice.get_node(node_name)
            node.os_reboot()

            Also, please share your slice ID so we can take a look at it.

            Thanks,

            Komal

            in reply to: FPGA valid sites for Esnet toolchain #8499
            Komal Thareja
            Participant

              Thank you for your question.

              What I meant is that once an FPGA is initially flashed with a provided bitstream, users can reflash it with a different bitstream of their choice—as long as the PCIe interface remains unchanged. Because of this flexibility, the actual state of the FPGA at a given site may differ from what’s shown in the shared sheet, depending on whether a user has reprogrammed it.

              Best,

              Komal

              in reply to: Testing BitTorrent and IPFS #8498
              Komal Thareja
              Participant

                Thank you for your feedback, Philip!

                You’re absolutely right—node.add_fabnet() attaches the FabNetV4 service to the node, enabling communication with other nodes over FABRIC’s data plane network via the FabNetV4 interface.

                In addition, all VMs provisioned in FABRIC are assigned a Management IP for administrative purposes. This interface allows inbound SSH access and supports outbound connections, including those required for operations like docker pull. However, please note that the management network is actively monitored and any torrent or insecure traffic may be flagged. Such activity can lead to enforcement actions, including possible slice termination. As a best practice, we recommend not using the management network for experimental traffic.

                Best,

                Komal

                in reply to: Testing BitTorrent and IPFS #8493
                Komal Thareja
                Participant

                  Thank you for your inquiry Philip.

                  You are welcome to conduct experiments involving IPFS or BitTorrent on FABRIC, particularly for evaluating peer discovery and data transfer between FABRIC nodes. This type of testing is permissible as long as it is confined to FABnet or a custom Layer 2 network within the FABRIC infrastructure.

                  We kindly request that your experiment not initiate connections to external BitTorrent or IPFS servers outside the FABRIC environment.

                  Please feel free to reach out if you need any assistance with the experiment setup or have further questions.

                  Best regards,

                  Komal

                  in reply to: FPGA valid sites for Esnet toolchain #8478
                  Komal Thareja
                  Participant

                    Hi Nishanth,

                    Please find enclosed the most recent known status. Kindly note that users have the ability to flash their own binaries, so the actual state of the infrastructure may differ from what is captured in the attached sheet. As a first step toward addressing this, we are working to include notebook and Control Framework support in Release 1.9, enabling users to flash FPGAs within their workflows directly.

                    Thanks,

                    Komal

                    in reply to: Slice showing as StableOK but is actually closed #8462
                    Komal Thareja
                    Participant

                      Hi Anthony,

                      Regarding your slice: a5d2fff2-84fc-48d9-8d67-5ff96e120273
                      Start: 2025-04-18 14:53:43 +0000
                      End: 2025-05-02 14:53:42 +0000

                      A renew operation was attempted for this slice, but it failed for the VM due to insufficient resources: ['core'].

                      Please note that we now support advance reservations, which allow users to reserve resources ahead of time. As a result, a renew request may fail if it conflicts with an existing advance reservation — which appears to be the case here.

                      It’s unclear how the renew was initiated, but if it was done through JupyterHub, the error would have been reported to the user. We suspect there may be a bug on the portal side where this error is not being surfaced correctly, and we will investigate and address that.

                      Unfortunately, the only available option at this point is to re-create the slice. We apologize for the inconvenience.

                      Thanks,

                      Komal

                      in reply to: Tofino bf_switchd process gets killed. #8460
                      Komal Thareja
                      Participant

                        Hi Nishanth,

                        Thank you for sharing this.

                        Please note that the current implementation of execute_thread maintains the process only for the duration of the specified timeout. As you correctly observed, for longer-running processes, directly accessing the switch via SSH allows you to manually launch switchd.

                        We will work on enhancing execute_thread to better support this use case and will keep you informed once the update is available.

                        Thanks,

                        Komal

                         

                        in reply to: refrsh token issue inside jupyter notebook #8443
                        Komal Thareja
                        Participant

                          This error typically occurs due to an expired token. Please try the following steps:
                          Go to File → Hub Control Panel → Stop My Server, then select Start Server to generate a new token.

                          Thanks,

                          Komal

                          P.S: https://learn.fabric-testbed.net/knowledge-base/using-the-jupyter-hub/#frequently-asked-questions

                          in reply to: Node has no valid management IP. #8440
                          Komal Thareja
                          Participant

                            Hi Philip,

                            I can confirm that your slice is up and running. Could you please verify it from the Portal via Experiments-> My Slices.

                            W.r.t to Jupyter Hub, could you please re-run this notebook jupyter-examples-rel1.8.*/configure_and_validate/configure_and_validate.ipynb ?

                            After this, please try deleting your slice and recreating it via Hello Fabric notebook again.

                            Thanks,

                            Komal

                            in reply to: Planned Outage Jupyter Hub – 11:00 – 11:30 AM EST #8436
                            Komal Thareja
                            Participant

                              The maintenance is complete.

                              Thanks,

                              Komal

                              in reply to: Availability of DPU-powered SmartNICs #8431
                              Komal Thareja
                              Participant

                                Hi Plabon,

                                We’re in the process of procuring BlueField DPUs and are planning to integrate them into the FABRIC infrastructure. While the timeline isn’t finalized yet, we’re tentatively looking at Summer or Fall 2025. Stay tuned for updates!

                                Thanks,
                                Komal

                                Komal Thareja
                                Participant

                                  Hi Sadat,

                                  Could you please provide following information?

                                  • Slice ID
                                  • Verify and report the status of the Slice from Portal via Experiments -> My Slices
                                  • Verify and report if any errors observed in /tmp/fablib/fablib.log from JH container.
                                  • Verify Bastion SSH Keys are not expired
                                    • Check via portal Experiments -> Manage SSH Keys -> Bastion Key
                                    • In JH Container – Run jupyter-examples-rel1.8*/configure_and_validate.ipynb – this shall renew your expired keys
                                    • Please try your slice again if your Bastion Keys are expired.

                                  Thanks,

                                  Komal

                                  • This reply was modified 7 months, 3 weeks ago by Komal Thareja.
                                Viewing 15 posts - 76 through 90 (of 501 total)