1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 121 through 135 (of 489 total)
  • Author
    Posts
  • in reply to: Issue Connecting via SSH to Specific Node in Topology #8100
    Komal Thareja
    Participant

      Hi Yuanjun,

      Your slice is already in a Dead state, meaning all associated resources have been released.

      Please try creating your slice again and let us know if the issue persists. To help us investigate potential problems before expiration, consider extending your slice’s lifetime if you encounter this issue again.


      Slice Name: byteps_8node_GPN_lamb Slice ID: 0e99c5ea-76d2-4189-ba2e-817a80fa8d29 Project ID: 34a45f8f-be0e-4efc-a91c-38358ce4ca29 Project Name: Ensemble Inference
      Graph ID: 070d665f-5fcc-467e-9afa-d1d9f2c2f11c
      Slice owner: { name: orchestrator, guid: orchestrator-guid, oidc_sub_claim: 82e78849-be30-4290-a225-50040c065e4e, email: yuanjun.dai@case.edu}
      Slice state: Dead
      Lease time: 2025-01-31 02:15:43+00:00

      Thanks,

      Komal

      in reply to: L2 Interfaces on my slice transitioning to DOWN State #8097
      Komal Thareja
      Participant

        Subject: Network Configuration Issue on Slice VMs

        Hi Prateek,

        I checked your Slice. Could you share the VMs and sites where the network configuration was lost?

        The WASH and STAR site workers were rebooted due to another issue, which may have caused this disruption. Please note that, in the current version, fablib configures interfaces using ip commands, which are not persistent across reboots.

        We are working on making this configuration reboot persistent. In the meantime, please consider using NetworkManager or netplan to configure the interfaces in a way that persists after a reboot.

        Additionally, we are addressing the underlying issue that required the worker node reboots.

        Apologies for the inconvenience, and thank you for your patience!

        Best,
        Komal

        in reply to: Kali machine failing on post_boot_config #8091
        Komal Thareja
        Participant

          Hi Nirmala,

          Could you please share your slice id?

          Thanks,
          Komal

          in reply to: Error allocating reousrce in RUTG site #8088
          Komal Thareja
          Participant

            Hi Yuanjun,

            We had leaked config on the switch which has been cleared by help from Network Team. Could you please try your slice again? Please let us know if you still see the issue.

            Thanks,
            Komal

            in reply to: upload_file does not work at all #8085
            Komal Thareja
            Participant

              User has confirmed in another post that this issue has been resolved.

              SSH Connection Error: ChannelException(2, ‘Connect failed’)

              Komal Thareja
              Participant

                User has confirmed that this is no longer an issue.


                HI, Komal

                The problem has been solved. I believe we do not need the meeting today.

                Thank you so much for your help!

                Best Regards,
                Yuanjun Dai

                Komal Thareja
                Participant

                  Subject: API Behavior Verification

                  Hi Yuanjun,

                  I have verified this, and the API is functioning as intended.

                  fablib.list_sites() retrieves resource information from the testbed. Once fetched, users can display data for a specific site using fablib.show_site(). Please note that fablib.show_site() only presents site information that was previously retrieved via fablib.list_sites().

                  To refresh the resource information, you need to call fablib.list_sites(update=True), followed by fablib.show_site().

                  Please check this in your code, and let us know if you continue to experience any issues.

                  Thanks,
                  Komal

                  Komal Thareja
                  Participant

                    Hi Yuanjun,

                    Both the issues with upload_file and this point to SSH access to the VMs. It could be because of the expired bastion keys or a configuration issue.
                    Is it possible for a quick zoom meeting to resolve this? Please let me know if that works.

                    Thanks,
                    Komal

                    in reply to: STAR probably needs reflashing to P4 workflow #8072
                    Komal Thareja
                    Participant

                      Hi Ilya,

                      I checked the host and see a similar output. Will check with Mert regarding the reboot.


                      0000:25:00.0 Network controller: Xilinx Corporation Device 903f
                      Subsystem: Xilinx Corporation Device 0007
                      Physical Slot: 2-1
                      Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
                      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-

                      Thanks,
                      Komal

                      in reply to: WASH FPGA slice issue #8071
                      Komal Thareja
                      Participant

                        Hi Ilya,

                        Hussam helped recover the WASH worker. Could you please try your slice again on WASH?

                        Thanks,

                        Komal

                        in reply to: upload_file does not work at all #8065
                        Komal Thareja
                        Participant

                          Could you please check for any error messages in /tmp/fablib/fablib.log file when upload_file() is invoked? If possible, share this file?

                          This file should exist on the system from where the fablib is being invoked.

                          Thanks,

                          Komal

                          Komal Thareja
                          Participant

                            Thank you for sharing your observation. I will investigate this more and share upate.

                            Thanks,

                            Komal

                            in reply to: upload_file does not work at all #8061
                            Komal Thareja
                            Participant

                              Could you run fablib.show_config() to display the log file in use?

                              Additionally, you can specify the log file location in the fabric_rc file, which may assist in debugging. Upload failures often occur if the bastion keys have expired, so please verify that your bastion keys are still valid.

                              Thanks,

                              Komal

                              Komal Thareja
                              Participant

                                Hi Yuanjun,

                                This behavior is intentional; resource information is retrieved from cached data. Due to the scale of the system, updating it after each create or delete operation would be resource-intensive. Instead, the cache is refreshed at regular intervals, currently every 30 minutes.

                                Thanks,

                                Komal

                                in reply to: upload_file does not work at all #8057
                                Komal Thareja
                                Participant

                                  /tmp/fablib/fablib.log log file exists on JH container. Could you please check and share that?

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 121 through 135 (of 489 total)