1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 121 through 135 (of 455 total)
  • Author
    Posts
  • in reply to: Multiple GPUs on a node? #7902
    Komal Thareja
    Participant

      Hi Abdulhadi,

      The GPU count you are referring to represents the total number of GPUs available at a site.

      No single host at a site has more than 3 GPUs. In fact, only a few hosts are equipped with 3 GPUs. To check the per-host resource details, you can use the notebook: jupyter-examples-main/fabric_examples/fablib_api/sites_and_resources/list_all_resources.ipynb.

      For convenience, the following code snippet can also be used:


      from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager
      fablib = fablib_manager()
      fablib.show_config();
      fields=['name', 'tesla_t4_capacity','rtx6000_capacity', 'a30_capacity', 'a40_capacity']
      output_table = fablib.list_hosts(fields=fields)

      Thanks,

      Komal

      in reply to: Slice stuck in ‘Configuring’ on extend #7900
      Komal Thareja
      Participant

        Hi Ilya,

        I looked into your slice and found that it was partially renewed, with the VM on STAR not renewing completely.

        This appears to be a side effect of the Kafka maintenance we conducted yesterday, which impacted STAR. During this time, renewal messages were not processed because the Kafka consumer had stopped. I’ve resolved the issue, and future renewals should now work as expected.

        Thank you for bringing this to our attention and helping us identify and fix the problem.

        Best regards,
        Komal

        P.S: Another user also ran into this: https://learn.fabric-testbed.net/forums/topic/not-able-to-renew-the-slice/

        in reply to: Not able to renew the slice #7899
        Komal Thareja
        Participant

          Hi Sankalpa,

          Both your slices were partially renewed. Each slice included a VM on STAR, where the renewal process was stuck.

          We use a Kafka messaging bus, and there was a brief maintenance yesterday that impacted STAR. As a result, renewal messages were not processed because the Kafka consumer had stopped. I have resolved this issue, and all the slivers in your slices have been successfully renewed. Your slice is now in the StableOK state.

          Thank you for reporting this and helping us identify and address the problem.

          Best regards,
          Komal

          in reply to: Not able to renew the slice #7897
          Komal Thareja
          Participant

            Please share the slice ID. Slice ID can be captured from the Portal as well as from JH.

            Portal -> Experiments -> My Slices -> Copy the Slice ID.

            Also, how are you renewing the slices – Portal or JH?

            Thanks,

            Komal

            in reply to: Not able to renew the slice #7895
            Komal Thareja
            Participant

              Hi,

              Could you please share your slice id?

              Thanks,

              Komal

              in reply to: Slice stuck in ‘Configuring’ on extend #7893
              Komal Thareja
              Participant

                Hi Ilya,

                Thank you for reporting this issue. It seems to be a bug, and I’m in the process of debugging it. In the meantime, I’ve closed your slice, so it should no longer show up as “Configuring.”

                Best regards,
                Komal

                in reply to: Permission denied for in-slice port mirroring #7889
                Komal Thareja
                Participant

                  Hi Vaneshi,

                  Permission updated would be rolled out with Release 1.8 in January.

                  Thanks,

                  Komal

                  in reply to: Cant Access ‘classifier’ node in Slice #7873
                  Komal Thareja
                  Participant

                    Hi Sourya,

                    It looks like the authorized_keys file is not correct. I am not even able to login to nova SSH keys.

                    Could you please confirm if you see a key which ends with Generated-by-Nova  in /home/ubuntu/.ssh/authorized_keys ?

                    Also, please share the output of the command ls -ltr /home/ubuntu/.ssh/ ?

                    Thanks,

                    Komal

                    in reply to: GPU + Connectx6 SmartNIC node #7866
                    Komal Thareja
                    Participant

                      Correction: Both the CERN and CIEN racks have ConnectX-6 and GPU available on the same host. However, the CIEN rack is currently under maintenance as it is being transported back from SC.

                      You can proceed with your experiment on the CERN rack, subject to its availability.

                      Additionally, here’s a Fablib code snippet to help you check for specific resources on hosts:


                      fields=['name','nic_connectx_6_capacity','nic_connectx_5_capacity','tesla_t4_capacity','rtx6000_capacity', 'a30_capacity', 'a40_capacity']
                      output_table = fablib.list_hosts(fields=fields)

                      Thanks,
                      Komal

                      1 user thanked author for this post.
                      in reply to: GPU + Connectx6 SmartNIC node #7865
                      Komal Thareja
                      Participant

                        Hi Tanay,

                        As you mentioned, the current infrastructure supports GPUs with ConnectX-5. Unfortunately, GPUs with ConnectX-6 are not a feasible option at this time. I hope the available setup works well for your experiment.

                        Thanks,
                        Komal

                        in reply to: FPGA coming up with only 1 port #7854
                        Komal Thareja
                        Participant

                          Shared the BitFile flash status with Nishanth over email. He confirmed that FPGA slice with EsNet Sites are working as expected.

                          Thanks,
                          Komal

                          in reply to: Permission denied for in-slice port mirroring #7853
                          Komal Thareja
                          Participant

                            Hi Vaneshi,

                            We are working to update the Permissions, but in the current release, you need permission Net.PortMirroring for InSlice PortMirroring to work.

                            Thanks,
                            Komal

                            in reply to: Permission denied for in-slice port mirroring #7850
                            Komal Thareja
                            Participant

                              Hi Vaneshi,

                              Your project would need Net.PortMirroring permission for this to work. Could you please check if your project has this permission? If not, Please request your Project Owner or Lead to request for these permissions from the portal.
                              More details for requesting the permissions can be found here.

                              Thanks,
                              Komal

                              in reply to: FPGA coming up with only 1 port #7844
                              Komal Thareja
                              Participant

                                Hi Nishanth,

                                I will check other sites too, it would be helpful if you can share sites you tried.
                                Could you please delete your slice and try again?

                                It might be related to which bitfile has been used to Flash the FPGA. For instance, INDI is flashed with a bit file compatible with NEU workflow.

                                Thanks,
                                Komal

                                • This reply was modified 8 months, 2 weeks ago by Komal Thareja.
                                in reply to: What is the maximum RAM and Disk space can allocated ? #7841
                                Komal Thareja
                                Participant

                                  Hi Yuanjun,

                                  Details about the VM profiles are available here. For specific flavor information, please refer to the GitHub link. Feel free to reach out if you have any questions or concerns.

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 121 through 135 (of 455 total)