1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 151 through 165 (of 478 total)
  • Author
    Posts
  • in reply to: Cant Access ‘classifier’ node in Slice #7873
    Komal Thareja
    Participant

      Hi Sourya,

      It looks like the authorized_keys file is not correct. I am not even able to login to nova SSH keys.

      Could you please confirm if you see a key which ends with Generated-by-Nova  in /home/ubuntu/.ssh/authorized_keys ?

      Also, please share the output of the command ls -ltr /home/ubuntu/.ssh/ ?

      Thanks,

      Komal

      in reply to: GPU + Connectx6 SmartNIC node #7866
      Komal Thareja
      Participant

        Correction: Both the CERN and CIEN racks have ConnectX-6 and GPU available on the same host. However, the CIEN rack is currently under maintenance as it is being transported back from SC.

        You can proceed with your experiment on the CERN rack, subject to its availability.

        Additionally, here’s a Fablib code snippet to help you check for specific resources on hosts:


        fields=['name','nic_connectx_6_capacity','nic_connectx_5_capacity','tesla_t4_capacity','rtx6000_capacity', 'a30_capacity', 'a40_capacity']
        output_table = fablib.list_hosts(fields=fields)

        Thanks,
        Komal

        1 user thanked author for this post.
        in reply to: GPU + Connectx6 SmartNIC node #7865
        Komal Thareja
        Participant

          Hi Tanay,

          As you mentioned, the current infrastructure supports GPUs with ConnectX-5. Unfortunately, GPUs with ConnectX-6 are not a feasible option at this time. I hope the available setup works well for your experiment.

          Thanks,
          Komal

          in reply to: FPGA coming up with only 1 port #7854
          Komal Thareja
          Participant

            Shared the BitFile flash status with Nishanth over email. He confirmed that FPGA slice with EsNet Sites are working as expected.

            Thanks,
            Komal

            in reply to: Permission denied for in-slice port mirroring #7853
            Komal Thareja
            Participant

              Hi Vaneshi,

              We are working to update the Permissions, but in the current release, you need permission Net.PortMirroring for InSlice PortMirroring to work.

              Thanks,
              Komal

              in reply to: Permission denied for in-slice port mirroring #7850
              Komal Thareja
              Participant

                Hi Vaneshi,

                Your project would need Net.PortMirroring permission for this to work. Could you please check if your project has this permission? If not, Please request your Project Owner or Lead to request for these permissions from the portal.
                More details for requesting the permissions can be found here.

                Thanks,
                Komal

                in reply to: FPGA coming up with only 1 port #7844
                Komal Thareja
                Participant

                  Hi Nishanth,

                  I will check other sites too, it would be helpful if you can share sites you tried.
                  Could you please delete your slice and try again?

                  It might be related to which bitfile has been used to Flash the FPGA. For instance, INDI is flashed with a bit file compatible with NEU workflow.

                  Thanks,
                  Komal

                  • This reply was modified 9 months, 3 weeks ago by Komal Thareja.
                  in reply to: What is the maximum RAM and Disk space can allocated ? #7841
                  Komal Thareja
                  Participant

                    Hi Yuanjun,

                    Details about the VM profiles are available here. For specific flavor information, please refer to the GitHub link. Feel free to reach out if you have any questions or concerns.

                    Thanks,

                    Komal

                    in reply to: Unable to delete nodes at sites due to ModifyError #7838
                    Komal Thareja
                    Participant

                      Hi Sourya,

                      I can confirm both node: MICH_3D and network interconnect4 have been deleted and are in Closed State.
                      In order to move the slice from ModifyError to StableError state, please execute the following block of code:


                      slice = fablib.get_slice(name=”Tailscale_Mesh_VPN”)
                      slice.modify_accept()

                      This should allow you to do any additional modifications.

                      Thanks,
                      Komal

                      in reply to: Unable to reserve FPGAs on KANS or WASH #7835
                      Komal Thareja
                      Participant

                        Hi Ilya,

                        I’ve confirmed that the FPGA is currently allocated to a slice belonging to another user. While the FPGA may have been flashed with your bitfile, our current allocation system does not reserve FPGAs for projects based on bitfiles flash requests. If the FPGAs are not linked to a specific slice, they remain available for other users to request and utilize.

                        We are actively working on enhancing our allocation system and quota management, and I will take your experience as valuable feedback for these improvements. Apologies for any inconvenience this may have caused.

                        Thanks,
                        Komal

                        in reply to: Unable to reserve FPGAs on KANS or WASH #7832
                        Komal Thareja
                        Participant

                          KANS and LOSA both have the FPGA allocated.
                          WASH seems to have the FPGA available but based on the Core/Ram/Disk requested, the slice might have been rejected.
                          Snapshot for WASH:

                          in reply to: Issue with SmartNIC Configuration on nodes #7830
                          Komal Thareja
                          Participant

                            Worked with Hemil over a zoom meeting and was able to resolve the issue by renaming the bastion key in fabric_rc and re-executing the configure_and_validate.ipynb

                            Thanks,
                            Komal

                            in reply to: Issue with SmartNIC Configuration on nodes #7826
                            Komal Thareja
                            Participant

                              Hi Hemil,

                              Could you please run the jupyter-examples-*/configure_and_validate/configure_and_validate.ipynb notebook?
                              This shall resolve any SSH key issues by renewing the expired bastion keys if any.

                              Try your setup.sh script after that and let us know if you still see this error.
                              In addition, could you please try to SSH to the VMs using the command shown in SSH Command coloumn.

                              Regarding auto configuring the IP addresses, please specify the subnet when creating a network and set mode to auto for the interfaces at slice creation. Please refer to one of the following examples for more details.

                              https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l2network_basic/create_l2network_basic_auto.ipynb

                              https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l2network_wide_area/create_l2network_wide_area_auto.ipynb

                              https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l3network_fabnet_ipv4/create_l3network_fabnet_ipv4_auto.ipynb

                              Please let us know if you still run into errors or questions.

                              Thanks,
                              Komal

                              in reply to: Unable to reserve FPGAs on KANS or WASH #7800
                              Komal Thareja
                              Participant

                                Hi Ilya,

                                Could you please try your slice again? There were leaked slivers. I have cleared them, slice provisioning should work now.

                                Thanks,
                                Komal

                                Komal Thareja
                                Participant

                                  Hi Ali,

                                  We only provide 1G storage to users on Jupyter Containers in the /home/fabric/work directory. Could you please clarify or add screenshot of the df -h from the terminal in your container and also share how you are trying to upload the files?

                                  I tried and uploaded 800 MB file to my container using the Jupyter Hub upload interface without issues.

                                  Thanks,
                                  Komal

                                Viewing 15 posts - 151 through 165 (of 478 total)