Forum Replies Created
-
AuthorPosts
-
closing.
Any help would be appreciated here! Thank you
I also tried the following:
slice = fablib.get_slice(name=”CEPH_DOCA_POC”)
slice.show()
# slice.delete()
DPU_NODE_NAME = “node3-dpu”node = slice.get_node(name=DPU_NODE_NAME)
node.show()
node.execute(“ip addr”)
node.execute(“sudo ip addr add 192.168.50.2/24 dev enp8s0”)Fabric returned this error:
File /opt/conda/lib/python3.11/site-packages/paramiko/transport.py:1130, in Transport.open_channel(self, kind, dest_addr, src_addr, window_size, max_packet_size, timeout) 1128 if e is None: 1129 e = SSHException("Unable to open channel.") -> 1130 raise e ChannelException: ChannelException(2, 'Connect failed')
I believe the instance lost power and restarted, and somehow some of the important networking config was lost. Any help would be really appreciated!!
Hi Mert,
Just a suggestion – it would be great to have the DOCA Snap tutorial (like we have artifacts for p4 and compression) https://docs.nvidia.com/doca/archive/2-9-1/doca+snap-4+service+guide/index.html#src-3453016610_id-.DOCASNAP4ServiceGuidev2.9.1-Hot-plugFirmwareConfigurationI dont think this is a Bluefield problem, it might most definitely be a host problem.
Steps:
1. To view current configuration – sudo mlxconfig -d /dev/mst/mt41692_pciconf0 -e query
2. To change a configuration value (in this case we change the values for PCI_SWITCH_EMULATION_ENABLE and NVME_EMULATION_ENABLE from 0 to 1) – sudo mlxconfig -d /dev/mst/mt41692_pciconf0 set PCI_SWITCH_EMULATION_ENABLE=1 NVME_EMULATION_ENABLE=1
3. Based on the DOCA Documentation, perform a system reboot using – sudo mlxfwreset -d 03:00.0 -y -l 3 –sync 1 r , to apply configuration changes.
4. After reboot, sudo mlxconfig -d /dev/mst/mt41692_pciconf0 -e query should display updated values“Hotplug is not guaranteed to work on AMD machines.” – I did think that would be one of the reasons, but unfortunately I cant find any relevant logs at all. I will continue my troubleshooting and let you know. I will also post an issue on the DOCA devzone to see if NVIDIA has any clues about this.
Thanks again Mert!
EDIT: There should be no differences in the commands for Bluefield-2 or 3. ‘mt41692’ changes based on your device.
Use ‘sudo mst start’ and ‘sudo mst status -v’ inside the DPU to find that out.-
This reply was modified 1 week, 5 days ago by
Tanay Maheshwari.
Hi Mert,
Unfortunately it didn’t update the firmware configurations. I am trying to figure out what is the blocker here.
This is what I use to check if the firmware configurations have applied. They still remain the same.
sudo mlxconfig -d /dev/mst/mt41692_pciconf0 -e query (as seen in the screenshot, anything with an asterisk * is to be changed on reboot. It never does though)In my local setup with a Bluefield-2, a simple reboot (or) the above mentioned mlxfwreset command is sufficient to apply changes. Power cycle is not required.
Thank you for taking the effort in helping me with this!
-
This reply was modified 1 week, 5 days ago by
Tanay Maheshwari.
Hi Mert,
Is it possible to do a cold-reboot on the HAWI DPU to see if that applies firmware configurations?Hi Mert,
Apologies, but I had to delete the slice since I couldn’t get any stuff to work there anymore. Also, I was unable to create a DPU slice in SEAT (seems like the DPU is still shut down)I created a new DPU slice on HAWI, and this command worked there with no timeout.
sudo mlxfwreset -d 03:00.0 -y -l 3 –sync 1 rHowever, the firmware configuration refuses to update, even after running that command and doing a manual reboot.
Slice Details:<caption>Slice</caption>
ID f761a02e-dae0-4122-b0a1-40b6cffc84e6 Name CEPH_DOCA_POC Lease Expiration (UTC) 2026-03-02 01:00:29 +0000 Lease Start (UTC) 2026-02-25 00:53:21 +0000 Project ID 42b3494b-982f-4fe8-b160-26f28c3e33c0 State StableOK Email mahesh88@purdue.edu UserId 14e40626-117b-43fe-a9dd-89b0063d126d Would love some guidance here.
Thanks,
TanayHi Komal, are the BF3s available for testing now?
output_table = fablib.list_sites() i used this to list all the resources, but i couldnt find any.Thanks!
October 31, 2025 at 1:52 pm in reply to: Bluefield NICs | FABRIC Webinar — November 11 at 3 PM ET #9132Are the DPUs available for use already?
Hi Komal,
Any updates on the Bluefiled integration? I checked the Fall updates and it doesn’t mention any Bluefields!Thanks,
TanayHi Komal,
Any updates on the DPU integration?Hi Komal,
Just wanted to check if there is a timeline for integrating the BlueField DPUs.Thanks,
TanayHi Komal,
It’s still unresolved. Do we have any updates?Thanks,
TanayHello, any ETA on this?
Thanks,
Tanay -
This reply was modified 1 week, 5 days ago by
-
AuthorPosts