Home › Forums › FABRIC General Questions and Discussion › Unable to SSH to one node in a 3-node slice
- This topic has 3 replies, 1 voice, and was last updated 2 days, 16 hours ago by
Tanay Maheshwari.
-
AuthorPosts
-
March 5, 2026 at 2:16 pm #9562
Hello, I have the following slice:
<caption>Slice</caption>
ID f761a02e-dae0-4122-b0a1-40b6cffc84e6 Name CEPH_DOCA_POC Lease Expiration (UTC) 2026-03-14 00:00:29 +0000 Lease Start (UTC) 2026-02-25 00:53:21 +0000 Project ID 42b3494b-982f-4fe8-b160-26f28c3e33c0 State StableOK Email mahesh88@purdue.edu UserId 14e40626-117b-43fe-a9dd-89b0063d126d It has 3 nodes, i am able to ssh into 2 of them, not into the third one.
Node details:<caption>Node</caption>
ID 4cdae64a-1527-49fb-8be4-564216a16102 Name node3-dpu Cores 16 RAM 16 Disk 100 Image default_ubuntu_24 Image Type qcow2 Host hawi-w3.fabric-testbed.net Site HAWI Username ubuntu Management IP 2607:f278:1:202:f816:3eff:fe8b:7638 State Active Error SSH Command ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@2607:f278:1:202:f816:3eff:fe8b:7638 Public SSH Key File /home/fabric/work/fabric_config/slice_key.pub Private SSH Key File /home/fabric/work/fabric_config/slice_key I was able to ssh with no issues a day back. Wondering what went wrong.
Thanks,
TanayEDIT: Iam also able to ssh into the dpu inside node3, which means the node is definitely up and working, i believe some networking issue here? Maybe the node lost its IP 192.168.50.2 on the NIC?
-
This topic was modified 3 days, 14 hours ago by
Tanay Maheshwari.
-
This topic was modified 3 days, 14 hours ago by
Tanay Maheshwari.
March 5, 2026 at 4:04 pm #9566I also tried the following:
slice = fablib.get_slice(name=”CEPH_DOCA_POC”)
slice.show()
# slice.delete()
DPU_NODE_NAME = “node3-dpu”node = slice.get_node(name=DPU_NODE_NAME)
node.show()
node.execute(“ip addr”)
node.execute(“sudo ip addr add 192.168.50.2/24 dev enp8s0”)Fabric returned this error:
File /opt/conda/lib/python3.11/site-packages/paramiko/transport.py:1130, in Transport.open_channel(self, kind, dest_addr, src_addr, window_size, max_packet_size, timeout) 1128 if e is None: 1129 e = SSHException("Unable to open channel.") -> 1130 raise e ChannelException: ChannelException(2, 'Connect failed')
I believe the instance lost power and restarted, and somehow some of the important networking config was lost. Any help would be really appreciated!!
March 6, 2026 at 9:29 am #9567Any help would be appreciated here! Thank you
March 6, 2026 at 12:01 pm #9569closing.
-
This topic was modified 3 days, 14 hours ago by
-
AuthorPosts
- The topic ‘Unable to SSH to one node in a 3-node slice’ is closed to new replies.