Home › Forums › FABRIC General Questions and Discussion › I cannot access some of my nodes
- This topic has 6 replies, 2 voices, and was last updated 4 days, 3 hours ago by
Fatih Berkay Sarpkaya.
-
AuthorPosts
-
May 1, 2026 at 12:52 am #9735
Dear FABRIC team,
I hope you are doing well. I suddenly lost access to some of my nodes. I was previously able to connect, but now SSH fails with a “No route to host” error through the bastion.
Could you please check whether there is any issue with these nodes?
Slice ID: d7ac84d6-d791-423f-88a0-90ccacd7880a
node name: r-2-1 Node ID: 7b4c35dd-c7d1-4d29-9ca0-c71d21e6089e
node name: r-2-3 Node ID: c834417a-7393-4cae-bd62-722358b6451f
Thank you.
Best regards,
Fatih Berkay Sarpkaya
May 1, 2026 at 12:03 pm #9739Both VMs were crashed. I’m attaching the console outputs.
console.7b4c35dd-c7d1-4d29-9ca0-c71d21e6089e-r-2-1
console.c834417a-7393-4cae-bd62-722358b6451f-r-2-3I restarted them, they are online. I also attached their PCI devices (IP addresses need to be re-assigned).
May 1, 2026 at 1:04 pm #9740Thank you so much. I checked the nodes, and they are working now. However, I lost connection to r-4-1 this time. Could you please also check this node?
Slice ID: d7ac84d6-d791-423f-88a0-90ccacd7880a
node name: r-4-1 Node ID: 33186378-c0a9-48de-a382-0e78cb209d6b
Thank you for your time.
Best regards,
Fatih Berkay Sarpkaya
May 1, 2026 at 1:22 pm #9741Same situation. Rebooted, devices attached.
I’m not sure what is causing this, worker node is not extremely loaded, but inside the VMs there seem to be mellanox driver issues. If you share some context about the actual experiment and traffic (generated/exchanged) we can try to understand and find a way to have it sustain reliably. Otherwise, I don’t have any clues right now. You can directly reach out if you prefer.
May 1, 2026 at 1:37 pm #9742Thank you so much for your help.
We are running a multi-AS IPv6 routing experiment with multiple router VMs organized into 6 ASes and a few endpoint VMs. The routers run FRR for BGP and OSPFv3, and we use SRv6 to steer some flows along specific paths. The crashes seem to happen when we push routing/SRv6 configuration changes across all routers at once. So far, three different routers have crashed in this way: r-2-1, r-2-3, and now r-4-1.
The console output you sent seems to point to kernel-level issues; one looked like a Mellanox driver issue, and another looked like a possible SRv6 kernel bug. We are using Ubuntu 22.04 with kernel 5.15.0-143-generic.
I am not sure whether this is something we can reliably fix from our side, but please let me know if you have any suggestions.
Thank you.
Best regards,
Fatih Berkay Sarpkaya
May 1, 2026 at 3:10 pm #9743For node “r-4-1” that you have just restored, could you please check again if its connections are correct. It was connected to “r-1-4” directly, but currently, I cannot ping between these nodes, so I thought there could be an issue between their L2 links.
Thank you for your time.
Best regards,
Fatih Berkay Sarpkaya
May 1, 2026 at 3:28 pm #9744Hi,
Sorry, this could be my mistake. After the reboot, the Linux interface assignment may have changed. I can now see the connection through a different interface than before the crash.
Thank you.
Kind regards,
Fatih Berkay Sarpkaya
-
AuthorPosts
- You must be logged in to reply to this topic.