Home › Forums › FABRIC General Questions and Discussion › Perhaps one of the bastion hosts is out
- This topic has 16 replies, 2 voices, and was last updated 2 weeks, 4 days ago by
Hussam Nasir.
-
AuthorPosts
-
January 19, 2026 at 9:33 pm #9391
Today I try to login to my VMs and once in a while I hit dead air when connecting (hangs on connecting to the bastion). Then I try again and it works. Maybe a dead bastion host? I haven’t spent any time debugging.
January 20, 2026 at 7:49 am #9392Hello Ilya,
We have added two new bastions recently and modified one of the pre-existing ones. Can you please post the result of
“nslookup bastion.fabric-testbed.net” from the machine where this failed?
January 20, 2026 at 9:50 am #9393This is what I see (this is from my home on Google Fiber):
nslookup bastion.fabric-testbed.net
Server: 192.168.1.1
Address: 192.168.1.1#53Non-authoritative answer:
Name: bastion.fabric-testbed.net
Address: 128.163.180.149
Name: bastion.fabric-testbed.net
Address: 23.134.235.242
Name: bastion.fabric-testbed.net
Address: 141.142.140.10
Name: bastion.fabric-testbed.net
Address: 152.54.15.12I also noticed that some commands sent to VMs over SSH via my laptop-local notebook don’t happen or are very delayed, which I suspect is part of the same issue. Strangely all these are reachable via ssh.
January 20, 2026 at 10:16 am #9396One thing that stands out is that i dont see any IPv6 addresses of the bastion host in your name lookup. We had been seeing issues on IPv6 from home networks, but we believe that the workaround we placed for that has worked, as reported by other users. I would also like to see the fablib.log file. Also, could you provide your source IP, as it’s possible that one of the bastions may have banned it?
January 20, 2026 at 10:26 am #9398My ip is 136.61.60.222
I do not have any IPv6 on my home network so it isn’t surprising. I’m using a DNS proxy, but even if I ask 8.8.8.8 directly I get:
$ dig @8.8.8.8 bastion.fabric-testbed.net ; <<>> DiG 9.10.6 <<>> @8.8.8.8 bastion.fabric-testbed.net ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15505 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;bastion.fabric-testbed.net. IN A ;; ANSWER SECTION: bastion.fabric-testbed.net. 3600 IN A 23.134.235.242 bastion.fabric-testbed.net. 3600 IN A 128.163.180.149 bastion.fabric-testbed.net. 3600 IN A 141.142.140.10 bastion.fabric-testbed.net. 3600 IN A 152.54.15.12
The log is full of the following messages
[21:02:48] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:02:48] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:02:48] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:02:48] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:09:13] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:09:13] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:14:12] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:14:12] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:43:49] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:43:49] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:43:49] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:43:49] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [21:47:35] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect f ailed [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection timed out: Connect failed [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:11:24] {/Users/baldin/venv/fabric/lib/python3.12/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed')January 20, 2026 at 10:34 am #9399The fablib logs do not tell which bastion it tried. Possibly enabling verbose debug. The other thing we can try to do is ssh directly to the bastion one by to see which one fails to connect (all will fail to SSH since direct SSH is not allowed)
Here is the list of all the bastions https://learn.fabric-testbed.net/knowledge-base/frequently-asked-starter-questions/ (last question in the FAQ)
January 20, 2026 at 10:37 am #9400Yeah strangely I can connect to all of them right now, so it must be intermittent. I may change my fabric ssh config to use a specific bastion and see if that changes how things work. I’ll update the debug level to see if I can catch it in the act also.
-
This reply was modified 2 weeks, 5 days ago by
Ilya Baldin.
January 20, 2026 at 10:28 pm #9404I experimentally determined (by manually specifying which bastion to use) that it is bastion-star-1 that is hanging for me.
January 21, 2026 at 8:53 am #9406I do see in the bastion star logs sucessfull logins from your id even from early morning today.
January 21, 2026 at 8:56 am #9407That’s suspect. (a) I was not doing anything this morning and (b) if I configure to use bastion-star-1 as my bastion host I cannot login to my slice (still); it works if I configure e.g. bastion-renc-1
January 21, 2026 at 9:33 am #9408Can you post the full ssh command? The issue may be the connection between STAR and the destination rack
January 21, 2026 at 9:44 am #9409So for one of the nodes I do something like that:
ssh -i /path/to/slice_key -F ~/path/to/fabric_config ubuntu@2001:400:a100:3030:f816:3eff:fe07:665e
and my fabric_config looks something like this:
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
ServerAliveInterval 120Host bastion-star-1.fabric-testbed.net
User username
ForwardAgent yes
Hostname %h
IdentityFile ~/.ssh/mykey
IdentitiesOnly yesHost * !bastion-star-1.fabric-testbed.net
ProxyJump username@bastion-star-1.fabric-testbed.net:22-
This reply was modified 2 weeks, 4 days ago by
Ilya Baldin.
January 21, 2026 at 10:45 am #9412Good news is that i have narrowed the issue down to your VM at STAR. Is the issue when using VMs at other sites too ?
January 21, 2026 at 10:47 am #9413Fascinating. I do not have slices at other sites. I have one slice and all of it is in STAR and as far as I can tell all nodes have this problem.
Slice ID is 16c49677-636b-4d3c-b71d-7fff7a75db09
-
This reply was modified 2 weeks, 4 days ago by
Ilya Baldin.
January 21, 2026 at 10:58 am #9415I see the issue on your VM. I believe you are using a FABNETv4 EXT and FABnetv6 EXT. During its configuration, you may have accidentally added the NIC to be used in the default route. This caused the system to have two default routes going out two different interfaces.
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:07:66:5e brd ff:ff:ff:ff:ff:ff
inet 10.30.6.168/23 metric 100 brd 10.30.7.255 scope global dynamic enp3s0
valid_lft 58217sec preferred_lft 58217sec
inet6 2001:400:a100:3030:f816:3eff:fe07:665e/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 86383sec preferred_lft 14383sec
inet6 fe80::f816:3eff:fe07:665e/64 scope link
valid_lft forever preferred_lft forever
3: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
link/ether 2a:77:c8:60:d3:bf brd ff:ff:ff:ff:ff:ff
inet 10.129.130.253/24 scope global enp7s0
valid_lft forever preferred_lft forever
inet 23.134.235.195/28 scope global enp7s0
valid_lft forever preferred_lft forever
inet6 2602:fcfb:101::3/28 scope global
valid_lft forever preferred_lft forever
inet6 2602:fcfb:101:0:2877:c8ff:fe60:d3bf/64 scope global dynamic mngtmpaddr
valid_lft 2591807sec preferred_lft 604607sec
inet6 fe80::2877:c8ff:fe60:d3bf/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether ca:03:f6:60:d5:34 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft foreverroot@SenderSTAR:~# ip -6 route show
::1 dev lo proto kernel metric 256 pref medium
2001:400:a100:3030::/64 dev enp3s0 proto ra metric 100 expires 86371sec pref medium
2001:400:a300::/48 via 2602:fcfb:101::1 dev enp7s0 metric 1024 pref medium
2602:fcfb:101::/64 dev enp7s0 proto kernel metric 256 expires 2591914sec pref medium
2602:fcf0::/28 dev enp7s0 proto kernel metric 256 pref medium
fe80::a9fe:a9fe via fe80::f816:3eff:fe79:edec dev enp3s0 proto ra metric 100 expires 271sec pref medium
fe80::/64 dev enp3s0 proto kernel metric 256 pref medium
fe80::/64 dev enp7s0 proto kernel metric 256 pref medium
default via fe80::f816:3eff:fe79:edec dev enp3s0 proto ra metric 100 expires 271sec mtu 9000 pref medium
default via fe80::c28b:2aff:fe82:6d02 dev enp7s0 proto ra metric 1024 expires 1714sec hoplimit 64 pref mediumAs soon as i disable the enp7s0 NIC using ip link set dev enp7s0 down , the VM started working via ssh.
It is possible that there is a routing issue when FABNETv6 is used at STAR with STAR Bastion. I will ask for this use case to be investigated.
-
This reply was modified 2 weeks, 5 days ago by
-
AuthorPosts
- You must be logged in to reply to this topic.