Home › Forums › FABRIC General Questions and Discussion › 3/4 nodes in slice not accessible via SSH
- This topic has 2 replies, 2 voices, and was last updated 3 weeks ago by
Pete Stenger.
-
AuthorPosts
-
March 5, 2025 at 11:52 pm #8335
When I try to connect to my setup, I get an output like:
VM Name: wgclient VM IP: 192.168.1.2 Reservation Active SSH working? True VM Name: wgnet-1 VM IP: 192.168.1.3 Reservation Active SSH working? False VM Name: wgnet-2 VM IP: 192.168.1.4 Reservation Active SSH working? False VM Name: wgnet-3 VM IP: 192.168.1.5 Reservation Active SSH working? False
Project ID: “2aaaea18-5cf9-497a-ade0-b4f51112a34d”
I am running locally using a token in my
token.json
I generated through the credential manager (not using JupyterHub). The configuration is shown in the screenshot below.This is the created nodes / slices:
This is the creation code:
This is the code to fetch if the reservations are active, and SSH is working. I also tried variants of
node.os_reboot()
andnode.config()
.
for vm_name in ["wgclient", "wgnet-1", "wgnet-2", "wgnet-3"]:
print('VM Name:', vm_name)
node = slice.get_node(vm_name)
print(f"VM IP: {node.get_interfaces()[0].get_ip_addr()}")
print('Reservation', node.get_reservation_state())
print('SSH working?', node.test_ssh())
I’m not sure exactly what’s going on, but I can only access the “wgclient” VM via ssh, not my “wgnet-*” VMs.
When I recreate the slice, and try to connect, it shows all 4 are available, but ~30 minutes later, only the “wgclient” VM is available. I run a script on each VM installing some tooling, and it takes ~7 minutes per VM to do this. After they all have the tooling installed, I can only connect to the “wgclient” VM via SSH.
I want to guess that somehow the method I setup the network interface with is incorrect? I also ensured that my Bastion and Sliver SSH keys aren’t expired on the site under my profile > SSH Keys.
Full stack trace in /tmp/fablib/fablib.log
[22:51:05] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed_extensions/fablib/fablib.py:1158} INFO - orchestrator_host=orchestrator.fabric-testbed.net,credmgr_host=cm.fabric-testbed.net,core_api_host=uis.fabric-testbed.net,am_host=artifacts.fabric-testbed.net,project_id=b24ba048-5b54-4034-b49f-16f8fbf3e35f,token_location=/home/retep/repos/cs538-project/local/token.json,initialize=True,scope='all' [22:51:06] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed/token_manager/token_manager.py:164} INFO - Project Id/Name not specified, trying to determine it from the token [22:51:16] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed_extensions/fablib/fablib.py:955} INFO - Fetching User's information [22:51:17] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed_extensions/fablib/fablib.py:987} INFO - User: peteras4@illinois.edu bastion key is valid! [22:51:31] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1944} ERROR - Secsh channel 0 open FAILED: Connection refused: Connect failed [22:51:31] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: ChannelException(2, 'Connect failed') [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1944} ERROR - Exception (client): Error reading SSH protocol banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - Traceback (most recent call last): [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2369, in _check_banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - buf = self.packetizer.readline(timeout) [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/packet.py", line 395, in readline [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - buf += self._read_timeout(timeout) [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - ^^^^^^^^^^^^^^^^^^^^^^^^^^^ [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/packet.py", line 665, in _read_timeout [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - raise EOFError() [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - EOFError [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - During handling of the above exception, another exception occurred: [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - Traceback (most recent call last): [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2185, in run [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - self._check_banner() [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2373, in _check_banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - raise SSHException( [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - paramiko.ssh_exception.SSHException: Error reading SSH protocol banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/fabrictestbed_extensions/fablib/node.py:1600} WARNING - Attempt 1 failed: Error reading SSH protocol banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1944} ERROR - Exception (client): Error reading SSH protocol banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - Traceback (most recent call last): [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2369, in _check_banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - buf = self.packetizer.readline(timeout) [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/packet.py", line 395, in readline [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - buf += self._read_timeout(timeout) [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - ^^^^^^^^^^^^^^^^^^^^^^^^^^^ [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/packet.py", line 665, in _read_timeout [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - raise EOFError() [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - EOFError [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - During handling of the above exception, another exception occurred: [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - Traceback (most recent call last): [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2185, in run [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - self._check_banner() [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - File "/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py", line 2373, in _check_banner [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - raise SSHException( [22:51:32] {/home/retep/repos/cs538-project/local/.venv/lib/python3.11/site-packages/paramiko/transport.py:1942} ERROR - paramiko.ssh_exception.SSHException: Error reading SSH protocol banner
-
This topic was modified 3 weeks, 4 days ago by
Pete Stenger.
-
This topic was modified 3 weeks, 4 days ago by
Pete Stenger.
-
This topic was modified 3 weeks, 4 days ago by
Pete Stenger.
-
This topic was modified 3 weeks, 4 days ago by
Pete Stenger.
-
This topic was modified 3 weeks, 4 days ago by
Pete Stenger.
March 6, 2025 at 8:37 am #8341Hi,
I looked at logs on one of the failed nodes and found that the last command before the node failed was
“sudo /usr/sbin/ldconfig /home/ubuntu/openssl/build/lib64/”
This command results in the breaking of the sshd daemon running on the machine, thus causing you to loose your ssh connection. A reboot would fix the ssh because the library you built is not loaded.
March 10, 2025 at 2:30 am #8349Thank you!
-
This topic was modified 3 weeks, 4 days ago by
-
AuthorPosts
- The topic ‘3/4 nodes in slice not accessible via SSH’ is closed to new replies.