Forum Replies Created
-
AuthorPosts
-
So, since you’re able to login to this problematic VM from other sources, then you can check and make sure the right SSH key is inside the VM. I just placed my SSH key in it, and I could login properly. Please let us know about the status following your SSH key check and I will take a look further.
If I understand the problem from the description correctly (“manual connect”), you’re trying to connect to the VM(s) from a terminal on your computer/laptop and getting the error. If that’s the case, you need to set up your ssh client configuration file and ssh keys properly (in your computer/laptop) and connect. This page can be helpful -> https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/
If my understanding is wrong and problem is something different, please disregard the info above.
Thank you Komal for the information.
Khawar, can you please describe the directory where your “critical data” resides on the VM?
I checked your VM and found it in a crashed state. I’m not sure about the reason, when/how it was crashed or rebooted without digging into the logs, but the worker node (star-w2) it’s running on is fully occupied with VMs and we will look into possible out of memory issues on the hypervisor. It can be good if you re-create this VM on another worker node on STAR or use a smaller flavor to run a VM on star-w2.
Hello Khawar,
Your VM was shut down by the hypervisor and I started it now. Please let us know if you have any other issues. We will be investigating the main cause of this shut down internally.
Best regards,
MertIssue with the bastion host traffic is resolved. You can try creating your slices with the standard bastion host settings (with bastion.fabric-testbed.net)
There is a problem with upstream connectivity affecting one of the bastion hosts, causing intermittent interruptions. We are working on the issue. In the mean time, you can set a specific bastion host in your fabric_rc file (eg
FABRIC_BASTION_HOST=bastion-renc-1.fabric-testbed.net) . We will notify about the status of the actual issueThis maintenance is completed. UCSD is open for experiments.
Maintenance is completed. CLEM node is available for experiments.
December 12, 2025 at 5:18 pm in reply to: Cannot SSH into NS2 and NS4 nodes, need to preserve data (PhD simulations) #9263Hello Danilo,
I checked both VMs (NS2 and NS4) and they are up and online. However there seems to be some changes in the SSH key(s) that are injected by the system and I cannot login to the VMs. I’m not sure about the main root cause of the exceptions that you posted, they indicate that the FABRIC orchestration system cannot perform actions on the VMs, and the issue with the SSH keys may be the main cause. We need to learn from you what might have changed on the VMs.
I’m posting the reservation info here. Both VM reservations are valid until 12/25
{ "sliver_id": "be7426fd-7ebe-4e6b-bc65-56e30d7e8e50", "slice_id": "53cfa2bd-5110-420d-8bdb-053c1af45801", "type": "VM", "notices": "Reservation be7426fd-7ebe-4e6b-bc65-56e30d7e8e50 (Slice NS2(53cfa2bd-5110-420d-8bdb-053c1af45801) Graph Id:2aced1b1-3843-4809-b9a2-9ed9e2ff0317 Owner:daniloassis@utfpr.edu.br) is in state (Active,None_) ", "start": "2025-08-02 12:44:10 +0000", "end": "2025-12-25 18:11:58 +0000", "requested_end": "2025-12-25 18:11:58 +0000", "units": 1, "state": 4, "pending_state": 11, "sliver": { "Name": "NS2", "Type": "VM", "Capacities": "{\"core\": 16, \"disk\": 1000, \"ram\": 16}", "CapacityHints": "{\"instance_type\": \"fabric.c16.m16.d1000\"}", "CapacityAllocations": "{\"core\": 16, \"disk\": 1000, \"ram\": 16}", "LabelAllocations": "{\"instance\": \"instance-000040ec\", \"instance_parent\": \"star-w1.fabric-testbed.net\"}", "ReservationInfo": "{\"reservation_id\": \"be7426fd-7ebe-4e6b-bc65-56e30d7e8e50\", \"reservation_state\": \"Active\"}", "NodeMap": "[\"e2b1e451-45b4-4691-9527-aae18cad3b19\", \"BBQSH63\"]", "StitchNode": "false", "ImageRef": "default_ubuntu_22,qcow2", "MgmtIp": "2001:400:a100:3030:f816:3eff:fed9:14ee", "Site": "STAR" } } { "sliver_id": "527af509-fb62-41ae-a0c0-191e3c7f6525", "slice_id": "059939f1-19be-49ee-ab5b-bf4504639c13", "type": "VM", "notices": "Reservation 527af509-fb62-41ae-a0c0-191e3c7f6525 (Slice NS4(059939f1-19be-49ee-ab5b-bf4504639c13) Graph Id:14a3d7f7-e797-4e75-bafb-00282ea63896 Owner:daniloassis@utfpr.edu.br) is in state (Active,None_) ", "start": "2025-08-02 12:44:10 +0000", "end": "2025-12-25 18:12:40 +0000", "requested_end": "2025-12-25 18:12:40 +0000", "units": 1, "state": 4, "pending_state": 11, "sliver": { "Name": "NS4", "Type": "VM", "Capacities": "{\"core\": 16, \"disk\": 1000, \"ram\": 16}", "CapacityHints": "{\"instance_type\": \"fabric.c16.m16.d1000\"}", "CapacityAllocations": "{\"core\": 16, \"disk\": 1000, \"ram\": 16}", "LabelAllocations": "{\"instance\": \"instance-00000cf6\", \"instance_parent\": \"toky-w2.fabric-testbed.net\"}", "ReservationInfo": "{\"reservation_id\": \"527af509-fb62-41ae-a0c0-191e3c7f6525\", \"reservation_state\": \"Active\"}", "NodeMap": "[\"e2b1e451-45b4-4691-9527-aae18cad3b19\", \"FW696S3\"]", "StitchNode": "false", "ImageRef": "default_ubuntu_22,qcow2", "MgmtIp": "133.69.160.21", "Site": "TOKY" } }December 5, 2025 at 5:14 pm in reply to: Maintenance on STAR, WASH, DALL, SALT, LOSA, KANS – on Dec 5 at 9AM EST #9252Maintenance is completed.
Maintenance completed. FABRIC-AMST is back online. All current VMs are started.
Network outage is resolved. BRIST node is available for new slices.
FABRIC-BRIST node is back online.
August 18, 2025 at 3:06 pm in reply to: NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. #8829Ajay,
The sliver that you reported the problem about was on FABRIC-FIU and it’s expired. I can see that you have recent slivers on the FIU node. Can you please let us know if you still encounter the problem or not?
-
AuthorPosts