Forum Replies Created
-
AuthorPosts
-
December 12, 2025 at 10:27 pm in reply to: Cannot SSH into NS2 and NS4 nodes, need to preserve data (PhD simulations) #9267
Cannot SSH into NS1 and NS5 nodes, need to preserve data (PhD simulations)
I found that the authorized_keys file on both NS1 and NS5 was empty, which is why SSH—whether through the admin key or the Control Framework—was failing resulting in POA/addKey failure. It seems this may have happened unintentionally as part of the experiment.
Please be careful not to remove or overwrite the authorized_keys file in the process.
Given this is a commonly occurring user error, maybe the OS images should include a separate account for Control Framework / POA access?
I have important simulation data stored on these nodes, and I cannot lose this data.
While I hope you can get the data back, you should setup automated backup for important data. FABRIC and Cloudlab machines should be considered ephemeral and are not suitable for important storage.
I learned the importance of full backups during my PhD simulations: while I downloaded both the program code and the outcome files, I neglected to save the parameters used to launch the program. After a disk failure, I had to spend multiple weeks to reconstruct the input parameters and command lines.
September 5, 2025 at 6:13 pm in reply to: BSD images cause error- channel 0: open failed: connect failed: No route to host #8897I haven’t used BSD in a decade, but I recall some BSD systems do not have IPv6 enabled by default or their SSH server isn’t listening on IPv6.
You can confirm this hypothesis by trying to create the nodes on a site that has IPv4 management addresses.
August 22, 2025 at 8:49 am in reply to: JupyterLab Save Error: No Space Left on Device in FABRIC Slice #8840You can specify the disk size in
slice.add_node(disk=)parameter.
https://fabric-fablib.readthedocs.io/en/latest/slice.html#fabrictestbed_extensions.fablib.slice.Slice.add_nodeAs said in the quoted notebook:
In this example, the switch daemon automatically terminates after 5 minutes, which may cause the ping to stop working beyond this duration. This is expected behavior.
The timeout is passed as a parameter to execute_thread:
("sleep infinity", r"bf-sde>", 300)
This tuple sends “sleep infinity” command to the switch and waits 300 seconds for “bf-sde>” prompt. Since the prompt never appears, the timeout arrives and shuts down the SSH connection.
In Unix, a disconnected SSH connection triggers SIGHUP, hence the process is killed.
April 30, 2025 at 3:33 pm in reply to: How to automate a script that creates a slice, deploys 2 VMs on different sites #8454Step 1-5 are easy to do.
I’d like this script to run automatically once or twice per week
This is the difficult part.
I believe it’s impossible within JupyterHub, because the container shuts down after an hour.It should be possible to install FABlib on your own server and invoke a script that does step 1-5 via crontab.
This then requires you obtain a long-lived authentication token, and then it’s set-and-forget until the token expires.I’m trying out the new
docker_ubuntu_24OS image and the updateddocker_ubuntu_22OS image, and noticed three issues:docker buildis using the legacy Docker builder that has been deprecated. Packagedocker-buildx-pluginshould be included in the image.- The
docker composecommand is missing. Compose is commonly used in Docker based applications including some fablib examples (they are currently usingdocker_rocky_8or manually installing Compose). Packagedocker-compose-pluginshould be included in the image. - For
docker_ubuntu_22image, theubuntuuser is not added to thedockergroup, so that the Docker socket is inaccessible without using sudo. The user should be added to the group, and then fablib examples that containsudo dockershould be revised.
it requires my UEs to send certain files to an apache server running on some other node
Host the Apache server within FABRIC, as part of your slice.
Or, make reverse port forwarding from your laptop:
- In the UPF node, edit /etc/ssh/sshd_config to have GatewayPorts yes.
- From the laptop, SSH into the UPF node with -R 8000:apache.example.net:8000 flag.
- The UEs can then access http://10.30.6.48:8000 (use the internal IP address of enp3s0 interface) to reach the Apache server.
I’ve tested 5G software, both on and off FABRIC. I don’t see the necessity to have Internet access for the 5G network. Typically, I run traffic generators (iperf3 etc) between UEs and UPFs, to measure the performance of 5G network.
- Add a Fabnetv4Ext interface and a Fabnetv6Ext interface to the node that runs UPF.
- Assign RFC1918 and ULA addresses to UEs.
- Setup NAT on the UPF machine, to reach the Internet via Fabnetv4Ext and Fabnetv6Ext interfaces.
IPv6 NAT sucks, but Fabnetv6Ext doesn’t offer routed subnets, so that it’s either NAT or NDP proxy.
-
This reply was modified 1 year, 2 months ago by
yoursunny.
Powder Testbed has USRP devices.
You can make a facility port on FABRIC and communicate with Powder nodes via Ethernet.
Where’s the proxy?
A switchport mirror is a network instrumentation technique. It doesn’t involve any proxy software.
One possible idea:
- Create a FABNetv4Ext or FABNetv6Ext network.
- Insert IP routes so that the traffic to the Internet goes through FABNetv4Ext/FABNetv6Ext interface, instead of the management interface.
- Setup switchport mirroring.
- Capture the traffic on the mirror port.
I wonder how you determined “interfaces for L2 connections on some of my nodes gets deleted”?
On the IPv4 sites, number of VMs that can be provisioned is limited with the available IPv4 addresses in the subnet.
Can “management IP address” show up as a resource on Fabric Portal – Resources page?
This would allow the experimenter to avoid this limitation.Oversubscription support – EDC and EDUKY sites have been enabled to support CPU over subscription.
I remember the CPU core capacity of STAR site was 384.
It’s now 768.
Did this site receive new hardware or is it oversubscription? -
AuthorPosts