1. yoursunny

yoursunny

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 61 total)
  • Author
    Posts
  • yoursunny
    Participant

      Cannot SSH into NS1 and NS5 nodes, need to preserve data (PhD simulations)

      I found that the authorized_keys file on both NS1 and NS5 was empty, which is why SSH—whether through the admin key or the Control Framework—was failing resulting in POA/addKey failure. It seems this may have happened unintentionally as part of the experiment.

      Please be careful not to remove or overwrite the authorized_keys file in the process.

      Given this is a commonly occurring user error, maybe the OS images should include a separate account for Control Framework / POA access?

       

      I have important simulation data stored on these nodes, and I cannot lose this data.

      While I hope you can get the data back, you should setup automated backup for important data. FABRIC and Cloudlab machines should be considered ephemeral and are not suitable for important storage.

      I learned the importance of full backups during my PhD simulations: while I downloaded both the program code and the outcome files, I neglected to save the parameters used to launch the program. After a disk failure, I had to spend multiple weeks to reconstruct the input parameters and command lines.

      yoursunny
      Participant

        I haven’t used BSD in a decade, but I recall some BSD systems do not have IPv6 enabled by default or their SSH server isn’t listening on IPv6.

        You can confirm this hypothesis by trying to create the nodes on a site that has IPv4 management addresses.

        yoursunny
        Participant
          in reply to: Tofino bf_switchd process gets killed. #8461
          yoursunny
          Participant

            As said in the quoted notebook:

            In this example, the switch daemon automatically terminates after 5 minutes, which may cause the ping to stop working beyond this duration. This is expected behavior.

            The timeout is passed as a parameter to execute_thread:

            
            
            ("sleep infinity", r"bf-sde>", 300)

            This tuple sends “sleep infinity” command to the switch and waits 300 seconds for “bf-sde>” prompt. Since the prompt never appears, the timeout arrives and shuts down the SSH connection.

            In Unix, a disconnected SSH connection triggers SIGHUP, hence the process is killed.

            yoursunny
            Participant

              Step 1-5 are easy to do.

              I’d like this script to run automatically once or twice per week

              This is the difficult part.
              I believe it’s impossible within JupyterHub, because the container shuts down after an hour.

              It should be possible to install FABlib on your own server and invoke a script that does step 1-5 via crontab.
              This then requires you obtain a long-lived authentication token, and then it’s set-and-forget until the token expires.

              in reply to: FABRIC is back online with exciting new features! #8041
              yoursunny
              Participant

                I’m trying out the new docker_ubuntu_24 OS image and the updated docker_ubuntu_22 OS image, and noticed three issues:

                • docker build is using the legacy Docker builder that has been deprecated. Package docker-buildx-plugin should be included in the image.
                • The docker compose command is missing. Compose is commonly used in Docker based applications including some fablib examples (they are currently using docker_rocky_8 or manually installing Compose). Package docker-compose-plugin should be included in the image.
                • For docker_ubuntu_22 image, the ubuntu user is not added to the docker group, so that the Docker socket is inaccessible without using sudo. The user should be added to the group, and then fablib examples that contain sudo docker should be revised.
                in reply to: Reaching the Internet from a FabNet node #7605
                yoursunny
                Participant

                  it requires my UEs to send certain files to an apache server running on some other node

                  Host the Apache server within FABRIC, as part of your slice.

                  Or, make reverse port forwarding from your laptop:

                  1. In the UPF node, edit /etc/ssh/sshd_config to have GatewayPorts yes.
                  2. From the laptop, SSH into the UPF node with -R 8000:apache.example.net:8000 flag.
                  3. The UEs can then access http://10.30.6.48:8000 (use the internal IP address of enp3s0 interface) to reach the Apache server.

                   

                  in reply to: Reaching the Internet from a FabNet node #7603
                  yoursunny
                  Participant

                    I’ve tested 5G software, both on and off FABRIC. I don’t see the necessity to have Internet access for the 5G network. Typically, I run traffic generators (iperf3 etc) between UEs and UPFs, to measure the performance of 5G network.

                    in reply to: Reaching the Internet from a FabNet node #7600
                    yoursunny
                    Participant
                      1. Add a Fabnetv4Ext interface and a Fabnetv6Ext interface to the node that runs UPF.
                      2. Assign RFC1918 and ULA addresses to UEs.
                      3. Setup NAT on the UPF machine, to reach the Internet via Fabnetv4Ext and Fabnetv6Ext interfaces.

                      IPv6 NAT sucks, but Fabnetv6Ext doesn’t offer routed subnets, so that it’s either NAT or NDP proxy.

                      • This reply was modified 1 year, 2 months ago by yoursunny.
                      in reply to: Integration of USRPs with FABRIC #7595
                      yoursunny
                      Participant

                        Powder Testbed has USRP devices.

                        You can make a facility port on FABRIC and communicate with Powder nodes via Ethernet.

                        in reply to: Traffic traces from middle of the network #7453
                        yoursunny
                        Participant

                          Where’s the proxy?

                          A switchport mirror is a network instrumentation technique. It doesn’t involve any proxy software.

                          in reply to: Traffic traces from middle of the network #7450
                          yoursunny
                          Participant

                            One possible idea:

                            1. Create a FABNetv4Ext or FABNetv6Ext network.
                            2. Insert IP routes so that the traffic to the Internet goes through FABNetv4Ext/FABNetv6Ext interface, instead of the management interface.
                            3. Setup switchport mirroring.
                            4. Capture the traffic on the mirror port.
                            in reply to: Network interfaces deleted automatically from nodes. #7341
                            yoursunny
                            Participant

                              I wonder how you determined “interfaces for L2 connections on some of my nodes gets deleted”?

                              in reply to: failed lease update – all units failed in priming #6856
                              yoursunny
                              Participant

                                On the IPv4 sites, number of VMs that can be provisioned is limited with the available IPv4 addresses in the subnet.

                                Can “management IP address” show up as a resource on Fabric Portal – Resources page?
                                This would allow the experimenter to avoid this limitation.

                                in reply to: Fabric Testbed is open and ready for use! #6270
                                yoursunny
                                Participant

                                  Oversubscription support – EDC and EDUKY sites have been enabled to support CPU over subscription.

                                  I remember the CPU core capacity of STAR site was 384.
                                  It’s now 768.
                                  Did this site receive new hardware or is it oversubscription?

                                Viewing 15 posts - 1 through 15 (of 61 total)