1. Increase space on jupyterhub

Increase space on jupyterhub

Home Forums FABRIC General Questions and Discussion Increase space on jupyterhub

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #7418
    Prateek Jain
    Participant

      I have conducted experiments on my slice and the output of all the experiments is too large. I am not able to download the output files from my VM to jupyter hub for analysis. I have deleted whatever I can from my jupyter hub.

      Is there any way to increase the storage capacity for the files on jupyter hub.

      #7419
      Komal Thareja
      Participant

        Hi Prateek,

        The /home/fabric/work directory in the JupyterHub environment serves as persistent storage for code, notebooks, scripts, and other materials related to configuring and running experiments, including the addition of extra Python modules. However, it is not designed to handle large datasets.

        **Transfer via the Bastion Host**
        All virtual machines (VMs) are connected to a management network that you use for SSH access. You may have noticed that accessing your VMs requires jumping through a bastion host for enhanced security.
        The simplest method for transferring data to and from a VM is by using scp (or a similar tool). However, you must still route through the bastion host. To do this, you will need to configure your external machine (such as your laptop) to use the bastion host for standard SSH connections.
        You can find SSH configuration details here: [Logging into FABRIC VMs](https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/).
        The key takeaway from that document is to create (or modify) your ~/.ssh/config file to include the bastion host configuration. A basic setup might look like this:


        #### Bastion
        UserKnownHostsFile /dev/null
        StrictHostKeyChecking no
        ServerAliveInterval 120


        Host bastion-?.fabric-testbed.net
        User
        ForwardAgent yes
        Hostname %h
        IdentityFile /path/to/bastion/private/key
        IdentitiesOnly yes


        Host bastion-*.fabric-testbed.net
        User
        IdentityFile /path/to/bastion/private/key

        After configuring, you should be able to SSH with the following command:


        ssh -i /path/to/vm/private/key [username]@[VM_IP]

        For SCP, you can use the command:


        scp -i /path/to/vm/private/key [username]@[VM_IP]:file.txt .

        Keep in mind that if the VM IP address uses IPv6, colons can interfere with parsing. In this case, you need to escape the characters (escaping is necessary on macOS but not in Bash; however, using escapes works in both environments). Here’s an example:


        scp -i /path/to/vm/private/key ubuntu@[2001:400:a100:3030:f816:3eff:fe93:a1a0]:file.txt .

        Thanks,
        Komal

        #7420
        Komal Thareja
        Participant

          Sharing the SSH and SCP commands again, due to an error in the above:

          ssh -i /path/to/vm/private/key -J bastion.fabric-testbed.net [username]@[VM_IP]

          scp -i /path/to/vm/private/key -o "ProxyJump bastion.fabric-testbed.net" [username]@[VM_IP]:file.txt .

          scp -i /path/to/vm/private/key -o "ProxyJump bastion.fabric-testbed.net" ubuntu@[2001:400:a100:3030:f816:3eff:fe93:a1a0]:file.txt .

          #7422
          Prateek Jain
          Participant

            Thanks for the prompt reply!
            I can definitely use this setup and commands to download my experiment results on my local machine but what I was trying to achieve was running the experiments on FABRIC, download the results from all my nodes to my jupyter hub and then conduct analysis on this data using a jupyter notebook. This would make the results of the experiments plottable as soon as the experiments are done.
            This works fine when I conduct experiments with short durations. With long duration experiments the experimental data size increases and I am not able to download it on my jupyterhub for analysis. Is there any work around this issue other than downloading the results on my local machine and run jupyter notebook locally?

            #7423
            Komal Thareja
            Participant

              Hi Prateek,

              I recommend setting up your own FABRIC environment on your laptop or machine to run your experiments. This approach will allow you to capture more data and reduce reliance on Jupyter Hub. Consider configuring a local Python environment for the FABRIC API as described here, and run the notebooks locally.

              Thanks,

              Komal

              #7424
              Prateek Jain
              Participant

                Thanks, I will try that.

              Viewing 6 posts - 1 through 6 (of 6 total)
              • You must be logged in to reply to this topic.