Increase space on jupyterhub

This topic has 5 replies, 2 voices, and was last updated 1 year, 2 months ago by Prateek Jain.

Viewing 6 posts - 1 through 6 (of 6 total)

Author

Posts
August 13, 2024 at 2:26 pm #7418
Prateek Jain
Participant
I have conducted experiments on my slice and the output of all the experiments is too large. I am not able to download the output files from my VM to jupyter hub for analysis. I have deleted whatever I can from my jupyter hub.

Is there any way to increase the storage capacity for the files on jupyter hub.
August 13, 2024 at 2:41 pm #7419
Komal Thareja
Participant
Hi Prateek,

The /home/fabric/work directory in the JupyterHub environment serves as persistent storage for code, notebooks, scripts, and other materials related to configuring and running experiments, including the addition of extra Python modules. However, it is not designed to handle large datasets.

**Transfer via the Bastion Host**
All virtual machines (VMs) are connected to a management network that you use for SSH access. You may have noticed that accessing your VMs requires jumping through a bastion host for enhanced security.
The simplest method for transferring data to and from a VM is by using scp (or a similar tool). However, you must still route through the bastion host. To do this, you will need to configure your external machine (such as your laptop) to use the bastion host for standard SSH connections.
You can find SSH configuration details here: [Logging into FABRIC VMs](https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/).
The key takeaway from that document is to create (or modify) your ~/.ssh/config file to include the bastion host configuration. A basic setup might look like this:

#### Bastion UserKnownHostsFile /dev/null StrictHostKeyChecking no ServerAliveInterval 120
Host bastion-?.fabric-testbed.net User ForwardAgent yes Hostname %h IdentityFile /path/to/bastion/private/key IdentitiesOnly yes
Host bastion-*.fabric-testbed.net User IdentityFile /path/to/bastion/private/key
After configuring, you should be able to SSH with the following command:

ssh -i /path/to/vm/private/key [username]@[VM_IP]

For SCP, you can use the command:

scp -i /path/to/vm/private/key [username]@[VM_IP]:file.txt .

Keep in mind that if the VM IP address uses IPv6, colons can interfere with parsing. In this case, you need to escape the characters (escaping is necessary on macOS but not in Bash; however, using escapes works in both environments). Here’s an example:

scp -i /path/to/vm/private/key ubuntu@[2001:400:a100:3030:f816:3eff:fe93:a1a0]:file.txt .

Thanks,
Komal
August 13, 2024 at 2:49 pm #7420
Komal Thareja
Participant
Sharing the SSH and SCP commands again, due to an error in the above:

ssh -i /path/to/vm/private/key -J bastion.fabric-testbed.net [username]@[VM_IP]

scp -i /path/to/vm/private/key -o "ProxyJump bastion.fabric-testbed.net" [username]@[VM_IP]:file.txt .

scp -i /path/to/vm/private/key -o "ProxyJump bastion.fabric-testbed.net" ubuntu@[2001:400:a100:3030:f816:3eff:fe93:a1a0]:file.txt .
- This reply was modified 1 year, 2 months ago by Komal Thareja.
August 13, 2024 at 3:33 pm #7422
Prateek Jain
Participant
Thanks for the prompt reply!
I can definitely use this setup and commands to download my experiment results on my local machine but what I was trying to achieve was running the experiments on FABRIC, download the results from all my nodes to my jupyter hub and then conduct analysis on this data using a jupyter notebook. This would make the results of the experiments plottable as soon as the experiments are done.
This works fine when I conduct experiments with short durations. With long duration experiments the experimental data size increases and I am not able to download it on my jupyterhub for analysis. Is there any work around this issue other than downloading the results on my local machine and run jupyter notebook locally?
August 13, 2024 at 4:53 pm #7423
Komal Thareja
Participant
Hi Prateek,

I recommend setting up your own FABRIC environment on your laptop or machine to run your experiments. This approach will allow you to capture more data and reduce reliance on Jupyter Hub. Consider configuring a local Python environment for the FABRIC API as described here, and run the notebooks locally.

Thanks,

Komal
August 13, 2024 at 5:50 pm #7424
Prateek Jain
Participant
Thanks, I will try that.
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.