Forum Replies Created
-
AuthorPosts
-
September 27, 2024 at 11:16 am in reply to: getting ChannelException: ChannelException(2, ‘Connect failed’) Error #7570
Hi Tejas,
It looks like SSH connection to bastion host is failing. Could you please re-rerun this notebook
jupyter-examples-rel1.7.*/configure_and_validate.ipynb
? Please retry your notebook after that.Please let us know if the issue persists.
Thanks,
Komal
Hi Sepideh,
Disk usage of your container is 100%. You seem to have
output_file
which seems to be taking majority of the space.The
/home/fabric/work
directory (1GB) in the JupyterHub environment serves as persistent storage for code, notebooks, scripts, and other materials related to configuring and running experiments, including the addition of extra Python modules. However, it is not designed to handle large datasets or output files.Please consider removing un-needed files or move
output_file
to avoid this error.Additionally, if you need more disk space, I recommend setting up your own FABRIC environment on your laptop or machine to run your experiments. This approach will allow you to capture more data and reduce reliance on Jupyter Hub. Consider configuring a local Python environment for the FABRIC API as described here, and run the notebooks locally.
fabric@spring:work-100%$ du -sh *
20K 5_clients_1_server.ipynb
60K fabric_config
228K hipft.ipynb
95M jupyter-examples-rel1.5.5
96M jupyter-examples-rel1.6.1
28K lost+found
686M output_file
82M rel1.7.0.tar.gz4q_e8dq5.tmp
0 rel1.7.0.tar.gzceav1uzw.tmp
0 rel1.7.0.tar.gzds9a6279.tmp
0 rel1.7.0.tar.gzgmqadnvv.tmp
0 rel1.7.0.tar.gziuc6xzxa.tmp
Thanks,
Komal
Hi Ilya,
Yes, it’s possible to pin vCPUs to physical cores. The following APIs on the node class may be of interest:
–
node.get_cpu_info()
provides information about the VM’s CPU in relation to the host.
– You can pin specific vCPUs to physical cores usingnode.poa(operation="cpupin", vcpu_cpu_map=vcpu_cpu_map)
.In this case,
vcpu_cpu_map
is a dictionary mapping each vCPU to the desired physical core.For more details, please refer to the documentation here. Let us know if you have any questions or encounter any issues!
Thanks,
KomalHi Khawar,
Ubuntu 18.04 LTS reached the end of its standard support on May 31, 2023, and is no longer available on FABRIC. Thank you for bringing the list of images to our attention. We will update it to reflect this change.
Thanks,
Komal
September 2, 2024 at 9:45 am in reply to: Subject: Issues with SSH Access to Fabric Nodes for Slice IDs: 34c41dad-c9f3-431 #7506Hi Yuanjun,
I suspect the bastion keys are expired. These keys are only on your JH container and are not pushed to your VMs. Sliver keys i.e. VM keys should not be affected. The error indicated above for SSH failure indicates login to bastion server was denied and hence the suggestion.
The error observed in the multi-processing pool cleanup can be ignored. We will address that error but it should not impact regeneration of the keys. Could you please see if you are able to SSH to your VMs?
Bastion key expiry can also be verified from the portal via: Experiments -> Manage SSH Keys.
Thanks,
Komal
Hi,
Please refer to the site details to view the available GPU models. For instance, the STAR site offers RTX600 and Tesla T4 GPUs. You can verify this information at [STAR site details](https://portal.fabric-testbed.net/sites/STAR).
Thanks,
Komal
September 2, 2024 at 8:11 am in reply to: Subject: Issues with SSH Access to Fabric Nodes for Slice IDs: 34c41dad-c9f3-431 #7502Hi Yuanjun,
I suspect your bastion keys have expired. Could you please re-run the notebook to regenerate your bastion keys
jupyter-examples-rel1.7.0/configure_and_validate.ipynb
?Please let us know if you still run into errors.
Thanks,
Komal
Hi Tianrui,
Users do not have
sudo
access on the JupyterHub container. To install Python packages, please use the commandpip install <package name> --user
.Thanks,
Komal
August 29, 2024 at 12:56 pm in reply to: Unable to (consistently) reach FABNetv4Ext addresses from outside FABRIC #7483Please consider checkingthe following examples on how to use these services:
Feel free to reach out in case you run into issues or have queries.
Thanks,
Komal
August 29, 2024 at 12:25 pm in reply to: Unable to (consistently) reach FABNetv4Ext addresses from outside FABRIC #7480Hi Sourya,
You will need permission for FabNetv*Ext services to enable public connections to your VMs. This request can be made by your Project Lead. For more details, please check here!
Thanks,
KomalHi Prateek,
This looks like a bug, we have a race condition which is preventing the updates to the Slice Graph Model. I will work on addressing this. For now as a workaround, you can determine the IP Addresses via slice commander using show commands. Refer https://learn.fabric-testbed.net/knowledge-base/using-slicecommander-with-fabric/ for slice-commander usage.
Alternatively, you can get the sliver information also via
for s in slice.get_slivers():
print(s._sliver) # This is a json object will provide the needed information
Thanks,
Komal
Hi Prateek,
Could you please share your slice ID?
Thanks,
Komal
Hi Kriti,
This looks like a version mismatch for the
fabrictestbed-extensions
. Are you running into this any of the JH containers? If so, please ensure you have no entries in/home/fabric/work/fabric_config/requirements.txt
and then restart your JH container via File -> Hub Start Control Panel -> Stop My Server -> Start My Server.If you are running into this on you local environment, please consider updating to the latest fablib via
pip install fabrictestbed-extensions==1.7.3
Thanks,
Komal
Thank you, @yoursunny, for the excellent suggestion!
@Jestus,
In addition to @yoursunny’s point, the Project Lead for your project will need to request permissions to enable FabNetv*Ext services. You can find more details here.
Additionally, we recommend avoiding the use of the management interface for data transfer and instead utilizing the FABNetv4Ext/FABNetv6Ext services. Please feel free to reach out if you have any questions or comments.
Thanks,
KomalJust closing the loop here as the issue was resolved over a short zoom call.
The error message “The token is not yet valid (iat)” indicates that the token’s “issued at” (
iat
) timestamp is set in the future compared to the current time. This can happen if there is a clock discrepancy between the system where the token was generated and the system validating the token.Time on the server was behind current time, synching the clock with NTP server resolved the error.
Thanks,
Komal
-
AuthorPosts