Forum Replies Created
-
AuthorPosts
-
@Robin, could you please try to restart your JH container via File -> Hub Control Panel -> Stop My container -> Start My container when you see this error?
Also, the path to replace the token is
/home/fabric/.tokens.json
. We will fix this in the documentation of it’s incorrect. Also, could you please let us know which JH container are you using?Thanks,
Komal
@Lyod – Fabnetv4Ext notebook has a bug and configures the route incorrectly. We will fix the notebook, sharing the fix needed to the routes below. Hope this helps!
Configure Node1 cell in notebook should change the route as below via EXT gateway, ping should work.
stdout, stderr = node1.execute(f'sudo ip route add 0.0.0.0/0 via {network1.get_gateway()}')
Configure Node2 cell in notebook should change the route as below via EXT gateway, ping should work.
stdout, stderr = node2.execute(f'sudo ip route add 0.0.0.0/0 via {network2.get_gateway()}')
Thanks,
Komal
- This reply was modified 10 months, 2 weeks ago by Komal Thareja.
Hi Shams,
Could you please remove <> enclosing the project id in
/home/fabric/work/fabric_config/fabric_rc
and restart your JH container via File -> Hub Control Panel -> Stop Container followed by Start Container?Please try your notebook again and let us know if you still observe this error.
Thanks,
Komal
STAR site has 6 worker nodes each with 128 cores = 768 cores. This is same as the previous release.
Oversubscription is not enabled on STAR.
January 8, 2024 at 3:09 pm in reply to: How to access the files from my older username on the same project? #6261@Nagmat – Your back should be available in your new JH container as
fabric_bkp.tgz
. Please start your container and let us know if you face any issues accessing the data.Thanks,
Komal
January 8, 2024 at 2:56 pm in reply to: How to access the files from my older username on the same project? #6259@Nagmat – Could you please stop your JH container? I took backup of your old files and would copy it your new container.
Minor correction in the version above, Please update the fablib using the command:
pip install fabrictestbed-extensions
@Kriti – the hypervisor on wash-w3 was down this morning and was recovered. Issues on WASH should clear now. I also verified TACC is working as well. Please try your slices again and let us know if you still face errors.
@Nagmat – there was a leaked service due to timeout from TACC switch. I have cleaned up the leaked services, your slice provisioning should work as well. Please let us know if you still face errors.
December 11, 2023 at 5:06 pm in reply to: Maintenance on Network AM – 12/11/2023 (3:30pm-4:30pm EST) #6182Maintenance has been completed!
Hi Kriti,
There was an issue on
new-y2
where your VMs were being provisioned as it had some leaked VMs. We rebooted the worker node, your slices should work on NEWY. We will also check STAR and WASH as well.Thanks,
Komal
Hello,
Could you please check if the file exists at the specified path using the command:
ls /home/fabric/work/re_vit/notebooks/animal-blur-canine-551628.jpg
?Thanks,
KomalNovember 12, 2023 at 3:33 pm in reply to: Maintenance on Network AM – 11/12/2023 (3:00pm-4:00pm EST) #6089The maintenance is complete!
You are right Greg, this is totally dependent on how much memory is available on the Numa Node on the Host where your VM is launched at the current time.
@yoursunny
What happens if there are multiple components that are on distinct NUMA sockets?
If you have multiple components, we try to pin the memory for the VM to both the Numa Nodes.Example: your VM has a ConnectX-5 and GPU both on different sockets, invoking numa_tune would pin the memory to both the sockets provided that the combined available memory on both the sockets >= requested VM RAM.
Is it possible to specify how much RAM to pin to each NUMA socket?
In the current version, this is not supported. We may be limited on this by the underlying OS API as well. But we would explore to improve on this.If we pin a CPU core or certain amount of RAM onto a NUMA socket, does it prevent other VMs from using the same CPU core or RAM capacity?
Yes, if you have pinned CPUs/Memory to a specific NUMA socket, other VMs cannot use the same cores/memory on that socket.For CPU pinning, you can explicitly specify how many cores to pin to a Numa Node.
Thanks,
Komal1 user thanked author for this post.
No, having lesser memory requested would have better chances or deploying on a relatively less used site would give better success. I checked on the portal GPN seems to be very sparsely used. Please consider requesting the VM there and try with 32G ram.
Upper limit for a VM connected with only one component would map to a single Numa Node. Max limit on memory for a numa node is 64G so exceeding that limit would not work.
Adding more flexibility to this API would help alleviate this issue. Will definitely work on that and keep you updated once that is available.
Thanks,
Komal -
AuthorPosts