Forum Replies Created
-
AuthorPosts
-
August 2, 2023 at 10:40 am in reply to: FailedPostStartHook Error when launching Jupyter Notebook #4885
Hi Sarah,
When you have the bleeding edge/beyond bleeding edge container running, could you please share the output of the following commands?
ls -lrt /home/fabric/work/
ls -lrt /home/fabric/work/fabric_config
Also, could you please share the warning message you couldn’t upload to kthare10@renci.org
Thanks,
KomalHi Fegping,
Node1:
ens7
maps toNIC3
It was configured as below:NOTE the
prefixlen
is set to 128 instead of 64.ens7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 2602:fcfb:1d:2::2 prefixlen 128 scopeid 0x0
inet6 fe80::7f:aeff:fe44:cbc9 prefixlen 64 scopeid 0x20 ether 02:7f:ae:44:cb:c9 txqueuelen 1000 (Ethernet)
RX packets 28126 bytes 2617668 (2.6 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2581 bytes 208710 (208.7 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0I brought this interface down and re-configured the IP address using the following command:
ip -6 addr add 2602:fcfb:1d:2::2/64 dev ens7
After this I can ping the gateway as well as other nodes.
root@node1:~# ping 2602:fcfb:1d:2::4
PING 2602:fcfb:1d:2::4(2602:fcfb:1d:2::4) 56 data bytes
64 bytes from 2602:fcfb:1d:2::4: icmp_seq=1 ttl=64 time=0.186 ms
^C
--- 2602:fcfb:1d:2::4 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.186/0.186/0.186/0.000 ms
root@node1:~# ping 2602:fcfb:1d:2::1
PING 2602:fcfb:1d:2::1(2602:fcfb:1d:2::1) 56 data bytes
64 bytes from 2602:fcfb:1d:2::1: icmp_seq=1 ttl=64 time=0.555 ms
^C
--- 2602:fcfb:1d:2::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.555/0.555/0.555/0.000 ms
Node2: IP was configured on
ens7
However, mac address for NIC302:15:60:C2:7A:AD
maps toens9
I configuredens9
with the commandip -6 addr add 2602:fcfb:1d:2::3/64 dev ens9
and can now ping gateway and other nodes.
root@node2:~# ping 2602:fcfb:1d:2::1
PING 2602:fcfb:1d:2::1(2602:fcfb:1d:2::1) 56 data bytes
64 bytes from 2602:fcfb:1d:2::1: icmp_seq=1 ttl=64 time=0.948 ms
64 bytes from 2602:fcfb:1d:2::1: icmp_seq=2 ttl=64 time=0.440 ms
^C
--- 2602:fcfb:1d:2::1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1007ms
rtt min/avg/max/mdev = 0.440/0.694/0.948/0.254 ms
root@node2:~# ping 2602:fcfb:1d:2::2
PING 2602:fcfb:1d:2::2(2602:fcfb:1d:2::2) 56 data bytes
64 bytes from 2602:fcfb:1d:2::2: icmp_seq=1 ttl=64 time=0.146 ms
64 bytes from 2602:fcfb:1d:2::2: icmp_seq=2 ttl=64 time=0.082 ms
^C
--- 2602:fcfb:1d:2::2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1010ms
rtt min/avg/max/mdev = 0.082/0.114/0.146/0.032 ms
Please configure the IPs on other interfaces or share the IPs and I can help configure them.
Thanks,
KomalJuly 24, 2023 at 12:30 pm in reply to: Label exception: Unable to set field numa of labels, no such field available #4804Hi Elie,
Could you please share the output of the following commands from your container?
pip list|grep fabric
cat ~/work/fabric_config/requirements.txt
If you have any entries for
fabrictestbed-extensions
in~/work/fabric_config/requirements.txt
Please remove them and restart your container via File -> Hub Control Panel -> Stop My Server followed by Start My Server.Thanks,
Komal
Hi Fengping,
I have rebooted both Node1 and Node2. They should be accessible now. Please set up the IPs as per the mac addresses shared above. Please do let me know if anything else is needed form my side.
Thanks,
Komal
You can confirm the interfaces for Node1 and Node2 via the mac addresses:
Node1
02:7F:AE:44:CB:C9
=> NIC306:E3:D6:00:5B:06
=> NIC202:BC:A6:3F:C7:CB
=> NIC1Node2
02:15:60:C2:7A:AD
=>NIC302:1D:B9:31:E7:23
=> NIC202:B5:53:89:2C:E6
=> NIC1Thanks,
Komal
Hi Fengping,
I think
ens7 -> net1
,ens8->net3
andens9 -> net2
Please let me know once you get the public access back. I can help figure out the interfaces.Thanks,
Komal
Hello Fengping,
I have re-attached the pci devices for the VMs:
node1
andnode2
. You would need to reassign the IP addresses back on them for your links to work. Please let us know if the links are working as expected after configuring the IP addresses.Thanks,
Komal
July 12, 2023 at 10:12 am in reply to: Maintenance on FABRIC-Network AM – 07/12/2023 (9:00am-10:00am EST) #4666Maintenance is completed. Testbed is open for use.
GKE cluster issues are resolved, Jupyter Hub is back online. Apologies for the inconvenience!
Thanks,
Komal
@yoursunny – Thank you for sharing the example scripts. Appreciate it!
@Xusheng – You can use FabNetv4Ext or FabNetv6Ext services as explained here.
Also, we have two example notebooks one each for FabNetv4Ext or FabNetv6Ext available via
start_here.ipynb
:- FABNet IPv4 Ext (Layer 3): Connect to FABRIC’s IPv4 internet with external access (manual)
- FABNet IPv6 Ext (Layer 3): Connect to FABRIC’s IPv6 internet with external access (manual)
Thanks,
Komal@yoursunny Thank you for sharing your script. We have updated MTU setting across sites and were able to use your script as well for testing. However, with latest fablib changes for performance improvements, the script needed to be adjusted a little bit. Sharing the updated script here.
Thanks,
KomalHi Fengping,
Thank you so much for reporting this issue. There was a bug which led to allocating same subnet to multiple slices. So when a second slice got allocated the same subnet the traffic stopped working for your slice.
I have applied the fix for the bug on production. Could you please delete your slice and recreate it? Apologies for the inconvenience.
Appreciate your help with making the system better.
Thanks,
KomalPlease try this to create 12 VMs, this shall let you use almost the entire worker w.r.t cores. I will keep you posted about the flavor details.
#Create Slice slice = fablib.new_slice(name=slice_name) # Network net1 = slice.add_l2network(name=network_name, subnet=IPv4Network("192.168.1.0/24")) node_name = "Node" number_of_nodes = 12 for x in range(number_of_nodes): disk = 500 if x == 0: disk = 4000 node = slice.add_node(name=f'{node_name}{x}', site=site, cores='62', ram='128', disk=disk) iface = node.add_component(model='NIC_Basic', name='nic1').get_interfaces()[0] iface.set_mode('auto') net1.add_interface(iface) #Submit Slice Request slice.submit();
Thanks,
KomalWith the current flavor definition, I would recommend requesting VMs with the configuration:
cores='62', ram='384', disk='2000'
Anything bigger than this maps to
fabric.c64.m384.d4000
and only one of the workers i.e.cern-w1
can accomodate 4TB disks and rest of the worker can at max accomodate2TB
disk. I will discuss this internally to work on providing a better flavor to accomodate your slice.Thanks,
Komal
P.S: I was able to successfully create a slice with the above configuration.
- This reply was modified 1 year, 7 months ago by Komal Thareja.
I looked at the instance types, please try setting the
core='62', ram='384', disk='100'
FYI: https://github.com/fabric-testbed/InformationModel/blob/master/fim/slivers/data/instance_sizes.json this might be useful for VM sizing.
Thanks,
Komal
-
AuthorPosts