Forum Replies Created
-
AuthorPosts
-
Hi Pilar,
Could you please check if your bastion keys are expired via on the Portal ->Experiments -> Manage SSH Keys -> Bastion Keys? If so, Please re-run the notebook
jupyter-examples-rel1.6.1/configure_and_validate.ipynb
This should renew your bastion keys. Please try creating your slice again after this.
Thanks,
Komal
Hi Nishant,
Installing fablib from main branch should work. It is using the
fabrictestbed==1.5.9
dependency which has the fix.fabrictestbed==1.5.9
is built fromllt
branch. I will work on merging this branch to main as well.Thanks,
Koma
Hi Fraida,
Yes, FABRIC now supports Slices using OVS Bridges using NIC_Basic. An example notebook can be found here: https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/complex_recipes/openvswitch/openvswitch.ipynb
We do have following constraint though for this to work.
Host Considerations:
Because of constraints imposed by NVIDIA/Mellanox, when utilizing
NIC_Basic
for an OVS bridge experiment, it is advisable to deploy the VM responsible for running the bridge on a separate host from the VMs linked to the bridge.Additionally, it’s worth noting that this condition does not apply to
NIC_ConnectX_5
andNIC_ConnectX_6
configurations.Thanks,
Komal
Thank you for sharing this Sunjay, we will fix this in the next version.
Thanks,
Komal
Posting an update to close the loop.
Had a quick Zoom meeting with Laura to resolve this. The issue seemed to be bastion keys, removing contents of
fabric_config
and re-runningconfigure_and_validate.ipynb
resolved the issue.@Laura – Please let us know if you run into any issues!
Thanks,
Komal
June 10, 2024 at 9:43 am in reply to: Created 2 nodes with smartnics. Not able to create a network connection #7083Hi Shoaib,
Which network are you trying to setup – layer2 or layer3? Please share your slice id to help us investigate this further.
In addition, please take a look at the examples available from
start_here.ipynb
All the networking examples have three configurations:
- Auto – FABLIB automatically configures IPs and routes
- Manual – User explicitly configures IPs and routes
- Config – User explicitly specifies the IPs/subnets to choose and FABLIB automatically configures IPs and routes
Please let us know if you still have questions or concerns.
Thanks,
Komal
- This reply was modified 5 months, 2 weeks ago by Komal Thareja.
- This reply was modified 5 months, 2 weeks ago by Komal Thareja.
There was a missing configuration on RUTG due to the maintenance being lifted from one of the hosts there. Please check again now; the interfaces should be visible on your VMs.
Thank you for reporting this issue and allowing us to address the misconfiguration.
Thanks,
Komal
1 user thanked author for this post.
Hi Nirmala,
Are you a member of multiple projects? If so, could you please try the following and see if this helps?
From the portal Go to Experiments -> Projects and Slices, choose the specific project, click on Slices under that project.
Please let us know if this is still an issue.
Alternatively, you could renew the slices from Jupyter Hub with one of the following options:
Option A: Slice commander
- Open a terminal, type slice-commander
- type ls to list your slices
- cd to your slice and then type renew <days>
Option B: Notebook
- List your slices using notebook Start Here -> List All Slices (available under Managing Slices)
- Renew your slices using notebook Start Here -> Extending a Slice Reservation (available under Managing Slices)
Thanks,
Komal
Could you please share your slice id? ID you shared earlier is your Project ID.
Thanks,
Komal
1 user thanked author for this post.
Hi Garegin,
I suspect you are using ubuntu image for your VMs. Please note for ubuntu, the interfaces are not up by default.
Please install
net-tools
using the following command:apt install net-tools
You can then verify the interfaces via the command:
ifconfig -a
Thanks,
Komal
Hi Laura,
Please check your slice on Portal. I can confirm all the resources requested by your slice are provisioned and are in Active State. I suspect
/home/fabric/work/fabric_config/ssh_config
is not setup correctly.Could you please check if you see any errors in
/tmp/fablib/fablib.log
?Also, please try following steps:
- Remove the file
/home/fabric/work/fabric_config/ssh_config
- Run the notebook
jupter-examples-1.6.1/configure_and_validate.ipynb
- Recreate your slice
Thanks,
Komal
Hi Sunjay,
As you correctly pointed, only the contents of the work directory persist across container restarts. I would recommend setting up your local Jupyter Environment for any customized experience as indicated here: https://learn.fabric-testbed.net/knowledge-base/install-the-python-api/#install-jupyter-in-the-virtual-environment
We are working on providing a containerized access to Jupyter as well where users can launch the container on their desktop/laptop and use that. This should be available soon and would enable user environment customization.
Thanks,
Komal
Portal currently display overall site TACC disk usage which is combined disk space on all the hosts at TACC. Control Framework determines possible candidate nodes for your VM on the basis of the resources requested.
Your slice is requesting for a VM with CX5, which in case of TACC are only available on
tacc-w4
and hence it is trying to allocate it ontacc-w4
but fails due to no disk space. I would recommend using a different site than TACC.Also, we are working on improving the Resource Usage display to show per Host level information.
Thanks,
Komal
Hi Nishanth,
The error
Insufficient resources : [disk]
implies that there is not enough disk available on the host on which your VM is being requested. Looking at your slice, following VM requesting a ConnectX5 is being rejected as it maps totacc-w4
There is not enough disk available ontacc-w4
to accomodate your VM hence the failure.
Reservation ID: 478b2a91-5a02-4cf0-9bcd-de04c3b873ea Slice ID: 30f9fb42-37be-420f-899e-082a41bfb735
Resource Type: VM Notices: Reservation 478b2a91-5a02-4cf0-9bcd-de04c3b873ea (Slice Traffic Listening Demo TACC(30f9fb42-37be-420f-899e-082a41bfb735) Graph Id:a58b7bc7-55d6-42e9-b457-5a8a32ebebc9 Owner:nshyamkumar@iit.edu) is in state (Closed,None_) (Last ticket update: Insufficient resources : ['disk'])
Start: 2024-06-05 17:55:24 +0000 End: 2024-06-06 17:55:23 +0000 Requested End: 2024-06-06 17:55:23 +0000
Units: 1 State: Closed Pending State: None_
Predecessors
Sliver: {'node_id': '9a579143-79b2-44fb-bacb-e6a5db4da3bf', 'capacities': '{ core: 2 , ram: 8 G, disk: 1 G}', 'capacity_hints': '{ instance_type: fabric.c2.m8.d10}', 'image_ref': 'default_ubuntu_20', 'image_type': 'qcow2', 'name': 'TACC_node4', 'reservation_info': '{"reservation_id": "478b2a91-5a02-4cf0-9bcd-de04c3b873ea", "reservation_state": "Closed"}', 'site': 'TACC', 'type': 'VM', 'user_data': '{"fablib_data": {"instantiated": "False", "run_update_commands": "False", "post_boot_commands": [], "post_update_commands": []}}'}
Component: {'node_id': '670d117f-19ac-477b-bff7-36ac4e90107a', 'details': 'Mellanox ConnectX-5 Dual Port 10/25GbE', 'model': 'ConnectX-5', 'name': 'TACC_node4-pmnic_2', 'type': 'SmartNIC', 'user_data': '{}'}
NS: {'node_id': 'adeede90-a808-45a6-8e1e-8c8de7a4ee6e', 'layer': 'L2', 'name': 'TACC_node4-TACC_node4-pmnic_2-l2ovs', 'site': 'TACC', 'type': 'OVS'}
IFS: {'node_id': '2f9a52b4-3108-48f2-b0f9-e0ccd7716cdc', 'capacities': '{ bw: 25 Gbps, unit: 1 }', 'labels': '{ local_name: p1}', 'name': 'TACC_node4-pmnic_2-p1', 'type': 'DedicatedPort', 'user_data': '{"fablib_data": {"mode": "config"}}'}
IFS: {'node_id': 'b6c42c3e-a570-4ed1-b633-607e90777f34', 'capacities': '{ bw: 25 Gbps, unit: 1 }', 'labels': '{ local_name: p2}', 'name': 'TACC_node4-pmnic_2-p2', 'type': 'DedicatedPort', 'user_data': '{"fablib_data": {"mode": "config"}}'}
Thanks,
Komal
-
AuthorPosts