Forum Replies Created
-
AuthorPosts
-
Hi Nirmala,
This looks like a bug in
fablib
which doesn’t have the update regarding the username for default_kali. Could you please try using userkali
to access the VM?I will work to provide a fix for this soon. Apologies for the inconvenience!
Thanks,
KomalSorry for the confusion, Users can still validate their keys against the bastion host using the command above as indicated on the learn article.
Thanks,
KomalHi,
Could you please try running this notebook?
jupyter-examples-rel1.6.1/configure_and_validate.ipynb
This will validate your configuration and update it if needed.
Please try running the
Hello Fabirc Notebook
after that. This notebook creates a VM and also displays the SSH command to use to login into the VM. Please check if the SSH command works and let us know if the issue still persists.NOTE: Bastion host is only used as a jump box, we do not allow login to Bastion Nodes.
Thanks,
Komal
- This reply was modified 4 months ago by Komal Thareja.
Hi Nishanth,
The fix for this issue is now available with
fabrictestbed-extensions==1.7.2
Thanks,
Komal
Hi Nishanth,
We just added support for users to query host level information as well. Examples for this would be available in a couple of days. I am sharing a snippet below on how this can be done:
fablib.list_hosts()
resources = fablib.get_resources()
site = resources.get_site("MAX")
all_hosts = site.get_hosts()
host = site.get_host(name="max-w1.fabric-testbed.net")
print(host)Please checkout our documentation here to find more details. Please let us know if you have more questions/concerns.
https://fabric-fablib.readthedocs.io/en/latest/resources.html
https://fabric-fablib.readthedocs.io/en/latest/site.html
Thanks,
Komal
July 23, 2024 at 10:03 am in reply to: MFLib question – querying the entire fabric environment #7278Hi Bjoern,
We just added support for users to query host level information as well. Examples for this would be available in a couple of days. I am sharing a snippet below on how this can be done:
fablib.list_hosts()
resources = fablib.get_resources()
site = resources.get_site("MAX")
all_hosts = site.get_hosts()
host = site.get_host(name="max-w1.fabric-testbed.net")
print(host)Please checkout our documentation here to find more details. Please let us know if you have more questions/concerns.
https://fabric-fablib.readthedocs.io/en/latest/resources.html
https://fabric-fablib.readthedocs.io/en/latest/site.html
Thanks,
Komal
Hi Natty,
Apologies for the inconvenience! I looked at your Slice and can confirm from the logs the deletion was triggered by user either via Fablib/Portal.
2024-07-10 21:30:15,216 - CFEL Slice event slc:54466474-bcc2-45ef-a86a-4696b7d1bc5e create by prj:073ee843-2310-45bd-a01f-a15d808827dc usr:4d5326ac-b002-444f-bb44-3b406a038be5:nm3833@nyu.edu:49c50f84eab437103fa7f7863bfbbd5bbe4e303ea017b2912ce0df418b61a3df compute vms:5,cores:48,p4s:0; sites GATECH; components SmartNIC:1,SharedNIC:4; services L2Bridge:0,L2Bridge:0; vmdetails C32/R256/D25:1,C4/R32/D10:4
2024-07-11 14:25:52,610 - CFEL Slice event slc:54466474-bcc2-45ef-a86a-4696b7d1bc5e delete by prj:073ee843-2310-45bd-a01f-a15d808827dc usr:4d5326ac-b002-444f-bb44-3b406a038be5:nm3833@nyu.edu:49c50f84eab437103fa7f7863bfbbd5bbe4e303ea017b2912ce0df418b61a3df
This possibly may have happened because of the Bug reported here With the maintenance next week the delete bug would be addressed.
Thanks,
Komal
- This reply was modified 4 months, 2 weeks ago by Komal Thareja.
Hi Nishant,
It appears that a network service has leaked. In a distributed system like our testbed, encountering some leaked resources is not unusual. We plan to deploy updates in the coming week to address this issue. In the meantime, I recommend introducing a delay between deletion and recreation, as the resources are distributed across the testbed.
For now, I have cleaned up the leaked services, so provisioning should work.
Also, if possible could you please share your notebook or code- snippet that might help reproduce this state. Would be super helpful to debug and address this issue? Appreciate your help with this!
Best regards,
Komal
We are deploying 1.7 next week which would contain a fix for this issue. Apologies for the inconvenince!
Thanks,
Komal
Hi Nishant,
VM requested on GATECH identified by ID:
3d425fc6-0f44-4e98-a0cc-d9ee9358cb8f
cannot be allocated. Looks like you are requesting CX6 there which are only available onGATECH-w3
which is currently under maintenance. Hence, CF is unable to find any nodes to serve this reservation.Hope this helps!
Thanks,
Komal
Hi Nirmala,
FABLIB API supports 3 modes of configuration for the VMs.
– Manual: Manual configuration does not require any additional steps before the slice request is submitted.– Auto: Automatic configuration requires specify a subnet for the network and setting the interface’s mode to auto using the iface1.set_mode(‘auto’) function before submitting the request. With automatic configuration, FABlib will allocate an IP from the network’s subnet and configure the device during the post boot configuration stage. Optionally, you can add routes to the node before submitting the request.
– User Defined (config): User defined configuration requires specifying a subnet for the network and specifying the IP to use for each interface before the request is submitted. You can signal FABlib to configure the user defined IPs by setting the interface’s mode to config using the iface1.set_mode(‘config’) function before submitting the request. With user defined configuration, FABlib will use the IP defined by the user and configure the device during the post boot configuration stage. Optionally, you can add routes to the node before submitting the request.
Examples for each mode of configuration are available via Start Here:
Assuming the manual configuration is done via
ip addr
commands, it is not reboot persistent and the onus lies on the user to save/apply the config again post reboot.For modes
auto
andconfig
, FABLIB maintains the IP address information in the meta data for each of VMs maintained in UserData JSON object saved in the Fabric Information Model for the VM. In both these modes, the configuration can be fetched and re-applied using the code block:slice = fablib.get_slice(slice_name) for n in slice.get_nodes() n.config()
Hope this helps! Please let me know if you have any feedback.
Thanks,
Komal- This reply was modified 4 months, 3 weeks ago by Komal Thareja.
- This reply was modified 4 months, 3 weeks ago by Komal Thareja.
Hi Vaiden,
Not sure how your slice was setup. If you interfaces were configured in
auto
mode. You should be able to do the following to reapply the config.
slice = fablib.get_slice(slice_name)
for n in slice.get_nodes():
n.config()
Thanks,
Komal
- This reply was modified 4 months, 4 weeks ago by Komal Thareja.
Hi Vaiden,
I have checked the logs and don’t see any errors. Looks like your slice was modified multiple times.
Some of the resources closed on
2024-05-18 18:27:07 +0000
while all other had expiry date set to2024-11-02 19:45:33 +0000
. Is it possible that the slice deletion was triggered by you on accident?I have no evidence of software deleting it due to expiry as per the logs.
Could you please share if there were any recent actions taken on the slice? Trying to see if this can be recreated.
Appreciate your help with this.
Thanks,
Komal
Hi Vaiden,
Unfortunately, it is not possible to recover the slivers once deleted. I will look at why the slice was closed before the end date. However, I do suspect that Extend/Renew may have failed for certain slivers. I will look more and share details here.
Thanks,
Komal
Good morning Fraida,
I’ve pushed a fix to address this issue. Could you please try again using the Beyond Bleeding Edge container on JH and let us know if the problem persists? Your help is greatly appreciated!
Thanks,
Komal
-
AuthorPosts