Komal Thareja

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 445 total)

← 1 2 3 … 28 29 30 →

Author

Posts
May 28, 2025 at 4:59 am in reply to: Prometheus/Grafana/Node Exporter example not working? #8539
Komal Thareja
Participant
Sorry, wrong post!
- This reply was modified 1 month, 2 weeks ago by Komal Thareja.
May 28, 2025 at 4:48 am in reply to: Reserve bandwidth for a slice #8538
Komal Thareja
Participant
Hi Philips,

At the moment, we do not support guaranteed QoS. This feature will be available soon. In the meantime, you can use tools such as tc to manage bandwidth on the VMs.

Thanks,
Komal
May 21, 2025 at 4:03 pm in reply to: FPGA valid sites for Esnet toolchain #8522
Komal Thareja
Participant
Hi Nishant,

Please find my responses inline below:

Once a user has reserved a slice with an FPGA, that resource is locked and cannot be acquired or modified by other users until the slice is released.

You’re correct—if the FPGA has been flashed with a workflow other than the EsNet workflow, it may fail.

However, we cannot guarantee the validity or state of the bitstream that was previously flashed by another user before you acquired the slice. This may leave the FPGA in an inconsistent or unusable state. In our experience, reflashing the FPGA with a known good (golden) image typically restores it to a usable state.

We are planning to share this golden image along with the notebook with users soon, so they can perform the reflash themselves when needed. In the meantime, if you’re currently blocked, please let me know the specific site you’re working with—I’ll check whether we can assist with reflashing the FPGA for you.

Thanks,

Komal
May 21, 2025 at 1:45 pm in reply to: L2Bridge not forwarding packets in SALT #8520
Komal Thareja
Participant
Hi Alex,

The network team reviewed the configuration and found no issues on the switch side. However, they observed that the MAC addresses for these interfaces have not been learned by the switch.

As a next step, they recommend removing the L2Bridge service and connecting both interfaces directly to FabNetV4 to verify if the network connectivity is restored.

Please perform this change using slice modify, so the same VMs and interfaces can be reused for validation. This helps us rule out the possibility that recreating the VMs might inadvertently resolve the issue.

Refer to this notebook for guidance on how to modify the slice.

Thanks,

Komal
May 21, 2025 at 1:14 pm in reply to: Unable to SSH into my Nodes #8519
Komal Thareja
Participant
Could you please check your VM again?

All PCI devices had been disconnected. I have reconnected them to your VM. Please check it.

Also, could you please share the sequence of operations that lead your VM to this state?

It would be helpful to see if there is anything that needs to be fixed on our control software.

Thanks,

Komal
May 21, 2025 at 12:40 pm in reply to: Unable to SSH into my Nodes #8517
Komal Thareja
Participant
Please share your slice ID and also the output of the command: ifconfig -a

Thanks,

Komal
May 21, 2025 at 6:11 am in reply to: L2Bridge not forwarding packets in SALT #8512
Komal Thareja
Participant
Thank you Alex for sharing this observation! I temporarily assigned IP addresses to these interfaces on r3 and 4 nodes and do not see ping working between them.

Network service as provisioned looks ok. I am reaching out to the network team and will keep you posted.

Thanks,

Komal
May 21, 2025 at 5:54 am in reply to: Unable to SSH into my Nodes #8511
Komal Thareja
Participant
Hi Ajay,

You can use the following code snippet to reboot the node:
slice = fablib.get_slice(slice_name) node = slice.get_node(node_name) node.os_reboot()

Also, please share your slice ID so we can take a look at it.

Thanks,

Komal
May 16, 2025 at 9:11 am in reply to: FPGA valid sites for Esnet toolchain #8499
Komal Thareja
Participant
Thank you for your question.

What I meant is that once an FPGA is initially flashed with a provided bitstream, users can reflash it with a different bitstream of their choice—as long as the PCIe interface remains unchanged. Because of this flexibility, the actual state of the FPGA at a given site may differ from what’s shown in the shared sheet, depending on whether a user has reprogrammed it.

Best,

Komal
May 16, 2025 at 9:06 am in reply to: Testing BitTorrent and IPFS #8498
Komal Thareja
Participant
Thank you for your feedback, Philip!

You’re absolutely right—node.add_fabnet() attaches the FabNetV4 service to the node, enabling communication with other nodes over FABRIC’s data plane network via the FabNetV4 interface.

In addition, all VMs provisioned in FABRIC are assigned a Management IP for administrative purposes. This interface allows inbound SSH access and supports outbound connections, including those required for operations like docker pull. However, please note that the management network is actively monitored and any torrent or insecure traffic may be flagged. Such activity can lead to enforcement actions, including possible slice termination. As a best practice, we recommend not using the management network for experimental traffic.

Best,

Komal
May 13, 2025 at 4:47 pm in reply to: Testing BitTorrent and IPFS #8493
Komal Thareja
Participant
Thank you for your inquiry Philip.

You are welcome to conduct experiments involving IPFS or BitTorrent on FABRIC, particularly for evaluating peer discovery and data transfer between FABRIC nodes. This type of testing is permissible as long as it is confined to FABnet or a custom Layer 2 network within the FABRIC infrastructure.

We kindly request that your experiment not initiate connections to external BitTorrent or IPFS servers outside the FABRIC environment.

Please feel free to reach out if you need any assistance with the experiment setup or have further questions.

Best regards,

Komal
May 12, 2025 at 9:48 am in reply to: FPGA valid sites for Esnet toolchain #8478
Komal Thareja
Participant
Hi Nishanth,

Please find enclosed the most recent known status. Kindly note that users have the ability to flash their own binaries, so the actual state of the infrastructure may differ from what is captured in the attached sheet. As a first step toward addressing this, we are working to include notebook and Control Framework support in Release 1.9, enabling users to flash FPGAs within their workflows directly.

Thanks,

Komal
May 2, 2025 at 1:56 pm in reply to: Slice showing as StableOK but is actually closed #8462
Komal Thareja
Participant
Hi Anthony,

Regarding your slice: a5d2fff2-84fc-48d9-8d67-5ff96e120273
Start: 2025-04-18 14:53:43 +0000
End: 2025-05-02 14:53:42 +0000

A renew operation was attempted for this slice, but it failed for the VM due to insufficient resources: ['core'].

Please note that we now support advance reservations, which allow users to reserve resources ahead of time. As a result, a renew request may fail if it conflicts with an existing advance reservation — which appears to be the case here.

It’s unclear how the renew was initiated, but if it was done through JupyterHub, the error would have been reported to the user. We suspect there may be a bug on the portal side where this error is not being surfaced correctly, and we will investigate and address that.

Unfortunately, the only available option at this point is to re-create the slice. We apologize for the inconvenience.

Thanks,

Komal
May 2, 2025 at 1:47 pm in reply to: Tofino bf_switchd process gets killed. #8460
Komal Thareja
Participant
Hi Nishanth,

Thank you for sharing this.

Please note that the current implementation of execute_thread maintains the process only for the duration of the specified timeout. As you correctly observed, for longer-running processes, directly accessing the switch via SSH allows you to manually launch switchd.

We will work on enhancing execute_thread to better support this use case and will keep you informed once the update is available.

Thanks,

Komal
April 24, 2025 at 8:45 am in reply to: refrsh token issue inside jupyter notebook #8443
Komal Thareja
Participant
This error typically occurs due to an expired token. Please try the following steps:
Go to File → Hub Control Panel → Stop My Server, then select Start Server to generate a new token.

Thanks,

Komal

P.S: https://learn.fabric-testbed.net/knowledge-base/using-the-jupyter-hub/#frequently-asked-questions
Author

Posts

Viewing 15 posts - 16 through 30 (of 445 total)

← 1 2 3 … 28 29 30 →