Forum Replies Created
-
AuthorPosts
-
What is the exception at the bottom of the stack trace? Can you post the whole of the pink box?
A very common exception is something like: “invalid token”. If that is the case, you likely need to restart the notebook’s kernel. At the top of the open notebook, click the circular arrow button. Then click “restart” in the popup. Then run the notebook from the beginning.
I’m wondering if your bastion key expired. They expire after 6 months.
Can you confirm that you bastion key still works? There are directions at the bottom of this article: https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/
@Manas I would strongly encourage you to avoid the native API and, instead, use FABlib. FABlib does not require Jupyter Notebooks and can be used on its own.
You should be able to install it with
pip
using the following command:python3 -m pip install fabrictestbed-extensions
You will need Python 3.9.
Adam’s config for Anaconda likely works too.
- This reply was modified 2 years, 2 months ago by Paul Ruth.
Are you able to ssh to the node on the command line? or run node.execute() with a simple command?
Are you intentionally sending packets with bad checksums as part of your experiment or are they being corrupted by something FABRIC is doing?
Either way is fine, I’m just trying to figure out what we need to look at.
I’m not really sure what is happening. I suspect our underlying switches might be filtering the bad packets. I’ll have someone who knows more about those switches look at this forum thread and maybe respond.
Yes, this is the solution. There a some low-level capabilities that are not possible with NIC_Basic. Try the dedicated NICs.
The original is a docker image built with the docker file found here: https://github.com/fabric-testbed/fabric-docker-images/tree/master/slice-vm-p4-bmv2/0.0.1
I think the deployed version is one I updated for a demo but probably only has small changes. You can probably copy the Dockerfile from that git repo and customize it to your needs. If you put together something useful and want to share it, we can add it as an example.
Let me know if this works.
Paul
- This reply was modified 2 years, 2 months ago by Paul Ruth.
Did you figure this out?
Thanks for letting us know about this. We will try to get it working correctly.
In general, I would suggest not relying on the examples directly, instead use them as examples for building your own experiments. I suspect that any interesting experiment will need some customization and directly relying on our examples will cause problems if/when we update the examples.
You might just need to restart the notebook kernel to get this working. Restarting the notebook kernel clears out any old var from the python application. Are you doing this?
@Adam I think you need to set the FABRIC_PROJECT_ID env var too. You can get the value from the projects page in the FABRIC portal.
This was new in a recent FABRIC update so it might not be in the docs you are working from. Which docs are you following? I would like to update them.
August 19, 2022 at 2:30 pm in reply to: Pip fabrictestbed-extensions 1.2.4 – “Exception ignored in” #2657There is some threading functionality available in fablib. I think this error is just a missing dependency for the thread pool. I suspect we need to add a pip requirement.
thanks,
Paul
- This reply was modified 2 years, 3 months ago by Paul Ruth.
I’m not sure about how to use GRO. There is nothing in particular we are doing to prevent you from using it. You have full access to the PCI NIC and full control over the operating system. It might just be that you need to install more tools or update a kernel. I played with it a bit but didn’t get any further than you. Have you made GRO work on machines outside of FABRIC?
UDP can be tricky. Have you tried the suggestions here? https://fasterdata.es.net/host-tuning/linux/udp-tuning
August 12, 2022 at 12:51 pm in reply to: L2 network between sites often have nodes that cannot reach one another #2638Is this slice still up? I’m seeing if someone can look at it.
@Chengyi — We do not currently have usage info about the network links. We have a monitoring framework that is in the process of being built but is not quite ready yet. Also, currently WAN bandwidth is shared and best-effort. Eventually you will be able to reserve bandwidth and it will be easy to know your target max bandwidth for these tests.
I’m glad to see you are getting better bandwidths!
100G is quite difficult to achieve, especially across wide-area links. 56G
What size is the VMs that you are using? You probably need the biggest one available: 64 cores, 385G ram.
Look at your dropped packets. Getting 100G probably requires zero dropped packets. I suspect the connection is not that clean yet. Also, 100G will be impossible if other users are using any bandwidth at all.
Also, I suspect we need to set CPU affinity to ensure streams are in the same numa domains as the 100G NIC. We haven’t looked into that yet.
-
AuthorPosts