Ilya Baldin

Forum Replies Created

Viewing 15 posts - 166 through 180 (of 285 total)

← 1 2 3 … 11 12 13 … 17 18 19 →

Author

Posts
April 8, 2023 at 12:36 am in reply to: Local NAS storage/VM #4062
Ilya Baldin
Participant
Praveen,

NAS and persistent storage are the same thing. The portal expects the volume name to match that of the volume that was created for you. The reason it fails is because you do not have volumes ‘s1’ or ‘s2’ on whichever site you are using.
April 6, 2023 at 10:30 am in reply to: STAR and MAX unavailable [RESOLVED (we think)] #4055
Ilya Baldin
Participant
STAR and MAX are available again. They are under watch to see if the vendor bug shows up again, but experimenters should feel free to use it and report any problems you may see.
April 6, 2023 at 10:23 am in reply to: Local NAS storage/VM #4054
Ilya Baldin
Participant
Praveen,

The NAS is the persistent storage. There is no other option currently available. If you need persistent storage at more sites, please request it.
April 5, 2023 at 9:37 am in reply to: Local NAS storage/VM #4045
Ilya Baldin
Participant
Praveen,

Persistent storage does work and is in use by a number of users. We will check what is going on with the portal provisioning, just be sure to use the correct sites and volume names – you have to remember that we allocate persistent storage ahead of time on the specific sites you request via the ticket system. If you try to use it on another site where this volume hasn’t been allocated, the provisioning will fail.
March 10, 2023 at 5:04 pm in reply to: Fabric Port Mirroring Service #3954
Ilya Baldin
Participant
Sean,

You are correct that your project needs Net.PortMirror tag in order to access this service (project owner needs to request it through the portal).

In general we need to understand specifically what your usecase is. PortMirror service obviously is quite powerful in that it allows to mirror traffic on any port into another port. Only the port you are mirroring *to* has to be in your slice (expected to be a dedicated 10/25 or 100G port), the port you are mirroring from can be any port on the switch within a given site. Before you start mirroring traffic belonging to others we need to understand the purpose and the scope (and also have you test port mirroring on your own slices first).

The port mirroring service is not yet well-integrated into the fablib, it is available as a lower-level library call like so (presumably myinterface is the interface of a dedicated card):
```
myinterface.get_slice().get_fim_topology().add_port_mirror_service(name=name, from_interface_name=port_name, to_interface=self.get_fim_interface())
```
We really do need to understand your usecase though before we proceed to make sure you have the right tools.
March 7, 2023 at 11:34 am in reply to: Obtain the NICs’ CRC specification #3935
Ilya Baldin
Participant
My best suggestion is to check the output of lspci command once you have a slice with the card to get the version of hardware and firmware and then to look through Mellanox documentation on their website.
February 22, 2023 at 1:23 pm in reply to: ALERT : SRI/RUTG Site in Acceptance Testing – AVOID #3879
Ilya Baldin
Participant
These sites are in maintenance mode and should not be usable i.e. produce errors when anyone not empowered to perform acceptance testing tries to use them. We are adding features for fablib to automatically avoid sites in maintenance in the future.
February 21, 2023 at 11:51 am in reply to: Unable to allocate resources after the updates/maintenance. #3875
Ilya Baldin
Participant
Praveen (and the team), just to close the loop and post a version of my private reply:

Individual FABRIC sites are not as large as CloudLab. They typically have between 3 and 6 worker nodes. Each worker has 64 cores. If you ask for VMs of more than 32 cores, that means at most one VM can be accommodated by a worker node. For your storage requirements I suspect you should rely on persistent storage in some cases – not every worker internal storage is the same, so some combinations of core/ram/disk are not possible on all workers, just some. We can create multiple persistent volumes for you on each site if required.

Another alternative is to use a combination of resources from FABRIC and other testbeds. Chameleon@Chicago is already reachable and we will be shortly adding access to Chameleon@TACC (a much larger installation) as well as CloudLab@Utah, Wisconsin and Clemson locations.
February 9, 2023 at 1:09 pm in reply to: Updating the Default VM Images #3830
Ilya Baldin
Participant
Yes! The challenge of updating images is that we should not remove or significantly change the images under existing labels, so some form of versioning is necessary with a history of versions going back for some predetermined period of time. This way if you created an experiment with image ubuntu_20_ver_1.0, that image is immutable for the duration of its lifetime (with the exception of mandatory security updates, which must be applied to preserve facility security).

This is exactly why we have so far not rolled out this feature as it requires some thinking and careful deployment.
February 9, 2023 at 11:06 am in reply to: Updating the Default VM Images #3824
Ilya Baldin
Participant
Brandon,

This is an excellent point. We are discussing within the team both the question of keeping the existing images updated and allowing experimenters to provide their own images. There are of course many pitfalls with the latter, as we test the images to make sure they boot properly and remote debugging of boot issues is difficult. That said we have this in our sights.

At the very least we plan to get on a regular cadence with updating the images we host (we’ve just been too busy to do it) and potentially we will start allowing experimenters to supply their own images as well.
January 31, 2023 at 6:07 pm in reply to: File save error and Load file error #3766
Ilya Baldin
Participant
Just as a form of explanation – we host the Jupyter Hub in Google Cloud, which costs real $$s allocated to us from NSF via a project named CloudBank. We are still evaluating the true costs of running it in its current configuration (so we can more accurately project future costs). We may revise the amount of disk space and other resources each notebook server gets, however we are constrained by the budget and this will not be a decision we will be making in the near term.

In general the Hub is not intended as a place to park or transfer large files.
January 30, 2023 at 2:17 pm in reply to: L2Bridge without MAC learning? #3701
Ilya Baldin
Participant
We will open an internal ticket about it. The VFs are created on the worker node at boot and then given out by the Control Framework to the virtual machines and we need to check what options are set on them at creation time (typically they cannot be changed once created).

@yoursunny may be right and it may or may not be possible for us to change this behavior – we will report here once we know more. Thank you all for your feedback.
January 30, 2023 at 2:14 pm in reply to: Broken get_ssh_command #3698
Ilya Baldin
Participant
Fraida,

We do basically two types of changes:

1. Underlying control framework changes (which generally bring forward new features, but they aren’t available to experimenters until the second change type happens), which are installed on our infrastructure and may affect the look/feel of the portal.

2. FABlib changes to make CF features from above available to users – they generally get installed into a new version of a notebook container image. They affect how the notebooks are run (although we try to keep the changes backward compatible as much as we can).

The change we did last week was of Type 1 as it were, and thus wasn’t going to impact anything you were already doing. The problem you saw is likely a coincidence with another change of FABlib (type 2) that happened earlier. In the coming weeks Paul will be bringing updated version(s) of FABlib that support the features of CF 1.4 and there will be separate announcements about it.
- This reply was modified 2 years, 5 months ago by Ilya Baldin.
- This reply was modified 2 years, 5 months ago by Ilya Baldin.
January 18, 2023 at 10:34 am in reply to: Is it possible to compile p4 program and do experiments with programmable switch #3618
Ilya Baldin
Participant
We are working on it. We are port-constrained on our dataplane switches in a number of desirable locations. Once we are able to resolve those constraints we should be able to ship the switches out and add the necessary code support in the control framework to enable working with them.

I’m assuming you or your professor have signed the SLACA with Intel and have access to their compiler tools. This is not something we will be providing – we will be providing P4 switches with runtime that allows you remotely to load the bytecode, but compiling the code using Intel-licensed tools will be the user’s responsibility.
January 17, 2023 at 10:03 pm in reply to: Is it possible to compile p4 program and do experiments with programmable switch #3616
Ilya Baldin
Participant
Also I moved this topic to a General Questions and Discussion forum.
Author

Posts

Viewing 15 posts - 166 through 180 (of 285 total)

← 1 2 3 … 11 12 13 … 17 18 19 →