Fraida Fund

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 32 total)

1 2 3 →

Author

Posts
December 9, 2024 at 1:09 pm in reply to: Setting up Kubernetes cluster on FABRIC #7926
Fraida Fund
Participant
Hi, you can use this example: https://github.com/teaching-on-testbeds/k8s

I just tested it and the playbook failed on the first run, but was successful on a second attempt. If it is successful, you should see zero “failed”, like this –
```
PLAY RECAP *********************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
node-0                     : ok=715  changed=35   unreachable=0    failed=0    skipped=1252 rescued=0    ignored=1   
node-1                     : ok=610  changed=26   unreachable=0    failed=0    skipped=1109 rescued=0    ignored=1   
node-2                     : ok=502  changed=20   unreachable=0    failed=0    skipped=779  rescued=0    ignored=1   
```
(I developed that example for teaching this material: https://github.com/teaching-on-testbeds/k8s-ml, if you want to see an example of how it is used. )
July 24, 2024 at 10:18 am in reply to: SSH Key authenticating error #7310
Fraida Fund
Participant
Re:

NOTE: Bastion host is only used as a jump box, we do not allow login to Bastion Nodes.

the knowledge base says we can test the bastion host login using
```
ssh -i ~/.ssh/fabric_bastion -C2T -D 14000 -M -N username_0123456789@bastion.fabric-testbed.net
```
is this no longer current guidance?
December 7, 2023 at 4:20 pm in reply to: Cannot SSH to VMs on newy-w2.fabric-testbed.net #6176
Fraida Fund
Participant
They’re back up. thanks!
September 18, 2023 at 12:11 pm in reply to: L2Bridge without MAC learning? #5332
Fraida Fund
Participant
Thanks for keeping me informed!
September 8, 2023 at 12:18 pm in reply to: Project member I did not specify is being added to my project #5256
Fraida Fund
Participant
The project did not have any members before.

I realized that the CSV file had an “Email” header at the top. It appears that the first existing FABRIC user with “email” in their email address was matched to this line. You can reproduce with a blank file with just the text
```
Email
```
in it. Similarly, if I use this CSV file:
```
nyu
```
it tries to add the first FABRIC user with “nyu” in their email address.

It seems to be matching on partial string instead of the entire string, so I guess if there were two users “xx@email.org” and “axx@email.org”, and I upload a CSV with “xx@email.org”, it might match to “axx@email.org” instead of “xx@email.org”.
August 30, 2023 at 11:53 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5192
Fraida Fund
Participant
Perhaps the instructions at https://learn.fabric-testbed.net/knowledge-base/obtaining-and-using-fabric-api-tokens/#using-tokens-within-the-jupyter-hub can be updated. Currently, it says to generate a new token and upload to JH when you get a “Refresh Token: (invalid grant)” error. But at least for this instance of that error it doesn’t work (and, my students say that solution also has not worked for them when they encounter this error) – whatever is not initialized properly fails even with a new token. It only works if the JH is stopped and restarted from the Hub Control Panel.
August 30, 2023 at 11:24 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5187
Fraida Fund
Participant
Following up on this to share more info –

In Step 4 above, the first JH server I start after the timeout does have a new refresh token. the contents of .tokens.json show the token is created when I start the JH server:
```
{
    "refresh_token": "XXX",
    "created_at": "2023-08-30 15:13:26"
}
```
but when I try to use fablib I get that token error, and no ID token.

After stopping the JH server from the Hub Control Panel and starting it again, then it gets another new refresh token – .tokens.json has –
```
{
    "refresh_token": "XXX",
    "created_at": "2023-08-30 15:16:47"
}
```
and this one works. When attempting to use fablib, I get an ID token and no error.

Not clear why the first refresh token does not work, even though it is new.
August 29, 2023 at 10:41 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5174
Fraida Fund
Participant
Thanks, the part that I consider a “bug” is that when I log in again and start a new server in Step 4, it does not get a new “good” token. Is that behavior expected?
August 24, 2023 at 10:38 am in reply to: L2Bridge without MAC learning? #5115
Fraida Fund
Participant
Thanks, I appreciate the update!
June 6, 2023 at 9:20 pm in reply to: Fail to launch “default” JupyterHub server for brand new account #4487
Fraida Fund
Participant
(related question – is that jupyter_startup.py anywhere in the fabric-testbed Github? I thought it should be https://github.com/fabric-testbed/jupyternb-setup but that hasn’t been updated in a while, and neither branch matches what’s currently on the “default” server.)
April 21, 2023 at 3:59 pm in reply to: Adding large number of members to a project #4138
Fraida Fund
Participant
Thanks for following up! I look forward to trying this feature.
March 31, 2023 at 11:53 am in reply to: L2Bridge without MAC learning? #4011
Fraida Fund
Participant
Hi! I wanted to follow up on this, since this functionality is used in educational materials, I am working to transition those materials ahead of the imminent retirement of InstaGENI, and I need to consider what platform to transition them to.

Is this issue expected to be fix-able? If yes, is there a rough timeline? (Is it likely to be fixed before InstaGENI is retired?)
March 14, 2023 at 5:32 pm in reply to: Bandwidth on FABRIC links #3962
Fraida Fund
Participant
Thanks. Did I get this right –
- A dedicated ConnectX-6/5 has its full bandwidth within a site (even in a hypothetical situation where the site has high utilization)
- A dedicated ConnectX-6/5s is currently best effort between sites, but eventually we’ll be able to reserve bandwidth on these links between sites.
- Basic NICs have (and only ever will have) best effort, with a 780 Mbps minimum in the hypothetical where the site has high utilization.
March 14, 2023 at 4:39 pm in reply to: Bandwidth on FABRIC links #3960
Fraida Fund
Participant
Thanks! Could you clarify this point –

Basic NICs: The existing Basic NICs are implemented as SR-IOV virtual functions on a 100Gbps ConnectX-6. The only limitation is that the bandwidth is shared with the other Basic NICs on that port.

This means that 100 Gbps is divided by all of the Basic NICs on that port, and the port may be shared by Basic NICs across my slice but also other users’ slices? Hypothetically, if all 128 SR-IOV VFs on the port are used, then the bandwidth could max out at ~780 Mbps? (And I don’t have any visibility into how many SR-IOV VFs are on the port.)
February 9, 2023 at 1:00 pm in reply to: Updating the Default VM Images #3829
Fraida Fund
Participant
As an experimenter, I would prefer if images were not updated so that I could develop experiments against a hosted image, and I wouldn’t have to keep updating experiments to reflect the latest software versions.

(Keeping the images stable gives me two choices: I could update to latest software versions, or I could choose not to. Updating the images means I have no choice.)

This is especially a concern for e.g. education experiments, where we may also record video materials to go along with each experiment, and it’s very time intensive to prepare. I have a strong interest in those experiments staying stable.

Maybe there could be one hosted image for each OS that is updated (e.g. “default_ubuntu_latest” is keep updated and there is an announcement so we know when it is updated), but also keep some stable images (e.g. “default_ubuntu_20” is not updated except for security updates).
Author

Posts

Viewing 15 posts - 1 through 15 (of 32 total)

1 2 3 →