Forum Replies Created
-
AuthorPosts
-
Hi, you can use this example: https://github.com/teaching-on-testbeds/k8s
I just tested it and the playbook failed on the first run, but was successful on a second attempt. If it is successful, you should see zero “failed”, like this –
PLAY RECAP ********************************************************************* localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 node-0 : ok=715 changed=35 unreachable=0 failed=0 skipped=1252 rescued=0 ignored=1 node-1 : ok=610 changed=26 unreachable=0 failed=0 skipped=1109 rescued=0 ignored=1 node-2 : ok=502 changed=20 unreachable=0 failed=0 skipped=779 rescued=0 ignored=1
(I developed that example for teaching this material: https://github.com/teaching-on-testbeds/k8s-ml, if you want to see an example of how it is used. )
Re:
NOTE: Bastion host is only used as a jump box, we do not allow login to Bastion Nodes.
the knowledge base says we can test the bastion host login using
ssh -i ~/.ssh/fabric_bastion -C2T -D 14000 -M -N username_0123456789@bastion.fabric-testbed.net
is this no longer current guidance?
They’re back up. thanks!
Thanks for keeping me informed!
September 8, 2023 at 12:18 pm in reply to: Project member I did not specify is being added to my project #5256The project did not have any members before.
I realized that the CSV file had an “Email” header at the top. It appears that the first existing FABRIC user with “email” in their email address was matched to this line. You can reproduce with a blank file with just the text
Email
in it. Similarly, if I use this CSV file:
nyu
it tries to add the first FABRIC user with “nyu” in their email address.
It seems to be matching on partial string instead of the entire string, so I guess if there were two users “xx@email.org” and “axx@email.org”, and I upload a CSV with “xx@email.org”, it might match to “axx@email.org” instead of “xx@email.org”.
August 30, 2023 at 11:53 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5192Perhaps the instructions at https://learn.fabric-testbed.net/knowledge-base/obtaining-and-using-fabric-api-tokens/#using-tokens-within-the-jupyter-hub can be updated. Currently, it says to generate a new token and upload to JH when you get a “Refresh Token: (invalid grant)” error. But at least for this instance of that error it doesn’t work (and, my students say that solution also has not worked for them when they encounter this error) – whatever is not initialized properly fails even with a new token. It only works if the JH is stopped and restarted from the Hub Control Panel.
August 30, 2023 at 11:24 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5187Following up on this to share more info –
In Step 4 above, the first JH server I start after the timeout does have a new refresh token. the contents of
.tokens.json
show the token is created when I start the JH server:{ "refresh_token": "XXX", "created_at": "2023-08-30 15:13:26" }
but when I try to use fablib I get that token error, and no ID token.
After stopping the JH server from the Hub Control Panel and starting it again, then it gets another new refresh token –
.tokens.json
has –{ "refresh_token": "XXX", "created_at": "2023-08-30 15:16:47" }
and this one works. When attempting to use fablib, I get an ID token and no error.
Not clear why the first refresh token does not work, even though it is new.
August 29, 2023 at 10:41 am in reply to: “Expired refresh token” when starting a new JupyterHub server after timeout #5174Thanks, the part that I consider a “bug” is that when I log in again and start a new server in Step 4, it does not get a new “good” token. Is that behavior expected?
Thanks, I appreciate the update!
June 6, 2023 at 9:20 pm in reply to: Fail to launch “default” JupyterHub server for brand new account #4487(related question – is that jupyter_startup.py anywhere in the fabric-testbed Github? I thought it should be https://github.com/fabric-testbed/jupyternb-setup but that hasn’t been updated in a while, and neither branch matches what’s currently on the “default” server.)
Thanks for following up! I look forward to trying this feature.
Hi! I wanted to follow up on this, since this functionality is used in educational materials, I am working to transition those materials ahead of the imminent retirement of InstaGENI, and I need to consider what platform to transition them to.
Is this issue expected to be fix-able? If yes, is there a rough timeline? (Is it likely to be fixed before InstaGENI is retired?)
Thanks. Did I get this right –
- A dedicated ConnectX-6/5 has its full bandwidth within a site (even in a hypothetical situation where the site has high utilization)
- A dedicated ConnectX-6/5s is currently best effort between sites, but eventually we’ll be able to reserve bandwidth on these links between sites.
- Basic NICs have (and only ever will have) best effort, with a 780 Mbps minimum in the hypothetical where the site has high utilization.
Thanks! Could you clarify this point –
Basic NICs: The existing Basic NICs are implemented as SR-IOV virtual functions on a 100Gbps ConnectX-6. The only limitation is that the bandwidth is shared with the other Basic NICs on that port.
This means that 100 Gbps is divided by all of the Basic NICs on that port, and the port may be shared by Basic NICs across my slice but also other users’ slices? Hypothetically, if all 128 SR-IOV VFs on the port are used, then the bandwidth could max out at ~780 Mbps? (And I don’t have any visibility into how many SR-IOV VFs are on the port.)
As an experimenter, I would prefer if images were not updated so that I could develop experiments against a hosted image, and I wouldn’t have to keep updating experiments to reflect the latest software versions.
(Keeping the images stable gives me two choices: I could update to latest software versions, or I could choose not to. Updating the images means I have no choice.)
This is especially a concern for e.g. education experiments, where we may also record video materials to go along with each experiment, and it’s very time intensive to prepare. I have a strong interest in those experiments staying stable.
Maybe there could be one hosted image for each OS that is updated (e.g. “default_ubuntu_latest” is keep updated and there is an announcement so we know when it is updated), but also keep some stable images (e.g. “default_ubuntu_20” is not updated except for security updates).
-
AuthorPosts