1. Paul Ruth

Paul Ruth

Forum Replies Created

Viewing 15 posts - 46 through 60 (of 271 total)
  • Author
    Posts
  • in reply to: Layer 3 IPv6 connection with error #3772
    Paul Ruth
    Keymaster

      There is a image called “default_ubuntu_22” that you can use.

      However, I tried both ubuntu images and there is something about their IPv6 configuration that isn’t working quite right. It looks like in ubuntu the interfaces are not correctly being put in the state ‘up’ with IPv6.

      It will work if you add the following line to each of the “Configure NodeX” cells of the JupyterExample:

      stdout, stderr = node1.execute(f'sudo ip -6 link set dev {node1_iface.get_os_interface()} up')

       

      in reply to: upload_file error: No such file or directory #3771
      Paul Ruth
      Keymaster

        That looks correct.

        Are you able to open and read the file directly in the code?

        It looks like you are running this on a Mac.  Are you running it in a virtual environment on the Mac?  In some cases a virtual environment will not be able to access files outside of the virtual environment.

        Try opening and reading the file directly.

        in reply to: “stdio forwarding failed” issue #3765
        Paul Ruth
        Keymaster

          For clarity, I say “slice/sliver key” because we are a bit inconsistent in our use of terms.   “Slice key” and “sliver key” are often use to mean the same thing.  This is really just the key that is in the VM (as opposed to the key that is in the bastion host).

          The important thing to know is that the slice key you use in the portal and the slice key you use in the JupyterHub are not necessarily the same.  The slice key that is pushed to the VM will be the slice key that is used when you submit the slice request.  That is the slice key you will need to use to access it, regardless of where you access if from.

          So, if you create a slice in the portal and want to access it from the jupyterhub, you will need to have that slice key in your jupyterhub.  The reverse is also true.

          You don’t need to have your keys match, you just need to know which key you used when you created the slice.

          in reply to: Failed to Login into FABRIC VM #3761
          Paul Ruth
          Keymaster

            This has something to do with the ESnet IEP for your account and we need to escalate it to a proper ticket.

            Can you create a account help ticket here: https://portal.fabric-testbed.net/help

            PAul

            in reply to: Failed to Login into FABRIC VM #3756
            Paul Ruth
            Keymaster

              This should work.  Can you confirm the following?

              • You replaced “username_0123456789” with your bastion user name from the portal.
              • You replace “~/.ssh/fabric_bastion” with the actual path to your bastion private key
              • Your bastion key is not more that 6 months old

              Paul

              in reply to: Unable to allocate resources after the updates/maintenance. #3753
              Paul Ruth
              Keymaster

                @Manas –

                Can you try using the NVMe drives?  They are 1 TB each and you can have multiple per VM.  Like all the other components, you can only create VMs composed of components that are on the same physical host. So, just because a site has 10+ NVMe drives does not mean you can put them all on one VM.  Two NVMe drives in a VMs is possible on most sites.  The other bonus of the NVMe drives is that they are very fast.

                Also, you might try using large persistent volumes.  These can be very large but are mounted across a network but within a site.  You would need to pick a few site where we can create the volumes. Then you can mount them with VMs on that site.  The bonus with these volumes is that the data is persistent.  So, if you shutdown a slice and come back tomorrow or next week, the data will still be there.

                Paul

                 

                • This reply was modified 1 year, 3 months ago by Paul Ruth.
                in reply to: “stdio forwarding failed” issue #3750
                Paul Ruth
                Keymaster

                  We looked into this and this is not an issue with being banned.

                  From your error, you have made it through the bastion host but are failing authorization at the VM.  This is likely caused by using the wrong VM username or the wrong key.  Keep in mind, the key you use in the portal and the key you use in the JupyterHub are likely different.  You can make them the same but you would need to manually do that.  Are you sure you are using the correct slice/sliver key?

                  in reply to: File save error and Load file error #3726
                  Paul Ruth
                  Keymaster

                    What are you using it for? Generally, the JupyterHub is a good place for code/script/docs (i.e. smaller things). Do you need space for large data sets? If so we can create a persistent storage volume in the testbed itself.

                    in reply to: “stdio forwarding failed” issue #3721
                    Paul Ruth
                    Keymaster

                      I think what you have will probably work once the ban is lifted.  We did make a small change in order to load balance across the bastions hosts.  There is now one bastion name “bastion.fabric-testbed.net”.  You might try making ssh_config look something like this (Although I think it would work the way you have it):

                      Host bastion.fabric-testbed.net
                          User pruth_0031379841
                          ForwardAgent yes
                          Hostname %h
                          IdentityFile /home/fabric/work/fabric_config/fabric_bastion_key
                          IdentitiesOnly yes
                      
                      Host * !bastion.fabric-testbed.net
                          ProxyJump pruth_0031379841@bastion.fabric-testbed.net:22
                      • This reply was modified 1 year, 3 months ago by Paul Ruth.
                      • This reply was modified 1 year, 3 months ago by Paul Ruth.
                      in reply to: “stdio forwarding failed” issue #3719
                      Paul Ruth
                      Keymaster

                        Oh, actually this won’t work for you right now.  There is still something wrong with your ssh setup but even if you correct it, you have triggered our security policy about failed ssh retries and your IP has been temporarily banned.

                        Are you able to try this from a different IP?

                        in reply to: “stdio forwarding failed” issue #3717
                        Paul Ruth
                        Keymaster

                          Ok, my next thought is that the ssh_config file might be wrong or not at the path you specified.

                          Can you confirm the ssh_config file is in the local dir and post the parts related to the fabric bastion host?

                          in reply to: File save error and Load file error #3714
                          Paul Ruth
                          Keymaster

                            It looks like you filled your disk allocation our JupyterHub.  Do you have old files that you can clean up?

                            in reply to: “stdio forwarding failed” issue #3713
                            Paul Ruth
                            Keymaster
                              in reply to: “stdio forwarding failed” issue #3712
                              Paul Ruth
                              Keymaster

                                I think you need to use the private key in that ssh command rather than the public one.

                                Paul

                                in reply to: L2Bridge without MAC learning? #3696
                                Paul Ruth
                                Keymaster

                                  Fraida,

                                  Coincidentally, I ran into this issue recently when putting together an example that I intend to share with you in our meeting with Kate this week.  I have a working prototype that looks like your example that uses a 5th VM to run a software OVS switch (https://witestlab.poly.edu/blog/basic-ethernet-switch-operation/).

                                  There are actually a couple issues going on here that I had to work around… and its super impressive that Yoursunny identified the trickiest part.

                                  The main issue is the one that Yoursunny pointed out related to the Basic NICs being SRIOV virtual functions on a ConnectX-6.

                                  You can think of the ConnectX-6 as a mini-switch that uses its physical port(s) as trunks between the itself and the bigger dataplane switch.  The mini-switch then has several access ports (i.e. SRIOV virtual functions) that that are passed through to the various VMs.  The traffic on each of these access ports is basically a “pseudo wire” going through the ConnectX-6 between the VM and the dataplane switch.   The problem is that the ConnectX-6 “mini-switch” is also doing MAC learning on the “pseudo wires” and is filtering the traffic.  I think this is a unforeseen problem with our SRIOV configuration and just needs to be changed in the future.  We are working on this.

                                  The effect this has on your example is that an OVS VM that is using 4 Basic NICs connected to 4 other hosts will not see traffic sent directly to one host from another. The ARP request will go through because it is an broadcast but the ARP reply is filtered by the ConnectX-6 “mini-switch”.  Without the ARPs, we don’t get very far.

                                  My workaround is to use dedicated ConnectX-5s for the OVS switch VM (the hosts can be Basic NICs).  The dedicated NIC are on access ports connected directly to the dataplane switch so there is no “mini-switch” filtering packets in between.  This isn’t a great solution because it limits the degree of your OVS switch and uses a much more scarce resource type.  The better long-term solution is for us to turn off MAC learning on the ConnectX-6 “mini-switches”.

                                  I can tell you more about this later this week when we talk with Kate.

                                  Paul

                                   

                                   

                                   

                                Viewing 15 posts - 46 through 60 (of 271 total)