1. Ilya Baldin

Ilya Baldin

Forum Replies Created

Viewing 15 posts - 91 through 105 (of 285 total)
  • Author
    Posts
  • in reply to: Resource issue – it’s not you it’s me … probably #5002
    Ilya Baldin
    Participant

      Right, we need to do a better job explaining it. Max-w1 is a valid worker node name, they are all named <site>-w<index>.fabric-testbed.net

      But also you do not need this I don’t think. Experimenters only in rare cases need to place their VM on specific workers. Most of the time you want the placement algorithm do it for you so don’t specify host.

      in reply to: Resource issue – it’s not you it’s me … probably #5000
      Ilya Baldin
      Participant

        Can you please post the pointer, we will take a look.

        in reply to: Resource issue – it’s not you it’s me … probably #4998
        Ilya Baldin
        Participant

          Remove the host parameter. I think you’re thinking it’s the hostname of your VM but it’s actually the name of the worker node you’re telling it to land on and it doesn’t exist. The name of the VM is the name of the node I think combined with its sliver id.

          in reply to: Takes long time for complete the Fablib API #4987
          Ilya Baldin
          Participant

            Just updating here for completeness:

            Reachability issues between JH and FABRIC infrastructure

             

            in reply to: Reachability issues between JH and FABRIC infrastructure #4986
            Ilya Baldin
            Participant

              Dear experimenters,

              We believe this problem has been addressed. It was traced to a incorrect configuration of UNC routers on the path which has now been corrected. We may still run some tests but you should not be experiencing any more persistent problems.

              in reply to: Takes long time for complete the Fablib API #4974
              Ilya Baldin
              Participant

                Thank you for your analysis. We are about where you are – there is either a route flapping or, perhaps, some kind of non-trivial packet loss specific to the path from JH to RENCI (we have been testing with just curl to various hosts at RENCI and the results are the same). We have notified the MCNC NOC as well as UNC ITS and are waiting to see what they say.

                The problem does not appear to manifest itself from the worker nodes hosting JH, only from the Dockers inside so we are thinking perhaps a middle-box somewhere that is dropping some connections specific to JH originating IPs because they have a high rate of transactions to our infrastructure compared to the background of regular IPs. But this is just a theory. We will continue our investigation and we apologize for the inconvenience.

                in reply to: Related to the PINGing problem #4964
                Ilya Baldin
                Participant

                  We do not recommend using the portal for anything, but the simplest experiments and also for visualizing the topologies. The portal does not support the full workflow of the experiment – it only creates topologies, leaving everything else (i.e. experiment configuration) a manual step.

                  Regarding IPv6 – this was not a choice for us – the amount of available IPv4 space is extremely limited at the hosting locations. At many of them IPv6 was the only option. Our systems deal with this transparently, however communicating to the outside world can be sometimes problematic, because despite the fact it is 2023, GitHub still doesn’t have IPv6 presence. Most other larger sites/services do and as we go forward we expect to have fewer problems with this issue.

                   

                  in reply to: Related to the PINGing problem #4960
                  Ilya Baldin
                  Participant

                    Regarding PSC –  the site underwent a power outage on Sunday, we are putting it back together.

                    in reply to: Related to the PINGing problem #4958
                    Ilya Baldin
                    Participant

                      By far the easiest way to connect VMs between each other is to use FABNetv4 (or FABNetv6) services. These will not allow the VMs to see the outside world through those interfaces, but they will ‘just work’ regardless of which sites you are on. You really do not need to rely on L2 services as much as you did in GENI, as FABRIC L3 services require far less configuration.

                      You are showing the screenshot from the portal, but I am assuming you are using the notebooks to build your slices – working from the portal requires a lot of manual steps to get things running.

                      The connectivity to outside world problem you are referring to is most likely due to the fact that you are ending up on sites that have IPv6 management network connectivity – in this case many sites (like yum/deb repos, GitHub) are not reachable directly, in which case you need to modify your DNS to allow the use of NAT64 (there is a notebook about it and also this article discusses it).

                      in reply to: Takes long time for complete the Fablib API #4951
                      Ilya Baldin
                      Participant

                        These calls can take longish time simetimes (the results of calls are cached, so the first caller gets a delay, but others do not for a while), however for the past couple of days we are seeing some connectivity issues between our Jupyter Hub hosted in Google and the rest of the testbed manifesting as various connection retries which can also cause additional delays. We are investigating the reasons for it. It appears to be specific to the Jupyter Hub environment.

                        in reply to: Getting up and running #4934
                        Ilya Baldin
                        Participant

                          Note that the latest ‘Bleeding Edge’ container has an example of a notebook that shows how to push extra SSH public keys into a slice at creation time.

                          in reply to: Can not ping other nodes (but can in notebook) #4932
                          Ilya Baldin
                          Participant

                            I guess I’m still not sure what you are experiencing, let’s please break it down:

                            1. Which notebook are you using to set up the experiment? (please be sure to specify the version of the notebooks, the type of the container you are using in Jupyter Hub and the reported version of fablib – reported as part of the table in the first cell of the notebook)

                            2. Are you saying it works in the notebook, but you are trying to issue (presumably the same) commands from the console and it doesn’t?

                            3. Which host are you trying to ping from which other host using what command? (please provide output of ip addr list and ip route list for each host). Please also provide the slice ID.

                             

                            in reply to: Can not ping other nodes (but can in notebook) #4930
                            Ilya Baldin
                            Participant

                              Can you please show some command outputs and describe which addresses you are trying to ping?

                               

                              This article may be useful to explain what interfaces you should expect in your VM.

                              in reply to: Recent regression: no geographical locations? #4913
                              Ilya Baldin
                              Participant

                                Thank you for reporting this – there has been a change in the underlying Nominatim API we are using to convert addresses to latlon that we failed to notice. We will update the Fall 2023 and the Bleeding Edge containers with the updated version of the underlying library that tracks this change. The 1.4.6 container is already end-of-life – we will not update it, code in it will continue to report 0 values for lat and lon.

                                in reply to: Recent regression: no geographical locations? #4912
                                Ilya Baldin
                                Participant

                                  We’ll look into it. Certainly not anything we did on purpose.

                                Viewing 15 posts - 91 through 105 (of 285 total)