1. Hussam Nasir

Hussam Nasir

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 135 total)
  • Author
    Posts
  • in reply to: node.execute() hangs in FABRIC notebook #9535
    Hussam Nasir
    Participant

      Which SITE was this VM on ? Also, could you please share the notebook you used when you encountered the issue? you can email the notebook to help@fabric-testbed.net.

      in reply to: node.execute() hangs in FABRIC notebook #9533
      Hussam Nasir
      Participant

        May I ask which OS is running on your nodes?. We are trying to narrow down the issue.

        in reply to: Bluefield3 external connectivity issue #9456
        Hussam Nasir
        Participant

          You are most likely running into the IPv6 vs IPv4 issue. The site you are trying to reach linux.mellanox.com is an IPv4-only site.  FIU is an IPv4 site, which is why it works seamlessly there. Whereas DALL and SEAT are IPv6-only sites.  Since you are using ip forwarding rule,s as you mentioned, you may have to put those rules in the ip6tables .

          The FABRIC VMs use the FABRIC DNS server which have the NAT64 capability on these racks. So i believe the issue may possibly lie with your forwarding rules.

          in reply to: Perhaps one of the bastion hosts is out #9418
          Hussam Nasir
          Participant

            i just did

            in reply to: Perhaps one of the bastion hosts is out #9415
            Hussam Nasir
            Participant

              I see the issue on your VM. I believe you are using a FABNETv4 EXT and FABnetv6 EXT. During its configuration, you may have accidentally added the NIC to be used in the default route. This caused the system to have two default routes going out two different interfaces.

              2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc fq_codel state UP group default qlen 1000
              link/ether fa:16:3e:07:66:5e brd ff:ff:ff:ff:ff:ff
              inet 10.30.6.168/23 metric 100 brd 10.30.7.255 scope global dynamic enp3s0
              valid_lft 58217sec preferred_lft 58217sec
              inet6 2001:400:a100:3030:f816:3eff:fe07:665e/64 scope global dynamic mngtmpaddr noprefixroute
              valid_lft 86383sec preferred_lft 14383sec
              inet6 fe80::f816:3eff:fe07:665e/64 scope link
              valid_lft forever preferred_lft forever
              3: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
              link/ether 2a:77:c8:60:d3:bf brd ff:ff:ff:ff:ff:ff
              inet 10.129.130.253/24 scope global enp7s0
              valid_lft forever preferred_lft forever
              inet 23.134.235.195/28 scope global enp7s0
              valid_lft forever preferred_lft forever
              inet6 2602:fcfb:101::3/28 scope global
              valid_lft forever preferred_lft forever
              inet6 2602:fcfb:101:0:2877:c8ff:fe60:d3bf/64 scope global dynamic mngtmpaddr
              valid_lft 2591807sec preferred_lft 604607sec
              inet6 fe80::2877:c8ff:fe60:d3bf/64 scope link
              valid_lft forever preferred_lft forever
              4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
              link/ether ca:03:f6:60:d5:34 brd ff:ff:ff:ff:ff:ff
              inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
              valid_lft forever preferred_lft forever

              root@SenderSTAR:~# ip -6 route show
              ::1 dev lo proto kernel metric 256 pref medium
              2001:400:a100:3030::/64 dev enp3s0 proto ra metric 100 expires 86371sec pref medium
              2001:400:a300::/48 via 2602:fcfb:101::1 dev enp7s0 metric 1024 pref medium
              2602:fcfb:101::/64 dev enp7s0 proto kernel metric 256 expires 2591914sec pref medium
              2602:fcf0::/28 dev enp7s0 proto kernel metric 256 pref medium
              fe80::a9fe:a9fe via fe80::f816:3eff:fe79:edec dev enp3s0 proto ra metric 100 expires 271sec pref medium
              fe80::/64 dev enp3s0 proto kernel metric 256 pref medium
              fe80::/64 dev enp7s0 proto kernel metric 256 pref medium
              default via fe80::f816:3eff:fe79:edec dev enp3s0 proto ra metric 100 expires 271sec mtu 9000 pref medium
              default via fe80::c28b:2aff:fe82:6d02 dev enp7s0 proto ra metric 1024 expires 1714sec hoplimit 64 pref medium

              As soon as i disable the enp7s0 NIC using ip link set dev enp7s0 down , the VM started working via ssh.

              It is possible that there is a routing issue when FABNETv6 is used at STAR with STAR Bastion. I will ask for this use case to be investigated.

              in reply to: Perhaps one of the bastion hosts is out #9412
              Hussam Nasir
              Participant

                Good news is that i have narrowed the issue down to your VM at STAR. Is the issue when using VMs at other sites too ?

                in reply to: Perhaps one of the bastion hosts is out #9408
                Hussam Nasir
                Participant

                  Can you post the full ssh command? The issue may be the connection between STAR and the destination rack

                  in reply to: Perhaps one of the bastion hosts is out #9406
                  Hussam Nasir
                  Participant

                    I do see in the bastion star logs sucessfull logins from your id even from early morning today.

                    in reply to: Perhaps one of the bastion hosts is out #9399
                    Hussam Nasir
                    Participant

                      The fablib logs do not tell which bastion it tried. Possibly enabling verbose debug. The other thing we can try to do is ssh  directly to the bastion one by to see which one fails to connect (all will fail to SSH since direct SSH is not allowed)

                      Here is the list of all the bastions https://learn.fabric-testbed.net/knowledge-base/frequently-asked-starter-questions/ (last question in the FAQ)

                      in reply to: Bastion problems #9397
                      Hussam Nasir
                      Participant

                        Hello Jiri,

                        The problem you show is different. Your attempt fails at authenticating to the bastion, which indicates you are using an incorrect bastion key. Others could not even connect to the bastion. PLease make sure you are using the right bastion key in your configuration.

                        in reply to: Perhaps one of the bastion hosts is out #9396
                        Hussam Nasir
                        Participant

                          One thing that stands out is that i dont see any IPv6 addresses of the bastion host in your name lookup. We had been seeing issues on IPv6 from home networks, but we believe that the workaround we placed for that has worked, as reported by other users.  I would also like to see the fablib.log file. Also, could you provide your source IP, as it’s possible that one of the bastions may have banned it?

                          in reply to: Perhaps one of the bastion hosts is out #9392
                          Hussam Nasir
                          Participant

                            Hello Ilya,

                            We have added two new bastions recently and modified one of the pre-existing ones. Can you please post the result of

                            “nslookup bastion.fabric-testbed.net” from the machine where this failed?

                            in reply to: FABRIC Services currently down #9262
                            Hussam Nasir
                            Participant

                              The issue has been resolved. Thank you for your patience

                              in reply to: can’t see nvidia card though VM shows component assigned #9027
                              Hussam Nasir
                              Participant

                                You should be able to see the Nvidia card in your VM now.

                                in reply to: permission for Bastion Key “too open” #8747
                                Hussam Nasir
                                Participant

                                  Hello Nirmala,

                                  The permission has to be 600 for ssh private keys.

                                  chmod 600 /home/fabric/work/fabric_config/Nirmala

                                Viewing 15 posts - 1 through 15 (of 135 total)