1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 196 through 210 (of 511 total)
  • Author
    Posts
  • in reply to: Issue with SmartNIC Configuration on nodes #7826
    Komal Thareja
    Participant

      Hi Hemil,

      Could you please run the jupyter-examples-*/configure_and_validate/configure_and_validate.ipynb notebook?
      This shall resolve any SSH key issues by renewing the expired bastion keys if any.

      Try your setup.sh script after that and let us know if you still see this error.
      In addition, could you please try to SSH to the VMs using the command shown in SSH Command coloumn.

      Regarding auto configuring the IP addresses, please specify the subnet when creating a network and set mode to auto for the interfaces at slice creation. Please refer to one of the following examples for more details.

      https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l2network_basic/create_l2network_basic_auto.ipynb

      https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l2network_wide_area/create_l2network_wide_area_auto.ipynb

      https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/fablib_api/create_l3network_fabnet_ipv4/create_l3network_fabnet_ipv4_auto.ipynb

      Please let us know if you still run into errors or questions.

      Thanks,
      Komal

      in reply to: Unable to reserve FPGAs on KANS or WASH #7800
      Komal Thareja
      Participant

        Hi Ilya,

        Could you please try your slice again? There were leaked slivers. I have cleared them, slice provisioning should work now.

        Thanks,
        Komal

        Komal Thareja
        Participant

          Hi Ali,

          We only provide 1G storage to users on Jupyter Containers in the /home/fabric/work directory. Could you please clarify or add screenshot of the df -h from the terminal in your container and also share how you are trying to upload the files?

          I tried and uploaded 800 MB file to my container using the Jupyter Hub upload interface without issues.

          Thanks,
          Komal

          in reply to: Unable to delete slice from expired project #7792
          Komal Thareja
          Participant

            Slice has been deleted.

            Thanks,
            Komal

            1 user thanked author for this post.
            in reply to: Unable to delete slice from expired project #7789
            Komal Thareja
            Participant

              Hello,

              Your slice is set to close by November 20th and is currently in the StableOK state. If needed, you can request to renew your project to continue using this slice. Alternatively, I can delete the slice if that is your intention.

              Slice Name: new_remote_attestation Slice ID: a4caf0d7-49b0-41c8-904f-e8ed64ab8f5d Project ID: a93b8d1a-a9dd-480d-b1f1-23c3889a7e17 Project Name: Tutorial on using Alveos on FABRIC as part of F23 CS595 at Illinois Tech
              Graph ID: e9577750-7a40-430c-9872-2ff856d061e2
              Slice owner: { name: orchestrator, guid: orchestrator-guid, oidc_sub_claim: 7baac318-48b4-43b3-bc3e-ac3dfd23d7bc, email: hbang3@hawk.iit.edu}
              Slice state: StableOK
              Lease time: 2024-11-20 05:12:54+00:00

              Thanks,
              Komal

              in reply to: Maintenance Network AM – 11/05/2024 – 2:00 pm – 3:00 pm #7771
              Komal Thareja
              Participant

                Closing the thread!

                in reply to: Maintenance Network AM – 11/05/2024 – 2:00 pm – 3:00 pm #7770
                Komal Thareja
                Participant

                  Maintenance is complete and testbed is available for use again!

                  Thanks,
                  Komal

                  Komal Thareja
                  Participant

                    Hi Luca,

                    Not much luck with it! I can reproduce what you are observing on CLEM but haven’t found a resolution yet. However, I did notice that when I start pktgen and all the containers are up and running. I keep noticing following error in the container sn-stack-ubuntu-smartnic-cfg-1

                    Probe reports few drops as soon as I start pktgen but after that it just keeps reporting all 0s.

                    Checking for FPGA readiness ... FPGA ready.
                    Starting server: sn-cfg-agent server --tls-cert-chain=/etc/letsencrypt/fullchain.pem --tls-key=/etc/letsencrypt/privkey.pem 0000:1f:00.0
                    --- PCI bus IDs:
                    ------> 0000:1f:00.0
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    --- UTC start time: 2024-10-24 20:33:02 +0000 [1729801982s.278712702ns]
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    agent_server_run: Serving on [::]:50100
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error
                    ERROR(cms_mailbox_post)[5 (Input/output error)]: packet error

                    Thanks,
                    Komal

                    Komal Thareja
                    Participant

                      Hey Luca,

                      I was looking at your slice on SRI and noticed two containers sn-stack-ubuntu-smartnic-cfg-1 and sn-stack-ubuntu-smartnic-p4-1 are restarting. I suspect that could be the reason for traffic issue.

                      Your DALL slice is expired so I could not check there.

                      The logs in both of them suggest FPGA is not ready.

                      ================================================================================
                      Created self-signed TLS certificate.
                      issuer=CN = localhost
                      subject=CN = localhost
                      notBefore=Oct 23 18:59:34 2024 GMT
                      notAfter=Oct 23 18:59:34 2025 GMT
                      X509v3 Subject Alternative Name:
                      DNS:smartnic-p4, DNS:localhost, DNS:localhost, IP Address:127.0.0.1, DNS:ip6-localhost, IP Address:0:0:0:0:0:0:0:1
                      ================================================================================
                      Checking for FPGA readiness ... FPGA not ready.


                      CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
                      388c54840920 smartnic-dpdk-docker:ubuntu-dev "/bin/bash -c -e -o …" 3 minutes ago Up 2 minutes sn-stack-ubuntu-smartnic-dpdk-1
                      76f7a24df81d esnet-smartnic-fw:ubuntu-dev "/bin/bash -c -e -o …" 3 minutes ago Up 2 minutes (healthy) sn-stack-ubuntu-smartnic-devbind-1
                      b5cca620505d esnet-smartnic-fw:ubuntu-dev "/usr/local/sbin/sn-…" 3 minutes ago Restarting (1) 59 seconds ago sn-stack-ubuntu-smartnic-cfg-1
                      9dbb7262d5d6 esnet-smartnic-fw:ubuntu-dev "/bin/bash -c -e -o …" 3 minutes ago Up 2 minutes sn-stack-ubuntu-smartnic-fw-1
                      380fcc8ad614 esnet-smartnic-fw:ubuntu-dev "/usr/local/sbin/sn-…" 3 minutes ago Restarting (1) 59 seconds ago sn-stack-ubuntu-smartnic-p4-1
                      a3972a1c0ce9 xilinx-labtools-docker:ubuntu-dev "/entrypoint.sh /bin…" 3 minutes ago Up 3 minutes (healthy) sn-stack-ubuntu-smartnic-hw-1
                      352f70e7da43 xilinx-labtools-docker:ubuntu-dev "/entrypoint.sh /bin…" 3 minutes ago Up 3 minutes (healthy) 3121/tcp sn-stack-ubuntu-xilinx-hwserver-1
                      4295b131ccd8 esnet-smartnic-fw:ubuntu-dev "/bin/bash -c -e -o …" 3 minutes ago Up 3 minutes sn-stack-ubuntu-smartnic-unpack-1
                      e9a84041e44e esnet-smartnic-fw:ubuntu-dev "/bin/bash -c -e -o …" 3 minutes ago Up 3 minutes sn-stack-ubuntu-xilinx-sc-console-1

                      Dev bind is successful:

                      No 'Regex' devices detected
                      ===========================
                      + lspci -D -kvm -s 0000:1f:00.0
                      + grep '^Driver: vfio-pci'
                      Driver: vfio-pci
                      + lspci -D -kvm -s 0000:1f:00.1
                      + grep '^Driver: vfio-pci'
                      Driver: vfio-pci
                      + touch /status/ok
                      + sleep infinity

                      Komal Thareja
                      Participant

                        Correction, FIU has been flashed with bit file compatible with XDMA shell so may not work with ESNet workflow.

                        Komal Thareja
                        Participant

                          Thank you Luca for sharing the slice information. I will investigate this further and keep you posted.

                          Could you please extend the slices for atleast upto a week so they don’t expire?

                          As a first check, FPGAs on the GATECG, FIU, SRI seem to be flashed with a bitfile compatible with ESNet workflow. I will check about KANS and LOSA and confirm.

                          Thanks,

                          Komal

                          Komal Thareja
                          Participant

                            Hi Luca,

                            Could you share the sites where you encountered this issue? I tried CLEM, and it worked fine.

                            As mentioned here, we collaborate with the experimenter to flash the FPGA with the initial bitstream. We’d like to rule out whether a different bitstream (other than ESnet) was used for flashing the FPGA at the sites where you experienced the problem. Also, if you have the slice up where you see the error, please share your slice ID with us!

                            Thanks,
                            Komal

                            in reply to: Unable to access slice #7672
                            Komal Thareja
                            Participant

                              Hi Kriti,

                              It appears that there may be a configuration issue with your experiment. I recommend checking your settings. Based on your slice, there’s a Layer 2 network established between ipnode-1 (192.168.14.1) and n6 (192.168.14.254).

                              I can confirm that pinging from ipnode-1 to n6 is successful, which indicates that the underlying Layer 2 network is functioning properly. While you do have a route on ipnode-1 to direct traffic through n6, it seems that the subnet you’re trying to access is not reachable from n6. This issue seems to be specific to your experiment’s configuration, so you may need to troubleshoot on your end.


                              [root@ipnode-1 ~]# ip route list
                              10.30.6.0/23 dev eth0 proto kernel scope link src 10.30.6.233 metric 100
                              169.254.169.254 via 10.30.6.11 dev eth0 proto dhcp src 10.30.6.233 metric 100
                              192.168.0.0/16 via 192.168.14.254 dev eth1
                              192.168.14.0/24 dev eth1 proto kernel scope link src 192.168.14.1
                              [root@ipnode-1 ~]# traceroute 192.168.28.1
                              traceroute to 192.168.28.1 (192.168.28.1), 30 hops max, 60 byte packets
                              1 192.168.14.254 (192.168.14.254) 0.422 ms !N 0.386 ms !N *
                              [root@ipnode-1 ~]# traceroute 192.168.14.254
                              traceroute to 192.168.14.254 (192.168.14.254), 30 hops max, 60 byte packets
                              1 192.168.14.254 (192.168.14.254) 0.454 ms 0.434 ms 0.423 ms
                              [root@ipnode-1 ~]#
                              [root@ipnode-1 ~]#
                              [root@ipnode-1 ~]#
                              [root@ipnode-1 ~]# ping -c 5 192.168.14.254
                              PING 192.168.14.254 (192.168.14.254) 56(84) bytes of data.
                              64 bytes from 192.168.14.254: icmp_seq=1 ttl=64 time=0.069 ms
                              64 bytes from 192.168.14.254: icmp_seq=2 ttl=64 time=0.062 ms
                              64 bytes from 192.168.14.254: icmp_seq=3 ttl=64 time=0.102 ms
                              64 bytes from 192.168.14.254: icmp_seq=4 ttl=64 time=0.066 ms
                              64 bytes from 192.168.14.254: icmp_seq=5 ttl=64 time=0.076 ms

                              n6:

                              [root@n6 ~]# ip route list
                              10.30.6.0/23 dev eth0 proto kernel scope link src 10.30.6.69 metric 100
                              169.254.169.254 via 10.30.6.11 dev eth0 proto dhcp src 10.30.6.69 metric 100
                              192.168.1.0/24 proto ospf metric 20
                              nexthop via 192.168.8.2 dev eth3 weight 1
                              nexthop via 192.168.12.1 dev eth1 weight 1
                              192.168.2.0/24 via 192.168.8.2 dev eth3 proto ospf metric 20
                              192.168.3.0/24 proto ospf metric 20
                              nexthop via 192.168.8.2 dev eth3 weight 1
                              nexthop via 192.168.12.1 dev eth1 weight 1
                              192.168.4.0/24 via 192.168.12.1 dev eth1 proto ospf metric 20
                              192.168.5.0/24 via 192.168.8.2 dev eth3 proto ospf metric 20
                              192.168.6.0/24 proto ospf metric 20
                              nexthop via 192.168.8.2 dev eth3 weight 1
                              nexthop via 192.168.12.1 dev eth1 weight 1
                              192.168.7.0/24 via 192.168.12.1 dev eth1 proto ospf metric 20
                              192.168.8.0/24 dev eth3 proto kernel scope link src 192.168.8.1
                              192.168.9.0/24 via 192.168.8.2 dev eth3 proto ospf metric 20
                              192.168.10.0/24 via 192.168.8.2 dev eth3 proto ospf metric 20
                              192.168.11.0/24 via 192.168.12.1 dev eth1 proto ospf metric 20
                              192.168.12.0/24 dev eth1 proto kernel scope link src 192.168.12.2
                              192.168.13.0/24 via 192.168.12.1 dev eth1 proto ospf metric 20
                              192.168.14.0/24 dev eth2 proto kernel scope link src 192.168.14.254
                              192.168.15.0/24 via 192.168.8.2 dev eth3 proto ospf metric 20
                              192.168.16.0/24 via 192.168.12.1 dev eth1 proto ospf metric 20

                              [root@n6 ~]# traceroute 192.168.28.1
                              traceroute to 192.168.28.1 (192.168.28.1), 30 hops max, 60 byte packets
                              connect: Network is unreachable

                              Thanks,

                              Komal

                              in reply to: Unable to access slice #7668
                              Komal Thareja
                              Participant

                                Hi Kriti,

                                I created a slice on MASS for a Layer2 Network as well as Layer3 network. Both slices were able to pass traffic.

                                This may be something specific to your slice or configuration. Please share your slice id. We can help to take a look but would also recommend checking configuration on your side as well.

                                Thank you for letting us know about the portal. We will work on addressing that as well.

                                Thanks,

                                Komal

                                in reply to: Unable to extend slice #7664
                                Komal Thareja
                                Participant

                                  Hi Kriti,

                                  Could you please open the Calendar by clicking on the small square shown next to Lease End at to choose the timestamp and then click on Extend?

                                  Also, you can extend your slice by following methods as well:

                                  • Notebook accessible from Start_here -> Extend Slice Reservation
                                  • slice-commander command line utility available in JH container


                                  fabric@fall:system-29%$ slice-commander
                                  Usage: renew <days> [SliceName1, SliceName2, ...]

                                  Please let us know if you run into issues or errors.

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 196 through 210 (of 511 total)