1. Komal Thareja

Komal Thareja

Forum Replies Created

Viewing 15 posts - 121 through 135 (of 515 total)
  • Author
    Posts
  • in reply to: GPU node is not available on MAX site. #8238
    Komal Thareja
    Participant

      Hi Yuanjun,

      Unfortunately, the MAX resources you’re requesting are currently in use. Please try again later or consider scheduling your slices in advance using the notebook.

      Additionally, could you post your inquiry in the FABRIC General Questions and Discussion forum?

      Thanks,
      Komal

      in reply to: Slice submit via Jupyter get’s stuck #8211
      Komal Thareja
      Participant

        Glad to hear that worked! We will work to address this and add support to interrupt/return meaningful error in such cases.

        Thanks,

        Komal

        in reply to: Unable to reserve slice #8208
        Komal Thareja
        Participant

          Hi Kriti,

          Could you please re-run this notebook: jupyter-examples-rel1.8.1/configure_and_validate/configure_and_validate.ipynb ?
          This shall renew any expired keys. Please try your slice again after this. I want to rule out any SSH errors. If you continue to see the error, please share /tmp/fablib/fablib.log

          Thanks,
          Komal

          in reply to: Slice submit via Jupyter get’s stuck #8207
          Komal Thareja
          Participant

            Authentication failed would explain the SSH errors you are observing. Could you please re-run this notebook: jupyter-examples-rel1.8.1/configure_and_validate/configure_and_validate.ipynb ?
            This shall renew any expired keys. Please try your slice again after this.

            Thanks,
            Komal

            in reply to: HAWI SITE PROBLEM #8205
            Komal Thareja
            Participant

              Hi,

              Please try the following snippet. Please note that list_sites() should be invoked before show_site().

              We will fix this to return a more meaningful error in the next version.


              from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager
              fablib = fablib_manager()
              fablib.list_sites()
              fablib.get_resources().show_site("HAWI")

              Thanks,
              Komal

              in reply to: Slice submit via Jupyter get’s stuck #8204
              Komal Thareja
              Participant

                Thank you Justas! I haven’t been able to reproduce this even on JH Stable 1.8 container. Could you please share /tmp/fablib/fablib.log file from your container?

                Also, please share the sliceid of your new slice.

                Thanks,

                Komal

                in reply to: Slice submit via Jupyter get’s stuck #8201
                Komal Thareja
                Participant

                  Could you please try this with Beyond Bleeding Edge Container? I wasn’t able to reproduce this issue there. Trying it with 1.8 Stable container now.

                  Thanks,

                  Komal

                  in reply to: Unable to reserve slice #8199
                  Komal Thareja
                  Participant

                    Hey Kriti,

                    Could you please share which JH container were you using when you noticed this issue?

                    Thanks,

                    Komal

                    in reply to: Slice submit via Jupyter get’s stuck #8198
                    Komal Thareja
                    Participant

                      Also, could you please share which JH container are you using?

                      Thanks,

                      Komal

                      in reply to: Slice submit via Jupyter get’s stuck #8197
                      Komal Thareja
                      Participant

                        Hi,

                        Both your slices  are in Stable State. It seems like a bug in fablib or a race condition which is causing fablib to think the slice is still Configuring.

                        As a workaround, could you please do the following?

                        I am trying to reproduce this at my end and would work to fix this. Apologies for the inconvenience!


                        slice=fablib.get_slice(slice_name)
                        slice.post_boot_config()
                        slice.list_nodes();
                        slice.list_interfaces();


                        Slice Name: FRR-losa Slice ID: 0367f6f3-1331-49dc-9399-722616237a5b Project ID: a57c7715-d871-4369-82e6-408c9a57a6e7 Project Name: UCSD-FABRIC test
                        Graph ID: 071abcd4-f292-449d-a69a-da4768780546
                        Slice owner: { name: orchestrator, guid: orchestrator-guid, oidc_sub_claim: 91f5ecc3-16ff-4f09-95ac-dfeee0c3b1e3, email: jbalcas@es.net}
                        Slice state: StableOK
                        Lease time: 2025-02-07 14:24:38+00:00

                        Thanks,

                        Komal

                        in reply to: Site to Site Connection Issue #8192
                        Komal Thareja
                        Participant

                          Hi Raghav,

                          Please set up your JH environment by running the notebook: jupyter-examples-rel1.8.1/configure_and_validate/configure_and_validate.ipynb

                          This shall setup all the required configuration files and SSH keys. Please try the Wide Area Link notebook or Hello Fabric after that to ensure your configuration works. Please let us know if you run into issues.

                          Thanks,

                          Komal

                          in reply to: Error message: strptime() argument 1 must be str, not None #8188
                          Komal Thareja
                          Participant

                            Issue resolved over zoom meeting, the issue was the token.

                            Token file only had id_token instead of the entire token contents. Downloading the token file and using that resolved the issue.

                            Please let us know if you run into any other issues. I have taken a note to return a more user friendly error. We will address this in the next release.

                            Thanks,

                            Komal

                            in reply to: Error message: strptime() argument 1 must be str, not None #8183
                            Komal Thareja
                            Participant

                              Reached out to Vaneshi via email to request a meeting to work to resolve this!

                              Thanks,

                              Komal

                              in reply to: Site to Site Connection Issue #8182
                              Komal Thareja
                              Participant

                                Hi Raghav,

                                The data plane interfaces on your VMs connected via L2STS do not have IP addresses configured.

                                The enp3s0 interface on your VMs is designated as the management interface and should be used solely for SSH access. For your experiment, please use the data plane interfaces, which are enp7s0 on both VMs.

                                I recommend exploring the JH example—Wide Area Link (Layer 2)—using manual, auto, or user-defined configurations, as it demonstrates how IP addresses should be set up.  Please, let us know if you encounter any further issues.

                                Snapshot from the VMs:


                                root@4f3a79fa-6e29-454e-9ec4-d1bfbda81a17-bapi-v2:~# ifconfig -a
                                enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
                                inet 10.30.6.167 netmask 255.255.254.0 broadcast 10.30.7.255
                                inet6 fe80::f816:3eff:fe82:7b9 prefixlen 64 scopeid 0x20
                                inet6 2001:400:a100:3070:f816:3eff:fe82:7b9 prefixlen 64 scopeid 0x0
                                ether fa:16:3e:82:07:b9 txqueuelen 1000 (Ethernet)
                                RX packets 51778 bytes 150077282 (150.0 MB)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 25537 bytes 2608566 (2.6 MB)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


                                enp6s0: flags=4098<BROADCAST,MULTICAST> mtu 1500
                                ether 06:b7:27:d2:b5:0b txqueuelen 1000 (Ethernet)
                                RX packets 0 bytes 0 (0.0 B)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 0 bytes 0 (0.0 B)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


                                lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
                                inet 127.0.0.1 netmask 255.0.0.0
                                inet6 ::1 prefixlen 128 scopeid 0x10
                                loop txqueuelen 1000 (Local Loopback)
                                RX packets 178 bytes 23663 (23.6 KB)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 178 bytes 23663 (23.6 KB)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


                                root@bd65ee61-46a2-4cb2-b89e-c6b385052336-bapi-vm1:~# ifconfig -a
                                enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
                                inet 10.20.5.38 netmask 255.255.254.0 broadcast 10.20.5.255
                                inet6 fe80::f816:3eff:fe55:c84f prefixlen 64 scopeid 0x20
                                ether fa:16:3e:55:c8:4f txqueuelen 1000 (Ethernet)
                                RX packets 15231 bytes 146475806 (146.4 MB)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 13258 bytes 1020159 (1.0 MB)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


                                enp7s0: flags=4098<BROADCAST,MULTICAST> mtu 1500
                                ether 16:8a:89:5e:75:97 txqueuelen 1000 (Ethernet)
                                RX packets 0 bytes 0 (0.0 B)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 0 bytes 0 (0.0 B)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


                                lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
                                inet 127.0.0.1 netmask 255.0.0.0
                                inet6 ::1 prefixlen 128 scopeid 0x10
                                loop txqueuelen 1000 (Local Loopback)
                                RX packets 238 bytes 37767 (37.7 KB)
                                RX errors 0 dropped 0 overruns 0 frame 0
                                TX packets 238 bytes 37767 (37.7 KB)
                                TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

                                Thanks,

                                Komal

                                in reply to: Unable to reserve slice #8166
                                Komal Thareja
                                Participant

                                  Hi Kriti,

                                  I think the attachment got lost. Could you please email it to me directly at kthare10@renci.org?

                                  Thanks,

                                  Komal

                                Viewing 15 posts - 121 through 135 (of 515 total)