1. yoursunny

yoursunny

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 60 total)
  • Author
    Posts
  • in reply to: Multi-day FABRIC maintenance (January 1-5, 2024) #6188
    yoursunny
    Participant

      Will the FABNetv4Ext peering point (located in WASH) be affected?

      yoursunny
      Participant

        channel 0: open failed: connect failed: No route to host

        This error typically means the VM is turned off / deleted, or is otherwise unreachable via management netif.

        The fastest solution is simply deleting the experiment slice and creating a new one.

        in reply to: error when attempting to numa_tune #6040
        yoursunny
        Participant

          Upper limit for a VM connected with only one component would map to a single Numa Node.

          What happens if there are multiple components that are on distinct NUMA sockets?
          Is it possible to specify how much RAM to pin to each NUMA socket?

          Max limit on memory for a numa node is 64G so exceeding that limit would not work.

          If we pin a CPU core or certain amount of RAM onto a NUMA socket, does it prevent other VMs from using the same CPU core or RAM capacity?

          in reply to: Communication between nodes on same site #5434
          yoursunny
          Participant

            Is there a way where I can convert my rspec into a FABRIC native object?

            This would make a nice hackathon project. It should be feasible to make a converter that covers the topology, links, IP assignments, and startup scripts.

            It’s even better if FABRIC platform can directly accept Request RSpec and return Manifest RSpec, so that existing tooling for Emulab can still work.

            in reply to: Communication between nodes on same site #5431
            yoursunny
            Participant

              Management IP cannot communicate with each other, to prevent abuse.

              You should create FABNetv4 or FABNetv6 network service in each slice. They can all communicate with each other, regardless of whether the slides are on the same or different sites.

              in reply to: FABNetv4/FABNetv6 gateway is not IP address #5425
              yoursunny
              Participant

                NetworkService in the affected slice:

                    <node id="73" labels=":GraphNode:NetworkService">
                      <data key="d0">:GraphNode:NetworkService</data>
                      <data key="d1">["bbf6a0a7-8981-4613-b797-0960e7e8ea9d", "node+amst-data-sw:ip+192.168.42.3-ipv4-ns"]</data>
                      <data key="d21">L3</data>
                      <data key="d8">{"fablib_data": {"instantiated": "False", "mode": "manual"}}</data>
                      <data key="d22">{"ipv4": "10.145.7.1", "ipv4_subnet": "10.145.7.0/24"}</data>
                      <data key="d3">{"error_message": "", "reservation_id": "4c9b702b-1346-4fe5-b61e-f5cb7790e75f", "reservation_state": "Active"}</data>
                      <data key="d15">NetworkService</data>
                      <data key="d10">false</data>
                      <data key="d11">AMST</data>
                      <data key="d13">FABNetv4</data>
                      <data key="d14">8</data>
                      <data key="d9">net4</data>
                      <data key="d16">be2e2e72-5bdd-4301-98aa-bb9e3fe23a56</data>
                      <data key="d17">98452967-6246-4517-a030-7d76d7044d05</data>
                    </node>

                NetworkService in a “normal” slice:

                    <node id="18" labels=":GraphNode:NetworkService">
                      <data key="d0">:GraphNode:NetworkService</data>
                      <data key="d4">{"fablib_data": {"instantiated": "True", "mode": "manual", "subnet": {"subnet": "10.138.131.0/24", "allocated_ips": ["10.138.131.1"], "gateway": "10.138.131.1"}}}</data>
                      <data key="d22">{"ipv4": "10.138.131.1", "ipv4_subnet": "10.138.131.0/24"}</data>
                      <data key="d2">["bbf6a0a7-8981-4613-b797-0960e7e8ea9d", "node+atla-data-sw:ip+192.168.33.3-ipv4-ns"]</data>
                      <data key="d3">{"error_message": "", "reservation_id": "17d077ce-b66e-48e0-aafb-a48033a02ff1", "reservation_state": "Active"}</data>
                      <data key="d10">LAN</data>
                      <data key="d11">false</data>
                      <data key="d12">ATLA</data>
                      <data key="d21">L3</data>
                      <data key="d13">FABNetv4</data>
                      <data key="d15">NetworkService</data>
                      <data key="d16">c0950a90-1312-4ac4-a7e9-a87aa52d8fda</data>
                      <data key="d17">553597ac-26ea-4729-97cd-f9f1b6a23ea9</data>
                      <data key="d14">5</data>
                    </node>

                I think instantiated: False is the problem. network.get_gateway() would not return the IP in this case.

                in reply to: FABNetv4/FABNetv6 gateway is not IP address #5423
                yoursunny
                Participant

                  GraphML file here: https://cdn1.frocdn.ch/oq8kXrhxQRTtAR4.graphml

                  I can definitely see the IP addresses in the GraphML file, but it’s not showing up in the list_node_and_networks.ipynb notebook or the net.get_gateway() function call.

                  I’m using the “default 08/22/2023” JupyterHub environment, with these package versions:

                  fabric@fall:work-10%$ pip list | grep fabric
                  fabric 3.2.2
                  fabric-credmgr-client 1.5.2
                  fabric_fim 1.5.5
                  fabric_fss_utils 1.5.1
                  fabric-orchestrator-client 1.5.5
                  fabrictestbed 1.5.6
                  fabrictestbed-extensions 1.5.4
                  fabrictestbed-mflib 1.0.3
                  
                  • This reply was modified 1 year, 12 months ago by yoursunny.
                  in reply to: STAR site power loss, connectivity losses #5352
                  yoursunny
                  Participant

                    FABNetv4Ext establishment is working, but I’m see connectivity issues to many destinations.

                    ubuntu@v4gateway:~$ mtr -4bwz -c4 --tcp -P 6363 hobo.cs.arizona.edu
                    Start: 2023-09-20T16:06:38+0000
                    HOST: v4gateway                                                                     Loss%   Snt   Last   Avg  Best  Wrst StDev
                      1. AS398900 23.134.233.81                                                          0.0%     4    0.5   0.5   0.5   0.5   0.0
                      2. AS???    10.133.0.141                                                           0.0%     4   13.2  13.2  13.1  13.2   0.0
                      3. AS11537  hundredge-0-0-0-28.1000.core1.wash.net.internet2.edu (198.71.45.162)   0.0%     4   15.4  15.1  14.5  15.7   0.5
                      4. AS???    ???                                                                   100.0     4    0.0   0.0   0.0   0.0   0.0
                    
                    ubuntu@v4gateway:~$ mtr -4bwz -c4 --tcp -P 5201 ash.speedtest.clouvider.net
                    Start: 2023-09-20T16:07:49+0000
                    HOST: v4gateway              Loss%   Snt   Last   Avg  Best  Wrst StDev
                      1. AS398900 23.134.233.81   0.0%     4    0.5   0.5   0.5   0.5   0.0
                      2. AS???    10.133.0.141    0.0%     4   13.4  13.2  13.1  13.4   0.1
                      3. AS???    ???            100.0     4    0.0   0.0   0.0   0.0   0.0
                    

                    Maybe some routing adjustment is needed too?

                    in reply to: STAR site power loss, connectivity losses #5347
                    yoursunny
                    Participant

                      The STAR outage seems to be affecting the creation of FABNetv4Ext networks. It seems that the control software is trying to access the STAR switch and it times out. This occurs even if the node is in WASH site where the FABNetv4Ext peering connection exists.

                      Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Node: gateway, Site: PSC, State: Active,
                      Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Node: gateway, Site: PSC, State: Active,

                      failed lease update- all units failed priming: Exception during modify for unit: 5a8383f3-30aa-41d8-9874-46b61ebbe621 Playbook has failed tasks: NSO commit returned JSON-RPC error: type: rpc.method.failed, code: -32000, message: Method failed, data: message: Failed to connect to device star-data-sw: connection refused: NEDCOM CONNECT: The kexTimeout (20000 ms) expired. in new state, internal: jsonrpc_tx_commit357#all units failed priming: Exception during modify for unit: 5a8383f3-30aa-41d8-9874-46b61ebbe621 Playbook has failed tasks: NSO commit returned JSON-RPC error: type: rpc.method.failed, code: -32000, message: Method failed, data: message: Failed to connect to device star-data-sw: connection refused: NEDCOM CONNECT: The kexTimeout (20000 ms) expired. in new state, internal: jsonrpc_tx_commit357#

                      The control software should choose alternate paths to reach the peering port. The control software should skip switches in maintenance, and attempt to re-apply the configuration when the maintenance mode is lifted.

                      yoursunny
                      Participant

                        Instead of having users add hosts entry (which would require changes in every level including inside containers), can the DNS64 server be configured to return this IP?

                        • This reply was modified 2 years ago by yoursunny.
                        yoursunny
                        Participant

                          I’m seeing “Unable to establish SSL connection” error when trying to download from GitHub releases:

                          ubuntu@N0:~$ wget --timeout=10s -v https://github.com/TomWright/dasel/releases/download/v2.3.4/dasel_linux_amd64
                          --2023-09-06 17:24:18-- https://github.com/TomWright/dasel/releases/download/v2.3.4/dasel_linux_amd64
                          Resolving github.com (github.com)... 2600:2701:5000:5001::8c52:7104, 140.82.113.4
                          Connecting to github.com (github.com)|2600:2701:5000:5001::8c52:7104|:443... connected.
                          HTTP request sent, awaiting response... 302 Found
                          Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/297615696/dfe35302-5ee7-42cf-939d-345b67a2091d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230906T172418Z&X-Amz-Expires=300&X-Amz-Signature=cdb822adb0af2026b86b8fae886e28358b27bb48551182c5ee95e03a946b4353&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=297615696&response-content-disposition=attachment%3B%20filename%3Ddasel_linux_amd64&response-content-type=application%2Foctet-stream [following]
                          --2023-09-06 17:24:18-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/297615696/dfe35302-5ee7-42cf-939d-345b67a2091d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230906T172418Z&X-Amz-Expires=300&X-Amz-Signature=cdb822adb0af2026b86b8fae886e28358b27bb48551182c5ee95e03a946b4353&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=297615696&response-content-disposition=attachment%3B%20filename%3Ddasel_linux_amd64&response-content-type=application%2Foctet-stream
                          Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 2600:2701:5000:5001::b9c7:6d85, 2600:2701:5000:5001::b9c7:6e85, 2600:2701:5000:5001::b9c7:6c85, ...
                          Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|2600:2701:5000:5001::b9c7:6d85|:443... connected.
                          Unable to establish SSL connection.

                          Is the NAT64 gateway being blocked by GitHub releases download server?

                          tcpdump of the transaction: https://cdn1.frocdn.ch/JTeh94VJIxkXv6P.pcap

                          • This reply was modified 2 years ago by yoursunny.
                          yoursunny
                          Participant

                            I found an unintended consequence of enabling NAT64:

                            1. I sometimes want multiple slices to communicate with each other, while each slice can be re-deployed independently.
                            2. To do so, I’m using FABNetv4 network service, paired with an external domain name that supports dynamic updates.
                            3. When a “server” slice is re-deployed, it updates the domain name to point to its new FABNetv4 IP address.
                            4. Previously, this works well: the “client” slice can find the “server” slice by resolving the domain name.
                            5. Since NAT64 is deployed, the “client” slice would resolve both A and AAAA records on the domain name.
                            6. If the “client” software tries to connect to the IPv6 address in the AAAA records, it cannot reach the FABNetv4 destination.

                            My suggestion is to configure the DNS64 server so that it does not return AAAA records if the domain name resolves to an IPv4 address that is part of FABNetv4 or other RFC1918 address.

                            yoursunny
                            Participant

                              Please post your experiment script or notebook, as well as any commands you typed into SSH console.

                              Please describe what you expect to happen in a certain operation, and what actually happened.

                              Please post commands, outputs, error messages in textual format, not as pictures.

                              in reply to: A public IP for the Fabric node #4601
                              yoursunny
                              Participant

                                Yes, you can request public IPv4/IPv6 address with FABNetv4Ext/FABNetv6Ext network service:

                                Network Services in FABRIC

                                There are some examples in my FABRIC scripts repository:

                                https://github.com/yoursunny/fabric

                                in reply to: Why is NDN packets not going through my network #4598
                                yoursunny
                                Participant

                                  If you think OpenVSwitch is causing problem, do not enable it.

                                  NFD alone is capable of forwarding traffic between different nodes.

                                Viewing 15 posts - 16 through 30 (of 60 total)