yoursunny

Forum Replies Created

Viewing 15 posts - 1 through 15 (of 48 total)

1 2 3 4 →

Author

Posts
March 21, 2024 at 9:41 am in reply to: failed lease update – all units failed in priming #6856
yoursunny
Participant
Mert Cevik wrote:

On the IPv4 sites, number of VMs that can be provisioned is limited with the available IPv4 addresses in the subnet.

Can “management IP address” show up as a resource on Fabric Portal – Resources page?
This would allow the experimenter to avoid this limitation.
January 9, 2024 at 8:27 am in reply to: Fabric Testbed is open and ready for use! #6270
yoursunny
Participant
Oversubscription support – EDC and EDUKY sites have been enabled to support CPU over subscription.

I remember the CPU core capacity of STAR site was 384.
It’s now 768.
Did this site receive new hardware or is it oversubscription?
January 2, 2024 at 7:13 pm in reply to: Not able to create slice #6236
yoursunny
Participant
The entire FABRIC is red this week: https://portal.fabric-testbed.net/resources/all

See notice here: https://learn.fabric-testbed.net/forums/topic/multi-day-fabric-maintenance-january-1-5-2024/

Look at this part:

On other sites, slices will continue running, and slivers will be accessible during the maintenance, however, we will place the testbed in maintenance mode between Jan 1-5, therefore it will not be possible to perform slice operations (create, extend, delete).
December 13, 2023 at 10:42 am in reply to: Multi-day FABRIC maintenance (January 1-5, 2024) #6188
yoursunny
Participant
Will the FABNetv4Ext peering point (located in WASH) be affected?
November 9, 2023 at 12:06 pm in reply to: “channel 0: open failed: connect failed: No route to host” Error #6080
yoursunny
Participant
channel 0: open failed: connect failed: No route to host

This error typically means the VM is turned off / deleted, or is otherwise unreachable via management netif.

The fastest solution is simply deleting the experiment slice and creating a new one.
November 7, 2023 at 11:33 am in reply to: error when attempting to numa_tune #6040
yoursunny
Participant
Komal Thareja wrote:

Upper limit for a VM connected with only one component would map to a single Numa Node.

What happens if there are multiple components that are on distinct NUMA sockets?
Is it possible to specify how much RAM to pin to each NUMA socket?

Komal Thareja wrote:

Max limit on memory for a numa node is 64G so exceeding that limit would not work.

If we pin a CPU core or certain amount of RAM onto a NUMA socket, does it prevent other VMs from using the same CPU core or RAM capacity?
September 25, 2023 at 8:19 am in reply to: Communication between nodes on same site #5434
yoursunny
Participant
Is there a way where I can convert my rspec into a FABRIC native object?

This would make a nice hackathon project. It should be feasible to make a converter that covers the topology, links, IP assignments, and startup scripts.

It’s even better if FABRIC platform can directly accept Request RSpec and return Manifest RSpec, so that existing tooling for Emulab can still work.
September 25, 2023 at 4:54 am in reply to: Communication between nodes on same site #5431
yoursunny
Participant
Management IP cannot communicate with each other, to prevent abuse.

You should create FABNetv4 or FABNetv6 network service in each slice. They can all communicate with each other, regardless of whether the slides are on the same or different sites.
September 22, 2023 at 6:32 pm in reply to: FABNetv4/FABNetv6 gateway is not IP address #5425
yoursunny
Participant
NetworkService in the affected slice:
```
    <node id="73" labels=":GraphNode:NetworkService">
      <data key="d0">:GraphNode:NetworkService</data>
      <data key="d1">["bbf6a0a7-8981-4613-b797-0960e7e8ea9d", "node+amst-data-sw:ip+192.168.42.3-ipv4-ns"]</data>
      <data key="d21">L3</data>
      <data key="d8">{"fablib_data": {"instantiated": "False", "mode": "manual"}}</data>
      <data key="d22">{"ipv4": "10.145.7.1", "ipv4_subnet": "10.145.7.0/24"}</data>
      <data key="d3">{"error_message": "", "reservation_id": "4c9b702b-1346-4fe5-b61e-f5cb7790e75f", "reservation_state": "Active"}</data>
      <data key="d15">NetworkService</data>
      <data key="d10">false</data>
      <data key="d11">AMST</data>
      <data key="d13">FABNetv4</data>
      <data key="d14">8</data>
      <data key="d9">net4</data>
      <data key="d16">be2e2e72-5bdd-4301-98aa-bb9e3fe23a56</data>
      <data key="d17">98452967-6246-4517-a030-7d76d7044d05</data>
    </node>
```
NetworkService in a “normal” slice:
```
    <node id="18" labels=":GraphNode:NetworkService">
      <data key="d0">:GraphNode:NetworkService</data>
      <data key="d4">{"fablib_data": {"instantiated": "True", "mode": "manual", "subnet": {"subnet": "10.138.131.0/24", "allocated_ips": ["10.138.131.1"], "gateway": "10.138.131.1"}}}</data>
      <data key="d22">{"ipv4": "10.138.131.1", "ipv4_subnet": "10.138.131.0/24"}</data>
      <data key="d2">["bbf6a0a7-8981-4613-b797-0960e7e8ea9d", "node+atla-data-sw:ip+192.168.33.3-ipv4-ns"]</data>
      <data key="d3">{"error_message": "", "reservation_id": "17d077ce-b66e-48e0-aafb-a48033a02ff1", "reservation_state": "Active"}</data>
      <data key="d10">LAN</data>
      <data key="d11">false</data>
      <data key="d12">ATLA</data>
      <data key="d21">L3</data>
      <data key="d13">FABNetv4</data>
      <data key="d15">NetworkService</data>
      <data key="d16">c0950a90-1312-4ac4-a7e9-a87aa52d8fda</data>
      <data key="d17">553597ac-26ea-4729-97cd-f9f1b6a23ea9</data>
      <data key="d14">5</data>
    </node>
```
I think instantiated: False is the problem. network.get_gateway() would not return the IP in this case.
September 22, 2023 at 6:25 pm in reply to: FABNetv4/FABNetv6 gateway is not IP address #5423
yoursunny
Participant
GraphML file here: https://cdn1.frocdn.ch/oq8kXrhxQRTtAR4.graphml

I can definitely see the IP addresses in the GraphML file, but it’s not showing up in the list_node_and_networks.ipynb notebook or the net.get_gateway() function call.

I’m using the “default 08/22/2023” JupyterHub environment, with these package versions:
```
fabric@fall:work-10%$ pip list | grep fabric
fabric 3.2.2
fabric-credmgr-client 1.5.2
fabric_fim 1.5.5
fabric_fss_utils 1.5.1
fabric-orchestrator-client 1.5.5
fabrictestbed 1.5.6
fabrictestbed-extensions 1.5.4
fabrictestbed-mflib 1.0.3
```
- This reply was modified 7 months, 2 weeks ago by yoursunny.
September 20, 2023 at 12:10 pm in reply to: STAR site power loss, connectivity losses #5352
yoursunny
Participant
FABNetv4Ext establishment is working, but I’m see connectivity issues to many destinations.
```
ubuntu@v4gateway:~$ mtr -4bwz -c4 --tcp -P 6363 hobo.cs.arizona.edu
Start: 2023-09-20T16:06:38+0000
HOST: v4gateway                                                                     Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS398900 23.134.233.81                                                          0.0%     4    0.5   0.5   0.5   0.5   0.0
  2. AS???    10.133.0.141                                                           0.0%     4   13.2  13.2  13.1  13.2   0.0
  3. AS11537  hundredge-0-0-0-28.1000.core1.wash.net.internet2.edu (198.71.45.162)   0.0%     4   15.4  15.1  14.5  15.7   0.5
  4. AS???    ???                                                                   100.0     4    0.0   0.0   0.0   0.0   0.0

ubuntu@v4gateway:~$ mtr -4bwz -c4 --tcp -P 5201 ash.speedtest.clouvider.net
Start: 2023-09-20T16:07:49+0000
HOST: v4gateway              Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS398900 23.134.233.81   0.0%     4    0.5   0.5   0.5   0.5   0.0
  2. AS???    10.133.0.141    0.0%     4   13.4  13.2  13.1  13.4   0.1
  3. AS???    ???            100.0     4    0.0   0.0   0.0   0.0   0.0
```
Maybe some routing adjustment is needed too?
September 19, 2023 at 11:59 am in reply to: STAR site power loss, connectivity losses #5347
yoursunny
Participant
The STAR outage seems to be affecting the creation of FABNetv4Ext networks. It seems that the control software is trying to access the STAR switch and it times out. This occurs even if the node is in WASH site where the FABNetv4Ext peering connection exists.

Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Node: gateway, Site: PSC, State: Active,
Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Slice Exception: Slice Name: v4gateway@1695137544, Slice ID: f20f1cff-11b0-4db9-9ffb-5b265c3653b6: Node: gateway, Site: PSC, State: Active,

failed lease update- all units failed priming: Exception during modify for unit: 5a8383f3-30aa-41d8-9874-46b61ebbe621 Playbook has failed tasks: NSO commit returned JSON-RPC error: type: rpc.method.failed, code: -32000, message: Method failed, data: message: Failed to connect to device star-data-sw: connection refused: NEDCOM CONNECT: The kexTimeout (20000 ms) expired. in new state, internal: jsonrpc_tx_commit357#all units failed priming: Exception during modify for unit: 5a8383f3-30aa-41d8-9874-46b61ebbe621 Playbook has failed tasks: NSO commit returned JSON-RPC error: type: rpc.method.failed, code: -32000, message: Method failed, data: message: Failed to connect to device star-data-sw: connection refused: NEDCOM CONNECT: The kexTimeout (20000 ms) expired. in new state, internal: jsonrpc_tx_commit357#

The control software should choose alternate paths to reach the peering port. The control software should skip switches in maintenance, and attempt to re-apply the configuration when the maintenance mode is lifted.
September 6, 2023 at 2:04 pm in reply to: FABRIC Nat64 solution obviates the need for custom DNS in IPv6 sites #5233
yoursunny
Participant
Instead of having users add hosts entry (which would require changes in every level including inside containers), can the DNS64 server be configured to return this IP?
- This reply was modified 8 months ago by yoursunny.
September 6, 2023 at 1:40 pm in reply to: FABRIC Nat64 solution obviates the need for custom DNS in IPv6 sites #5230
yoursunny
Participant
I’m seeing “Unable to establish SSL connection” error when trying to download from GitHub releases:
```
ubuntu@N0:~$ wget --timeout=10s -v https://github.com/TomWright/dasel/releases/download/v2.3.4/dasel_linux_amd64
--2023-09-06 17:24:18-- https://github.com/TomWright/dasel/releases/download/v2.3.4/dasel_linux_amd64
Resolving github.com (github.com)... 2600:2701:5000:5001::8c52:7104, 140.82.113.4
Connecting to github.com (github.com)|2600:2701:5000:5001::8c52:7104|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/297615696/dfe35302-5ee7-42cf-939d-345b67a2091d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230906T172418Z&X-Amz-Expires=300&X-Amz-Signature=cdb822adb0af2026b86b8fae886e28358b27bb48551182c5ee95e03a946b4353&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=297615696&response-content-disposition=attachment%3B%20filename%3Ddasel_linux_amd64&response-content-type=application%2Foctet-stream [following]
--2023-09-06 17:24:18-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/297615696/dfe35302-5ee7-42cf-939d-345b67a2091d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230906T172418Z&X-Amz-Expires=300&X-Amz-Signature=cdb822adb0af2026b86b8fae886e28358b27bb48551182c5ee95e03a946b4353&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=297615696&response-content-disposition=attachment%3B%20filename%3Ddasel_linux_amd64&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 2600:2701:5000:5001::b9c7:6d85, 2600:2701:5000:5001::b9c7:6e85, 2600:2701:5000:5001::b9c7:6c85, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|2600:2701:5000:5001::b9c7:6d85|:443... connected.
Unable to establish SSL connection.
```
Is the NAT64 gateway being blocked by GitHub releases download server?

tcpdump of the transaction: https://cdn1.frocdn.ch/JTeh94VJIxkXv6P.pcap
- This reply was modified 8 months ago by yoursunny.
September 5, 2023 at 12:01 pm in reply to: FABRIC Nat64 solution obviates the need for custom DNS in IPv6 sites #5222
yoursunny
Participant
I found an unintended consequence of enabling NAT64:
1. I sometimes want multiple slices to communicate with each other, while each slice can be re-deployed independently.
2. To do so, I’m using FABNetv4 network service, paired with an external domain name that supports dynamic updates.
3. When a “server” slice is re-deployed, it updates the domain name to point to its new FABNetv4 IP address.
4. Previously, this works well: the “client” slice can find the “server” slice by resolving the domain name.
5. Since NAT64 is deployed, the “client” slice would resolve both A and AAAA records on the domain name.
6. If the “client” software tries to connect to the IPv6 address in the AAAA records, it cannot reach the FABNetv4 destination.
My suggestion is to configure the DNS64 server so that it does not return AAAA records if the domain name resolves to an IPv4 address that is part of FABNetv4 or other RFC1918 address.
Author

Posts

Viewing 15 posts - 1 through 15 (of 48 total)

1 2 3 4 →