Home › Forums › FABRIC General Questions and Discussion › IPv6 on FABRIC: A hop with a low MTU
Tagged: ipv6 hops mtu packet_size
- This topic has 15 replies, 6 voices, and was last updated 1 year, 7 months ago by Komal Thareja.
-
AuthorPosts
-
March 22, 2022 at 12:30 pm #1535
I (we) deployed our application for a few minutes and found that for some reason our packets were being dropped when using IPv6 and having the packet be around ~8300 bytes. However, with smaller packets (~1400 bytes), it works. Our conclusion is that a hop in between our nodes is dropping larger packets.
The nodes themselves do not have a MTU limit (set at 9000 I believe) which means it likely is a hop in between the two nodes that is dropping the packets. Is this a known thing / is this a concern?
I can of course limit the packet size via our application just fine (and it does work), but it would be nice to use larger packets. I was told by other people that FABRIC supports super-size packets or something? Perhaps, this issue is sorted out via our slice configuration?
Kind regards,
Justin
March 22, 2022 at 3:06 pm #1536What are the source and destination sites/hosts for this test?
There might be a configuration error in a switch somewhere. Thanks for helping us find it.
Paul
March 22, 2022 at 3:54 pm #1537I have two of the IPv6 node addresses if you can find the sites (my partner can only access the slice):
- [2001:1948:417:7:f816:3eff:fe94:5413]
- [2001:400:a100:3030:f816:3eff:feef:dba1]
If I find out the sites, I will reply again. To add, we have only tested two sites, so there very well could be more hops that limit packet sizes. We will let you know as we test more! Thank you.
Kind regards,
Justin
- This reply was modified 2 years, 10 months ago by Justin Presley.
March 23, 2022 at 9:13 am #1542We are expressing a low MTU switch when communicating between site “UTAH” and “STAR”.
March 23, 2022 at 9:35 am #1543I think I have narrowed this down to STAR and UTAH. I am able the use jumbo frames between these sites on a FABRIC L2 data plane network but not using the management network (i.e. the public Internet
What is your data plane network configuration? Which network/IPs are you using for your application? I’m wondering if you are trying to use jumbo frames across the management network. If you use the management network to connect nodes from different sites, your traffic will go over the public Internet. We probably can’t fix MTU issues on the public Internet.
You can test by trying the following command. You can find the MTU by increasing the packet size until the ping starts failing.
ping -M do -s <packet size> <destination IP>
- This reply was modified 2 years, 10 months ago by Paul Ruth.
- This reply was modified 2 years, 10 months ago by Paul Ruth.
April 6, 2023 at 6:15 pm #4059I’m seeing MTU issues on the data plane network between SALT and UTAH.
The scenario is using NIC_ConnectX_5 NICs and L2PTP network service.I increased the MTU of VLAN netifs to 9000, and assigned IPv4 addresses to both ends.
The maximum ICMP ping size that can pass through is 1472.ubuntu@9bf529e4-3efc-4988-a9f8-5089ccfa08af-nb:~$ ping -M do -c 4 -s 1472 192.168.8.1 PING 192.168.8.1 (192.168.8.1) 1472(1500) bytes of data. 1480 bytes from 192.168.8.1: icmp_seq=1 ttl=64 time=0.348 ms 1480 bytes from 192.168.8.1: icmp_seq=2 ttl=64 time=0.225 ms 1480 bytes from 192.168.8.1: icmp_seq=3 ttl=64 time=0.227 ms 1480 bytes from 192.168.8.1: icmp_seq=4 ttl=64 time=0.192 ms --- 192.168.8.1 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3051ms rtt min/avg/max/mdev = 0.192/0.248/0.348/0.059 ms ubuntu@9bf529e4-3efc-4988-a9f8-5089ccfa08af-nb:~$ ping -M do -c 4 -s 1473 192.168.8.1 PING 192.168.8.1 (192.168.8.1) 1473(1501) bytes of data. --- 192.168.8.1 ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3074ms
April 12, 2023 at 1:31 pm #4080MTU issue is discovered between MASS and STAR on the experiment network.
I increased MTU of every netif to 9000, but the largest IPv4 ping that can pass through is 1424.
Slice ID: 3b8d1e30-8c17-45b2-9e78-4e59f69cfc3eubuntu@NA:~$ ping -M do -c 4 -s 1424 192.168.8.2 PING 192.168.8.2 (192.168.8.2) 1424(1452) bytes of data. 1432 bytes from 192.168.8.2: icmp_seq=1 ttl=64 time=26.8 ms 1432 bytes from 192.168.8.2: icmp_seq=2 ttl=64 time=26.7 ms 1432 bytes from 192.168.8.2: icmp_seq=3 ttl=64 time=26.7 ms 1432 bytes from 192.168.8.2: icmp_seq=4 ttl=64 time=26.7 ms --- 192.168.8.2 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3005ms rtt min/avg/max/mdev = 26.659/26.695/26.765/0.041 ms ubuntu@NA:~$ ping -M do -c 4 -s 1425 192.168.8.2 PING 192.168.8.2 (192.168.8.2) 1425(1453) bytes of data. --- 192.168.8.2 ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3050ms
April 13, 2023 at 5:45 pm #4083Connection to MASS unfortunately goes through providers that do not support jumbo frames – it is concatenated from two L2 services from regional providers and no other options exist. We will start updating the topology advertisements you see to indicate the MTU a given link can support, currently feel free to ask us.
April 13, 2023 at 5:49 pm #4084UTAH-SALT we will look into – that is a L1 connection and this is surprising, thank you for letting us know @yoursunny
April 14, 2023 at 5:43 pm #4094@yoursunny – we looked into SALT-UTAH – you are correct, the MTU was incorrectly set on the switch on the 25Gbps ports (for 100G it seems to be correct). We will remedy this next week and will check settings on other sites so the experimenters can get as consistent an experience as possible. Thank you for reporting this.
April 17, 2023 at 4:35 pm #4106@yoursunny I tested SALT to UTAH and this is what I have as the max MTU:
try: node1 = slice.get_node(name=node1_name) stdout, stderr = node1.execute(f'ping -M do -c 5 -s 8928 {node2_addr}') except Exception as e: print(f"Exception: {e}") PING 192.168.1.2 (192.168.1.2) 8928(8956) bytes of data. 8936 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.268 ms 8936 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.214 ms 8936 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.225 ms 8936 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.239 ms 8936 bytes from 192.168.1.2: icmp_seq=5 ttl=64 time=0.246 ms --- 192.168.1.2 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4118ms rtt min/avg/max/mdev = 0.214/0.238/0.268/0.023 ms
April 18, 2023 at 2:58 pm #4112MTU is good now (except MASS).
I made a slice in every available location with FABNetv4 network service, tested ping with a few MTUs (256, 1280, 1420, 1500, 8900, 8948, 9000).
They can all support MTU 8948 (IPv4 ping -s 8920), but not MTU 9000 (IPv4 ping -s 8972).IPv4 ping MTU and RTT src\dst | CERN | UCSD | DALL | NCSA | CLEM | TACC | MAX | WASH | GPN | INDI | FIU | MICH | MASS | UTAH | SALT | STAR ---------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|---------- CERN | 9000 0 | 8948 148 | 8948 122 | 8948 105 | 8948 105 | 8948 128 | 8948 91 | 8948 88 | 8948 156 | 8948 107 | 8948 115 | 8948 108 | 1420 101 | 8948 133 | 8948 133 | 8948 102 UCSD | 8948 148 | 9000 0 | 8948 43 | 8948 48 | 8948 76 | 8948 49 | 8948 62 | 8948 59 | 8948 37 | 8948 50 | 8948 86 | 8948 50 | 1420 72 | 8948 14 | 8948 14 | 8948 45 DALL | 8948 122 | 8948 43 | 9000 0 | 8948 22 | 8948 50 | 8948 5 | 8948 36 | 8948 34 | 8948 51 | 8948 24 | 8948 60 | 8948 25 | 1420 46 | 8948 28 | 8948 28 | 8948 19 NCSA | 8948 105 | 8948 48 | 8948 22 | 9000 0 | 8948 32 | 8948 28 | 8948 19 | 8948 16 | 8948 56 | 8948 7 | 8948 43 | 8948 7 | 1420 29 | 8948 33 | 8948 33 | 8948 2 CLEM | 8948 105 | 8948 76 | 8948 50 | 8948 32 | 9000 0 | 8948 56 | 8948 19 | 8948 16 | 8948 84 | 8948 35 | 8948 43 | 8948 35 | 1420 28 | 8948 61 | 8948 61 | 8948 30 TACC | 8948 128 | 8948 49 | 8948 5 | 8948 28 | 8948 56 | 9000 0 | 8948 42 | 8948 39 | 8948 57 | 8948 30 | 8948 66 | 8948 30 | 1420 52 | 8948 34 | 8948 34 | 8948 25 MAX | 8948 91 | 8948 62 | 8948 36 | 8948 19 | 8948 19 | 8948 42 | 9000 0 | 8948 2 | 8948 70 | 8948 21 | 8948 29 | 8948 22 | 1420 15 | 8948 47 | 8948 47 | 8948 17 WASH | 8948 88 | 8948 59 | 8948 34 | 8948 16 | 8948 16 | 8948 39 | 8948 2 | 9000 0 | 8948 67 | 8948 18 | 8948 26 | 8948 19 | 1420 12 | 8948 44 | 8948 44 | 8948 14 GPN | 8948 156 | 8948 37 | 8948 51 | 8948 56 | 8948 84 | 8948 57 | 8948 70 | 8948 67 | 9000 0 | 8948 58 | 8948 94 | 8948 58 | 1420 80 | 8948 22 | 8948 23 | 8948 53 INDI | 8948 107 | 8948 50 | 8948 24 | 8948 7 | 8948 35 | 8948 30 | 8948 21 | 8948 18 | 8948 58 | 9000 0 | 8948 45 | 8948 9 | 1420 31 | 8948 35 | 8948 35 | 8948 4 FIU | 8948 115 | 8948 86 | 8948 60 | 8948 43 | 8948 43 | 8948 66 | 8948 29 | 8948 26 | 8948 94 | 8948 45 | 9000 0 | 8948 46 | 1420 39 | 8948 71 | 8948 71 | 8948 40 MICH | 8948 108 | 8948 50 | 8948 25 | 8948 7 | 8948 35 | 8948 30 | 8948 22 | 8948 19 | 8948 58 | 8948 9 | 8948 46 | 9000 0 | 1420 31 | 8948 36 | 8948 35 | 8948 5 MASS | 1420 101 | 1420 72 | 1420 46 | 1420 29 | 1420 28 | 1420 52 | 1420 15 | 1420 12 | 1420 80 | 1420 31 | 1420 39 | 1420 31 | 9000 0 | 1420 57 | 1420 57 | 1420 26 UTAH | 8948 133 | 8948 14 | 8948 28 | 8948 33 | 8948 61 | 8948 34 | 8948 47 | 8948 44 | 8948 22 | 8948 35 | 8948 71 | 8948 36 | 1420 57 | 9000 0 | 8948 0 | 8948 30 SALT | 8948 133 | 8948 14 | 8948 28 | 8948 33 | 8948 61 | 8948 34 | 8948 47 | 8948 44 | 8948 23 | 8948 35 | 8948 71 | 8948 35 | 1420 57 | 8948 0 | 9000 0 | 8948 30 STAR | 8948 102 | 8948 45 | 8948 19 | 8948 2 | 8948 30 | 8948 25 | 8948 17 | 8948 14 | 8948 53 | 8948 4 | 8948 40 | 8948 5 | 1420 26 | 8948 30 | 8948 30 | 9000 0
April 18, 2023 at 3:03 pm #4113Very nice! We are discussing internally the change we need to make to allow full 9000 MTU – we plan to implement the change after the workshop next week.
May 7, 2023 at 9:48 pm #4168@yoursunny Thanks for the nice testing table/results. We do have some links in the network which need to be set higher to allow hosts to use MTU of 9000. We need to do some more testing for all the links in the network to see if we can find a single value that works everywhere. We are also adding many new wide area links over the next month. So we will probably update the MTU settings across the network sometime in June. We will report back here on the forum with more details once we have made this change. For now, hopefully you can use MTU 8948 as you show in your table, until we make the change.
The MASS site/links should now support a host MTU of 8948. Please let us know if you still have issues with MASS node.
May 9, 2023 at 3:32 pm #4183We need to do some more testing for all the links in the network to see if we can find a single value that works everywhere.
Use my script:
https://github.com/yoursunny/fabric/blob/5d434c3117314730a9ab38ffd4eefcab70f13779/util/mtu.py
-
AuthorPosts
- You must be logged in to reply to this topic.