Home › Forums › FABRIC General Questions and Discussion › Bmv2 max performance in FABRIC
Tagged: Bmv2, Performance, Throughput
- This topic has 11 replies, 3 voices, and was last updated 1 year, 1 month ago by Elie Kfoury.
-
AuthorPosts
-
September 18, 2023 at 11:46 am #5331
Hi,
I was looking at the following Fabric document: https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/complex_recipes/p4_labs_bmv2/lab1_creating_a_slice_with_a_P4_switch.ipynb
There is a section about high performance towards the end, and it mentioned it could achieve a speed of ~1Gbps using Bmv2. However when I tried it on my Fabric instance, I can get only upto ~160Mbps before packets start to drop.
As the doc mentioned, I disabled all logging while building the Bmv2 switch binary and have run the ‘disable-offload.sh’ script and verified that the offloads have been turned off.A. Could you please elaborate on how a throughput close to 1Gbps was achieved ? I would like to recreate this performance in my fabric setup.
1) What kind of network connections were used (L2STS etc.) ?
2) How many cores and main memory were assigned for the switch VM ?
3) Was it using a shared NIC or smart NIC ? No.of hardware queues ?
4) UDP or TCP payload?
5) Size of the packet ?
6) Any process scheduling alterations (e.g. nice values or cpu pinning? )B. Could you explain how disabling hardware offloads can increase the speed. It seems counter-intuitive. Is it because of the VM hypervisor or is there some other dependency?
September 18, 2023 at 1:14 pm #5333I tried the same example, after tuning up the servers and the switch I am getting max of “3.17 Gbits/sec” and average 709 Mbits/s.
Server configs :
server1 = slice.add_node(name=”server1″,
site=site1,
cores=8,
ram=16,
disk=500,
image=’default_ubuntu_20′)server2 = slice.add_node(name=”server2″,
site=site3,
cores=8,
ram=16,
disk=500,
image=’default_ubuntu_20′)Switch config :
switch = slice.add_node(name=”switch”,
site=site2,
cores=32,
ram=16,
disk=40,
image=’default_ubuntu_20′)The results are attached.
Kind regards,
Nagmat
- This reply was modified 1 year, 2 months ago by Nagmat Nazarov.
- This reply was modified 1 year, 2 months ago by Nagmat Nazarov.
September 18, 2023 at 10:25 pm #5341Hi Nishanth,
I tried running the notebook you mentioned, and I got a throughput close to 1Gbps. Note that I changed the sites since NCSA, STAR, and UMich are under maintenance. I use the following sites:
site1=’MAX’
site2=’MASS’
site3=’NEWY’I just executed the cells sequentially and got high throughput as shown in the figure. Can you try changing the sites and repeat the experiment?
Regards,
Elie.
September 18, 2023 at 10:28 pm #5343Hi Nagmat,
I see from your screenshot that the first few seconds the throughput was zero, then it went up. I believe if you run iperf for longer time, the average will go up to 1Gbps.
You can try using the -t option to specify the time the iperf3 test should be running (e.g., -t 60 will run the test for 60 seconds).
Regards,
Elie.
September 20, 2023 at 3:16 pm #5360Dear Elie,
After executing with ” -t 60″ I was getting 1 Gbit/s speed limit.
Hi Nagmat,
I see from your screenshot that the first few seconds the throughput was zero, then it went up. I believe if you run iperf for longer time, the average will go up to 1Gbps.
You can try using the -t option to specify the time the iperf3 test should be running (e.g., -t 60 will run the test for 60 seconds).
Regards,
Elie.
I tried to add some info on the header and execute the main1.p4 (attached) program again.
For some reason, it didn’t work for me. Since logging was disabled I couldn’t debug the program.
I executed a similar program on my other experiments using fabric_testbed and it was working fine.
What may be the reason?
Kind regards,
Nagmat
September 20, 2023 at 3:29 pm #5361Hi Nagmat,
Good to hear that you were able to reach the 1Gbps speed.
You did not attach the main1.p4 program. I would suggest running BMv2 with logging enabled. You can refer to lab2_P4_program_building_blocks.ipynb notebook to understand how you can enable logging.
The typical workflow is to start with logging enabled so that you can verify the behavior of your P4 program. Afterwards, you can move your program to the high performance BMv2.
Regards,
Elie.
September 27, 2023 at 11:03 am #5493Hi,
I was able to get ~1Gbps performance with TCP, however there is always a slight mismatch between packets sent and received as reported by iperf3, with couple of retries during the 1st second.
I used a UDP connection instead and I am seeing a consistent behaviour of reasonable amount of losses on the 1st second across runs.
I am using the same notebook as mentioned above, but replaced the client command with the following:
iperf3 -c 192.168.2.10 -u -l 1300 -b 600MI have pasted the results for multiple runs with the above mentioned values. Is there an explanation for this drop, mostly at the first second.
Accepted connection from 192.168.1.10, port 36772
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 33996
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.7 MBytes 585 Mbits/sec 0.044 ms 501/56731 (0.88%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57691 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57696 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 599 Mbits/sec 0.052 ms 58/57694 (0.1%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57691 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57693 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.018 ms 0/57709 (0%)
[ 5] 10.00-10.02 sec 1.12 MBytes 600 Mbits/sec 0.004 ms 0/901 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 715 MBytes 598 Mbits/sec 0.004 ms 559/576881 (0.097%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 46278
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 39296
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.6 MBytes 584 Mbits/sec 0.058 ms 598/56731 (1.1%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57692 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.053 ms 0/57693 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 599 Mbits/sec 0.112 ms 0/57634 (0%)
[ 5] 4.00-5.00 sec 71.6 MBytes 601 Mbits/sec 0.041 ms 5/57751 (0.0087%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.064 ms 0/57690 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.043 ms 0/57690 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57695 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57692 (0%)
[ 5] 10.00-10.02 sec 1.13 MBytes 610 Mbits/sec 0.013 ms 0/914 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.013 ms 603/576876 (0.1%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 42888
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 36183
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 70.3 MBytes 590 Mbits/sec 0.063 ms 0/56730 (0%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57694 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.061 ms 0/57691 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57693 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57692 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57695 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.062 ms 0/57691 (0%)
[ 5] 10.00-10.02 sec 1.07 MBytes 577 Mbits/sec 0.053 ms 0/867 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 715 MBytes 599 Mbits/sec 0.053 ms 0/576827 (0%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 53714
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 46178
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.1 MBytes 580 Mbits/sec 0.056 ms 965/56730 (1.7%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57695 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.050 ms 0/57689 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57693 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57692 (0%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57692 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.042 ms 0/57695 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57688 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57692 (0%)
[ 5] 10.00-10.02 sec 1.07 MBytes 574 Mbits/sec 0.054 ms 0/867 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.054 ms 965/576826 (0.17%) receiverSeptember 27, 2023 at 11:17 am #5494Hi Nishanth,
From the results you’re sharing, I think this is a minor performance degradation at the beginning of the test. Maybe it is related to the burst of traffic before the 600Mbps rate is satisfied.
If the switch is dropping the packets, you can also try increasing the queue size on the BMv2 switch by using the set_queue_depth command in the simple_switch_CLI tool. You can refer to Lab 5 to interact with the switch at runtime.
Another suggestion is to try using nuttcp for UDP tests. ESnet suggests nuttcp instead of iperf3 for UDP testing (https://fasterdata.es.net/performance-testing/network-troubleshooting-tools/nuttcp/)
Regards,
Elie.
September 27, 2023 at 11:50 am #5495Hi,
I was able to get ~1Gbps performance with TCP, however there is always a slight mismatch between packets sent and received as reported by iperf3, with couple of retries during the 1st second.
I used a UDP connection instead and I am seeing a consistent behaviour of reasonable amount of losses on the 1st second across runs.
I am using the same notebook as mentioned above, but replaced the client command with the following:
iperf3 -c 192.168.2.10 -u -l 1300 -b 600MI have pasted the results for multiple runs with the above mentioned values. Is there an explanation for this drop, mostly at the first second.
Accepted connection from 192.168.1.10, port 36772
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 33996
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.7 MBytes 585 Mbits/sec 0.044 ms 501/56731 (0.88%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57691 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57696 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 599 Mbits/sec 0.052 ms 58/57694 (0.1%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57691 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57693 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.018 ms 0/57709 (0%)
[ 5] 10.00-10.02 sec 1.12 MBytes 600 Mbits/sec 0.004 ms 0/901 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 715 MBytes 598 Mbits/sec 0.004 ms 559/576881 (0.097%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 46278
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 39296
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.6 MBytes 584 Mbits/sec 0.058 ms 598/56731 (1.1%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.055 ms 0/57692 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.053 ms 0/57693 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 599 Mbits/sec 0.112 ms 0/57634 (0%)
[ 5] 4.00-5.00 sec 71.6 MBytes 601 Mbits/sec 0.041 ms 5/57751 (0.0087%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.064 ms 0/57690 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57694 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.043 ms 0/57690 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57695 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57692 (0%)
[ 5] 10.00-10.02 sec 1.13 MBytes 610 Mbits/sec 0.013 ms 0/914 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.013 ms 603/576876 (0.1%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 42888
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 36183
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 70.3 MBytes 590 Mbits/sec 0.063 ms 0/56730 (0%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.052 ms 0/57694 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.061 ms 0/57691 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57693 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.063 ms 0/57691 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.057 ms 0/57692 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57695 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57690 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.062 ms 0/57691 (0%)
[ 5] 10.00-10.02 sec 1.07 MBytes 577 Mbits/sec 0.053 ms 0/867 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 715 MBytes 599 Mbits/sec 0.053 ms 0/576827 (0%) receiver
———————————————————–
Server listening on 5201
———————————————————–
Accepted connection from 192.168.1.10, port 53714
[ 5] local 192.168.2.10 port 5201 connected to 192.168.1.10 port 46178
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 69.1 MBytes 580 Mbits/sec 0.056 ms 965/56730 (1.7%)
[ 5] 1.00-2.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57695 (0%)
[ 5] 2.00-3.00 sec 71.5 MBytes 600 Mbits/sec 0.050 ms 0/57689 (0%)
[ 5] 3.00-4.00 sec 71.5 MBytes 600 Mbits/sec 0.049 ms 0/57693 (0%)
[ 5] 4.00-5.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57692 (0%)
[ 5] 5.00-6.00 sec 71.5 MBytes 600 Mbits/sec 0.045 ms 0/57692 (0%)
[ 5] 6.00-7.00 sec 71.5 MBytes 600 Mbits/sec 0.042 ms 0/57695 (0%)
[ 5] 7.00-8.00 sec 71.5 MBytes 600 Mbits/sec 0.048 ms 0/57688 (0%)
[ 5] 8.00-9.00 sec 71.5 MBytes 600 Mbits/sec 0.051 ms 0/57693 (0%)
[ 5] 9.00-10.00 sec 71.5 MBytes 600 Mbits/sec 0.054 ms 0/57692 (0%)
[ 5] 10.00-10.02 sec 1.07 MBytes 574 Mbits/sec 0.054 ms 0/867 (0%)
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.02 sec 714 MBytes 598 Mbits/sec 0.054 ms 965/576826 (0.17%) receiverHi,
I think minor performance degradation is related to reaching specific limits specified (600M).
After tuning up on the servers and the switch I was getting near 4Gbits/s using bmv2 switch.
`
[ 5] 55.00-56.00 sec 449 MBytes 3.77 Gbits/sec 1751 67.4 MBytes
[ 5] 56.00-57.00 sec 428 MBytes 3.59 Gbits/sec 1930 62.7 MBytes
[ 5] 57.00-58.00 sec 471 MBytes 3.95 Gbits/sec 2035 69.3 MBytes
[ 5] 58.00-59.00 sec 431 MBytes 3.62 Gbits/sec 1863 51.0 MBytes
[ 5] 59.00-60.00 sec 470 MBytes 3.94 Gbits/sec 1384 40.1 MBytes
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 26.2 GBytes 3.74 Gbits/sec 121474 sender
[ 5] 0.00-60.07 sec 25.8 GBytes 3.69 Gbits/sec receiver<code></code>`
I tried the same example with UDP as well but ended up 1Gbits/s with UDP packets.
`
[ 5] 54.00-55.00 sec 456 MBytes 3.82 Gbits/sec 183850
[ 5] 55.00-56.00 sec 455 MBytes 3.82 Gbits/sec 183609
[ 5] 56.00-57.00 sec 454 MBytes 3.81 Gbits/sec 183204
[ 5] 57.00-58.00 sec 454 MBytes 3.81 Gbits/sec 183261
[ 5] 58.00-59.00 sec 456 MBytes 3.83 Gbits/sec 183899
[ 5] 59.00-60.00 sec 457 MBytes 3.84 Gbits/sec 184449
– – – – – – – – – – – – – – – – – – – – – – – – –
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-60.00 sec 26.8 GBytes 3.83 Gbits/sec 0.000 ms 0/11050971 (0%) sender
[ 5] 0.00-60.07 sec 9.98 GBytes 1.43 Gbits/sec 0.012 ms 6929044/11050832 (63%) receiver<code></code>`
What maybe the reason for UDP packets to be 1Gbits/s?
September 27, 2023 at 11:58 am #5496Hi Nagmat,
Reaching ~4Gbps on BMv2 is not common. What kind of tuning did you do? Would you mind sharing your notebook?
Regards,
Elie.
September 27, 2023 at 12:30 pm #5497Hi Nagmat,
Reaching ~4Gbps on BMv2 is not common. What kind of tuning did you do? Would you mind sharing your notebook?
Regards,
Elie.
How can I share it?
September 27, 2023 at 12:52 pm #5498 -
AuthorPosts
- You must be logged in to reply to this topic.