Myri-10G
10-Gigabit Ethernet
Performance Measurements
We report performance measurements for Myri-10G NICs using our 10-Gigabit Ethernet driver, Myri10GE, on Linux, Windows, Solaris, MacOSX, and FreeBSD.
Linux | Windows | Solaris GLDv2 | Solaris GLDv3 | MacOSX | FreeBSD
| Benchmark: | netperf version 2.4.5 |
| OS: | Centos5 x86_64 2.6.18-128.1.16.el5 kernel |
| NICs: | Myri-10G 10G-PCIE-8B |
| Driver: | Myri10GE version 1.5.0 |
| Interrupt Coalescing: | 75 µs |
| TCP Segmentation Offload (TSO): | enabled |
| Large Receive Offload (LRO): | enabled |
| Hosts: | Asus RS500-E6-PS4 systems with dual Intel quad-core 2.93GHz Xeon X5570s (8 2.93GHz Nehalem cores) |
| Topology: | point-to-point (switchless) |
For these Linux tests, TCP buffer sizes were increased and TCP timestamps were disabled as recommended in the Performance Tuning section of the Linux Myri10GE README, and the netserver was run without options. Performance is measured using 9000-byte (jumbo) frames and 1500-byte (standard) frames, and bandwidth (BW) is measured in Megabits/second.
Netperf Results, MTU 9000
Commands: $ netperf -H asus02-m -t TCP_STREAM -C -c -l 60
$ netperf -H asus02-m -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.18-128.1.16.el5
$ netperf -H asus02-m -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 4M -S 4M
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9910.33 4.52 2.84
TCP_SENDFILE 9000 9910.32 2.71 2.82
UDP_STREAM_TX 9000 9924.70 5.73 0.00
UDP_STREAM_RX 9000 9924.70 0.00 3.66
Netperf Results, MTU 1500
Commands: $ netperf -H asus02-m -t TCP_STREAM -C -c -l 60
$ netperf -H asus02-m -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.18-128.1.16.el5
$ netperf -H asus02-m -t UDP_STREAM -l 60 -C -c -- -m 1472 -s 4M -S 4M
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 9477.10 4.62 5.57
TCP_SENDFILE 1500 9452.54 2.56 5.63
UDP_STREAM_TX 1500 9249.00 12.51 0.00
UDP_STREAM_RX 1500 9249.00 0.00 11.59
Notes:
| Benchmark: | ntttcps and ntttcpr (from the Windows 2003 DDK) |
| OS: | Windows Server 2003 x64 SP1 Edition |
| NICs: | Myri-10G 10G-PCIE-8A |
| Driver: | Myri10GE AMD64 version 1.0.1 |
| Interrupt Coalescing: | 25 µs |
| TCP Segmentation Offload (TSO): | enabled |
| Checksum Offload: | enabled |
| Flow Control: | enabled |
| Hosts: | Sender: Tyan S2895 motherboard with AMD single-core dual-processor 2.6GHz Opteron |
| Receiver: Dell PowerEdge 2950 | |
| Topology: | point-to-point (switchless) |
For these Windows tests, no registry entries were added to the Windows 2003-based machines. Bandwidth (BW) is measured in Megabits/second.
One ntttcps process was run on one Windows host connected to one Windows host running one ntttcpr process.
Ntttcp Results, MTU 9000
Commands:
Sender: ntttcps -m 1,1,10.0.130.50 -l 1048576 -n 100000 -w -v -a 8
Receiver: ntttcpr -m 1,1,10.0.130.50 -l 1048576 -rb 2097152 -n 1000000 -w -v -a 8
Results on the Sender:
-----------------------------------------------------------------
| Estimated Time to Complete Test at line speed (seconds) |
-----------------------------------------------------------------
1000 Base-T 622 OC-12(ATM) 155 OC-3(ATM) 100 Base-T 10 Base-T
=========== ============== ============= ========== =========
419 369 1408 2128 25000
------------------------------------------------------
| Output Summary |
------------------------------------------------------
Thread Realtime(s) Throughput(KB/s) Throughput(Mbit/s)
====== =========== ================ ==================
0 85.500 1226404.678 9811.237
Total Bytes(MEG) Realtime(s) Average Frame Size Total Throughput(Mbit/s)
================ =========== ================== ========================
104857.600000 85.500 60667.263 9811.237
Total Buffers Throughput(Buffers/s) Pkts(sent/intr) Intr(count/s) Cycles/Byte
============= ===================== =============== ============= ===========
100000.000 1169.591 1 23467.10 0.5
Packets Sent Packets Received Total Retransmits Total Errors Avg. CPU %
============ ================ ================= ============ ==========
1728405 281845 2 0 10.70
Results on the Receiver:
-----------------------------------------------------------------
| Estimated Time to Complete Test at line speed (seconds) |
-----------------------------------------------------------------
1000 Base-T 622 OC-12(ATM) 155 OC-3(ATM) 100 Base-T 10 Base-T
=========== ============== ============= ========== =========
419 369 1408 2128 25000
------------------------------------------------------
| Output Summary |
------------------------------------------------------
Thread Realtime(s) Throughput(KB/s) Throughput(Mbit/s)
====== =========== ================ ==================
0 85.735 1223043.098 9784.345
Total Bytes(MEG) Realtime(s) Average Frame Size Total Throughput(Mbit/s)
================ =========== ================== ========================
104857.600000 85.735 8959.587 9784.345
Total Buffers Throughput(Buffers/s) Pkts(recv/intr) Intr(count/s) Cycles/Byte
============= ===================== =============== ============= ===========
100000.000 1166.385 29 4610.68 2.7
Packets Sent Packets Received Total Retransmits Total Errors Avg. CPU %
============ ================ ================= ============ ==========
281837 11703396 0 0 27.27
Notes:
If you're using Windows 2000, XP, or 2003, you will need to add the following two registry entries:
HKLM\System\CurrentControlSet\Services\Tcpip\Parameters:
For a detailed list of Performance Tuning Guidelines for Windows Server 2003 and 2008 refer to this FAQ entry.
| Benchmark: | netperf version 2.4.5 |
| OS: | OpenSolaris 2008.11 (snv_101b_rc2) |
| NICs: | Myri-10G 10G-PCIE-8B |
| Driver: | Myri10GE version AMD64 1.0.4 |
| Interrupt Coalescing: | 30 µs |
| Large Receive Offload (LRO): | enabled |
| Hosts: | Asus RS500-E6-PS4 systems with dual Intel quad-core 2.93GHz Xeon X5570s (8 2.93GHz Nehalem cores) |
| Topology: | point-to-point (switchless) |
For these Solaris GLDv2 tests, the netserver was run without options. Performance is measured using 9000-byte (jumbo) frames and 1500-byte (standard) frames, and bandwidth (BW) is measured in Megabits/second.
Netperf Results, MTU 9000
Commands: $ netperf -H asus2-m -t TCP_STREAM -C -c -l 60 -T loc,remote -- -s 512K -S 512K
$ netperf -H asus2-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60 -T loc,remote -- -s 512K -S 512K
$ netperf -H asus2-m -t UDP_STREAM -l 60 -C -c -T loc,remote -- -m 8972 -s 512K -S 512K
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9877.62 9.91 10.08
TCP_SENDFILE 9000 9887.49 11.83 10.34
UDP_STREAM_TX 9000 9880.90 17.51 00.00
UDP_STREAM_RX 9000 9880.90 00.00 17.93
Netperf Results, MTU 1500
Commands: $ netperf -H asus2-m -t TCP_STREAM -C -c -l 60 -T loc,remote -- -s 1M -S 1M
$ netperf -H asus2-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60 -T loc,remote -- -s 1M -S 1M
$ netperf -H asus2-m -t UDP_STREAM -l 60 -C -c -T loc,remote -- -m 1472 -s 1M -S 1M
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 7787.70 17.59 19.51
TCP_SENDFILE 1500 5775.41 24.65 17.16
UDP_STREAM_TX 1500 5291.70 15.05 00.00
UDP_STREAM_RX 1500 5165.40 00.00 28.63
Notes:
| Benchmark: | netperf version 2.4.5 |
| OS: | OpenSolaris 2008.11 (snv_101b_rc2) |
| NICs: | Myri-10G 10G-PCIE-8B |
| Driver: | Myri10GE version AMD64 1.4.5gldv3 |
| Interrupt Coalescing: | 125 µs |
| TCP Segmentation Offload (TSO): | enabled |
| Large Receive Offload (LRO): | enabled |
| Hosts: | Asus RS500-E6-PS4 systems with dual Intel quad-core 2.93GHz Xeon X5570s (8 2.93GHz Nehalem cores) |
| Topology: | point-to-point (switchless) |
For these Solaris GLDv3 tests, the netserver was run without options. Performance is measured using 9000-byte (jumbo) frames and 1500-byte (standard) frames, and bandwidth (BW) is measured in Megabits/second.
Netperf Results, MTU 9000
Commands: $ netperf -H asus2-m -t TCP_STREAM -C -c -l 60 -- -s 512K -S 512K
$ netperf -H asus2-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60 -- -s 512K -S 512K
$ netperf -H asus2-m -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 512K -S 512K
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9868.72 9.29 8.91
TCP_SENDFILE 9000 9866.15 11.96 8.94
UDP_STREAM_TX 9000 9925.20 9.33 00.00
UDP_STREAM_RX 9000 9925.20 00.00 9.04
Netperf Results, MTU 1500
Commands: $ netperf -H asus2-m -t TCP_STREAM -C -c -l 60 -- -s 512K -S 512K
$ netperf -H asus2-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60 -- -s 512K -S 512K
$ netperf -H asus2-m -t UDP_STREAM -l 60 -C -c -- -m 1472 -s 512K -S 512K
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 9345.75 7.75 20.32
TCP_SENDFILE 1500 9285.96 9.15 20.98
UDP_STREAM_TX 1500 5978.60 12.55 00.00
UDP_STREAM_RX 1500 5978.60 00.00 24.20
Notes:
| Benchmark: | netperf version 2.4.3 |
| iperf version 2.0.2 | |
| OS: | MacOSX 10.5 |
| NICs: | Myri-10G 10G-PCIE-8A |
| Driver: | Myri10GE version 1.1.0 |
| Interrupt Coalescing: | 75 µs |
| Large Receive Offload (LRO): | enabled |
| Hosts: | MacPro with Intel dual-core dual-processor 2.6GHz Xeons |
| Topology: | point-to-point (switchless) |
For these MacOSX tests, LRO was enabled as recommended in the Performance Tuning section of the MacOSX Myri10GE README, and the netserver was run without options. The iperf server was run with the same window (-w) and buffer length (-l) arguments as the client. Performance is measured using 9000-byte (jumbo) frames and 1500-byte (standard) frames, and bandwidth (BW) is measured in Megabits/second.
Netperf Results, MTU 9000
Commands: $ netperf -H macpro01-m -t TCP_STREAM -C -c -l 60 -- -S 768K -S 768K -m 256K
$ netperf -H macpro01-m -t UDP_STREAM -l 60 -C -c -- -m 32K -s 512K -S512K
$ iperf -c macpro01-m -w -w 768k -l 256k -P 2 -f m -t 60
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9661.82 41.38 36.74
UDP_STREAM_TX 9000 6867.00 28.08 00.00
UDP_STREAM_RX 9000 6867.00 00.00 39.26
Dual-Stream TCP Results (2 netperf processes):
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9692.00 54.72 47.36
Dual-Stream TCP Results (2 iperf threads):
Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
iperf 9000 9825.00 65 58
Netperf Results, MTU 1500
Commands: $ netperf -H macpro01-m -t TCP_STREAM -C -c -l 60 -- -s 768K -S 768K -m 256K
$ netperf -H macpro01-m -t UDP_STREAM -l 60 -C -c -- -m 32K -s 512K -S512K
$ iperf -c macpro01-m -w 512k -l 256k -P 2 -f m -t 60
Single-Stream Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 4782.15 41.70 39.15
UDP_STREAM_TX 1500 3310.40 27.85 00.00
UDP_STREAM_RX 1500 3310.40 00.00 39.24
Dual-Stream TCP Results (2 netperf processes):
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 4367.00 42.29 43.75
Dual-Stream TCP Results (2 iperf threads):
Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
iperf 1500 6417.00 76 65
Notes:
| Benchmark: | netperf version 2.4.5 |
| OS: | FreeBSD/amd64 7.2-RELEASE |
| NICs: | Myri-10G 10G-PCIE-8B |
| Driver: | if_mxge |
| Interrupt Coalescing: | 30 µs |
| TCP Segmentation Offload (TSO): | enabled |
| Large Receive Offload (LRO): | enabled |
| Hosts: | Asus RS500-E6-PS4 systems with dual Intel quad-core 2.93GHz Xeon X5570s (8 2.93GHz Nehalem cores) |
| Topology: | point-to-point (switchless) |
For these FreeBSD tests, the kern.ipc.maxsockbuf tunable was increased to 16777216, and the netserver was run without options. Performance is measured using 9000-byte (jumbo) frames and 1500-byte (standard) frames, and bandwidth (BW) is measured in Megabits/second.
Netperf Results, MTU 9000
Commands: $ netperf -H asus02-m -t TCP_STREAM -C -c -l 60
$ netperf -H asus02-m -t TCP_SENDFILE -l 60 -C -c -F /boot/kernel/kernel
$ netperf -H asus02-m -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 128K -S 128K
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 9000 9887.91 8.22 7.73
TCP_SENDFILE 9000 9887.31 6.33 7.50
UDP_STREAM_TX 9000 9926.00 13.85 0.00
UDP_STREAM_RX 9000 9926.00 0.00 6.77
Netperf Results, MTU 1500
Commands: $ netperf -H asus02-m -t TCP_STREAM -C -c -l 60
$ netperf -H asus02-m -t TCP_SENDFILE -l 60 -C -c -F /boot/kernel/kernel
$ netperf -H asus02-m -t UDP_STREAM -l 60 -C -c -- -m 16256 -s 128K -S 128K
Results:
Netperf Test MTU BW TX_CPU % RX_CPU %
------------ ---- ------- -------- --------
TCP_STREAM 1500 9361.92 8.26 10.07
TCP_SENDFILE 1500 9390.04 5.90 10.21
UDP_STREAM_TX 1500 9243.90 14.18 0.00
UDP_STREAM_RX 1500 9243.90 0.00 14.69
Notes:
![]()
Last updated: 25 August 2009