*********************************** CHANGES file for MX *********************************** MX 1.2.15 (Nov 9, 2011) ----------------------------- BUG FIXES: 1. Fix regression in FMA introduced in 1.2.14. MX 1.2.14 (Oct 2011) ----------------------------- ENHANCEMENTS 1. Linux: Add support for linux kernels up to kernel 3.0. 2. Add support for "8C2" nics. 3. Windows: added event logging to better aid debugging. BUG FIXES: 1. Linux: Fix issue on systems with more than one myricom nic that use another driver (ie myri10ge) in conjunction with mx, where the other driver was being unloaded incorrectly either during starting or stopping of mx. 2. Linux: Fix issues with the transparent huge page feature added in vanilla linux kernel 2.6.38, and backported into RHEL6 kernels. 3. Fix rare issue concerning the IDT chip used by the "8B2" nics. 4. Windows: Fix problem with mapper service not shutting down properly. MX 1.2.12 (May 27, 2010): ------------------------------ BUG FIXES: 1. Fix a corner case under heavy load that could lead to firmware/driver crash 2. Fix a corner case where sram-parity recovery would fail leading to driver abort 3. Avoid a deadlock in an uncommon pattern involving large-messages where typically more than 128 receives are posted in reverse order compared to sending order. 4. Fix a bug for apps using unidirectional mx_connect() that would prevent reconnects. 5. Make sure new libmyriexpress can still work with older driver (a bug in libmyriexpress 1.2.11 caused a runaway thread to consume one CPU core if mixed with older driver <= 1.2.10) 6. Make sure multiple parity-recovery are supported when needed. 7. Fix firmware occasionally crashing after a parity-recovery 8. Fix a bug where the myrix interfaces could stop receiving packet in the mx_ether_rx_frags=0 case. 9. Fix possible firmware race when configuring the myrix interface up and down 10. Fix theoretical bug (never observed) possibly leading to application failure because of lib/mcp event-queue overflow ENHANCEMENTS: 1. Add counter to monitor link bit-error-rate independantly of traffic (the category of decoding errors that can be detected even on idle links). 2. Linux: support using/registering buffers allocated by third-party drivers. 3. Linux: new module option mx_mac= to only attach driver to a specific NIC (similar in spirit to mx_bus, but selection by MAC-address instead of pci bus) 4. Linux: support for kernel version up to 2.6.33 5. Application with messages >= 2M can now survive a sram-parity-recovery event MX 1.2.11 (November 5, 2009): ------------------------------ BUG FIXES: 1. Fix two cases where sram parity error was not recovered. 2. Windows ndis6: prevent simultaneous init and halt of devices. 3. Fix endpoint/ressource leak race when endpoint is killed while a shared-memory communication is in progress. MX 1.2.10 (October 1st, 2009): ------------------------------ BUG FIXES: 1. Detect and fix inconsistent MSI configuration in nvidia chipset 2. Fix possible race leading to "Timed out waiting for MCP to close endpoint": scenario was closing endpoints with hanged commmunication to "lost" node 3. mx-kernel-lib mode: avoid a spurious "out of memory" affecting lustre 4. Avoid possible crash at init time on chipset which do not accept ecrc packets, by only enabling ecrc when appropriate 5. Fix a panic that can occur with some -RT kernels ENHANCEMENTS: 1. support for linux kernel 2.6.30 and 2.6.31 2. support for parity recovery for 10g 3. Add --enable-opteron-iommu option to help workaround an opteron hw bug on some platforms 4. Support for FreeBSD 8 5. Add NIC ability to answer by itself to fma scout 6. Support for 64-bit kernel in MacOSX 10.6 MX 1.2.9 (April 17th, 2009): ------------------------------ BUG FIXES: 1. Handle power6 platform unusual MSI behavior 2. Fix possible endpoint leak on MacOSX (typically with openmpi) 3. Linux: Fix rare possible race if the source process of shared-mem comm is killed while the message is transferred. 4. Fix MXoM gateway handling for ethernet aggregations whose size is not a power of two. ENHANCEMENTS: 1. support for linux kernel 2.6.29 2. Myrinet-Ethernet/gateway-cache support on Solaris/FreeBSD/MacOSX 3. support MacOSX 10.6 4. support for 8B*-C NICs MX 1.2.8 (December 23rd, 2008): ------------------------------ BUG FIXES: 1. Solaris: Fix the time-reference used for retransmit on Solaris with AMD processors (could cause unwarranted communication failures with "Unreachable" status) 2. Linux: Fix a bug that was causing ethernet packets, going from a MXoM node across a Myrinet-Ethernet bridge, to be broadcasted on all Myrinet nodes. Bug was present when using mx_ether_rx_frags=1 (which was the default for kernels >= 2.6.18). 3. Make sure mx_gw_cache is compiled/installed by default. 4. Guarantee minimum progression speed for ethernet in the presence of heavy native MX traffic. 5. Altix: add mmiowb() barrier in Ethernet xmit to guarantee correct operation. 6. Linux: Fix ethernet bug causing driver crash for network buffers with many frags on architecture with page-size > 4K (ia64, some ppc64, ...). 7. Fix a bug that might cause some unwarranted communication failures (unreachable status) on some specific pattern of communications. ENHANCEMENTS: 1. Support for new Z8ES-based NICs: 10G-PCIE*-8B*-*. 2. MSI support for Solaris. 3. Performance improvements on Windows. 4. mx_bug_report additions. 5. Try to support all possible combinations of --{enable/disable}-{32b/64b} in mix 32b/64b environments. 6. Remove a GPL dependency on some 2.6.27 kernels. 7. Allow to automatically redirect linux kernel messages to a 10g NIC log that is available across warm resets (mx_console=1). 8. Allow to specify several NICs with the mx_bus=<..> option. 9. Add --{enable/disable}-wc64 compile-time option to optimize latency depending on whether the architecture support 64byte write-combined requests or not. (default is enabled on x86, disabled otherwise). 10. Add debugging diagnostic MCPs in default tarball 11. Uses MSI when available on non-x86 archs. 12. Support Altix with default tarball. 13. Use pcie transactions with relaxed-ordering for better performance on some archs (Cell,...). 14. Support host with 64k page-size without patch. MX 1.2.7 (August 1st, 2008): ------------------------------ BUG FIXES: 1. Make progression-thread more aggressive to avoid spurious "Unreachable errors" 2. Fix a rare interrupt overflow problem that would cause the driver/firmware to abort. 3. Fix IP bug after 2**32 packets on linux kernels >= 2.6.18 and with default mx_ether_rx_frags setting (was causing a kernel "segfault" in the networking stack) 4. Leave MSI-X capability intact on unload (only matters if myri10ge driver is used afterwards) ENHANCEMENTS: 1. Improve handling of removed/exchanged NICs in peer table - new NICs have priority when using hostname resolution - add new mx_set_peer_name tool to rename entries in the peer table (for instance mx_set_peer_name -n 00:60:dd:11:22:33 -S REMOVED 2. "/etc/init.d/mx start" is more aggressive in cleaning up a previous incomplete startup (but unlike "restart", it won't touch a working MX setup). MX 1.2.6 (May 27th, 2008): ------------------------------ BUG FIXES: 1. Fix bug causing "MCP endpoint error" kernel message and job abort. 2. Fix pcie interoperability bug (FC credits mgmt) causing firmware/NIC crash with HT2000 on specific workloads. 3. Fix bug that could cause hang/crash for some workloads on congested networks. 4. Fix initialization of 10G-PCIE-8AE-R card (express module form factor) ENHANCEMENTS: 1. With MX in Myrinet-mode, distinguish a link-down from a link wrongly plugged in an ethernet port (mx_info/kernel-messages) 2. Add mx_gw_cache tool to dump gateway cache 3. Better route dispersion to maximize aggregate network throughput (probing method) 4. Linux: distinguish consistent/relax-order dma mappings (to get normal performance for big Altix machines) 5. Specific support for 2K Read-request-size, for SGI Altix gen1 pcie chipset 6. Add low-level diagnostic tools for 10g (mx_lmesg/mx_sram_dump/mx_ze_scan) MAPPING: includes fma/fms-1.3.1rc3 MX 1.2.5 (March 20th, 2008): ------------------------------ BUG FIXES: 1. Fix bug in mx_connect(), if a connection was already established between two endpoints, subsequent calls to mx_connect() could return a garbage endpoint_addr. 2. Fix firmware problem, typically happening based on some resend condition, or on temporary link-down situations. 3. Powerpc/Linux: fix IOMMU detection for recent Linux kernels. 4. Fix possible kernel oops when using kernel with 64K page size. ENHANCEMENTS: 1. Make overlap easier to use by generally completing large send or recv earlier and without depending on the other side to complete. 2. Support for linux kernel 2.6.24 and 2.6.25rc 3. MSI support for FreeBSD. 4. Better route dispersion to maximize aggregate network throughput. 5. Add LRO to Linux IPoM driver. 6. Update fma (see fma/CHANGES) MAPPING: includes fma/fms-1.3.1rc2 (see fma/CHANGES) MX 1.2.4 (November 9th, 2007): ------------------------------ BUG FIXES: 1. Fix race in hostname resolution query timeout that could generate a fatal error in the NIC firmware in some cases. 2. Fix for E-cards: link status in mx_info was incorrectly reported for the second port since mx-1.2.2. ENHANCEMENTS: 1. Substancially improve bidirectional bandwidth, specially when receiving more than 63 large sends at the same time (useless OSU benchmark). 2. Prevent network contention due to two sender sending large messages to the same receiver at the same time. 3. MacOSX: Add registration-cache capability (with MX_RCACHE=1) . 4. Make MX_CSUM=1 debugging option compatible with all MX software (rather than being limited to mpich*-mx). MAPPING: includes fma/fms-1.3.0 (see fma/CHANGES) MX 1.2.3 (October 1st, 2007): ----------------------------- BUG FIXES: 1. Correctly categorize Parity errors by NIC unit on 10G cards. 2. Linux: Update driver to properly support kernel >= 2.6.22 3. Linux: Workaround a bug in kernel 2.6.21/2.6.22 related to MSI. 4. Solaris: support enabling 10g and 2g at the same time 5. Fix possible NMI or machine crash caused by speculative reads from the processor in a "write-only" NIC region. ENHANCEMENTS: 1. Support for MIPS processor. 2. Performance improvements for processors without write-combining support. 3. Upgrade fma/fms to version 1.3.0 MAPPING: includes fma/fms-1.3.0 (see fma/CHANGES) MX 1.2.2 (August 17, 2007): --------------------------- BUG FIXES: 1. Linux: fix support for VLAN inside "IP" driver. 2. Fix a bug when using mx_isend "auto-forget-req" mode (i.e. passing a NULL pointer for the request). 3. Close race between large message resend and memory deregistration that could lead to Read DMA on a deregistered page, which would to a IOMMU error (the potentially stalled data was always dropped on receive side). ENHANCEMENTS: 1. Detect network mismatch between 10g ethernet and 10g Myrinet automatically. 2. Do kernel mapping with write-combining for "IP" and mx-kernel-lib benefit. 3. Make the debug lib less intrusive by not including the "MX_MATTER_DEBUG" checks by default. 4. Macosx: start Leopard support. 5. Add more debugging info when a bad session is detected. 6. Add sanity checks in the net send list operations. MAPPING: includes fma/fms-1.2.5a (see fma/CHANGES) MX 1.2.1 (June 12, 2007): ------------------------- BUG FIXES: 1. Fix bug where sometimes Ethernet mode did not work on certain SPARC architectures. 2. Fix a bug where small buffers created above 4GB would loose the upper 32 bits of their DMA address due to a double conversion from MX_HIGHPART_TO_U32(). 3. Fix Ethernet checksums when checksum offset is not on segment boundary. 4. Fix linked list termination bug when sending a lot of NACKs. 5. Handle very slow interrupt handler. ENHANCEMENTS: 1. Acks at the lib level for 2G (already in 10G). 2. Transparent SRAM Parity Error recovery for 2G NICs. 3. New FMA (see fma/CHANGES) 4. Support for MX-over-Ethernet (MXoE) for 10G NICs with the link in Ethernet mode, connected to an Ethernet switch. 5. MXoM/MXoE wire interoperability through Myrinet/Ethernet Bridges. 6. Report number of Ethernet PAUSE packets in counters. 7. Make MX_RCACHE=1 the default on Linux MAPPING: includes fma/fms-1.2.5a (see fma/CHANGES) =============================================================================== MX-10G 1.2.0j (May 8, 2007): ---------------------------- BUG FIXES: 1. Fix problem with mx_wait possible timeout earlier than requested. ENHANCEMENTS: 1. Improve ethernet trunking for situations with two Myrinet fabrics linked through a ethernet gateway. 2. Add possibility of passing NULL request to mx_isend/mx_irecv as an alternative to mx_forget_request() MX-10G 1.2.0i (February 15, 2007): ---------------------------------- ENHANCEMENTS: 1. Limit pcie wdma max-payload-size to 256 on HT2100, which seems to maximize WDMA performance than the typical 512 BIOS default. 2. Add an install-only target to prevent permission problem with "make install" from a "root_squashed" nfs build directory. 3. Add new mx_get_info() keys MX_NET_TYPE, MX_NUMA_NODE, MX_LINE_SPEED MX-10G 1.2.0h (January 25, 2007): --------------------------------- BUG FIXES: 1. Fix a problem under heavy traffic where some medium messages might fail spuriously with a "bad session" error. 2. Fix support for more than 4 endpoints (all requests of endpoints above 4 would hang forever) 3. Fix possible message corruption under specific loads. 4. Remove usage of "might_sleep" ethernet gateway-cache mutex in interrupt mode. 5. Fix possible deadlock when many large messages are under transmission 6. Solaris: fix possible crash or corruption when the OS try to swaps out large messages. ENHANCEMENTS: 1. Streamline "endpoint closed" error messages. 2. Print specific type of Parity error for ZE chip. MX-10G 1.2.0g (December 18, 2006): ---------------------------------- BUG FIXES: 1. Fix ethernet over Myrinet *major* performance problem introduced in 1.2.0f, upgrade from 1.2.0f is mandatory for anybody using ethernet over Myrinet. ENHANCEMENTS: 1. MacOSX/Intel "beta" support 2. Support ethernet bridging or connectivity across 2Z linecards. 3. DESTDIR=... in "make install" now as its GNU semantics, old behavior (overriding installation dir) is done with the more "standard" make install prefix=<...> MX-10G 1.2.0f (November 27, 2006): ---------------------------------- BUG FIXES: 1. Correctly propagate nacks to application (to avoid all communication errors being reported as "timed-out"). 2. Fix a rare of possible message corruption occuring - either if interrupts delivery is delayed by the system for a very long time. - or in case of network reordering with a very specific combination of buffers alignement ENHANCEMENTS: 1. Solaris release. 2. Debugging functionality for MacOSX 3. Preliminary MacOSX/Intel support 4. Linux: use PAT extensions to enable write-combining 5. New FMS 1.2.1 (see fma/CHANGES) MX-10G 1.2.0e (October 02, 2006): --------------------------------- BUG FIXES: 1. Correctly propagate nacks to application (to avoid all communication errors being reported as "timed-out"). ENHANCEMENTS: 1. Solaris release. 2. Debugging functionality for MacOSX MX-10G 1.2.0d (September 08, 2006): ----------------------------------- BUG FIXES: 1. Better reporting of communication errors. ENHANCEMENTS: 1. experimental support for: mx_iconnect(), mx_disconnect() 2. better handling of client-server situations where one peer reboot or disappear/reappear. 3. ppc: use altivec for 10g case, remove "guarded" bit from IO mappings. 4. performance optimization for huge pages 5. 2.6.18 support MX-10G 1.2.0c (July 28, 2006): ------------------------------ BUG FIXES: 1. Fix first 8 byte corruption of small messages, occuring with N->1 stream pattern or heavy contention. ENHANCEMENTS: 1. Upgrade to fma-1.1.4 (see fma/CHANGES) MX-10G 1.2.0b: -------------- ENHANCEMENTS: 1. Better overlap between computation and communication MX-10G 1.2.0a: -------------- BUG FIXES: 1. Fix rare problem in mx_connect MX-10G 1.2.0 (June 10, 2006): ----------------------------- ENHANCEMENTS: 1. Major rework of firmware and lib 2. New API function mx_test_any() and mx_wait_any() 3. available on 10g NICs =============================================================================== MX-2G 1.1.8 (June 07, 2007): ---------------------------- BUG FIXES: 1. Fix a PCIX timing problem leading to possible event or message corruption, often manifesting itself with a "Unknown mcp event type", or <> error message. Typically happens on platform with PCIe-to-PCI-X bridge and PCI-X at 133Mhz. 2. Fix fms segfault with some configurations. 3. Fix compilation flags on MacOSX/Intel MX-2G 1.1.7 (May 08, 2007): --------------------------- BUG FIXES: 1. Fix some crash problems on Solaris. 2. Fix bug where sometimes Ethernet mode did not work on certain SPARC architectures under Solaris. 3. Fix problem with mx_wait possibly timeout'ing earlier than the requested time out. ENHANCEMENTS: 1. Support for Linux up to 2.6.21-git9 2. Distinguish inactive/active peers in mx_info with a 'D' flag (as in "Doesn't answer mapping packets"), always print all peers. 3. Add support for Macintel 4. Enable write-combining for powerpc/linux 5. Update FMA, see fma/CHANGES for details MX-2G 1.1.6 (November 11, 2006): -------------------------------- BUG FIXES: 1. Fix a case of possible message corruption, if interrupts delivery is delayed by the system for a very long time. ENHANCEMENTS: 1. Support for Linux >= 2.6.19 2. Integrate FMA 1.2.1, see fma/CHANGES for details MX-2G 1.1.5 (October 18, 2006): ------------------------------- BUG FIXES: 1. Make sure SRAM parity error are correctly reported as such in mx_info. ENHANCEMENTS: 1. Support for Linux >= 2.6.18 2. Integrate FMA 1.2.0, see fma/CHANGES for details 3. Solaris improvements, support Sun cc as well as gcc. MX-2G 1.1.4 (August 02, 2006): ------------------------------ ENHANCEMENTS: 1. Upgrade FMA to version 1.1.4, see fma/CHANGES for details. MX-2G 1.1.3 (June 01, 2006): ---------------------------- BUG FIXES: 1. Fix bug in FMS 2. Fix PT2PT support in FMA. ENHANCEMENTS: 1. Linux: Add support for hugepages. 2. Linux: Allow read-only data when sending large messages. MX-2G 1.1.2 (May 17, 2006): --------------------------- BUG FIXES: 1. Increase watchdog timeout to avoid false positive on ppc64. 2. Better PCI parity error detection. 3. Fix interrupt error path to dump the queue only once. 4. Fix missing mx_cancel and mx_register_unexp_callback in the kernel lib. 5. Fix MSI. 6. Improve error reporting in the library. ENHANCEMENTS: 1. Upgrade FMA to version 1.1.1, see fma/CHANGES for details. 2. Support for Linux kernel version 2.6.16. 3. Use a bigger portion of the IOMMU (when not shared) on ppc64. 4. Do not display inactive peers in mx_info by default, use -a to get all peers (previous behavior). 5. Document environment variables in the README. 6. Support Solaris 10 on both Sparc and amd64 platforms. MX-2G 1.1.1 (January 13, 2006): ------------------------------- BUG FIXES: 1. Fix MCP race causing Lanai Memory error. 2. Make mx_cancel/mx_register_unexp_callback available in the kernel. 3. Fix accounting associated with the MX_PARAM_UNEXP_QUEUE_MAX option when used with self/shmem messages. 4. Fix mx_probe()/mx_register_unexp_callback-invocation reporting incorrect length for unexp messages. 5. Fix compilation with 2.6.15 linux kernels. 6. Support for some xserve w/o IOMMU. MX-2G 1.1.0 (November 09, 2005): -------------------------------- BUG FIXES: 1. Prevent spurious interrupts on x86/x86_64 machines (was typically causing the interrupt line to become disabled by the OS on power5 machines). 2. Prevent a rogue DMA-read possibly causing problems on IOMMU platforms (powerpc64), only occuring in the presence of crc-errors/lost-packets. MAJOR ENHANCEMENTS: 1. Introduction of the FMS: http://www.myri.com/scs/fms/ (old mapper still usable with --disable-fms option). API ADDITIONS: 1. Add new primitive mx_register_unexp_callback() allowing an "active-message" like programmation model. ENHANCEMENTS: 1. MX_NOMYRINET=1 allows to run simple MX programs without a Myrinet card. 2. Support MSI interrupts. 3. Add an option MX_CSUM=1 (requires --enable-debug) to checksum messages. =============================================================================== MX-2G 1.0.3 (October 05, 2005): ------------------------------- BUG FIXES: 1. Fix a typo which prevented the endpoint close timeout from actually being increased to 10 seconds. 2. Fix a compilation error affecting IA64 when using some Linux 2.4 kernels. MX-2G 1.0.2 (September 21, 2005): --------------------------------- BUG FIXES: 1. Fix a typo which would send an Ethernet broadcast packet with an MX header, resulting in firmware crashes on remote nodes. 2. Fix a priority problem which could cause general slowdowns, and delays when closing endpoints on heavily loaded NICs. 3. Increase the time the driver will wait to close an endpoint to 10 seconds, to ensure that the driver does not prematurely declare the firmware dead. 4. Fix driver bug which could result in a kernel oops on Linux/ppc64. 5. Eliminated a race which could lead to data corruption when bringing the myri interface down and then up again on a heavily loaded network. 6. Fix linux symbol versioning to ensure dlsym() will pick the right function pointer. 7. Disabled experimental SRAM parity error recovery code until it is completed in a future release. ENHANCEMENTS: 1. Improved debugging support. 2. Support for Linux kernel version >= 2.6.11 on powerpc64. MX-2G 1.0.1 (August 22, 2005): ------------------------------ BUG FIXES: 1. Fix a bug in the route-updates mechanism which was causing a range of firmware problems. 2. Fix a shmem bug in Linux (causing "mx__direct failed" error message). 3. Fix a bug causing mx_issend to always use the network, possibly causing a message-ordering problem if mx_isend was using shared-mem communications. 4. Don't rely or install mx_auto_config.h to avoid potential conflicts with the use of define PACKAGE_xxx for applications also using autoconf. 5. Fix buffer leaks in the ethernet/ip linux driver, which were causing connections to stall. 6. Fix a bug related to the use of the MX_PARAM_UNEXP_QUEUE_MAX option to mx_open_endpoint(), when sending messages unidirectionally with a slow receiver. MacOSX updates: 7. MacOSX will work correctly on machines with more than 4GB. 8. Fixed a bug where the memory size on MacOSX 10.3 was detected as 2GB for all memory sizes 2GB and larger. 9. Worked around limitations in MacOSX which could lead to data corruption when using MX shmem to exchange messages larger than 256MB, or messages which crossed a 256MB boundary in the address space of the receiver. 10. Make more efficient use of the IOMMU. 11. Other MacOSX bug fixes ENHANCEMENTS: 1. Windows port. 2. Support point-2-point routes: * don't require a network for same-NIC communications * properly distinguish pt2pt routes from no route 3. Support for newer/upcoming linux kernels. 4. Addition of a mx_bug_report script to help report system information on linux platforms. 5. Shared-memory performance improvements. 6. Add ability to build binary rpms in the makefile. MX API CHANGES/ADDITIONS: 1. Add MX_PORT_COUNT key to mx_get_info() MX-2G 1.0.0 (June 14, 2005): ---------------------------- Initial release. ===============================================================================