Intel^® Math Kernel Library 7.0 for Linux*
Release Notes

Overview
New in Intel® MKL 7.0
System Requirements
Installation
Directory Structure
Known Limitations
Technical Support and Feedback
Related Products and Services
Copyright and Legal Information

Overview

The Intel® Math Kernel Library (Intel® MKL) provides developers of scientific, engineering and financial software with a set of linear algebra routines, discrete Fourier transforms and vectorized math and random number generation functions, all optimized for the latest Intel® Pentium® 4, Intel® Pentium® M processor component of Intel® Centrino™ mobile technology, Intel® Xeon™ and Intel® Itanium® 2 processors. Intel MKL provides linear algebra functionality with LAPACK (solvers and eigensolvers) plus levels 1, 2, and 3 BLAS offering the vector, vector-matrix, and matrix-matrix operations needed for complex mathematical software. For solving sparse systems of equations, Intel MKL now provides a direct sparse solver, for which two interfaces are provided: the PARDISO interface and the DSS interface. Intel MKL offers multidimensional discrete Fourier transforms (1D, 2D, 3D) with mixed radix support (not limited to sizes of powers of 2). Intel MKL also includes a set of vectorized transcendental functions (called the Vector Math Library (VML)) offering both greater performance and excellent accuracy compared to the libm (scalar) functions for most of the processors. The Vector Statistical Library (VSL) offers high performance, hand tuned vectorized random number generators for a number of probability distributions. Intel MKL offers multi-threading support using OpenMP* in addition to being a fully thread-safe library.

Version 7.0 of Intel MKL introduces:

Direct sparse solver (PARDISO)
New Vector Statistical random number generator functions

For detailed information on these features, please refer to the "New in Intel® MKL 7.0" section below.

The original versions of the BLAS from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/blas/index.html. The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.

New in Intel® MKL 7.0

Functionality
- Sparse solver with PARDISO and DSS interfaces.
- Implemented 2D DFT for real data types
- Two quasi-random basic generators implemented (Sobol, Niederreiter), Poisson distribution generator with varying mean implemented (vector Poisson generator), and multivariate (correlated) normal distribution generator implemented. Please see the VSL Notes (file vslnotes.pdf) and the manual for further information.
Performance improvements since Intel® MKL 6.1.1
- Improvements for the Intel® Itanium® 2 processor
  - BLAS
    - Improved DGEMM performance by 150-200% on small sizes up to 40 and 5-15% on other sizes.
  - DFT
    - Improved performance by up to 100% on the 1D DFT for small to medium sizes (up to 256)
- Improvements for the Intel® Pentium® 4 processor
  - BLAS
    - Improved DGEMM performance by 5-50% for small sizes of M dimension (up to 200) and by 2% on big sizes.
    - Improved the performance of the DAXPY, DSCAL by 8-70% on sizes up to 1000 when data is in cache
    - Improved the performance of the DDOT, DASUM by 7-50% on sizes up to 3000 when data is in cache
  - DFT
    - Improved performance by up to 100% on the 1D DFT for small to medium sizes (up to 256)
  - VML
    - Improved performance of vdSin, vdCos, vdTan, vdSinCos, vdTanh, vsSqrt, vsAsin, vsPow and vdErf functions for the Intel® Pentium® 4 processor with SSE3 extensions. See VML Notes document for details.
  - VSL
    - Improved performance of MRG32k3a basic random number generator by 70% for Intel® Pentium® 4 processors with SSE3 extensions. Please note that performance data in VSL Notes document is not up to date for this BRNG.
Other improvements
- DFT source code examples have been introduced.
- Fixed issues with accuracy in SDOT for the Itanium 2 processor
- New configuration parameter DFTI_NUMBER_OF_USER_THREADS introduced to specify the number of application threads that will call DFT computational routines with the same descriptor. See the MKL manual for more details (file: mklman.pdf).
- Fixed performance drop of DBDSQR and other LAPACK functions which intensively use (D,S)LARTG and (D,S)LAMCH
- Corrected behavior of floating point comparisons containing NaNs in BLAS and LAPACK functions for IA-32
- Documentation updated: new section on the PARDISO direct sparse solver.

System Requirements

Recommended hardware: a PC, workstation or server, with Intel® Xeon™ processor, Pentium 4 processor, or Itanium® 2 processor.

Software requirements

Red Hat* Linux* version 9.0 (on IA-32 systems only)
Red Hat* EL 2.1
Red Hat* EL 3.0
SuSE* Linux* 8.2 (on IA-32 systems only)
SuSE* Linux* Enterprise Server 8

Intel® Fortran Compiler version 7.1 and 8.0
Intel® C++ Compiler version 7.1 and 8.0
GNU compiler collection

Note: Intel MKL has parts which have Fortran interfaces, and are Fortran in their data structures, and parts which have C interfaces and have C data structures. The user notes file (mkluse.htm in the doc directory) contains advice on how to link to Intel MKL with different compilers.

Installation

To install the Intel MKL package on Linux*, use the following instructions. The installation software installs the full Intel MKL file set for all supported processors. See the Intel MKL website for updates, when available.

Use the tar command to extract the Intel MKL package in a directory to which you have write access
(e.g., tar -xvf package.tar).
Become the root user and execute the install script in the directory where the tar file was extracted by typing "./install.sh".
- The use of rpm necessitates root access to your system. If you do not have root access, contact customer support for direct access to the RPM package, and work around information.
The Intel® Performance Libraries products already installed will be listed, followed by a menu of products to install which includes:
- Intel® Math Kernel Library Version 7.0
Select a package to install. All packages needed to use the product will also be installed. The default RPM options [-ivh --force] are recommended to force the update of existing files. The recommended (default) installation directory is /opt/intel. In the directory you choose, a directory named mkl70 will be created and all files will be installed there. Any previous installation, including Intel MKL 6.0 and Intel MKL 6.1 may remain installed when installing Intel MKL 7.0, but you will be required to remove Intel MKL 7.0 Beta if you have it installed. Be sure to update your build scripts to point to the desired version of Intel MKL if you choose to keep multiple versions installed.
- The Intel MKL installation program uses RPM as the installation vehicle. Some versions of RPM do not allow redirection of installation. If the install program detects that you have a version of RPM that does not allow redirection, you will be required to install to the default directory.
After installation, the packages installed will be redisplayed, followed by a redisplay of the install menu. Enter 'x' to exit the install script.

Two files, mklvars32.sh and mklvars64.sh, will be placed in the tools/environment directory. These files can be used to set the INCLUDE and LD_LIBRARY_PATH environment variables in the current user shell.

Intel MKL uses Macrovision's* FLEXlm* electronic licensing technology. License management should be transparent, but if you have any problems during installation, please make sure a current license file (*.lic) is located in the same directory as the install file. If you still have problems, please submit an issue to Intel® Premier Support. See the "Technical Support and Feedback" section of this document for details.

Directory Structure

The information below indicates the high level structure for Intel MKL.

mkl70 Main directory

mklnotes.htm Release notes (this file)

mkllic.htm Intel MKL license

redist.txt List of redistributable files

mkl70/doc Directory for documents

index.htm Index to the Intel MKL documentation

mklman.pdf Intel MKL manual

mkluse.htm User notes for Intel MKL

vmlnotes.htm General discussion of VML

vslnotes.pdf General discussion of VSL

mkl70/examples Source and data for examples

mkl70/include Contains include files for both library routines and test and example programs

mkl70/tests Source and data for tests

mkl70/lib/32 Contains static libraries and shared objects for IA-32 applications

mkl70/lib/64 Contains static libraries and shared objects for the Itanium® 2 processor

mkl70/tools/environment Contains shell scripts to set environment variables in the user shell

mkl70/tools/support Contains a utility for reporting package ID and license key information to Intel® Premier Support

Known Limitations

Limitations to the sparse solver in Intel MKL 7.0:

The default number of threads (when OMP_NUM_THREADS is not set) is equal to the number of processors in system. This differs from the default OpenMP mode in Intel MKL (by default the number of threads is set to one).
Only statically linkable sparse solver library files will be available with this release.
Enhanced precision accumulation is implemented in long doubles (10 bytes real precision).
Statistics output is not implemented (msglvl=1 will not deliver statistics).

There are a number of limitations in the current implementation of the set of DFT functions:

The function DftiCopyDescriptor is not implemented.
The function DftiGetValue is implemented with the following restriction: the DFTI_FORWARD_ORDERING and DFTI_BACKWARD_ORDERING parameters are not yet supported.
Complex data is stored using the Fortran data type; real and imaginary parts are adjacent.
Modes DFTI_INITIALIZATION_EFFORT, DFTI_WORKSPACE, and DFTI_TRANSPOSE are implemented only for the default case. DFTI_FORWARD_SIGN can have the default value only and is not changeable by the DftiSetValue function.
DFTI_PRECISION, DFTI_DIMENSION, and DFTI_LENGTHS are settable only through the DftiCreateDescriptor function and are not changeable by the DftiSetValue function.
Mode DFTI_FORWARD_DOMAIN can not have the value DFTI_CONJUGATE_EVEN.
3D real DFT is not currently implemented.
Modes DFTI_REAL_STORAGE and DFTI_CONJUGATE_EVEN_STORAGE can have the default value only and are not changeable by the DftiSetValue function (i.e., DFTI_REAL_STORAGE = DFTI_REAL_REAL and DFTI_CONJUGATE_EVEN_STORAGE = DFTI_COMPLEX_REAL).
Mode DFTI_COMPLEX_STORAGE can have the default value only and is not changeable by the DftiSetValue function. In other words, DFTI_COMPLEX_STORAGE is always DFTI_COMPLEX_COMPLEX.

When using the DFTs in Intel MKL it may be necessary to explicitly link 'libm'. Please include '-lm' on your link line after any reference to Intel MKL library files.

Hyperthreading is especially effective when each thread is performing different types of operations and when there are under-utilized resources on the processor. Intel MKL fits neither of these criteria as the threaded portions of the library execute at high efficiencies (using most of the available resources) and perform identical operations on each thread. You may obtain higher performance when using Intel MKL without hyperthreading enabled.

DFT, VML, and VSL functions can not be used with Fortran 77 compilers.

Memory Allocation: In order to achieve better performance, memory allocated by Intel MKL is not released. This behavior is by design and is a one time occurrence for Intel MKL routines that require workspace memory buffers. Even so, the user should be aware that some tools may report this as a memory leak. Should the user wish, memory can be released by the user program through use of a function (MKL_FreeBuffers()) made available in Intel MKL or memory can be released after each call by setting an environment variable (MKL_DISABLE_FAST_MM) (see technical user notes in the doc directory for more details). Using one of these methods to release memory will not necessarily stop programs from reporting memory leaks, and in fact may increase the number of such reports should you make multiple calls to the library thereby requiring new allocations with each call. Memory not released by one of the methods described will be released by the system when the program ends. The maximum number of buffers allocated in each thread is 32. To avoid this restriction disable memory management as described above.

On Red Hat* Enterprise Linux 3.0, in order to ensure that the correct support libraries are linked, the environment variable LD_ASSUME_KERNEL must be set: For example: 'export LD_ASSUME_KERNEL=2.4.1'

Technical Support and Feedback

Self Help and User Forums

A rich repository of self-help product information such as tutorials, getting started tips, known product issues, product errata, compatibility information and answers to frequently asked questions can be found at the Intel® Software Development Products Technical Support. It's a great place to find answers quickly or to gain insight in using our products effectively.

The Intel MKL User Forum is the place to ask questions of and share information with other users of Intel® MKL.

Submitting Issues

Your feedback is very important to us. To receive technical support and product updates for the tools provided in this product you need to register at the Intel® Registration Center and click on "Create New Account".

For information about the Intel® MKL including FAQ�s, tips and tricks, and other support information, please visit: http://support.intel.com/support/performancetools/libraries/mkl

Note: If you are having trouble registering or unable to access your Premier Support account, contact developer.support@intel.com. Please do not email your technical issue to developer.support@intel.com as it is not a secure medium.

To submit an issue via the Intel® Premier Support website, please perform the following steps:

Ensure that Java* and JavaScript* are enabled in your browser.
Go to https://premier.intel.com/.
Type in your Login and Password. Both are case-sensitive.
Click the "Submit Issues" button.
Read the Confidentiality Statement and click the "I Accept" button.
Click on the "Go" button next to the "Product" drop-down list.
Click on the "Submit Issue" link in the left navigation bar.
Choose "Development Environment (tools,SDV,EAP)" from the "Product Type" drop-down list.
If this is a software or license-related issue choose "Intel(R) MKL for Linux*" from the "Product Name" drop-down list.
Enter your question and complete the fields in the windows that follow to successfully submit the issue.

Please follow these guidelines when forming your problem report or product suggestion:

Describe your difficulty or suggestion.
For problem reports please be as specific as possible (e.g., including compiler and link command line options), so that we may reproduce the problem. Please include a small test case if possible.
Describe your system configuration information.
Be sure to include specific information that may be applicable to your setup: operating system, name and version number of installed applications, and anything else that may be relevant to helping us address your concern.

Related Products and Services

Information on Intel® software development products is available at http://www.intel.com/software/products. Some of the related products include:

The Intel® Software College provides interactive tutorials, documentation, and code samples that teach Intel® architecture and software optimization techniques.
The VTune™ Performance Analyzer allows you to evaluate how your application is utilizing the CPU and helps you determine if there are modifications you can make to improve your application's performance.
The Intel® C++ and Fortran Compilers are an important part of making software run at top speeds and fully support the latest Intel IA-32 and Itanium processors.
The Intel® Performance Library Suite provides a set of routines optimized for various Intel® processors. The Intel® Math Kernel Library, which provides developers of scientific and engineering software with a set of linear algebra, fast Fourier transforms and vector math functions optimized for the latest Intel Pentium and Intel Itanium processors. The Intel® Integrated Performance Primitives consists of cross platform tools to build high performance software for several Intel architectures and several operating systems.

Celeron, Dialogic, i386, i486, iCOMP, Intel, Intel Centrino, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetStructure, Intel Xeon, Intel XScale, Itanium, MMX, MMX logo, Pentium, Pentium II Xeon, Pentium III Xeon, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
* Other names and brands may be claimed as the property of others.

mkl70		Main directory
	mklnotes.htm	Release notes (this file)
	mkllic.htm	Intel MKL license
	redist.txt	List of redistributable files
mkl70/doc		Directory for documents
	index.htm	Index to the Intel MKL documentation
	mklman.pdf	Intel MKL manual
	mkluse.htm	User notes for Intel MKL
	vmlnotes.htm	General discussion of VML
	vslnotes.pdf	General discussion of VSL
mkl70/examples		Source and data for examples
mkl70/include		Contains include files for both library routines and test and example programs
mkl70/tests		Source and data for tests
mkl70/lib/32		Contains static libraries and shared objects for IA-32 applications
mkl70/lib/64		Contains static libraries and shared objects for the Itanium® 2 processor
mkl70/tools/environment		Contains shell scripts to set environment variables in the user shell
mkl70/tools/support		Contains a utility for reporting package ID and license key information to Intel® Premier Support

Intel® Math Kernel Library 7.0 for Linux* Release Notes

Contents