6

I want to test and compare Numpy matrix multiplication and Eigen decomposition performance with Intel MKL and without Intel MKL.

I have installed MKL using pip install mkl (Windows 10 (64-bit), Python 3.8).

I then used examples from here for matmul and eigen decompositions.

How do I now enable and disable MKL in order to check numpy performance with MKL and without it?

Reference code:

import numpy as np
from time import time

def matrix_mul(size, n=100):
    # reference: https://markus-beuckelmann.de/blog/boosting-numpy-blas.html
    np.random.seed(112)
    a, b = np.random.random((size, size)), np.random.random((size, size))
    t = time()
    for _ in range(n):
        np.dot(a, b)
    delta = time() - t
    print('Dotted two matrices of size %dx%d in %0.4f ms.' % (size, size, delta / n * 1000))


def eigen_decomposition(size, n=10):
    np.random.seed(112)
    a = np.random.random((size, size))
    t = time()
    for _ in range(n):
        np.linalg.eig(a)
    delta = time() - t
    print('Eigen decomposition of size %dx%d in %0.4f ms.' % (size, size, delta / n * 1000))

#Obtaining computation times: 

for i in range(20): 
    eigen_decomposition(500)

for i in range(20): 
    matrix_mul(500)
2

2 Answers 2

5

You can use different environments for the comparison of Numpy with and without MKL. In each environment you can install the needed packages(numpy with MKL or without) using package installer. Then on that environments you can run your program to compare the performance of Numpy with and without MKL.

NumPy doesn’t depend on any other Python packages, however, it does depend on an accelerated linear algebra library - typically Intel MKL or OpenBLAS.

  • The NumPy wheels on PyPI, which is what pip installs, are built with OpenBLAS.

  • In the conda defaults channel, NumPy is built against Intel MKL. MKL is a separate package that will be installed in the users' environment when they install NumPy.

  • When a user installs NumPy from conda-forge, that BLAS package then gets installed together with the actual library.But it can also be MKL (from the defaults channel), or even BLIS or reference BLAS.

Please refer this link to know about installing Numpy in detail.

You can create two different environments to compare the NumPy performance with MKL and without it. In the first environment install the stand-alone NumPy (that is, the NumPy without MKL) and in the second environment install the one with MKL.

To create environment using NumPy without MKL.

conda create -n <env_name_1> python=<version>
conda activate <env_name_1>
pip install numpy

But depending on your OS, it might be possible that there is no distribution available (Windows).

On Windows, we have always been linking against MKL. However, with the Anaconda 2.5 release we separated the MKL runtime into its own conda package, in order to do things uniformly on all platforms.

In general you can create a new env:

conda create -n wheel_based python
activate wheel
pip install numpy-1.13.3-cp36-none-win_amd64.whl  # or whatever the file is named

In the other environment, install NumPy with MKL using below command

conda create -n <env_name_2> python=<version>
conda activate <env_name_2>
pip install intel-numpy

In these environments <env_name_1> and <env_name_2> you can run your program seperately, so that you can compare the performance of Numpy without MKL and With MKL respectively.

Sign up to request clarification or add additional context in comments.

5 Comments

This looks like copy paste from the numpy docs...
@djvg That's how intel answers in their forum, rare exceptions put aside.
@Olórin please note my comments applied to the answer before revision.
@djvg I wasn't qualifying your comment with mine, but the answer. ;)
2

Installing NumPy with MKL support

NumPy has to be compiled against mkl, so you can't just switch BLAS/LAPACK libraries at runtime. If you are happy to use conda, you can install NumPy with MKL-support from conda-forge or anaconda. The other answer already provides good details if you want to go that route.

If you want to stick to pip, you can install them from, e.g., https://github.com/urob/numpy-mkl/ using the following command:

pip install numpy --extra-index-url https://urob.github.io/numpy-mkl

MKL vs BLAS/LAPACK

To compare performance between the different BLAS/LAPACK implementations you can create two virtual environments.

Here I'm slightly modifying the original benchmark script from the question. The modified script runs the benchmark for multiple problem sizes and replaces time with timeit.Timer (which disables garbage collection during the timing to make them more comparible).

import inspect
import sys
from timeit import Timer

import numpy as np

fmt = "{name:<17}: {time:.4f}s ({loops:6d} loops)"


class Benchmark:
    def run(self):
        for size, loops in zip(self.size, self.loops):
            best = self.run_single(size, loops)
            name = self.__class__.__name__ + f"({size}x{size})"
            print(fmt.format(name=name, time=best, loops=loops))

    def run_single(self, size, loops, repeat=5):
        self.setup(size)
        self.time_it()  # Run once to warm up JIT if needed

        t = Timer(lambda: self.time_it())
        return min(t.repeat(repeat=repeat, number=loops))


class Matmul(Benchmark):
    size = (10, 100, 1000)
    loops = (100_000, 10_000, 100)

    def setup(self, k):
        rng = np.random.default_rng(1)
        self.a = rng.random((k, k))
        self.b = rng.random((k, k))

    def time_it(self):
        _ = np.dot(self.a, self.b)


class Eigen(Benchmark):
    size = (10, 100, 1000)
    loops = (10_000, 100, 5)

    def setup(self, k):
        rng = np.random.default_rng(1)
        self.a = rng.random((k, k))

    def time_it(self):
        _ = np.linalg.eig(self.a)


def get_benchmarks():
    cls = inspect.getmembers(sys.modules[__name__], inspect.isclass)
    return [c for _, c in cls if issubclass(c, Benchmark) and c is not Benchmark]


if __name__ == "__main__":
    for b in get_benchmarks():
        b().run()

Benchmark results

# MKL
python -m venv mkl && source mkl/bin/activate
pip install numpy --extra-index-url https://urob.github.io/numpy-mkl
python benchmark.py

Eigen(10x10)     : 0.3221s ( 10000 loops)
Eigen(100x100)   : 0.3209s (   100 loops)
Eigen(1000x1000) : 4.0278s (     5 loops)
Matmul(10x10)    : 0.1164s (100000 loops)
Matmul(100x100)  : 0.1922s ( 10000 loops)
Matmul(1000x1000): 0.9143s (   100 loops)
# BLAS/LAPACK
python -m venv nomkl && source nomkl/bin/activate
pip install numpy
python benchmark.py

Eigen(10x10)     : 0.4455s ( 10000 loops)
Eigen(100x100)   : 0.8334s (   100 loops)
Eigen(1000x1000) : 3.9544s (     5 loops)
Matmul(10x10)    : 0.0999s (100000 loops)
Matmul(100x100)  : 0.3058s ( 10000 loops)
Matmul(1000x1000): 1.0825s (   100 loops)

Interestingly, the biggest performance differences can be seen for mid-sized problems, where MKL is significantly faster than OpenBLAS/LAPACK (more than twice as fast for eig). For small and large problems, the differences are smaller (and in fact OpenBLAS is even faster than MKL for matrix multiplication of very small matrices and when computing eigenvalues for very large matrices).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.