A library like OpenBLAS (Basic Linear Algebra Subprograms) is well known to those involved in machine learning systems and computer vision. OpenBLAS is written in C and is used everywhere where work with matrices is needed. It also has several alternative implementations such as Eigen and two closed implementations from Intel and Apple. All of them are written in C / C ++.
Currently, OpenBLAS is used in matrix manipulations in languages such as Julia and Python (NumPy). OpenBLAS is extremely well optimized and much of it is generally
written in assembly language.
However, is pure C so good for computing, as is commonly believed?
')
Meet
Mir GLAS ! Native implementation of the library of linear algebra on pure D without a single insert in an assembler!
To compile the Mir GLAS library we need an
LDC compiler (LLVM D Compiler). The DMD compiler is not officially supported. It does not support AVX and AVX2 instructions.
Test configuration will consist of:
CPU | 2.2 GHz Core i7 (I7-4770HQ) |
L3 cache | 6 MB |
Ram | 16 GB of 1600 MHz DDR3L SDRAM |
Model Identifier | MacBookPro11,2 |
OS | OS X 10.11.6 |
Mir GLAS | 0.18.0, single thread |
Openblas | 0.2.18, single thread |
Eigen | 3.3-rc1, single thread (sequential configurations) |
Intel MKL | 2017.0.098, single thread (sequential configurations) |
Apple accelerate | OS X 10.11.6, single thread (sequential configurations) |
»The code of the test itself can be obtained
here .
»Mir GLAS based on
mir.ndslice library
Mir GLAS can be easily used in any language that supports C ABI. This is done elementary:
For comparison, OpenBLAS will need to write the following code:
void cblas_sgemm ( const CBLAS_LAYOUT layout, const CBLAS_TRANSPOSE TransA, const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K, const float alpha, const float *A, const int lda, const float *B, const int ldb, const float beta, float *C, const int ldc)
During the test, the following values of variables were established:
openBLAS | OPENBLAS_NUM_THREADS = 1 |
Accelerate (Apple) | VECLIB_MAXIMUM_THREADS = 1 |
Intel MKL | MKL_NUM_THREADS = 1 |
Eigen compiled with the flags `EIGEN_TEST_AVX` and` EIGEN_TEST_FMA`:
mkdir build_dir cd build_dir cmake -DCMAKE_BUILD_TYPE=Release -DEIGEN_TEST_AVX=ON -DEIGEN_TEST_FMA=ON .. make OpenBLAS
Results (more is better):








Results:
- Mir GLAS significantly ahead of OpenBLAS and Apple Accelerate in all respects.
- Mir GLAS is almost twice as fast as Eigen and Apple Accelerate when working with matrices.
“Mir GLAS turns out to be comparable to proprietary Intel MKL, which is the fastest of its kind.
“Thanks to its design, Mir GLAS can easily be adapted for new architectures.
PS At the moment, the computer vision system
DCV is actively developing on the basis of GLAS.
PPS The original author is present in the comments under the nickname
9ilThe original article is located on the
author’s blog.