Skip to content

Commit

Permalink
Squash commit of GitHub wiki
Browse files Browse the repository at this point in the history
* Created Installation Guide (markdown)
* Updated quick installation (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Document (markdown)
* Updated Document (markdown)
* Created Installation Guide (markdown)
* Created Home (markdown)
* Init version
* Updated OpenBLAS Wiki (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated Document (markdown)
* Updated Installation Guide (markdown)
* Updated Installation Guide (markdown)
* Created Download (markdown)
* Created Faq (markdown)
* Updated Faq (markdown)
* Updated FAQ
* Created How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Faq (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated Home (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Created How to generate import library for MingW (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Build instrunctions for FreeBSD
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Installation Guide (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* minor edits
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Faq (markdown)
* Installation instructions for Windows
* Updated Faq (markdown)
* G77 conventions no longer needed with GCC 4.7+
* Updated Home (markdown)
* Document why issue 168 occurred.
* Updated Home (markdown)
* Created Publications (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Updated Publications (markdown)
* Updated Faq (markdown)
* Updated Document (markdown)
* Revert 7580d38ffad37e6613e6304707aaaa681f3d78c2 ... b1bd4ff37d2106bbd5c4730a08dbb789cc44e7d4
* Created Mailing List (markdown)
* Updated Mailing List (markdown)
* Updated Mailing List (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Publications (markdown)
* Updated Download (markdown)
* Updated Faq (markdown)
* Updated Home (markdown)
* Updated Faq (markdown)
* Updated Home (markdown)
* Revert b69f1417cdf8820be046cc27a2b96b42a25bc3a3 ... 90a227c317c3572ced943461ac3a252c40790f44 on Home
* Updated Home (markdown)
* Updated Publications (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* We already ensure the stack alignment in Makefile.system for Win32.
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated Publications (markdown)
* Created Donation (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Publications (markdown)
* Updated Download (markdown)
* Updated Mailing List (markdown)
* Updated Donation (markdown)
* Updated Download (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Home (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Updated Home (markdown)
* Updated Home (markdown)
* Add new entry for static linking and pthread.
* Fix named anchors (see http://stackoverflow.com/questions/5319754/cross-reference-named-anchor-in-markdown/7335259#7335259)
* Created Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Document (markdown)
* Created To-do List (markdown)
* Updated To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Fix English idiom
* Remove trailing whitespace
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Created Machine List (markdown)
* Updated Document (markdown)
* Updated Installation Guide (markdown)
* Created User Manual (markdown)
* Updated User Manual (markdown)
* Updated Document (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Faq (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Machine List (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Add a note about building in QEMU
* Updated Home (markdown)
* Updated Faq (markdown)
* update for allocating too many meory error.
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated Installation Guide (markdown)
* Updated Faq (markdown)
* Init function doc
* Updated Document (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Created How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated Home (markdown)
* Part of the description is really no clear, I add some more information, so it would be easier for VS user to fix the problems facing them.
* Created Developer manual (markdown)
* Updated Document (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* a typo, download ** frome -> download from
* Updated Faq (markdown)
* English (minor edit)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated Machine List (markdown)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* issue#842
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Added FC for building with Fortran
* Change link for the Intel MKL documentation
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Added MIPS build instructions from OpenMathLib#949
* use TARGET_CFLAGS and TARGET_LDFLAGS instead of CFLAGS and LDFLAGS for linking OpenBLAS on ARMv7
* Add Windows updates (msys2,mingw/w64 merger), Android/MIPS pointers, qemu hint
* Building libs & netlib targets to prevent errors in tests
* Recipes not targets (for make)
* Making only libs, not netlib (which also contains link/run tests...)
* Copied from instructions by Ivan Ushakov, originally posted in OpenMathLib#569
* Updated How to build OpenBLAS for iPhone iOS (markdown)
* Updated Faq (markdown)
* Created How to build OpenBLAS for iPhone iOS (markdown)
* error code (0xc000007b) was missing a character
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Revert 7e9dd0ebf079e002e3aa831fa671fde3e8cfad81...8d105c7be8cd447482f61e0295c0c146f5314eb5 on How to build OpenBLAS for iPhone iOS
* Add guide on how to reversibly supplant Ubuntu LTS libblas.so.3
* typo
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated User Manual (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Add perl to pacman package list
* Fixed formatting on general questions
* Copied from issue OpenMathLib#1136
* Added instructions for building for Windows UWP.
* To clear confusions vs super-fat-binaries that dont exist.
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Update for 0.2.20 (full builds, ARMv7 softfp support, newer NDKs using CLANG)
* Updated How to build OpenBLAS for Android (markdown)
* Fix some formatting issues
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Created Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Example - debian?
* Mention (and link to) distribution-specific packages
* Updated Installation Guide (markdown)
* OpenSuSE (13.2, SLE included)
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Make it look consistent.
* Fedora+EPEL // maybe rpmbuild is too heavy
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Precompiled installation packages (markdown)
* fix toolchain argument in armv8 clang build as per OpenMathLib#1337
* add note about stdio.h not found error
* Add flang instructions
* Use the SVG Travis badge
* homebrew option for OSX
* Promote native MSVC builds with LLVM
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Direct people to the appropriate instructions
* Add link to the Goto paper
* Add CMAKE_BUILD_TYPE
* Add note about having to specify AR on a Mac, from issue 1435
* Mention requirement to build a standalone toolchain in the clang section as well
* added 'perl' to conda install command
* homebrew/science was deprecated. This tap is now empty as all its formulae were migrated.
* Added hint for "expected identifier" error message to mingw section following OpenMathLib#1503
* Revert 9161c3b54281131e892dec739d888f35e6c59cf3...03f879be0c9e6a55705bc7efd5ee193299e04029 on How to use OpenBLAS in Microsoft Visual Studio
* Revert to recommending mingw-w64 from sf.net and add note about issue 1503
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Update MSVC installation procedure with info from OpenMathLib#1521
* Add downgrade option for msys2 mingw compiler issue as suggested by econwang in OpenMathLib#1503
* Add note about static linking bug with NDK 16 and API>22
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Updated Faq (markdown)
* OBS is renamed and deep link format changed. Apparently recent SLE includes rpm by default too.
* Add links to Conda-Forge and to staticfloat's builds for Julia
* Mention _64 suffix appended to Julia builds with INTERFACE64 (issue 1617)
* Fix unwanted markdown italicization
* Add instruction to change to the generic sgemmkernel implementation from issue 1531
* Added hint about stack size requirements for running lapack-test from PR 1645; fixed markup of section headings
* Add link to RvdG's publications page as a non-paywalled source of the "Goto paper"
* Add section about non-suitability of the IBM XL compiler on POWER8
* Mention cmake version requirement in view or recent issues with link failures in utest etc.
* Replace outdated entry for Sandybridge support with more general section on AVX512, Ryzen and GPU
* Mention Apple Accelerate here as iOS build issue tickets usually die as soon as someone points out this option to the questioner.
* Add section about unexpectedly using an older pre-installed version of the shared library (issue 1822)
* fix markup of new entry
* Mention perl and C compiler as prerequisites on the build host
* Save WIP page
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Destroyed [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated Faq (markdown)
* Add small note on AVX512 for CentOS/RHEL section.
* document the extension functions
* formatting
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Download (markdown)
* Add brief general usage information from issue 1925
* Add link to Pete Warden blog article on GEMM rather than just deep-linking to a diagram from it
* Document some of the less useful parameters from param.h
* Updated Installation Guide (markdown)
* Done with OpenMathLib#2089
* Add note about changed library names for update-alternatives on Debian/Ubuntu
* Updated Home (markdown)
* Add note about using OpenBLAS with CUDA_HPL 2.3 from issue OpenMathLib#909
* Fix typos in previous commit
* Add pdb instructions fir cross-builds
* Add note about generic QEMU CPUID clashing with existing P2(MMX)
* typo
* typo
* C code syntax highlight
* Updated multithreading section to introduce option USE_LOCKING (issue 2164)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Clarify Miniconda/cmake install instructions and redact outdated note about msys2
* Document cmake install step
* Updated How to build OpenBLAS for Android (markdown)
* Add solution for programs that look for libblas.so/liblapack.so
* Add entry for powersaving modes on ARM boards (from issue 2540)
* Add suggestion for speed problems on big.little systems from issue 2589
* Convert the ARMV8 big.little tidbit to a separate topic and update it with more details from the issue ticket
* Add entry about problems caused by using the raw cblas.h (issue 2593)
* complete quote symbol around CPATH environment variable
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add note about running conda activate when working in a VS window (from issue 2637)
* Add note about (not) compiling with -fbounds-check (ticket 2657)
* Add entry about compile-time NUM_THREADS setting (issue 2678)
* Added some sketchy description of adding cpuids for autodetection, adding targets and architectures
* Markup and typo fixes
* Add openblas_set_affinity from PR 2547
* Created _Footer (markdown)
* Destroyed _Footer (markdown)
* Add LAPACK-like SHGEMM to document the "official" status of the SH prefix
* fix formatting of latest addition
* Move outdated instructions for gcc-based NDK versions to the bottom, add hint about x86 builds
* Add help for cpuid recognition failure
* Update source tree layout & mention extraneous cpu paramerts
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Explain why pure VS builds are slower, and highlight that they do not support DYNAMIC_ARCH
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Mention fortran requirement and incompatibility of ifort with msvc
* preliminary page for understanding the build system, needs a lot more work and input from more knowledgeable people than me
* Updated Build system overview (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* add information for HOSTCC, HOST_CFLAGS
* Added alternative script which was tested on OSX with latest NDK
* added link to targets list
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* added script for x86_64 architecture
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* updated link to FLAME publications list
* Created How to use OpenBLAS on Cortex-M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated Precompiled installation packages (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Update source layout graph and start a short section on benchmarking to collect various pointers from the issue tracker
* Add workaround for building with CMAKE on OSX
* Use actual small headings to fix... weird bullet indent shit
* Oops
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated How to generate import library for MingW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* explicitly set CMAKE_MT to replace the new cmake default llvm-mt (failing)
* Add -Wl,-rpath,/your_path/OpenBLAS/lib option to gcc linker line in "Link shared library" section + explanation for why it is needed/can be omitted. Also make note that -lgfortran not needed if only making LAPACKE calls.
* Add note explaining that build flags passed to make should also be passed to make install
* give example of install error
* Describe how to build openblas library for win/arm64 targets
* Add Xen to the existing entry for QEMU/KVM based on issue 3445
* Updated Download (markdown)
* Updated Installation Guide (markdown)
* Updated Installation Guide (markdown)
* Revert b8da0e8523b898a2206d1e2fe99dbfb4ebb0ffa8...bc55aade759d2f925689b000828da249e1fc6a1a on Installation Guide
* Revert b0c9a2ee060b8dd0b46b4c58375ef2a743c0363a...cecf8cf67963bd77a0bb97086e3a457a4cee11ff on Download
* Revert bc55aade759d2f925689b000828da249e1fc6a1a...134894a0f09a0e92eef1b9a5c9e63f459d2db55e on Installation Guide
* Add NDK23B example
* Makes iOS build more robust
* Double -isysroot
* Bump up required devtoolset version for AVX-512 intrinsics.
* Updated Installation Guide (markdown)
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Revert b8da0e8523b898a2206d1e2fe99dbfb4ebb0ffa8...75bba70832f8765faee693931c4a9e3eb6c84d98 on Installation Guide
* Revert 75bba70832f8765faee693931c4a9e3eb6c84d98...d171e711a5cd8026b2eb507b249b5e51fa28b2a2 on Installation Guide
* restore Windows link after malicious edit
* Revert 1bcb03dcef85c675aace7f0a755d5aa36ec46eca...f732906434146b1a1ee82abe944a6d51d8f43b81 on Installation Guide
* restore Windows link after malicious edit
* Updated Installation Guide (markdown)
* Bump up AVX-512 devtoolset because of identified packaging issues
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* n-dash html entity instead of -
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add the bfloat16 functions
* mention AXPBY
* Update building for Apple M1
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Created How to build OpenBLAS for macOS M1 / arm64 (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add NO_AVX2 build hint for OSX Docker Desktop/xhyve (issues 2194 and 2244)
* Mention the ELF offset/address bug from binutils 2.38 ld
* moved issue 665 (sparse matrix/vector support) to a faq entry
* Update and simplify based on CI experience and 3741
* Updated Download (markdown)
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Revert 0dcee87d486028fbd88c603853cdcae810e025c6...bf3d15e74d42b0b01618b4beb7b9d658fb905118 on Download
* Revert a02f9e470f8e26eda1b8d8601ad2486557721ccf...c862aeb3492c29b487858d43c93676855b60a1f2 on How to build OpenBLAS for Windows on ARM64
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Revert 9db97d11d88c801e8c5e9b8d6cc85fb44e5bca61...d2eb48810f3ecc1680900581473005f79c394ca4 on How to use OpenBLAS in Microsoft Visual Studio
* start with the smallest configs, Appveyor and Cirrus
* Updated CI jobs overview (markdown)
* Add Azure CI
* Add github workflows
* Add the crossbuild parts of the dynamic_arch workflow
* remove trailing separator
* Add FreeBSD/Cirrus
* Add ILP64 jobs on Cirrus
* Add C910V and the OSUOSL Jenkins jobs (currently configured for my fork)
* Updated Installation Guide (markdown)
* Expand section on precompiled windows binaries to mention INTERFACE64=0 option
* Remove reference to buildbot (domain reregistered to someone else, issue 4148
  • Loading branch information
honno committed Aug 4, 2023
1 parent c2f4bdb commit 09f99ae
Show file tree
Hide file tree
Showing 25 changed files with 1,772 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Installation Guide

test
54 changes: 54 additions & 0 deletions CI-jobs-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
| Arch|Target CPU|OS|Build system|XComp to|C Compiler|Fortran Compiler|threading|DYN_ARCH|INT64|Libraries| CI Provider| CPU count|
| ------------|---|---|-----------|-------------|----------|----------------|------|------------|----------|-----------|----------|-------|
| x86_64 |Intel 32bit|Windows|CMAKE/VS2015| -|mingw6.3| - | pthreads | - | - | static | Appveyor| |
| x86_64 |Intel |Windows|CMAKE/VS2015| -|mingw5.3| - | pthreads | - | - | static | Appveyor| |
| x86_64 |Intel |Centos5|gmake | -|gcc 4.8 |gfortran| pthreads | + | - | both | Azure | |
| x86_64 |SDE (SkylakeX)|Ubuntu| CMAKE| - | gcc | gfortran | pthreads | - | - | both | Azure | |
| x86_64 |Haswell/ SkylakeX|Windows|CMAKE/VS2017| - | VS2017| - | | - | - | static | Azure | |
| x86_64 | " | Windows|mingw32-make| - |gcc | gfortran | | list | - | both | Azure | |
| x86_64 | " |Windows|CMAKE/Ninja| - |LLVM | - | | - | - | static | Azure | |
| x86_64 | " |Windows|CMAKE/Ninja| - |LLVM | flang | | - | - | static | Azure | |
| x86_64 | " |Windows|CMAKE/Ninja| - |VS2022| flang* | | - | - | static | Azure | |
| x86_64 | " |macOS11|gmake | - | gcc-10|gfortran| OpenMP | + | - | both | Azure | |
| x86_64 | " |macOS11|gmake | - | gcc-10|gfortran| none | - | - | both | Azure | |
| x86_64 | " |macOS12|gmake | - | gcc-12|gfortran|pthreads| - | - | both | Azure | |
| x86_64 | " |macOS11|gmake | - | llvm | - | OpenMP | + | - | both | Azure | |
| x86_64 | " |macOS11|CMAKE | - | llvm | - | OpenMP | no_avx512 | - | static | Azure | |
| x86_64 | " |macOS11|CMAKE | - | gcc-10| gfortran| pthreads | list | - | shared | Azure | |
| x86_64 | " |macOS11|gmake | - | llvm | ifort | pthreads | - | - | both | Azure | |
| x86_64 | " |macOS11|gmake |arm| AndroidNDK-llvm | - | | - | - | both | Azure | |
| x86_64 | " |macOS11|gmake |arm64| XCode 12.4 | - | | + | - | both | Azure | |
| x86_64 | " |macOS11|gmake |arm | XCode 12.4 | - | | + | - | both | Azure | |
| x86_64 | " |Alpine Linux(musl)|gmake| - | gcc | gfortran | pthreads | + | - | both | Azure | |
| arm64 |Apple M1 |OSX |CMAKE/XCode| - | LLVM | - | OpenMP | - | - | static | Cirrus | |
| arm64 |Apple M1 |OSX |CMAKE/Xcode| - | LLVM | - | OpenMP | - | + | static | Cirrus | |
| arm64 |Apple M1 |OSX |CMAKE/XCode|x86_64| LLVM| - | - | + | - | static | Cirrus | |
| arm64 |Neoverse N1|Linux |gmake | - |gcc10.2| -| pthreads| - | - | both | Cirrus | |
| arm64 |Neoverse N1|Linux |gmake | - |gcc10.2| -| pthreads| - | + | both | Cirrus | |
| arm64 |Neoverse N1|Linux |gmake |- |gcc10.2| -| OpenMP | - | - | both |Cirrus | 8 |
| x86_64 | Ryzen| FreeBSD |gmake | - | gcc12.2|gfortran| pthreads| - | - | both | Cirrus | |
| x86_64 | Ryzen| FreeBSD |gmake | | gcc12.2|gfortran| pthreads| - | + | both | Cirrus | |
| x86_64 |GENERIC |QEMU |gmake| mips64 | gcc | gfortran | pthreads | - | - | static | Github | |
| x86_64 |SICORTEX |QEMU |gmake| mips64 | gcc | gfortran | pthreads | - | - | static | Github | |
| x86_64 |I6400 |QEMU |gmake| mips64 | gcc | gfortran | pthreads | - | - | static | Github | |
| x86_64 |P6600 |QEMU |gmake| mips64 | gcc | gfortran | pthreads | - | - | static | Github | |
| x86_64 |I6500 |QEMU |gmake| mips64 | gcc | gfortran | pthreads | - | - | static | Github | |
| x86_64 |Intel |Ubuntu |CMAKE| - | gcc-11.3 | gfortran | pthreads | + | - | static | Github | |
| x86_64 |Intel |Ubuntu |gmake| - | gcc-11.3 | gfortran | pthreads | + | - | both | Github | |
| x86_64 |Intel |Ubuntu |CMAKE| - | gcc-11.3 | flang-classic | pthreads | + | - | static | Github | |
| x86_64 |Intel |Ubuntu |gmake| - | gcc-11.3 | flang-classic | pthreads | + | - | both | Github | |
| x86_64 |Intel |macOS12 | CMAKE| - | AppleClang 14 | gfortran | pthreads | + | - | static | Github | |
| x86_64 |Intel |macOS12 | gmake| - | AppleClang 14 | gfortran | pthreads | + | - | both | Github | |
| x86_64 |Intel |Windows2022 | CMAKE/Ninja| - | mingw gcc 13 | gfortran | | + | - | static | Github | |
| x86_64 |Intel |Windows2022 | CMAKE/Ninja| - | mingw gcc 13 | gfortran | | + | + | static | Github | |
| x86_64 |Intel 32bit|Windows2022 | CMAKE/Ninja| - | mingw gcc 13 | gfortran | | + | - | static | Github | |
| x86_64 |Intel |Windows2022 | CMAKE/Ninja| - | LLVM 16 | - | | + | - | static | Github | |
| x86_64 |Intel | Windows2022 |CMAKE/Ninja| - | LLVM 16 | - | | + | + | static | Github | |
| x86_64 |Intel | Windows2022 |CMAKE/Ninja| - | gcc 13| - | | + | - | static | Github | |
| x86_64 |Intel| Ubuntu |gmake |mips64|gcc|gfortran|pthreads|+|-|both|Github| |
| x86_64 |generic|Ubuntu |gmake |riscv64|gcc|gfortran|pthreads|-|-|both|Github| |
| x86_64 |Intel|Ubuntu |gmake |mips32|gcc|gfortran|pthreads|-|-|both|Github | |
| x86_64 |Intel|Ubuntu |gmake |ia64|gcc|gfortran|pthreads|-|-|both|Github| |
| x86_64 |C910V|QEmu |gmake |riscv64|gcc|gfortran|pthreads|-|-|both|Github| |
|power |pwr9| Ubuntu |gmake | - |gcc|gfortran|OpenMP|-|-|both|OSUOSL| |
|zarch |z14 | Ubuntu |gmake | - |gcc|gfortran|OpenMP|-|-|both|OSUOSL| |
141 changes: 141 additions & 0 deletions Developer-manual.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
## Source codes Layout

```
OpenBLAS/
├── benchmark Benchmark codes for BLAS
├── cmake CMakefiles
├── ctest Test codes for CBLAS interfaces
├── driver Implemented in C
│   ├── level2
│   ├── level3
│   ├── mapper
│   └── others Memory management, threading, etc
├── exports Generate shared library
├── interface Implement BLAS and CBLAS interfaces (calling driver or kernel)
│   ├── lapack
│   └── netlib
├── kernel Optimized assembly kernels for CPU architectures
│   ├── alpha Original GotoBLAS kernels for DEC Alpha
│   ├── arm ARMV5,V6,V7 kernels (including generic C codes used by other architectures)
│   ├── arm64 ARMV8
│   ├── generic General kernel codes written in plain C, parts used by many architectures.
│   ├── ia64 Original GotoBLAS kernels for Intel Itanium
│ ├── mips
│   ├── mips64
│   ├── power
| ├── riscv64
| ├── simd Common code for Universal Intrinsics, used by some x86_64 and arm64 kernels
│   ├── sparc
│   ├── x86
│ ├── x86_64
│   └── zarch
├── lapack Optimized LAPACK codes (replacing those in regular LAPACK)
│   ├── getf2
│   ├── getrf
│   ├── getrs
│   ├── laswp
│   ├── lauu2
│   ├── lauum
│   ├── potf2
│   ├── potrf
│   ├── trti2
│ ├── trtri
│   └── trtrs
├── lapack-netlib LAPACK codes from netlib reference implementation
├── reference BLAS Fortran reference implementation (unused)
├── relapack Elmar Peise's recursive LAPACK (implemented on top of regular LAPACK)
├── test Test codes for BLAS
└── utest Regression test
```

A call tree for `dgemm` is as following.

```
interface/gemm.c
driver/level3/level3.c
gemm assembly kernels at kernel/
```

To find the kernel currently used for a particular supported cpu, please check the corresponding `kernel/$(ARCH)/KERNEL.$(CPU)` file.

Here is an example for `kernel/x86_64/KERNEL.HASWELL`

```
...
DTRMMKERNEL = dtrmm_kernel_4x8_haswell.c
DGEMMKERNEL = dgemm_kernel_4x8_haswell.S
...
```
According to the above `KERNEL.HASWELL`, OpenBLAS Haswell dgemm kernel file is `dgemm_kernel_4x8_haswell.S`.

## Optimizing GEMM for a given hardware

Read the Goto paper to understand the algorithm.

Goto, Kazushige; van de Geijn, Robert A. (2008). ["Anatomy of High-Performance Matrix Multiplication"](http://delivery.acm.org/10.1145/1360000/1356053/a12-goto.pdf?ip=155.68.162.54&id=1356053&acc=ACTIVE%20SERVICE&key=A79D83B43E50B5B8%2EF070BBE7E45C3F17%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&__acm__=1517932837_edfe766f1e295d9a7830812371e1d173). ACM Transactions on Mathematical Software 34 (3): Article 12
(The above link is available only to ACM members, but this and many related papers is also available on the pages
of van de Geijn's FLAME project, http://www.cs.utexas.edu/~flame/web/FLAMEPublications.html )

The `driver/level3/level3.c` is the implementation of Goto's algorithm. Meanwhile, you can look at `kernel/generic/gemmkernel_2x2.c`, which is a naive `2x2` register blocking gemm kernel in C.

Then,
* Write optimized assembly kernels. consider instruction pipeline, available registers, memory/cache accessing
* Tuning cache block size, `Mc`, `Kc`, and `Nc`

Note that not all of the cpu-specific parameters in param.h are actively used in algorithms. DNUMOPT only appears as a scale factor in profiling output of the level3 syrk interface code, while its counterpart SNUMOPT (aliased as NUMOPT in common.h) is not used anywhere at all.
SYMV_P is only used in the generic kernels for the symv and chemv/zhemv functions - at least some of those are usually overridden by cpu-specific implementations, so if you start by cloning the existing implementation for a related cpu you need to check its KERNEL file to see if tuning SYMV_P would have any effect at all.
GEMV_UNROLL is only used by some older x86_64 kernels, so not all sections in param.h define it.
Similarly, not all of the cpu parameters like L2 or L3 cache sizes are necessarily used in current kernels for a given model - by all indications the cpu identification code was imported from some other project originally.

## Run OpenBLAS Test

We use netlib blas test, cblas test, and LAPACK test. Meanwhile, we use [BLAS-Tester](https://github.com/xianyi/BLAS-Tester), a modified test tool from ATLAS.

* Run `test` and `ctest` at OpenBLAS. e.g. `make test` or `make ctest`.
* Run regression test `utest` at OpenBLAS.
* Run LAPACK test. e.g. `make lapack-test`.
* Clone [BLAS-Tester](https://github.com/xianyi/BLAS-Tester), which can compare the OpenBLAS result with netlib reference BLAS.

The project makes use of several Continuous Integration (CI) services conveniently interfaced with github to automatically check compilability on a number of platforms.
Lastly, the testsuites included with "numerically heavy" projects like Julia, NumPy, Octave or QuantumEspresso can be used for regression testing.

## Benchmarking

Several simple C benchmarks for performance testing individual BLAS functions are available in the `benchmark` folder, and its `scripts` subdirectory contains corresponding versions for Python, Octave and R.
Other options include

* https://github.com/RoyiAvital/MatlabJuliaMatrixOperationsBenchmark (various matrix operations in Julia and Matlab)
* https://github.com/mmperf/mmperf/ (single-core matrix multiplication)

## Adding autodetection support for a new revision or variant of a supported cpu

Especially relevant for x86_64, a new cpu model may be a "refresh" (die shrink and/or different number of cores) within an existing
model family without significant changes to its instruction set. (e.g. Intel Skylake, Kaby Lake etc. still are fundamentally Haswell,
low end Goldmont etc. are Nehalem). In this case, compilation with the appropriate older TARGET will already lead to a satisfactory build.

To achieve autodetection of the new model, its CPUID (or an equivalent identifier) needs to be added in the `cpuid_<architecture>.c`
relevant for its general architecture, with the returned name for the new type set appropriately. For x86 which has the most complex
cpuid file, there are two functions that need to be edited - get_cpuname() to return e.g. CPUTYPE_HASWELL and get_corename() for the (broader)
core family returning e.g. CORE_HASWELL. (This information ends up in the Makefile.conf and config.h files generated by `getarch`. Failure to
set either will typically lead to a missing definition of the GEMM_UNROLL parameters later in the build, as `getarch_2nd` will be unable to
find a matching parameter section in param.h.)

For architectures where "DYNAMIC_ARCH" builds are supported, a similar but simpler code section for the corresponding runtime detection of the cpu exists in `driver/others/dynamic.c` (for x86) and `driver/others/dynamic_<arch>.c` for other architectures.
Note that for x86 the CPUID is compared after splitting it into its family, extended family, model and extended model parts, so the single decimal
number returned by Linux in /proc/cpuinfo for the model has to be converted back to hexadecimal before splitting into its constituent
digits, e.g. 142 = 8E , translates to extended model 8, model 14.

## Adding dedicated support for a new cpu model

Usually it will be possible to start from an existing model, clone its KERNEL configuration file to the new name to use for this TARGET and eventually replace individual kernels with versions better suited for peculiarities of the new cpu model. In addition, it is necessary to add
(or clone at first) the corresponding section of GEMM_UNROLL parameters in the toplevel param.h, and possibly to add definitions such as USE_TRMM
(governing whether TRMM functions use the respective GEMM kernel or a separate source file) to the Makefiles (and CMakeLists.txt) in the kernel
directory. The new cpu name needs to be added to TargetLists.txt and the cpu autodetection code used by the `getarch` helper program - contained in
the `cpuid_<architecture>.c` file amended to include the CPUID (or equivalent) information processing required (see preceding section).

## Adding support for an entirely new architecture

This endeavour is best started by cloning the entire support structure for 32bit ARM, and within that the ARMV5 cpu in particular as this is implemented through plain C kernels only. An example providing a convenient "shopping list" can be seen in pull request #1526.
27 changes: 27 additions & 0 deletions Document.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<hr noshade="noshade">
<center>
<a href="Home"> [Home]</a>
<a href="Document"> [Document]</a>
<a href="faq"> [FAQ]</a>
<a href="publications"> [Publications]</a>
<a href="download"> [Download]</a>
<a href="Mailing-List">[Mailing List]</a>
<a href="Donation">[Donation]</a>
</center>
<hr noshade="noshade">

[Installation Guide](Installation-Guide)

[User Manual](User-Manual)

[Developer Manual](Developer-manual)

[Use OpenBLAS in MS Visual Studio](How-to-use-OpenBLAS-in-Microsoft-Visual-Studio)

[Generate import library for MingW](How-to-generate-import-library-for-MingW)

[OpenBLAS Extensions](OpenBLAS-Extensions)

[Related packages that use OpenBLAS](Related-packages-that-use-OpenBLAS)

[Machine List](Machine-List)
31 changes: 31 additions & 0 deletions Donation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<hr noshade="noshade">
<center>
<a href="Home"> [Home]</a>
<a href="Document"> [Document]</a>
<a href="faq"> [FAQ]</a>
<a href="publications"> [Publications]</a>
<a href="download"> [Download]</a>
<a href="Mailing-List">[Mailing List]</a>
<a href="Donation">[Donation]</a>
</center>
<hr noshade="noshade">

Thank you for the support.

You can read OpenBLAS statement of receipts and disbursement and cash balance on [google doc](https://docs.google.com/spreadsheet/ccc?key=0AghkTjXe2lDndE1UZml0dGpaUzJmZGhvenBZd1F2R1E&usp=sharing).

## Fundraiser

* [2013.8] [Testbed for OpenBLAS project](https://www.bountysource.com/fundraisers/443-testbed-for-openblas-project)

* completed.

Here is [Backer list](https://github.com/xianyi/OpenBLAS/blob/develop/BACKERS.md).

## For Personal Donation

You can create a bounty for an issue on [bountysource.com](https://www.bountysource.com/trackers/69691-openblas).

## For Hardware Vendors

We welcome the hardware donation, including the latest CPU and boards.
23 changes: 23 additions & 0 deletions Download.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<hr noshade="noshade">
<center>
<a href="Home"> [Home]</a>
<a href="Document"> [Document]</a>
<a href="faq"> [FAQ]</a>
<a href="publications"> [Publications]</a>
<a href="download"> [Download]</a>
<a href="Mailing-List">[Mailing List]</a>
<a href="Donation">[Donation]</a>
</center>
<hr noshade="noshade">

## Binary Packages

We provide binary packages for the following platform.

* Windows x86/x86_64
* ARM

You can download them from [file hosting on sourceforge.net](http://sourceforge.net/projects/openblas/files/).

## Source
Download the latest [stable version](https://github.com/xianyi/OpenBLAS/releases) from release page.
Loading

0 comments on commit 09f99ae

Please sign in to comment.