CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

leo-pony · 2024-11-08T09:29:56Z

CANN Support Ascend310P to accelerate F32/F16 model inferencing. Corresponding issue is #10160. Q8 and Q4 will implement next.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Function is normal:

feichenchina · 2024-11-11T06:31:30Z

我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码，在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理，结果为乱码，不知道是还未支持该模型还是有什么别的原因？

leo-pony · 2024-11-11T06:41:24Z

我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码，在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理，结果为乱码，不知道是还未支持该模型还是有什么别的原因？

Compile option should with -DSOC_TYPE, such as:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=debug -DSOC_TYPE=Ascend310P3
cmake --build build --config debug

feichenchina · 2024-11-11T07:19:23Z

我已在 ggml/src/ggml-cann/kernels/CMakeLists.txt 文件中将未设置 SOC_TYPE 时，自动将 SOC_TYPE 设置为ascend310P3了 if (NOT SOC_TYPE) set (SOC_TYPE "ascend310p3") endif() 在 2024-11-11 14:41:45，"leo-pony" ***@***.***> 写道：我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码，在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理，结果为乱码，不知道是还未支持该模型还是有什么别的原因？ Compile option should with -DSOC_TYPE, such as: cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=debug -DSOC_TYPE=Ascend310P3 cmake --build build --config debug — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

Hakstar · 2024-11-18T02:33:47Z

您好！我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

环境信息如下：
【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
【硬件】： Ascend 310I Pro（310P3）
【CANN】：7.5.T11.0.B081:8.0.RC3.alpha003
【NPU驱动】：24.1.rc2
编译执行指令：

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

主要错误信息：

Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

leo-pony · 2024-11-18T02:50:04Z

您好！我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

环境信息如下：
【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
【硬件】： Ascend 310I Pro（310P3）
【CANN】：7.5.T11.0.B081:8.0.RC3.alpha003
【NPU驱动】：24.1.rc2
编译执行指令：

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

主要错误信息：

Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

It seems -DSOC_TYPE=Ascend310P3 doesn't take effect. Plz check your code is the same with this PR. In this PR there is no Cast call for 310P3. If SOC_TYPE been set to Ascend310PX, macro ASCEND_310P would been defined.

ggml/src/ggml-cann/kernels/CMakeLists.txt

ggml/src/ggml-cann/aclnn_ops.cpp

ggml/src/ggml-cann/kernels/dup.cpp

Hakstar · 2024-11-19T00:51:05Z

您好！我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

环境信息如下：
【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
【硬件】： Ascend 310I Pro（310P3）
【CANN】：7.5.T11.0.B081:8.0.RC3.alpha003
【NPU驱动】：24.1.rc2
编译执行指令：

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

主要错误信息：

Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

It seems -DSOC_TYPE=Ascend310P3 doesn't take effect. Plz check your code is the same with this PR. In this PR there is no Cast call for 310P3. If SOC_TYPE been set to Ascend310PX, macro ASCEND_310P would been defined.

切换到ascend310PAdapt分支，重新执行编译命令之后出现新的错误，关键错误信息如下：

[ 51%] Built target test-model-load-cancel
Consolidate compiler generated dependencies of target test-autorelease
[ 51%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
Consolidate compiler generated dependencies of target test-json-schema-to-grammar
[ 53%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
Consolidate compiler generated dependencies of target test-c
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误：ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99：bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490：tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146：all] 错误 2

leo-pony · 2024-11-19T03:49:20Z

切换到ascend310PAdapt分支，重新执行编译命令之后出现新的错误，关键错误信息如下：

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

Hakstar · 2024-11-19T06:01:50Z

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

重新编译后出现同样的问题，以下是我执行的完整命令：

git clone https://github.com/leo-pony/llama.cpp.git
cd llama.cpp
git checkout ascend310PAdapt
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

再执行完

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++

之后的输出为：

-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/gcc-11.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/gcc-11.2.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.27.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- OpenMP found
-- Using llamafile
-- Using AMX
-- CANN: updated CANN_INSTALL_DIR from ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
-- Compile for Ascend310P.
-- CANN: CANN_INCLUDE_DIRS =  /usr/local/Ascend/ascend-toolkit/latest/include;/usr/local/Ascend/ascend-toolkit/latest/include/aclnn;/usr/local/Ascend/ascend-toolkit/latest/acllib/include
-- CANN: CANN_LIBRARIES =  ascendcl;nnopbase;opapi;acl_op_compiler;ascendc_kernels
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxx/llama.cpp/build

执行

cmake --build build --config debug

发生错误，主要错误如下：

[ 48%] Built target test-backend-ops
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 49%] Linking CXX executable ../bin/test-rope
[ 49%] Built target test-rope
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-model-load-cancel
[ 51%] Built target test-model-load-cancel
[ 51%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
[ 54%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误：ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99：bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490：tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146：all] 错误 2

leo-pony · 2024-11-19T06:45:33Z

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

重新编译后出现同样的问题，以下是我执行的完整命令：

git clone https://github.com/leo-pony/llama.cpp.git
cd llama.cpp
git checkout ascend310PAdapt
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

再执行完

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++

之后的输出为：

-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/gcc-11.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/gcc-11.2.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.27.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- OpenMP found
-- Using llamafile
-- Using AMX
-- CANN: updated CANN_INSTALL_DIR from ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
-- Compile for Ascend310P.
-- CANN: CANN_INCLUDE_DIRS =  /usr/local/Ascend/ascend-toolkit/latest/include;/usr/local/Ascend/ascend-toolkit/latest/include/aclnn;/usr/local/Ascend/ascend-toolkit/latest/acllib/include
-- CANN: CANN_LIBRARIES =  ascendcl;nnopbase;opapi;acl_op_compiler;ascendc_kernels
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxx/llama.cpp/build

执行

cmake --build build --config debug

发生错误，主要错误如下：

[ 48%] Built target test-backend-ops
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 49%] Linking CXX executable ../bin/test-rope
[ 49%] Built target test-rope
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-model-load-cancel
[ 51%] Built target test-model-load-cancel
[ 51%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
[ 54%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误：ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99：bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490：tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146：all] 错误 2

Plz check wether your can build basic CANN C++ applications:
https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC3alpha003/apiref/aolapi/context/common/编译与运行样例.md

hipudding · 2024-11-21T06:48:08Z

#9560
#10160

Hakstar · 2024-11-22T02:25:51Z

Plz check wether your can build basic CANN C++ applications: https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC3alpha003/apiref/aolapi/context/common/编译与运行样例.md

我修复并且能够正常编译样例之后，拉去最新的310PAdapt分支，进行编译出现了新的错误：

-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-build
[ 15%] Performing build step for 'ascendc_kernels_aic_device'
[ 12%] Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp:10:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/dup.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:28:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_intf.h:25:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_dump_tensor_intf.cppm:138:24: error: (8th) const argument is in address space __gm__, but parameter must be in Local Memory
__aicore__ inline void AssertImpl(__gm__ const char* fmt, Args&&... args)
                       ^
1 error generated.
gmake[5]: *** [CMakeFiles/device_aic_obj.dir/build.make:76：CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o] 错误 1
gmake[4]: *** [CMakeFiles/Makefile2:85：CMakeFiles/device_aic_obj.dir/all] 错误 2
gmake[3]: *** [Makefile:91：all] 错误 2
gmake[2]: *** [ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/build.make:86：ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-stamp/ascendc_kernels_aic_device-build] 错误 2
gmake[1]: *** [CMakeFiles/Makefile2:1860：ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/all] 错误 2
gmake: *** [Makefile:146：all] 错误 2

leo-pony · 2024-11-22T03:53:23Z

我修复并且能够正常编译样例之后，拉去最新的310PAdapt分支，进行编译出现了新的错误：

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment.
Another options may been you can try, for example:
Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp.
Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

Hakstar · 2024-11-25T01:17:52Z

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment. Another options may been you can try, for example: Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp. Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

请问您用的测试环境是什么呢？以及从华为了解到310P中包含了两种，一种是单芯的310P，一种是双芯的，具体区别在于npu-smi info时显示
，npu chip有分组的为双芯，而显示如下内容（npu chip）：

则为单芯，请问具体是用的哪一种呢？是否有可能是单芯不支持该pr？

leo-pony · 2024-11-25T01:52:56Z

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment. Another options may been you can try, for example: Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp. Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

请问您用的测试环境是什么呢？以及从华为了解到310P中包含了两种，一种是单芯的310P，一种是双芯的，具体区别在于npu-smi info时显示，npu chip有分组的为双芯，而显示如下内容（npu chip）：) 则为单芯，请问具体是用的哪一种呢？是否有可能是单芯不支持该pr？

What's i am using, as following:

Hakstar · 2024-11-25T02:39:33Z

What's i am using, as following:

好的，感谢您的解答，看起来您用的应该是双芯310p3，在华为产品中应该对应的是310I Duo

hipudding · 2024-11-25T03:25:48Z

I think this PR support both of them.

Hakstar · 2024-11-26T00:55:34Z

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)？Thank you very much!

hipudding · 2024-11-26T01:08:39Z

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)？Thank you very much!

Sorry, We only have 310I Duo currently.

leo-pony · 2024-11-27T01:56:45Z

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)？Thank you very much!

Could your support the contact information of the person who your get the information that 310I Pro (single-core 310p) doesn't not support inference LLM? We want to know some detail information.

…0216) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt

hipudding self-requested a review November 8, 2024 09:30

hipudding added enhancement New feature or request Ascend NPU issues specific to Ascend NPUs labels Nov 8, 2024

leo-pony mentioned this pull request Nov 11, 2024

Bug: CANN: Inference result garbled #10252

Open

hipudding reviewed Nov 18, 2024

View reviewed changes

ggml/src/ggml-cann/kernels/CMakeLists.txt Outdated Show resolved Hide resolved

hipudding reviewed Nov 18, 2024

View reviewed changes

ggml/src/ggml-cann/aclnn_ops.cpp Outdated Show resolved Hide resolved

hipudding reviewed Nov 18, 2024

View reviewed changes

ggml/src/ggml-cann/aclnn_ops.cpp Outdated Show resolved Hide resolved

hipudding reviewed Nov 18, 2024

View reviewed changes

ggml/src/ggml-cann/kernels/dup.cpp Show resolved Hide resolved

hipudding assigned leo-pony Nov 18, 2024

leo-pony force-pushed the ascend310PAdapt branch from b0700ae to 7f5efeb Compare November 20, 2024 03:03

CANN Support Ascend310P to accelerate F32 and F16 LLM Model

6327369

gitlawr mentioned this pull request Nov 20, 2024

Unable to load NPU gpustack/gpustack#573

Open

leo-pony force-pushed the ascend310PAdapt branch from 7f5efeb to 6327369 Compare November 20, 2024 03:41

Add compile option soc type macro ASCEND_310P to ggml-cann lib

1ee8d72

leo-pony marked this pull request as ready for review November 20, 2024 07:57

Remove unused code

4201656

Remove the ascend soc_type hard code compile option in CMakelist.txt

be29da9

hipudding self-requested a review November 22, 2024 06:06

hipudding approved these changes Nov 22, 2024

View reviewed changes

hipudding merged commit c18610b into ggerganov:master Nov 22, 2024
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

leo-pony commented Nov 8, 2024

feichenchina commented Nov 11, 2024

leo-pony commented Nov 11, 2024

feichenchina commented Nov 11, 2024 via email

Hakstar commented Nov 18, 2024

leo-pony commented Nov 18, 2024

Hakstar commented Nov 19, 2024

leo-pony commented Nov 19, 2024

Hakstar commented Nov 19, 2024

leo-pony commented Nov 19, 2024

hipudding commented Nov 21, 2024

Hakstar commented Nov 22, 2024

leo-pony commented Nov 22, 2024

Hakstar commented Nov 25, 2024

leo-pony commented Nov 25, 2024

Hakstar commented Nov 25, 2024

hipudding commented Nov 25, 2024

Hakstar commented Nov 26, 2024

hipudding commented Nov 26, 2024

leo-pony commented Nov 27, 2024

CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

Conversation

leo-pony commented Nov 8, 2024

feichenchina commented Nov 11, 2024

leo-pony commented Nov 11, 2024

feichenchina commented Nov 11, 2024 via email

Hakstar commented Nov 18, 2024

leo-pony commented Nov 18, 2024

Hakstar commented Nov 19, 2024

leo-pony commented Nov 19, 2024

Hakstar commented Nov 19, 2024

leo-pony commented Nov 19, 2024

hipudding commented Nov 21, 2024

Hakstar commented Nov 22, 2024

leo-pony commented Nov 22, 2024

Hakstar commented Nov 25, 2024

leo-pony commented Nov 25, 2024

Hakstar commented Nov 25, 2024

hipudding commented Nov 25, 2024

Hakstar commented Nov 26, 2024

hipudding commented Nov 26, 2024

leo-pony commented Nov 27, 2024