Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216

Merged
merged 4 commits into from
Nov 22, 2024

Conversation

leo-pony
Copy link
Contributor

@leo-pony leo-pony commented Nov 8, 2024

CANN Support Ascend310P to accelerate F32/F16 model inferencing. Corresponding issue is #10160. Q8 and Q4 will implement next.

Function is normal:
image
image

@hipudding hipudding self-requested a review November 8, 2024 09:30
@hipudding hipudding added enhancement New feature or request Ascend NPU issues specific to Ascend NPUs labels Nov 8, 2024
@feichenchina
Copy link

我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码,在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理,结果为乱码,不知道是还未支持该模型还是有什么别的原因?

@leo-pony
Copy link
Contributor Author

我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码,在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理,结果为乱码,不知道是还未支持该模型还是有什么别的原因?

Compile option should with -DSOC_TYPE, such as:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=debug -DSOC_TYPE=Ascend310P3
cmake --build build --config debug

@feichenchina
Copy link

feichenchina commented Nov 11, 2024 via email

@Hakstar
Copy link

Hakstar commented Nov 18, 2024

您好!我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

  • 环境信息如下:
    【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
    【硬件】: Ascend 310I Pro(310P3)
    【CANN】:7.5.T11.0.B081:8.0.RC3.alpha003
    【NPU驱动】:24.1.rc2
  • 编译执行指令:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug
  • 主要错误信息:
Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

@leo-pony
Copy link
Contributor Author

您好!我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

  • 环境信息如下:
    【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
    【硬件】: Ascend 310I Pro(310P3)
    【CANN】:7.5.T11.0.B081:8.0.RC3.alpha003
    【NPU驱动】:24.1.rc2
  • 编译执行指令:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug
  • 主要错误信息:
Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

It seems -DSOC_TYPE=Ascend310P3 doesn't take effect. Plz check your code is the same with this PR. In this PR there is no Cast call for 310P3. If SOC_TYPE been set to Ascend310PX, macro ASCEND_310P would been defined.
image

@Hakstar
Copy link

Hakstar commented Nov 19, 2024

您好!我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。

  • 环境信息如下:
    【系统平台】: Linux worker-12 4.19.90-23.8.v2101.ky10.x86_64
    【硬件】: Ascend 310I Pro(310P3)
    【CANN】:7.5.T11.0.B081:8.0.RC3.alpha003
    【NPU驱动】:24.1.rc2
  • 编译执行指令:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug
  • 主要错误信息:
Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
    CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
    ^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here

It seems -DSOC_TYPE=Ascend310P3 doesn't take effect. Plz check your code is the same with this PR. In this PR there is no Cast call for 310P3. If SOC_TYPE been set to Ascend310PX, macro ASCEND_310P would been defined. image

切换到ascend310PAdapt分支,重新执行编译命令之后出现新的错误,关键错误信息如下:

[ 51%] Built target test-model-load-cancel
Consolidate compiler generated dependencies of target test-autorelease
[ 51%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
Consolidate compiler generated dependencies of target test-json-schema-to-grammar
[ 53%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
Consolidate compiler generated dependencies of target test-c
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误:ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99:bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490:tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2

@leo-pony
Copy link
Contributor Author

切换到ascend310PAdapt分支,重新执行编译命令之后出现新的错误,关键错误信息如下:

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

@Hakstar
Copy link

Hakstar commented Nov 19, 2024

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

重新编译后出现同样的问题,以下是我执行的完整命令:

git clone https://github.com/leo-pony/llama.cpp.git
cd llama.cpp
git checkout ascend310PAdapt
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

再执行完

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++

之后的输出为:

-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/gcc-11.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/gcc-11.2.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.27.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- OpenMP found
-- Using llamafile
-- Using AMX
-- CANN: updated CANN_INSTALL_DIR from ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
-- Compile for Ascend310P.
-- CANN: CANN_INCLUDE_DIRS =  /usr/local/Ascend/ascend-toolkit/latest/include;/usr/local/Ascend/ascend-toolkit/latest/include/aclnn;/usr/local/Ascend/ascend-toolkit/latest/acllib/include
-- CANN: CANN_LIBRARIES =  ascendcl;nnopbase;opapi;acl_op_compiler;ascendc_kernels
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxx/llama.cpp/build

执行

cmake --build build --config debug

发生错误,主要错误如下:

[ 48%] Built target test-backend-ops
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 49%] Linking CXX executable ../bin/test-rope
[ 49%] Built target test-rope
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-model-load-cancel
[ 51%] Built target test-model-load-cancel
[ 51%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
[ 54%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误:ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99:bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490:tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2

@leo-pony
Copy link
Contributor Author

It seems has dirty files, Delete build directory, and retry may been could handle this problem.

重新编译后出现同样的问题,以下是我执行的完整命令:

git clone https://github.com/leo-pony/llama.cpp.git
cd llama.cpp
git checkout ascend310PAdapt
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug

再执行完

cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++

之后的输出为:

-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/gcc-11.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/gcc-11.2.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.27.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- OpenMP found
-- Using llamafile
-- Using AMX
-- CANN: updated CANN_INSTALL_DIR from ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
-- Compile for Ascend310P.
-- CANN: CANN_INCLUDE_DIRS =  /usr/local/Ascend/ascend-toolkit/latest/include;/usr/local/Ascend/ascend-toolkit/latest/include/aclnn;/usr/local/Ascend/ascend-toolkit/latest/acllib/include
-- CANN: CANN_LIBRARIES =  ascendcl;nnopbase;opapi;acl_op_compiler;ascendc_kernels
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxx/llama.cpp/build

执行

cmake --build build --config debug

发生错误,主要错误如下:

[ 48%] Built target test-backend-ops
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 49%] Linking CXX executable ../bin/test-rope
[ 49%] Built target test-rope
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-model-load-cancel
[ 51%] Built target test-model-load-cancel
[ 51%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
[ 54%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误:ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99:bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490:tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2

Plz check wether your can build basic CANN C++ applications:
https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC3alpha003/apiref/aolapi/context/common/编译与运行样例.md

@github-actions github-actions bot added documentation Improvements or additions to documentation build Compilation issues script Script related testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment examples python python script changes devops improvements to build systems and github actions server ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Nov 19, 2024
@hipudding hipudding removed examples python python script changes devops improvements to build systems and github actions server ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Kompute https://github.com/KomputeProject/kompute/ labels Nov 19, 2024
@leo-pony leo-pony marked this pull request as ready for review November 20, 2024 07:57
@hipudding
Copy link
Collaborator

#9560
#10160

@Hakstar
Copy link

Hakstar commented Nov 22, 2024

Plz check wether your can build basic CANN C++ applications: https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC3alpha003/apiref/aolapi/context/common/编译与运行样例.md

我修复并且能够正常编译样例之后,拉去最新的310PAdapt分支,进行编译出现了新的错误:

-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-build
[ 15%] Performing build step for 'ascendc_kernels_aic_device'
[ 12%] Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp:10:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/dup.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:28:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_intf.h:25:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_dump_tensor_intf.cppm:138:24: error: (8th) const argument is in address space __gm__, but parameter must be in Local Memory
__aicore__ inline void AssertImpl(__gm__ const char* fmt, Args&&... args)
                       ^
1 error generated.
gmake[5]: *** [CMakeFiles/device_aic_obj.dir/build.make:76:CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o] 错误 1
gmake[4]: *** [CMakeFiles/Makefile2:85:CMakeFiles/device_aic_obj.dir/all] 错误 2
gmake[3]: *** [Makefile:91:all] 错误 2
gmake[2]: *** [ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/build.make:86:ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-stamp/ascendc_kernels_aic_device-build] 错误 2
gmake[1]: *** [CMakeFiles/Makefile2:1860:ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2

@leo-pony
Copy link
Contributor Author

我修复并且能够正常编译样例之后,拉去最新的310PAdapt分支,进行编译出现了新的错误:

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment.
Another options may been you can try, for example:
Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp.
Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

@hipudding hipudding self-requested a review November 22, 2024 06:06
@hipudding hipudding merged commit c18610b into ggerganov:master Nov 22, 2024
54 checks passed
@Hakstar
Copy link

Hakstar commented Nov 25, 2024

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment. Another options may been you can try, for example: Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp. Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

请问您用的测试环境是什么呢?以及从华为了解到310P中包含了两种,一种是单芯的310P,一种是双芯的,具体区别在于npu-smi info时显示
751732497207_ pic,npu chip有分组的为双芯,而显示如下内容(npu chip):
image
则为单芯,请问具体是用的哪一种呢?是否有可能是单芯不支持该pr?

@leo-pony
Copy link
Contributor Author

We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment. Another options may been you can try, for example: Option 1: you can try find what's wrong, for example delete the assert code in dup.cpp. Option 2: try with Open Euler or Ubutun os with community version ascend firmware, driver and CANN toolkit.

请问您用的测试环境是什么呢?以及从华为了解到310P中包含了两种,一种是单芯的310P,一种是双芯的,具体区别在于npu-smi info时显示,npu chip有分组的为双芯,而显示如下内容(npu chip):) 则为单芯,请问具体是用的哪一种呢?是否有可能是单芯不支持该pr?

What's i am using, as following:
image

@Hakstar
Copy link

Hakstar commented Nov 25, 2024

What's i am using, as following: image

好的,感谢您的解答,看起来您用的应该是双芯310p3,在华为产品中应该对应的是310I Duo

@hipudding
Copy link
Collaborator

I think this PR support both of them.

@Hakstar
Copy link

Hakstar commented Nov 26, 2024

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)?Thank you very much!

@hipudding
Copy link
Collaborator

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)?Thank you very much!

Sorry, We only have 310I Duo currently.

@leo-pony
Copy link
Contributor Author

I think this PR support both of them.

After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)?Thank you very much!

Could your support the contact information of the person who your get the information that 310I Pro (single-core 310p) doesn't not support inference LLM? We want to know some detail information.

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
…0216)

* CANN Support Ascend310P to accelerate F32 and F16 Model

* Add compile option soc type macro ASCEND_310P to ggml-cann lib

* Remove unused code

* Remove the ascend soc_type hard code compile option in CMakelist.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ascend NPU issues specific to Ascend NPUs enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants