Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Appendix G: CANN 8.5 Kernel Coverage — 998 Kernels

This appendix documents the coverage of CANN 8.5 built-in kernels by the ascendc-to-rs transpiler.

  • 998 CANN kernel names — the real operator batch that feeds ascendc-to-rs; each kernel below is a distinct ops_<category>__<name>.rs produced by the transpiler.
  • Two fidelity tiers:
    • Transpiled (real compute body): 247/998 (25%). The Rust body contains at least one compute intrinsic beyond the alloc / load / pipe_barrier / store boilerplate (e.g. ascend_add_f32, ub_reduce_max, tile_matmul_f16).
    • Registered (identity stub): 751/998 (75%). The body is load → barrier → store only — the transpiler parsed the C++ signature and produced a kernel that passes the compile gate, but did not yet lower the compute intrinsics. Shape, dtype, and kernel ABI are real; the body is a placeholder.
  • This is a compile-gate coverage — every kernel produces a valid kernel.acl.o through Rust → MLIR → AscendC → bisheng on Ascend 910B2. Numerical correctness against the reference CANN implementation is a separate (ongoing) gate.
  • Reproducible: the interactive browser below is regenerated from the in-repo transpiled corpus at benchmarks/cann_kernels/ops_*__*.rs by blog/mdbook/scripts/appg_build_cbdata.py. Re-run that script after any re-transpile to refresh both the per-category table and the embedded CB_DATA in one step.

Milestone — 2026-04-20: all 998/998 kernels in the real ascendc-to-rs batch produce a valid kernel.acl.o (compile-gate pass). 247/998 of these carry non-identity bodies; the remaining 751/998 are identity stubs awaiting intrinsic-lowering work in rustc_codegen_mlir. Tag: ascendc-to-rs-998-working.

Note on category scheme: this appendix uses the real batch categories emitted by the ascendc-to-rs pipeline (ops_cv, ops_legacy, ops_math, ops_nn, ops_oam, ops_transformer). Earlier drafts showed a synthetic 8-category catalog (ops_index/ops_optimizer/ops_reduce/ops_resize) with no kernels in common with the tested set — replaced on 2026-04-20.


G.1 Kernel Inventory by Category

CategoryTotalTranspiledRegisteredDescription
ops_cv41536Computer-vision primitives (resize, colour convert, background replace, custom blends)
ops_legacy343106237Element-wise unary/binary ops across the CANN legacy library (exp, abs, add, mul, logical, per-dtype variants)
ops_math15552103Math / special functions (trig, hyperbolic, erf, gamma, power, per-dtype variants)
ops_nn30681225Neural-network ops (activations, norms, pooling, loss, optimizers, indexing, reductions, resize)
ops_oam303Operator-Adapter (OAM) bridge kernels
ops_transformer1503147Attention, matmul, flash-attention, MoE, MLA, quantized-linear variants
Total998247751

“Transpiled” = body contains compute intrinsics beyond alloc/load/barrier/store. “Registered” = body is an identity stub (load → barrier → store) that passes the compile gate but does not yet express the original C++ compute. ops_transformer is the furthest from full fidelity (3/150 transpiled) because its kernels have complex inner loops (attention softmax, flash-attention tiling, matmul) that the transpiler does not yet lower; the legacy / math / nn categories fare better because their element-wise bodies already lower through today’s intrinsics. Closing the remaining gap is a rustc_codegen_mlir intrinsic-lowering task, not a transpiler-frontend one.

G.2 Interactive Kernel Browser

Select a category and kernel to view the AscendC C++ source and transpiled Rust code. Click buttons to open in Playground.

998 kernels
← Select a kernel from the list

998 kernels cataloged. Green = transpiled, grey = registered (source pending).

Back to Chapter 9: Automated Transpilation