English | 中文版
Appendix E: Complete Kernel Inventory
This appendix is auto-generated by
scripts/generate_kernel_appendix.sh. Runbash scripts/generate_kernel_appendix.shto regenerate.
Summary
| Metric | Count |
|---|---|
| Compiletest kernels | 486 |
| Deployable kernels | 19 |
| Total kernels | 505 |
| MultiKernelBench coverage | 300/300 (100%) |
| MKB categories covered | 15/15 (100%) |
| Memory safety vulnerability patterns | 6 classes (with attack examples) |
Vulnerability Pattern Legend
| ID | Vulnerability | C++ Root Cause | Rust Prevention | Attack Example |
|---|---|---|---|---|
| V1 | Type erasure | GM_ADDR erases all type info | Function signature encodes element type | case1 |
| V2 | Buffer overflow | GetValue(i) unchecked indexing | Buffer-ID API with explicit count | case2 |
| V3 | Integer overflow | Silent u32 wrap in offset calc | wrapping_mul makes overflow explicit | case6 |
| V4 | Use-after-free | FreeTensor() then stale access | No manual free in API | case3 |
| V5 | Double free | FreeTensor() called twice | No free operation exists | case5 |
| V6 | Missing sync | Forgotten pipe_barrier() | kernel_ops composites embed barriers | case4 |
Kernel Inventory by Category
Activation (17 kernels)
Applicable vulnerability patterns: V1(type erasure),V2(unchecked index),V6(missing sync)
MKB reference: reference_kernels/activation/
Architecture (77 kernels)
Applicable vulnerability patterns: V1,V2,V3(offset overflow),V6
MKB reference: reference_kernels/architecture/
Attention (23 kernels)
Applicable vulnerability patterns: V1,V2,V3,V6(multi-stage sync)
MKB reference: reference_kernels/attention/
Broadcast (12 kernels)
Applicable vulnerability patterns: V1(type erasure),V2(bounds),V5(double free)
MKB reference: reference_kernels/broadcast/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
add_bias | tests/compiletest/ui/broadcast_ops_kernel.rs | add_bias.py | PASS |
elementwise_mul | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_mul.py | PASS |
elementwise_div | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_div.py | PASS |
elementwise_sub | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_sub.py | PASS |
elementwise_max | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_max.py | PASS |
clamp | tests/compiletest/ui/broadcast_ops_kernel.rs | — | PASS |
elementwise_min | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_min.py | PASS |
elementwise_square | tests/compiletest/ui/broadcast_ops_kernel.rs | — | PASS |
where_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | — | PASS |
logic_and_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | logic_and_broadcast.py | PASS |
power_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | power_broadcast.py | PASS |
scalar_mul | tests/compiletest/ui/scalar_mul_kernel.rs | scalar_mul.py | PASS |
Convolution (34 kernels)
Applicable vulnerability patterns: V2(nested loop OOB),V3(stride*index overflow)
MKB reference: reference_kernels/convolution/
Fuse (120 kernels)
Applicable vulnerability patterns: V1,V2,V4(use-after-free in chain),V6(inter-op sync)
MKB reference: reference_kernels/fuse/
Index (12 kernels)
Applicable vulnerability patterns: V2(gather/scatter OOB),V3(index calc overflow)
MKB reference: reference_kernels/index/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
argmax | tests/compiletest/ui/index_ops_kernel.rs | argmax.py | PASS |
argmin | tests/compiletest/ui/index_ops_kernel.rs | argmin.py | PASS |
gather | tests/compiletest/ui/index_ops_kernel.rs | gather.py | PASS |
scatter | tests/compiletest/ui/index_ops_kernel.rs | scatter.py | PASS |
scatter_add | tests/compiletest/ui/index_ops_kernel.rs | scatter_add.py | PASS |
index_select | tests/compiletest/ui/index_ops_kernel.rs | index_select.py | PASS |
index_copy | tests/compiletest/ui/index_ops_kernel.rs | index_copy.py | PASS |
index_add | tests/compiletest/ui/index_ops_kernel.rs | index_add.py | PASS |
embedding | tests/compiletest/ui/index_ops_kernel.rs | embedding.py | PASS |
masked_fill | tests/compiletest/ui/index_ops_kernel.rs | masked_fill.py | PASS |
inplace_update | tests/compiletest/ui/index_ops_kernel.rs | inplace_update.py | PASS |
take_along_dim | tests/compiletest/ui/index_ops_kernel.rs | take_along_dim.py | PASS |
Loss (6 kernels)
Applicable vulnerability patterns: V1,V2,V6(reduction sync)
MKB reference: reference_kernels/loss/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
mse_loss | tests/compiletest/ui/loss_ops_kernel.rs | mse_loss.py | PASS |
huber_loss | tests/compiletest/ui/loss_ops_kernel.rs | huber_loss.py | PASS |
hinge_loss | tests/compiletest/ui/loss_ops_kernel.rs | hinge_loss.py | PASS |
cosine_similarity | tests/compiletest/ui/loss_ops_kernel.rs | cosine_similarity.py | PASS |
cross_entropy_loss | tests/compiletest/ui/loss_ops_kernel.rs | cross_entropy_loss.py | PASS |
kl_div_loss | tests/compiletest/ui/loss_ops_kernel.rs | kl_div_loss.py | PASS |
Math (5 kernels)
Applicable vulnerability patterns: V2(cumulative bounds),V3(offset overflow)
MKB reference: reference_kernels/math/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
matrix_scalar_mul | tests/compiletest/ui/math_ops_kernel.rs | matrix_scalar_mul.py | PASS |
cumprod | tests/compiletest/ui/math_cumulative_kernel.rs | cumprod.py | PASS |
cumsum | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum.py | PASS |
cumsum_exclusive | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum_exclusive.py | PASS |
cumsum_reverse | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum_reverse.py | PASS |
Matmul (23 kernels)
Applicable vulnerability patterns: V1(type erasure f16/f32),V2(tile bounds),V3(dim overflow),V6(cube sync)
MKB reference: reference_kernels/matmul/
Normalization (10 kernels)
Applicable vulnerability patterns: V1,V2,V6(reduce-normalize sync)
MKB reference: reference_kernels/normalization/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
rms_norm | tests/compiletest/ui/norm_ops_kernel.rs | rms_norm.py | PASS |
l1_norm | tests/compiletest/ui/norm_ops_kernel.rs | l1_norm.py | PASS |
l2_norm | tests/compiletest/ui/norm_ops_kernel.rs | l2_norm.py | PASS |
l2_normalize | tests/compiletest/ui/norm_ops_kernel.rs | l2_normalize.py | PASS |
layer_norm | tests/compiletest/ui/norm_ops_kernel.rs | layer_norm.py | PASS |
batch_norm | tests/compiletest/ui/norm_extended_kernel.rs | — | PASS |
group_norm | tests/compiletest/ui/norm_extended_kernel.rs | group_norm.py | PASS |
instance_norm | tests/compiletest/ui/norm_extended_kernel.rs | instance_norm.py | PASS |
frobenius_norm | tests/compiletest/ui/norm_extended_kernel.rs | frobenius_norm.py | PASS |
layernorm | tests/compiletest/ui/layernorm_kernel.rs | layernorm.py | PASS |
Optimizer (6 kernels)
Applicable vulnerability patterns: V1,V2(param bounds),V4(in-place update UAF)
MKB reference: reference_kernels/optimizer/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
sgd_update | tests/compiletest/ui/optimizer_ops_kernel.rs | sgd_update.py | PASS |
sgd_momentum | tests/compiletest/ui/optimizer_ops_kernel.rs | sgd_momentum.py | PASS |
adagrad_update | tests/compiletest/ui/optimizer_ops_kernel.rs | adagrad_update.py | PASS |
rmsprop_update | tests/compiletest/ui/optimizer_ops_kernel.rs | rmsprop_update.py | PASS |
adam_update | tests/compiletest/ui/optimizer_ops_kernel.rs | adam_update.py | PASS |
lamb_update | tests/compiletest/ui/optimizer_ext_kernel.rs | lamb_update.py | PASS |
Pooling (12 kernels)
Applicable vulnerability patterns: V2(window OOB),V3(stride overflow)
MKB reference: reference_kernels/pooling/
Reduce (5 kernels)
Applicable vulnerability patterns: V1,V2,V6(reduction pipeline sync)
MKB reference: reference_kernels/reduce/
| Kernel Function | Source File | MKB Reference | 910B3 Status |
|---|---|---|---|
reduce_max | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_max.py | PASS |
reduce_min | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_min.py | PASS |
reduce_sum | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_sum.py | PASS |
reduce_mean | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_mean.py | PASS |
reduce_prod | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_prod.py | PASS |
Resize (15 kernels)
Applicable vulnerability patterns: V2(interpolation OOB),V3(coordinate overflow)
MKB reference: reference_kernels/resize/
Tiled (16 kernels)
Applicable vulnerability patterns: V2(tile boundary OOB),V6(tile-boundary sync)
| Kernel Function | Source File | 910B3 Status |
|---|---|---|
relu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
sigmoid_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
gelu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
tanh_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
swish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
exp_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
vec_add_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
vec_mul_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
elu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
mish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
layernorm_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
softmax_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
selu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
leaky_relu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
hardswish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
rmsnorm_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
Multiblock (16 kernels)
Applicable vulnerability patterns: V2(block partition OOB),V6(cross-block sync)
F16 (14 kernels)
Applicable vulnerability patterns: V1(f16/f32 type confusion)
| Kernel Function | Source File | 910B3 Status |
|---|---|---|
relu_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
sigmoid_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
abs_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
exp_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
ln_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
sqrt_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
rsqrt_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reciprocal_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_add_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_sub_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_mul_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_div_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reduce_max_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reduce_sum_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
Unary_math (8 kernels)
Applicable vulnerability patterns: V1,V2
| Kernel Function | Source File | 910B3 Status |
|---|---|---|
exp_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
ln_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
sqrt_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
rsqrt_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
reciprocal_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
negate_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
square_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
cube_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
Deployable Kernels (with host code)
Memory Safety Case Studies
Each case pairs a vulnerable C++ kernel with a structurally safe Rust kernel.
| Case | Vulnerability | C++ File | Rust File |
|---|---|---|---|
| 1 | Type confusion (GM_ADDR type erasure) | vulnerable.cpp | safe.rs |
| 2 | Buffer overflow (unchecked indexing) | vulnerable.cpp | safe.rs |
| 3 | Use-after-free (FreeTensor then access) | vulnerable.cpp | safe.rs |
| 4 | Missing sync (forgotten pipe_barrier) | vulnerable.cpp | safe.rs |
| 5 | Double free (repeated FreeTensor) | vulnerable.cpp | safe.rs |
| 6 | Integer overflow (silent offset wrap) | vulnerable.cpp | safe.rs |
Performance Comparison (in progress)
| Kernel | ascend-rs Time | AscendC C++ Time | Ratio | Notes |
|---|---|---|---|---|
| softmax (256) | 0.077 ms | 0.078 ms | 0.99x | Zero overhead |
| softmax (16384) | 0.087 ms | 0.089 ms | 0.98x | Zero overhead |
| relu | — | — | — | Pending |
| matmul | — | — | — | Pending |
| layernorm | — | — | — | Pending |
| conv2d | — | — | — | Pending |
Performance benchmarking experiments are in progress. This table will be updated as results become available.
This appendix was auto-generated by bash scripts/generate_kernel_appendix.sh.
Kernel counts: 486 compiletests + 19 deployable = 505 total.