Skip to content

Commit

Permalink
xe: reduction: limit active channels to inner dim size
Browse files Browse the repository at this point in the history
  • Loading branch information
Simonsays095 authored and karturov committed Nov 5, 2024
1 parent 32d8660 commit a4fcef9
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/gpu/intel/ocl/reduction/combined_reduction.cl
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,8 @@ combined_reduce(
const int red_off_sg = (inner_idx_start + sglid) / INNER_DIM_SIZE;
const int red_off_tg = red_off_sg + sgid * red_per_sg;

const int active_channels = min(SUBGROUP_SIZE, red_per_sg * INNER_DIM_SIZE);
const int active_channels = min(
SUBGROUP_SIZE, red_per_sg * (INNER_DIM_SIZE - inner_idx_start));
ASSUME(active_channels == SUBGROUP_SIZE || !WITH_BLOCK_READ);

const int loop_stride = _SRC_OFF(0, other_reductions, 0);
Expand Down

0 comments on commit a4fcef9

Please sign in to comment.