vulkan: Optimize binary ops (#10270)
Reuse the index calculations across all of src0/src1/dst. Add a shader variant for when src0/src1 are the same dimensions and additional modulus for src1 aren't needed. Div/mod are slow, so add "fast" div/mod that have a fast path when the calculation isn't needed or can be done more cheaply.
This commit is contained in:
parent
66798e42fb
commit
af148c9386
9 changed files with 117 additions and 52 deletions
|
@ -3,6 +3,8 @@
|
|||
#include "types.comp"
|
||||
#include "generic_binary_head.comp"
|
||||
|
||||
layout(local_size_x = 512, local_size_y = 1, local_size_z = 1) in;
|
||||
|
||||
void main() {
|
||||
const uint idx = gl_GlobalInvocationID.z * 262144 + gl_GlobalInvocationID.y * 512 + gl_GlobalInvocationID.x;
|
||||
const int dim = p.param3;
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue