GPU Kernels#

There are two ways for VTensor to launch GPU kernels.

Raw pointer#

Users could still use the raw pointer to access the tensor’s data within a kernel. However, users need to ensure the tensor’s memory is contiguous.

#include <iostream>
#include <lib/vtensor.hpp>


__global__ void kernel(float* tensor) {
    tensor[threadIdx.x] += 1;
}

int main() {
    auto tensor = vt::arange(12).reshape(4, 3);
    kernel<<<1, 12>>>(tensor.raw_ptr());
    cudaDeviceSynchronize();
    return 0;
}