currybab's blog

pmpp lecture 17 sparse matrix computation (ELL and JDS) 요약

Source: Lecture 17 - Sparse Matrix Computation (ELL and JDS)

Today

ELLPACK Format (ELL)

ell concept ell array

SpMV/ELL

    struct ELLMatrix {
        unsigned int numRows;
        unsigned int numCols;
        unsigned int maxNNZPerRow;
        unsigned int* nnzPerRow;
        unsigned int* colIdxs;
        float *values;
    }
    __global__ void spmv_ell_kernel(ELLMatrix ellMatrix, const float* inVector, float* outVector) {
        int row = blockIdx.x * blockDim.x + threadIdx.x;
        if (row < ellMatrix.numRows) {
            float sum = 0.0f;
            for (int iter = 0; iter < ellMatrix.nnzPerRow[row]; iter++) {
                unsigned int i = iter * ellMatrix.numRows + row;
                unsigned int col = ellMatrix.colIdxs[i];
                float value = ellMatrix.values[i];
                sum += value * inVector[col];
            }
            outVector[row] = sum;
        }
    }
    // csr버전에 비해 더 나은 coalesced 접근으로 성능이 증가함 (0.386mx -> 0.295ms)
    // 복사시간이 길어짐(패딩 때문에 저장공간이 더 큼)

ELL Tradeoffs

Hybrid ELL + COO

hybrid ell + coo

ELL + COO Tradeoffs

Jagged Diagonal Storage (JDS)

jds row collect

jds row sort

jds column major

jds storage

SpMV/JDS kernel

JDS tradeoffs

#blog #cuda #gpu #pmpp