currybab's blog

pmpp lecture 05 memory and tiling 요약

source: Lecture 05 - Memory and Tiling

Today

Performance Metrics

Performance Bounds

Example: Vector Addition

    z[i] = x[i] + y[i];

Example: Matrix-Matrix Multiplication

    for (unsigned int i = 0; i < N; i++) {
        sum += A[row*N + i] * B[i*N + col];
    }

Reuse in Matrix-Matrix Multiplication

GPU 아키텍쳐 에서의 메모리

CUDA 프로그래밍 모델에서의 메모리

CUDA Type Qualifies

Reuse in Matrix-Matrix Multiplication again

Tiled Matrix-Matrix Multiplication

Boundary Conditions

CPU에서의 타일링

#blog #cuda #gpu #pmpp