Fréchet View  1.6.0
A Tool for Exploring Fréchet Distance Algorithms
clm4rm_multiplication.cpp File Reference
#include <clm4rm.h>
#include <stdlib.h>
#include <qdebug.h>

Go to the source code of this file.

Functions

void print_event_info (cl_event event)
 
void clm4rm_mul_block (clmatrix_t *C, clmatrix_t *A, clmatrix_t *B, int r0, int r1, cl_command_queue queue, clm4rm_conditions *cond)
 
void clm4rm_mul (clmatrix_t *C, clmatrix_t *A, clmatrix_t *B, cl_command_queue queue, clm4rm_conditions *cond)
 Boolean matrix multiplication on the GPU using the method of the Four Russians. C := A * B. More...
 
void clcubic_mul_enqeue (clmatrix_t *C, clmatrix_t *A, clmatrix_t *B, int tile_n, int tile_m, size_t work_offset[2], size_t work_size[2], int uptri, cl_command_queue queue, clm4rm_conditions *cond)
 
void clcubic_mul (clmatrix_t *C, clmatrix_t *A, clmatrix_t *B, size2_t max_tile, cl_command_queue queue, clm4rm_conditions *cond)
 Boolean matrix multiplication on the GPU using nested loops. C := A*B. More...
 
void clutri_mul (clmatrix_t *C, clmatrix_t *A, clmatrix_t *B, size2_t max_tile, cl_command_queue queue, clm4rm_conditions *cond)
 Boolean matrix multiplication on the GPU using nested loops. C := A*B Assumes matrixes to be upper triangular. More...
 
void printb (uint32_t x, int k)
 
void print3 (uint32_t x, int k)
 
void create_index_tables (uint32_t k)
 

Variables

cl_kernel clm4rm_mul_kernel
 OpenCL kernel for Four-Russians matrix multiplication. More...
 
cl_kernel clcubic_mul_kernel [MAX_TILE_M+1]
 OpenCL kernels for cubic matrix multiplication. Each kernel for a tile size. Actual tile sizes are injected as macros. More...
 
cl_kernel clutri_mul_kernel [MAX_TILE_M+1]
 OpenCL kernels for cubic upper-triangle matrix multiplication. Each kernel for a tile size. Actual tile sizes are injected as macros. More...
 

Function Documentation

◆ clcubic_mul()

void clcubic_mul ( clmatrix_t C,
clmatrix_t A,
clmatrix_t B,
size2_t  max_tile,
cl_command_queue  queue,
clm4rm_conditions cond 
)

Boolean matrix multiplication on the GPU using nested loops. C := A*B.

Matrix is partitioned into three parts:

       +----------------------------------+----+
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       | - - -+ - - -+ - - -+ - - -+ - - -|    |
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       | - - -+ - - -+ - - -+ - - -+ - - -|    |
       |                                  |REST|
       |      |      |      |      |      |RIGHT
       |                                  |    |
       | - - -+ - - -+ - - -+ - - -+ - - -|    |
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       | - - -+ - - -+ - - -+ - - -+ - - -|    |
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       |------+------+------+------+------+----|
       |   REST BOTTOM                         |
       | - - -+ - - -+ - - -+ - - -+ - - -+ - -+

Definition at line 132 of file clm4rm_multiplication.cpp.

◆ clcubic_mul_enqeue()

void clcubic_mul_enqeue ( clmatrix_t C,
clmatrix_t A,
clmatrix_t B,
int  tile_n,
int  tile_m,
size_t  work_offset[2],
size_t  work_size[2],
int  uptri,
cl_command_queue  queue,
clm4rm_conditions cond 
)

Definition at line 241 of file clm4rm_multiplication.cpp.

◆ clm4rm_mul()

void clm4rm_mul ( clmatrix_t C,
clmatrix_t A,
clmatrix_t B,
cl_command_queue  queue,
clm4rm_conditions cond 
)

Boolean matrix multiplication on the GPU using the method of the Four Russians. C := A * B.

The function returns immediately. The operation is scheduled for asynchronous execution of the GPU. Use post-condition events to wait for the execution of the operation.

Parameters
Ca matrix structure; receives the resutl
Aa matrix structure
Ba matrix structure
queueOpenCL command queue
condkeeps track of pre-conditions and newly created post-conditions

Definition at line 30 of file clm4rm_multiplication.cpp.

◆ clm4rm_mul_block()

void clm4rm_mul_block ( clmatrix_t C,
clmatrix_t A,
clmatrix_t B,
int  r0,
int  r1,
cl_command_queue  queue,
clm4rm_conditions cond 
)

Definition at line 44 of file clm4rm_multiplication.cpp.

◆ clutri_mul()

void clutri_mul ( clmatrix_t C,
clmatrix_t A,
clmatrix_t B,
size2_t  max_tile,
cl_command_queue  queue,
clm4rm_conditions cond 
)

Boolean matrix multiplication on the GPU using nested loops. C := A*B Assumes matrixes to be upper triangular.

Matrix is partitioned into three parts
       +----------------------------------+----+
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       | - - -+ - - -+ - - -+ - - -+ - - -|    |
       |                                  |    |
       |      |      |      |      |      |    |
       |                                  |    |
       |      + - - -+ - - -+ - - -+ - - -|    |
       |                                  |REST|
       |             |      |      |      |RIGHT
       |                                  |    |
       |             + - - -+ - - -+ - - -|    |
       |                                  |    |
       |                    |      |      |    |
       |   (empty)                        |    |
       |                    + - - -+ - - -|    |
       |                                  |    |
       |                           |      |    |
       |                                  |    |
       | - - - - - - + - - - - - - + - - -+    |
       |   (empty)                        |    |
       | - - -+ - - -+ - - -+ - - -+ - - -+ - -+

Definition at line 339 of file clm4rm_multiplication.cpp.

◆ create_index_tables()

void create_index_tables ( uint32_t  k)

Definition at line 453 of file clm4rm_multiplication.cpp.

◆ print3()

void print3 ( uint32_t  x,
int  k 
)

Definition at line 439 of file clm4rm_multiplication.cpp.

◆ print_event_info()

void print_event_info ( cl_event  event)

Definition at line 228 of file clm4rm_multiplication.cpp.

◆ printb()

void printb ( uint32_t  x,
int  k 
)

Definition at line 433 of file clm4rm_multiplication.cpp.

Variable Documentation

◆ clcubic_mul_kernel

cl_kernel clcubic_mul_kernel[MAX_TILE_M+1]

OpenCL kernels for cubic matrix multiplication. Each kernel for a tile size. Actual tile sizes are injected as macros.

Definition at line 71 of file clm4rm.cpp.

◆ clm4rm_mul_kernel

cl_kernel clm4rm_mul_kernel

OpenCL kernel for Four-Russians matrix multiplication.

Definition at line 68 of file clm4rm.cpp.

◆ clutri_mul_kernel

cl_kernel clutri_mul_kernel[MAX_TILE_M+1]

OpenCL kernels for cubic upper-triangle matrix multiplication. Each kernel for a tile size. Actual tile sizes are injected as macros.

Definition at line 72 of file clm4rm.cpp.