Institutions | About Us | Help | Gaeilge
rian logo


Mark
Go Back
Automatic vectorization through superword level parellelism with associative chain re-ordering and loop shifting
ROGERS, STEPHEN
Single instruction, multiple data (SIMD) is a class of parallel computing that involves executing a single operation across multiple pieces of data. A common type of SIMD is vector processing which involves executing a single instruction across 1-dimensional arrays of data called vectors. A category of compiler optimization called automatic vectorization has been developed since the introduction of vector processing to allow 'vectorizing compilers' to target such processor capabilities without direct intervention from application programmers. Convolution is a fundamental concept in image processing. It involves the application of a matrix called a kernel to weight the sum of a pixel and its adjacent pixels, for all pixels in an image. This process is used to perform tasks like image blurring, edge detection and noise reduction. In this thesis, we explore the challenges of automatic vectorization of image convolutions implemented in C and C++. We describe the fundamentals of vectorization and image convolutions and propose an approach for the effective vectorization of these convolutions. Our approach combines vectorization through Superword Level Parallelism with tentative loop unrolling, loop shifting, and the reordering of associative and commutative chains of instructions. Most modern optimizing compilers are capable of vectorizing 3x3 image convolutions, but tend to fail at vectorizing larger sized convolutions, like 5x5. The vectorizer we describe in this thesis, with the aid of its combined optimizations, is designed to vectorize such larger convolutions. Through this combination of optimizations, we have measured performance improvements for 5x5, 7x7, and 9x9 image convolutions. For convolutions operating on integer data types we measured performance improvements between 2.01x and 6.97x, and for floating-point types, between 2.19x and 5.34x.
Keyword(s): automatic vectorization; SLP; software pipelining; compiler optimisation
Publication Date:
2018
Type: Master thesis (research)
Peer-Reviewed: Yes
Language(s): English
Institution: Trinity College Dublin
Citation(s): ROGERS, STEPHEN, Automatic vectorization through superword level parellelism with associative chain re-ordering and loop shifting, Trinity College Dublin.School of Computer Science & Statistics, 2018
Publisher(s): Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science
Supervisor(s): Gregg, David
First Indexed: 2018-12-16 06:41:28 Last Updated: 2018-12-16 06:41:28