Hardware aware tiling optimization for multi-core systems

Dominik Adamski, Grzegorz Jabłoński


This paper presents a proposition of the new tool which improves tiling efficiency
for given hardware architecture. This article also describes the correlation
between changing hardware architecture and methods of software optimization.
First chapter includes short description of the change in hardware architecture
which has occurred in recent 10 years. The second chapter provides an overview
of tools which will be used in further research. The consecutive sections contain
description of proposed hardware-aware tool for optimal tiling.


LLVM; Tiling; Data Locality; Polyhedral Model

Full Text:



Intel Xeon PhiTM System Software Developers Guide. Document available on webpage http://www.intel.com/ content/dam/www/public/us/en/documents/product-briefs/ xeon-phi-coprocessor-system-software-developers-guide.pdf.

Deest, G., Estibals, N., Yuki, T., Derrien, S., & Rajopadhye, S.: Towards Scalable and Efficient FPGA Stencil Accelerators. Article presented during 6th International Workshop on Polyhedral Compilation Techniques 2016 http://impact.gforge.inria.fr/impact2016/papers/ impact2016-deest.pdf.

J., Danalis et. al.: PAPI Library. Project description available on webpage http://icl. cs.utk.edu/papi/index.html.

Bastoul C.: Code Generation in the Polyhedral Model Is Easier Than You Think. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, PACT ’04, pp. 7–16. IEEE Computer Society, Washington, DC, USA, 2004. ISBN 0-7695-2229-7. URL http: //dx.doi.org/10.1109/PACT.2004.11.

Carvalho C.: The Gap between Processor and Memory Speed. In: Proceedings of the Internal Conference on Computer Architecture, pp. 27–34. 2002.

C.Lattner: LLVM. Figure available on webpage http://www.aosabook.org/en/ llvm.html.

Frigo M., Leiserson C.E., Prokop H., Ramachandran S.: Cache-Oblivious Algorithms. In: Proceedings of the 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, pp. 285–. IEEE Computer Society, Washington, DC, USA, 1999. ISBN 0-7695-0409-4. URL http://dl.acm.org/citation.cfm?id= 795665.796479.

Grosser T., Cohen A., Holewinski J., Sadayappan P., Verdoolaege S.: Hybrid Hexagonal/Classical Tiling for GPUs. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’14, pp. 66:66–66:75. ACM, New York, NY, USA, 2014. ISBN 978-1-4503-2670-4. URL http://dx.doi.org/10.1145/2544137.2544160.

Grosser T., Größlinger A., Lengauer C.: Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation. In: Parallel Processing Letters, vol. 22(4), 2012. URL http://dx.doi.org/10.1142/S0129626412500107.

Grosser T., Zheng H., A R., Simburger A., Grosslinger A., Pouchet L.N.: Polly Polyhedral optimization in LLVM. In: First International Workshop on Polyhedral Compilation Techniques (IMPACT’11). Chamonix, France, 2011.

Hennessy J.L., Patterson D.A.: Computer Architecture, Fifth Edition: A Quantative Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 5th ed., 2011. ISBN 012383872X, 9780123838728.

Kong M., Pop A., Pouchet L.N., Govindarajan R., Cohen A., Sadayappan P.: Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs. In: ACM Trans. Archit. Code Optim., vol. 11(4), pp. 61:1–61:30, 2015. ISSN 1544-3566. URL http://dx.doi.org/10.1145/2687652.

Lattner C., Adve V.: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, CGO ’04, pp. 75–. IEEE Computer Society, Washington, DC, USA, 2004. ISBN 0-7695-2102-9. URL http://dl.acm.org/citation.cfm?id=977395.977673.

Malik A.M.: Optimal Tile Size Selection Problem Using Machine Learning. In: Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 02, ICMLA ’12, pp. 275–280. IEEE Computer Society, Washington, DC, USA, 2012. ISBN 978-0-7695-4913-2. URL http://dx.doi.org/10.1109/ICMLA.2012.214.

Puchet L.N.: Polybench. Benchmark available on webpage http://web.cse.ohio-state.edu/~pouchet/software/polybench/.

Rahman M., Pouchet L.N., Sadayappan P.: Neural Network Assisted Tile Size Selection. In: International Workshop on Automatic Performance Tuning (IWAPT’2010). Springer Verlag, Berkeley, CA, 2010.

Rupp K., M.Horovitz, Labonte F., Shacham O., K.Olukotun, L.Hammond, C.Batten: 40 Years of Microprocessor Trend Data. Figure available on webpage http://www.karlrupp.net/wp-content/uploads/2015.06/40-years-processor-trend.png.

Shirako J., Sharma K., Fauzia N., Pouchet L.N., Ramanujam J., Sadayappan P., Sarkar V.: Analytical Bounds for Optimal Tile Size Selection. In: Proceedings of the 21st International Conference on Compiler Construction, CC’12, pp. 101–121. Springer-Verlag, Berlin, Heidelberg, 2012. ISBN 978-3-642-28651-3. URL http://dx.doi.org/10.1007/978-3-642-28652-0_6.

T.Grosser: Speedup of Polly tiling optimization in comparison to gcc -O3. Figure available on webpage http://polly.llvm.org/performance.html.

Verdoolaege S.: isl: An Integer Set Library for the Polyhedral Model. In: K. Fukuda, J. Hoeven, M. Joswig, N. Takayama, eds., Mathematical Software (ICMS’10), LNCS 6327, pp. 299–302. Springer-Verlag, 2010.

DOI: https://doi.org/10.7494/csci.2017.18.2.145


  • There are currently no refbacks.