Exploiting Loop-Carried Stream Reuse for Scientific Computing Applications on the Stream Processor
Weixia XU, Qiang DOU, Ying ZHANG, Gen LI, Xuejun YANG
DOI: 10.4236/ijcns.2010.31003   PDF    HTML     5,966 Downloads   9,531 Views  


Compared with other stream applications, scientific stream programs are usually bound by memory accesses. Reusing streams across different iterations, i.e. loop-carried stream reuse, can effectively improve the SRF locality, thus reducing memory accesses greatly. In the paper, we first present the algorism identifying loop-carried stream reuse and that exploiting the reuse after analyzing scientific computing applications. We then perform several representative microbenchmarks and scientific stream programs with and without our optimization on Isim, a cycle-accurate stream processor simulator. Experimental results show that our algorithms can effectively exploit loop-carried stream reuse for scientific stream programs and thus greatly improve the performance of memory-bound scientific stream programs.

Share and Cite:

W. XU, Q. DOU, Y. ZHANG, G. LI and X. YANG, "Exploiting Loop-Carried Stream Reuse for Scientific Computing Applications on the Stream Processor," International Journal of Communications, Network and System Sciences, Vol. 3 No. 1, 2010, pp. 32-37. doi: 10.4236/ijcns.2010.31003.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] W. A. Wulf and S. A. McKee, “Hitting the memory wall: Implications of the obvious,” Computer Architecture News, Vol. 23, No. 1, pp. 20–24, 1995.
[2] D. Burger, J. Goodman, and A. Kagi, “Memory bandwidth limitations of future microprocessors,” In Proceedings of the 23rd International Symposium on Computer Architecture, Philadelphia, PA, pp. 78–89, 1996.
[3] S. A. William, “Stream architectures,” In PACT 2003, September 27, 2003.
[4] Merrimac–Stanford Streaming Supercomputer Project, Stanford University, http://merrimac.stanford.edu/.
[5] W. J. Dally, P. Hanrahan, et al., “Merrimac: Supercomputing with streams,” SC2003, Phoenix, Arizona, November 2003.
[6] M. Erez, J. H. Ahn, et al., “Merrimac-supercomputing with streams,” Proceedings of the 2004 SIGGRAPH GP^2 Workshop on General Purpose Computing on Gra- phics Processors, Los Angeles, California, June 2004.
[7] J. B. Wang, Y. H. Tang, et al., “Application and study of scientific computing on stream processor,” Advances on Computer Architecture (ACA’06), Chengdu, China, August 2006.
[8] J. Du, X. J. Yang, et al., “Implementation and evaluation of scientific computing programs on imagine,” Advances on Computer Architecture (ACA’06), Chengdu, China, August 2006.
[9] M. Rixner, “Stream processor architecture,” Kluwer Academic Publishers, Boston, MA, 2001.
[10] P. Mattson, “A programming system for the imagine media processor,” Department of Electrical Engineering, Ph.D. thesis, Stanford University, 2002.
[11] O. Johnsson, M. Stenemo, and Z. ul-Abdin, “Programming & implementation of streaming applications,” Master’s thesis, Computer and Electrical Engineering, Halmstad University, 2005.
[12] U. J. Kapasi, S. Rixner, et al., “Programmable stream processor,” IEEE Computer, August 2003.
[13] G. Goff, K. Kennedy, and C. W. Tseng, “Practical dependence testing,” In Proceedings of the SIGPLAN ‘91 Conference on Programming Language Design and Implementation, ACM, New York, 1991.
[14] T. F. Chan, E. Gallopoulos, V. Simoncini, T. Szeto, and C. H. TongSIAM, “A quasi-minimal residual variant of the bi-cgstab algorithm for nonsymmetric systems,” Journal on Scientific Computing, 1994.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.