ABSTRACT
We propose a new machine model in which load operations can be performed in parallel with arithmetic operations by two separate functional units. For this model, the evaluation of expression trees is considered. An efficient algorithm to produce an optimal order of evaluation is described and analyzed. For a tree with n vertices the algorithm runs in time Ο(n log2n). If the arithmetic operations have at most two arguments, the complexity goes down to Ο(n logn).
- [1] Aho, A. V., Hopcroft, J. E., and Ullman, J. D., Data structures and algorithms, Addison-Wesley, Reading, MA, 1983. Google ScholarDigital Library
- [2] Aho, A. V., and Jonhson, S. C., "Optimal code generation for expression trees", JACM 23, 3 (Jul. 1976), 488-501. Google ScholarDigital Library
- [3] Even, S., Graph algorithms, Computer Science Press, Rockville, MD, 1979. Google ScholarDigital Library
- [4] Knuth, D. E., The art of computer programming: Sorting and Searching (Vol. 31), Addison-Wesley, Reading, MA, 1973.Google Scholar
- [5] Li, H. F., "Scheduling trees in parallel /pipelined processing environments", IEEE transactions on computers, C-26, 11 (Nov. 1977), 1101-1112.Google ScholarDigital Library
- [6] Lloyd, E. L., Scheduling task systems with resources, Ph. D. dissertation, MIT-LCS-TR- 236, May 1980. Google ScholarDigital Library
- [7] Sethi, R., and Ullman, J. D., "The generation of optimal code for arithmetic expressions", JACM 17, 4 (Oct. 1970), 715-728. Google ScholarDigital Library
- [8] Smith, J. E., "Decoupled access/execute computer architectures", Proceedings of the 9th Symposium on Computer Architecture , (April 1982), 112-119. Google ScholarDigital Library
Index Terms
- Optimal scheduling of arithmetic operations in parallel with memory access (preliminary version)
Recommendations
Parallel Heap Operations on an EREW PRAM
We consider parallel heap operations on the exclusive-read exclusive-write parallel random-access machine. We first present an O(n/p + log p) time parallel algorithm to construct a heap of n elements using p processors, which is optimal for p (n/log n). ...
Nearly Optimal Parallel Algorithms for Longest Increasing Subsequence
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesThe paper presents parallel algorithms for multiplying implicit simple unit-Monge matrices (Krusche and Tiskin, PPAM 2009) of size n x n in the EREW PRAM model. We show implicit simple unit-Monge matrices multiplication of size n x n can be achieved by a ...
Optimal parallel selection
We present an optimal parallel selection algorithm on the EREW PRAM. This algorithm runs in O(log n) time with n/log n processors. This complexity matches the known lower bound for parallel selection on the EREW PRAM model. We therefore close this ...
Comments