ABSTRACT
We propose a new machine model in which load operations can be performed in parallel with arithmetic operations by two separate functional units. For this model, the evaluation of expression trees is considered. An efficient algorithm to produce an optimal order of evaluation is described and analyzed. For a tree with n vertices the algorithm runs in time Ο(n log2n). If the arithmetic operations have at most two arguments, the complexity goes down to Ο(n logn).
- [1] Aho, A. V., Hopcroft, J. E., and Ullman, J. D., Data structures and algorithms, Addison-Wesley, Reading, MA, 1983. Google ScholarDigital Library
- [2] Aho, A. V., and Jonhson, S. C., "Optimal code generation for expression trees", JACM 23, 3 (Jul. 1976), 488-501. Google ScholarDigital Library
- [3] Even, S., Graph algorithms, Computer Science Press, Rockville, MD, 1979. Google ScholarDigital Library
- [4] Knuth, D. E., The art of computer programming: Sorting and Searching (Vol. 31), Addison-Wesley, Reading, MA, 1973.Google Scholar
- [5] Li, H. F., "Scheduling trees in parallel /pipelined processing environments", IEEE transactions on computers, C-26, 11 (Nov. 1977), 1101-1112.Google ScholarDigital Library
- [6] Lloyd, E. L., Scheduling task systems with resources, Ph. D. dissertation, MIT-LCS-TR- 236, May 1980. Google ScholarDigital Library
- [7] Sethi, R., and Ullman, J. D., "The generation of optimal code for arithmetic expressions", JACM 17, 4 (Oct. 1970), 715-728. Google ScholarDigital Library
- [8] Smith, J. E., "Decoupled access/execute computer architectures", Proceedings of the 9th Symposium on Computer Architecture , (April 1982), 112-119. Google ScholarDigital Library
Index Terms
- Optimal scheduling of arithmetic operations in parallel with memory access (preliminary version)
Recommendations
Parallel Heap Operations on an EREW PRAM
We consider parallel heap operations on the exclusive-read exclusive-write parallel random-access machine. We first present an O(n/p + log p) time parallel algorithm to construct a heap of n elements using p processors, which is optimal for p (n/log n). ...
Nearly Optimal Parallel Algorithms for Longest Increasing Subsequence
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesThe paper presents parallel algorithms for multiplying implicit simple unit-Monge matrices (Krusche and Tiskin, PPAM 2009) of size n x n in the EREW PRAM model. We show implicit simple unit-Monge matrices multiplication of size n x n can be achieved by a ...
Feasible Time-Optimal Algorithms for Boolean Functions on Exclusive-Write Parallel Random-Access machines
It was shown some years ago that the computation time for many important Boolean functions of $n$ arguments on concurrent-read exclusive-write parallel random-access machines (CREW PRAMs) of unlimited size is at least $arphi (n) \approx 0.72\log_...
Comments