High performance computing for vision on distributed-memory machines

October 1996

Author:
Cho-Li Wang
Univ. of Southern California

Publisher:

University of Southern California
Computer Science Dept. 200 University Park Los Angeles, CA
United States

Order Number:UMI Order No. GAX96-25040

Bibliometrics

Abstract

Computer vision has been identified as a Grand Challenge application by the High Performance Computing and Communication initiative. With the advancement of microprocessor technology and network technology, current massively parallel machines can achieve hundreds of Gigaflops performance. These parallel machines have a distributed-memory architecture, so they can scale to large system sizes. Examples of such machines include TMC CM-5, IBM SP-2, Intel Paragon, Meiko CS-2, and Cray T3D among others. These high-performance computing platforms seem to have opened new avenues to meet the computational challenge of vision. Even though many "Gigaflops" machines have become available, straightforward approaches to parallelizing vision applications on these architectures do not yield satisfactory performance. In the distributed-memory architecture, communication operations incur considerable overheads. Due to the irregular nature of the communication in intermediate- and high-level vision algorithms, the overheads could increase with the size of the parallel system, leading to poor performance. As a consequence, the algorithms do not scale to large system sizes. It is therefore necessary to develop efficient algorithmic techniques for various vision processes to achieve larger speed-ups.The focus of our work is to develop scalable and portable parallel algorithms for computer vision tasks on distributed-memory machines. We propose a computational model for distributed-memory machines which considers communication startup cost and data transmission rate to account for the cost in data communication. To illustrate our algorithms and implementations, we parallelize vision tasks in a building detection system and in an object recognition system. Based on the model, we show scalable algorithms for several key steps in the building system, including a linear feature extraction task and a perceptual grouping task, as well as a high-level task in an object recognition system. For portable implementations, our codes are written in C and message passing standard MPI. These codes are portable to run on several high-performance platforms. Currently, they have been ported to CM-5, SP-2, and T3D. These implementations achieve fast execution of the vision tasks. For example, given a 2048 x 2048 image, the extraction of linear feature on a 512-node CM-5 can be completed in 1.118 seconds. The same task takes more than 8 minutes on a state-of-the-art Sun Sparcstation.

Cited By

Contributors

Choli Wang
The University of Hong Kong
- Publication Years1993 - 2023
- Publication counts81
- Citation count400
- Available for Download14
- Downloads (cumulative)7,146
- Downloads (12 months)1,103
- Downloads (6 weeks)175
- Average Downloads per Article510
- Average Citation per Article5
View Full Profile

Index Terms

High performance computing for vision on distributed-memory machines
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Parallel algorithms for irregular vision problems on distributed-memory machines
Read More
Scalable data redistribution services for distributed-memory machines
Read More
Teaching shared memory parallel concepts with OpenMP (abstract only)
SIGCSE '14: Proceedings of the 45th ACM technical symposium on Computer science education

Curriculum 2013 brings parallelism into the CS curricular mainstream. This hands-on workshop is intended for faculty with little or no background in parallel computing. OpenMP is a platform independent, industry-standard library for shared-memory ...
Read More

Comments

Browse Theses

Sections

Cited By

Index Terms

Parallel algorithms for irregular vision problems on distributed-memory machines

Scalable data redistribution services for distributed-memory machines

Teaching shared memory parallel concepts with OpenMP (abstract only)

Sections

Cited By

Save to Binder

Index Terms

Recommendations

Parallel algorithms for irregular vision problems on distributed-memory machines

Scalable data redistribution services for distributed-memory machines

Teaching shared memory parallel concepts with OpenMP (abstract only)