skip to main content
Studies in Prolog architectures
Publisher:
  • Stanford University
  • 408 Panama Mall, Suite 217
  • Stanford
  • CA
  • United States
Order Number:UMI Order No. GAX87-23099
Bibliometrics
Skip Abstract Section
Abstract

This dissertation addresses the problem of how logic programs can be made to execute at high speeds. Prolog, chosen as a representative logic programming language, differs from procedural languages in that it is applicative, nondeterminate and uses unification as its primary operation. Program performance is directly related to memory performance because high-speed processors are ultimately limited by memory bandwidth and architectures that require less bandwidth have greater potential for high performance. This dissertation reports the dynamic data and instruction referencing characteristics of both sequential and parallel Prolog architectures and corresponding uniprocessor and multiprocessor memory-hierarchy performance tradeoffs.

Initially, a family of canonical architectures, corresponding closely to Prolog, is defined from the principles of ideal machine architectures of Flynn, and is then refined into the realizable Warren Abstract Machine (WAM) architecture. The memory-referencing behavior of these architectures is examined by tracing memory references during emulation of a set of Prolog benchmarks. Measurements of the canonical architectures indicate the upper memory-performance bounds of sequential execution. Measurements of the WAM provide frequencies of memory references and indicate that the WAM approaches the performance of the canonical Prolog architectures on current hosts.

Two-level memory hierarchies for both sequential (WAM) and parallel (PWAM) Prolog architectures are modeled. PWAM is the Restricted-AND Parallel architecture of Hermenegildo. Local memory designs are simulated using memory traces, whereas main memory designs are analyzed with queueing models. The results show that small buffers (256 words or less) can significantly reduce Prolog's memory bandwidth requirement, primarily by capturing shallow backtracking information. Larger, more general local memories, such as caches, are necessary in high-performance systems to further reduce memory traffic. Local memory consistency protocols for a shared memory PWAM multiprocessor are analyzed. Measurements indicate that the memory-referencing overheads of exploiting Restricted-AND Parallelism are minor. These results show, however, that as few as eight high-performance processing elements can saturate a shared bus. With emerging bus technology and properly interleaved shared-memory, limited-size multiprocessors of this type have great potential for cost-effective speedups. This dissertation provides previously unavailable information concerning the memory-referencing characteristics of logic programming languages executing on hierarchical memory organizations, thus contributing to processor memory design.

Contributors
  • Stanford University

Recommendations