Abstract
Probability theory is mathematically the best understood paradigm for modeling and manipulating uncertain information. Probabilities of complex events can be computed from those of basic events on which they depend, using any of a number of strategies. Which strategy is appropriate depends very much on the known interdependencies among the events involved. Previous work on probabilistic databases has assumed a fixed and restrictivecombination strategy (e.g., assuming all events are pairwise independent). In this article, we characterize, using postulates, whole classes of strategies for conjunction, disjunction, and negation, meaningful from the viewpoint of probability theory. (1) We propose a probabilistic relational data model and a genericprobabilistic relational algebra that neatly captures various strategiessatisfying the postulates, within a single unified framework. (2) We show that as long as the chosen strategies can be computed in polynomial time, queries in the positive fragment of the probabilistic relational algebra have essentially the same data complexity as classical relational algebra. (3) We establish various containments and equivalences between algebraic expressions, similar in spirit to those in classical algebra. (4) We develop algorithms for maintaining materialized probabilistic views. (5) Based on these ideas, we have developed a prototype probabilistic database system called ProbView on top of Dbase V.0. We validate our complexity results with experiments and show that rewriting certain types of queries to other equivalent forms often yields substantial savings.
- BARBARA, D., GARCIA-MOLINA, H., AND PORTER, D. 1992. The management of probabilistic data. IEEE Trans. Knowl. Data Eng. 4, 487-502. Google Scholar
- BELL, C., NERODE, A., NG, R., AND SUBRAHMANIAN, V.S. 1994. Mixed integer programming methods for computing non-monotonic deductive databases. J. ACM 41, 6 (Nov.), 1178- 1215. Google Scholar
- BLAKELEY, J., COBURN, N., AND LARSON, P.-A. 1989. Updating derived relations: Detecting irrelevant and autonomously computable updates. ACM Trans. Database Syst. 14, 3, 369-400. Google Scholar
- BOOLE, G. 1854. The Laws of Thought. Macmillan, London.Google Scholar
- CAVALLO, R. AND PITTARELLI, M. 1987. The theory of probabilistic databases. In Proceedings of the Conference on Very Large Data Bases (Brighton, England, Sept. 1-4), 71-81. Google Scholar
- DAYAL, U. 1989. Queries and views in a object-oriented databases. In International Workshop on Database Programming Languages (Glenden Beach, OR, June 4-8), 80-102. Google Scholar
- DAYAL, V. AND HWANG, H. 1984. View definition and generalization for database integration in a multi-database system. IEEE Trans. Softw. Eng. SE-IO, 6, 628-644.Google Scholar
- DUBOIS, D. AND PRADE, H. 1988. Default reasoning and possibility theory. Artif. Intell. 35, 243-257. Google Scholar
- DUMAIS, S. 1993. LSI meets TREC: A status report. In Proceedings First Text Retrieval Conference (Gaithersburg, MD), NIST Special Publication 500-207, 137-152.Google Scholar
- DYRESON, C. E. AND SNODGRASS, R.T. 1993. Valid-time indeterminacy. In Proceedings of the International Conference on Data Engineering (Vienna, April), 335-343. Google Scholar
- FENSTAD, J. E. 1980. The structure of probabilities defined on first-order languages. In Studies in Inductive Logic and Probabilities, Vol. 2, R. C. Jeffrey, Ed. University of California Press, Berkeley, CA, 251-262.Google Scholar
- GADIA, S. 1988. A homogeneous relational model and query languages for temporal databases. ACM Trans. Database Syst. 13, 4, 418-448. Google Scholar
- GAREY, M. R. AND JOHNSON, D.S. 1979. A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, New York. Google Scholar
- GUNTZER, V., KIESLING, W., AND THONE, H. 1991. New directions for uncertainty reasoning in deductive databases. In Proceedings 1991 ACM SIGMOD (Denver, CO, May), 178-187. Google Scholar
- GUPTA, A., MUMICK, I. S., AND SUBRAHMANIAN, V.S. 1993. Maintaining views incrementally. In Proceedings 1993 ACM SIGMOD Conference on Management of Data (Washington, DC), 157-166. Google Scholar
- HILLIER, F. AND LIEBERMAN, G. 1974. Operations Research. Holden-Day, San Francisco, CA.Google Scholar
- IMIELINSKI, T. AND LIPSKI, W. 1984. Incomplete information in relational databases. J. ACM 31, 4 (Oct.). Google Scholar
- IYENGAR, S. S., PRASAD, L., AND MIN, H. 1995. Advances in Distributed Sensor Technology. Prentice-Hall, Englewood Cliffs, NJ. Google Scholar
- HAN, J., CAI, Y., AND CERCONE, N. 1992. Knowledge discovery in databases: An attributeoriented approach. In Proceedings of the 18th Conference on Very Large Data Bases (Vancouver, Canada, Aug. 23-27), 547-559. Google Scholar
- HARRISON, J. V. AND DIETRICH, S. 1992. Maintenance of materialized views in a deductive database: An update propagation approach. In Workshop on Deductive Databases, JICSLP (Washington, D.C., Nov. 1992).Google Scholar
- KEMPER, A., KILGER, C., AND MOERKOTTE, G. 1994. Function materialization in object bases: Design, realization, and evaluation. IEEE Trans. Knowl. Data Eng. 6, 4 (Aug.). Google Scholar
- KIESSLING, W., THONE, H., AND GUNTZER, V. 1992. Database support for problematic knowledge. In Proceedings EDBT-92 (Vienna), Springer LNCS Vol. 580, 421-436. Google Scholar
- KIFER, M. AND LI, A. 1988. On the semantics of rule-based expert systems with uncertainty. In Second International Conference on Database Theory, M. Gyssens, J. Paredaens, D. Van Gucht, Eds., Springer Verlag, (LNCS 326), 102-117. Google Scholar
- KOLMOGOROV, A.N. 1956. Foundations of the Theory of Probability. Chelsea Publishing Co., New York.Google Scholar
- LAKSHMANAN, L. V. S. 1994. An epistemic foundation for logic programming with uncertainty. In Proceedings of the 14th Conference on the Foundations of Software Technology and Theoretical Computer Science (Madras, India, Dec. 1994). Google Scholar
- LAKSHMANAN, L. V. S. AND SADRI, F. 1994a. Probabilistic deductive databases. In Proceedings of the International Logic Programming Symposium, (Ithaca, NY, Nov.), MIT Press, Cambridge, MA, 254-268. Google Scholar
- LAKSHMANAN, L. V. S. AND SADRI, F. 1994b. Modeling uncertainty in deductive databases. In Proceedings International Conference on Database Expert Systems and Applications (DEXA '94) (Athens, Greece, Sept.), LNCS 856, Springer-Verlag. Google Scholar
- LAKSHMANAN, L. V. S., LEONE, N., ROSS, R., AND SUBRAHMANIAN, V. S. 1996. ProbView: A flexible probabilistic database system. Tech. Rep. Concordia University and University of Maryland, Dec. (Available by http : //www. cs . umd. edu/users/vs/papers/probpapers. html. ).Google Scholar
- Lu, J., MOERKOTTE, G., SCHUE, J., AND SUBRAHMANIAN, V.S. 1995. Efficient maintenance of materialized mediated views. In Proceedings 1995 ACM SIGMOD Conference on Management of Data (San Jose, CA, May). Google Scholar
- NG, R. AND SUBRAHMANIAN, V.S. 1993. Probabilistic logic programming. Inf. Cornput. 101, 2, 150-201. Google Scholar
- NG, R. AND SUBRAHMANIAN, V.S. 1995. Stable semantics for probabilistic deductive databases. Inf. Cornput. 110, 1, 42-83. Google Scholar
- RAJU, K. S. V. S. V. N. AND MAJUMDAR, A. 1988. Fuzzy functional dependencies and lossless join decomposition of fuzzy relational database systems. ACM Trans. Database Syst. 13, 2 (June). Google Scholar
- SCHMIDT, H., KIESSLING, W., GUNTZER, U., AND BAYER, R. 1987. Combining deduction by uncertainty with the power of magic. In Proceedings DOOD-89 (Kyoto, Japan), 205-224.Google Scholar
- SILBERSCHATZ, n., STONEBRAKER, M., AND ULLMAN, J.D. 1991. Database systems: Achievements and opportunities. Cornmun. ACM 34, 10, 110-119. Google Scholar
- THONE, H., KIESSLING, W., AND GUNTZER, U. 1995. On cautious probabilistic inference and default detachment. Ann. Oper. Res. 55, 195-224.Google Scholar
- VARDI, M.Y. 1985. Querying logical databases. In Proceedings of the Fourth ACM SIGACT- SIGMOD Symposium on Principles of Database Systems, 57-65. Google Scholar
Index Terms
- ProbView: a flexible probabilistic database system
Recommendations
A data model and algebra for probabilistic complex values
We present a probabilistic data model for complex values. More precisely, we introduce probabilistic complex value relations, which combine the concept of probabilistic relations with the idea of complex values in a uniform framework. We elaborate a ...
A relational database model and algebra integrating fuzzy attributes and probabilistic tuples
AbstractAlthough there have been many fuzzy or probabilistic relational database models proposed for representing and handling imprecise and uncertain information of objects in real-world applications, models combining the relevance and ...
An Algebra for Probabilistic Databases
An algebra is presented for a simple probabilistic data model that may be regarded as an extension of the standard relational model. The probabilistic algebra is developed in such a way that (restricted to /spl alpha/-acyclic database schemes) the ...
Comments