skip to main content
10.1145/2996913.2996935acmotherconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
demonstration
Public Access

Simba: spatial in-memory big data analysis

Published:31 October 2016Publication History

ABSTRACT

We present the Simba (<u>S</u>patial <u>I</u>n-Memory <u>B</u>ig data <u>A</u>nalytics) system, which offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. Simba natively extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and DataFrame API. It enables the construction of indexes over RDDs inside the engine in order to work with big spatial data and complex spatial operations. Simba also comes with an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput in big spatial data analysis. This demonstration proposal describes key ideas in the design of Simba, and presents a demonstration plan.

References

  1. http://zeppelin.incubator.apache.org.Google ScholarGoogle Scholar
  2. Gdelt project. http://www.gdeltproject.org.Google ScholarGoogle Scholar
  3. Openstreepmap project. http://www.openstreetmap.org.Google ScholarGoogle Scholar
  4. A. Aji, F. Wang, H. Vo, R. Lee, Q. Liu, X. Zhang, and J. Saltz. Hadoop gis: a high performance spatial data warehousing system over mapreduce. In VLDB, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In SIGMOD, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Eldawy and M. F. Mokbel. Spatialhadoop: A mapreduce framework for spatial data. In ICDE, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  9. J. N. Hughes, A. Annex, C. N. Eichelberger, A. Fox, A. Hulbert, and M. Ronquest. Geomesa: a distributed architecture for spatio-temporal fusion. In SPIE Defense+ Security, 2015.Google ScholarGoogle Scholar
  10. S. T. Leutenegger, M. Lopez, J. Edgington, et al. STR: A simple and efficient algorithm for R-tree packing. In ICDE, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Nishimura, S. Das, D. Agrawal, and A. El Abbadi. MD-hbase: design and implementation of an elastic data infrastructure for cloud-scale location services. In DAPD, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Xie, F. Li, B. Yao, G. Li, L. Zhou, and M. Guo. Simba: Efficient in-memory spatial analytics. In SIGMOD, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. You, J. Zhang, and L. Gruenwald. Large-scale spatial join query processing in cloud. In IEEE CloudDM workshop, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Yu, J. Wu, and M. Sarwat. Geospark: A cluster computing framework for processing large-scale spatial data. In SIGSPATIAL GIS, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Simba: spatial in-memory big data analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      SIGSPACIAL '16: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
      October 2016
      649 pages
      ISBN:9781450345897
      DOI:10.1145/2996913

      Copyright © 2016 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 31 October 2016

      Check for updates

      Qualifiers

      • demonstration

      Acceptance Rates

      SIGSPACIAL '16 Paper Acceptance Rate40of216submissions,19%Overall Acceptance Rate220of1,116submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader