skip to main content
10.1145/2103656.2103674acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Symbolic finite state transducers: algorithms and applications

Published:25 January 2012Publication History

ABSTRACT

Finite automata and finite transducers are used in a wide range of applications in software engineering, from regular expressions to specification languages. We extend these classic objects with symbolic alphabets represented as parametric theories. Admitting potentially infinite alphabets makes this representation strictly more general and succinct than classical finite transducers and automata over strings. Despite this, the main operations, including composition, checking that a transducer is single-valued, and equivalence checking for single-valued symbolic finite transducers are effective given a decision procedure for the background theory. We provide novel algorithms for these operations and extend composition to symbolic transducers augmented with registers. Our base algorithms are unusual in that they are nonconstructive, therefore, we also supply a separate model generation algorithm that can quickly find counterexamples in the case two symbolic finite transducers are not equivalent. The algorithms give rise to a complete decidable algebra of symbolic transducers. Unlike previous work, we do not need any syntactic restriction of the formulas on the transitions, only a decision procedure. In practice we leverage recent advances in satisfiability modulo theory (SMT) solvers. We demonstrate our techniques on four case studies, covering a wide range of applications. Our techniques can synthesize string pre-images in excess of 8,000 bytes in roughly a minute, and we find that our new encodings significantly outperform previous techniques in succinctness and speed of analysis.

Skip Supplemental Material Section

Supplemental Material

popl_2b_2.mp4

mp4

205 MB

References

  1. R. Alur and P. Cerný. Streaming transducers for algorithmic verification of single-pass list-processing programs. In POPL'11, pages 599--610. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. E. Bakke, R. Parameswaran, D. M. Blough, A. A. Franz, and T. J. Palmer. Data obfuscation: Anonymity and desensitization of usable data sets. IEEE Security and Privacy, Apr. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda, C. Kruegel, and G. Vigna. Saner: Composing static and dynamic analysis to validate sanitization in web applications. In IEEE Oakland Security and Privacy, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bek. http://research.microsoft.com/bek.Google ScholarGoogle Scholar
  5. M. Benedikt, C. Ley, and G. Puppis. Automata vs. logics on data words. In CSL, volume 6247 of LNCS, pages 110--124. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. er and Veanes(2011)}BV11N. Bjørner and M. Veanes. Symbolic transducers. Technical Report MSR-TR-2011--3, Microsoft Research, January 2011.Google ScholarGoogle Scholar
  7. M. Boja'nczyk, A. Muscholl, T. Schwentick, L. Segoufin, and C. David. Two-variable logic on words with data. In phLICS, pages 7--16. IEEE, 06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Brush, J. Krumm, and J. Scott. Exploring end user preferences for location obfuscation, location-based services, and the value of location. In UbiComp, September 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Z. Chen, G. Gu, J. Nazario, X. Han, and J. Zhuge. WebPatrol: Automated collection and replay of web-based malware scenarios. In ASIACCS, March 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Courcelle and P. Franchi-Zannettacchi. Attribute grammars and recursive program schemes. Theoretical Computer Science, 17: 163--191, 1982.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Cova, C. Kruegel, and G. Vigna. Detection and analysis of drive-by-download attacks and malicious JavaScript code. In WWW Conference, Raleigh, NC, April 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Culic and J. Karhumaki. The equivalence of finite-valued transducers (on HDTOL languages) is decidable. Theoretical Computer Science, 47: 71--84, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert. Zozzle: Low-overhead mostly static javascript malware detection. In Proceedings of the Usenix Security Symposium, Aug. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. B. Dantzig and B. C. Eaves. Fourier-Motzkin elimination and its dual. Journal of Combinatorial Theory (A), 14: 288--297, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  15. L. de Moura and N. Bjørner. Z3: An Efficient SMT Solver. In TACAS'08, LNCS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. J. Demers, C. Keleman, and B. Reusch. On some decidable properties of finite state translations. Acta Informatica, 17: 349--364, 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Eckersley. How unique is your web browser? In Privacy Enhancing Technologies, pages 1--18, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Engelfriet. Some open questions and recent results on tree transducers and tree languages. In R. V. Book, editor, Formal Language Theory, pages 241--286. Academic Press, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Engelfriet and S. Maneth. A comparison of pebble tree transducers with macro tree transducers. Acta Informatica, 39: 2003, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Z. Esik. Decidability results concerning tree transducers. Acta Cybernetica, 5: 1--20, 1980.Google ScholarGoogle Scholar
  21. Z. Fülöp and H. Vogler. Syntax-Directed Semantics: Formal Models Based on Tree Transducers. EATCS. Springer, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. L. Gall and B. Jeannet. Lattice automata: A representation for languages on infinite alphabets, and some applications to verification. In SAS 2007, volume 4634 of LNCS, pages 52--68, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Griffiths. The unsolvability of the equivalence problem for lambda-free nondeterministic generalized machines. J. ACM, 15: 409--413, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Hooimeijer and M. Veanes. An evaluation of automata algorithms for string analysis. In VMCAI'11, LNCS. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Hooimeijer and W. Weimer. A decision procedure for subset constraints over regular languages. In Proceedings of the Conference on Programming Language Design and Implementation, pages 188--198, New York, NY, USA, 2009. ACM. ISBN 978--1--60558--392--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Hooimeijer and W. Weimer. Solving string constraints lazily. In ASE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, and M. Veanes. Fast and precise sanitizer analysis with bek. In Proceedings of the USENIX Security Symposium, August 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. O. Ibarra. The unsolvability of the equivalence problem for Efree NGSM's with unary input (output) alphabet and applications. SIAM Journal on Computing, 4: 524--532, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. Kaminski and N. Francez. Finite-memory automata. In 31st Annual Symposium on Foundations of Computer Science (FOCS 1990), volume 2, pages 683--688. IEEE, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Kiezun, V. Ganesh, P. J. Guo, P. Hooimeijer, and M. D. Ernst. HAMPI: a solver for string constraints. In ISSTA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. N. Kobayashi, N. Tabuchi, and H. Unno. Higher-order multi-parameter tree transducers and recursion schemes for program verification. In POPL, pages 495--508. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Krumm. A survey of computational location privacy. Personal Ubiquitous Comput., 13: 391--399, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Maletti, J. Graehl, M. Hopkins, and K. Knight. The power of extended top-down tree transducers. SIAM J. Comput., 39: 410--430, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Milo, D. Suciu, and V. Vianu. Typechecking for XML transformers. In Proc. 19th ACM Symposium on Principles of Database Systems (PODS'2000), pages 11--22. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Minamide. Static approximation of dynamically generated web pages. In WWW '05: Proceedings of the 14th International Conference on the World Wide Web, pages 432--441, 2005. ISBN 1--59593-046--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Mowery, D. Bogenreif, S. Yilek, and H. Shacham. Fingerprinting information in javascript implementations. In Proceedings of Web 2.0 Security and Privacy 2011 (W2SP), May 2011.Google ScholarGoogle Scholar
  37. F. Neven, T. Schwentick, and V. Vianu. Finite state machines for strings over infinite alphabets. ACM Trans. CL, 5: 403--435, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. G. V. Noord and D. Gerdemann. Finite state transducers with predicates and identities. Grammars, 4: 263--286, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  39. C.-H. L. Ong and S. J. Ramsay. Verifying higher-order functional programs with pattern-matching algebraic data types. In POPL'11, pages 587--598. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. R. Parker. Algorithms for Image Processing and Computer Vision. Wiley and Sons, 2006.Google ScholarGoogle Scholar
  41. P. Ratanaworabhan, B. Livshits, and B. Zorn. Nozzle: A defense against heap-spraying code injection attacks. In Proceedings of the Usenix Security Symposium, Aug. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. Saxena, D. Akhawe, S. Hanna, S. McCamant, F. Mao, and D. Song. A symbolic execution framework for javascript. In IEEE Security and Privacy, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. P. Schützenberger. Sur les relations rationnelles. In GI Conference on Automata Theory and Formal Languages, volume 33 of LNCS, pages 209--213, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. L. Segoufin. Automata and logics for words and trees over an infinite alphabet. In Z. Ésik, editor, CSL, volume 4207 of LNCS, pages 41--57, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. H. Seidl. Equivalence of finite-valued tree transducers is decidable. Math. Systems Theory, 27: 285--346, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Veanes and N. Bjørner. Symbolic tree transducers. In Perspectives of System Informatics (PSI'11), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Veanes, N. Bjørner, and L. de Moura. Symbolic automata constraint solving. In C. Fermüller and A. Voronkov, editors, LPAR-17, volume 6397 of LNCS, pages 640--654, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. M. Veanes, P. de Halleux, and N. Tillmann. Rex: Symbolic Regular Expression Explorer. In ICST'10. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura, and Z. Su. Dynamic test input generation for web applications. In ISSTA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. Weber. Decomposing finite-valued transducers and deciding their equivalence. SIAM Journal on Computing, 22 (1): 175--202, February 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. F. Yu, T. Bultan, and O. H. Ibarra. Relational string verification using multi-track automata. In Proceedings of the 15th international conference on Implementation and application of automata, CIAA'10, pages 290--299, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Symbolic finite state transducers: algorithms and applications

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              POPL '12: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
              January 2012
              602 pages
              ISBN:9781450310833
              DOI:10.1145/2103656
              • cover image ACM SIGPLAN Notices
                ACM SIGPLAN Notices  Volume 47, Issue 1
                POPL '12
                January 2012
                569 pages
                ISSN:0362-1340
                EISSN:1558-1160
                DOI:10.1145/2103621
                Issue’s Table of Contents

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 January 2012

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate824of4,130submissions,20%

              Upcoming Conference

              POPL '25

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader