Research School

Collection Research School

00:00:00 / 00:00:00
40 65

Semistructured data, logic, and automata – part 1

By Diego Figueira

Also appears in collection : Ecole de Printemps d'Informatique Théorique (EPIT) 2019 - Données, logique et automates / Spring school on Theoretical Computer Science (EPIT) - Databases, Logic and Automata

Semistructured data is an umbrella term encompassing data models which are not logically organized in tables (i.e., the relational data model) but rather in hierarchical structures using markers such as tags to separate semantic elements and data fields in a ‘self-describing’ way. In this lecture we survey some of the multiple connections between formal language theory and semi-structured data, in particular concerning the XML format. We will cover ranked and unranked tree automata, and its connections to Monadic Second Order logic, First Order logic, and XPath. The aim is to take a glimpse at the landscape of closure properties, algorithms and expressiveness results for these formalisms.

Information about the video

Citation data

  • DOI 10.24350/CIRM.V.19518803
  • Cite this video Figueira, Diego (09/04/2019). Semistructured data, logic, and automata – part 1. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19518803
  • URL https://dx.doi.org/10.24350/CIRM.V.19518803

Bibliography

  • Rajeev Alur and P. Madhusudan. Visibly pushdown languages. In Symposium on Theory of Computing (STOC), pages 202–211. ACM Press, 2004. - http://doi.org/10.1145/1007352.1007390
  • Saguy Benaim, Michael Benedikt, Witold Charatonik, Emanuel Kieronski, Rastislav Lenhardt, Filip Mazowiecki, and James Worrell. Complexity of two-variable logic on finite trees. ACM Transactions on Computational Logic, 17(4):32:1–32:38, 2016. - https://doi.org/10.1145/2996796
  • Mikołaj Bojańczyk and Thomas Colcombet. Tree-walking automata cannot be determinized. Theoretical Computer Science, 350(2-3):164–173, 2006 - https://doi.org/10.1016/j.tcs.2005.10.031
  • Mikołaj Bojańczyk and Thomas Colcombet. Tree-walking automata do not recognize all regular languages. SIAM J. Comput., 38(2):658–701, 2008 - https://doi.org/10.1137/050645427
  • Mikołaj Bojańczyk, Claire David, Anca Muscholl, Thomas Schwentick, and Luc Segoufin. Two-variable logic on data words. ACM Transactions on Computational Logic, 2010
  • Mikołaj Bojańczyk, Anca Muscholl, Thomas Schwentick, and Luc Segoufin. Two-variable logic on data trees and XML reasoning. Journal of the ACM, 56(3):1–48, 2009. - https://doi.org/10.1145/1516512.1516515
  • Mikołaj Bojańczyk and Paweł Parys. XPath evaluation in linear time. Journal of the ACM, 58(4):17:1–17:33, 2011 - https://doi.org/10.1145/1989727.1989731
  • Mikołaj Bojańczyk, Mathias Samuelides, Thomas Schwentick, and Luc Segoufin. Expressive power of pebble automata. In International Colloquium on Automata, Languages and Programming (ICALP), volume 4051 of Lecture Notes in Computer Science, pages 157–168. Springer, 2006. - https://doi.org/10.1007/11786986_15.
  • J Richard Büchi. Weak second-order arithmetic and finite automata. Mathematical Logic Quarterly, 6(1-6):66–92, 1960 - https://doi.org/10.1002/malq.19600060105
  • Balder ten Cate and Luc Segoufin. Transitive closure logic, nested tree walking automata, and XPath. Journal of the ACM, 57(3):18, 2010 - https://doi.org/10.1145/1706591.1706598
  • Ashok K. Chandra, Dexter Kozen, and Larry J. Stockmeyer. Alternation. Journal of the ACM, 28(1):114–133, 1981 - https://doi.org/10.1145/322234.322243
  • H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications (TATA) - [http://tata.gforge.inria.fr, 2007](http://tata.gforge.inria.fr, 2007)
  • Stephen A. Cook. An observation on time-storage trade off. Journal of Computer and System Sciences (JCSS), 9(3):308–316, 1974 - https://doi.org/10.1016/S0022-0000(74)80046-2
  • Julien Cristau, Christof Löding, and Wolfgang Thomas. Deterministic automata on unranked trees. In Fundamentals of Computation Theory, volume 3623 of Lecture Notes in Computer Science, pages 68–79. Springer, 2005. - https://doi.org/10.1007/11537311_7
  • Wojciech Czerwiński, Wim Martens, Matthias Niewerth, and Paweł Parys. Minimization of tree patterns. Journal of the ACM, 65(4):26:1–26:46, 2018 - https://doi.org/10.1145/3180281
  • Stéphane Demri and Ranko Lazić. LTL with the freeze quantifier and register automata. ACM Transactions on Computational Logic, 10(3), 2009. - https://doi.org/10.1145/1507244.1507246.
  • Diego Figueira. Alternating register automata on finite data words and trees. Logical Methods in Computer Science (LMCS), 8(1), 2012. - https://doi.org/10.2168/LMCS-8(1:22)2012
  • Diego Figueira. Decidability of downward XPath. ACM Transactions on Computational Logic, 13(4), 2012. - https://doi.org/10.1145/2362355.2362362.
  • Diego Figueira. On XPath with transitive axes and data tests. In ACM Symposium on Principles of Database Systems (PODS), pages 249–260. ACM Press, 2013 - https://doi.org/10.1145/2463664.2463675
  • Diego Figueira. Satisfiability of XPath on data trees. SIGLOG News, 5(2):4–16, 2018. - https://doi.org/10.1145/3212019.3212021
  • Diego Figueira and Luc Segoufin. Bottom-up automata on data trees and vertical XPath. Logical Methods in Computer Science (LMCS), 13(4), 2017. - https://doi.org/10.23638/LMCS-13(4:5)2017
  • Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. - https://doi.org/10.1007/3-540-29953-X
  • Georg Gottlob and Christoph Koch. Monadic datalog and the expressive power of languages for Web information extraction. Journal of the ACM, 51(1):74–113, 2004 - https://doi.org/10.1145/962446.962450
  • Georg Gottlob, Christoph Koch, and Reinhard Pichler. Efficient algorithms for processing XPath queries. ACM Transactions on Database Systems (TODS), 30(2):444–491, 2005. - https://doi.org/doi:10.1145/1071610.1071614
  • Georg Gottlob, Christoph Koch, Reinhard Pichler, and Luc Segoufin. The complexity of XPath query evaluation and XML typing. Journal of the ACM, 52(2):284–335, 2005 - https://doi.org/10.1145/1059513.1059520
  • Marcin Jurdziński and Ranko Lazić. Alternating automata on data trees and XPath satisfiability. ACM Transactions on Computational Logic, 12(3):19, 2011. - [https://doi.org/10.1145/ 1929954.1929956](https://doi.org/10.1145/ 1929954.1929956)
  • Michael Kaminski and Nissim Francez. Finite-memory automata. Theoretical Computer Science, 134(2):329–363, 1994 - https://doi.org/10.1016/0304-3975(94)90242-9
  • Leonid Libkin. Logics for unranked trees: An overview. Logical Methods in Computer Science (LMCS), 2(3), 2006. - https://doi.org/10.2168/LMCS-2(3:2)2006
  • Wim Martens and Stijn Vansummeren. Automata and logic on trees. ESSLLI 2017 summer school. Course slides. - [https://www.theoinf.uni-bayreuth.de/pool/ documents/Other/ESSLLI07.zip](https://www.theoinf.uni-bayreuth.de/pool/ documents/Other/ESSLLI07.zip)
  • Maarten Marx. Conditional XPath. ACM Transactions on Database Systems (TODS), 30(4):929–959, 2005 - https://doi.org/10.1145/1114244.1114247.
  • Maarten Marx and Maarten de Rijke. Semantic characterizations of navigational XPath. SIGMOD Record, 34(2):41–46, 2005 - https://doi.org/10.1145/1083784.1083792
  • Gerome Miklau and Dan Suciu. Containment and equivalence for a fragment of XPath. Journal of the ACM, 51(1):2–45, 2004 - https://doi.org/10.1145/962446.962448.
  • Frank Neven. Automata, logic, and XML. In EACSL Annual Conference on Computer Science Logic (CSL), volume 2471 of Lecture Notes in Computer Science, pages 2–26. Springer, 2002 - https://doi.org/10.1007/3-540-45793-3_2
  • Frank Neven. Automata theory for XML researchers. SIGMOD Record, 31(3):39–46, 2002. - http://doi.org/10.1145/601858.601869
  • Frank Neven and Thomas Schwentick. Query automata over finite trees. Theoretical Computer Science, 275(1-2):633–674, 2002 - https://doi.org/10.1016/S0304-3975(01)00301-2
  • Klaus Reinhardt. The complexity of translating logic to finite automata. In Automata, Logics, and Infinite Games: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001], volume 2500 of Lecture Notes in Computer Science, pages 231–238. Springer, 2001. - https://doi.org/10.1007/3-540-36387-4_13
  • Thomas Schwentick. Formal methods for XML: Algorithms and complexity. EDBT 2004 summer school - [https://ls1-www.cs.tu-dortmund.de/de/ vortraege-thomas-schwentick.](https://ls1-www.cs.tu-dortmund.de/de/ vortraege-thomas-schwentick.)
  • Thomas Schwentick. Automata for XML – A survey. Journal of Computer and System Sciences (JCSS), 73(3):289–315, 2007. - https://doi.org/10.1016/j.jcss.2006.10.003
  • Thomas Schwentick. Foundations of XML based on logic and automata: A snapshot. In Foundations of Information and Knowledge Systems (FoIKS), volume 7153 of Lecture Notes in Computer Science, pages 23–33. Springer, 2012. - https://doi.org/10.1007/978-3-642-28472-4_2
  • Luc Segoufin. Automata and logics for words and trees over an infinite alphabet. In EACSL Annual Conference on Computer Science Logic (CSL), volume 4207 of Lecture Notes in Computer Science, pages 41–57. Springer, 2006 - http://dx.doi.org/10.1007/11874683_3
  • Luc Segoufin. Static analysis of XML processing with data values. SIGMOD Record, 36(1):31–38, 2007. - https://doi.org/10.1145/1276301.1276308
  • Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time: Preliminary report. In Symposium on Theory of Computing (STOC), pages 1–9. ACM Press, 1973 - https://doi.org/10.1145/800125.804029
  • Larry Joseph Stockmeyer. The complexity of decision problems in automata theory and logic. PhD thesis, Massachusetts Institute of Technology, 1974 - http://hdl.handle.net/1721.1/15540
  • Dan Suciu. Typechecking for semistructured data. In Database Programming Languages (DBPL), volume 2397 of Lecture Notes in Computer Science, pages 1–20. Springer, 2001 - https://doi.org/10.1007/3-540-46093-4_1
  • Ivan Hal Sudborough. On the tape complexity of deterministic context-free languages. Journal of the ACM, 25(3):405–414, 1978. - https://doi.org/10.1145/322077.322083
  • Balder ten Cate. The expressivity of XPath with transitive closure. In ACM Symposium on Principles of Database Systems (PODS), pages 328–337. ACM Press, 2006 - https://doi.org/10.1145/1142351.1142398
  • Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In Symposium on Theory of Computing (STOC), pages 137–146. ACM Press, 1982 - https://doi.org/10.1145/800070.802186
  • Moshe Y. Vardi. Reasoning about the past with two-way automata. In International Colloquium on Automata, Languages and Programming (ICALP), volume 1443 of Lecture Notes in Computer Science, pages 628–641. Springer, 1998. - https://doi.org/10.1007/BFb0055090
  • János Virágh. Deterministic ascending tree automata I. Acta Cybernetica, 5(1):33–42, 1980. - http://cyber.bibl.u-szeged.hu/index.php/actcybern/article/view/3203

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback