Model for Dynamic and Hierarchical Data Repository in Relational Database

Authors

  • Mateusz Piech AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland
  • Wojciech Frącz AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland
  • Wojciech Turek AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland
  • Marek Kisiel-Dorohinicki AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland
  • Jacek Dajda AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland
  • Aleksander Byrski AGH University of Science and Technology, Faculty of Computer Science, Electronics andTelecommunications, Department of Computer Science, Krakow, Poland

DOI:

https://doi.org/10.7494/csci.2018.19.4.3088

Keywords:

JSON, Relational Databases, EAV, CQRS, PostgreSQL, Open Schema Model

Abstract

The aim of this research is to build an open schema model for digital sources repository in the relational database. It required us to develop a few advanced techniques. One of them was to keep and maintain a hierarchical data structure pushed into the repository. The second was to create constraints on any hierarchical level that allow enforcing data integrity and consistency. The created solution is mainly based on a JSON as a native column type, which was designed for holding open schema documents. In this paper, we present the model for any repository that uses hierarchical dynamic data. Additionally, we include a structure for normalizing input and description for data to keep all the model assumptions. We compared our solution with well-known open schema model -- Entity-Attribute-Value -- in the scope of saving data and querying about relationship and content from structure. Results have shown that we achieved improvement in both performance and disk space usage, although we extended our model with a few new features that the previous model does not include. The techniques developed in this research can be applied in every domain where hierarchical dynamic data is required, as demonstrated by the digital book repository that we have presented.

Downloads

Download data is not yet available.

References

DB-Engines Ranking of Search Engines. https://db-engines.com/en/ranking/search+engine. Accessed: 2018-10-04.

Chasseur C., Li Y., Patel J.M.: Enabling JSON Document Stores in RelationalSystems. In: WebDB, vol. 13, pp. 14–15. 2013.

Chen H.: Javascript object notation schema definition language, 2014. URL https://www.google.com/patents/US20140067866. US Patent App.13/596,694.

Chen R.S., Nadkarni P., Marenco L., Levin F., Erdos J., Miller P.L.: Exploring performance issues for a clinical database organized using an entity-attribute-value representation. In: Journal of the American Medical Informatics Associa-tion, vol. 7(5), pp. 475–487, 2000.

Fowler M.: Cqrs. In: Martin Fowler’s Blog, 2011.

Johnson S.B.: Generic data modeling for clinical repositories. In: Journal of theAmerican Medical Informatics Association, vol. 3(5), pp. 328–339, 1996.

Kılıç U., Karabey I.: Comparison of Solr and Elasticsearch Among Popular FullText Search Engines and Their Security Analysis.

LiuZ.H., Hammerschmidt B., McMahon D.: JSON data management: support-ing schema-less development in RDBMS. In: Proceedings of the 2014 ACM SIG-MOD international conference on Management of data, pp. 1247–1258. ACM,2014.

Liu Z.H., Hammerschmidt B., McMahon D., Liu Y., Chang H.J.: Closing the functional and Performance Gap between SQL and NoSQL. In: Proceedings of the 2016 International Conference on Management of Data, pp. 227–238. ACM,2016.

McHugh J., Abiteboul S., Goldman R., Quass D., Widom J.: Lore: A database management system for semistructured data. In:SIGMOD record, vol. 26(3), pp.54–66, 1997.

Nadkarni P.M., Marenco L., Chen R., Skoufos E., Shepherd G., Miller P.: Organization of heterogeneous scientific data using the EAV/CR representation. In: Journal of the American Medical Informatics Association, vol. 6(6), pp. 478–493,1999.

Orend K.: Analysis and classification of NoSQL databases and evaluation of their ability to replace an object-relational Persistence Layer. In: Architecture, vol. 1,2010.

Piech M., Marcjan R.: A new approach to storing dynamic data in relational databases using JSON. In: Computer Science, vol. 19(1), 2018.

Rys M.: XML and relational database management systems: inside Microsoft SQL Server 2005. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp. 958–962. ACM, 2005.

Sleator D.D., Tarjan R.E.: A data structure for dynamic trees. In: Journal of computer and system sciences, vol. 26(3), pp. 362–391, 1983.

Tahara D., Diamond T., Abadi D.J.: Sinew: a SQL system for multi-structured data. In: Proceedings of the 2014 ACM SIGMOD international conference onManagement of data, pp. 815–826. ACM, 2014.

Tatarinov I., Viglas S.D., Beyer K., Shanmugasundaram J., Shekita E., ZhangC.: Storing and querying ordered XML using a relational database system. In:Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 204–215. ACM, 2002.

Whang K.Y., Park B.K., Han W.S., Lee Y.K.: Inverted index storage structure using subindexes and large objects for tight coupling of information retrieval with database management systems, 2002. US Patent 6,349,308.

Downloads

Published

2018-11-25

Issue

Section

Articles

How to Cite

Model for Dynamic and Hierarchical Data Repository in Relational Database. (2018). Computer Science, 19(4). https://doi.org/10.7494/csci.2018.19.4.3088

Most read articles by the same author(s)

1 2 > >>