Extracting class diagram from hidden dependencies in data set
DOI:
https://doi.org/10.7494/csci.2020.21.2.3483Keywords:
conceptual model, class diagram, UML, data retrieval, raw data, csv 2019/10/28,Abstract
A conceptual model is a high-level, graphical representation of a specic do-
main, presenting its key concepts and relationships between them. In particular, these dependencies can be inferred from concepts' instances being a part of big raw data les. The paper aims to propose a method for constructing a conceptual model from data frames encompassed in data les. The result is presented in the form of a class diagram. The method is explained with several examples and veried by a case study in which the real data sets are processed. It can also be applied for checking the quality of the data set.
Downloads
References
Data Cleansing: Care for most valuable business asset. https://www.hitechbpo.com/data-cleansing.php.
Embley D., Campbell D., Jiang Y., et al.: Conceptual-model-based data extraction from multiple-record Web pages. In: Data & Knowledge Engineering, vol. 31, pp. 227-251, 1999. URL http://dx.doi.org/10.1016/S0169-023X(99)
-0.
Embley D., Kurtz B., Woodeld S.: Object-Oriented Systems Analysis: A Model-Driven Approach. Prentice Hall, USA, 1992.
Embley D., Liddle S.: Conceptual Modeling, chap. Big Data - Conceptual Modeling to the Rescue. Springer, Heidelberg, 2013.
Hermans F., Pinzger M., van Deursen A.: ECOOP 2010 - Object-Oriented Programming, chap. Automatically Extracting Class Diagrams from Spreadsheets, pp. 52-75. Springer, Heidelberg, 2010.
Hnatkowska B., Huzar Z., Tuzinkiewicz L.: Integrating research and practice in software engineering, chap. A data-driven conceptual modeling, pp. 97-109. Springer, Cham, 2020.
Kung C., Solvberg A.: Activity Modeling and Behavior Modeling. In: Proc. Of the IFIP WG 8.1 Working Conference on Information Systems Design Methodologies: Improving the Practice, pp. 145-171. North-Holland Publishing Co., Amsterdam,
The Netherlands, The Netherlands, 1986. ISBN 0-444-70014-5. URL
http://dl.acm.org/citation.cfm?id=20143.20149.
Liu J., Li J., Liu C., Chen Y.: Discover Dependencies from Data - A Review. In: IEEE Transactions on Knowledge and Data Engineering, vol. 24, pp. 251-264,
URL http://dx.doi.org/10.1109/TKDE.2010.197.
McKinney W.: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython 2nd Edition. O'Reilly Media, USA, 2017.
Ross R.: Conceputal model vs. Concept Model: Not the Same! In: Business Rules Journal, vol. 20, 2019. http://www.brcommunity.com/a2019/b977.html.
Svolba G.: Data Quality for Analytics Using SAS. SAS Institute Inc., USA, 2012.
Teixeira R., Amaral V.: Software Technologies: Applications and Foundations. STAF 2016, chap. On the Emergence of Patterns for Spreadsheets Data Arrangements,
pp. 333-345. Springer, Cham, 2016.
Tijerino Y., Embley D., Lonsdale D., et al.: Towards Ontology Generation from Tables. In: World Wide Web, vol. 8, pp. 261-285, 2005.
Veerman E., Moss J., Knight B., Hackney J.: SQL Server 2008. Integration Services. Problem-Design-Solution. O'Reilly Media, USA, 2010.