Extracting class diagram from hidden dependencies in data set


  • Bogumiła Hnatkowska Wrocław University of Science and Technology, Faculty of Computer Science and Management https://orcid.org/0000-0003-1706-0205
  • Zbigniew Huzar
  • Lech Tuzinkiewicz




conceptual model, class diagram, UML, data retrieval, raw data, csv 2019/10/28,


A conceptual model is a high-level, graphical representation of a specic do-
main, presenting its key concepts and relationships between them. In particular, these dependencies can be inferred from concepts' instances being a part of big raw data les. The paper aims to propose a method for constructing a conceptual model from data frames encompassed in data les. The result is presented in the form of a class diagram. The method is explained with several examples and veried by a case study in which the real data sets are processed. It can also be applied for checking the quality of the data set.


Download data is not yet available.


Data Cleansing: Care for most valuable business asset. https://www.hitechbpo.com/data-cleansing.php.

Embley D., Campbell D., Jiang Y., et al.: Conceptual-model-based data extraction from multiple-record Web pages. In: Data & Knowledge Engineering, vol. 31, pp. 227-251, 1999. URL http://dx.doi.org/10.1016/S0169-023X(99)


Embley D., Kurtz B., Woodeld S.: Object-Oriented Systems Analysis: A Model-Driven Approach. Prentice Hall, USA, 1992.

Embley D., Liddle S.: Conceptual Modeling, chap. Big Data - Conceptual Modeling to the Rescue. Springer, Heidelberg, 2013.

Hermans F., Pinzger M., van Deursen A.: ECOOP 2010 - Object-Oriented Programming, chap. Automatically Extracting Class Diagrams from Spreadsheets, pp. 52-75. Springer, Heidelberg, 2010.

Hnatkowska B., Huzar Z., Tuzinkiewicz L.: Integrating research and practice in software engineering, chap. A data-driven conceptual modeling, pp. 97-109. Springer, Cham, 2020.

Kung C., Solvberg A.: Activity Modeling and Behavior Modeling. In: Proc. Of the IFIP WG 8.1 Working Conference on Information Systems Design Methodologies: Improving the Practice, pp. 145-171. North-Holland Publishing Co., Amsterdam,

The Netherlands, The Netherlands, 1986. ISBN 0-444-70014-5. URL


Liu J., Li J., Liu C., Chen Y.: Discover Dependencies from Data - A Review. In: IEEE Transactions on Knowledge and Data Engineering, vol. 24, pp. 251-264,

URL http://dx.doi.org/10.1109/TKDE.2010.197.

McKinney W.: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython 2nd Edition. O'Reilly Media, USA, 2017.

Ross R.: Conceputal model vs. Concept Model: Not the Same! In: Business Rules Journal, vol. 20, 2019. http://www.brcommunity.com/a2019/b977.html.

Svolba G.: Data Quality for Analytics Using SAS. SAS Institute Inc., USA, 2012.

Teixeira R., Amaral V.: Software Technologies: Applications and Foundations. STAF 2016, chap. On the Emergence of Patterns for Spreadsheets Data Arrangements,

pp. 333-345. Springer, Cham, 2016.

Tijerino Y., Embley D., Lonsdale D., et al.: Towards Ontology Generation from Tables. In: World Wide Web, vol. 8, pp. 261-285, 2005.

Veerman E., Moss J., Knight B., Hackney J.: SQL Server 2008. Integration Services. Problem-Design-Solution. O'Reilly Media, USA, 2010.




How to Cite

Hnatkowska, B., Huzar, Z., & Tuzinkiewicz, L. (2020). Extracting class diagram from hidden dependencies in data set. Computer Science, 21(2). https://doi.org/10.7494/csci.2020.21.2.3483