INTEGRATION OF DATA FROM HETEROGENEOUS SOURCES USING ETL TECHNOLOGY.

Authors

  • Marek Macura AGH University of Science and Technology

DOI:

https://doi.org/10.7494/csci.2014.15.2.109

Keywords:

data integration, integration approaches, ETL technology, knowledge discovery from data, business intelligence

Abstract

Data integration is a crucial issue in environments of heterogeneous data sources. At present mentioned heterogeneity is becoming widespread. Whenever, based on various data sources, we want to gain useful information and knowledge we must solve data integration problem in order to apply appropriate analytical methods on comprehensive and uniform data. Such activity is known as knowledge discovery from data process. Therefore approaches to data integration problem are very interesting and bring us closer to the "age of information". The paper presents an architecture, which implements knowledge discovery from data process. The solution combines ETL technology and wrapper layer known from mediated systems. It also provides semantic integration through connections mechanism between data elements. The solution allows for integration of any data sources and implementation of analytical methods in one environment. The proposed environment is verified by applying it to data sources on the foundry industry.

Downloads

Download data is not yet available.

Author Biography

  • Marek Macura, AGH University of Science and Technology
    The Faculty of Computer Science, Electronics and Telecomunications, PhD student.

References

Calvanese D., Giacomo G. D., Lenzerini M., Nardi D., Rosati R.:. Source integration in data warehousing. DEXA Workshop, 1998.

Calvanese D., Giacomo G. D., Lenzerini M., Nardi D., Rosati R.:. A principled approach to data integration and reconciliation in data warehousing. In Proceedings of the International Workshop on Design and Management of Data Warehouses, 1999.

Calvanese D., Giacomo G. D., Lenzerini M., Nardi D., Rosati R.:. Data integration in data warehousing. Int. J. Cooperative Inf. Syst., 2001.

Doan A., Halevy A., Ives Z.:. Principles of Data Integration. Morgan Kaufmann, 2012.

Halevy A. Y., Rajaraman A., Ordille J. J.:. Data integration: The teenage years. VLDB, 2006.

Han J., Kamber M.:. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2012.

Hull R., Zhou G.:. A framework for supporting data integration using the materialized and virtual approaches. 1996.

Inmon W. H.:. Building the Data Warehouse. Wiley Publishing, Inc., 2005.

Ives Z. G.:. Efficient query processing for data integration. A dissertation for the degree of Doctor of Philosophy, 2002.

Kermanshahani S.:. Semi-materialized framework: a hybrid approach to data integration. ACM, 2008.

Kermanshahani S.:. Ixia (index-based integration approach) a hybrid approach to data integration. A dissertation for the degree of Doctor of Philosophy, 2009.

Kimball R., Caserta J.:. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. Wiley Publishing, Inc., 2004.

Kimball R., Ross M.:. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley Publishing, Inc., 2002.

Koch C.:. Data integration against multiple evolving autonomous schemata. CERN-THESIS-2001-036, 2001.

Kurgan L. A., Musilek P.:. A survey of knowledge discovery and data mining process models. The Knowledge Engineering Review, 2006.

Lenzerini M.:. Data integration: A theoretical perspective. PODS ’02 Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2002.

Levy A. Y.:. The information manifold approach to data integration. IEEE Intelligent Systems, 1998.

Negash S.:. Business intelligence. AMCIS, 2003.

Tatbul N., Karpenko O., Convey C., Yan J.:. Data integration services. Brown University, Computer Science, 2001.

Vassiliadis P.:. A survey of extract-transform-load technology. Integrations of Data Warehousing, Data Mining and Database Technologies, 2011.

Vassiliadis P., Simitsis A.:. Extraction, transformation, and loading. Encyclopedia of Database Systems, 2009.

Vercellis C.:. Business Intelligence: Data Mining and Optimization for Decision Making. A John Wiley and Sons, Ltd., 2009.

Widom J.:. Research problems in data warehousing. In Proceedings of International Conference on Information and Knowledge Management, 1995.

Wiederhold G.:. Mediators in the architecture of future information systems. IEEE COMPUTER, 1992.

Downloads

Published

2014-03-14

Issue

Section

Articles

How to Cite

INTEGRATION OF DATA FROM HETEROGENEOUS SOURCES USING ETL TECHNOLOGY. (2014). Computer Science, 15(2), 109. https://doi.org/10.7494/csci.2014.15.2.109