Data Mining Terminology: A Glossary of Key Terms
Advertisement
This page covers useful data mining-related terms. This data mining glossary is very helpful for beginners in this domain.
The following table lists terminologies related to the data mining domain.
Data Mining terminologies | Description |
---|---|
Data Mining | It refers to the extraction of desired information from huge data available on the web or in databases. It has many applications, a few of which are market analysis, customer retention, fraud detection, science exploration, disease analysis, etc. |
Data Mining Engine | It’s the main component in a data mining system. It performs many core functions, such as association, classification, characterization, prediction, cluster analysis, etc. |
Knowledge Base | It’s storage based on pattern searches, like a cache in a computer network. This helps in providing quick results for searches when similar patterns are searched for in the future. |
Knowledge Discovery | It has broad functionalities which include data cleaning, data selection, data integration, data transformation, data mining, pattern evaluation, etc. |
Data Warehouse | It’s constructed by integrating data from multiple heterogeneous sources. It supports various tasks such as analytical reporting, ad-hoc and/or structured queries, decision-making, etc. |
User Interface | This is the interface between the user and the data mining system. It helps in fulfilling the user’s requirements from the data mining system. It helps in providing information relevant to the search pattern, helps in visualizing patterns in various forms, sorting out data based on needs from different databases, and integrating it as per desired formats, etc. |
Data Cleaning | It’s the process that removes noisy data and corrects any inconsistencies in the data. It’s a process applied before data warehousing or data storage. It basically performs transformations on the data to provide correct datasets. |
Data Selection | The process of retrieving relevant data from databases for analysis purposes is known as data selection. |
Data Integration | The collection and aggregation of appropriate data as per need is known as data integration. |
Data Transformation | The change in the form or syntax of the data is known as data transformation. |
Clusters | It refers to a group of similar kinds of objects. Objects in one cluster are different compared to objects in the other cluster. |
OLAP | On-Line Analytic Processing |
OLTP | On-line Transaction Processing |
OLAM | Online Analytical Mining |
KDD | Knowledge discovery in databases |
MDDB | Multidimensional database |