![]() |
How to Turn Data into Knowledge Part Two of a Two-Part Series July 3, 2000 (SmartPros) Perhaps you have spent thousands on the development of your Web site. There is a return on that investment. Each visitor leaves trails of data that could be captured and add significant value to any marketing opportunities. In an organization with more than just a few people, it is likely there is some degree of fraud taking place. Individuals who commit fraud will likely leave some evidence of their act. This data (evidence) could help organizations predict, detect, and reduce the risk of future fraud by helping to strengthen internal controls. Successful organizations know that you cannot pay bills with net income. In other words, "Cash is King." As organizations operate, they will probably experience bad debts from regular trade receivables or other financing arrangements. Most organizations will record a great deal of information from the customer. Perhaps the data they obtain contains hidden patterns that may help predict the default of consumer financing arrangements. These are just a few examples of the opportunities of data mining. Organizations collect and process staggering amounts of data everyday. Most organizations really do not use the data as well as they could. This is good news for competitors that do employ data mining techniques. They know that if they use these techniques properly, they will have a better understanding of consumer behavior. Data mining tools and techniques are still in their infancy and require a considerable amount of business and technical experience to become proficient and to identify meaningful results. With proper use and management, it can provide organizations opportunities to improve operations and increase cash flow.What is Data Mining? Data mining is not a magic bullet, nor will it transform business operations and profitability overnight. However, over time these techniques can help an organization gain a greater understanding and result in a competitive advantage. It is unlikely that any data mining tools will ever replace professionals in our lifetime since there is no analysis technique that can replace experience. Data mining will assist as a tool to find patterns and relationships. The professional must determine the value of these findings and actually verify them in the real world. Therefore, it is critical that the professional uses the correct data and has a thorough understanding of it. Data Mining and OLAP OLAP is commonly referred to as multi-dimensional reporting and includes a variety of techniques, which are user-driven. Although data mining and OLAP are two very different tools, they can complement each other. OLAP is a way to validate a hypothesis (i.e., patterns and relationships) against data. OLAP tools are part of decision support tools and allow the users to run a series of queries against data and validate whether a hypothesis is true. The problem with this deductive analysis is that it is difficult and time consuming to find valuable hypothesis. On the other hand, data mining is not as time consuming and it uses an inductive process, which is data-driven. The professional will not feed the data mining tool a hypothesis to validate because the tool will use the given data to uncover patterns and relationships. OLAP tools are widely used and are an important element of decision-making. The OLAP can complement the data mining by identifying a good place to start the data mining. For example, if you wish to know what took place in the past such as what the sales were by country, by month, etc, then start with an OLAP tool. Once what happened in the past is identified, many will wonder why those events happened, and what is likely to happen in the future. Knowledgeable and well-equipped professionals will use data mining tools and techniques for these types of questions. Data Mining Process Before beginning any type of data mining process, remember there are always two common ingredients of a successful project. First, know your objective. For example, what are you trying to solve? Second, use the appropriate data. There are several phases in the data mining process and it all begins with the data collected. When combining these phases you will see that it is an iterative process because the process requires hopping back and forth between phases until reaching a comfort level. The seven distinct phases of data mining are:
As one iterates between the different phases, professionals will obtain new knowledge resulting in new questions and clearer focus. As more data mining occurs, in theory, their processes should provide some degree of benefit to future data miners. Organization Understanding Data Understanding Data Preparation Modeling All of the techniques available will provide answer. However, the quality of the results will have a direct relationship with the applied technique. Although it is outside of the scope of this article, it is critical that one understands the available techniques so they can align the techniques with goals. Some of the techniques used in data mining include the following:
Evaluation Deployment and Monitoring Since whatever is being modeled is likely to change over time it is important that there are internal controls to constantly revalidate the models. Just because the model was valid yesterday, does not mean the model will be valid in the future. Conclusion Even though data mining is an exciting technology with significant payback possibilities, keep in mind that without adequate training and experience, the process could lead to deficient findings, invalid conclusions, and costly organizational decisions. 2000, Smartpros Ltd. All Rights Reserved. |
|
|||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||