International Journal of Management, Knowledge and Learning, 1(1), 27–43 A Composite Strategy for the Legal and Ethical Use of Data Mining Dinah Payne University of New Orleans, USA Brett J. L. Landry University of Dallas, USA An increasingly popular business practice, data mining provides for the ex- traction of information from existing data to identify trends such as consumer purchasing practices and can foster greater efficiency in companies’ market- ing efforts. There are corresponding costs associated with data mining, as well. The most difficult issue surrounding data mining is that of individual privacy rights and the costs associated with the potential alteration of ‘tradi- tional’ privacy rights. This paper seeks to review basic definitional information on data mining and provide a strategy for companies’ successful, meaning- ful and ethical use of data mining as presented for meaningful knowledge generation. Keywords: management, learning, knowledge, data mining, text mining, privacy, ethics, strategy Introduction Historically and anecdotally, the ‘village gossip’ was the best source of a wide variety of information, although not always an accurate one. Currently, society has two much better, more efficient and accurate sources of a much wider spectrum of information: data mining and its natural extension, text mining. The information potentially available to anyone with the right re- sources ranges from people’s name, address and telephone number to their personal financial and medical information (Montana, 2001). ‘Seeing’ these data is becoming a major component in decision support and the formation of organizational strategies. The ‘Age of Information’ has transformed so- ciety’s understanding of the concept of information from its present form of ‘something that we ought to know’ into its evolving form of ‘something we could otherwise never have known or used effectively.’ A clear sign that data-recording and evaluation is becoming central to society’s operations is the substantiation of a 1999 prediction by professionals at the University of California at Berkeley. They predicted that by the year 2004, the amount of stored information would be double that of the 1999 storage rate; that trend has been confirmed (O’Harrow, 2004). This paper sets out to define data mining, its purposes, stakeholders www.issbs.si/press/ISSN/2232-5697/1_27-43.pdf 28 Dinah Payne and Brett J. L. Landry and common usages, and then focus on the ethical issues and burdens of data mining. While much of the literature has focused on the techniques used and the benefits gained from data mining, careful consideration must be given to how that data has been used and cared for. The paper ties existing ethical frameworks to data mining strategies that organizations can use in concert with data usage and technical strategies. Data Mining: Definitions and Functions Data mining ‘[. . .] attempts to extract even more information from exist- ing data by finding a correlation or trend in the existing data. It is also called knowledge discovery [. . .] because data miners do not know specif- ically what they are looking for before they find it. They are seeking to dis- cover new insights from the data in their databases [. . .]’ (Cary, Wen, & Mahatanankoon, 2003). Text mining is a more refined type of data mining, one that allows for more intelligent, refined and efficient searches of textual information. The focus of this paper is on the broader search model of data mining. Data mining can be defined broadly as the analysis of database infor- mation. By its simplest definition, data mining is the set of activities used to find new, hidden, or unexpected patterns in data (Marakas, 1999). Its purposes can be described in a variety of ways. First, ‘data mining is the analysis of data to establish relationships and identify patterns’ (FindLaw, 2012). Database information mining can also be used to identify specific product information and codes. In addition, the purposes of both cleansing and re-formatting data for future use are served by data mining. Further, data mining is used as an information extraction activity which has the goal of discovering hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques, and database technology, data mining finds patterns and subtle relation- ships in data and infers rules that allow the prediction of future results. This information can then be used to increase organizational learning and the overall knowledge base. According to a Gartner magic Quadrant Report for Customer Data Mining applications, SAS and SPSS remain leading vendors with Portrait Software and Angoss Software listed as challengers, ThinkAna- lytics as a visionary, and Infor CRM Epiphany, Viscovery, and Unica as niche players (Herschel, 2008). This is not an exhaustive list, but the firms that Gartner reviewed were based upon functionality including pulling data from a variety of dissimilar data sources and ‘support common CRM decisions such as customer segmentation, cross-sell or customer churn prevention, with data mining-driven-insights’ (Herschel, 2008). However, CRMs are not the only use for data mining applications. Olson (2007) describes that data mining can be used in a variety of operational needs including banking and International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 29 credit card management, fraud detection, and risk analysis. It can also be used for analyzing market segmentation, customer profiling, evaluation of retail promotions, and credit risk analysis (Two Crows Consulting, 2012). It is this ability to analyze a variety of data sources that allows data mining to produce different outputs than surveying customers, suppliers, and others for a variety of reasons. The first of these as discussed is that data min- ing produces unexpected results. Surveys, questionnaires, and other types of instruments address a hypothesis or research question that is already known. This is not a bad thing and provides answers for questions and is- sues that are known. Data mining can answer existing questions and find patterns that will create new questions. It can also generate predictions based on previous patterns (Meidan, 2011). An example of the predictive capability of data mining is bankruptcy predictions using a variety of statis- tical methods (Olson, 2007). According to Marakas (1999), data mining processes may be classified by the functions they perform or by the class of application they can be used in. Four categories emerge: classification, association, sequence, and cluster. The most broadly applicable technique is that of classification. This application discovers rules about whether an item or event belongs to a particular subset or class of data. For example, the response of customers to a particular direct mail campaign can be predicted by designating the appropriate parameters in the firm’s information database from which the direct mailing campaign information will be drawn. A second function of data mining is the association function. This ap- proach employs linkage analysis of transactions that have a high probability of repetition. For example, the purchase of two products from the grocery store by the same person at the same time can be associated, such as skim milk and yogurt, peanut butter and jelly. The sequence function of data mining relates events in time. A very good example can be found in the mail catalog industry: when a customer uses a credit card to buy clothes from a certain type of store, data mining allows other, similar companies to be apprised of the purchase. The firm who mined the data about the pur- chase will then immediately target that buyer. A rational extrapolation of the first purchase is that the consumer may be likely to make similar purchases from other merchants. This function of data mining appears to be the most important reason that businesses use data mining. It is used as a tool for identifying consumers to target in direct marketing campaigns (Cary et al., 2003). Finally, the fourth function of data mining is simply to identify sets of objects grouped together by virtue of their similarity or proximity to each other. This approach might be used to mine credit card purchase data to discover that meals charged on a business card are typically purchased on Volume 1, Issue 1, 2012 30 Dinah Payne and Brett J. L. Landry weekdays and have a value greater than $250, whereas meals purchased using a personal credit card occur on weekends and have a value less than $175. Common Usage and Stakeholders There are many uses to which data mining can be put. Additionally, there are many stakeholders who both use and are affected by data mining. Data mining tools for these uses include a wide range of analytical activities, including data profiling, data warehousing, online analytical processing and enterprise analytical applications (Agosta, 2004). The most common uses of data mining efforts can be categorized into four areas: efficiency, secu- rity, customer-service, and product innovation. It can be used to increase efficiency and enhance security via its ability to detect fraud, waste and abuse; the discovery of patterns of monetary disbursement can reveal both inefficient spending patterns, as well as unauthorized patterns of spending. An example of fraud detection is the use of data mining for forensic inves- tigators within accounting audits (Meidan, 2011). Additionally, efficiency is enhanced by the use of data mining via the better management of human resources; for example, employers may be able to choose better employees for specific jobs than they were able to without the knowledge base provided by data mining. Data mining can be used to improve service performance for customers; patterns of customer choice and spending can help the firm better provide more satisfactory goods and services for customers. The final category of use of data mining is enhanced product innovation. Like the use of information to provide better customer service, data mining can provide better analysis of scientific and research data. For example, drug and medical research can be greatly enhanced by data mining tech- niques. Today, thanks to data mining technology, business is easily able to collect volumes of information, store it and access it at any time to mine for the necessary information. Also, this technology has enabled businesses to perform data mining on many types of data, including those in structured, textual, web, or multimedia forms. In addition, data mining techniques can be implemented rapidly on ex- isting software and hardware platforms to enhance the value of existing information resources and can be integrated with new products and sys- tems as they are bought on-line. When implemented on high performance client/server or parallel processing computers, data mining tools can ana- lyze massive databases to deliver answers to questions such as ‘Which con- sumers are most likely to be receptive to our advertisements?’ or ‘Which in- dustrial processes are more likely to be most efficient?’ (Cary et al., 2003). Clearly, the ability to answer more effectively and in a more timely fashion questions such as these can only be beneficial to the firm. In order to understand the benefits and burdens of data mining more International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 31 fully, one must recognize those who are most closely affected by its use. In this review, it is imperative that the interests of stakeholders be consid- ered; an ethical analysis of data mining must begin with the identification of the key stakeholders affected by data mining practices and a description of how the stakeholders may be positively or negatively affected. ‘A stake- holder is any individual, group, organization, or institution that can affect, as well as be affected by, an individual’s, group’s, organization’s, or institu- tion’s policy or policies’ (Wood-Harper, Corder, Wood, & Watson, 1996, p. 9). These include, but are not limited to, customers/clients of the data ware- house, data warehouse management, subjects of information searches, shareholders of all companies affected by the firm’s behavior, employees of the various firms involved, the community, society at large, and current and future financial backers of the firm. Other stakeholders include professional associations, governmental regulatory agencies, competitors, and informa- tion suppliers. In the consideration of data mining, several constituents are more important than others (Payne & Landry, 2005). Some stakeholders are more important than others because the positive or negative effects of actions done upon various stakeholders can differ greatly. The magnitude of harm that might be felt by the stakeholder if the data mining process was not executed in an ethical fashion could be great. For example, the public image of the firm collecting the data can be significantly harmed if consumers feel that the collection of the data was an egregious violation of their privacy. On the other hand, stakeholders such as professional as- sociations monitoring the use of data mining or future financial backers of a project may not suffer the same degree of harm: in this case, the data miner might be less likely to be perceived as the bad guy in data mining efforts. Business Benefits and Ethical Burdens of Data Mining There are large numbers of uses to which business entities put data mining. These uses correspond to the many benefits accruing therefrom. Commer- cial information services have been in existence for decades, providing ser- vices regarding financial information (O’Harrow, 2004). Use of data mining in this area has arguably reduced the number of financial failures among unsuitable borrowers. Another benefit of data mining is that derived from better marketing practices. ‘Data mining and direct marketing are benefi- cial to the business community because they enable businesses to identify more accurately the target audience for their product or service, thereby reducing marketing costs’ (Morse & Morse, 2002, p. 77). Corporations can use mathematical and statistical techniques to determine salient behavior patterns that were previously hidden in large databases compiled by the business (Markoff, 1999; Morse & Morse, 2002). According to Morse and Morse (2002), there are two major benefits that Volume 1, Issue 1, 2012 32 Dinah Payne and Brett J. L. Landry Users of data mining Th e co m m un ity So ci et y at la rg e Cur ren t an d fu tur e fi nan cia l bac ker s o f th e p roje ct Professional associations Governmental regulatory agencies Suppliers of inform ation C om pe tit or s Sh ar eh ol de rs of al l fi rm s in vo lv ed Per son al s tan dar ds of e mp loy ers / ma rke ting pro fes sio nal s Customers and clients Em ployees of all firm s involved Management of firms using data mining techniques Figure 1 Pressures Felt by Stakeholders of Data Mining Techniques (adapted from Payne and Landry, 2005) accrue as a result of data mining. First, the discovery of relevant data and behavior patterns allows marketers to better learn and understand the inter- ests and purchasing behavior of consumers. This benefit is the foundation of the second benefit: the more accurate knowledge marketers have about their consumers, the less money they will have to spend either in identi- fying the consumer or in identifying the consumer’s consumption habits or patterns. ‘[B]usinesses can save money on marketing while increasing their customer base substantially’ (Cary et al., 2003, p. 158). While businesses are using data mining for improving service, detect- ing fraud, analyzing scientific information, and, inter alia, managing human resources, it can be concluded that vast amounts of data, including per- sonal information, can be collected, organized, and manipulated easily in the firm’s efforts to uncover hidden consumption patterns and predict future trends and results. Thus, although there are many significant benefits to be derived from data mining, there are also serious drawbacks to unlimited or unregulated use of data mining. Incorrect conclusions can be drawn from data, data can be used for other than the original purposes for which they were collected and privacy rights can be violated. Additionally, data mining can create inferences that reveal information that the subject of the infor- mation does not want or choose to reveal: information that could be harmful if known to the wrong person. Further, costs of data mining must include those attributable to the improper or even incorrect collection and use of the data. ‘Data mining is not carried out with scientific rigor. The quality International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 33 or randomness of the original data is not strictly verified and therefore the significance of inferences drawn from the data must be in question (Cary et al., 2003, p. 161).’ The potential costs of data mining and text mining can be substantial and unexpected. These costs can be both tangible and intangible and re- sult from consumer opposition to data mining practices that may be seen as unethical. However this is not a new issue. For example, Lotus Cor- poration had to abandon a potentially lucrative product called Households when public outrage about privacy right violations caused their stock to fall. Households was designed to search for customers’ profiles according to data provided by Equifax, a credit reporting agency. The public forced the company to scrap the product because they believed that their privacy rights were being violated (Invasion of privacy, 1993). Concerns over privacy rights violation in general give rise to two more potentially costly problems of data mining. First, the subject of the informa- tion search does not have control over his own data, yet it is he that will suffer from the inaccurate or incorrect assessment of the data. Second, the gatherer of the information must give heed to his responsibility to keep the gathered information secure. ‘If companies are going to gather and infer sensitive data about individuals they must make a reasonable effort to pro- tect it from unauthorized access and from unethical use by employees or outside agents (Cary et al., 2003, p. 162).’ This problem is compounded by the use of mobile devices to connect to data repositories within the enter- prise and more sophisticated attacks from around the world (Higgins, 2007) which provide conduits to steal data. These problems are encompassed in the larger question of privacy rights of those about whom the information is gathered. The question is then begged: how can the technology of data mining be used without violating privacy rights? The increased use of data mining raises many concerns regarding pri- vacy. There is a growing concern among consumers that the right to privacy is being eroded by the increased sophistication of data collection and min- ing practices by both corporations and government entities. Exacerbating the consumers’ concern is the question of ownership of the consumers’ personal data. These ethical questions need to be identified and addressed by businesses and the government whenever they create new applications of data mining. This essay focuses on business rights and responsibilities surrounding data mining, rather than governmental uses of data mining. Business is confronted by many ethical questions in the use of data min- ing. These questions should be considered at all stages of the process, from when the original data is collected to when the insights gained from data mining are put to use. Companies today are gaining more of the consumer’s personal identifi- Volume 1, Issue 1, 2012 34 Dinah Payne and Brett J. L. Landry able information (PII) without the average consumer even being aware of the collection or transfer of this data. An efficient method of data collec- tion is through the use of the Internet. Technology has made it possible to electronically monitor a person’s interests, beliefs, purchasing habits, the kind of people they talk to and type of lifestyle they lead. One method of data collection on the Internet uses cookies to assign customer iden- tification numbers. This identification number is linked automatically with the company database. A cookie is placed in each individual’s computer every time he visits a website. Cookies track website destination and the frequencies with which sites are visited. Additionally, at all purchase sites, including the tangible ones at the mall, electronic databases record and track purchases, consumer names, addresses, credit card information and so on. Registration forms are being used through the Internet as well in the search for information. The information provided by the consumer in the reg- istration form is kept, stored, retrieved and used as deemed needful by the data warehouse or purchaser of the information from the data warehouse (Morse & Morse, 2002). Ethics concerns arise through the use of these data collection methods. Depending on the method of data collection, the individual is not aware that he is subjecting himself to being monitored, he has not been asked for his consent to the search nor does he know where this information will go. Some companies collect information only for the government and some sell the information to other companies. To further intensify the problem, the individual is not given any choice about future uses of the data that he provides. Several ethical issues are inherent in the collection and mining of per- sonal data: privacy, consent, ownership, and security. Many consumers feel that their privacy is violated when the provision of information is a require- ment for purchasing and that information is utilized in ways to which they did not explicitly consent. The companies claim that the information they are gathering is a public good gathered in a public sphere and that therefore privacy is not being violated (Cary et al., 2003). However, the question is whether the information derived from the data is private. Such information about the customer is not actually supplied by the customer (Wahlstrom & Roddick, 2000). In addition, the information may have been required in order for the customer to make purchases and that practice alone raises privacy issues. Further, firms do not make a sufficient effort to inform the user of the current or future uses of his data. The user or customer is not provided the opportunity to provide informed consent. Consumers may not be aware that the company may combine the approved information with public information and prior information gathered to create a profile of the customer. International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 35 Another concern is in the type of data being collected. Some types of personal information are seen as being more sensitive than others (Cary et al., 2003; Wahlstrom & Roddick, 2000). Many consumers are unaware that credit history, financial information, employment history, and possibly some medical information are routinely sold. They would be equally surprised to know that they had unknowingly abdicated their ownership rights by allow- ing cookies to reside in their computers or by completing required registra- tion forms. Privacy rights have been linked to many other rights deemed to be essential to the development of a well-balanced individual and society (Levine, 2003). These rights then must be given the appropriate level of respect. Again, the question remains as to what actually is the appropriate level of respect. One final ethical issue surrounding data mining is the security of the data collected and mined. If anyone is going to gather and infer sensitive data about individuals, they must make a reasonable effort to protect it from unauthorized access and from unethical use by employees or outside agents. The questions that arise here are: how well are the privacy laws written and how well is the law actually followed? Ethical Data Mining Suggestions and Strategies There are two categories of tools that can be used to assure the appropriate use of data mining: technological tools and managerial tools of business ethics. Technological tools include things like anonymity tools and security measures like data encryption that prevent or secure information from being used without proper consent. This paper focuses on the other category of measures that can be taken to make sure that data are mined and used in an ethically and societally appropriate manner, the managerial tools. Three proposals to preserve privacy rights are reviewed here. The first proposal, Montana’s (2001), is more legalistic in nature, guiding firms in the development of policies designed to prevent legal or public image problems associated with privacy violations. The second (Cary et al., 2003) and third (Raiborn & Payne, 1990) approaches are more general in a managerial sense, providing guidelines which, if followed, should not only preserve the legal and public image integrity of the firm, but should also enhance the firm’s ability to defend its actions ethically. All three approaches towards fairly and legally using data mining as a business tool can ultimately be described as having three main thrusts: cus- tomer orientation, adherence to sound ethical, and legal principles. These thrusts are relatively straightforward. The customer orientation stresses the importance of keeping the consumer happy by preserving his legal and per- ceived privacy rights. The adherence to sound ethical and legal principles provides the firm the foundation to defend itself, again, both ethically and Volume 1, Issue 1, 2012 36 Dinah Payne and Brett J. L. Landry legally, in the event of some question of its use of data mining. Finally, draw- ing on the proposals to legally and ethically mine data, this paper presents a code of ethics that should be applicable in any situation, including those fraught with questions of privacy rights. Privacy was a sensitive issue long before the advent of comput- ers. Concerns have been magnified, however, by the existence and widespread use of large computer databases that make it easy to compile a dossier about an individual from many different data sources. Privacy issues are further exacerbated (by how easy it has become) for new data to be automatically collected and added to databases (Cranor, 1999, p. 29). Providing a pragmatic approach to the privacy problems generated by the use of data mining techniques, Montana (2001) suggests that the firm follows a strategy consisting of engaging in five actions to avoid legal trou- ble or public image damage. First, particularly in the customer orientation, the firm considering using data mining should consider the expectations of the persons whose information they are collecting and/or using. Cavalier disregard for the consuming public’s expectations is likely to lead to dis- satisfaction, a growing refusal to be used in this manner and a backlash of public policy in the form of the development of new and more restrictive laws governing the collection and/or use of data. Montana’s second sug- gestion is one of adherence to sound ethical principles. It is for the firm to develop a privacy policy that clearly and immediately explains to the user whose information is collected what, the information will be used for and what it will not be used for. This suggestion incorporates advice about the collection of consent from the user to use the information: it should not be such a draconian consent or registration form that the reader is likely to give consent to a use he does not expect without even realizing it. The third suggestion put forth by Montana has attributes of adherence to both ethical and legal standards. A firm should not resort to the use of compli- cated and legal latin type language and policies about the collection and use of the data. Confusing the consumer with legalese designed to provide loopholes for the firm is again likely to lead to an unpleasant backlash of consumer anger. Finally, Montana’s final two suggestions are grounded in adherence to sound legal principle. Montana’s fourth suggestion is that the firm fully understand its responsibility to abide by the law as it relates to the nature of the information collected and / or used. For example, there are stringent laws in place protecting the privacy of medical and financial information; the firm must respect the person’s privacy rights as legally pro- tected privacy rights. Finally, it is suggested that firms collecting and/or using data maintain constant vigilance with regard to the law and public International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 37 Table 1 Montana’s Suggestions for Fair and Legal Use of Data Mining Principle/ orientation Suggestions Customer Consider consumer expectation: •Respect consumer view of what is proper or improper use of personal data. Ethics Develop a privacy policy: •Clearly and immediately state what the firm will do with the information collected •Do not use draconian measures to create legal loopholes Follow internal practices honestly and without ‘splitting legal hairs:’ •Arcane legal machinations will annoy consumers Legal If information used is the subject of legal protection, adhere strictly to the privacy law: •Personal medical and financial information is inviolate Follow changes in law and public opinion: •Law and attitude change rapidly opinion. Legally and managerially, compliance with what is legislatively and societally mandated is a business necessity. Montana’s four suggestions are shown in Table 1. Cary et al. (2003, p. 163) have developed a strategy containing ten practices that would aid in the legal and ethical development and use of data mining systems: The power and sensitivity of public opinion [. . .] dictates that corpo- rations act to self-regulate their practices related to the handling and use of personal data. Following existing laws and regulations alone will not be enough to protect a corporation from the risk of damage from a negative public perception of their practices. The authors here clearly understand that compliance with the letter of the law alone is not sufficient to prevent consumer dissatisfaction at the least and legal action at the most. The first strategic category for the devel- opment of an ethical data mining system is that pertaining to the customer. The first suggestion in this category is that the firm consider the expec- tations of the customer when beginning a new project that requires data collection or use. It is immaterial whether the firm is actually complying with the law if the consumer feels that his privacy rights have been violated. The development of a customer-oriented privacy policy is a closely related sug- gestion: consumers believe that to divulge private information is their choice and, as such, their wishes with regard to privacy should be respected. The firm should alert the consumer as to the uses to which the information will be put so that the consumer can then make a better informed choice about disclosing the information. Finally, the customer driven principle requires Volume 1, Issue 1, 2012 38 Dinah Payne and Brett J. L. Landry that the firm give more control to the consumer over what happens with the information collected. To achieve this, full disclosure and honesty is vital when gathering the information itself and when obtaining the consumer’s consent to use the information subsequently. Cary et al. (2003) also present suggestions that fall into the strategic category or principle of ethics. There are several of these suggestions. First, the spirit of the privacy policy must be followed, not just the letter of the policy or the letter of the law. ‘[R]egardless of the depth or breadth of a le- gal(istic) code, every immoral or illegal behavior cannot be proscribed. Thus, the spirit of the law is always broader than the letter of the law’ (Raiborn & Payne, 1990, p. 17). Second, the quality of the source data should be checked: data mining of wrong or inaccurate data can cause serious harm to reputations that could lead to other types of harm. As a custodian of the information, the firm arguably has a fiduciary duty not to disseminate incor- rect or false information. Additionally, a corporate code of conduct should be developed to establish appropriate standards for practices and treatment of consumers. For example, such a code can prevent potential harm from accruing in the first place from the dissemination of inaccurate information. Finally, the firm should perform an ethical audit of the uses to which its data is put. This audit can help identify any ethical or legal concerns that may arise when data is used in new or questionable ways. It can protect the firm from public outrage or legal action by providing proactive guidance to prevent problems from happening. There are three legal principles projected by Cary et al. (2003) as shown in Table 2. First, the firm should research and understand all laws that may pertain to its activities, especially the law concerning information that may be considered sensitive, like financial or medical information. Additionally, legal procedural matters should be on the firm’s radar screen: there may be federal, state, local, and even international law that impacts the legality of the use of certain information in data mining operations. A second sugges- tion with regard to the legal principle of data mining policy is the requirement that the firm stay current on new legal and public policy developments, as well as new attitudes towards the collection and use of data. Finally, ac- cess to the data warehouse is of paramount interest here. Although after the terrorist attacks of 2001, the Fourth Amendment Search and Seizure provisions have changed to reflect a greater need of the government to know of criminal activity, there are still well-established and recognized legal pro- tections for privacy. Thus, the security of the data storage should be under constant surveillance. The third approach utilized here is that designed by Raiborn and Payne (1990); they designed a methodology for creating a corporate code of ethics that was comprehensive, clear, and enforceable. Using the standards of be- International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 39 Table 2 Suggestions by Cary et al. for Fair and Legal Use of Data Mining Principle/ orientation Suggestions Customer Consider consumer expectation: •The company should be able to predict consumer perception about potential privacy violations Develop a customer-oriented privacy policy: •Provide the consumer with a privacy policy meant to adequately inform the consumer about the current and future uses of data Give consumers more control over their information •Consumers with more control over their data will perceive a greater control over and respect for their privacy Ethics Follow the spirit of the privacy policy: •Abiding by the spirit of the policy engenders a relationship of honesty and trust important to the consumer Evaluate the quality of the source data: •Use of accurate information is vital to the correct dissemination of data •Intentional harm cannot accrue as a result of the dissemination of correct information Develop a corporate code of conduct: •Acceptable standards for use of information and treatment of consumers are accessible for use Perform an ethics audit to identify new uses of data mining: •Review potential risks of new uses of data mining with regard to legal and ethical ideals Legal Research and understand laws/legal procedures surrounding data mining: •Privacy laws must be followed •Federal, state, local and international legal procedure should be honored havior and the values they suggested, it is possible to create a workable code of ethics. It should also work to help establish a good data mining policy. The four values presented in their model are integrity, justice, com- petence, and utility. The value of integrity implies that one will act with sincerity, good faith, and honesty. The value of justice requires that fairness and equity are incorporated into the decision-making process. Competence imposes the responsibility to be capable, knowledgeable, and competent in the execution of one’s actions. Finally, the value of utility requires that one have a complete understanding of the elements involved in decision-making and that social utility is a consideration. The four standards of behavior are the theoretical, the practical, the cur- rently attainable, and the basic standard. The theoretical standard reflects the highest standard of ethical behavior: this is the spirit of morality. The second level of ethical behavior is the practical level; it is the acknowledge- ment that the highest level of ethical attainment may not be possible in the world we live in. It reflects the use of extreme diligence in ethical decision- Volume 1, Issue 1, 2012 40 Dinah Payne and Brett J. L. Landry Table 3 Raiborn and Payne’s (1990) Codal Provisions Adapted to the Categories of Consumer, Ethical and Legal Strategies for Ethical Data Mining Principle/ orientation Suggestions Customer •The data collected, stored and shared should be accurate Ethics Integrity at the theoretical or practical level: •Customers should be fully informed that the data mined on them is being mined •Customers should knowingly and willingly consent to such data mining Utility at the theoretical or practical level: •The data mined/utilized should be utilized for reasonable purposes •The data mined/utilized should be utilized for the purposes for which it was intended to be used Legal Justice at the theoretical or practical level: •Fairness and equity are to be incorporated into adherence of the law making: the decision maker should be as ethical as possible in the circum- stances. The level of currently attainable ethical behavior recognizes the society, through the idea of the public policy, and has certain minimum re- quirements for morality; here, the decision-maker does not strive to achieve heights of moral behavior, he merely seeks to satisfy the basic societal moral standard of behavior. Finally, the basic standard of behavior is that of the basic legal standard of behavior; in this instance, the spirit of the law is not a part of the potential solution. Table 3 depicts the manner in which the Raiborn and Payne model can be adapted to the issues of data mining. Impacts on Knowledge and Learning Organizations need to be faithful stewards of data by acknowledging the benefits and issues of creating and maintaining knowledge repositories such as data mining. The data needs to be collected in an honest and ethical manner. The data must then be maintained in such a way that there is data integrity and that the data is safe from data leakages. However, just having the data collected in an honest and ethical manner and protected from prying eyes is not enough. Organizations need to have procedures in place where organizational learning can take place and data and information can be synthesized into knowledge. This synthesis can take many forms, such as learning who our customers are, why our customers do business with us and why they stop doing business with us. It can also examine and learn from the behaviors of customers and others in the data repositories. Table 4 provides a summary of the three ethical frameworks discussed so far. The three principles or orientations examine the Customer, Ethics, and Legal implications for data International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 41 Table 4 A Composite Strategy for the Legal and Ethical Use of Data Mining Principle/ orientation Montana’s suggestions Suggestions by Cary et al.’s Raiborn and Payne’s suggestions Customer Consider consumer expectation Consider consumer expectation; develop a customer-oriented privacy policy; give consumers more control over their information Competence at the theoretical or practical level Ethics Develop a privacy policy; follow internal practices honestly Follow the spirit of the privacy policy; evaluate the quality of the source data; develop a corporate code of conduct; perform an ethics audit to identify new uses of data mining Integrity at the theoretical or practical level; utility at the theoretical or practical level Legal Adhere strictly to the privacy law; follow changes in law and public opinion Research and understand laws/legal procedures surrounding data mining Justice at the theoretical or practical level mining. Are there things that the organization can do, or behaviors that the organization may engage in that could persuade customers to give more meaningful data? The opposite is also true. There may be behaviors that the firm may engage in that cause customers not to give meaningful data. The key for the firm is to learn what these activities are and position data gathering activities in such a way that meaningful data are collected. Once this knowledge is acquired it must be shared among the organization for and permeate the organization culture on what are and are not best practices for data mining. Conclusion It is not enough just to develop data mining strategies that include the infor- mational benefits and the techniques used. The strategy should also assure that both legal compliance and ethical behavior in the collection and use of data includes consideration of three main principles: a customer orienta- tion, adherence to the spirit of the law, and adherence to the letter of the law. Businesses and consumers can have both interests served and served well, with integrity through the appropriate use of data mining techniques. The organization must also examine who the stakeholders are and how they are affected by the data collection and usage within the system. The strate- gies reviewed in this paper, the Montana model, the Cary et al. model, and the Raiborn and Payne model all suggest that, in the proper legal, ethical, Volume 1, Issue 1, 2012 42 Dinah Payne and Brett J. L. Landry and managerial structures, data mining and the uses to which information collected thereby can be used in a way that is legal, respectful of people’s privacy and advantageous to the business firm using the information. References Agosta, L. (2004). Data mining is dead – long live predictive analytics! Infor- mation Management, 14(1), 37. Cary, C., Wen, H. J., & Mahatanankoon, P. (2003). Data mining: Consumer pri- vacy, ethical policy, and systems development practices. Human Systems Management, 22(4), 157–168. Cranor, L. F. (1999). Internet privacy. Communications of the ACM, 42(2), 28– 31. FindLaw. (2012). Data mining. http://dictionary.findlaw.com/definition/data- mining.html Herschel, G. (2008). Magic quadrant for customer data-mining applications. http://www.spss.com.hk/PDFs/Gartner_Magic_Quadrant.pdf. Higgins, K. J. (2007, November 5). Zombies, bots take a bite out of sensitive business data. InformationWeek, p. 38. Invasion of privacy: When is access to customer information foul – or fair? (1993). Harvard Business Review, 71(5), 154–155. Levine, P. (2003). Information technology and the social construction of infor- mation privacy: Comment. Journal of Accounting and Public Policy, 22(3), 281–285. Marakas, G. M. (1999). Decision support systems in the 21st century. Upper Saddle River, NJ: Prentice Hall. Markoff, J. (1999). The privacy debate: Little brother and the buying and sell- ing of consumer data. Upside, 11(4), 94–106. Meidan, A. (2011). Data mining for forensic investigators. Internal Auditing, 26(1), 26–29. Montana, J. C. (2001). Data mining: A slippery slope. Information Manage- ment Journal, 35(4), 50–54. Morse, J., & Morse, S. (2002). Teaching temperance to the ‘Cookie Mon- ster:’ Ethical challenges to data mining and direct marketing. Business and Society Review, 107(1), 76–97. O’Harrow, Jr. R. (2004, October 15). Privacy eroding, bit by byte. Washington Post, p. E1. Olson, D. L. (2007). Data mining in business services. Service Business, 1(3), 181–193. Payne, D., & Landry, B. J. L. (2005). Similarities in business and IT profes- sional ethics: The need for and development of a comprehensive code of ethics. Journal of Business Ethics, 62(1), 73–85. Raiborn, C. A., & Payne, D. (1990). Corporate codes of conduct: A collective conscience and continuum. Journal of Business Ethics, 9, 879–889. Two Crows Consulting. (2012). Glossary of data mining terms. http:// twocrows.com/data-mining/dm-glossary/ International Journal of Management, Knowledge and Learning A Composite Strategy for the Legal and Ethical Use of Data Mining 43 Wahlstrom, K., & Roddick, J. F. (2000, November). On the impact of knowl- edge discovery and data mining. Paper presented at the Second Aus- tralian Institute of Computer Ethics Conference, AiCE2000, Canberra, Australia. Wood-Harper, A. T., Corder, S., Wood, J. R. G., & Watson, H. (1996). How we profess: The ethical systems analyst. Communications of the ACM, 39(3), 69–78. Dinah Payne, Professor of Management, has been licensed by the Louisiana Bar Association since 1986 and at the University of New Orleans since 1988. Her teaching and research interests are in the fields of business ethics, do- mestic and international law and management. She has participated in many international seminars both as speaker and student. She has received teach- ing, research and service awards. She has been published in many journals, including the Journal of Business Ethics, Communications of the ACM, Labor Law Journal, the Journal of Corporate Accounting and Finance, the Journal of Developmental Entrepreneurship and Global Focus. dmpayne@uno.edu Brett J. L. Landry is the Ellis Endowed Chair of Technology Management, As- sociate Professor and Director of the Center for Cybersecurity Education at the University of Dallas. Landry joined the University of Dallas in the Fall of 2006 following six years of teaching at the University of New Orleans. Over the last twenty years, he has worked in the area of information security in the public and private sectors and has published numerous journal articles, conference proceedings, book chapters on IT, higher education, and cyber- security. Landry also holds numerous industry security certifications such as Certified Information Systems Security Professional (CISSP), Certified Ethical Hacker (CEH), Certified Information Systems Auditor (CISA) and Certified in Risk and Information Systems Control (CRISC). blandry@udallas.edu This paper is published under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License (http://creativecommons.org/licenses/by-nc-nd/3.0/). Volume 1, Issue 1, 2012