Results 1 - 10
of
13
Recent Advances in Clustering: A Brief Survey
- WSEAS Transactions on Information Science and Applications
, 2004
"... Abstract:- Unsupervised learning (clustering) deals with instances, which have not been pre-classified in any way and so do not have a class attribute associated with them. The scope of applying clustering algorithms is to discover useful but unknown classes of items. Unsupervised learning is an app ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Abstract:- Unsupervised learning (clustering) deals with instances, which have not been pre-classified in any way and so do not have a class attribute associated with them. The scope of applying clustering algorithms is to discover useful but unknown classes of items. Unsupervised learning is an approach of learning where instances are automatically placed into meaningful groups based on their similarity. This paper introduces the fundamental concepts of unsupervised learning while it surveys the recent clustering algorithms. Moreover, recent advances in unsupervised learning, such as ensembles of clustering algorithms and distributed clustering, are described.
A Decision-Theoretic Approach to Data Mining
, 2003
"... In this paper, we develop a decision-theoretic framework for evaluating data mining systems, which employ classification methods, in terms of their utility in decision-making. The decision-theoretic model provides an economic perspective on the value of "extracted knowledge," in terms of its payoff ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
In this paper, we develop a decision-theoretic framework for evaluating data mining systems, which employ classification methods, in terms of their utility in decision-making. The decision-theoretic model provides an economic perspective on the value of "extracted knowledge," in terms of its payoff to the organization, and suggests a wide range of decision problems that arise from this point of view. The relation between the quality of a data mining system and the amount of investment that the decision maker is willing to make is formalized. We propose two ways by which independent data mining systems can be combined and show that the combined data mining system can be used in the decision -making process of the organization to increase payoff. Examples are provided to illustrate the various concepts, and several ways by which the proposed framework can be extended are discussed.
Modeling the Dermoscopic Structure Pigment Network Using a Clinically Inspired Feature Set
"... Abstract. We present a method to detect and classify the dermoscopic structure pigment network which may indicate early melanoma in skin lesions. We locate the network as darker areas constituting a mesh, as well as lighter areas representing the ‘holes ’ which the mesh surrounds. After identifying ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. We present a method to detect and classify the dermoscopic structure pigment network which may indicate early melanoma in skin lesions. We locate the network as darker areas constituting a mesh, as well as lighter areas representing the ‘holes ’ which the mesh surrounds. After identifying the lines and holes, 69 features inspired by the clinical definition are derived and used to classify the network into one of two classes: Typical or Atypical. We validate our method over a large, inclusive, real-world dataset consisting of 436 images and achieve an accuracy of 82 % discriminating between three classes (Absent, Typical or Atypical) and an accuracy of 93 % discriminating between two classes (Absent or Present). 1
Data Mining: How Research Meets Practical Development?
- Knowl. Inform. Syst
, 1998
"... At the 2001 IEEE International Conference on Data Mining in San Jose, California on November 29 - December 2, 2001, there was a panel discussion on how data mining research meets practical development. One of the motivations for organizing the panel discussion was to provide useful advice for ind ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
At the 2001 IEEE International Conference on Data Mining in San Jose, California on November 29 - December 2, 2001, there was a panel discussion on how data mining research meets practical development. One of the motivations for organizing the panel discussion was to provide useful advice for industrial people to explore their directions in data mining development.
Evaluating the Utility of Web-Based Consumer Support Tools Using Rough Sets,” To Appear
- In Proc. International Conference on Conceptual Structures (ICCS), Rough Set and Data Mining Workshop
, 2007
"... Abstract. Many popular e-commerce sites provide decision support tools to assist potential customers. Preliminary research indicates that web usage mining analyses may help to assess the utility of these tools and highlight possible areas for improvement. This paper describes a new procedure for ass ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Many popular e-commerce sites provide decision support tools to assist potential customers. Preliminary research indicates that web usage mining analyses may help to assess the utility of these tools and highlight possible areas for improvement. This paper describes a new procedure for assessing the utility of web-based support tools using techniques in rough set theory. The authors evaluated this procedure in a study of two such support tools, one developed by the US-EPA and the other developed by one of the authors. Results provided interesting insights on the utility of both tools and indicated that both tools could be improved. Details of the new procedure, results obtained from its evaluation, and its implications for future work are described. 1
A Survey of Open Source Data Mining Systems
"... Abstract. Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterprise can easily initiate a data mining project using the most current technology. Of ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterprise can easily initiate a data mining project using the most current technology. Often the software is available at no cost, allowing the enterprise to instead focus on ensuring their staff can freely learn the data mining techniques and methods. Open source ensures that staff can understand exactly how the algorithms work by examining the source code, if they so desire, and can also fine tune the algorithms to suit the specific purposes of the enterprise. However, diversity, instability, scalability and poor documentation can be major concerns in using open source data mining systems. In this paper, we survey open source data mining systems currently available on the Internet. We compare 12 open source systems against several aspects such as general characteristics, data source accessibility, data mining functionality, and usability. We discuss advantages and disadvantages of these open source data mining systems.
Neural Network Tool for Data Mining: SOM Toolbox
, 2000
"... Self-Organizing Map is an unsupervised neural network which combines vector quantization and vector projection. This makes it a powerful visualization tool. SOM Toolbox implements the SOM in the Matlab 5 computing environment. In this paper, computational complexity of SOM and the applicability of t ..."
Abstract
- Add to MetaCart
Self-Organizing Map is an unsupervised neural network which combines vector quantization and vector projection. This makes it a powerful visualization tool. SOM Toolbox implements the SOM in the Matlab 5 computing environment. In this paper, computational complexity of SOM and the applicability of the Toolbox are investigated. It is seen that the Toolbox is easily applicable to small data sets (under 10000 records) but can also be applied in case of medium sized data sets. The prime limiting factor is map size: the Toolbox is mainly suitable for training maps with 1000 map units or less.
A Proposed Architecture for Implementing a Knowledge Management System in the Brazilian National Cancer Institute
"... Rio de Janeiro, RJ, Brazil. Because their services are based decisively on the collection, analysis and exchange of clinical information or knowledge, within and across organizational boundaries, knowledge management has exceptional application and importance to health care organizations. This artic ..."
Abstract
- Add to MetaCart
Rio de Janeiro, RJ, Brazil. Because their services are based decisively on the collection, analysis and exchange of clinical information or knowledge, within and across organizational boundaries, knowledge management has exceptional application and importance to health care organizations. This article proposes a conceptual framework for a knowledge management system, which is expected to support both hospitals and the oncology network in Brazil. Under this holistic single-case study, triangulation of multiple sources of data collection was used by means of archival records, documents and participant observation, as two of the authors were serving as INCA staff members, thus gaining access to the event and its documentation and being able to perceive reality from an insider point of view. The benefits derived from the present status of the ongoing implementation, so far, are: (i) speediness of cancer diagnosis and enhanced quality of both diagnosis and data used in epidemiological studies; (ii) reduction in treatment costs; (iii) relief of INCA’S labor shortage; (iii) improved management performance; (iv) better use of installed capacity; (v) easiness of massive (explicit) knowledge transference among the members of the network; and (vi) increase in organizational capacity of knowledge retention (institutionalization of procedures). Key words: knowledge management; information system; health care; hospital management.
IMDC: An Image-Mapped Data Clustering Technique for Large Datasets
"... Abstract — In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the data ..."
Abstract
- Add to MetaCart
Abstract — In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

