Results 1 - 10
of
15
Shared farthest neighbor approach to clustering of high dimensionality, low cardinality data
"... Clustering algorithms are routinely used in biomedical disciplines, and are a basic tool in bioinformatics. Depending on the task at hand, there are two most popular options, the central partitional techniques and the Agglomerative Hierarchical Clustering techniques and their derivatives. These meth ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Clustering algorithms are routinely used in biomedical disciplines, and are a basic tool in bioinformatics. Depending on the task at hand, there are two most popular options, the central partitional techniques and the Agglomerative Hierarchical Clustering techniques and their derivatives. These methods are well studied and well established. However, both categories have some drawbacks related to data dimensionality (for partitional algorithms) and to the bottom-up structure (for hierarchical agglomerative algorithms). To overcome these limitations, motivated by the problem of gene expression analysis with DNA microarrays, we present a hierarchical clustering algorithm based on a completely different principle, which is the analysis of shared farthest neighbors. We present a framework for clustering using ranks and indexes, and introduce the Shared Farthest Neighbors clustering criterion. We illustrate the properties of the method and present experimental results on different data sets, using the strategy of evaluating data clustering by extrinsic knowledge given by class labels. 1
Determining patient similarity in medical social networks
- in Proc. MedEx Workshop
"... Abstract. In social networks the primary concern of people is to find others who share similar interests. For medical systems this means finding people who have similar symptoms or comparable diseases. Here a sim-ple matching of variables would lead to a very small number of identical cases and dete ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. In social networks the primary concern of people is to find others who share similar interests. For medical systems this means finding people who have similar symptoms or comparable diseases. Here a sim-ple matching of variables would lead to a very small number of identical cases and determining similarity would usually fail due to the categori-cal nature of most factors. In particular, such problems arise for cancer patients. We have developed a system that is capable of determining sim-ilarity in terms of the survival time distribution. By a similarity based search our approach allows to determine related patients. Thus recom-mendations for contacts of interest become possible. We will present the theoretical foundation as well as a use case scenario with an existing data mining software. 1
unknown title
"... Abstract In this paper, a CBR approach that deals with missing data is presented. In the conversational ISOR system different knowledge sources are involved, including medical experts. In the case base rules and formulae are stored that support the restoration of numerous missing values. The task i ..."
Abstract
- Add to MetaCart
Abstract In this paper, a CBR approach that deals with missing data is presented. In the conversational ISOR system different knowledge sources are involved, including medical experts. In the case base rules and formulae are stored that support the restoration of numerous missing values. The task is to restore missing values in an observed medical data set. The presented method is used for a set of physiological and biochemical measurements of dialysis patients. The measurements were taken at four time points during a year in which the patients participated at an especially developed physical training program. To analyse the obtained data a restoration of missing values is really necessary.
Operational Support in Fish Farming through Case-based Reasoning
"... Abstract. Farmed fish is the third biggest export in Norway (around NOK 30 billion/e3.82 billion/US $ 5.44 billion in 2010), and large fish farms have biomass worth around NOK 150 million/e19.38 million/US$ 26.72 million. Several processes are automated (e.g. the feeding system), and sensory logging ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Farmed fish is the third biggest export in Norway (around NOK 30 billion/e3.82 billion/US $ 5.44 billion in 2010), and large fish farms have biomass worth around NOK 150 million/e19.38 million/US$ 26.72 million. Several processes are automated (e.g. the feeding system), and sensory logging systems are becoming ubiquitous. Still, the key to successful management of a site is the operational knowledge possessed by the fish farmers. In most cases, this information is not stored formally. To capture, store and reuse this knowledge in a more systematic way is called for. We present a system that employs case-based reasoning (CBR) for such knowledge management, combined with sensor data and numerical models. The CBR system will ultimately be the core part of a decision support for regional managers surveying fish farming sites. Data is acquired from multiple fish farms, spanning several years. We present recent results in testing how well the CBR system finds similar cases. An important part of this test is the evaluation of three different methods for case retrieval (kNN, linear programming for setting feature weights,
GRANULAR SUPPORT VECTOR MACHINES BASED ON GRANULAR COMPUTING, SOFT COMPUTING AND STATISTICAL LEARNING
, 2006
"... With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine sta ..."
Abstract
- Add to MetaCart
With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems. In general, GSVM works in 3 steps. Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space. Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary. Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level. A good granulation method to find suitable granules is crucial for modeling a good GSVM. Under this framework, many different granulation algorithms including the GSVM-CMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining)
unknown title
"... The spreading of interest in health sciences applications of case-based reasoning (CBR) is expanding more and more, not only in the traditional CBR in medicine domain, but also in enabling home health care technologies, CBR integration, and synergies between CBR and other artificial intelligence (AI ..."
Abstract
- Add to MetaCart
(Show Context)
The spreading of interest in health sciences applications of case-based reasoning (CBR) is expanding more and more, not only in the traditional CBR in medicine domain, but also in enabling home health care technologies, CBR integration, and synergies between CBR and other artificial intelligence (AI) methodologies. This workshop on CBR in the Health Sciences is the fifth in a series of exciting workshops, the first four of which were held at ICCBR-03, in Trondheim, Norway, at
Article URL
, 2009
"... This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. ANMM4CBR: a case-based reasoning method for gene expression data classification ..."
Abstract
- Add to MetaCart
(Show Context)
This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. ANMM4CBR: a case-based reasoning method for gene expression data classification
Case-Based Reasoning in a System Architecture for Intelligent Fish Farming
"... Abstract. Fish farmers manage assets of considerable value on a daily basis. Many aspects of the daily operation are automated in some way, such as the feeding system. Sensory equipment steadily becomes cheaper and more ubiquitous, yielding data that can be used by automated systems and for post-pro ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Fish farmers manage assets of considerable value on a daily basis. Many aspects of the daily operation are automated in some way, such as the feeding system. Sensory equipment steadily becomes cheaper and more ubiquitous, yielding data that can be used by automated systems and for post-processing (i.e. data mining) to discover hidden trends in the data. However, a lot of information is only known informally by the fish farmers themselves, through years of experience. Companies that can store this information and reuse it will have an advantage; even more so if high-level human expertise can be linked to low-level sensor data. This paper presents early developments of a system that stores this informal knowledge using case based-reasoning, combined with corresponding sensor data.
Incremental Development of an Explanation Model for Exceptional Dialyse Patients
"... Abstract. Our starting points are situations where neither a well-developed theory nor reliable knowledge nor a proper case base is available. So, instead of reliable theoretical knowledge and intelligent experience, we have just some theoretical hypothesis and a set of measurements. In this paper, ..."
Abstract
- Add to MetaCart
Abstract. Our starting points are situations where neither a well-developed theory nor reliable knowledge nor a proper case base is available. So, instead of reliable theoretical knowledge and intelligent experience, we have just some theoretical hypothesis and a set of measurements. In this paper, we propose to combine CBR with another method, for our specific problem with a statistical model. We use CBR to explain those cases that do not fit the statistical model. The case base has to be set up incrementally, it contains the exceptional cases, and their explanations are the solutions, which can be used to help to explain further exceptional cases. 1