Results 1  10
of
13
A Survey on Transfer Learning
"... A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many realworld applications, this assumption may not hold. For example, we sometimes have a classification task i ..."
Abstract

Cited by 181 (19 self)
 Add to MetaCart
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many realworld applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.
Modelling gene expression data using dynamic bayesian networks
, 1999
"... Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of ..."
Abstract

Cited by 157 (1 self)
 Add to MetaCart
Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs. Keywords: Genetic networks, boolean networks, Bayesian networks, neural networks, reverse engineering, machine learning. 1
Mapping and revising markov logic networks for transfer learning
 In Proceedings of the 22 nd National Conference on Artificial Intelligence (AAAI
, 2007
"... Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains. We present a complete MLN transfer system that first autonomously maps the predicates in the source MLN to the target domain and then revises the mapped structure to further improve its accuracy. Our results in several realworld domains demonstrate that our approach successfully reduces the amount of time and training data needed to learn an accurate model of a target domain over learning from scratch.
Symbolic Interpretation of Artificial Neural Networks
, 1996
"... Hybrid Intelligent Systems that combine knowledge based and artificial neural network systems typically have four phases involving domain knowledge representation, mapping of this knowledge into an initial connectionist architecture, network training and rule extraction respectively. The final phase ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
Hybrid Intelligent Systems that combine knowledge based and artificial neural network systems typically have four phases involving domain knowledge representation, mapping of this knowledge into an initial connectionist architecture, network training and rule extraction respectively. The final phase is important because it can provide a trained connectionist architecture with explanation power and validate its output decisions. Moreover, it can be used to refine and maintain the initial knowledge acquired from domain experts. In this paper, we present three rule extraction techniques. The first technique extracts a set of binary rules from any type of neural network. The other two techniques are specific to feedforward networks with a single hidden layer of sigmoidal units. Technique 2 extracts partial rules that represent the most important embedded knowledge with an adjustable level of detail, while the third technique provides a more comprehensive and universal approach. A rule eval...
Tractable probabilistic models for intention recognition based on expert knowledge
 In Intl. Conf. Intel. Rob. Sys
, 2007
"... Abstract — Intention recognition is an important topic in humanrobot cooperation that can be tackled using probabilistic modelbased methods. A popular instance of such methods are Bayesian networks where the dependencies between random variables are modeled by means of a directed graph. Bayesian n ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Abstract — Intention recognition is an important topic in humanrobot cooperation that can be tackled using probabilistic modelbased methods. A popular instance of such methods are Bayesian networks where the dependencies between random variables are modeled by means of a directed graph. Bayesian networks are very efficient for treating networks with conditionally independent parts. Unfortunately, such independence sometimes has to be constructed by introducing so called hidden variables with an intractably large state space. An example are human actions which depend on human intentions and on other human actions. Our goal in this paper is to find models for intentionaction mapping with a reduced state space in order to allow for tractable online evaluation. We present a systematic derivation of the reduced model and experimental results of recognizing the intention of a real human in a virtual environment. I.
Selfaware services: Using bayesian networks for detecting anomalies in internetbased services
 Northwestern University and Stanford University. Gary (Igor
, 2001
"... service management, anomaly detection, Bayesian networks, online learning, fault and performance management We propose a general architecture and implementation for the autonomous assessment of health of arbitrary service elements, as a necessary prerequisite to selfcontrol. We describe a health en ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
service management, anomaly detection, Bayesian networks, online learning, fault and performance management We propose a general architecture and implementation for the autonomous assessment of health of arbitrary service elements, as a necessary prerequisite to selfcontrol. We describe a health engine, the central component of our proposed ‘SelfAwareness and Control ’ architecture. The health engine combines domain independent statistical analysis and probabilistic reasoning technology (Bayesian networks) with domain dependent measurement collection and evaluation methods. The resultant probabilistic assessment enables open, nonhierarchical communications about service element health. We demonstrate the validity of our approach using HP's corporate email service and detecting email anomalies: mail loops and a virus attack. We also present the results of applying online machine learning to this architecture and quantify the benefits of the Bayesian network layer.
Integrating Abduction and Induction in Machine Learning
 IN WORKING NOTES OF THE IJCAI97 WORKSHOP ON ABDUCTION AND INDUCTION IN AI
, 1997
"... This paper discusses the integration of traditional abductive and inductive reasoning methods in the development of machine learning systems. In particular, the paper discusses our recent work in two areas: 1) The use of traditional abductive methods to propose revisions during theory refineme ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
This paper discusses the integration of traditional abductive and inductive reasoning methods in the development of machine learning systems. In particular, the paper discusses our recent work in two areas: 1) The use of traditional abductive methods to propose revisions during theory refinement, where an existing knowledge base is modified to make it consistent with a set of empirical data; and 2) The use of inductive learning methods to automatically acquire from examples a diagnostic knowledge base used for abductive reasoning.
Improving Learning of Markov Logic Networks using Transfer and BottomUp Induction
"... Statistical relational learning (SRL) algorithms combine ideas from rich knowledge representations, such as firstorder logic, with those from probabilistic graphical models, such as Markov networks, to address the problem of learning from multirelational data. One challenge posed by such data is t ..."
Abstract
 Add to MetaCart
Statistical relational learning (SRL) algorithms combine ideas from rich knowledge representations, such as firstorder logic, with those from probabilistic graphical models, such as Markov networks, to address the problem of learning from multirelational data. One challenge posed by such data is that individual instances are frequently very large and include complex relationships among the entities. Moreover, because separate instances do not follow the same structure and contain varying numbers of entities, they cannot be effectively represented as a featurevector. SRL models and algorithms have been successfully applied to a wide variety of domains such as social network analysis, biological data analysis, and planning, among others. Markov logic networks (MLNs) are a recentlydeveloped SRL model that consists of weighted firstorder clauses. MLNs can be viewed as templates that define Markov networks when provided with the set of constants present in a domain. MLNs are therefore very powerful because they inherit the expressivity of firstorder logic. At the same time, MLNs can flexibly deal with noisy or uncertain data to produce probabilistic predictions for a set of propositions. MLNs have also been shown to subsume several other popular SRL models. The expressive power of MLNs comes at a cost: structure learning, or learning the firstorder clauses
Maximizing Theory Accuracy Through Selective Reinterpretation
, 2000
"... Existing methods for exploiting flawed domain theories depend on the use of a sufficiently large set of training examples for diagnosing and repairing flaws in the theory. In this paper, we offer a "universal" method of theory reinterpretation that makes only marginal use of training examples. Th ..."
Abstract
 Add to MetaCart
Existing methods for exploiting flawed domain theories depend on the use of a sufficiently large set of training examples for diagnosing and repairing flaws in the theory. In this paper, we offer a "universal" method of theory reinterpretation that makes only marginal use of training examples. The idea is as follows: Often a small number of flaws in a theory can completely destroy the theory's classification accuracy. Yet it is clear that valuable information is available even from such flawed theories. For example, an instance with several independent proofs in a slightly flawed theory is certainly more likely to be correctly classified as positive than an instance with only a single proof. This idea can be generalized to a numerical notion of "degree of provedness" which measures the robustness of proofs or refutations for a given instance. This "degree of provedness" can be easily computed using a "soft" interpretation of the theory. Given a ranking of instances based on ...
SelfAware Services: Using Bayesian
, 2001
"... We propose a general architecture and implementation for the autonomous assessment of health of arbitrary service elements, as a necessary prerequisite to self control. We describe a health engine, the central component of our proposed `SelfAwareness and Control' architecture. The health engine com ..."
Abstract
 Add to MetaCart
We propose a general architecture and implementation for the autonomous assessment of health of arbitrary service elements, as a necessary prerequisite to self control. We describe a health engine, the central component of our proposed `SelfAwareness and Control' architecture. The health engine combines domain independent statistical analysis and probabilistic reasoning technology (Bayesian networks) with domain dependent measurement collection and evaluation methods. The resultant probabilistic assessment enables open, nonhierarchical communications about service element health. We demonstrate the validity of our approach using HP's corporate email service and detecting email anomalies: mail loops and a virus attack. We also present the results of applying online machine learning to this architecture, and quantify the benefits of the Bayesian network layer.