Results 1  10
of
32
Multitask Learning
 MACHINE LEARNING
, 1997
"... Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task ..."
Abstract

Cited by 481 (6 self)
 Add to MetaCart
Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. This paper reviews prior work on MTL, presents new evidence that MTL in backprop nets discovers task relatedness without the need of supervisory signals, and presents new results for MTL with knearest neighbor and kernel regression. In this paper we demonstrate multitask learning in three domains. We explain how multitask learning works, and show that there are many opportunities for multitask learning in real domains. We present an algorithm and results for multitask learning with casebased methods like knearest neighbor and kernel regression, and sketch an algorithm for multitask learning in decision trees. Because multitask learning works, can be applied to many different kinds of domains, and can be used with different learning algorithms, we conjecture there will be many opportunities for its use on realworld problems.
On Local Optima in Learning Bayesian Networks
, 2003
"... This paper proposes and evaluates the kgreedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a tradeoff between greediness and randomness, thus exploring different good local optima when run repeatedly. W ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
This paper proposes and evaluates the kgreedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a tradeoff between greediness and randomness, thus exploring different good local optima when run repeatedly. When
Knowledge discovery in telecommunication services data using Bayesian Models
 In Proceedings of the First International Conference on Knowledge Discovery (KDD95
, 1993
"... Fraud and uncollectible debt are multibillion dollar problems in the telecommunications industry. Because it is difficult to know which accounts will go bad, we are faced with the difficult knowledgediscovery task of characterizing a rare binary outcome using large amounts of noisy, highdimension ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
Fraud and uncollectible debt are multibillion dollar problems in the telecommunications industry. Because it is difficult to know which accounts will go bad, we are faced with the difficult knowledgediscovery task of characterizing a rare binary outcome using large amounts of noisy, highdimensional data. Binary characterizatioos may be of interest but will not be especially useful in this domain. Instead, proposing an action requires an estimate of the probability that a customer or a call is uncollectible. This paper addresses the discovery of predictive knowledge bearing on fraud and uncollectible debt using a supervised machine leamiog method that constructs Bayesiao network models. The new method is able to predict rare event outcomes and cope with the quirks and copious amounts of input data. The Bay&an network models it produces serve as ao input module to a normative decisionsupport system and suggest ways to reinforce or redirect existing efforts in the problem area. We compare the performance of several conditionally independent models with the conditionally dependent models discovered by the new learning system using realworld datasets of 46 million records and 603 800 million bytes. I.
Text Analysis for Constructing Design Representations
 Artificial Intelligence in Engineering
, 1997
"... Abstract. An emerging model in concurrent product design and manufacturing is the federation of workgroups across traditional functional “silos. ” Along with the benefits of this concurrency comes the complexity of sharing and accessing design information. The primary challenge in sharing design inf ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Abstract. An emerging model in concurrent product design and manufacturing is the federation of workgroups across traditional functional “silos. ” Along with the benefits of this concurrency comes the complexity of sharing and accessing design information. The primary challenge in sharing design information across functional workgroups lies in reducing the complex expressions of associations between design elements. Collaborative design systems have addressed this problem from the perspective of formalizing a shared ontology or product model. We share the perspective that the design model and ontology are an expression of the “meaning ” of the design and provide a means by which information sharing in design may be achieved. However, in many design cases, formalizing an ontology before the design begins, establishing the knowledge sharing agreements or mapping out the design hierarchy is potentially more expensive than the design itself. This paper introduces a technique for inducing a representation of the design based upon the syntactic patterns contained in the corpus of design documents. The association between the design and the representation for the design is captured by basing the representation on terminological patterns in the design text. In the first stage, we create a “dictionary ” of nounphrases found in the text corpus based upon a measurement of the content carrying power of the phrase. In the second stage, we cluster the words to discover interterm dependencies and build a Bayesian belief network which describes a conceptual hierarchy specific to the domain of the design. We integrate the design document learning system with an agentbased collaborative design system for fetching design information based on the “smart drawings” paradigm. 1.
Bayesian belief network model for the safety assessment of nuclear computerbased systems. Second year report part 2, Esprit Long Term Research Project 20072DeVa
, 1998
"... The formalism of Bayesian Belief Networks (BBNs) is being increasingly applied to probabilistic modelling and decision problems in a widening variety of fields. This method provides the advantages of a formal probabilistic model, presented in an easily assimilated visual form, together with the read ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
The formalism of Bayesian Belief Networks (BBNs) is being increasingly applied to probabilistic modelling and decision problems in a widening variety of fields. This method provides the advantages of a formal probabilistic model, presented in an easily assimilated visual form, together with the ready availability of efficient computational methods and tools for exploring model consequences. Here we formulate one BBN model of a part of the safety assessment task for computer and software based nuclear systems important to safety. Our model is developed from the perspective of an independent safety assessor who is presented with the task of evaluating evidence from disparate sources: the requirement specification and verification documentation of the system licensee and of the system manufacturer; the previous reputation of the various participants in the design process; knowledge of commercial pressures; information about tools and resources used; and many other sources. Based on these multiple sources of
Bayesian Belief Networks for Data Mining
 University of Magdeburg
, 1996
"... In this paper we present a novel constraint based structural learning algorithm for causal networks. A set of conditional independence and dependence statements (CIDS) is derived from the data which describes the relationships among the variables. Although we implicitly assume that there exist ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
In this paper we present a novel constraint based structural learning algorithm for causal networks. A set of conditional independence and dependence statements (CIDS) is derived from the data which describes the relationships among the variables. Although we implicitly assume that there exists a perfect map for the true, yet unknown, distribution, there does not need to be a perfect map for the CIDSs derived from the limited data. The reason is that the distribution of limited data might differ from the true probability distribution due to sampling noise. We derive a necessary condition for the existence of a perfect map given a set of CIDSs and utilize it to check for inconsistencies. If an inconsistency is detected, the algorithm finds all Bayesian networks with a minimum number of edges such that a maximum number of CIDSs is represented in each of the multiple solutions. The advantages of our approach are illustrated using the alarm network data set. 1
Learning from Aggregate Views
"... In this paper, we introduce a new class of data mining problems called learning from aggregate views. In contrast to the traditional problem of learning from a single table of training examples, the new goal is to learn from multiple aggregate views of the underlying data, without access to the una ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
In this paper, we introduce a new class of data mining problems called learning from aggregate views. In contrast to the traditional problem of learning from a single table of training examples, the new goal is to learn from multiple aggregate views of the underlying data, without access to the unaggregated data. We motivate this new problem, present a general problem framework, develop learning methods for RFA (RestrictionFree Aggregate) views defined using COUNT, SUM, AVG and STDEV, and offer theoretical and experimental results that characterize the proposed methods. 1.
A bayesian network scoring metric that is based on globally uniform parameter priors
 Proceedings of Uncertainty in Artificial Intelligence
, 2002
"... We introduce a new Bayesian network (BN) scoring metric called the Global Uniform (GU) metric. This metric is based on a particular type of default parameter prior. Such priors may be useful when a BN developer is not willing or able to specify domainspecific parameter priors. The GU parameter prio ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We introduce a new Bayesian network (BN) scoring metric called the Global Uniform (GU) metric. This metric is based on a particular type of default parameter prior. Such priors may be useful when a BN developer is not willing or able to specify domainspecific parameter priors. The GU parameter prior specifies that every prior joint probability distribution P consistent with a BN structure S is considered to be equally likely. Distribution P is consistent with S if P includes just the set of independence relations defined by S. We show that the GU metric addresses some undesirable behavior of the BDeu and K2 Bayesian network scoring metrics, which also use particular forms of default parameter priors. A closed form formula for computing GU for special classes of BNs is derived. Efficiently computing GU for an arbitrary BN remains an open problem. 1
Performance Evaluation of Compromise Conditional Gaussian Networks for Data Clustering
, 2001
"... This paper is devoted to the proposal of two classes of compromise conditional Gaussian networks for data clustering as well as to their experimental evaluation and comparison on synthetic and realworld databases. According to the reported results, the models show an ideal tradeoff between eciency ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper is devoted to the proposal of two classes of compromise conditional Gaussian networks for data clustering as well as to their experimental evaluation and comparison on synthetic and realworld databases. According to the reported results, the models show an ideal tradeoff between eciency and effectiveness, i.e., a balance between the cost of the unsupervised model learning process and the quality of the learnt models. Moreover, the proposed models are very appealing due to their closeness to human intuition and computational advantages for the unsupervised model induction process, while preserving a rich enough modelling power.
Learning Dynamic Bayesian Networks from Multivariate Time Series with Changing Dependencies
 In Proc. 5th Intelligent Data Analysis Conference (IDA 2003
, 2003
"... Abstract. Many examples exist of multivariate time series where dependencies between variables change over time. If these changing dependencies are not taken into account, any model that is learnt from the data will average over the different dependency structures. Paradigms that try to explain unde ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. Many examples exist of multivariate time series where dependencies between variables change over time. If these changing dependencies are not taken into account, any model that is learnt from the data will average over the different dependency structures. Paradigms that try to explain underlying processes and observed events in multivariate time series must explicitly model these changes in order to allow nonexperts to analyse and understand such data. In this paper we have developed a method for generating explanations in multivariate time series that takes into account changing dependency structure. We make use of a dynamic Bayesian network model with hidden nodes. We introduce a representation and search technique for learning such models from data and test it on synthetic time series and realworld data from an oil refinery, both of which contain changing underlying structure. We compare our method to an existing EMbased method for learning structure. Results are very promising for our method and we include sample explanations, generated from models learnt from the refinery dataset. 1