Results 1  10
of
22
Symbolic knowledge extraction from trained neural networks: A sound approach
, 2001
"... Although neural networks have shown very good performance in many application domains, one of their main drawbacks lies in the incapacity to provide an explanation for the underlying reasoning mechanisms. The "explanation capability" of neural networks can be achieved by the extraction of symbolic k ..."
Abstract

Cited by 44 (6 self)
 Add to MetaCart
Although neural networks have shown very good performance in many application domains, one of their main drawbacks lies in the incapacity to provide an explanation for the underlying reasoning mechanisms. The "explanation capability" of neural networks can be achieved by the extraction of symbolic knowledge. In this paper, we present a new method of extraction that captures nonmonotonic rules encoded in the network, and prove that such a method is sound. We start by discussing some of the main problems of knowledge extraction methods. We then discuss how these problems may be ameliorated. To this end, a partial ordering on the set of input vectors of a network is defined, as well as a number of pruning and simplification rules. The pruning rules are then used to reduce the search space of the extraction algorithm during a pedagogical extraction, whereas the simplification rules are used to reduce the size of the extracted set of rules. We show that, in the case of regular networks, the extraction algorithm is sound and complete. We proceed to extend the extraction algorithm to the class of nonregular networks, the general case. We show that nonregular networks always contain regularities in their subnetworks. As a result, the underlying extraction method for regular networks can be applied, but now in a decompositional fashion. In order to combine the sets of rules extracted from each subnetwork into the final set of rules, we use a method whereby we are able to keep the soundness of the extraction algorithm. Finally, we present the results of an empirical analysis of the extraction system, using traditional examples and realworld application problems. The results have shown that a very high fidelity between the extracted set of rules and the network can be achieved....
Decision Graphs  An Extension of Decision Trees
, 1993
"... : In this paper, we examine Decision Graphs, a generalization of decision trees. We present an inference scheme to construct decision graphs using the Minimum Message Length Principle. Empirical tests demonstrate that this scheme compares favourably with other decision tree inference schemes. This w ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
: In this paper, we examine Decision Graphs, a generalization of decision trees. We present an inference scheme to construct decision graphs using the Minimum Message Length Principle. Empirical tests demonstrate that this scheme compares favourably with other decision tree inference schemes. This work provides a metric for comparing the relative merit of the decision tree and decision graph formalisms for a particular domain. 1 Introduction In this paper, we examine the problem of inferring a decision procedure from a set of examples. We examine the decision graph [5, 1, 16, 15, 14], a generalization of the decision tree [3, 18], and propose a method to construct decision graphs based upon Wallace's Minimum Message Length Principle (MMLP) [24, 10, 25]. The MMLP is related to Rissanen's Minimum Description Length Principle (MDLP) [21, 22, 20]. For the reader unfamiliar with minimum encoding methods (MML and MDL), a good introduction to the area is given by Georgeff [10]. We formalize ...
Combining Prior Symbolic Knowledge And Constructive Neural Network Learning
 Connection Science
, 1993
"... The concepts of knowledgebased systems and machine learning are combined by integrating an expert system and a constructive neural networks learning algorithm. Two approaches are explored: embedding the expert system directly and converting the expert system rule base into a neural network. This in ..."
Abstract

Cited by 24 (8 self)
 Add to MetaCart
The concepts of knowledgebased systems and machine learning are combined by integrating an expert system and a constructive neural networks learning algorithm. Two approaches are explored: embedding the expert system directly and converting the expert system rule base into a neural network. This initial system is then extended by constructively learning additional hidden units in a problemspecific manner. Experiments performed indicate that generalization of a combined system surpasses that of each system individually. Contact: Dr. Zoran Obradovi'c zoran@eecs.wsu.edu School of Electrical Engineering and Computer Science Washington State University Pullman, WA 991642752 (509) 3356601 FAX: (509) 3353818 COMBINING PRIOR SYMBOLIC KNOWLEDGE AND CONSTRUCTIVE NEURAL NETWORK LEARNING Justin Fletcher Zoran Obradovi'c y jfletche@eecs.wsu.edu zoran@eecs.wsu.edu School of Electrical Engineering and Computer Science Washington State University, Pullman WA 991642752 Abstract The conce...
The Connectionist Inductive Learning and Logic Programming System
, 1999
"... This paper presents the Connectionist Inductive Learning and Logic Programming System (CIL²P). CIL²P is a new massively parallel computational model based on a feedforward Artificial Neural Network that integrates inductive learning from examples and background knowledge, with deductive learning ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
This paper presents the Connectionist Inductive Learning and Logic Programming System (CIL²P). CIL²P is a new massively parallel computational model based on a feedforward Artificial Neural Network that integrates inductive learning from examples and background knowledge, with deductive learning from Logic Programming. Starting with the background knowledge represented by a propositional logic program, a translation algorithm is applied generating a neural network that can be trained with examples. The results obtained with this refined network can be explained by extracting a revised logic program from it. Moreover, the neural network computes the stable model of the logic program inserted in it as background knowledge, or learned with the examples, thus functioning as a parallel system for Logic Programming. We have successfully applied CIL2Ptotwo realworld problems of computational biology, specifically DNA sequence analyses. Comparisons with the results obtained by some of the main neural, symbolic, and hybrid inductive learning systems, using the same domain knowledge, show the effectiveness of CIL²P.
Extraction of logical rules from backpropagation networks
 Neural Processing Lett
, 1998
"... networks ..."
CN2MCI: A TwoStep Method for Constructive Induction
, 1994
"... Methods for constructive induction perform automatic transformations of description spaces if representational shortcomings deteriorate the quality of learning. In the context of concept learning and propositional representation languages, feature construction algorithms have been developed in order ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Methods for constructive induction perform automatic transformations of description spaces if representational shortcomings deteriorate the quality of learning. In the context of concept learning and propositional representation languages, feature construction algorithms have been developed in order to improve the accuracy and to decrease the complexity of hypotheses. Particularly, socalled hypothesisdriven constructive induction (HCI) algorithms construct new attributes based upon the analysis of induced hypotheses. A new method for constructive induction, CN2MCI, is described that applies a single, new constructive operator (o ) in the usual HCIframework to achieve a more finegrained analysis of decision rules. o uses a cluster algorithm to map selected features into a new binary feature. Given training examples as input, CN2MCI computes an inductive hypothesis expressed in terms of the transformed representation. Although this paper presents work in progress, early empirica...
Improved phishing detection using modelbased features
 In Fifth Conference on Email and AntiSpam, CEAS
, 2008
"... Phishing emails are a real threat to internet communication and web economy. Criminals are trying to convince unsuspecting online users to reveal passwords, account numbers, social security numbers or other personal information. Filtering approaches using blacklists are not completely effective as a ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Phishing emails are a real threat to internet communication and web economy. Criminals are trying to convince unsuspecting online users to reveal passwords, account numbers, social security numbers or other personal information. Filtering approaches using blacklists are not completely effective as about every minute a new phishing scam is created. We investigate the statistical filtering of phishing emails, where a classifier is trained on characteristic features of existing emails and subsequently is able to identify new phishing emails with different contents. We propose advanced email features generated by adaptively trained Dynamic Markov Chains and by novel latent ClassTopic Models. On a publicly available test corpus classifiers using these features are able to reduce the number of misclassified emails by two thirds compared to previous work. Using a recently proposed more expressive evaluation method we show that these results are statistically significant. In addition we successfully tested our approach on a nonpublic email corpus with a reallife composition. 1
Learning a Neural Tree
 Proceedings International Joint Conference on Neural Networks
, 1992
"... A method to learn neural trees is proposed in this paper. Not only the weights of the network connections but also the structure of the whole network including the number of neurons and the interconnections among the neurons are all learned from the training set by our method. Issues about the op ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
A method to learn neural trees is proposed in this paper. Not only the weights of the network connections but also the structure of the whole network including the number of neurons and the interconnections among the neurons are all learned from the training set by our method. Issues about the optimization and pruning of the generated networks are investigated. Initial test results, comparisons with other learning methods and several possible applications are also discussed. 1 Introduction It is well known that neural networks can learn connection weights from examples. However, is it possible to learn a neural network structure from examples? Structures of neural networks are usually designed by human experts. It is quite tricky to choose a good structure to fit for the learning task at hand. This makes the advantages of connectionist learning much less attractive. Furthermore, it turns out that the structure of a neural network is closely related to its final performance in som...
CompressionBased Feature Subset Selection
 In Proceedings of the IJCAI95 Workshop on Data Engineering for Inductive Learning
, 1995
"... Irrelevant and redundant features may reduce both predictive accuracy and comprehensibility of induced concepts. Most common Machine Learning approaches for selecting a good subset of relevant features rely on crossvalidation. As an alternative, we present the application of a particular Minimum De ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Irrelevant and redundant features may reduce both predictive accuracy and comprehensibility of induced concepts. Most common Machine Learning approaches for selecting a good subset of relevant features rely on crossvalidation. As an alternative, we present the application of a particular Minimum Description Length (MDL) measure to the task of feature subset selection. Using the MDL principle allows taking into account all of the available data at once. The new measure is informationtheoretically plausible and yet still simple and therefore efficiently computable. We show empirically that this new method for judging the value of feature subsets is more efficient than and performs at least as well as methods based on crossvalidation. Domains with both a large number of training examples and a large number of possible features yield the biggest gains in efficiency. Thus our new approach seems to scale up better to large learning problems than previous methods. 1 This research is spons...
CIPF 2.0: A Robust Constructive Induction System
 Proceedings of MLCOLT'94
, 1994
"... We describe CIPF 2.0, a propositional constructive learner which is able to cope with both noise and representation mismatch in training examples simultaneously. CIPF 2.0's abilities stem from coupling the robust selective learner C4.5 (and its production rule generator) with a sophisticated constru ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We describe CIPF 2.0, a propositional constructive learner which is able to cope with both noise and representation mismatch in training examples simultaneously. CIPF 2.0's abilities stem from coupling the robust selective learner C4.5 (and its production rule generator) with a sophisticated constructive induction component. An important new general constructive operator incorporated into CIPF 2.0 is the simplified Kramer operator which abstracts combinations of two attributes into a single new boolean attribute. The socalled Minimum Description Length (MDL) principle acts as a powerful control heuristic guiding the search in the possibly vast representation space. 1 INTRODUCTION When learning concept descriptions from preclassified examples, simple concept learners typically make strong assumptions about the way these examples are represented. For a concept to be learnable, its examples must populate one or a few regions of the hypothesis space expressible in the description languag...