• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Instance-based learning: Nearest neighbor with generalization (1995)

by B Martin
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Applications of Data Mining in Constraint-Based Intelligent Tutoring Systems

by Karthik Nilakant, Supervisor Dr A. Mitrovic , 2004
"... This report describes an investigation into the use of data mining processes, with respect to student interaction with Intelligent Tutoring Systems (ITSs). In particular, a framework for the analysis of constraint-based tutors is developed. The framework, which involves three phases (collection, tra ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
This report describes an investigation into the use of data mining processes, with respect to student interaction with Intelligent Tutoring Systems (ITSs). In particular, a framework for the analysis of constraint-based tutors is developed. The framework, which involves three phases (collection, transformation and analysis), is implemented to extract patterns in student interaction with an ITS called SQL-Tutor. This investigation identifies alternative techniques in each of the three phases, and discusses the advantages and disadvantages of each approach. The report also highlights a number of key knowledge areas in which the mining process can be used to find rules, relationships and patterns. A range of typical findings from an existing tutoring system are used to illustrate each of these knowledge areas. It is envisaged that the knowledge that is extracted using these techniques will be used to improve the tutoring systems.

Steady State Contingency analysis of electrical networks using machine learning techniques

by Dimitrios Semitekos, Nikolaos Avouris
"... Abstract. Steady state contingency analysis aims at the assessment of the risk certain contingencies may pose to an electrical network. This is a particularly important task of network operators, especially as network stability issues become of prime importance in the current era of electricity dere ..."
Abstract - Add to MetaCart
Abstract. Steady state contingency analysis aims at the assessment of the risk certain contingencies may pose to an electrical network. This is a particularly important task of network operators, especially as network stability issues become of prime importance in the current era of electricity deregulation. The article focuses on the analysis of experimental data that are produced through operating point simulation, contingency application, machine- learning cross validation (based on pre-contingency network index selection algorithms) to point out the “nature ” of given contingencies. Experimental statistical results of contingency prediction and selected network state indicators are translated to electric network data in an effort to further interpret the “nature ” of each contingency and produce effective predicting algorithms that support operators.

Early Drift Detection Method

by Manuel Baena-Garca Jose, José Del Campo- Ávila, Raúl Fidalgo, Albert Bifet, Ricard Gavaldà, Rafael Morales-bueno
"... An emerging problem in Data Streams is the detection of concept drift. This problem is aggravated when the drift is gradual over time. In this work we define a method for detecting concept drift, even in the case of slow gradual change. It is based on the estimated distribution of the distances ..."
Abstract - Add to MetaCart
An emerging problem in Data Streams is the detection of concept drift. This problem is aggravated when the drift is gradual over time. In this work we define a method for detecting concept drift, even in the case of slow gradual change. It is based on the estimated distribution of the distances between classification errors. The proposed method can be used with any learning algorithm in two ways: using it as a wrapper of a batch learning algorithm or implementing it inside an incremental and online algorithm. The experimentation results compare our method (EDDM) with a similar one (DDM). Latter uses the error-rate instead of distance-error-rate.

Towards the Development of a Problem Solver for the Monitoring and Control of Instrumentation in a Grid Environment

by Tatiana Kalganova, Russell Taylor, Mujtaba Alsaif
"... Abstract – This paper considers the issues involved in developing a generic problem solver to be used within a grid environment for the monitoring and control of instrumentation. The specific feature of such an environment is that the type of data to be processed, as well as the problem, is not alwa ..."
Abstract - Add to MetaCart
Abstract – This paper considers the issues involved in developing a generic problem solver to be used within a grid environment for the monitoring and control of instrumentation. The specific feature of such an environment is that the type of data to be processed, as well as the problem, is not always known in advance. Therefore, it is necessary to develop a problem solver architecture that will address this issue. We propose to analyze the performance of the problem solving algorithms available within the WEKA toolkit and determine a decision tree of the best performing algorithm for a given type of data. For this purpose the algorithms have been tested using 51 datasets either drawn from publicly available repositories or generated in a gridenabled environment. I

Categorizing Blogger’s Interests Based on Short Snippets of Blog Posts

by Jiahui Liu, Larry Birnbaum, Bryan Pardo
"... Blogs have become an important medium for people to express opinions and share information on the web. Predicting the interests of bloggers can be beneficial for information retrieval and knowledge discovery in the blogosphere. In this paper, we propose a two-layer classification model to categorize ..."
Abstract - Add to MetaCart
Blogs have become an important medium for people to express opinions and share information on the web. Predicting the interests of bloggers can be beneficial for information retrieval and knowledge discovery in the blogosphere. In this paper, we propose a two-layer classification model to categorize the interests of bloggers based on a set of short snippets collected from their blog posts. Experiments were conducted on a list of bloggers collected from blog directories, with their snippets collected from Google Blog Search. The results show that the proposed method is robust to errors in the lower level and achieve satisfactory performance in categorizing blogger’s interests.

Comparative Analysis of Artificial Intelligence Techniques for Goods Classification

by I. Fern, D. Gonzalez, A. Gomez, P. Priore, J. Puente, J. Parreno
"... Abstract- In this paper, different methods of inventory classification are compared. ABC classical methodology discriminates the articles to be classified according to two variables: unitary cost and yearly demand. This paper proposes different methodologies that broaden the analysis over more attri ..."
Abstract - Add to MetaCart
Abstract- In this paper, different methods of inventory classification are compared. ABC classical methodology discriminates the articles to be classified according to two variables: unitary cost and yearly demand. This paper proposes different methodologies that broaden the analysis over more attributes: Genetic Algorithms, Neural Networks, Tabu Search and several techniques included the WEKA program developed by the University of Waikato. To check the reliability of the models, the results are compared to the heuristic classification that an expert made in a set of 189 pharmaceutical products considering five input attributes. In addition, an Inventory Generator Program has been used to create five inventories that have been classified by the different algorithms, so that the results obtained by the algorithms could be compared. Keywords: Metaheuristic, genetic algorithm, Tabu Search, Neural Network, WEKA.

Author manuscript, published in "IEEE/ACM/WIC International Conference on Intelligent Agent Technology (2011)" On-board Evolutionary Algorithm and Off-line Rule Discovery for Column Formation in Swarm Robotics

by Asuki Kouno, Jean-marc Montanier, Shigeru Takano, Nicolas Bredeche, Einoshin Suzuki , 2011
"... Abstract—This paper aims at building autonomous controllers for swarm robots, specifically aimed at enforcing a given shape formation, here a column formation. The proposed approach features two main characteristics. Firstly, a state-of-the-art evolutionary setting is used to achieve the on-board op ..."
Abstract - Add to MetaCart
Abstract—This paper aims at building autonomous controllers for swarm robots, specifically aimed at enforcing a given shape formation, here a column formation. The proposed approach features two main characteristics. Firstly, a state-of-the-art evolutionary setting is used to achieve the on-board optimization of the controller, avoiding any simulator-based approach. Secondly, as the cost of physical experiments might be prohibitively high for plain evolutionary approaches, a data mining approach is achieved on the top of evolution; rule discovery is used to discover the most promising regions in the controller search space. The merits of the approach are experimentally validated using a 5 robot formation, showing that the hybrid evolutionary learning process outperforms evolution alone in terms of swarm speed and shape quality. I.

GA-stacking: Evolutionary stacked generalization

by Agapito Ledezma, Ricardo Aler, Araceli Sanchis, Daniel Borrajo , 2009
"... Stacking is a widely used technique for combining classifiers and improving prediction accuracy. Early research in Stacking showed that selecting the right classifiers, their parameters and the meta-classifiers was a critical issue. Most of the research on this topic hand picks the right combinatio ..."
Abstract - Add to MetaCart
Stacking is a widely used technique for combining classifiers and improving prediction accuracy. Early research in Stacking showed that selecting the right classifiers, their parameters and the meta-classifiers was a critical issue. Most of the research on this topic hand picks the right combination of classifiers and their parameters. Instead of starting from these initial strong assumptions, our approach uses genetic algorithms to search for good Stacking configurations. Since this can lead to overfitting, one of the goals of this paper is to empirically evaluate the overall efficiency of the approach. A second goal is to compare our approach with the current best Stacking building techniques. The results show that our approach finds Stacking configurations that, in the worst case, perform as well as the best techniques, with the advantage of not having to manually set up the structure of the Stacking system.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University