• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Predicting Equity Returns from Securities Data (1995)

by C. Apte, Se June Hong
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 20
Next 10 →

From data mining to knowledge discovery in databases

by Usama Fayyad, Gregory Piatetsky-shapiro, Padhraic Smyth - AI Magazine , 1996
"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases ..."
Abstract - Cited by 215 (0 self) - Add to MetaCart
■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges involved in real-world applications of knowledge discovery, and current and future research directions in the field. Across a wide variety of fields, data are

Extensibility in Data Mining Systems

by Stefan Wrobel, Dietrich Wettschereck, Edgar Sommer, Werner Emde , 1996
"... The successful application of data mining techniques ideally requires both system support for the entire knowledge discovery process and the right analysis algorithms for the particular task at hand. While there are a number of successful data mining systems that support the entire mining process, t ..."
Abstract - Cited by 24 (1 self) - Add to MetaCart
The successful application of data mining techniques ideally requires both system support for the entire knowledge discovery process and the right analysis algorithms for the particular task at hand. While there are a number of successful data mining systems that support the entire mining process, they usually are limited to a fixed selection of analysis algorithms. In this paper, we argue in favor of extensibility as a key feature of data mining systems, and discuss the requirements that this entails for system architecture. We identify in which points existing data mining systems fail to meet these requirements, and then describe a new integration architecture for data mining systems that addresses these problems based on the concept of "plug-ins". KEPLER, our data mining system built according to this architecture, is presented and discussed. Keywords: data mining, system architecture, extensibility, KEPLER Introduction Data Mining, or Knowledge Discovery in Databases (KDD) aims...

R-MINI: An Iterative Approach for Generating Minimal Rules from Examples

by Se June Hong , 1997
"... Generating classification rules or decision trees from examples has been a subject of intense study in the pattern recognition community, the statistics community and the machine learning community of the artificial intelligence area. We pursue a point of view that minimality of rules is important, ..."
Abstract - Cited by 18 (3 self) - Add to MetaCart
Generating classification rules or decision trees from examples has been a subject of intense study in the pattern recognition community, the statistics community and the machine learning community of the artificial intelligence area. We pursue a point of view that minimality of rules is important, perhaps above all other considerations (biases) that come into play in generating rules. We present a new minimal rule generation algorithm called R-MINI (Rule-MINI) that is an adaptation of a well-established heuristic switching function minimization technique, MINI. The main mechanism that reduces the number of rules is repeated application of generalization and specialization operations to the rule set while maintaining completeness and consistency. R-MINI results on some benchmark cases are also presented. I. Introduction There are many approaches to generating Disjunctive Normal Form (DNF) rules from examples. The Aq family of rule generation and other approaches [1-4] incrementally c...

Data Mining with Decision Trees and Decision Rules

by Chidanand Apte, Sholom Weiss - FUTURE GENERATION COMPUTER SYSTEMS , 1997
"... This paper describes the use of decision tree and rule induction in data mining applications. Of methods for classification and regression that have been developed in the fields of pattern recognition, statistics, and machine learning, these areofparticular interest for data mining since they utiliz ..."
Abstract - Cited by 15 (3 self) - Add to MetaCart
This paper describes the use of decision tree and rule induction in data mining applications. Of methods for classification and regression that have been developed in the fields of pattern recognition, statistics, and machine learning, these areofparticular interest for data mining since they utilize symbolic and interpretable representations. Symbolic solutions can provide a high degree of insight into the decision boundaries that exist in the data, and the logic underlying them. This aspect makes these predictive mining techniques particularly attractive in commercial and industrial data mining applications. We present hereasynopsis of some major state-of-the-art tree and rule mining methodologies, as well as some recent advances.

Mortgage Data Mining

by George H. John, Ying Zhao - Stanford University , 1997
"... This paper reports a preliminary investigation of the use of modern data mining tools for mortgage scoring. Using IBM's Intelligent Miner (a data mining toolbox), we built a model of serious delinquency on a sample of data from Mortgage Information Corporation 's Loan Performance System, which conta ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
This paper reports a preliminary investigation of the use of modern data mining tools for mortgage scoring. Using IBM's Intelligent Miner (a data mining toolbox), we built a model of serious delinquency on a sample of data from Mortgage Information Corporation 's Loan Performance System, which contains over 20 million loans with a volume of over $1.6 trillion. Currently, two technologies prevail in mortgage scoring: logistic regression, a very old and very simple method, and neural networks, newer and more complex types of models that can be extremely difficult to interpret. The radial basis function (RBF) algorithm in Intelligent Miner combines the mathematical complexity and generality of neural networks with a comprehensible visualization that explains the RBF model. Due to the performance and understandability of the RBF model, as well as other unique technologies not described here, the Intelligent Miner should be a useful tool for mortgage bankers, facilitating development of cus...

RAMP: Rules Abstraction for Modeling and Prediction

by Chidanand Apte, C. Apte, S. J. Hong, Jorge Lepre, and Barry Rosen, S. Prasad, B. Rosen, Se June Hong, Jorge Lepre, Seema Prasad, Barry Rosen - IBM Research Division, IBM Research Division, T. J. Watson Research Center, Yorktown Heights, NY , 1995
"... ion for Modeling and Prediction C. Apte, S.J. Hong, J. Lepre, S. Prasad, and B. Rosen IBM Research Division Technical Report RC-20271 RAMP: Rules Abstraction for Modeling and Prediction Chidanand Apte, Se June Hong, Jorge Lepre, Seema Prasad, and Barry Rosen IBM T.J. Watson Research Center Y ..."
Abstract - Cited by 10 (3 self) - Add to MetaCart
ion for Modeling and Prediction C. Apte, S.J. Hong, J. Lepre, S. Prasad, and B. Rosen IBM Research Division Technical Report RC-20271 RAMP: Rules Abstraction for Modeling and Prediction Chidanand Apte, Se June Hong, Jorge Lepre, Seema Prasad, and Barry Rosen IBM T.J. Watson Research Center Yorktown Heights, NY 10598 January 12, 1996 Abstract Generating accurate and robust models is crucial to the successful use and deployment of classifiers on a large scale. Rule induction, i.e., generating decision rule models from data, is often a preferred approach to classification modeling and prediction, due to the enhanced explanatory capability and interpretability of decision rules. The RAMP system for rules abstraction and modeling is evolving with accuracy and robustness as primary goals. The system provides the following key capabilities: 1) feature analysis and selection based upon contextual merits technique, 2) "optimal" discretization of numerical features, 3) generation of m...

Temporal Pattern Recognition in Noisy Non-stationary Time Series Based on Quantization into Symbolic Streams: Lessons Learned from Financial Volatility Trading

by Peter Tino, Christian Schittenkopf, Georg Dorffner - URL http://citeseer.nj.nec.com/tino00temporal.html. (URL accessed on March 30 , 2000
"... In this paper we investigate the potential of the analysis of noisy non-stationary time series by quantizing it into streams of discrete symbols and applying finitememory symbolic predictors. The main argument is that careful quantization can reduce the noise in the time series to make model esti ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
In this paper we investigate the potential of the analysis of noisy non-stationary time series by quantizing it into streams of discrete symbols and applying finitememory symbolic predictors. The main argument is that careful quantization can reduce the noise in the time series to make model estimation more amenable given limited numbers of samples that can be drawn due to the non-stationarity in the time series. As a main application area we study the use of such an analysis in a realistic setting involving financial forecasting and trading. In particular, using historical data, we simulate the trading of straddles on the financial indexes DAX and FTSE 100 on a daily basis, based on predictions of the daily volatility differences in the underlying indexes. We propose a parametric, data-driven quantization scheme which transforms temporal patterns in the series of daily volatility changes into grammatical and statistical patterns in the corresponding symbolic streams. As sy...

Discretization Oriented to Decision Rules Generation

by R. Giraldez, J.S. Aguilar-Ruiz, J.C. Riquelme, F.J. Ferrer-Troyano, D.S. Rodriguez-Baena - Frontiers in Artificial Intelligence and Applications 82 , 2002
"... Many of the supervised learning algorithms only work with spaces of discrete attributes. Some of the methods proposed in the bibliography focus on the discretization towards the generation of decision rules. This work provides a new discretization algorithm called USD (Unparametrized Supervised D ..."
Abstract - Cited by 6 (3 self) - Add to MetaCart
Many of the supervised learning algorithms only work with spaces of discrete attributes. Some of the methods proposed in the bibliography focus on the discretization towards the generation of decision rules. This work provides a new discretization algorithm called USD (Unparametrized Supervised Discretization), which transforms the infinite space of the values of the continuous attributes in a finite group of intervals with the purpose of using these intervals in the generation of decision rules, in such a way that these rules do not loose accuracy or goodness. Stands out the fact that, contrary to other methods, USD doesn't need parameterization.

The Benefit of Information Reduction for Trading Strategies

by Christian Schittenkopf, Peter Tino, Georg Dorffner - Applied Financial Economics , 2000
"... Motivated by previous findings that discretization of financial time series can effectively filter the data and reduce the noise, this experimental study compares the trading performance of predictive models based on different modelling paradigms in a realistic setting. Different methods ranging ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Motivated by previous findings that discretization of financial time series can effectively filter the data and reduce the noise, this experimental study compares the trading performance of predictive models based on different modelling paradigms in a realistic setting. Different methods ranging from real-valued time series models to predictive models on a symbolic level are applied to predict the daily change in volatility of two major stock indices. The predicted volatility changes are interpreted as trading signals for buying or selling a straddle portfolio on the underlying stock index. Profits realized by this trading strategy are tested for statistical significance taking into account transactions costs. The results indicate that symbolic information processing is a promising approach to financial prediction tasks undermining the hypothesis of efficient capital markets. 1 Introduction In informationally efficient capital markets, market prices are supposed to reflect ...

A Symbolic Dynamics Approach to Volatility Prediction

by Peter Tino, Christian Schittenkopf, Georg Dorffner, Engelbert J. Dockner - in Computational Finance, (Proceedings of the Sixth International Conference on Computational Finance), Leonard N. Stern School of Business , 1999
"... We consider the problem of predicting the direction of daily volatility changes in the Dow Jones Industrial Average (DJIA). This is accomplished by quantizing a series of historic volatility changes into a symbolic stream over 2 or 4 symbols. We compare predictive performance of the classical fixedo ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
We consider the problem of predicting the direction of daily volatility changes in the Dow Jones Industrial Average (DJIA). This is accomplished by quantizing a series of historic volatility changes into a symbolic stream over 2 or 4 symbols. We compare predictive performance of the classical fixedorder Markov models with that of a novel approach to variable memory length prediction (called prediction fractal machine, or PFM) which is able to select very specific deep prediction contexts (whenever there is a sufficient support for such contexts in the training data). We learn that daily volatility changes of the DJIA only exhibit rather shallow finite memory structure. On the other hand, a careful selection of quantization cut values can strongly enhance predictive power of symbolic schemes. Results on 12 non-overlapping epochs of the DJIA strongly suggest that PFMs can outperform both traditional Markov models and (continuous-valued) GARCH models in the task of predicting volatility o...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University