#### DMCA

## Towards comprehensive foundations of computational intelligence (2007)

### Cached

### Download Links

- [cogprints.org]
- [www.fizyka.umk.pl]
- [www.fizyka.umk.pl]
- DBLP

### Other Repositories/Bibliography

Venue: | In: Duch W, Mandziuk J, Eds, Challenges for Computational Intelligence |

Citations: | 20 - 13 self |

### Citations

6462 |
Neural Networks for Pattern Recognition
- Bishop
- 1996
(Show Context)
Citation Context ...mations of multidimensional mappings by neural networks require flexibility that may be provided only by networks with sufficiently large number of parameters. This leads to the bias-variance dilemma =-=[14, 63, 174]-=-, since large number of parameters contribute to a large variance of neural models and small number of parameters increase their bias. Regularization techniques may help to avoid overparameterization ... |

2320 |
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge Univ
- Cristianini, Shawe-Taylor
- 2000
(Show Context)
Citation Context ...the input space into hyperrectangles. Multivariate decision trees provide several hyperplanes at high computational cost. Support Vector Machines use one kernel globally optimized for a given dataset =-=[27]-=-. All these systems may be called “homogenous” since they search for a solution providing the same type of elements, the same type of decision borders in the whole feature space. Committees of the hom... |

1353 |
G.: Swarm Intelligence: From Natural to Artificial Systems
- Bonabeau, Dorigo, et al.
- 1999
(Show Context)
Citation Context ...iration led to introduction of evolutionary programming Fogel66,Goldberg89, and later also other biologically-inspired optimization approaches, such as ant, swarm, and immunological system algorithms =-=[16, 105, 30]-=-, that can be used for optimization of adaptive parameters in neural and neurofuzzy systems. Although the algorithms based on these diverse biological inspirations are used for similar applications th... |

794 | The cascade-correlation learning architecture
- Fahlman, Lebiere
- 1990
(Show Context)
Citation Context ...selects the most promising function from a pool of candidates adding new node to the transformation has been introduced in [45, 94, 59]. Other constructive algorithms, such as the cascade correlation =-=[65]-=-, may also be used for this purpose. Each candidate node using different transfer function should be trained and the most useful candidate added to the network. The second approach starts from transfo... |

717 | G.: Solving multiclass learning problems via ECOCs
- Dietterich, Bakiri
- 1995
(Show Context)
Citation Context ...al single-output functions. In the K-class classification problems the number of outputs is usually K − 1, with zero output for the default class. In the Error Correcting Output Codes (ECOC) approach =-=[32]-=- learning targets that are easier to distinguish are defined, setting a number of binary targets that define a prototype “class signature” vectors. The final transformation compares then the distance ... |

693 | An empirical comparison of voting classification algorithms: Bagging, boosting, and variants,
- Bauer, Kohavi
- 1999
(Show Context)
Citation Context ...X, therefore the solution is done in the same way as before. After renormalization P(C i|X; M)/ � j P(Cj|X; M) give final probability of classification. In contrast to AdaBoost and similar procedures =-=[10]-=- explicit information about competence, or quality of classifier performance in different feature space areas, is used here. Many variants of committee or boosting algorithms with competence are possi... |

452 | Kernel independent component analysis
- Bach, Jordan
- 2002
(Show Context)
Citation Context ...designing various kernels that evaluate similarity between complex objects [27, 155]. Although kernels are usually designed for SVM methods, for example in text classification [118] or bioinformatics =-=[171, 7, 169]-=- applications, they may besdirectly used in the SBM framework, because kernels are specific (dis)similarity functions. In particular positive semidefinitive kernels used in SVM approach correspond to ... |

399 |
Spiking Neuron Models
- Gerstner, WM
- 2002
(Show Context)
Citation Context ...ms can only be solved with a different type of neurons that include at least one phase-sensitive parameter [111], or with spiking neurons [173]. Computational neuroscience is based on spiking neurons =-=[68]-=-, and although mathematical characterization of their power has been described [119] their practical applications are still limited. On the other hand feedforward artificial neural networks found wide... |

376 | Algorithmic Information Theory
- Chaitin
- 1987
(Show Context)
Citation Context ...o may have to restructure many existing associations to accommodate it. Semantic information measure, introduced in [56], is proportional to the change of algorithmic (Chaitin-Kolmogorov) information =-=[20]-=- that is needed to describe the whole system, and therefore measures relative complexity, depending on the knowledge already accumulated. Algorithmic information or the relative complexity of an objec... |

315 | Exploratory projection pursuit
- Friedman
- 1987
(Show Context)
Citation Context ...ng T 2( 1 X; 1 W) based on some specific criterion. Many interesting mappings are linear and define transformations equivalent to those provided by the Exploratory Projection Pursuit Networks (EPPNs) =-=[98, 66]-=-. Quadratic cost functions used for optimization of linear transformations may lead to formulation of the problem in terms of linear equations, but most cost functions or optimization criteria are non... |

309 |
Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition
- Cover
- 1965
(Show Context)
Citation Context ...maximization is certainly a good strategy. Consider for example a two-class case. In m-dimensional space the expected maximum number of separable vectors randomly assigned to one of the classes is 2m =-=[24, 85]-=-. For k-bit strings there are n = 2 k vectors and 2 n Boolean functions that may be separated in the space with n/2 dimensions with high probability. In case of k-bit Boolean problems localized kernel... |

254 |
Syntactic pattern recognition and applications
- Fu
- 1982
(Show Context)
Citation Context ... of objects is a result of temporal evolution, a series of transformations describing the formative history of these objects. This is more ambitious than the syntactic approach in pattern recognition =-=[67]-=-, where objects are composed of atoms, or basic structures, using specific rules that belong to some grammar. In the evolving transformation system (ETS) object structure is a temporal recording of st... |

249 |
Nonlinear multivariate analysis
- Gifi
- 1990
(Show Context)
Citation Context ...pendent Component Analysis, with each node computing one independent component [90, 23]. – Linear factor analysis, computing common and unique factors from data [75]. – Canonical correlation analysis =-=[70]-=-. – KL, or Kullback-Leibler networks with orthogonal or non-orthogonal components; networks maximizing mutual information [168] are a special case here, with product vs. joint distribution of classes/... |

233 | The Helmholtz machine
- Dayan, GE, et al.
- 1995
(Show Context)
Citation Context ... from partial observations [151], but they proved to be very inefficient because the stochastic training algorithm needs time that grows exponentially with the size of the problem. Helmholtz machines =-=[28]-=-, and recently introduced multi-layer restricted Boltzmann machines and deep belief networks [87] have been used only for pattern recognition problems so far. These models are based on stochastic algo... |

207 |
Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications
- Cichocki, Amari
- 2002
(Show Context)
Citation Context ...minatory Analysis (FDA), with each node computing canonical component using one of many FDA algorithms [174, 165]. – Independent Component Analysis, with each node computing one independent component =-=[90, 23]-=-. – Linear factor analysis, computing common and unique factors from data [75]. – Canonical correlation analysis [70]. – KL, or Kullback-Leibler networks with orthogonal or non-orthogonal components; ... |

166 | Application of spreading activation techniques in information retrieval
- Crestani
- 1997
(Show Context)
Citation Context ...ll with the size of the network. Processing sequential information by simpler mechanisms, such as spreading activation in appropriately structured networks, is more suitable for information retrieval =-=[26]-=-. The challenge here is to create large semantic networks with overall structure and weighted links that facilitate associations and reproduce priming effects. Wordnet (http://wordnet.princeton.edu) a... |

153 | Task clustering and gating for bayesian multitask learning
- Bakker, Heskes
(Show Context)
Citation Context ...versified models, Bayesian framework for dynamic selection of most competent classifier [69], regional boosting [122], confidence-rated boosting predictions [154], task clustering and gating approach =-=[9]-=-, or stacked generalization [183]. A committee may be build as a network of networks, or a network where each element has been replaced by a very complex processing element made from individual networ... |

142 |
Neurocomputing: Foundations of Research
- Anderson, Rosenfeld
- 1988
(Show Context)
Citation Context ...earch in artificial neural networks (ANNs) grew out from attempts to drastically simplify biophysical neural models, and for a long time was focused on logical and graded response (sigmoidal) neurons =-=[5]-=- used for classification, approximation, association and vector quantization approaches used for clusterization and self-organization [106]. Later any kind of basis function expansion, used since a lo... |

72 | Adaptive Resonance Theory
- Carpenter, Grossberg
- 2003
(Show Context)
Citation Context ...physiological and psychophysical results about visual system, but it has also been used to develop new algorithms for image processing. In practical applications a version of Adaptive Resonant Theory =-=[19]-=- called LAMINART [145], has been used. So far this is the most advanced approach to perception that will certainly play a very important role in the growing field of autonomous mental development and ... |

70 |
The search for simplicity: A fundamental cognitive principle. The Quartely
- Chater
- 1999
(Show Context)
Citation Context ... address it are described below. 2.2 Computing and cognition as compression Neural information processing in perception and cognition is based on the principles of economy, or information compression =-=[22]-=-. In computing these ideas have been captured by such concepts as the minimum (message) length encoding, minimum description length, or general algorithmic complexity [117]. An approach to information... |

69 |
Feature Space Mapping as a universal adaptive system
- Duch, Diercksen
- 1995
(Show Context)
Citation Context ...s that encode new information in terms of the known information are certainly not new. They include constructive neural networks that add new nodes only if the current approximation is not sufficient =-=[85, 50]-=-, similarity-based systems that accept new reference vector checking first if it is not redundant, decision trees that are pruned to increase their generalization and ignore data that are already corr... |

66 |
Theory of Approximation
- Achieser
- 1956
(Show Context)
Citation Context ...for Gaussian uncertainties.sThe radial basis functions became a synonym for all basis function expansions, although in approximation theory already in the classical book of Achieser published in 1956 =-=[2]-=- many such expansions were considered. RBF networks are equivalent to the fuzzy systems only in special cases, for example when the Gaussian membership functions are used [103], but in the literature ... |

63 |
An adaptive neural network: the cerebral cortex. Masson Editeur
- Burnod
- 1990
(Show Context)
Citation Context ...anisms that can influence neuronal states and thus can be interpreted as performing computations and learning. A useful approximation to microcircuit dynamics may be provided by finite state automata =-=[18, 29, 34]-=-. The Liquid State Machine (LSM) model aims at better approximation at the microscopic level, based on “liquid” high dimensional states of neural microcircuits that change in real time. In [34] anothe... |

60 | A new methodology of extraction, optimization and application of crisp and fuzzy logical rules
- Duch, Adamczak, et al.
- 2001
(Show Context)
Citation Context ...er bias for a given data is very important. Some real world examples showing the differences between RBF and MLP networks that are mainly due to the transfer functions used were presented in [57] and =-=[46, 47]-=-. The simplest transformation that has the chance to discover appropriate bias for complex data may require several different types of elementary functions. Heterogeneous adaptive systems (HAS) introd... |

49 | An attractor model of lexical conceptual processing: Simulating semantic priming
- Cree, McRae
- 1999
(Show Context)
Citation Context ..., is one of the most popular subjects of investigation [126]. How can this priming process be approximated? An attractor network model has been created to explain results of psychological experiments =-=[25]-=-. However, such dynamical models are rather complex and do not scale well with the size of the network. Processing sequential information by simpler mechanisms, such as spreading activation in appropr... |

49 |
Introduction to the Special Issue on Meta-Learning
- Giraud-Carrier, Vilalta, et al.
- 2004
(Show Context)
Citation Context ...ced by Michalski [64]. Learning of a single model may be sufficiently difficult, therefore to be feasible search in the space of many possible models should be heuristically guided. The Metal project =-=[71]-=- tried to collect information about data characteristics and correlate it with the methods that performed well on a given data. A system recommending classification methods for a given data has been b... |

47 | Dynamic classifier selection based on multiple classifier behaviour
- Giacinto, Roli
- 2001
(Show Context)
Citation Context ...ny variants of committee or boosting algorithms with competence are possible [110], focusing on generation of diversified models, Bayesian framework for dynamic selection of most competent classifier =-=[69]-=-, regional boosting [122], confidence-rated boosting predictions [154], task clustering and gating approach [9], or stacked generalization [183]. A committee may be build as a network of networks, or ... |

39 | The curse of highly variable functions for local kernel machines
- Bengio, Delalleau, et al.
- 2006
(Show Context)
Citation Context ...s and network architectures, it is clear that selection of transfer functions is decisive for the speed of convergence in approximation and classification problems. As already shown in [57] (see also =-=[11, 12]-=-) some problems may require O(n 2 ) parameters using localized functions and only O(n) parameters when non-local functions are used. The n-parity problem may be trivially solved using a periodic funct... |

38 | New neural transfer functions - Duch, Jankowski - 1997 |

37 | Similarity based methods: A general framework for classification, approximation and association
- Duch
- 2000
(Show Context)
Citation Context ... CI programs that could adjust themselves in a deeper way, beyond parameter optimization, to the problem analyzed. This idea has been partially implemented in the similaritybased meta-learning scheme =-=[35, 53]-=-, and is also in accord with evolving programs and connectionist systems [101] that to a limited degree change their structure. Most CI algorithms have very limited goals, such as prediction (using ap... |

35 | Boosted mixture of experts: An ensemble learning scheme
- Avnimelech, Intrator
- 1999
(Show Context)
Citation Context ...e frequently unstable [17], i.e. quite different models are created as a result of repeated training (if learning algorithms contains stochastic elements) or if the training set is slightly perturbed =-=[6]-=-. Although brains are massively parallel computing devices attention mechanisms are used to inhibit parts of the neocortex that are not competent in analysis of a given type of signal. All sensory inp... |

31 | Measurement of membership functions: theoretical and empirical work
- Bilgic, Turksen
- 1997
(Show Context)
Citation Context ...is universe belong to the set F. This degree should not be interpreted as probability [109] and in fact at least four major interpretations of the meaning of membership functions may be distinguished =-=[13]-=-. One natural interpretation is based on the degree to which all elements X ∈Xare similar to the typical elements (that is those with χF (X) ≈ 1) ofF. From this point of view fuzzy modeling seems to b... |

30 | New developments in the feature space mapping model
- Adamczak, Duch, et al.
- 1997
(Show Context)
Citation Context ...sion [110]. It is not clear why so much research has been devoted to the RBF networks while neural networks based on separable functions are virtually unknown: the Feature Space Mapping (FSM) network =-=[50, 60, 3, 44]-=- seems to be the only existing implementation of the Separable Basis Function networks so far. A general framework for similarity-based methods (SBMs) has been formulated using the concept of similari... |

30 |
Basis functions for object-centered representations, Neuron 37
- Deneve, Pouget
- 2003
(Show Context)
Citation Context ...ogic using liquid states. One good algorithm for k-separability combines projections with discovery of local clusters. This is essentially what is needed for object-centered representations in vision =-=[31]-=- and has been used to model the outputs of parietal cortex neurons [141, 142]. Any continuous sensorymotor transformation may be approximated in this way [153]. Although precise neural implementation ... |

27 |
Training a support vector machine
- Chapelle
- 2007
(Show Context)
Citation Context ...alysis algorithms. In particular the idea of maximizing margins, not only minimizing errors, used in SVM algorithms based on solutions to regularized least square problem in the primial or dual space =-=[21]-=- , has not yet been used for prototype optimization or selection [95, 79]. Any hyperplane defined by its bias and normal vector (W0, W) is equivalent to a minimal distance rule for two prototypes P, P... |

24 | W.: The Separability of Split Value criterion
- Gr�bczewski, Duch
- 2000
(Show Context)
Citation Context ... transformation embedding input vectors in a space where distances are preserved [138]. – Linear approximations to multidimensional scaling [138].s– Separability criterion used on orthogonalized data =-=[76]-=-. Non-linearities may be introduced in transformations in several ways: either by adding non-linear functions to linear combinations of features, or using distance functions, or transforming component... |

23 |
ZURADA Computational intelligence methods for understanding of data
- DUCH, SETIONO, et al.
(Show Context)
Citation Context ...s been described [119] their practical applications are still limited. On the other hand feedforward artificial neural networks found wide applications in data analysis [144] and knowledge extraction =-=[62]-=-. Better understanding of mathematical foundations brought extensions of neural techniques towards statistical pattern recognition models, such as the Support Vector Machines (SVMs) [155] for supervis... |

21 | Concept Learning - Michalski - 1990 |

20 |
Bias-variance, regularization, instability and stabilization
- Breiman
- 1998
(Show Context)
Citation Context ...a solution providing the same type of elements, the same type of decision borders in the whole feature space. Committees of the homogenous systems are frequently used to improve and stabilize results =-=[17]-=-. Combining systems of different types in a committee is a step towards heterogeneous systems that use different types of decision borders, but such models may become quite complex and difficult to un... |

20 | Uncertainty of data, fuzzy membership functions, and multilayer perceptions
- Duch
- 2005
(Show Context)
Citation Context ...elligent behavior at a higher, psychological level, rather than elementary neural level. Sets of fuzzy rules have a natural graphical representation [116], and are deeply connected to neural networks =-=[37]-=-. Fuzzy rules organized in a network form may be tuned by adaptive techniques used in neural networks, therefore they are called neurofuzzy systems [133, 136]. Thus fuzzy and neural systems are at the... |

20 | Extraction of logical rules from backpropagation networks
- Duch, Adamczak, et al.
- 1998
(Show Context)
Citation Context ...er bias for a given data is very important. Some real world examples showing the differences between RBF and MLP networks that are mainly due to the transfer functions used were presented in [57] and =-=[46, 47]-=-. The simplest transformation that has the chance to discover appropriate bias for complex data may require several different types of elementary functions. Heterogeneous adaptive systems (HAS) introd... |

19 | Fuzzy rule-based systems derived from similarity to prototypes
- Duch, Blachnik
- 2004
(Show Context)
Citation Context ... a basis for computational intelligence methods [138]. For additive similarity measures models based on similarity to prototypes are equivalent to models based on fuzzy rules and membership functions =-=[48]-=-. Similarity functions may be related to distance functions by many transformations, for example exponential transformation S(X, Y) = exp(−D(X, Y)). Additive distance functions D(X, Y) are then conver... |

19 | Transfer functions: hidden possibilities for better neural networks
- Duch, Jankowski
(Show Context)
Citation Context ...eneous adaptive systems (HAS) introduced in [51] provide different types of decision borders at each stage of building data model, enabling discovery of the most appropriate bias for the data. Neural =-=[59, 94, 45]-=-, decision tree [51, 78] and similarity-based systems [53, 177, 178] of this sort have been described, finding for some data simplest and most accurate models known so far. Heterogeneous neural algori... |

18 | G.: Classification, association and pattern completion using neural similarity based methods
- Duch, Adamczak, et al.
- 2000
(Show Context)
Citation Context ...ems to be the only existing implementation of the Separable Basis Function networks so far. A general framework for similarity-based methods (SBMs) has been formulated using the concept of similarity =-=[43, 35]-=-. This framework includes typical feedforward neural network models (MLP, RBF, SBF) , some novel networks (Distance-Based Multilayer Perceptrons (D-MLPs, [41]) and the nearest neighbor or minimum-dist... |

17 | The Unified Learning Paradigm: A Foundation for AI
- Goldfarb, Nigam
- 1994
(Show Context)
Citation Context ...h reasoning (higher level cognitive functions). A very interesting approach to representation of objects as evolving structural entities/processes has been developed by Goldfarb and his collaborators =-=[74, 72, 73]-=-. Structure of objects is a result of temporal evolution, a series of transformations describing the formative history of these objects. This is more ambitious than the syntactic approach in pattern r... |

16 | Coloring black boxes: visualization of neural network decisions
- Duch
- 2003
(Show Context)
Citation Context ...y researchers working in NASA, IBM and other research oriented companies. Visualization of data transformations performed by CI systems, including analysis of perturbation data, are very useful tools =-=[36]-=-, although still rarely used. Cognitive robotics may be an ultimate challenge for computational intelligence. Various robotic platforms that could be used for testing new ideas in semi-realistic situa... |

14 | The PDF projection theorem and the class-specific method
- Baggenstoss
- 2003
(Show Context)
Citation Context ...reate K probabilities, so the final dimensionality after this transformation may reach at least K 2 . A more sophisticated approach to class-specific use of features has been presented by Baggenstoss =-=[8]-=-. It is based on estimation of probability density functions (PDFs) in the reduced low-dimensional feature space selected separately for each class, and mapping these PDFs back to the original input s... |

14 | Non-local estimation of manifold structure
- Bengio, Monperrus, et al.
- 2006
(Show Context)
Citation Context ...s and network architectures, it is clear that selection of transfer functions is decisive for the speed of convergence in approximation and classification problems. As already shown in [57] (see also =-=[11, 12]-=-) some problems may require O(n 2 ) parameters using localized functions and only O(n) parameters when non-local functions are used. The n-parity problem may be trivially solved using a periodic funct... |

14 | Neural minimal distance methods
- Duch
(Show Context)
Citation Context ...rk includes typical feedforward neural network models (MLP, RBF, SBF) , some novel networks (Distance-Based Multilayer Perceptrons (D-MLPs, [41]) and the nearest neighbor or minimum-distance networks =-=[33, 43]-=-), as well as many variants of the nearest neighbor methods, improving upon the traditional approach by providing more flexible decision borders. This framework has been designed to enable meta-learni... |

13 | Platonic model of mind as an approximation to neurodynamics
- Duch
- 1997
(Show Context)
Citation Context ...anisms that can influence neuronal states and thus can be interpreted as performing computations and learning. A useful approximation to microcircuit dynamics may be provided by finite state automata =-=[18, 29, 34]-=-. The Liquid State Machine (LSM) model aims at better approximation at the microscopic level, based on “liquid” high dimensional states of neural microcircuits that change in real time. In [34] anothe... |

12 |
Neurocomputing 2
- Anderson, Pellionisz, et al.
- 1993
(Show Context)
Citation Context ...zation and self-organization [106]. Later any kind of basis function expansion, used since a long time in approximation theory [143] and quite common in pattern recognition, became “a neural network” =-=[4]-=-, with radial basis function (RBF) networks [140] becoming a major alternative to multilayer perceptron networks (MLPs). However, as pointed out by Minsky and Papert [131] there are some problems that... |

10 | Evolution of functional link networks
- Sierra, Macias, et al.
- 2001
(Show Context)
Citation Context ...measures on subsets of input variables. Non-linear feature transformations, such as tensor products of features, are particularly useful, as Pao has already noted introducing functional link networks =-=[137, 1]-=-. Rational function neural networks [85] in signal processing [115] and other applications use ratios of polynomial combinations of features; a linear dependence on a ratio y = x1/x2 is not easy to ap... |

10 | 2007a) Creativity and the
- Duch
(Show Context)
Citation Context ...nt with all knowledge assumed to be true at a given stage. The interplay between left and right hemisphere representations leads to generalization of constraints that help to reason at the meta-level =-=[38]-=-. These ideas may form a basis for an associative machine that could reason using both perceptions (observations) and a priori knowledge. In this way pattern recognition (lower level cognitive functio... |

9 | Heterogeneous adaptive systems
- Duch, Gra֒bczewski
- 2002
(Show Context)
Citation Context ... neural algorithms can also be presented in this framework. Heterogeneous constructive systems of this type are especially useful and have already discovered some of the simplest descriptions of data =-=[78, 51]-=-. A framework based on transformations, presented in this paper for the first time, is even more general, as it includes all kinds of pre-processing and unsupervised methods for initial data transform... |

9 | K.: Meta-learning via search combined with parameter optimization
- Duch, Grudziński
- 2002
(Show Context)
Citation Context ... CI programs that could adjust themselves in a deeper way, beyond parameter optimization, to the problem analyzed. This idea has been partially implemented in the similaritybased meta-learning scheme =-=[35, 53]-=-, and is also in accord with evolving programs and connectionist systems [101] that to a limited degree change their structure. Most CI algorithms have very limited goals, such as prediction (using ap... |

9 | Committees of undemocratic competent models
- Duch, Itert
- 2003
(Show Context)
Citation Context ...er models should be maintained. A committee based on competent models, with various factors determining regions of competence (or incompetence) may be used to integrate decisions of individual models =-=[54, 55]-=-. The competence factor should reach F (X; M l) ≈ 1 in all areas where the model Ml has worked well and F (X; Ml) ≈ 0 near the training vectors where errors were made. A number of functions may be use... |

9 |
Survey of Neural Transfer Functions. Neural Computing Surveys (submitted
- Duch, Jankowski
- 1999
(Show Context)
Citation Context ...r techniques for unsupervised learning. Most networks are composed of elements that perform very simple functions, such as squashed weighted summation of their inputs, or some distance-based function =-=[57, 58]-=-. Connectionist modeling in psychology [151] introduced nodes representing whole concepts, or states of network subconfigurations, although their exact relations to neural processes were never elucida... |

9 | Heterogeneous Forests of Decision Trees
- Grąbczewski, Duch
- 2002
(Show Context)
Citation Context ...or optimal models in meta-learning should explore many different models. Models that are close to the Pareto front [130] should be retained and evaluated by domain experts. A forest of decision trees =-=[77]-=- and heterogeneous trees [78] is an example of a simple meta-search in a model space restricted to decision trees. Heterogeneous trees use different types of rule premises, splitting the branches not ... |

8 |
Factor Analysis. Erlbaum
- Gorsuch
- 1983
(Show Context)
Citation Context ... many FDA algorithms [174, 165]. – Independent Component Analysis, with each node computing one independent component [90, 23]. – Linear factor analysis, computing common and unique factors from data =-=[75]-=-. – Canonical correlation analysis [70]. – KL, or Kullback-Leibler networks with orthogonal or non-orthogonal components; networks maximizing mutual information [168] are a special case here, with pro... |

7 |
Selection of prototypes rules: Context searching via clustering
- Blachnik, Duch, et al.
- 2006
(Show Context)
Citation Context ...rse be more accurately modeled using hierarchical fuzzy systems. Selection of prototypes and features together with similarity measures offers new, so far unexplored alternative to neurofuzzy methods =-=[49, 178, 15]-=-. Duality between similarity measures and membership functions allows for generation of propositional rules based on individual membership functions, but there are significant differences. Fuzzy rules... |

7 | Search and global minimization in similarity-based methods
- Duch, Grudziński
- 1999
(Show Context)
Citation Context ...thms are generated by applying admissible extensions to the existing algorithms and the most promising are retained and extended further. Training is performed using parameter optimization techniques =-=[52, 53]-=-. Symbolic values used with probabilistic distance functions allow to avoid ad hoc procedure to replace them with numerical values. To understand the structure of the data prototype-based interpretati... |

7 |
What is a structural representation? Fifth variation
- Goldfarb, Gay
- 2005
(Show Context)
Citation Context ...h reasoning (higher level cognitive functions). A very interesting approach to representation of objects as evolving structural entities/processes has been developed by Goldfarb and his collaborators =-=[74, 72, 73]-=-. Structure of objects is a result of temporal evolution, a series of transformations describing the formative history of these objects. This is more ambitious than the syntactic approach in pattern r... |

6 | G.H.F.: Distance-based multilayer perceptrons
- Duch, Adamczak, et al.
- 1999
(Show Context)
Citation Context ...lated using the concept of similarity [43, 35]. This framework includes typical feedforward neural network models (MLP, RBF, SBF) , some novel networks (Distance-Based Multilayer Perceptrons (D-MLPs, =-=[41]-=-) and the nearest neighbor or minimum-distance networks [33, 43]), as well as many variants of the nearest neighbor methods, improving upon the traditional approach by providing more flexible decision... |

6 | G.: Constructive density estimation network based on several different separable transfer functions
- Duch, Adamczak, et al.
- 2001
(Show Context)
Citation Context ...eneous adaptive systems (HAS) introduced in [51] provide different types of decision borders at each stage of building data model, enabling discovery of the most appropriate bias for the data. Neural =-=[59, 94, 45]-=-, decision tree [51, 78] and similarity-based systems [53, 177, 178] of this sort have been described, finding for some data simplest and most accurate models known so far. Heterogeneous neural algori... |

6 | Competent undemocratic committees - Jankowski, Duch, et al. - 2002 |

6 | Feature Space Mapping: a neurofuzzy network for system identification, Engineering Applications of Neural Networks
- Duch, Adamczak, et al.
- 1995
(Show Context)
Citation Context ...sion [110]. It is not clear why so much research has been devoted to the RBF networks while neural networks based on separable functions are virtually unknown: the Feature Space Mapping (FSM) network =-=[50, 60, 3, 44]-=- seems to be the only existing implementation of the Separable Basis Function networks so far. A general framework for similarity-based methods (SBMs) has been formulated using the concept of similari... |

5 | Filter methods
- Duch
- 2006
(Show Context)
Citation Context ...at this stage is very large. Feature selection techniques [82], and in particular filter methods wrapped around algorithms that search for interesting feature transformations (called “filtrappers” in =-=[39]-=-), may be used to quickly evaluate the usefulness of proposed transformations. The challenge is to provide a single framework for systematic selection and creation of interesting transformations in a ... |

3 | Quo Vadis Computational Intelligence
- Duch, Mandziuk
- 2004
(Show Context)
Citation Context ...d these networks become modules that are used to build next-level supernetworks, functional equivalents of larger brain areas. The principles on which models should be based at each level are similar =-=[61]-=-: networks of interacting modules should adjust tosthe flow of information (learn) changing their internal knowledge and their interactions with other modules. Efficient algorithms for learning are kn... |

3 | What is a structural representation? A proposal for a representational formalism,” Univ
- Goldfarb, Gay, et al.
- 2006
(Show Context)
Citation Context ...h reasoning (higher level cognitive functions). A very interesting approach to representation of objects as evolving structural entities/processes has been developed by Goldfarb and his collaborators =-=[74, 72, 73]-=-. Structure of objects is a result of temporal evolution, a series of transformations describing the formative history of these objects. This is more ambitious than the syntactic approach in pattern r... |

1 |
Feature space mapping neural network applied to structure-activity relationship problems
- Duch, Adamczak, et al.
- 2000
(Show Context)
Citation Context ...sion [110]. It is not clear why so much research has been devoted to the RBF networks while neural networks based on separable functions are virtually unknown: the Feature Space Mapping (FSM) network =-=[50, 60, 3, 44]-=- seems to be the only existing implementation of the Separable Basis Function networks so far. A general framework for similarity-based methods (SBMs) has been formulated using the concept of similari... |

1 | Complex systems, information theory and neural networks
- Duch, Jankowski
- 1994
(Show Context)
Citation Context ...a way that is not yet fully understood. Information compression and encoding of new information in terms of old has been used to define the measure of syntactic and semantic information introduced in =-=[56]-=-. This information is based on the size of the minimal graph representing a given data structure or knowledge–base specification, thus it goes beyond alignment of sequences. A chunk of information has... |