## A `Microscopic' Study of Minimum Entropy Search in Learning Decomposable Markov Networks (1995)

Venue: | MACHINE LEARNING |

Citations: | 23 - 18 self |

### BibTeX

@INPROCEEDINGS{Xiang95a`microscopic',

author = {Y. Xiang and S.K.M. Wong and N. Cercone},

title = {A `Microscopic' Study of Minimum Entropy Search in Learning Decomposable Markov Networks},

booktitle = {MACHINE LEARNING},

year = {1995},

publisher = {}

}

### OpenURL

### Abstract

Several scoring metrics are used in different search procedures for learning probabilistic networks. We study the properties of cross entropy in learning a decomposable Markov network. Though entropy and related scoring metrics were widely used, its `microscopic' properties and asymptotic behavior in a search have not been analyzed. We present such a `microscopic' study of a minimum entropy search algorithm, and show that it learns an I-map of the domain model when the data size is large. Search procedures that modify a network structure one link at a time have been commonly used for efficiency. Our study indicates that a class of domain models cannot be learned by such procedures. This suggests that prior knowledge about the problem domain together with a multi-link search strategy would provide an effective way to uncover many domain models.

### Citations

7074 |
Probabilistic Reasoning in Intelligent Systems
- Pearl
- 1988
(Show Context)
Citation Context ...t in a multi-link lookahead search. Keywords: Inductive learning, reasoning under uncertainty, knowledge acquisition, Markov networks, probabilistic networks. 11 Introduction A probabilistic network =-=[28, 26, 14, 4]-=- combines a qualitative graphic structure, which encodes domain dependencies, with a quantitative probability distribution, which encodes the strength of the dependencies. The network structure can be... |

1477 |
Information Theory and Reliable Communication
- Gallager
- 1968
(Show Context)
Citation Context ...If Ind(A� � C) holds in M, then IM(A� C) = 0. Since Ind(A� � B) does not hold in M, wehaveH2(N) ; H1(N) =IM(A� B) > 0. On the other hand, if Ind(A� B� C) holds in M,thenIM(A� B) =IM(A� C)+ IM(A� BjC) =-=[12]-=- (equation 2.3.18), where I(A� BjC) istheaverage conditional mutual information between A and B given C, I(A� BjC) = X P (AjCB) P (ACB)log P (AjC) : ACB Hence, IM(A� B);IM(A� C) =IM(A� BjC) 0, with eq... |

1290 | Local computation with probabilities on graphical structures and their application to expert systems - Lauritzen, Spiegelhalter - 1988 |

1250 |
On information and sufficiency
- Kullback, Leibler
- 1951
(Show Context)
Citation Context ...robabilistic model M over a set N of variables, we would like to learn a DMN (G, P ) that is an approximation of M. To measure the closeness of (G, P )toM, we adopt the Kullback-Leibler cross entropy =-=[26]-=-, K(PM,P)= ∑ v PM(v) log(PM(v)/P (v)), where PM is the true jpd defined by M and v is a configuration of N. A DMN that minimizes K(PM,P) will be considered as the best approximation of M. It has been ... |

1134 |
Algorithmic Graph Theory and Perfect Graphs
- Golumbic
- 1980
(Show Context)
Citation Context ...at the same level, otherwise the next higher level of search starts. The following analyzes the worst case time complexity of the algorithm. Testing the chordality ofG can be performed in O(jNj) time =-=[13]-=-. A JT can be computed by a maximal spanning tree algorithm [19]. maximal spanning tree of a graph with v nodes and e links can be computed in O((v+e) log v) time [25]. Since a complete graph has O(v ... |

1079 | Bayesian method for the induction of probabilistic networks from data
- Cooper, Herskovits
- 1992
(Show Context)
Citation Context ... by minimizing the entropy of the distribution de ned by the BN. Their method starts with an empty graph (no links) and adds one link at each pass during search. Later, they proposed the K2 algorithm =-=[8]-=- that learns a BN based on a Bayesian method which selects a BN with the highest posterior probability given a database. A similar algorithm was independently developed by Buntine [2]. Recently, Hecke... |

905 | Learning Bayesian networks: the combination of knowledge and statistical
- Heckerman, Geiger, et al.
- 1995
(Show Context)
Citation Context ...s a BN based on a Bayesian method which selects a BN with the highest posterior probability given a database. A similar algorithm was independently developed by Buntine [2]. Recently, Heckerman et al =-=[16]-=- applied the Bayesian method to learning a BN by combining prior knowledge and statistical data. Spirtes and Glymour [31] developed the PC algorithm that learns a BN by deleting links from a complete ... |

830 | Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Green
- 1995
(Show Context)
Citation Context ...arch. We present in Section 6 a multi-link lookahead algorithm based on the minimum entropy search. Experimental results are presented in Section 7, 2 However, other methods such as stochastic search =-=[30, 17]-=- may not have this problem. 3 We consider learning processes that infer dependencies contained in the domain model. As we have no direct access of the model, we must infer dependencies from the data g... |

637 | Approximating discrete probability distributions with dependence trees
- Chow, Liu
- 1968
(Show Context)
Citation Context ...of probabilistic networks have been amply demonstrated in many arti cial intelligence domains [4], many researchers turn their attention to automatic learning of such networks from data. Chow and Liu =-=[7]-=- pioneered learning of probabilistic networks. They developed an algorithm to approximate a joint probability distribution (jpd) by a tree-structured BN. Rebane and Pearl [29] extended their method to... |

597 | K.: Irrelevant features and the subset selection problem
- John, Kohavi, et al.
- 1994
(Show Context)
Citation Context ...wbacks of a single-link lookahead search which is commonly used in learning probabilistic networks. It is well known that parity functions cause failure of many decision tree learning algorithms (see =-=[33, 25]-=- for example). We show that a class of probabilistic domain models, that forms a generalization of parity functions, cannot be learned by a single-link lookahead search procedure. 2 Although our obser... |

538 |
Theory of Statistics
- Schervish
- 1995
(Show Context)
Citation Context ...tional independencies (see Figure 4). Therefore, the method of Fung and Crawford cannot be used under 2 these circumstances. We haveinstead applied the test of goodness-of- t for composite hypotheses =-=[9]-=-. We describe the method below. Recall from Section 6 that the replacement ofF0 by F2 causes the maximum amount of decrease of entropy. The CI test is performed for deciding if F 0 should be rejected ... |

432 |
The Theory of Relational Databases
- Maier
- 1983
(Show Context)
Citation Context ... in P but not in M. Nowsince X = Y = , cliques of G1 are identical to those of G except cliques covered by G0 are unioned into a single clique X0 [ Y 0 [ Z in G0 1. If we apply to G1 Graham reduction =-=[24]-=-, which recursively removes leaf cliques of a graph, we will end up with an empty graph (Graham reduction succeeds). This is because we can follow the same sequence of leaf clique removal that leads u... |

374 | Structuring in Belief Networks
- Pearl, “Fusion
- 1986
(Show Context)
Citation Context ...ork (BN) structure is a directed acyclic graph and a decomposable Markov network (DMN) structure is an undirected chordal graph. As many e ective probabilistic inference techniques have beendeveloped =-=[27, 17, 23, 20, 36]-=- and the applicability of probabilistic networks have been amply demonstrated in many arti cial intelligence domains [4], many researchers turn their attention to automatic learning of such networks f... |

298 | Learning Bayesian Networks: The
- Heckerman, Geiger, et al.
- 1995
(Show Context)
Citation Context ...e BN. Instead of learning a BN, Fung and Crawford [11] developed the Constructor algorithm that learns a DMN. A more extensive review of literature for learning probabilistic networks can be found in =-=[18, 8, 3, 15]-=-. In this paper we consider learning a DMN from a database. Pearl [28] showed that directionality makes BNs a richer language in expressing dependencies. For instance, an induced dependency can be exp... |

287 |
Bayesian updating in causal probabilistic networks by local computation
- Jensen, Lauritzen, et al.
- 1990
(Show Context)
Citation Context ...ork (BN) structure is a directed acyclic graph and a decomposable Markov network (DMN) structure is an undirected chordal graph. As many e ective probabilistic inference techniques have beendeveloped =-=[27, 17, 23, 20, 36]-=- and the applicability of probabilistic networks have been amply demonstrated in many arti cial intelligence domains [4], many researchers turn their attention to automatic learning of such networks f... |

265 | Model selection and accounting for model uncertainty in graphical models using Occam's window
- Madigan, Raftery
- 1994
(Show Context)
Citation Context ...eloped the Constructor algorithm that learns a DMN. Dawid and Lauritzen [11] studies ‘hyper Markov laws’ in learning numerical parameters of a DMN with a given decomposable graph. Madigan and Raftery =-=[29]-=- proposed algorithms for learning a set of acceptable models expressed as BNs or DMNs. A more extensive review of literature for learning probabilistic networks can be found in [22, 10, 5, 19]. In thi... |

248 | Operations for learning with graphical model
- Buntine
- 1994
(Show Context)
Citation Context ...e BN. Instead of learning a BN, Fung and Crawford [11] developed the Constructor algorithm that learns a DMN. A more extensive review of literature for learning probabilistic networks can be found in =-=[18, 8, 3, 15]-=-. In this paper we consider learning a DMN from a database. Pearl [28] showed that directionality makes BNs a richer language in expressing dependencies. For instance, an induced dependency can be exp... |

238 |
The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks
- Beinlich, Suernondt, et al.
- 1989
(Show Context)
Citation Context ...η constraints are involved in the assignment, and ensure that the new assignment conforms to the constraints. We start by assigning the single configuration in GP0: P (0,...,0) = 0.5 η−1 q, where q ∈ =-=[0, 1]-=- and q = 0.5. This assignment does not violate any constraints. We then assign a configuration in GP1: P (0,...,0, 1) = P (X1,0,...,Xη−1,0)−P (X1,0,...,Xη−1,0,Xη,0) =0.5 η−1 (1−q). Note that this ass... |

235 | Bayesian Networks without Tears
- Charniak
- 1991
(Show Context)
Citation Context ...ve way to uncover many domain models. Keywords: Inductive learning, reasoning under uncertainty, knowledge acquisition, Markov networks, probabilistic networks. 1 Introduction A probabilistic network =-=[35, 32, 18, 6]-=- combines a qualitative graphic structure which encodes domain dependencies, with a quantitative probability distribution which encodes the strength of the dependencies. The network structure can be a... |

227 | Bayesian graphical models for discrete data
- Madigan, York
- 1995
(Show Context)
Citation Context ...arch. We present in Section 6 a multi-link lookahead algorithm based on the minimum entropy search. Experimental results are presented in Section 7, 2 However, other methods such as stochastic search =-=[30, 17]-=- may not have this problem. 3 We consider learning processes that infer dependencies contained in the domain model. As we have no direct access of the model, we must infer dependencies from the data g... |

202 |
Boolean feature discovery in empirical learning
- Pagallo, Haussler
- 1990
(Show Context)
Citation Context ...wbacks of a single-link lookahead search which is commonly used in learning probabilistic networks. It is well known that parity functions cause failure of many decision tree learning algorithms (see =-=[33, 25]-=- for example). We show that a class of probabilistic domain models, that forms a generalization of parity functions, cannot be learned by a single-link lookahead search procedure. 2 Although our obser... |

184 |
Propagating uncertainty in Bayesian networks by probabilistic logic sampling
- Henrion
- 1988
(Show Context)
Citation Context ...k (BN) structure is a directed acyclic graph and a decomposable Markov network (DMN) structure is an undirected chordal graph. As many effective probabilistic inference techniques have been developed =-=[34, 21, 28, 24, 45]-=- and the applicability of probabilistic networks have been amply demonstrated in many artificial intelligence 1domains [6], many researchers turn their attention to automatic learning of such network... |

182 | Theory refinement on Bayesian networks
- Buntine
- 1991
(Show Context)
Citation Context ...ould be connected, then we can focus the multi-link lookahead search based on the resultant network. We leave such an investigation to future work. Related work on beam search can be found in Buntine =-=[4]-=-. 7 Experimental Results Algorithm 1 was implemented and a set of experiments were performed to (1) test if an I-map reasonably close to a control model can be learned given a reasonably large databas... |

151 |
Introduction to Algorithms: A Creative Approach
- Manber
- 1989
(Show Context)
Citation Context ... can be performed in O(jNj) time [13]. A JT can be computed by a maximal spanning tree algorithm [19]. maximal spanning tree of a graph with v nodes and e links can be computed in O((v+e) log v) time =-=[25]-=-. Since a complete graph has O(v 2 ) links, a maximal spanning tree can be computed in O(v 2 log v) time. Equivalently, computation of a JT of a chordal graph with k nodes and v cliques takes O(v2 log... |

150 |
Probabilistic Reasoning in Expert Systems
- Neapolitan
- 1990
(Show Context)
Citation Context ...t in a multi-link lookahead search. Keywords: Inductive learning, reasoning under uncertainty, knowledge acquisition, Markov networks, probabilistic networks. 11 Introduction A probabilistic network =-=[28, 26, 14, 4]-=- combines a qualitative graphic structure, which encodes domain dependencies, with a quantitative probability distribution, which encodes the strength of the dependencies. The network structure can be... |

119 |
Hyper Markov laws in the statistical analysis of decomposable graphical models. Annals of Statistics
- Dawid, Lauritzen
- 1993
(Show Context)
Citation Context ...f its own encoding length and the encoding length of the data given the BN. Instead of learning a BN, Fung and Crawford [14] developed the Constructor algorithm that learns a DMN. Dawid and Lauritzen =-=[11]-=- studies ‘hyper Markov laws’ in learning numerical parameters of a DMN with a given decomposable graph. Madigan and Raftery [29] proposed algorithms for learning a set of acceptable models expressed a... |

82 |
An algorithm for fast recovery of sparse causal graphs
- Spirtes, Glymour
- 1991
(Show Context)
Citation Context ...lgorithm was independently developed by Buntine [2]. Recently, Heckerman et al [16] applied the Bayesian method to learning a BN by combining prior knowledge and statistical data. Spirtes and Glymour =-=[31]-=- developed the PC algorithm that learns a BN by deleting links from a complete graph. Lam and Bacchus [22] applied the minimal description length (MDL) principle to learning a BN, which evaluates a BN... |

79 | Multiply sectioned Bayesian networks and junction forests for large knowledgebased systems
- Xiang, Jensen, et al.
- 1993
(Show Context)
Citation Context ...es is concerned, 2a DMN is as equally expressive as a BN. Jensen et al's method can be extended to probabilistic inference with multiply sectioned Bayesian networks in a single agent oriented system =-=[36, 35]-=- as well as in a multiagent distributed interpretation system [34]. The run time representation is a set of DMNs (in terms of a set of JTs). It has been shown [33] that computation of posterior margin... |

59 |
KUTATO: An Entropy-Driven System for Construction of Probabilistic Expert Systems from Databases
- Herskovits, Cooper
- 1990
(Show Context)
Citation Context ...y real world domain models cannot be represented adequately with a tree-structured network. The following algorithms are all applicable to learning a multiply connected network. Herskovits and Cooper =-=[22]-=- developed the Kutato algorithm to learn a BN from a database of cases by minimizing the entropy of the distribution defined by the BN. Their method starts with an empty graph (no links) and adds one ... |

48 |
A fast procedure for model search in multidimensional contingency tables
- Edwards, Havranek
- 1985
(Show Context)
Citation Context ...data on six probable risk factors for coronary thrombosis [37]. With κ = 2 and δh =0.004, the DMN structure in Figure 6 was obtained. Our result is consistent with the models learned by other methods =-=[12, 29]-=-. We then tested the algorithm using the ALARM model [1] with 37 variables. A database of 30000 cases, generated from the BN, was used in the learning. A control DMN was obtained by converting the ori... |

42 | Constructor: A system for the induction of probabilistic models
- Fung, Crawford
- 1990
(Show Context)
Citation Context ...e to learning a BN, which evaluates a BN as the best if it has the minimal sum of its own encoding length and the encoding length of the data given the BN. Instead of learning a BN, Fung and Crawford =-=[11]-=- developed the Constructor algorithm that learns a DMN. A more extensive review of literature for learning probabilistic networks can be found in [18, 8, 3, 15]. In this paper we consider learning a D... |

41 |
Properties of Bayesian belief network learning algorithms
- Bouckaert
- 1994
(Show Context)
Citation Context ...MSs. Finally, as BNs and DMNs are so closely related, knowledge gained in learning one of them will bene t the learning of the other. It has been shown that learning probabilistic networks is NP-hard =-=[1, 6]-=-. Therefore, it is justi ed to use heuristic search in learning. Many algorithms developed use a scoring metric and a search procedure. The scoring metric evaluates the goodness-of- t of a structure t... |

41 | A probabilistic framework for cooperative multiagent distributed interpretation and optimization of communication
- Xiang
- 1996
(Show Context)
Citation Context ...he method can be extended to probabilistic inference with multiply sectioned Bayesian networks in a single agent oriented system [45, 44] as well as in a multi-agent distributed interpretation system =-=[43]-=-. The run time representation is a set of DMNs (in terms of a set of JTs). It has been shown [42, 40] that computation of posterior probabilities of a BN can be performed using an extended relational ... |

35 |
Uncertain Information Processing in Expert Systems
- Hájek, Havránek, et al.
- 1992
(Show Context)
Citation Context ...ve way to uncover many domain models. Keywords: Inductive learning, reasoning under uncertainty, knowledge acquisition, Markov networks, probabilistic networks. 1 Introduction A probabilistic network =-=[35, 32, 18, 6]-=- combines a qualitative graphic structure which encodes domain dependencies, with a quantitative probability distribution which encodes the strength of the dependencies. The network structure can be a... |

32 |
Decomposition of maximum likelihood in mixed graphical interaction models
- Frydenberg, Lauritzen
- 1989
(Show Context)
Citation Context ... implies that G1 is chordal. Projecting PM to G1, we obtain a new DMN (G1�P1). From the discussion above, G is augmented into G1 through one of the three cases of Theorem 6. 3Frydenberg and Lauritzen =-=[10]-=- (p553) proved that, given two chordal graphs with one being the subgraph of the other, there is an increasing sequence of chordal graphs between them that di er by exactly one link. Our result here i... |

31 |
Junction tree and decomposable hypergraphs
- Jensen
- 1988
(Show Context)
Citation Context ...rts. The following analyzes the worst case time complexity of the algorithm. Testing the chordality ofG can be performed in O(jNj) time [13]. A JT can be computed by a maximal spanning tree algorithm =-=[19]-=-. maximal spanning tree of a graph with v nodes and e links can be computed in O((v+e) log v) time [25]. Since a complete graph has O(v 2 ) links, a maximal spanning tree can be computed in O(v 2 log ... |

29 | A method for implementing a probabilistic model as a relational database - Wong, Butz, et al. - 1995 |

27 | Classifiers: a theoretical and empirical study
- Buntine
- 1991
(Show Context)
Citation Context ...he K2 algorithm [10] that learns a BN based on a Bayesian method which selects a BN with the highest posterior probability given a database. A similar algorithm was independently developed by Buntine =-=[3]-=-. Recently, Heckerman et al [20] applied the Bayesian method to learning a BN by combining prior knowledge and statistical data. Spirtes and Glymour [39] developed the PC algorithm that learns a BN by... |

27 |
Learning Bayesian networks: an approach based on the MDL principle
- Lam, Bacchus
- 1994
(Show Context)
Citation Context ...ethod to learning a BN by combining prior knowledge and statistical data. Spirtes and Glymour [31] developed the PC algorithm that learns a BN by deleting links from a complete graph. Lam and Bacchus =-=[22]-=- applied the minimal description length (MDL) principle to learning a BN, which evaluates a BN as the best if it has the minimal sum of its own encoding length and the encoding length of the data give... |

21 | Construction of a Markov network from data for probabilistic inference
- Wong, Xiang
- 1994
(Show Context)
Citation Context ...nd selects the best based on the evaluation. Out of many possible scoring metrics, Bayesian metrics1 , description length metrics and entropy metrics have been used and studied by several researchers =-=[22, 3, 10, 27, 29, 20, 2, 41]-=-. In many common cases, a Bayesian metric can be constructed that is equivalent to a description length metric, or at least approximately equal. See for instance [7, 38] for a more detailed discussion... |

19 |
On Information and Su ciency
- Kullback, Leibler
- 1951
(Show Context)
Citation Context ...ach originally presented in [32]. Given M over N, wewould like to learn a DMN (G� P ) that is an approximation of M. To measure the closeness of (G� P )toM, we adopt the Kullback-Leibler cross entropy=-=[21]-=-: K(PM�P)= P x PM(x)log(PM(x)=P (x))� where PM is the true jpd de ned by M and x is a con guration of N. A DMN that minimizes K(PM�P) will be considered as the best approximation of M. Since K(PM�P)=H... |

13 | Representation of bayesian networks as relational databases - Wong, Xiang, et al. |

8 |
Learning Bayesian networks: serach methods and experimental results
- Chickering, Geiger, et al.
- 1995
(Show Context)
Citation Context ...MSs. Finally, as BNs and DMNs are so closely related, knowledge gained in learning one of them will bene t the learning of the other. It has been shown that learning probabilistic networks is NP-hard =-=[1, 6]-=-. Therefore, it is justi ed to use heuristic search in learning. Many algorithms developed use a scoring metric and a search procedure. The scoring metric evaluates the goodness-of- t of a structure t... |

8 |
The recovery of causal ploy-trees from statistical data
- Rebane, Pearl
- 1987
(Show Context)
Citation Context ...s from data. Chow and Liu [9] pioneered learning of probabilistic networks. They developed an algorithm to approximate a joint probability distribution (jpd) by a tree-structured BN. Rebane and Pearl =-=[36]-=- extended their method to learn a polytree-structured BN. However, many real world domain models cannot be represented adequately with a tree-structured network. The following algorithms are all appli... |

6 |
Overview of model selection
- Cheeseman
- 1993
(Show Context)
Citation Context ... Out of many possible scoring metrics, the Bayesian metric, the description length metric and the entropy metric have been used and studied by several researchers [18, 2, 8, 22, 16, 1, 32]. Cheeseman =-=[5]-=- showed that the Bayesian metric and the description length metric are equivalent subject to a constant di erence. Lam and Bacchus [22] showed that in applying MDL principle to learning a BN, the enco... |

6 |
Small-sample and large-sample statistical model selection criteria
- Sclove
- 1994
(Show Context)
Citation Context ... automatic balance of model complexity and tness of data in MDL and Bayesian approaches, the level here acts as a user-controlled lever to balance the two. The necessity ofsuchbalance is discussed in =-=[30]-=-. In Section 8, we further illustrate experimentally the leverage that can be provided by the CI test. 8 Experimental Results A set of ten DMNs were randomly generated to serve as the control PMs. We ... |

6 | Distributed multi-agent probabilistic reasoning with Bayesian networks
- Xiang
- 1994
(Show Context)
Citation Context ...l's method can be extended to probabilistic inference with multiply sectioned Bayesian networks in a single agent oriented system [36, 35] as well as in a multiagent distributed interpretation system =-=[34]-=-. The run time representation is a set of DMNs (in terms of a set of JTs). It has been shown [33] that computation of posterior marginal probabilities of a BN can be performed using an extended relati... |

2 |
Prognostic significance of the risk profile in the prevention of coronary heart disease. Bratis. lek Listy
- Reinis, Pokorny, et al.
- 1981
(Show Context)
Citation Context ...r; and (3) test if the multi-link lookahead is necessary and effective to learn an embedded PI submodel. The algorithm was first run with the data on six probable risk factors for coronary thrombosis =-=[37]-=-. With κ = 2 and δh =0.004, the DMN structure in Figure 6 was obtained. Our result is consistent with the models learned by other methods [12, 29]. We then tested the algorithm using the ALARM model [... |

1 |
Classi ers: a theoretical and empirical study. InR.Lopez de Mantaras and
- Buntine
- 1991
(Show Context)
Citation Context ...the K2 algorithm [8] that learns a BN based on a Bayesian method which selects a BN with the highest posterior probability given a database. A similar algorithm was independently developed by Buntine =-=[2]-=-. Recently, Heckerman et al [16] applied the Bayesian method to learning a BN by combining prior knowledge and statistical data. Spirtes and Glymour [31] developed the PC algorithm that learns a BN by... |

1 |
Uncertain Information ProcessinginExpert Systems
- Hajek, Hovranek, et al.
- 1992
(Show Context)
Citation Context ...t in a multi-link lookahead search. Keywords: Inductive learning, reasoning under uncertainty, knowledge acquisition, Markov networks, probabilistic networks. 11 Introduction A probabilistic network =-=[28, 26, 14, 4]-=- combines a qualitative graphic structure, which encodes domain dependencies, with a quantitative probability distribution, which encodes the strength of the dependencies. The network structure can be... |