#### DMCA

## Hinge-loss Markov random fields and probabilistic soft logic (2015)

Citations: | 6 - 4 self |

### Citations

13209 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ... (2004) connected SP and PGMs by showing how to train MRFs with largemargin estimation, a generalization of the large-margin objective for binary classification used to train support vector machines (=-=Vapnik, 2000-=-). Large-margin learning is a wellstudied approach to train structured predictors because it directly incorporates the structured loss function into a convex upper bound on the true objective: the reg... |

1366 |
Fuzzy Sets and Fuzzy Logic. Theory and Applications
- Klir, Yuan
- 1995
(Show Context)
Citation Context ...ly continuous information, such as similarity, vague or fuzzy concepts, and real-valued data. Instead of interpreting the clauses C using Boolean logic, we can interpret them usingsLukasiewicz logic (=-=Klir and Yuan, 1995-=-), which extends Boolean logic to infinite-valued logic in which the propositions x can take truth values in the continuous interval [0, 1]. Extending truth values to a continuous domain enables them ... |

997 | Distributed optimization and statistical learning via the alternating direction method of multipliers
- Boyd, Parikh, et al.
(Show Context)
Citation Context ...rder to divide problem (45) into independent subproblems that are easier to solve, using the alternating direction method of multipliers (ADMM) (Glowinski and Marrocco, 1975; Gabay and Mercier, 1976; =-=Boyd et al., 2011-=-). The first step is to form the augmented Lagrangian function for the problem. Let α = (α1, . . . ,αm+r) be a concatenation of vectors of Lagrange multipliers. Then the augmented Lagrangian is L(yL,α... |

901 |
Combinatorial Optimization: Polyhedra and Efficiency
- Schrijver
- 2003
(Show Context)
Citation Context ...ctured dependencies, models with submodular potential functions, models encoding bipartite matching problems, and those with nand potentials and perfect graph structures (Wainwright and Jordan, 2008; =-=Schrijver, 2003-=-; Jebara, 2009; Foulds et al., 2011). Researchers have also studied performance guarantees of other subclasses of the first-order local consistency relaxation. Kleinberg and Tardos (2002) and Chekuri ... |

815 | P.: Markov Logic Networks
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ...w probabilistic programming language that makes HL-MRFs easy to define and use for large, relational data sets. This idea has been explored for other classes of models, such as Markov logic networks (=-=Richardson and Domingos, 2006-=-) for discrete MRFs, relational dependency networks (Neville and Jensen, 2007) for dependency networks, and probabilistic relational models (Getoor et al., 2002) for Bayesian networks. We build on the... |

796 | Reducing the dimensionality of data with neural networks - Hinton, Salakhutdinov - 2006 |

660 | Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms.
- Collins
- 2002
(Show Context)
Citation Context ...f the log-likelihood with respect to a parameter Wq is ∂ logP (y|x) ∂Wq = EW [Φq(y,x)]− Φq(y,x), (61) where EW is the expectation under the distribution defined by W . The voted perceptron algorithm (=-=Collins, 2002-=-) optimizes W by taking steps of fixed length in the direction of the gradient, then averaging the points after all steps. Any step that is outside the feasible region is projected back before continu... |

624 | Large margin methods for structured and interdependent output variables
- Tsochantaridis, Joachims, et al.
- 2005
(Show Context)
Citation Context ...uctured loss function. The objective is therefore encoded as a norm minimization problem subject to many linear constraints, one for each possible prediction in the structured space. Structured SVMs (=-=Tsochantaridis et al., 2005-=-) extend large-margin estimation to a broad class of structured predictors and admit a tractable cutting-plane learning algorithm. This algorithm will terminate in a number of iterations linear in the... |

612 | Learning probabilistic relational models
- Friedman, Getoor, et al.
- 1999
(Show Context)
Citation Context ...and SRL encompass many approaches. One broad area of work—of which PSL is a part—uses first-order logic and other relational formalisms to specify templates for PGMs. Probabilistic relational models (=-=Friedman et al., 1999-=-) define templates for BNs in terms of a database schema, and they can be grounded out over instances of that schema to create BNs. Relational dependency networks (Neville and Jensen, 2007) template R... |

603 | Max-Margin Markov Networks - Taskar, Guestrin, et al. - 2003 |

533 | Inductive logic programming: theory and methods. J Log Program. - Muggleton, Raedt - 1994 |

488 | Convergent tree-reweighted message passing for energy minimization - Kolmogorov |

471 |
Some simplified NP-complete graph problems
- Garey, Johnson, et al.
- 1976
(Show Context)
Citation Context ...or easily defining rich, structured models for a wide range of problems, there is a new challenge: finding a most probable assignment to the variables, i.e., MAP inference, is NP-hard (Shimony, 1994; =-=Garey et al., 1976-=-). This means that (unless P=NP) our only hope for performing tractable inference is to perform it approximately. Observe that MAP inference for an MRF defined by C is the integer linear program arg m... |

385 |
Statistical analysis of non-lattice data
- Besag
- 1975
(Show Context)
Citation Context ... structured perceptron. 6.2 Maximum Pseudolikelihood Estimation Since exact maximum likelihood estimation is intractable in general, we can instead perform maximum-pseudolikelihood estimation (MPLE) (=-=Besag, 1975-=-), which maximizes the likelihood of each variable conditioned on all other variables, i.e., P ∗(y|x) = n∏ i=1 P ∗(yi|MB(yi),x) (62) = n∏ i=1 1 Zi(W ,y,x) exp [−f iw(yi,y,x)] ; (63) Z(w, yi) = ∫ yi ex... |

376 | Eigentaste: A Constant Time Collaborative Filtering Algorithm,” - Goldberg, Roeder, et al. - 2001 |

339 | Efficient clustering of highdimensional data sets with application to reference matching
- McCallum, Nigam, et al.
- 2000
(Show Context)
Citation Context ... number of entities. If handled naively, this could make scaling to large data sets difficult, but this problem is often handled by constructing blocks (e.g., Newcombe and Kennedy, 1962) or canopies (=-=McCallum et al., 2000-=-) over the entities, so that a limited subset of all possible links are actually considered. Blocking partitions the entities so that only links among entities in the same partition element, i.e., blo... |

321 | Cutting-plane training of structural SVMs. - Joachims, Finley, et al. - 2009 |

309 | Adaptive subgradient methods for online learning and stochastic optimization - Duchi, Hazan, et al. - 2011 |

299 |
The Probabilistic Method, Wiley-Interscience Series in Discrete Mathematics and Optimization
- Alon, Spencer
- 2000
(Show Context)
Citation Context ... One simple example of such a function is pi = 1 2 ŷ?i + 1 4 . (8) In this way, objective (7) leads to an expected .75 approximation of the MAX SAT solution. The method of conditional probabilities (=-=Alon and Spencer, 2008-=-) can find a single Boolean assignment that achieves at least the expected score from a set of rounding probabilities, and therefore at least .75 of the MAX SAT solution when objective (7) and functio... |

282 |
A dual algorithm for the solution of nonlinear variational problems via finiteelement approximations
- Gabay, Mercier
- 1976
(Show Context)
Citation Context ... y (L,̂i) = y (C,̂i) in order to divide problem (45) into independent subproblems that are easier to solve, using the alternating direction method of multipliers (ADMM) (Glowinski and Marrocco, 1975; =-=Gabay and Mercier, 1976-=-; Boyd et al., 2011). The first step is to form the augmented Lagrangian function for the problem. Let α = (α1, . . . ,αm+r) be a concatenation of vectors of Lagrange multipliers. Then the augmented L... |

230 | Learning structured prediction models: a large margin approach - Taskar, Chatalbashev, et al. |

228 | Deep Boltzmann machines
- Salakhutdinov, Hinton
- 2009
(Show Context)
Citation Context ...on and Domingos (2011) and compare against the results they report, which include tests using sum product networks, deep belief networks (Hinton and Salakhutdinov, 2006), and deep Boltzmann machines (=-=Salakhutdinov and Hinton, 2009-=-). We train HL-MRFs and discrete MRFs with all three learning methods: maximum likelihood estimation (MLE), maximum pseudolikelihood estimation(MPLE), and large-margin estimation (LME). When appropria... |

197 | Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. - Kleinberg, Tardos - 1999 |

189 | Bayesian probabilistic matrix factorization using markov chain Monte Carlo,”
- Salakhutdinov, Mnih
- 2008
(Show Context)
Citation Context ... For link prediction for preference prediction, a task that is inherently continuous and nontrivial to encode in discrete logic, we compare against Bayesian probabilistic matrix factorization (BPMF) (=-=Salakhutdinov and Mnih, 2008-=-). Finally, for image completion, we run the same experimental setup as Poon and Domingos (2011) and compare against the results they report, which include tests using sum product networks, deep belie... |

184 | D.: BLOG: Probabilistic Models with Unknown Objects
- Milch, Marthi, et al.
- 2005
(Show Context)
Citation Context ...ons. In this paper we emphasize designing algorithms that are flexible enough to support the full class of HL-MRFs. Examples of probabilistic programming languages include IBAL (Pfeffer, 2001), BLOG (=-=Milch et al., 2005-=-), ProbLog (De Raedt et al., 2007), Church (Goodman et al., 2008), Figaro (Pfeffer, 2009), and FACTORIE (McCallum et al., 2009). 7.2 Inference Whether viewed as MAP inference for an MRF or SP without ... |

182 | Using linear programming to decode binary linear codes - FELDMAN, WAINWRIGHT, et al. - 2005 |

178 | Collective classification in network data - Sen, Namata, et al. |

177 | Incremental parsing with the perceptron algorithm.
- Collins, Roark
- 2004
(Show Context)
Citation Context ...nents of the structure, so that each one-dimension prediction problem can be conditioned on the most useful information. Examples of learn-to-search methods include incremental structured perceptron (=-=Collins and Roark, 2004-=-), SEARN (Daumé III et al., 2009), DAgger (Ross et al., 2011), and AggreVaTe (Ross and Bagnell, 2014). In this paper we focus on SP methods that perform joint prediction directly. Better understandin... |

174 | A linear programming approach to maxsum problem: A review. - Werner - 2007 |

167 |
Sur lapproximation par elements finis dordre un, et la resolution par penalisation-dualite dune classe de problemes de Dirichlet nonlineaires, Rev. Francaise dAut
- Glowinski, Marrocco
- 1975
(Show Context)
Citation Context ...relax the equality constraints y (L,̂i) = y (C,̂i) in order to divide problem (45) into independent subproblems that are easier to solve, using the alternating direction method of multipliers (ADMM) (=-=Glowinski and Marrocco, 1975-=-; Gabay and Mercier, 1976; Boyd et al., 2011). The first step is to form the augmented Lagrangian function for the problem. Let α = (α1, . . . ,αm+r) be a concatenation of vectors of Lagrange multipli... |

160 | Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations.
- Globerson, Jaakkola
- 2007
(Show Context)
Citation Context ... dual decomposition (DD) Sontag et al. (2011), which solves a problem dual to the LCR objective. Many DD algorithms use coordinate descent, such as TRW-S (Kolmogorov, 2006), MSD (Werner, 2007), MPLP (=-=Globerson and Jaakkola, 2007-=-), and ADLP (Meshi and Globerson, 2011), Other DD algorithms use subgradient-based approaches (e.g., Jojic et al., 2010; Komodakis et al., 2011; Schwing et al., 2012). Another approach to solving the ... |

156 | Minima of Functions of Several Variables with Inequalities as Side Constraints - Karush - 1939 |

144 | ProbLog: A probabilistic Prolog and its application in link discovery. - Raedt, Kimmig, et al. - 2007 |

141 | Church: a language for generative models.
- Goodman, Mansinghka, et al.
- 2008
(Show Context)
Citation Context ...flexible enough to support the full class of HL-MRFs. Examples of probabilistic programming languages include IBAL (Pfeffer, 2001), BLOG (Milch et al., 2005), ProbLog (De Raedt et al., 2007), Church (=-=Goodman et al., 2008-=-), Figaro (Pfeffer, 2009), and FACTORIE (McCallum et al., 2009). 7.2 Inference Whether viewed as MAP inference for an MRF or SP without probabilistic semantics, searching over a structured space to fi... |

140 |
Finding MAPs for belief networks is NP-hard
- Shimony
- 1994
(Show Context)
Citation Context ...have a method for easily defining rich, structured models for a wide range of problems, there is a new challenge: finding a most probable assignment to the variables, i.e., MAP inference, is NP-hard (=-=Shimony, 1994-=-; Garey et al., 1976). This means that (unless P=NP) our only hope for performing tractable inference is to perform it approximately. Observe that MAP inference for an MRF defined by C is the integer ... |

130 | Learning probabilistic models of link structure - Getoor, Friedman, et al. |

113 | Relational dependency networks - Neville, Jensen |

112 | Tightening LP relaxations for MAP using message passing.
- Sontag, Meltzer, et al.
- 2008
(Show Context)
Citation Context ...i and Globerson, 2011), which is the primal analog of ADLP. Like AD3, it uses ADMM to optimize the objective. Other approaches to approximate inference include tighter linear programming relaxations (=-=Sontag et al., 2008-=-, 2012). These tighter relaxations enforce local consistency on variable subsets that are larger than individual variables, which makes them higher-order local consistency relaxations. Mezuman et al. ... |

105 | MRF energy minimization and beyond via dual decomposition.
- Komodakis, Paragios, et al.
- 2011
(Show Context)
Citation Context ...RW-S (Kolmogorov, 2006), MSD (Werner, 2007), MPLP (Globerson and Jaakkola, 2007), and ADLP (Meshi and Globerson, 2011), Other DD algorithms use subgradient-based approaches (e.g., Jojic et al., 2010; =-=Komodakis et al., 2011-=-; Schwing et al., 2012). Another approach to solving the LCR objective uses message-passing algorithms to solve 49 the problem directly in its primal form. One well-known algorithm is that of Ravikuma... |

93 | Predicting Structured Data, - Bakir, Hofmann, et al. - 2007 |

89 | Factorie: Probabilistic programming via imperatively defined factor graphs
- McCallum, Schultz, et al.
- 2009
(Show Context)
Citation Context ...s of probabilistic programming languages include IBAL (Pfeffer, 2001), BLOG (Milch et al., 2005), ProbLog (De Raedt et al., 2007), Church (Goodman et al., 2008), Figaro (Pfeffer, 2009), and FACTORIE (=-=McCallum et al., 2009-=-). 7.2 Inference Whether viewed as MAP inference for an MRF or SP without probabilistic semantics, searching over a structured space to find the optimal prediction is an important but difficult task. ... |

87 | Efficient weight learning for Markov logic networks.
- Lowd, Domingos
- 2007
(Show Context)
Citation Context ...the feasible region is projected back before continuing. For a smoother ascent, it is often helpful to divide the q-th component of the gradient by the number of groundings |tq| of the q-th template (=-=Lowd and Domingos, 2007-=-), which we do in our experiments. Computing the expectation is intractable, so we use a common approximation: the values of the potential functions at the most probable setting of y with the current ... |

86 | IBAL: A probabilistic rational programming language.
- Pfeffer
- 2001
(Show Context)
Citation Context ... SRL for the same reasons. In this paper we emphasize designing algorithms that are flexible enough to support the full class of HL-MRFs. Examples of probabilistic programming languages include IBAL (=-=Pfeffer, 2001-=-), BLOG (Milch et al., 2005), ProbLog (De Raedt et al., 2007), Church (Goodman et al., 2008), Figaro (Pfeffer, 2009), and FACTORIE (McCallum et al., 2009). 7.2 Inference Whether viewed as MAP inferenc... |

81 | New 34-approximation algorithms for the maximum satisfiability problem,
- Goemans, Williamson
- 1994
(Show Context)
Citation Context ...g both discrete and continuous structured data. On the road to deriving HL-MRFs, we unify three different approaches to scalable inference in structured models: (1) randomized algorithms for MAX SAT (=-=Goemans and Williamson, 1994-=-), (2) local consistency relaxation (Wainwright and Jordan, 2008) for discrete Markov random fields defined using Boolean logic, and (3) reasoning about continuous information with fuzzy logic. We sho... |

73 | Sum-product networks: A new deep architecture. - Poon, Domingos - 2011 |

67 | A reduction of imitation learning and structured prediction to no-regret online learning.
- Ross, Gordon, et al.
- 2011
(Show Context)
Citation Context ...em can be conditioned on the most useful information. Examples of learn-to-search methods include incremental structured perceptron (Collins and Roark, 2004), SEARN (Daumé III et al., 2009), DAgger (=-=Ross et al., 2011-=-), and AggreVaTe (Ross and Bagnell, 2014). In this paper we focus on SP methods that perform joint prediction directly. Better understanding the differences and relative advantages of joint-prediction... |

66 | Search-based structured prediction. - Langford, John, et al. - 2009 |

62 | Message-passing for graph-structured linear programs: Proximal methods and rounding schemes. - Ravikumar, Agarwal, et al. - 2010 |

61 | Temporal collaborative filtering with bayesian probabilistic tensor factorization. - Xiong, Chen, et al. - 2010 |

57 |
Record linkage: making maximum use of the discriminating power of identifying information
- Newcombe, Kennedy
- 1962
(Show Context)
Citation Context ...ssible links grows quadratically with the number of entities. If handled naively, this could make scaling to large data sets difficult, but this problem is often handled by constructing blocks (e.g., =-=Newcombe and Kennedy, 1962-=-) or canopies (McCallum et al., 2000) over the entities, so that a limited subset of all possible links are actually considered. Blocking partitions the entities so that only links among entities in t... |

53 | Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. - Ravikumar, Lafferty - 2006 |

49 | Using weighted MAX-SAT engines to solve MPE. - Park - 2002 |

48 | Statistical predicate invention. - Kok, Domingos - 2007 |

45 |
Approximating MAPs for belief networks is NP-hard and other theorems.
- Abdelbar, Hedetniemi
- 1998
(Show Context)
Citation Context ... is relative to this score, the bound is loosened for the original problem the larger the constant added to the weights is. This is to be expected, since even approximating MAP is NP-hard in general (=-=Abdelbar and Hedetniemi, 1998-=-). We have described how general structural dependencies can be modeled with the logical rules of PSL. It is possible to represent arbitrary logical relationships with them. The process for converting... |

44 | A linear programming formulation and approximation algorithms for the metric labeling problem. - Chekuri, Khanna, et al. - 2004 |

44 | Convexity arguments for efficient minimization of the Bethe and Kikuchi free energies. - Heskes - 2006 |

38 | A.: Learning efficiently with approximate inference via dual losses. In: ICML. - Meshi, Sontag, et al. - 2010 |

37 |
editors. Introduction to Statistical Relational Learning
- Getoor, Taskar
- 2007
(Show Context)
Citation Context ...tly. Examples include social networks, biological networks, the Web, natural language, computer vision, sensor networks, and so on. Machine learning subfields such as statistical relational learning (=-=Getoor and Taskar, 2007-=-), inductive logic programming (Muggleton and De Raedt, 1994), and structured prediction (BakIr et al., 2007) all seek to represent both the relational structure and dependencies in ∗Computer Science ... |

37 | Accelerated dual decomposition for MAP inference. - Jojic, Gould, et al. - 2010 |

37 | Guided local search for solving SAT and weighted MAX-SAT problems - Mills, Tsang - 2000 |

36 | Probabilistic similarity logic - Broecheler, Getoor - 2009 |

33 |
The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems
- An, Tao
- 2005
(Show Context)
Citation Context ...ause any distance-based loss to be concave, which require the separation oracle to solve a non-convex objective. For interior ground truth values, we use the difference of convex functions algorithm (=-=An and Tao, 2005-=-) to find a local optimum. Since the concave portion of the loss-augmented inference objective pivots around the ground truth value, the subgradients are 1 or −1, depending on whether the current valu... |

33 | The interior-point revolution in optimization: History, recent developments and lasting consequences”, - Wright - 2004 |

31 | Discrete lagrangian-based search for solving MAX-SAT problems - Shang, Wah - 1997 |

29 | Maxmargin weight learning for Markov logic networks - Huynh, Mooney - 2009 |

28 | Hinge-loss Markov Random Fields: Convex Inference for Structured Prediction - Bach, Huang, et al. - 2013 |

25 | Solving Markov random fields using second order cone programming relaxations. - Kumar, Torr, et al. - 2006 |

19 | Tight Approximation Algorithms for - Fleischer, Goemans, et al. |

17 | Scaling MPE inference for constrained continuous Markov random fields with consensus optimization. - Bach, Broecheler, et al. - 2012 |

17 | MAP estimation, message passing, and perfect graphs.
- Jebara
- 2009
(Show Context)
Citation Context ...es, models with submodular potential functions, models encoding bipartite matching problems, and those with nand potentials and perfect graph structures (Wainwright and Jordan, 2008; Schrijver, 2003; =-=Jebara, 2009-=-; Foulds et al., 2011). Researchers have also studied performance guarantees of other subclasses of the first-order local consistency relaxation. Kleinberg and Tardos (2002) and Chekuri et al. (2005) ... |

16 | A scalable framework for modeling competitive diffusion in social networks. - Broecheler, Shakarian, et al. - 2010 |

16 |
Probabilistic programming
- Gordon, Henzinger, et al.
- 2014
(Show Context)
Citation Context ...expressivity and accuracy of their discrete counterparts. In addition, HL-MRFs and PSL can reason directly about continuous data. PSL is part of a broad family of probabilistic programming languages (=-=Gordon et al., 2014-=-). The goals of probabilistic programming and SRL often overlap. Probabilistic programming seeks to make constructing probabilistic models easy for the end user, and separate model specification from ... |

16 | Knowledge graph identification.
- Pujara
- 2013
(Show Context)
Citation Context ...ling both discrete and continuous data. The effectiveness of HL-MRFs and PSL has also been demonstrated in other publications on many problem domains, including automatic knowledge base construction (=-=Pujara et al., 2013-=-), high-level computer vision (London et al., 2013b), drug discovery (Fakhraei et al., 2014), natural language semantics (Beltagy et al., 2014; Sridhar et al., 2015), automobiletraffic modeling (Chen ... |

13 |
Figaro: An Object-Oriented Probabilistic Programming Language. Charles River Analytics
- Pfeffer
- 2009
(Show Context)
Citation Context ...full class of HL-MRFs. Examples of probabilistic programming languages include IBAL (Pfeffer, 2001), BLOG (Milch et al., 2005), ProbLog (De Raedt et al., 2007), Church (Goodman et al., 2008), Figaro (=-=Pfeffer, 2009-=-), and FACTORIE (McCallum et al., 2009). 7.2 Inference Whether viewed as MAP inference for an MRF or SP without probabilistic semantics, searching over a structured space to find the optimal predictio... |

11 | A flexible framework for probabilistic models of social trust.
- Huang, Kimmig, et al.
- 2013
(Show Context)
Citation Context ...overy (Fakhraei et al., 2014), natural language semantics (Beltagy et al., 2014; Sridhar et al., 2015), automobiletraffic modeling (Chen et al., 2014), and user attribute (Li et al., 2014) and trust (=-=Huang et al., 2013-=-; West et al., 2014) prediction in social networks. The ability to easily incorporate latent variables into HL-MRFs and PSL has enabled innovative applications, including modeling latent topics in tex... |

11 | Efficiently searching for frustrated cycles in MAP inference. - Sontag, Choe, et al. - 2012 |

10 | Learning latent engagement patterns of students in online courses. - Ramesh, Goldwasser, et al. - 2014 |

9 | Fast learning of relational kernels - Landwehr, Passerini, et al. - 2010 |

9 | Collective stability in structured prediction: Generalization from one example. - London, Huang, et al. - 2013 |

9 | Globally convergent dual MAP LP relaxation solvers using Fenchel-Young margins.
- Schwing, Hazan, et al.
- 2012
(Show Context)
Citation Context ... MSD (Werner, 2007), MPLP (Globerson and Jaakkola, 2007), and ADLP (Meshi and Globerson, 2011), Other DD algorithms use subgradient-based approaches (e.g., Jojic et al., 2010; Komodakis et al., 2011; =-=Schwing et al., 2012-=-). Another approach to solving the LCR objective uses message-passing algorithms to solve 49 the problem directly in its primal form. One well-known algorithm is that of Ravikumar et al. (2010), which... |

8 | Probabilistic soft logic for semantic textual similarity. - Beltagy, Erk, et al. - 2014 |

8 | Structured learning via logistic regression. - Domke - 2013 |

7 | F.: Exploiting the Power of mip Solvers in maxsat
- Davies, Bacchus
- 2013
(Show Context)
Citation Context ...ne such approach, that of Wah and Shang (1997), is essentially a type of DD formulated for MAX SAT. A more recent approach blends convex programming and discrete search via mixed integer programming (=-=Davies and Bacchus, 2013-=-). Additionally, Huynh and Mooney (2009) introduced a linear programming relaxation for MLNs inspired by MAX SAT relaxations, but the relaxation of general Markov logic provides no known guarantees on... |

7 | Revisiting MAP estimation, message passing and perfect graphs.
- Foulds, Navaroli, et al.
- 2011
(Show Context)
Citation Context ...h submodular potential functions, models encoding bipartite matching problems, and those with nand potentials and perfect graph structures (Wainwright and Jordan, 2008; Schrijver, 2003; Jebara, 2009; =-=Foulds et al., 2011-=-). Researchers have also studied performance guarantees of other subclasses of the first-order local consistency relaxation. Kleinberg and Tardos (2002) and Chekuri et al. (2005) considered the metric... |

7 | Exploiting social network structure for person-to-person sentiment analysis - West, Paskov, et al. - 2014 |

6 | Computing marginal distributions over continuous Markov networks for statistical relational learning. - Broecheler, Getoor - 2010 |

6 | Network-based drug-target interaction prediction with probabilistic soft logic
- Fakhraei, Huang, et al.
(Show Context)
Citation Context ...n demonstrated in other publications on many problem domains, including automatic knowledge base construction (Pujara et al., 2013), high-level computer vision (London et al., 2013b), drug discovery (=-=Fakhraei et al., 2014-=-), natural language semantics (Beltagy et al., 2014; Sridhar et al., 2015), automobiletraffic modeling (Chen et al., 2014), and user attribute (Li et al., 2014) and trust (Huang et al., 2013; West et ... |

6 | Tighter linear program relaxations for high order graphical models. - Mezuman, Tarlow, et al. - 2013 |

3 | Approximating Weighted MaxSAT Problems by Compensating for Relaxations, CP - Choi, Standley, et al. - 2009 |

3 | 11). Inferring User Preferences by Probabilistic Logical Reasoning over Social Networks. Retrieved April 6, 2015, from http://arxiv.org/abs/1411.2679
- Li, Ritter, et al.
- 2014
(Show Context)
Citation Context ...on et al., 2013b), drug discovery (Fakhraei et al., 2014), natural language semantics (Beltagy et al., 2014; Sridhar et al., 2015), automobiletraffic modeling (Chen et al., 2014), and user attribute (=-=Li et al., 2014-=-) and trust (Huang et al., 2013; West et al., 2014) prediction in social networks. The ability to easily incorporate latent variables into HL-MRFs and PSL has enabled innovative applications, includin... |

3 | Collective activity detection using hinge-loss Markov random fields. - London, Khamis, et al. - 2013 |

2 | Unifying local consistency and MAX SAT relaxations for scalable inference with rounding guarantees
- Bach, Huang, et al.
- 2015
(Show Context)
Citation Context ...that the hinge-loss potentials of HL-MRFs can also be motivated by two different approximate inference methods for discrete models: randomized algorithms for MAX SAT and local consistency relaxation (=-=Bach et al., 2015-=-). Now, in this paper, we present the full motivation for and derivation of HL-MRFs, including unifying three approaches to scalable inference. We also present the PSL language, algorithms for inferen... |

2 |
Road traffic congestion monitoring in social media with hinge-loss Markov random fields
- Chen, Chen, et al.
- 2014
(Show Context)
Citation Context ... 2013), high-level computer vision (London et al., 2013b), drug discovery (Fakhraei et al., 2014), natural language semantics (Beltagy et al., 2014; Sridhar et al., 2015), automobiletraffic modeling (=-=Chen et al., 2014-=-), and user attribute (Li et al., 2014) and trust (Huang et al., 2013; West et al., 2014) prediction in social networks. The ability to easily incorporate latent variables into HL-MRFs and PSL has ena... |

2 | Latent topic networks: A versatile probabilistic programming framework for topic models.”
- Foulds, Kumar, et al.
- 2015
(Show Context)
Citation Context ...est et al., 2014) prediction in social networks. The ability to easily incorporate latent variables into HL-MRFs and PSL has enabled innovative applications, including modeling latent topics in text (=-=Foulds et al., 2015-=-), and improving student outcomes in massive open online courses (MOOCs) by modeling latent information about students and their communications (Ramesh et al., 2014, 2015). Researchers have also studi... |

2 | Paired-dual learning for fast training of latent variable hinge-loss MRFs - Bach, Huang, et al. |

1 |
FoxPSL: An extended and scalable PSL implementation
- Magliacane, Stutz, et al.
- 2015
(Show Context)
Citation Context ...udents and their communications (Ramesh et al., 2014, 2015). Researchers have also studied how to make HL-MRFs and PSL even more scalable by developing distributed implementations (Miao et al., 2013; =-=Magliacane et al., 2015-=-). That they are already being widely applied indicates HL-MRFs and PSL directly address an open need of the machine learning community. The paper is organized as follows. In Section 2, we first consi... |

1 | A hypergraph-partitioned vertex programming approach for large-scale consensus optimization
- Miao, Liu, et al.
- 2013
(Show Context)
Citation Context ...nformation about students and their communications (Ramesh et al., 2014, 2015). Researchers have also studied how to make HL-MRFs and PSL even more scalable by developing distributed implementations (=-=Miao et al., 2013-=-; Magliacane et al., 2015). That they are already being widely applied indicates HL-MRFs and PSL directly address an open need of the machine learning community. The paper is organized as follows. In ... |

1 | Weakly supervised models of aspect-sentiment for online course discussion forums - Ramesh, Kumar, et al. - 2015 |

1 | Reinforcement and Imitation Learning via Interactive No-Regret Learning
- Ross, Bagnell
- 2014
(Show Context)
Citation Context ...seful information. Examples of learn-to-search methods include incremental structured perceptron (Collins and Roark, 2004), SEARN (Daumé III et al., 2009), DAgger (Ross et al., 2011), and AggreVaTe (=-=Ross and Bagnell, 2014-=-). In this paper we focus on SP methods that perform joint prediction directly. Better understanding the differences and relative advantages of joint-prediction methods and learnto-search methods is a... |

1 |
Joint models of disagreement and stance in online debate
- Sridhar, Foulds, et al.
- 2015
(Show Context)
Citation Context ...tomatic knowledge base construction (Pujara et al., 2013), high-level computer vision (London et al., 2013b), drug discovery (Fakhraei et al., 2014), natural language semantics (Beltagy et al., 2014; =-=Sridhar et al., 2015-=-), automobiletraffic modeling (Chen et al., 2014), and user attribute (Li et al., 2014) and trust (Huang et al., 2013; West et al., 2014) prediction in social networks. The ability to easily incorpora... |