Figaro 4.0 has just been released, available from http://www.cra.com/figaro. The headline new feature is called “Structured Factored Inference”, or SFI. So I’d like to take this opportunity to explain what SFI is about and why we’re pursuing it.
First of all, as the name implies, it’s a framework for doing inference, so you first need to understand what inference is. To help with this, I’ll pull up a figure from my book that describes the gist of probabilistic reasoning:
You have a probabilistic model that expresses your general knowledge about the situation you’re interested in, like general knowledge about corner kicks in soccer. You have evidence about a particular situation, such as information about the skill of the goalie for a particular corner kick. Your queries express what you would like to know, such as a prediction of whether a goal will be scored. Inference is the process of using the model to answer the queries in light of the evidence, and it is accomplished by an inference algorithm.
As implementers of probabilistic programming systems, one of the things we think about the most is how to do inference efficiently. We’ve encoded a variety of inference algorithms, each of which is good in a variety of situations. For example, there are sampling algorithms that work by imagining lots of different possible states of the world and there are factored algorithms that work with algebraic operations using a data structure called factors. In my book, I go into detail into how these algorithms work, when you should choose a particular algorithm, and how to make your algorithm work well.
In general, there are various algorithms to choose from and it can be hard for you to know exactly what to use, unless you are a probabilistic programming expert. So we have set ourselves the goal to make that choice automatically in an intelligent way. But we actually want to go much further than that. We’ve realized that in some complex models, different parts of the model should be solved by different algorithms. We’ve developed a general mathematical formulation of inference in probabilistic programs that enables you to break up models and mix and match algorithms for different parts of models. We call this framework SFI.
There’s an analogy here to relational databases. In a relational database, you typically specify the data and a query and the relational database system figures out how to answer this query using a query optimizer. The query optimizer operates over an algebraic representation of the query-answering task to look for the best way to answer the query. Similarly, our SFI algorithm operates over an algebraic representation of the probabilistic inference task to find the best way to answer the query.
The essence of SFI is to:
- Identify decomposition points in a probabilistic model and break off parts of the model into subproblems by the decomposition points.
- Represent each subproblem using a set of factors specific to the subproblem.
- Solve each subproblem separately using an algorithm that operates on factors. The result of this operation is a new factor that represents the solution of the subproblem.
- Combine the solution factors for each of the subproblems together to solve the larger problem.
While this scheme is conceptually simple, there are lots of questions to answer. What are the decomposition points? How do you break a problem into subproblems by the decomposition points? How do you represent each subproblem using factors? How do you select an algorithm to solve a subproblem? What algorithms can you use? How do you take the solution of a subproblem and use it in the higher level problem?
The Figaro development team has come up with some answers to these questions, but this is an ongoing research project. Although this is a factored algorithm framework, that doesn’t preclude us from using sampling algorithms as well, as long as we can make them operate on the factor data structure and return a new factor as the solution. In fact, we’ve implemented a sampling algorithm called Gibbs sampling to operate on factors and it is one of the choices available for solving a subproblem.
We’ve obtained encouraging results using this approach on a number of problems. For example, we’ve run SFI on a medical diagnosis problem, where we compared ordinary variable elimination (VE) and belief propagation (BP) to a hybrid SFI algorithm where we choose between VE and BP for each subproblem. VE is an exact algorithm, meaning that if it runs to completion, it gets you exactly the answer you’re looking for. However, on this problem, VE takes exponential time, so it’s not a practical solution. BP is often used on problems like this because it runs reasonably quickly, but it can be quite inaccurate. Our SFI algorithm, is actually faster than BP and two orders of magnitude more accurate!
The really hard part of all of this is figuring out which algorithm to use. It can be hard to predict how an algorithm is going to perform on a problem. Also, performance might depend on the specific machine configuration of a user. We’re looking at using machine learning techniques to learn how to predict the performance of an algorithm based on problem characteristics and then adapt those predictions to a target user’s machine. This technique has been used in other domains, such as satisfiability, where SATzilla has spawned a line of research on automated algorithm selection. With SFI, we now get to make these selections for probabilistic programming on a fine-grained level. Our work so far is just a beginning and we hope to continue towards our goal of fully automated probabilistic programming inference.