Probabilistic Ontology: A Simple Example
Bayesian Networks
Bayesian
networks have been successfully applied to create consistent
probabilistic
representations of uncertain knowledge in a wide range of applications.
Bayesian
networks (BNs) provide a means of
parsimoniously expressing joint probability distributions over many
interrelated hypotheses. A Bayesian network consists of a directed
acyclic
graph (DAG) and a set of local distributions. Each node in the graph
represents
a random variable. A random variable denotes an attribute, feature, or
set of
hypotheses about which we may be uncertain. Each random variable has a
set of
mutually exclusive and collectively exhaustive possible values. That
is,
exactly one of the possible values is or will be the actual value, and
we are
uncertain about which one it is. The graph represents direct
qualitative
dependence relationships; the local distributions represent
quantitative
information about the strength of those dependencies. The graph and the
local
distributions together represent a joint probability distribution over
the
random variables denoted by the nodes of the graph.
Figure
1 shows an example of a BN representing part of a highly simplified
ontology
for wines and pizzas. In this toy example, inspired by the wine
and pizza
ontologies,
we assume that domain knowledge about gastronomy was gathered from
sources such
as statistical data collected from restaurants and expert judgment of
sommeliers
and pizzaiolos. The resulting knowledge base expresses a probability
distribution relating features of the pizzas
ordered by customers (i.e. type of base and topping) and
characteristics of the
wines ordered to accompany the pizzas. Figure 1a shows prior
probability distributions based on the background information in the
knowledge base. Figure 1b represents
a situation in which a customer requests a pizza with cheese topping
and
a thin and crispy base. Using the probability distribution stored in
the BN of
Figure 1, the waiter can apply Bayes rule to infer the best type of
wine to
offer the customer given his pizza preferences and the body of
statistical and
expert information previously linking features of pizza to wines. A
Bayesian network provides a parsimonious way to express the joint
distribution
and a computationally efficient way to implement Bayes rule.
The result of Bayesian inference is shown in Figure 1b, where evidence
of the customer's
order points to Beaujolais as the
most likely wine the customer would order, followed by Cabernet
Sauvignon, and
so on. We can see that knowledge of the customer's pizza choice has
increased the likelihood of Beaujolais and decreased the likelihood of
Chardonnay and Bordeaux.
|
a: Prior
Probabilities
|

b: Posterior Probabilities Given Base and Topping |
| Figure 1:
Bayesian Network for Pizza and Wine |
Although this is just a toy example, it demonstrates how
incomplete information about a domain can be used to improve our
decisions. In an ontology without uncertainty, there would not be
enough information for a logical reasoner to infer a good choice of
wine to offer the customer, and the decision would have to be made
without optimal use of all the information available.
As
Bayesian
networks have grown in popularity, their shortcomings in expressiveness
for many real-world applications have become increasingly apparent.
More specifically, Bayesian Networks assume a simple attribute-value
representation – that is, each problem instance involves
reasoning
about the same fixed number of attributes, with only the evidence
values changing from problem instance to problem instance. In the pizza
and wine example, the PizzaTopping random variable conveys general
information about the class of pizza toppings (i.e. types of toppings
for a given pizza and how it is related to preferences over wine flavor
and color), but the BN in Figures 1 and 2 is valid for pizzas with only
one topping. To deal with more elaborate pizzas it is necessary to
build specific BNs for each configuration, each one with a distinct
probability distribution. For example, Figure 3 depicts a BN for a 3-
topping pizza
with a specific customer preference displayed. Also, the information
conveyed by the BNs (i.e. for 1-topping, 2-toppings, etc.) relates to
the class of pizza toppings, and not to specific instances of those
classes. Therfore, the BN in Figure 3 cannot be used for a situation in
which the costumer asks for two 3-topping pizzas. This type of
representation is inadequate for many problems of practical importance.
Similarly, these BNs cannot be used to reason about a situation in
which a customer orders several bottles of wine that may be of
different varieties. Many domains require reasoning about varying
numbers of related entities of different types, where the numbers,
types and relationships among entities usually cannot be specified in
advance and may have uncertainty in their own definitions. For these
types of problem, a more expressive probabilistic language is needed.
Multi-Entity Bayesian Networks
In recent years, languages have appeared that extend the expressiveness
of probabilistic graphical models in various ways. This trend
reflects the need for probabilistic tools with more
representational power to meet the demands of real world problems.
Here, we consider one such language, called Multi-Entity
Bayesian Networks,
or MEBN. MEBN represents probabilistic knowledge as a collection of
Bayesian network fragments called MFrags. As an illustration of the
expressiveness of a first-order probabilistic logic, Figure 3a presents
a graphical depiction of MFrags for the wine and pizza toy
example. It conveys both the structural relationships
(implied by
the arcs) among the nodes and the numerical probabilities (represented
by the local distributions, and not depicted in the figure). The
MFrags contain three kinds of random variables.
- Context
random variables, shown in
yellow, represent conditions that must be satisfied for the probability
distributions encoded by the MFrags to apply. For example, the Pizza
Base MFrag relates the flavor and body of a wine w the base of a
pizza p.
The context random variables specify that w must be a wine, p must be a pizza,
and w must
be served with p.
- Resident
random
variables represent random variables whose distributions are defined by
the MFrag. The parents of a resident random variable are the random
variables with arcs pointing into it. The MFrag assigns a
probability distribution to the resident random variable for each
combination of values for its parents.
- Input random
variables,
shown in gray, represent random variables which influence the random
variables in the MFrag (i.e., have arcs emanating from them into
resident random variables) but whose distributions are defined in some
other MFrag.
Filling in the placeholders w,
p, and t
with specific entities of type Wine, Pizza and Topping results in a
situation-specific Bayesian network, or SSBN. Figure 3b shows
a
SSBN for a three-topping pizza. Notice that this is the same
Bayesian network as Figure 2, except for a renaming of the nodes.
This example shows that an expressive language like MEBN
allows
us to represent a wide variety of specific situations involving
different numbers of wines, pizzas, and toppings.
|
a: MFrags
for Wine and Pizza Example
|

b: SSBN for 3-Topping Pizza |
| Figure 3: MEBN
Representation of Pizzion of Wine and Pizza Example |
Of course, this example is oversimplified, and does not represent many
of the relationships we would want to consider in a more realistic
problem. However, it suffices to illustrate how ontologies can be
combined with Bayesian to represent and reason with uncertainty.
MEBN Semantics
There are three kinds of random variables in MEBN: logical random
variables, non-logical random variables, and finding random variables.
Logical random variables correspond to predicates in
first-order
logic, and non-logical random variables correspond to functions in
first-order logic. The logical random variables have possible
values in the set {T,
F,
⊥}, and the non-logical random variables take on values in the
set Ω∪{⊥}, where T
is the value assigned to a logical statement that is true; F
is the value assigned to a logical statement that is false; Ω
is
a countable set of distinct entity identifiers; and ⊥
is a
special value denoting a meaningless statement. There are special
logical random variables corresponding to the usual logical connectives
and quantified statements. Finding random variables are used to
represent evidence about particular situations, such as the toppings a
specific customer has ordered.
A MEBN Theory consists of a set of MFrags satisfying a set of
consistency constraints ensuring the existence of a joint probability
distribution on the possible values of its random variables. A MEBN
theory represents a probability distribution on interpretations of an
associated first-order logic theory in the set Ω of
entity
identifiers. We can construct a probability distribution on
interpretations of the theory in any domain Δ by mapping the
entity identifiers to elements of Δ.
PR-OWL
To represent an MFrag, we need to specify:
- Context
random variables. The conditions under which the
distributions in the MFrag apply.
- Resident
random variables. The names, arguments (ordinary
variables), and possible values of the resident random variables of the
MFrag.
- Input random
variables. The random variables that influence the
resident random variables, but whose distributions are defined in other
MFrags.
- Local
distributions. A
function mapping configurations of values of a resident random
variable's parents to probability distributions on the random
variable's possible values.
PR-OWL is an
upper ontology for
building probabilistic ontologies based on MEBN logic. The
MFrags
depicted in Figure 4 form a consistent set that can be used to reason
probabilistically about the wine-and-pizza domain. These MFrags can be
stored in an OWL file using the classes and properties defined in the
PR-OWL upper ontology. The MFrags can be used to instantiate situation
specific Bayesian networks to answer queries about the domain of
application being modeled. In other words, a PR-OWL probabilistic
ontology consists of both deterministic and probabilistic information
about the domain of discussion (e.g. wines and pizzas). This
information is stored in an OWL file and can be used for answering
specific queries for any configuration of the instances given the
evidence at hand.
_______________________________________
©2007 Paulo C. G. Costa and Kathryn B. Laskey