BayesiaLab is a powerful desktop application (Windows/Mac/Unix) with a sophisticated graphical user interface, which provides scientists a comprehensive “laboratory” environment for machine learning, knowledge modeling, diagnosis, analysis, simulation, and optimization. With BayesiaLab, BayesiaLab networks have become practical for gaining deep insights into problem domains. BayesiaLab leverages the inherently graphical structure of Bayesian networks for exploring and explaining complex problems.
BayesiaLab is the result of nearly twenty years of research and software development by Dr. Lionel Jouffe and Dr. Paul Munteanu. In 2001, their research efforts led to the formation of Bayesia S.A.S., headquartered in Laval in northwestern France. Today, the company is the world’s leading supplier of Bayesian network software, serving hundreds major corporations and research organizations around the world.
BayesiaLab’s Methods, Features, and Functions
BayesiaLab is designed around a prototypical workflow with a Bayesian network model at the center. BayesiaLab supports the research process from model generation to analysis, simulation, and optimization. The entire process is fully contained in a uniform “lab” environment, which provides scientists with flexibility in moving back and forth between different elements of the research task.
Knowledge Modeling
Subject matter experts often express their causal understanding of a domain in the form of diagrams, in which arrows indicate causal directions. This visual representation of causes and effects has a direct analog in the network graph in BayesiaLab. Nodes (representing variables) can be added and positioned on BayesiaLab’s Graph Panel with a mouse-click, arcs (representing relationships) can be “drawn” between nodes. The causal direction can be encoded by orienting the arcs from cause to effect
The quantitative nature of relationships between variables, plus many other attributes, can be managed in BayesiaLab’s Node Editor. In this way, BayesiaLab facilitates the straightforward encoding of one’s understanding of a domain. Simultaneously, BayesiaLab enforces internal consistency, so that impossible conditions cannot be encoded accidentally.
In addition to having individuals directly encode their explicit knowledge in BayesiaLab, the Bayesia Expert Knowledge Elicitation Environment (BEKEE) is available for acquiring the probabilities of a network from a group of experts. BEKEE offers a web-based interface for systematically eliciting explicit and tacit knowledge from multiple stakeholders.
Discrete, Nonlinear, and Nonparametric Modeling
BayesiaLab contains all “parameters” describing probabilistic relationships between variables in conditional probability tables (CPT), which means that no functional forms are utilized. Given this nonparametric, discrete approach, BayesiaLab can conveniently handle nonlinear relationships between variables. However, this CPT-based representation requires a preparation step for dealing with continuous variables, namely discretization. This consists in defining—manually or automatically—a discrete representation of all continuous values. BayesiaLab offers several tools for discretization, which are accessible in the Data Import Wizard, in the Node Editor, and in a standalone Discretization function. In this context, univariate, bivariate, and multivariate discretization algorithms are available.
Machine Learning with BayesiaLab
BayesiaLab features a comprehensive array of highly optimized learning algorithms that can quickly uncover structures in datasets. The optimization criteria in BayesiaLab’s learning algorithms are based on information theory (e.g. the Minimum Description Length). With that, no assumptions regarding the variable distributions are made. These algorithms can be used for all kinds and all sizes of problem domains, sometimes including thousands of variables with millions of potentially relevant relationships.
Unsupervised Structural Learning
In statistics, “unsupervised learning” is typically understood to be a classification or clustering task. To make a very clear distinction, we place emphasis on “structural” in “Unsupervised Structural Learning,” which covers a number of important algorithms in BayesiaLab.
Unsupervised Structural Learning means that BayesiaLab can discover probabilistic relationships between a large number of variables, without having to specify input or output nodes. One might say that this is a quintessential form of knowledge discovery, as no assumptions are required to perform these algorithms on unknown datasets.
Supervised Learning
Supervised Learning in BayesiaLab has the same objective as many traditional modeling methods, i.e. to develop a model for predicting a target variable. Note that numerous statistical packages also offer “Bayesian Networks” as a predictive modeling technique. However, in most cases, these packages are restricted in their capabilities to a one type of network, i.e. the Naive Bayes network. BayesiaLab offers a much greater number of Supervised Learning algorithms to search for the Bayesian network that best predicts the target variable while also taking into account the complexity of the resulting network.
We should highlight the Markov Blanket algorithm for its speed, which is particularly helpful when dealing with a large number of variables. In this context, the Markov Blanket algorithm can serve as an efficient variable selection algorithm. An example of Supervised Learning using this algorithm, and the closely-related Augmented Markov Blanket algorithm.
Clustering in BayesiaLab covers both Data Clustering and Variable Clustering. The former applies to the grouping of records (or observations) in a dataset; the latter performs a grouping of variables according to the strength of their mutual relationships.
A third variation of this concept is of particular importance in BayesiaLab: Multiple Clustering can be characterized as a kind of nonlinear, nonparametric and nonorthogonal factor analysis. Multiple Clustering often serves as the basis for developing Probabilistic Structural Equation Models with BayesiaLab (see Chapter 8 in our book, Bayesian Networks and BayesiaLab).
Inference: Diagnosis, Prediction, and Simulation
The inherent ability of Bayesian networks to explicitly model uncertainty makes them suitable for a broad range of real-world applications. In the Bayesian network framework, diagnosis, prediction, and simulation are identical computations. They all consist of observational inference conditional upon evidence:
Inference from effect to cause: diagnosis or abduction.
Inference from cause to effect: simulation or prediction.
This distinction, however, only exists from the perspective of the researcher, who would presumably see the symptom of a disease as the effect and the disease itself as the cause. Hence, carrying out inference based on observed symptoms is interpreted as “diagnosis.”
Observational Inference
One of the central benefits of Bayesian networks is that they compute inference “omni-directionally.” Given an observation with any type of evidence on any of the networks’ nodes (or a subset of nodes), BayesiaLab can compute the posterior probabilities of all other nodes in the network, regardless of arc direction. Both exact and approximate observational inference algorithms are implemented in BayesiaLab.
Model Utilization
BayesiaLab provides a range of functions for systematically utilizing the knowledge contained in a Bayesian network. They make a network accessible as an expert system that can be queried interactively by an end user or through an automated process.
The Adaptive Questionnaire function provides guidance in terms of the optimum sequence for seeking evidence. BayesiaLab determines dynamically, given the evidence already gathered, the next best piece of evidence to obtain, in order to maximize the information gain with respect to the target variable, while minimizing the cost of acquiring such evidence. In a medical context, for instance, this would allow for the optimal “escalation” of diagnostic procedures, from “low-cost/small-gain” evidence (e.g. measuring the patient’s blood pressure) to “high-cost/large-gain” evidence (e.g. performing an MRI scan). The Adaptive Questionnaire is discussed in the context of tumor classification in Chapter 6 of our book, Bayesian Networks & BayesiaLab.
The WebSimulator is a platform for publishing interactive models and Adaptive Questionnaires via the web, which means that any Bayesian network model built with BayesiaLab can be shared privately with clients or publicly with a broader audience. Once a model is published via the WebSimulator, end users can try out scenarios and examine the dynamics of that model. Click here to try out the WebSimulator.
Batch Inference is available for automatically performing inference on a large number of records in a dataset. For example, Batch Inference can be used to produce a predictive score for all customers in a database. With the same objective, BayesiaLab’s optional Export function can translate predictive network models into static code that can run in external programs. Modules are available that can generate code for R, SAS, PHP, VBA, and JavaScript.
Developers can also access many of BayesiaLab’s functions—outside the graphical user interface—by using the Bayesia Engine APIs. The Bayesia Modeling Engine allows constructing and editing networks. The Bayesia Inference Engine can access network models programmatically for performing automated inference, e.g. as part of a real-time application with streaming data. The Bayesia Engine APIs are implemented as pure Java class libraries (jar files), which can be integrated into any software project.
Knowledge Communication
While generating a Bayesian network, either by expert knowledge modeling or through machine learning, is all about a computer acquiring knowledge, a Bayesian network can also be a remarkably powerful tool for humans to extract or “harvest” knowledge. Given that a Bayesian network can serve as a high-dimensional representation of a real-world domain, BayesiaLab allows us to interactively—even playfully—engage with this domain to learn about it. Through visualization, simulation, and analysis functions, plus the graphical nature of the network model itself, BayesiaLab becomes an instructional device that can effectively retrieve and communicate the knowledge contained within the Bayesian network. As such, BayesiaLab becomes a bridge between artificial intelligence and human intelligence.
Licensing Options
Academic Edition License |
Single-User/Single-Machine Licence |
Continental Token License |
Available exlusively to students and faculty of accredited academic institutions. |
The most common license type for individual researchers. |
For workgroups, it's ideal to share one or more tokens. |
Functionally equivalent to the Single-User/Single-Machine license. |
Permanent local installation. |
Unlimited number of installations within origanization. |
Restricted to non-commercial use. |
Tied to one user on one machine. |
Unlimited number of registered users. |
Can be used entirely offline. |
Each user can have BayesiaLab installed on multiple machines. |
Also available as a perpetual license. |
User management via control panel. |
"Borrow Token" function allows offline use, e.g. while traveling. |