Data

Data

Open Data Source

This menu item allows opening the file or the database selector and then starts the Data Import Wizard.

  • Text *file:* Once the file is read and the pre-processing done, a fully unconnected network is created in a new graph window, each attribute having one corresponding node. The set of Bayesian network learning methods becomes then available.

  • Database: Once the database table is loaded and the pre-processing done, a fully unconnected network is created in a new graph window, each attribute having one corresponding node. The set of Bayesian network learning methods becomes then available.

  • Recent databases: Keep a list of the recently opened databases. The Data importation wizard is directly opened on the selected file. The size of this list can be modified through the settings Menus .

Associate Data Source

This menu item allows opening the Data association wizard in order to associate data from a text file or a database with an existing Bayesian network.

  • Recent databases: Keep a list of the recently opened databases. The Data association wizard is directly opened on the selected file. The size of this list can be modified through the settings Menus .

When the network structure is modified during the association (addition of nodes or states), the conditional probability tables are automatically recomputed from the database. If the structure re- mains unmodified, the conditional probability tables are not modified.

Associate Dictionary

This menu item allows defining the properties of the active Bayesian network thanks to text files. These properties concern arcs, nodes and states:

  • Arc:

    • Arcs: allows associating a set of arcs to the network. The indicated arcs can be added or removed from the network. The arc removal will always be done before adding an arc. Before adding an arc, all the constraints belonging to the Bayesian network as well as the arc constraints and the temporal indices will be checked. If a constraint is not verified, then the arc won't be added.

    • Forbidden Arcs: allows associating with the network a set of forbidden arcs .

    • Arc Comments: allows associating with the network a set of arc comments .

    • Arc Colors: allows associating with the network a set of colors on the arcs.

    • Fixed Arcs: allows defining if some arcs are fixed or not.

  • Node:

    • Node Renaming: allows renaming each node with a new name. These new names must be, of course, all different.

    • Comments: allows associating a comment with each node that is in the file.

    • Classes: allows organizing nodes in subsets called classes . A node can belong to several classes at the same time. These classes allow generalizing some node's properties to the nodes belonging to the same classes. They allow also creating constraints over the arc creation during learning.

    • Colors: allows associating colors with the nodes or classes that are in the file. The colors are written as Red Green Blue with 8 bits by channel in hexadecimal format (web format): for example the color red is 255 red 0 green 0 blue, it will give FF0000. Green gives 00FF00, yellow gives FFFF00, etc.

    • Images: allows associating colors with the nodes or classes that are in the file. The images are represented by their path relatively to the directory where the dictionary is.

    • Costs: allows associating with each node a cost . A node without cost is called not observable.

    • Temporal Indices: allows associating temporal indices with the nodes that are in the file. These temporal indexes are used by the BayesiaLab's learning algorithms to take into account any constraints over the probabilistic relations, as for example the no adding arcs between future nodes to past nodes. The rule that is used to add an arc from node N1 to node N2 is:

    • If the temporal index of N1 is positive or null, then the arc from N1 to N2 is only possible if the temporal index of N2 is greater of equal to the index of N1.

    • Local Structural Coefficients: allows setting the local structural coefficient of each specified node or each node of each specified class.

    • State Virtual Numbers: allows setting the state virtual number of each specified node or each node of each specified class.

    • Locations: allows setting the position of each node.

  • State:

    • State Renaming: allows renaming each state of each node with a new name.

    • State Values: allows associating with each state of each node a numerical value .

    • State Long Names: allows associating with each state of each node a long name more explicit than the default state name. This name can be used in the different ways to export a database, in the html reports and in the monitors.

    • Filtered States: allows defining a state to each node as a filtered state .

As indicated by the syntax, the name of the node, class or state in the text file cannot contain equal, space or tab characters. If the node names contain such characters in the networks, those characters must be written with a {color} (backslash) character before in the text file: for example the node named Visit Asia will be written Visit\ Asia in the file.

In order to specifically differenciate a nam which is the same for a classe, a node or a state, you must add at the end of the name the suffix "c" for a class, "n" for a node and "s" for a state.

If your network contains not-ASCII characters, you must save your own dictionaries with UTF-8 (Unicode) encoding. For example, in MS Excel, choose "save as" and select "Text Unicode (*.txt)" as type of file. In Notepad, choose "save as" and select "UTF-8" as encod- ing. If your file contains only ASCII character you can let the default encoding (depending on the platform) but it is strongly encouraged to use UTF-8 (Unicode) encoding in order to create dictionary files that doesn't depend on the user's platform. So, for example, a chinese dictionary can be read by a german without any problem whatever the used platforms are. If you are not sure how to save a file with UTF-8 encoding, you should export a dictionary with BayesiaLab, modify and save it (with any text editor) and load it in BayesiaLab.

Export Dictionary

This menu item allows exporting the different kinds of dictionaries in text files.

The dictionary files are saved with UTF-8 (Unicode) encoding in order to support any character of any language. An option, in the Import and Associate preferences: Save Format , allows saving or not the BOM (Byte Order Mask) at the beginning of the file. The BOM increases the compatibility with Microsoft applications. On other platform like Unix, Linux or Mac OS X, the BOM is not necessary and, in come cases, is considered as simple extra characters at the beginning of the file.

Associate an Evidence Scenario File

This menu item allows associating an evidence scenario file with the network.

Export an Evidence Scenario File

This menu item allows exporting into a text file an evidence scenario file associated with the network.

Save Data

This menu item allows saving the base associated with the network including the results of the various pre-processing that have been carried out within the data importation wizard (discretization, aggregation, filtering,). If the imported database still contains missing values and if the selected algorithm to process the missing values is one of the two imputation algorithms (static or dynamic), then option will allow you to realize all your imputation tasks by saving a database without any missing values. Indeed, each missing value is replaced by taking into account its conditional probability distri- bution, returned by the Bayesian network, given all the known values of the line. If the database contains data for test and data for learning, the user can choose which kind of data he wants to save: only learning data, only test data or the whole data. It is also possible to save only the data corresponding to the selected nodes.

The states' long name can be saved instead of the states' name. The numerical values in the database associated with the continuous nodes can be saved if they exist. If there is no numerical values asso- ciated with the database and if the option is checked, the numerical values will be created by randomly generating a value in each concerned interval. If the database contains weights, they will be saved as the first column in the output file.

Imputation

Allows the imputation of the missing values of the associated database according to the mode selected in the following dialog box:

The data will be saved in the specified file and the long name of the states will be used as specified. If the database contains data for test and data for learning, the user can choose on which kind of data he wants to perform imputation: only learning data, only test data or the whole data. The states' long name can be saved instead of the states' name. The numerical values in the database associated with the continuous nodes can be saved if they exist. If there is no numerical values associated with the database and if the option is checked, the numerical values will be created by randomly generating a value in each concerned interval. However, if there are numerical values in the database, the missing numerical values will be generated from the distribution function of each interval. If the database contains weights, they will be saved as the first column in the output file.

Graphs

Opens the graph editor if a database is associated with the current network.




Last updated

Logo

Bayesia USA

info@bayesia.us

Bayesia S.A.S.

info@bayesia.com

Bayesia Singapore

info@bayesia.com.sg

Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.