Glossary
There are currently 25 names in this directory beginning with the letter P.
P
Parent node and child node (in decision trees)
These terms are relative in nature. Any node situated below another node is typically referred to as a child node or sub-node, while the node preceding these child nodes is commonly known as the parent node.
Parents and children (in Bayesian networks)
In a Bayesian network, the parents of a node are the nodes that directly influence it. The children of a node are the nodes that are directly influenced by it.
Partitioning clustering
Partitioning clustering divides the data into non-overlapping partitions or clusters. It directly assigns each data point to a specific cluster.
Pie chart
A pie chart is a circular chart that represents the proportions of different categories in a whole, with each category represented by a slice of the pie.
Population
Population is the primary target audience of which we want to examine the patterns of the data and make conclusions.
Posterior probability
The posterior probability represents the updated probability of a variable given observed evidence or data. It is calculated by combining the prior probability with the likelihood of the observed data using Bayes’ theorem.
Prediction accuracy
Prediction accuracy is a measure of how accurate a model predicts the target variables (including actual positives and actual negatives).
Prediction errors
Errors measure the extent to which the trained model misfits the data. The lower the error, the more accurate the model.
Predictive modeling
A process of using machine learning algorithms to develop a model that predicts future or unseen events or outcomes based on data.
Predictive power
Predictive power is the ability of a model to accurately capture and represent the underlying patterns, relationships, or trends present in the data and generalize the results to other datasets.
Predictive research
Predictive strategy looks into the future. In other words, organizations analyze a large amount of data to predict new potential safety problems in the future. By mining big data, they can develop a predictive model that predicts an incident or accident before it happens.
Principal components
Principal components are the eigenvectors that correspond to the highest eigenvalues of the covariance matrix. They are ordered in terms of their importance, where the first principal component explains the maximum variance in the dataset, followed by the second principal component, and so on.
Prior probability
The prior probability represents the initial belief or knowledge about a variable before any evidence or data is observed. It is typically specified as part of the Bayesian network modeling process.
Proactive research
Proactive research studies the present. In other words, organizations examine contributing factors to an incident/accident from various aspects of hazardous conditions and organizational processes and see how they are related to the incident or accident.
Probabilistic inference
Bayesian networks allow for probabilistic inference, which means computing the probability of specific events or variables given evidence (observed data or values) from other variables in the network. Inference is based on Bayes’ theorem.
Probabilistic relationship
A probabilistic relationship refers to the existence of a connection or association between two or more variables, where the relationship is characterized by uncertainty and is described using probabilities.
Probability sampling
Probability sampling is the method of selecting individuals randomly in such as way each individual in the sampling frame has an equal probability of being chosen.
Promax
Promax is a oblique rotation method that extends the advantages of Varimax rotation while also accounting for possible correlations between the rotated components. It is considered a compromise between the simplicity of orthogonal rotation and the flexibility of oblique rotation.
Pruning
Removing the sub-nodes of a parent node is called pruning. It is a technique used to simplify the decision tree by removing nodes that do not significantly improve its performance on the validation set. A tree is grown through splitting and shrunk through pruning.
Pseudo R-squared
Pseudo R-squared is a metric used in logistic regression to assess the model fit. Unlike the traditional R-squared used in linear regression, which measures the proportion of variance explained by the model, pseudo R-squared measures the proportion of the deviance explained by the model.

