Glossary

All | # A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

There are currently 25 names in this directory beginning with the letter P.

Parent node and child node (in decision trees)

These terms are relative in nature. Any node situated below another node is typically referred to as a child node or sub-node, while the node preceding these child nodes is commonly known as the parent node.

Parents and children (in Bayesian networks)

In a Bayesian network, the parents of a node are the nodes that directly influence it. The children of a node are the nodes that are directly influenced by it.

Partitioning clustering

Partitioning clustering divides the data into non-overlapping partitions or clusters. It directly assigns each data point to a specific cluster.

Pie chart

A pie chart is a circular chart that represents the proportions of different categories in a whole, with each category represented by a slice of the pie.

Population

Population is the primary target audience of which we want to examine the patterns of the data and make conclusions.

Posterior probability

The posterior probability represents the updated probability of a variable given observed evidence or data. It is calculated by combining the prior probability with the likelihood of the observed data using Bayes’ theorem.

Prediction

A general term for a predicting act

Prediction accuracy

Prediction accuracy is a measure of how accurate a model predicts the target variables (including actual positives and actual negatives).

Prediction errors

Errors measure the extent to which the trained model misfits the data. The lower the error, the more accurate the model.

Prediction model

See Predictive model

Predictive model

A model predicting either a continuous or categorical target variable

Predictive modeling

A process of using machine learning algorithms to develop a model that predicts future or unseen events or outcomes based on data.

Predictive power

Predictive power is the ability of a model to accurately capture and represent the underlying patterns, relationships, or trends present in the data and generalize the results to other datasets.

Predictive research

Predictive strategy looks into the future. In other words, organizations analyze a large amount of data to predict new potential safety problems in the future. By mining big data, they can develop a predictive model that predicts an incident or accident before it happens.

Predictors

Inputs being used in predicting the output

Principal components

Principal components are the eigenvectors that correspond to the highest eigenvalues of the covariance matrix. They are ordered in terms of their importance, where the first principal component explains the maximum variance in the dataset, followed by the second principal component, and so on.

Prior probability

The prior probability represents the initial belief or knowledge about a variable before any evidence or data is observed. It is typically specified as part of the Bayesian network modeling process.

Proactive research

Proactive research studies the present. In other words, organizations examine contributing factors to an incident/accident from various aspects of hazardous conditions and organizational processes and see how they are related to the incident or accident.

Probabilistic inference

Bayesian networks allow for probabilistic inference, which means computing the probability of specific events or variables given evidence (observed data or values) from other variables in the network. Inference is based on Bayes’ theorem.

Probabilistic relationship

A probabilistic relationship refers to the existence of a connection or association between two or more variables, where the relationship is characterized by uncertainty and is described using probabilities.

Probability sampling

Probability sampling is the method of selecting individuals randomly in such as way each individual in the sampling frame has an equal probability of being chosen.

Promax

Promax is a oblique rotation method that extends the advantages of Varimax rotation while also accounting for possible correlations between the rotated components. It is considered a compromise between the simplicity of orthogonal rotation and the flexibility of oblique rotation.

Pruning

Removing the sub-nodes of a parent node is called pruning. It is a technique used to simplify the decision tree by removing nodes that do not significantly improve its performance on the validation set. A tree is grown through splitting and shrunk through pruning.

Pseudo R-squared

Pseudo R-squared is a metric used in logistic regression to assess the model fit. Unlike the traditional R-squared used in linear regression, which measures the proportion of variance explained by the model, pseudo R-squared measures the proportion of the deviance explained by the model.

Purposive Sampling

Purposive sampling is a non-probabilistic sampling method where researchers deliberately select specific individuals or units based on pre-defined criteria to best represent certain characteristics or traits of interest.