From: Landslide susceptibility assessment of South Korea using stacking ensemble machine learning
Symbol | Model | Description | References |
---|---|---|---|
ADA | Adaptive boosting | Boosting basic algorithm initially generates a weak learner and weights it sequentially | Freund and Schapire (1997) |
CatBoost | Categorical boosting | Combines boosting with 'target encoding' to improve categorical data processing performance that was vulnerable in traditional boosting algorithms | Prokhorenkova et al. (2018) |
DT | Decision tree | Classifies variables into nodes based on classification criteria and recursively classifies the process | Â |
Dummy | Dummy classifier | A simple comparison between training and prediction data. Builds a baseline for model performance comparison, but not for actual prediction | Â |
ET | Extremely randomized tree | Uses the entire data without a bagging process in the structure of RF and randomly generates node branches | Geurts et al. (2006) |
GBC | Gradient boosting | Boosting basic algorithm to predict the residual of the previous step sequentially from the initial weak learner | Friedman et al. (2001) |
KNN | K nearest neighbors | Determined by a majority vote of the nearest k data of the target | Â |
LDA | Linear discriminant analysis | Assumes that all classes share the same covariance matrix, i.e., have a linear structure, by applying a Bayes rule that maximizes the probability that a given data belongs to each class | Â |
lightGBM | Light gradient boosting | While other GBC algorithms apply tree depth minimization through the levelwise method, only certain trees are developed through the leafwise method to minimize loss and shorten the time | Ke et al. (2017) |
NB | Naive Bayes | Classified on a Bayes basis with the simple assumption that all characteristics are independent of each other | Lewis (1998) |
QDA | Quadratic discriminant analysis | Unlike LDA, assumes that each class is a different covariance matrix. Therefore, the crystal boundary is in the form of a quadratic curve | Â |
RF | Random forest | Samples data through the bootstrap process to perform prediction and aggregation with multiple decision trees; allows to measure the importance of each data | Breiman (2001) |
Ridge | Ridge classifier | Performed based on linear regression methods, but adds a normalization process called 'L2 regularization' to avoid overfitting | Â |
SVM | Support vector machine | Distinguished by hyperplane between the two data, and regression is also possible based on hyperplane. Basic 'linear model' and 'RBF kernel model considering multidimensional data' are used | Cortes and Vapnik (1995) |
XGBoost | Extreme gradient boosting | Improves performance through normalization, pruning, and missing value processing in traditional GBC | Chen et al. (2016) |