Landslide susceptibility assessment of South Korea using stacking ensemble machine learning

Lee, Seung-Min; Lee, Seung-Jae

doi:10.1186/s40677-024-00271-y

Geoenvironmental Disasters

Table 2 Description of LSA input parameters

From: Landslide susceptibility assessment of South Korea using stacking ensemble machine learning

Symbol	Model	Description	References
ADA	Adaptive boosting	Boosting basic algorithm initially generates a weak learner and weights it sequentially	Freund and Schapire (1997)
CatBoost	Categorical boosting	Combines boosting with 'target encoding' to improve categorical data processing performance that was vulnerable in traditional boosting algorithms	Prokhorenkova et al. (2018)
DT	Decision tree	Classifies variables into nodes based on classification criteria and recursively classifies the process
Dummy	Dummy classifier	A simple comparison between training and prediction data. Builds a baseline for model performance comparison, but not for actual prediction
ET	Extremely randomized tree	Uses the entire data without a bagging process in the structure of RF and randomly generates node branches	Geurts et al. (2006)
GBC	Gradient boosting	Boosting basic algorithm to predict the residual of the previous step sequentially from the initial weak learner	Friedman et al. (2001)
KNN	K nearest neighbors	Determined by a majority vote of the nearest k data of the target
LDA	Linear discriminant analysis	Assumes that all classes share the same covariance matrix, i.e., have a linear structure, by applying a Bayes rule that maximizes the probability that a given data belongs to each class
lightGBM	Light gradient boosting	While other GBC algorithms apply tree depth minimization through the levelwise method, only certain trees are developed through the leafwise method to minimize loss and shorten the time	Ke et al. (2017)
NB	Naive Bayes	Classified on a Bayes basis with the simple assumption that all characteristics are independent of each other	Lewis (1998)
QDA	Quadratic discriminant analysis	Unlike LDA, assumes that each class is a different covariance matrix. Therefore, the crystal boundary is in the form of a quadratic curve
RF	Random forest	Samples data through the bootstrap process to perform prediction and aggregation with multiple decision trees; allows to measure the importance of each data	Breiman (2001)
Ridge	Ridge classifier	Performed based on linear regression methods, but adds a normalization process called 'L2 regularization' to avoid overfitting
SVM	Support vector machine	Distinguished by hyperplane between the two data, and regression is also possible based on hyperplane. Basic 'linear model' and 'RBF kernel model considering multidimensional data' are used	Cortes and Vapnik (1995)
XGBoost	Extreme gradient boosting	Improves performance through normalization, pruning, and missing value processing in traditional GBC	Chen et al. (2016)

Back to article page