Linus Nilsson An introduction to data mining methods CART and MARS with applications in industrial monitoring ABSTRACT: We introduce CART---Classification And Regression Trees, and MARS---Multivariate Adaptive Regression Splines, together with a methodology for model-based fault detection in industrial plants using these methods. Factors complicating the analysis of data from industrial control systems include the presence of non-linear, dynamic relationships, high dimensionality, large numbers of observations, limited a priori knowledge, and high requirements regarding interpretability. CART and MARS can naturally handle many of these problems. MARS models can be seen as generalized CART models and by their generalized nature they have continuity properties which are more in line with the underlying physics. CART is on the other hand the more scalable method, especially when high order interactions are present. The two methods also differ regarding how easy it is to make use of extra a priori knowledge. These differences are investigated by applying both methods to real data from a thermal plant. Remarkably precise prediction of many of the controlled variables in the complex industrial plant turns out to be possible, using quite limited physical knowledge. The results suggest that MARS would be the better technique of the two for the given purpose.