Scaling Up Unbiased Search-based Symbolic Regression

Scaling Up Unbiased Search-based Symbolic Regression

Paul Kahlmeyer, Joachim Giesen, Michael Habeck, Henrik Voigt

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4264-4272. https://doi.org/10.24963/ijcai.2024/471

In a regression task, a function is learned from labeled data to predict the labels at new data points. The goal is to achieve small prediction errors. In symbolic regression, the goal is more ambitious, namely, to learn an interpretable function that makes small prediction errors. This additional goal largely rules out the standard approach used in regression, that is, reducing the learning problem to learning parameters of an expansion of basis functions by optimization. Instead, symbolic regression methods search for a good solution in a space of symbolic expressions. To cope with the typically vast search space, most symbolic regression methods make implicit, or sometimes even explicit, assumptions about its structure. Here, we argue that the only obvious structure of the search space is that it contains small expressions, that is, expressions that can be decomposed into a few subexpressions. We show that systematically searching spaces of small expressions finds solutions that are more accurate and more robust against noise than those obtained by state-of-the-art symbolic regression methods. In particular, systematic search outperforms state-of-the-art symbolic regressors in terms of its ability to recover the true underlying symbolic expressions on established benchmark data sets.
Keywords:
Machine Learning: ML: Symbolic methods
Machine Learning: ML: Explainable/Interpretable machine learning
Machine Learning: ML: Regression
Search: S: Search and machine learning