Particle physics experiments, like the Large Hadron Collider in Geneva, can generate thousands of data points listing detected particle reactions. An important learning task is to analyze the reaction data for evidence of conserved quantities and hidden particles. This task involves latent structure in two ways: first, hypothesizing hidden quantities whose conservation determines which reactions occur, and second, hypothesizing the presence of hidden particles. We model this problem in the classic linear algebra framework of automated scientific discovery due to Valdes-Perez, Zytkow and Simon, where both reaction data and conservation laws are represented as matrices. We introduce a new criterion for selecting a matrix model for reaction data: find hidden particles and conserved quantities that rule out as many interactions among the nonhidden particles as possible. A polynomial-time algorithm for optimizing this criterion is based on the new theorem that hidden particles are required if and only if the Smith Normal Form of the reaction matrix R contains entries other than 0 or 1. To our knowledge this is the first application of Smith matrix decomposition to a problem in AI. Using data from particle accelerators, we compare our algorithm to the main model of particles in physics, known as the Standard Model: our algorithm discovers conservation laws that are equivalent to those in the Standard Model, and indicates the presence of a hidden particle (the electron antineutrino) in accordance with the Standard Model.

Oliver Schulte