Semantic Representation of Data Science Programs

Semantic Representation of Data Science Programs

Evan Patterson, Ioana Baldini, Aleksandra Mojsilović, Kush R. Varshney

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

Your computer is continuously executing programs, but does it really understand them? Not in any meaningful sense. That burden falls upon human knowledge workers, who are increasingly asked to write and understand code. They would benefit greatly from intelligent tools that reveal the connections between their code and its subject matter. Towards this prospect, we present an AI system that forms semantic representations of computer programs, using techniques from knowledge representation and program analysis. These representations are created through a novel algorithm for the semantic enrichment of dataflow graphs. We illustrate its workings with examples from the field of data science. The algorithm is undergirded by a new ontology language for modeling computer programs and a new ontology about data science, written in this language.
Keywords:
Knowledge Representation, Reasoning, and Logic: Knowledge Representation Languages
Multidisciplinary Topics and Applications: Knowledge-based Software Engineering
Machine Learning: Interpretability