MATcH
BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs
1University of Edinburgh
2Univ. Grenoble Alpes
Abstract
We introduce a task consisting in matching a proof to a given mathematical statement. The task fits well within current research on Mathematical Information Retrieval and, more generally, mathematical article analysis. We present a dataset for the task (the MATcH dataset) consisting of over 180k statement-proof pairs extracted from modern mathematical research articles.
While pre-trained language models perform strongly with a mean reciprocal rank of 73.7, our analysis shows they rely on shallow symbolic cues. MATcH highlights the need for deeper semantic reasoning in mathematical article analysis.
Dataset



Results


BibTeX
@inproceedings{li-etal-2023-bert,
title = {BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs},
author = {Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B.},
booktitle = {Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics},
year = {2023},
address = {Dubrovnik, Croatia},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2023.eacl-main.260/},
doi = {10.18653/v1/2023.eacl-main.260},
pages = {3581--3593}
}