Smoothing Entailment Graphs with Language Models

Smoothing Entailment Graphs with Language Models

Mark Johnson, Nick McKenna, Tianyi Li, Mark Steedman

31 October 2023

The diversity and Zipfian frequency distribution of natural language predicates in corpora leads to sparsity in Entailment Graphs (EGs) built by Open Relation Extraction (ORE). EGs are theoretically-founded and computationally efficient, but as symbolic models for natural language inference, they fail if a novel premise or hypothesis vertex is missing at test-time. We introduce a theory of optimal graph smoothing to overcome vertex sparsity by constructing transitive chains. We then demonstrate an efficient, open-domain smoothing method using an off-the-shelf Language Model to find approximations of missing premise predicates, improving recall by 25.1 and 16.3 percentage points on two difficult directional entailment datasets while raising average precision. Further, in a recent QA task, we show that EG smoothing is most useful for answering questions with lesser supporting text, where missing predicates are more costly. Finally, in controlled experiments with WordNet we show that hypothesis smoothing is difficult, but possible in principle.


Venue : http://www.ijcnlp-aacl2023.org/ IJCNLP-AACL 2023: The 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

File Name : AACL__Smoothing_Entailment_Graphs_with_Language_Models.pdf