Measuring and reducing non-multifact reasoning in multi-hop question answering

Published in arXiv preprint arXiv:2005.00789, 2020