![]() Rationales with better human utility, while maintaining most of its task Score, GEN-U, that we propose, which can help improve LMs' ability to generate We also translate this finding into an automated Helpfulness in answering similar unseen instances, we can measure its human We show that, by estimating a rationale's Novelty are correlated with their human utility, estimating them without human While we observe that certain properties of rationales like conciseness and Generated and gold rationales are not good indicators of their human utility. Performance of the LM generating the rationales, or similarity between That human utility of existing rationales is far from satisfactory, andĮxpensive to estimate with human studies. Humans try to answer questions based on those machine rationales? We observe This phenomenon raises a question:Ĭan machine generated rationales also be useful for humans, especially when lay Generating seemingly useful rationalizations, which in turn, can dramaticallyĮnhance their performances on leaderboards. Download a PDF of the paper titled Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-Text Rationales, by Brihi Joshi and 8 other authors Download PDF Abstract: Among the remarkable emergent capabilities of large language models (LMs) isįree-text rationalization beyond a certain scale, large LMs are capable of
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |