
Can LLMs keep track of very long conversations? We evaluate 'conversational memory' of LLMs via 3 tasks on our dataset of multi-session multimodal dialogs --> LLMs struggle to remember, reason over history, draw long-range temporal/causal connections arxiv.org/abs/2402.17753 🧵






















