Fede_Ranaldi

121 posts

Fede_Ranaldi

@FedeRanaldi

I'm Federico Ranaldi, PhD in Data Science at @unitorvergata. My research focuses on LLMs handling Formal Domains spanning from Logic to Code.

Frascati, Rome Katılım Mart 2023

182 Takip Edilen89 Takipçiler

Fede_Ranaldi@FedeRanaldi·13 Şub

@__YuWang__ @zxlzr Someone might find interesting our analysis on structured memorization. We analyze how LLMs memorize Knowledge Graphs like DBpedia and Wikidata and reuse them for solving downstream tasks like #TextToSPARQL. Link: arxiv.org/abs/2505.15501

English

Yu Wang@__YuWang__·11 Şub

We just released a new survey of Agent Memory! we frame agent memory along three orthogonal axes: • Substrate — where memory lives • Cognitive mechanism — what role it plays (episodic/semantic/procedural) • Subject — who it serves (user/agent)

English

157

11.6K

Fede_Ranaldi retweetledi

Oren Sultan@oren_sultan·1 Şub

Can LLMs reliably predict program termination? We evaluate frontier LLMs in the International Competition on Software Verification (SV-COMP) 2025, directly competing with state-of-the-art verification systems. @AIatMeta @HebrewU @Bloomberg @imperialcollege @ucl @jordiae @pascalkesseli @jvanegue @HyadataLab @adiyossLC @PeterOHearn12 Paper: arxiv.org/pdf/2601.18987 Website: orensultan.com/llms_halting_p… 🧵👇 1/n

English

115

43.4K

Fede_Ranaldi@FedeRanaldi·18 Oca

@kidehen @JayaGup10 Hi. You might find interesting our work in which we analyze how LLMs internalize Knowledge Graphs (#DBpedia and #Wikidata) during pretraining and reuse them for solving downstream tasks like #TextToSPARQL. arxiv.org/abs/2505.15501

English

Kingsley Uyi Idehen@kidehen·1 Oca

@JayaGup10 Semantic Web == Context / Knowledge Graphs. And yes, it has always been a 1 Trillion+ opportunity waiting for its perfect complement in the form of LLMs 😀

English

520

Jaya Gupta@JayaGup10·31 Ara

Before LLMs, Palantir was competing with Snowflake and Databricks. Post-LLMs, they do not believe they have any competitors. Why? Snowflake/Databricks optimized for SQL and query throughput: get raw data into tables, run fast analytical reads, ship dashboards and models on top. Palantir made a different bet: an ontology, a world model where data is represented the way humans actually reason about it (objects, relationships, properties; nouns/verbs/adjectives). Back then, that was built for government analysts trying to make sense of messy, interdependent systems. Then LLMs arrived and the ontology suddenly looked like the perfect interface because models don’t want a trillion rows. They want a structured, language-shaped substrate: named entities, typed relationships, constraints, and “what interacts with what”, something you can linearize into a coherent prompt, traverse, and act on. The bigger implication for decision traces is that the “context graph” problem we wrote about has multiple architectural solutions: Platform-first (example: Palantir): prescribe the unified world model upfront. Pay the integration + ontology + embedded-team tax (months of use case discovery / workflow decomposition / “process mining”), and in return you get a substrate that can connect data to decisions because everything now lives inside the same model for an extremely absurd price. Workflow-first (decision traces): don’t start by rebuilding the world. Instrument the moments where the world changes. Capture decision receipts at commit surfaces: inputs referenced, policy/constraints, exception path, approvals, action taken, outcome. Over time (not day 1), that write-time provenance becomes its own world model, learned from trajectories rather than imposed upfront (there will be many different methods here) And importantly: this is still an ontology approach, just a different kind. Palantir prescribes the ontology first. Our take is that startups can learn it bottom-up from traces. You start by capturing what people actually do at the decision surface: what evidence is referenced, which approvals happen, what exceptions recur, what actions are taken, what outcomes follow and over time, infer the minimal set of entities + relations that explain those trajectories. The missing piece is decision traces: without them, you have state, but not the legible “why”!! Cc @akoratana

Jaya Gupta@JayaGup10

x.com/i/article/2003…

English

206

2.2K

374.7K

Fede_Ranaldi@FedeRanaldi·18 Oca

@firozkhxn_ Hi. I share with you our work based on analyzing how LLMs internalize Knowledge Graphs during pretraining and reuse them for solving downstream tasks like #TextToSPARQL. arxiv.org/abs/2505.15501

English

𝖋𝖎𝖗𝖔𝖟@firozkhxn_·4 Oca

KGs + LLMs are exploding—hard to track every new idea. One repo collects 200+ recent papers, surveys & code on marrying knowledge graphs with large language models. Saves weeks of literature crawl. github.com/zjukg/KG-LLM-P…

English

Fede_Ranaldi retweetledi

Stella Li@StellaLisy·25 Kas

🤔💭What even is reasoning? It's time to answer the hard questions! We built the first unified taxonomy of 28 cognitive elements underlying reasoning Spoiler—LLMs commonly employ sequential reasoning, rarely self-awareness, and often fail to use correct reasoning structures🧠

English

262

28.2K

Fede_Ranaldi@FedeRanaldi·13 Kas

@WikiResearch Hi. I share our work in which we analyze how LLMs leverage their parametric knowledge acquired from pretraining for recalling structured information like pieces of Knowledge Graphs and solving downstream tasks like #TextToSPARQL. arxiv.org/abs/2505.15501

English

WikiResearch@WikiResearch·13 Şub

"Instruct-to-SPARQL: A text-to-SPARQL dataset for training Wikidata Agents" (obtained by crawling [[d:Wikidata:Weekly query examples]] and other wiki pages) Paper: hal.science/hal-04918564/d… Code and data: github.com/padas-lab-de/i… / huggingface.co/datasets/PaDaS…

English

786

Fede_Ranaldi@FedeRanaldi·6 Kas

@slow_developer We offer a different look at Semantic Data Contamination. We study how LLMs internalize Knowledge Graphs from pretraining and how they leverage different abstractions ,defined as #protoknoweledge, in solving complex tasks like #texttosparql . arxiv.org/abs/2505.15501

English

Haider.@slow_developer·4 Kas

important research paper from google... "LLMs don't just memorize, they build a geometric map that helps them reason" according to the paper: – builds a global map from only local pairs – plans full unseen paths when knowledge is in weights; fails in context – turns a many-step path into a 1-step pick – comes from a natural training bias; room to make memory more geometric

English

112

779

52.9K

Fede_Ranaldi@FedeRanaldi·15 Eki

@harish @BathNLP @UKPLab @IGurevych @CS_TUDarmstadt @TUDarmstadt @UniofBath @frankniujc Nice perspective! We conduct an analysis over Knowledge Graphs internalisation from pretraining and models' ability to abstract over them and solving downstream tasks by composition of basic tasks over KGs. arxiv.org/abs/2505.15501

English

Harish Tayyar Madabushi@harish·19 May

Is In-Context Learning (ICL) in LLMs Memorisation? Emergence? Some Algorithmic Capability? 🤔 📢New work exploring ICL in LLMs: arxiv.org/abs/2505.11004 💡Key Finding: ICL capabilities are linked to token frequency 🤨 Strap in for the unexpected 🤯 A 🧵👇 #NLProc #LLMs

English

1.1K

Fede_Ranaldi retweetledi

Adi Simhi@AdiSimhi·8 Eki

🤔What happens when LLM agents choose between achieving their goals and avoiding harm to humans in realistic management scenarios? Are LLMs pragmatic or prefer to avoid human harm? 🚀 New paper out: ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs🚀🧵

English

Fede_Ranaldi@FedeRanaldi·26 Eyl

@denizbayazit Good work! We run a more static analysis to locate typological features in monolingual BERTs. Our method method tests whether syntactic and morphological differences between languages are reflected in their monolingual models. aclanthology.org/2023.findings-…

English

Deniz Bayazit@denizbayazit·25 Eyl

1/🚨 New preprint How do #LLMs’ inner features change as they train? Using #crosscoders + a new causal metric, we map when features appear, strengthen, or fade across checkpoints—opening a new lens on training dynamics beyond loss curves & benchmarks. #interpretability

English

5.1K

Fede_Ranaldi retweetledi

ItaliaNLP Lab@ItaliaNLP_Lab·25 Eyl

Last but not least (for today), @lucadini_ presenting “The Role of Eye-Tracking Data in Encoder-Based Models: An In-depth Linguistic Analysis” (with @workerplacemint, Dominique Brunato and Felice Dell’Orletta)! 👁️ Link to the paper: clic2025.unica.it/wp-content/upl… #NLProc

English

336

Fede_Ranaldi@FedeRanaldi·23 Eyl

@graph_ Interesting work! We conducted an analysis on #TextToSPARQL based on studying the way LLMs internalize Knowledge Graphs and reuse them in Downstream Reasining Tasks. arxiv.org/abs/2505.15501

English

Anton Tsitsulin@graph_·9 Şub

𝐋𝐞𝐭 𝐘𝐨𝐮𝐫 𝐆𝐫𝐚𝐩𝐡 𝐃𝐨 𝐭𝐡𝐞 𝐓𝐚𝐥𝐤𝐢𝐧𝐠: 𝐄𝐧𝐜𝐨𝐝𝐢𝐧𝐠 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐃𝐚𝐭𝐚 𝐟𝐨𝐫 𝐋𝐋𝐌𝐬 Don't know what to do with your graphs in 2024? Shove them to an LLM, of course, and let LLM figure out what to do! arxiv.org/abs/2402.05862 Short thread (1/5):

English

285

44.2K

Fede_Ranaldi@FedeRanaldi·8 Eyl

@fchollet You might find interesting our work in which we analyze correlation between Code Generation and related subtasks. We focus on how LLMs internalize #KnowledgeGraphs and how they solve #TextToSPARQL. arxiv.org/abs/2505.15501

English

François Chollet@fchollet·18 Haz

I believe that program synthesis will solve reasoning. And I believe that deep learning will solve program synthesis (by guiding a discrete program search process). But I don't think you can go all that far with just prompting a LLM to generate end-to-end Python programs (even with a verification step and many samples). That won't scale to very long programs.

English

829

170.4K

Fede_Ranaldi@FedeRanaldi·5 Ağu

@casper_hansen_ * #TextToSQL

Fede_Ranaldi@FedeRanaldi·3 Ağu

@casper_hansen_ Hello you might find interesting this work in which we propose a framework for Data Contamination applied on #TextToSPARQL and discover that GPTs have a sort of #protoknowledge on Spider. aclanthology.org/2024.findings-…

English

Casper Hansen@casper_hansen_·5 Tem

i don't think you understand you train on Text-to-SQL it gets 73.7% on Spider you train on Text-to-Cypher it gets 5.5% exact match you realize they share abstractions schema grounding, joins, filtering you train on both Spider jumps to 76.5% Text2Cypher hits 6.2% you add Chain-of-Thought it starts explaining its queries you add reinforcement learning it learns from execution feedback you design a topology-aware reward it understands graph edit distance you realize SQL helps Cypher and Cypher helps SQL cross-formalism transfer is real you test on MongoDB Query Language never trained on it it still works you test on table QA never trained on it 62.5% exact match you test on knowledge graph QA never trained on it 86.3% accuracy you realize you didn't train a SQL model you didn't train a Cypher model you trained structured reasoning itself someone says "but it's just 32B parameters" you point to QwQ-32B-trained-Both it beats o3 on Text2Cypher they say "but the datasets overlap" you show the ablations even single-task training transfers they say "but it's not real understanding" your model generalizes to unseen query languages they go quiet you realize you didn't build a parser you built a bridge between natural language and every structured formalism that will ever exist tell me you hate this type of post, but do read the paper🔽 arxiv.org/abs/2506.21575

English

409

30.5K

Fede_Ranaldi@FedeRanaldi·4 Ağu

@DengHokin @l2m2_workshop @aclmeeting Thanks @DengHokin

English

Hokin Deng@DengHokin·4 Ağu

@FedeRanaldi @l2m2_workshop @aclmeeting This is a very interesting idea

English

Fede_Ranaldi@FedeRanaldi·2 Ağu

I propose the concept of #protoknowledge (in-between #memorization and #generalization) to evaluate whether models are truly capable of solving downstream tasks. Thanks to the @l2m2_workshop part of @aclmeeting , I could receive valuable feedback and engage idea exchanges.

English

Fede_Ranaldi@FedeRanaldi·4 Ağu

@TomSheffer17807 @l2m2_workshop @aclmeeting Thank you Tom. Obviously our research deserves to be extended on different levels such as studying internal representations or applying adaptation techniques through mechanistic interpretability methods.

English

Tom Sheffer@TomSheffer17807·3 Ağu

@FedeRanaldi @l2m2_workshop @aclmeeting Great concept! Combining memorization and generalization into #protoknowledge offers an interesting perspective on evaluating LLMs. Thanks for sharing your insights at #ACL2025.

English

Fede_Ranaldi@FedeRanaldi·3 Ağu

@taskinfatih @l2m2_workshop @aclmeeting Thank you for the exchange of ideas. I consider quite appropriate to invest time in formalizing task instances through a theoretical perspective rather than making experiments and evaluate model capabilities just by looking at Input Output.

English

Fatih⏩⤴️@taskinfatih·3 Ağu

@FedeRanaldi @l2m2_workshop @aclmeeting Great work, congrats👏

English

Fede_Ranaldi@FedeRanaldi·3 Ağu

@taskinfatih @l2m2_workshop @aclmeeting Hi Fatih. Our idea is related with Chollet's program synthesis and is specifically adapted to reasoning applied to #knowledgegraphs. We adopt a neuro-symbolic approach aimed at evaluating real model's capabilities and detecting potential benchmark contamination.

English

Fatih⏩⤴️@taskinfatih·3 Ağu

@FedeRanaldi @l2m2_workshop @aclmeeting Is this like Chollet’s program synthesis?

English

Fede_Ranaldi@FedeRanaldi·3 Ağu

@CatAstro_Piyush @l2m2_workshop @aclmeeting Hi @CatAstro_Piyush thanks for the interaction. Here's a preprint on arxiv arxiv.org/abs/2505.15501. Feel free to message for anything.

English

Piyush@CatAstro_Piyush·3 Ağu

@FedeRanaldi @l2m2_workshop @aclmeeting Very interesting. Is it available online?

English

Keşfet

@__YuWang__ @zxlzr @AIatMeta @HebrewU @Bloomberg @imperialcollege @ucl @jordiae