Dr Steven McDermott

16.6K posts

Dr Steven McDermott banner
Dr Steven McDermott

Dr Steven McDermott

@soci

Leeds Katılım Şubat 2007
1.5K Takip Edilen1.3K Takipçiler
Dr Steven McDermott
Dr Steven McDermott@soci·
Explanation of "Accurate Predictions on Small Data with a Tabular Foundation Model" (Hollmann et al., 2025)[1] This work addresses a critical challenge in machine learning: achieving high prediction accuracy on small tabular datasets. bohrium.dp.tech/ai-search/shar…
English
0
0
0
90
Dr Steven McDermott retweetledi
Simon Wardley
Simon Wardley@swardley·
It amazes me that the most important metrics (lines of code, story points, cycle time, devex satisfaction) in development are the two that are never discussed, let alone measured ... mean time to answer (mttA) and mean time to question (mttQ).
English
8
36
188
26.6K
Dr Steven McDermott retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Local models now protect your privacy while still accessing powerful LLM capabilities Chain small and large LLMs to get best performance while keeping data private 🔍 Original Problem: Users share sensitive personal information with proprietary LLMs during inference, raising privacy concerns. While local open-source models help with privacy, they perform worse than proprietary models. ----- 🛠️ Solution in this Paper: • PAPILLON: A multi-stage pipeline where local models act as privacy-conscious proxies • Uses DSPy prompt optimization to find optimal prompts for privacy preservation • Two key components: - Prompt Creator: Generates privacy-preserving prompts - Information Aggregator: Combines responses while protecting PII • Created PUPA benchmark with 901 real-world user-LLM interactions containing PII ----- 💡 Key Insights: • Simple redaction significantly lowers LLM response quality • Privacy-conscious delegation can balance privacy and performance • Smaller local models can effectively leverage larger models while protecting privacy • Prompt optimization improves both quality and privacy metrics ----- 📊 Results: • Maintains 85.5% response quality compared to proprietary models • Restricts privacy leakage to only 7.5% • Outperforms simple redaction approaches • Shows consistent improvement across different model sizes
Rohan Paul tweet media
English
11
51
282
31.8K
Dr Steven McDermott retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Fantastic open-sourced tool for RAG, chatting with your documents with open-source LLMs. ✨ It trended at Number-1 in Github for quite sometime. And a clean & customizable RAG UI for chatting with your documents. → Open-source RAG UI for document QA → Supports local LLMs and API providers → Hybrid RAG pipeline with full-text & vector retrieval → Multi-modal QA with figures & tables support → Advanced citations with in-browser PDF preview → Complex reasoning with question decomposition → Configurable settings UI → Extensible Gradio-based architecture Key features: 👇 🌐 Host your own RAG web UI with multi-user login 🤖 Organize LLM & embedding models (local & API) 🔎 Hybrid retrieval + re-ranking for quality 📚 Multi-modal parsing and QA across documents 💡 Detailed citations with relevance scores 🧩 Question decomposition for complex queries 🎛️ Adjustable retrieval & generation settings 🔌 Customizable UI and indexing strategies
Rohan Paul tweet media
English
19
140
1.1K
102.1K
Dr Steven McDermott retweetledi
Sanyam Bhutani
Sanyam Bhutani@bhutanisanyam1·
NotebookLlama: An Open Source version of NotebookLM 🙏 A complete tutorial on building a PDF to Podcast flow using Llama: - 1B to pre-process PDF - 70B to convert it to a podcast Transcript - 8B to make it more dramatic - Parler and Suno models for TTS github.com/meta-llama/lla…
Sanyam Bhutani tweet media
English
29
259
1.5K
125K
Dr Steven McDermott retweetledi
Towards Data Science
Towards Data Science@TDataScience·
"After reading this blog you’ll know where to start and how to select the most appropriate Bayesian techniques for causal discovery for your use case." An Extensive Starters Guide For Causal Discovery using Bayesian Modeling by Erdogan Taskesen towardsdatascience.com/an-extensive-s…
English
1
3
29
3.8K
Dr Steven McDermott retweetledi
elvis
elvis@omarsar0·
Huge efforts to improve LLMs for tool use, computer use, reasoning, and long-context understanding. Here are a few interesting papers for the weekend: 1). Agentic Information Retrieval Provides an introduction to agentic information retrieval, which is shaped by the capabilities of LLM agents. Discusses different types of cutting-edge applications of agentic information retrieval and challenges. arxiv.org/abs/2410.09713 2). A Theoretical Understanding of CoT Finds that adding correct and incorrect reasoning paths in demonstrations improves the accuracy of intermediate steps and CoT. The proposed method, Coherent CoT, significantly improves performance on several benchmarks. In the Tracking Shuffled Objects dataset, Gemini Pro shows a 6.60% improvement (from 58.20% to 64.80%), and in Penguins in a Table, DeepSeek 67B demonstrates an increase of 6.17% (from 73.97% to 80.14%). arxiv.org/abs/2410.16540 3). LongRAG Enhances RAG's understanding of long-context knowledge which includes global information and factual details. Consists of a hybrid retriever, an LLM-augmented information extractor, a CoT-guided filter, and an LLM-augmented generator. These are key components that enable the RAG system to mine global long-context information and effectively identify factual details. LongRAG outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). arxiv.org/abs/2410.18050 4). Reasoning Patterns of OpenAI’s o1 Model When compared with other test-time compute methods, o1 achieved the best performance across most datasets. The authors observe that the most commonly used reasoning patterns in o1 are divide and conquer and self-refinement. o1 uses different reasoning patterns for different tasks. For commonsense reasoning tasks, o1 tends to use context identification and emphasize constraints. For math and coding tasks, o1 mainly relies on method reuse and divide and conquer. arxiv.org/abs/2410.13639 5). A Survey on Data Synthesis and Augmentation for LLMs Provides a comprehensive summary of data generation techniques in the lifecycle of LLMs. Includes discussions on data preparation, pre-training, fine-tuning, instruction-tuning, preference alignment, and applications. arxiv.org/abs/2410.12896 6). Beyond Browsing: API-Based Web Agents Researchers demonstrate that AI agents using both web APIs and browsing capabilities outperform traditional web-only agents by 20% on the WebArena benchmark, achieving a state-of-the-art 35.8% success rate for task-agnostic agents. arxiv.org/abs/2410.16464
elvis tweet mediaelvis tweet mediaelvis tweet mediaelvis tweet media
English
5
113
523
50.9K
Dr Steven McDermott retweetledi
Sumanth
Sumanth@Sumanth_077·
Harvard University is offering Free world class education in Python, Data Science, Machine Learning, Data Preprocessing, Visualization & Statistics:
English
1
54
237
18.8K
Dr Steven McDermott retweetledi
Akshay 🚀
Akshay 🚀@akshay_pachaar·
100% Local OpenAI Swarm Agents!🐝 OpenAI Swarm is an educational framework that explores ergonomic, lightweight multi-agent orchestration. It's fairly easy to integrate with locally running LLMs through Ollama. ______ Find me → @akshay_pachaar ✔️ And stay tuned for more on multi-agent tutorials coming soon!
Akshay 🚀 tweet media
English
4
71
371
31.8K
Dr Steven McDermott retweetledi
Shubham Saboo
Shubham Saboo@Saboo_Shubham_·
Run any LLM like Llama 3.2 directly from Hugging Face using Ollama on your local computer (100% free and without internet).
English
10
45
366
33.5K
Dr Steven McDermott retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
NetworkX from NVIDIA is one THE most popular Python graph analytics library with ~15K Github starts and 80M downloads monthly. This library is for working with networks and graphs. It helps analyze connections between things - like social networks, computer networks, or any system where objects are connected to each other. And now NetworkX just got massively accelerated after its backend integration with NVIDIA's cuGraph. ✨ Up to 500x speedups on large graph workloads in NetworkX with zero code changes. And it is Zero Code Change Acceleration. 📌 cuGraph is NVIDIA's GPU-accelerated graph analytics library within the RAPIDS ecosystem. The library provides fast graph algorithms on GPUs, supporting property graphs, remote operations, and graph neural networks (GNNs). Works with GPU DataFrames (cuDF) and integrates smoothly with NetworkX-like API. -------- 📌 The traditional bottleneck of NetworkX's pure Python implementation becomes apparent when processing graphs larger than 100K nodes and 1M edges. 📌 And so now cuGraph solves this by offloading supported algorithms to the GPU. PageRank, Louvain community detection, betweenness centrality, and about 60 other algorithms get instant acceleration. 📌 This acceleration enables previously impractical use cases. Fraud detection systems can now process massive transaction networks in real-time. Recommendation engines handle millions of user-item interactions efficiently. Social network analysis scales to entire platforms worth of data on a single machine. @NVIDIAAIDev
Rohan Paul tweet media
English
10
155
921
63K
Dr Steven McDermott retweetledi
Heather Cooper
Heather Cooper@HBCoop_·
🔹STORM🔹 Stanford University’s free, public app automates comprehensive research and report generation using AI to create Wikipedia-style articles with citations from web sources. STORM is incredible. I asked it to write a research paper about the use of AI in DNA analysis and predicting genetic traits - got a comprehensive article with citations. It takes about 3 minutes and you can download the PDF:
English
34
249
1.7K
176.2K
Dr Steven McDermott retweetledi
Santiago
Santiago@svpino·
Data pipelines will put you in the top 1% of the market. If you could only learn one skill for the next decade, I can't think of anything more critical than learning to move and process data at scale. I like to tell people I'm a Machine Learning Engineer, but in reality, 90% of the value I produce comes from my ability to move data around consistently and correctly. In the field, we like to use the term "orchestration" when talking about coordinating workflows that move and process data. At a high level, there are three main steps you need to worry about: 1. Getting the data from its source 2. Processing and cleaning that data 3. Delivering the cleaned data to the right place You might have also heard about "ETL" (Extract, Transform, Load). That's how most people refer to the process above. Of course, building a simple ETL system isn't complex; most developers can do it without too much trouble. The problem is designing resilient, scalable, and fault-tolerant systems. You can't code your way to a production-ready orchestration platform (ask me how I know.) I started with AirFlow and eventually moved to @kestra_io because of its event-driven architecture. Event-driven means you can kick off a workflow automatically based on different triggers. For instance, when somebody uploads a new file to a folder, an app updates a database table, or there's a new message in a queue. It's hard to summarize everything you get from Kestra, but here are some of the highlights: • Kestra is free and open-source • You install it from a Docker container • Workflows as Code using YAML <--- this is awesome • Scales to millions of executions • It integrates with every cloud platform you've seen • Language agnostic (but I still like Python the most) Here is a link to their GitHub repository: shortclick.link/4ls02n Here are the three things I recommend: 1 - Take a look at their live demo in their GitHub repo 2 - Build a simple workflow (it will take 5 minutes) 3 - Talk to your boss. Where can you plug this into your company? I started using Kestra at the height of the pandemic. It's an awesome tool, and I'm proud that they are sponsoring my writing. I hope you find it helpful as well.
Santiago tweet media
English
34
150
1.1K
90.4K
Dr Steven McDermott retweetledi
Santiago
Santiago@svpino·
Here is a cookbook on how to test your LLM applications: This cookbook uses Ragas and Comet Opik, an open-source platform for evaluating, testing, and monitoring LLM applications. You can use Opik to: • Detect hallucinations • Evaluate RAG applications • Determine answer relevance • Measure context recall • Create and store test cases • Integrate it with your CI/CD pipeline using Pytest Cookbook: comet.com/docs/opik/cook…
Santiago tweet media
English
9
67
429
39.8K
Dr Steven McDermott retweetledi
Luiza Jarovsky, PhD
Luiza Jarovsky, PhD@LuizaJarovsky·
🚨 [AI RESEARCH] If you're interested in AI ETHICS, the paper "Reconstructing AI Ethics Principles: Rawlsian Ethics of AI" by Salla Westerstrand is a MUST-READ. These are the 'Rawlsian ethics guidelines for fair AI' proposed: 1️⃣ "Developers and deployers of an AI system must ensure that the AI system does not threaten the basic liberties of any individual. ➵ AI systems should not endanger but support the freedom of thought and liberty of conscience. ➵ AI systems should not compromise but support political liberties and freedom of association, such as the right to vote and to hold public office. ➵ AI systems should not harm but support the liberty and integrity of the person, including freedom from psychological oppression and physical assault and dismemberment. ➵ All AI systems should be aligned with the principle of rule of law. 2️⃣ The use and development of AI systems should not negatively impact people’s opportunities to seek income and wealth. If an AI system is used in distribution of advantageous positions, such as recruitment, performance evaluation, or access to education, it needs to be ensured that. ➵ The tool is trained with non-biased training data, or appropriate tools are used to mitigate the biases in the final product if no non-biased training data is available (data bias mitigation), ➵ The outcome of the use of the tool includes an explanation of the grounds for the outcome it produces (explainability), and. ➵ The algorithms used shall encourage neither biased results nor the systematic repetition and amplification thereof in, e.g., the feedback loops of a machine learning system (algorithmic bias mitigation). *If these conditions cannot be met, AI should not be used in the process. 3️⃣ All inequalities affected by AI systems, such as acquiring a position of power or accumulation of wealth, must be to the greatest benefit of the least advantaged members of society." - ╰┈➤ This is a fascinating paper, especially for those familiar with Rawls' theory of justice. As AI development advances and AI agents start to become more prevalent, AI ethics is more important than ever, including in the context of supporting AI regulation and policy efforts. ╰┈➤ Download & read the paper below. 🏛️ STAY UP TO DATE. AI governance is moving fast: join 37,200+ people in 150+ countries who subscribe to my newsletter on AI policy, compliance & regulation, including outstanding research papers (link below).
Luiza Jarovsky, PhD tweet media
English
5
28
117
7.9K