Dr Steven McDermott

16.6K posts

Dr Steven McDermott

@soci

Leeds Katılım Şubat 2007

1.5K Takip Edilen1.3K Takipçiler

Dr Steven McDermott@soci·23 May

Explanation of "Accurate Predictions on Small Data with a Tabular Foundation Model" (Hollmann et al., 2025)[1] This work addresses a critical challenge in machine learning: achieving high prediction accuracy on small tabular datasets. bohrium.dp.tech/ai-search/shar…

English

110

Dr Steven McDermott@soci·8 May

strawberrybrowser.com/?ref_id=OB37RK…

ZXX

Dr Steven McDermott@soci·26 Şub

well that seems to have worked well

English

Dr Steven McDermott retweetledi

Simon Wardley@swardley·7 Kas

It amazes me that the most important metrics (lines of code, story points, cycle time, devex satisfaction) in development are the two that are never discussed, let alone measured ... mean time to answer (mttA) and mean time to question (mttQ).

English

185

26.6K

Dr Steven McDermott@soci·7 Kas

" #AI Assurance Taxonomies", explores the language used by UK business leaders to describe #AI assurance practices and the challenges they face in understanding and applying these concepts. notebooklm.google.com/notebook/2de6a…

English

Dr Steven McDermott retweetledi

Rohan Paul@rohanpaul_ai·2 Kas

Local models now protect your privacy while still accessing powerful LLM capabilities Chain small and large LLMs to get best performance while keeping data private 🔍 Original Problem: Users share sensitive personal information with proprietary LLMs during inference, raising privacy concerns. While local open-source models help with privacy, they perform worse than proprietary models. ----- 🛠️ Solution in this Paper: • PAPILLON: A multi-stage pipeline where local models act as privacy-conscious proxies • Uses DSPy prompt optimization to find optimal prompts for privacy preservation • Two key components: - Prompt Creator: Generates privacy-preserving prompts - Information Aggregator: Combines responses while protecting PII • Created PUPA benchmark with 901 real-world user-LLM interactions containing PII ----- 💡 Key Insights: • Simple redaction significantly lowers LLM response quality • Privacy-conscious delegation can balance privacy and performance • Smaller local models can effectively leverage larger models while protecting privacy • Prompt optimization improves both quality and privacy metrics ----- 📊 Results: • Maintains 85.5% response quality compared to proprietary models • Restricts privacy leakage to only 7.5% • Outperforms simple redaction approaches • Shows consistent improvement across different model sizes

English

280

32K

Dr Steven McDermott retweetledi

Rohan Paul@rohanpaul_ai·31 Eki

Fantastic open-sourced tool for RAG, chatting with your documents with open-source LLMs. ✨ It trended at Number-1 in Github for quite sometime. And a clean & customizable RAG UI for chatting with your documents. → Open-source RAG UI for document QA → Supports local LLMs and API providers → Hybrid RAG pipeline with full-text & vector retrieval → Multi-modal QA with figures & tables support → Advanced citations with in-browser PDF preview → Complex reasoning with question decomposition → Configurable settings UI → Extensible Gradio-based architecture Key features: 👇 🌐 Host your own RAG web UI with multi-user login 🤖 Organize LLM & embedding models (local & API) 🔎 Hybrid retrieval + re-ranking for quality 📚 Multi-modal parsing and QA across documents 💡 Detailed citations with relevance scores 🧩 Question decomposition for complex queries 🎛️ Adjustable retrieval & generation settings 🔌 Customizable UI and indexing strategies

English

139

1.1K

102.1K

Dr Steven McDermott@soci·30 Eki

As researchers tackle the limitations of #LLMs, the potential for developing models with human-like reasoning capabilities is within reach. Listen to the NotebookLM generated podcast - notebooklm.google.com/notebook/56e13… Or read the paper by Mirzadeh 2024 arxiv.org/pdf/2410.05229

English

Dr Steven McDermott retweetledi

Sanyam Bhutani@bhutanisanyam1·26 Eki

NotebookLlama: An Open Source version of NotebookLM 🙏 A complete tutorial on building a PDF to Podcast flow using Llama: - 1B to pre-process PDF - 70B to convert it to a podcast Transcript - 8B to make it more dramatic - Parler and Suno models for TTS github.com/meta-llama/lla…

English

258

1.5K

125.1K

Dr Steven McDermott retweetledi

Towards Data Science@TDataScience·26 Eki

"After reading this blog you’ll know where to start and how to select the most appropriate Bayesian techniques for causal discovery for your use case." An Extensive Starters Guide For Causal Discovery using Bayesian Modeling by Erdogan Taskesen towardsdatascience.com/an-extensive-s…

English

3.8K

Dr Steven McDermott retweetledi

elvis@omarsar0·26 Eki

Huge efforts to improve LLMs for tool use, computer use, reasoning, and long-context understanding. Here are a few interesting papers for the weekend: 1). Agentic Information Retrieval Provides an introduction to agentic information retrieval, which is shaped by the capabilities of LLM agents. Discusses different types of cutting-edge applications of agentic information retrieval and challenges. arxiv.org/abs/2410.09713 2). A Theoretical Understanding of CoT Finds that adding correct and incorrect reasoning paths in demonstrations improves the accuracy of intermediate steps and CoT. The proposed method, Coherent CoT, significantly improves performance on several benchmarks. In the Tracking Shuffled Objects dataset, Gemini Pro shows a 6.60% improvement (from 58.20% to 64.80%), and in Penguins in a Table, DeepSeek 67B demonstrates an increase of 6.17% (from 73.97% to 80.14%). arxiv.org/abs/2410.16540 3). LongRAG Enhances RAG's understanding of long-context knowledge which includes global information and factual details. Consists of a hybrid retriever, an LLM-augmented information extractor, a CoT-guided filter, and an LLM-augmented generator. These are key components that enable the RAG system to mine global long-context information and effectively identify factual details. LongRAG outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). arxiv.org/abs/2410.18050 4). Reasoning Patterns of OpenAI’s o1 Model When compared with other test-time compute methods, o1 achieved the best performance across most datasets. The authors observe that the most commonly used reasoning patterns in o1 are divide and conquer and self-refinement. o1 uses different reasoning patterns for different tasks. For commonsense reasoning tasks, o1 tends to use context identification and emphasize constraints. For math and coding tasks, o1 mainly relies on method reuse and divide and conquer. arxiv.org/abs/2410.13639 5). A Survey on Data Synthesis and Augmentation for LLMs Provides a comprehensive summary of data generation techniques in the lifecycle of LLMs. Includes discussions on data preparation, pre-training, fine-tuning, instruction-tuning, preference alignment, and applications. arxiv.org/abs/2410.12896 6). Beyond Browsing: API-Based Web Agents Researchers demonstrate that AI agents using both web APIs and browsing capabilities outperform traditional web-only agents by 20% on the WebArena benchmark, achieving a state-of-the-art 35.8% success rate for task-agnostic agents. arxiv.org/abs/2410.16464

English

113

523

50.9K

Dr Steven McDermott retweetledi

Sumanth@Sumanth_077·27 Eki

Harvard University is offering Free world class education in Python, Data Science, Machine Learning, Data Preprocessing, Visualization & Statistics:

English

237

18.8K

Dr Steven McDermott retweetledi

Kirk Borne@KirkDBorne·27 Eki

See 244-page PDF “Introduction to #NeuralNetworks” ➡️ dkriesel.com/en/science/neu… ———— #DataScience #AI #Algorithms #ML #MachineLearning #DeepLearning #Mathematics #Calculus #DataScientist

English

275

14.6K

Dr Steven McDermott retweetledi

Akshay 🚀@akshay_pachaar·26 Eki

100% Local OpenAI Swarm Agents!🐝 OpenAI Swarm is an educational framework that explores ergonomic, lightweight multi-agent orchestration. It's fairly easy to integrate with locally running LLMs through Ollama. ______ Find me → @akshay_pachaar ✔️ And stay tuned for more on multi-agent tutorials coming soon!

English

367

31.8K

Dr Steven McDermott retweetledi

Shubham Saboo@Saboo_Shubham_·26 Eki

Run any LLM like Llama 3.2 directly from Hugging Face using Ollama on your local computer (100% free and without internet).

English

363

33.5K

Dr Steven McDermott retweetledi

Rohan Paul@rohanpaul_ai·25 Eki

NetworkX from NVIDIA is one THE most popular Python graph analytics library with ~15K Github starts and 80M downloads monthly. This library is for working with networks and graphs. It helps analyze connections between things - like social networks, computer networks, or any system where objects are connected to each other. And now NetworkX just got massively accelerated after its backend integration with NVIDIA's cuGraph. ✨ Up to 500x speedups on large graph workloads in NetworkX with zero code changes. And it is Zero Code Change Acceleration. 📌 cuGraph is NVIDIA's GPU-accelerated graph analytics library within the RAPIDS ecosystem. The library provides fast graph algorithms on GPUs, supporting property graphs, remote operations, and graph neural networks (GNNs). Works with GPU DataFrames (cuDF) and integrates smoothly with NetworkX-like API. -------- 📌 The traditional bottleneck of NetworkX's pure Python implementation becomes apparent when processing graphs larger than 100K nodes and 1M edges. 📌 And so now cuGraph solves this by offloading supported algorithms to the GPU. PageRank, Louvain community detection, betweenness centrality, and about 60 other algorithms get instant acceleration. 📌 This acceleration enables previously impractical use cases. Fraud detection systems can now process massive transaction networks in real-time. Recommendation engines handle millions of user-item interactions efficiently. Social network analysis scales to entire platforms worth of data on a single machine. @NVIDIAAIDev

English

155

913

63K

Dr Steven McDermott retweetledi

Heather Cooper@HBCoop_·25 Eki

🔹STORM🔹 Stanford University’s free, public app automates comprehensive research and report generation using AI to create Wikipedia-style articles with citations from web sources. STORM is incredible. I asked it to write a research paper about the use of AI in DNA analysis and predicting genetic traits - got a comprehensive article with citations. It takes about 3 minutes and you can download the PDF:

English

249

1.7K

176.2K

Dr Steven McDermott retweetledi

Santiago@svpino·25 Eki

Data pipelines will put you in the top 1% of the market. If you could only learn one skill for the next decade, I can't think of anything more critical than learning to move and process data at scale. I like to tell people I'm a Machine Learning Engineer, but in reality, 90% of the value I produce comes from my ability to move data around consistently and correctly. In the field, we like to use the term "orchestration" when talking about coordinating workflows that move and process data. At a high level, there are three main steps you need to worry about: 1. Getting the data from its source 2. Processing and cleaning that data 3. Delivering the cleaned data to the right place You might have also heard about "ETL" (Extract, Transform, Load). That's how most people refer to the process above. Of course, building a simple ETL system isn't complex; most developers can do it without too much trouble. The problem is designing resilient, scalable, and fault-tolerant systems. You can't code your way to a production-ready orchestration platform (ask me how I know.) I started with AirFlow and eventually moved to @kestra_io because of its event-driven architecture. Event-driven means you can kick off a workflow automatically based on different triggers. For instance, when somebody uploads a new file to a folder, an app updates a database table, or there's a new message in a queue. It's hard to summarize everything you get from Kestra, but here are some of the highlights: • Kestra is free and open-source • You install it from a Docker container • Workflows as Code using YAML <--- this is awesome • Scales to millions of executions • It integrates with every cloud platform you've seen • Language agnostic (but I still like Python the most) Here is a link to their GitHub repository: shortclick.link/4ls02n Here are the three things I recommend: 1 - Take a look at their live demo in their GitHub repo 2 - Build a simple workflow (it will take 5 minutes) 3 - Talk to your boss. Where can you plug this into your company? I started using Kestra at the height of the pandemic. It's an awesome tool, and I'm proud that they are sponsoring my writing. I hope you find it helpful as well.

English

150

1.1K

90.5K

Dr Steven McDermott retweetledi

Santiago@svpino·23 Eki

Here is a cookbook on how to test your LLM applications: This cookbook uses Ragas and Comet Opik, an open-source platform for evaluating, testing, and monitoring LLM applications. You can use Opik to: • Detect hallucinations • Evaluate RAG applications • Determine answer relevance • Measure context recall • Create and store test cases • Integrate it with your CI/CD pipeline using Pytest Cookbook: comet.com/docs/opik/cook…

English

429

39.9K

Dr Steven McDermott retweetledi

Luiza Jarovsky, PhD@LuizaJarovsky·24 Eki

🚨 [AI RESEARCH] If you're interested in AI ETHICS, the paper "Reconstructing AI Ethics Principles: Rawlsian Ethics of AI" by Salla Westerstrand is a MUST-READ. These are the 'Rawlsian ethics guidelines for fair AI' proposed: 1️⃣ "Developers and deployers of an AI system must ensure that the AI system does not threaten the basic liberties of any individual. ➵ AI systems should not endanger but support the freedom of thought and liberty of conscience. ➵ AI systems should not compromise but support political liberties and freedom of association, such as the right to vote and to hold public office. ➵ AI systems should not harm but support the liberty and integrity of the person, including freedom from psychological oppression and physical assault and dismemberment. ➵ All AI systems should be aligned with the principle of rule of law. 2️⃣ The use and development of AI systems should not negatively impact people’s opportunities to seek income and wealth. If an AI system is used in distribution of advantageous positions, such as recruitment, performance evaluation, or access to education, it needs to be ensured that. ➵ The tool is trained with non-biased training data, or appropriate tools are used to mitigate the biases in the final product if no non-biased training data is available (data bias mitigation), ➵ The outcome of the use of the tool includes an explanation of the grounds for the outcome it produces (explainability), and. ➵ The algorithms used shall encourage neither biased results nor the systematic repetition and amplification thereof in, e.g., the feedback loops of a machine learning system (algorithmic bias mitigation). *If these conditions cannot be met, AI should not be used in the process. 3️⃣ All inequalities affected by AI systems, such as acquiring a position of power or accumulation of wealth, must be to the greatest benefit of the least advantaged members of society." - ╰┈➤ This is a fascinating paper, especially for those familiar with Rawls' theory of justice. As AI development advances and AI agents start to become more prevalent, AI ethics is more important than ever, including in the context of supporting AI regulation and policy efforts. ╰┈➤ Download & read the paper below. 🏛️ STAY UP TO DATE. AI governance is moving fast: join 37,200+ people in 150+ countries who subscribe to my newsletter on AI policy, compliance & regulation, including outstanding research papers (link below).

English

117

7.9K

Keşfet

@akshay_pachaar @NVIDIAAIDev @kestra_io @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates