Baselight

927 posts

Baselight banner
Baselight

Baselight

@BaselightDB

Everyone should be a data analyst. Turn questions into insights instantly with AI and billions of rows of data on crypto, sports, and everything in between📊

learn more here 👉 Se unió Temmuz 2024
22 Siguiendo1.8K Seguidores
Tweet fijado
Baselight
Baselight@BaselightDB·
Finding it hard to analyze data with SQL? That changes today with Baselight AI, your data analyst copilot! Baselight AI lets you extract insights from Baselight datasets (and your own uploads!), audit reasoning, and verify the data used to reach conclusions. We're launching today on @ProductHunt!
Baselight tweet media
English
9
12
46
8.9K
Baselight
Baselight@BaselightDB·
If you’re working on politics, economics, or investigative journalism - this is a goldmine. Trace donations, uncover patterns, and connect funding flows across decades. Ask your own questions with Baselight AI: baselight.app
English
0
1
1
13
Baselight
Baselight@BaselightDB·
+23 billion rows this week - led by the addition of Federal Election Commission (FEC) data. Follow the money. All of it. Baselight Weekly Update (last 7 days) • 451,979,054,166 rows (+23B) • 454,627 tables (+1K) • 69,696 datasets (+283) FEC datasets now live - and the scale is massive • $15.4B in individual contributions (2024) • Up from $14.6B (2020) and ~3× vs 2018 ($5.5B) From individual donors to PACs, electioneering to lobbying: 20 datasets • 352M+ rows • 1980 -> today Who funds American elections? Explore it on Baselight.
Baselight tweet media
English
1
2
5
41
Baselight
Baselight@BaselightDB·
The US addiction crisis isn’t one story. It’s many - and the data makes that impossible to ignore. We’ve been exploring newly added SAMHSA TEDS-A 2023 datasets on Baselight... and the geographic patterns are striking. -> The “Meth Belt” is a myth Meth isn’t just a rural Midwest issue. - Hawaii: 60% of all treatment admissions - Utah: 56% - California: 52% The coasts have a meth crisis too — it’s just less talked about. -> Heroin has a clear geographic fingerprint Top states by heroin share: - Maine: 39% - New Jersey: 37% - Massachusetts: 32% Every top state is in the Northeast. -> Synthetic opioids = the next wave Rhode Island: 37.5% New Mexico: 36.7% Tennessee: 30.6% These states are early signals of the fentanyl-driven shift. -> Unexpected leaders: Marijuana In several states (NC, IN, OH, AL, LA), marijuana is the #1 treatment substance. Even as legalization spreads, treatment demand isn’t disappearing. -> Wildest stat of all New York has 74,459 cocaine treatment admissions: 40% of all admissions in the state That’s more than double the next state. NYC is in a league of its own. Why this matters? Addiction in the US isn’t a single epidemic - it’s region-specific systems of risk. Policies, funding, and interventions need to reflect that reality. At Baselight, we’re making datasets like this queryable, explorable, and AI-ready - so anyone can uncover insights like these in seconds.
Baselight tweet media
English
1
3
6
89
Baselight
Baselight@BaselightDB·
21 billion new rows added to Baselight this week. Public data is growing fast - and we're working to make it easier to explore. Over the past 7 days, the Baselight catalog expanded significantly: Platform Scale • 428,770,708,355 rows (+21B this week) • 453,493 tables (+14K) • 69,413 datasets (+515) Highlights from this week’s additions 🏠 US Department of Housing and Urban Development (HUD) Housing Affordability (CHAS) and Income Limits (IL) datasets — key indicators for analyzing housing stress and regional affordability. 🧠 SAMHSA (Substance Abuse and Mental Health Services Administration) Large public health datasets covering mental health services, treatment programs, and behavioral health infrastructure. 📊 US Census expansion Census data now available at town/place-level resolution, enabling much more granular geographic analysis. The Baselight catalog continues to grow toward a simple goal: making the world’s structured data accessible and queryable in one place. You can explore the datasets or ask questions directly with Baselight AI. Links in the comments. Curious to hear from the community: If you had access to all this data in one place, what question would you ask first?
Baselight tweet media
English
1
2
3
148
Baselight
Baselight@BaselightDB·
New on Baselight: 21 years of official HUD Income Limits data is now live - FY2005 through FY2025. The national average Area Median Family Income has nearly doubled in 20 years: - 2005: $49,887 - 2025: $94,985, +90% increase - and it determines who qualifies for Section 8, public housing, and more. The gap is staggering: - The highest Area Median Family Income in the U.S. (2025): $195,200 - 8x higher than the lowest: $24,100 States that saw the biggest jump in HUD Area Median Family Income from 2024 to 2025? 1) Colorado: +9.0% 2) Hawaii: +8.8% 3) Idaho: +8.2% 4) Puerto Rico: +8.2% 5) South Dakota: +8.0% 4,764 areas. 56 states & territories. All queryable in seconds.
Baselight tweet media
English
2
4
7
110.9K
Baselight
Baselight@BaselightDB·
Another week of growth across the Baselight catalog - +3 billion new rows and hundreds of new datasets covering sports analytics, disaster response, and public health. Platform Scale - Rows: 407,793,294,092 (+3B this week) - Tables: 439,321 (+898) - Datasets: 68,898 (+382) Highlights Ultimate Soccer Dataset expansion Now 278M+ rows, covering 273K+ matches across 90+ competitions. Historical results now go back to the 1990s, enabling long-term analysis of leagues and teams performance. FEMA Disaster Data Datasets from the Federal Emergency Management Agency (FEMA) are now available on Baselight, providing a comprehensive view of U.S. disaster declarations, assistance programs, and response activity. CDC / ATSDR Social Vulnerability Index (SVI) This dataset ranks U.S. census tracts using 16 social indicators grouped into four themes: Socioeconomic Status, Household Characteristics, Racial & Ethnic Minority Status, Housing Type & Transportation Explore the data and start asking questions - links in the comments.
Baselight tweet media
English
1
3
3
188
Baselight
Baselight@BaselightDB·
For those curious about what you can do with the data, here are a few ready-to-use insights, queries, and dashboards built on top of the Ultimate Soccer Dataset: Queries • Cristiano Ronaldo scoring rate by age baselight.app/u/pjsousa/quer… • Messi vs Ronaldo scoring rate by age: baselight.app/u/pjsousa/quer… • Top scorers across major soccer leagues baselight.app/u/pjsousa/quer… • Referees that show the most cards baselight.app/u/pjsousa/quer… Dashboards • World Soccer Scoreboard baselight.app/u/pjsousa/dash… • Premier League 2025/2026 baselight.app/u/pjsousa/dash… • Premier League Insights (2015–2025) baselight.app/u/pjsousa/dash…
Baselight tweet media
English
0
1
2
145
Baselight
Baselight@BaselightDB·
Building the World’s Most Complete Soccer Dataset - Major Expansion. We've just significantly expanded the Ultimate Soccer Dataset with: - Historical match results extended back to the 1990s for major leagues - New league coverage across multiple countries, including: Austria, China, Denmark, Greece, Ireland, Japan, Mexico, Norway, Romania, Scotland, Turkey The Ultimate Soccer Dataset is a large, structured collection of global football data compiled by the Baselight team and partners. It includes standardized information on: - Competitions - Seasons - Matches - Teams - Players - Goals & assists - Lineups - Transfers - Betting odds - and more ... The dataset spans national leagues, international tournaments, and club competitions worldwide, with all data normalized across competitions and time to make querying and analysis easy. It’s designed for: - statistical analysis - machine learning models - scouting insights - historical research We welcome contributions, corrections, and suggestions from the community - help us make this the most comprehensive football dataset available.
Baselight tweet media
English
2
4
107
481
Baselight
Baselight@BaselightDB·
Dive into billions of rows of research, funding, and public data: baselight.app
English
0
0
2
152
Baselight
Baselight@BaselightDB·
Grounded AI starts with grounded data. This week, Baselight added 7 new high-impact public data sources - expanding coverage across research, innovation funding, public spending, and global standards. New sources now live: • OpenAIRE Research Graph • US Grants • Small Business Innovation Research (SBIR) • National Institutes of Health (NIH) • National Science Foundation (NSF) • USAspending • ISO reference datasets Platform scale continues to accelerate: • 404.6B+ rows (▲ +4B this week) • 438K+ tables • 68.5K+ datasets From biomedical research funding to startup grants and federal spending flows - it’s all structured, queryable, and fully traceable.
Baselight tweet media
English
2
1
6
137.4K
Baselight
Baselight@BaselightDB·
Explore the analysis yourself. We made the full NIH funding datasets queryable in Baselight, where you can: • explore funded projects and institutions • analyze funding distribution and trends • verify results directly from the source data • run your own questions using AI over structured datasets Everything is grounded in real, traceable data with full provenance: baselight.app
English
0
0
4
99
Baselight
Baselight@BaselightDB·
Where does $36 billion of biomedical research funding actually go? Every year, the U.S. National Institutes of Health (NIH) funds tens of thousands of research projects - quietly shaping the future of medicine, healthcare, and biotechnology. We analyzed NIH funding data for FY2025 using Baselight to understand where the largest investments are happening and how funding is distributed. Top funded research areas across ~60,000 projects: • Cancer: $4.9B (7,884 projects) • Infectious diseases: $4.7B (6,563 projects) • Aging: $3.8B (4,484 projects) • Heart, lung & blood: $3.4B (5,956 projects) These four areas alone account for more than $16B of NIH funding. We made NIH funding data fully queryable in Baselight - with AI-powered analysis grounded in real, traceable data you can inspect, verify, and explore yourself.
Baselight tweet media
English
2
3
6
151
Baselight retuiteado
adlrocha
adlrocha@adlrocha·
Building one of these agents with @BaselightDB would be literally minutes (actually, connecting the Baselight's MCP to any of the existing coding agents could get you this out of the box). Context is all you need! We are currenlty missing the datasets required to implement this exact one, but I am really excited about this work and similar use cases that are surfacing for niche agents. Soon we will incentivise devs to distribute their datasets for Baselight to build the knowledge base for agents so any niche agent can be built out of the box :) Huge fan of this one @tarikjmoody 🙌 Thanks!
Tarik Moody@tarikjmoody

I'm building the Claude Code for real music lovers, DJs, music journalists, and radio programmers (inspired by @virattt's Dexter Finance CLI. Crate, built on the Claude Agent SDK, is a personal research assistant AND music workspace for DJs, collectors, music journalists, and serious listeners. Here's the problem it solves: If you want to know who played bass on a specific album, what year a rare pressing came out, how much it's worth, or how two artists are connected, you'd normally have to bounce between five or six different websites, cross-referencing everything manually. It's tedious. And most people just give up. Crate does all of that digging for you in seconds. You ask a question in plain language — "Who produced the original pressing of this record?" or "Show me everything this session musician played on in 1974" — and Crate searches across multiple music databases simultaneously (Discogs, MusicBrainz, Genius, Bandcamp, and more), then brings back a single, connected answer. Think of it as having a record store clerk with an encyclopedic memory, a music librarian, and a pricing guide — all rolled into one AI-powered tool you can talk to from your terminal.

English
1
2
3
181
Baselight retuiteado
Paulo Sousa
Paulo Sousa@pjsousa·
What’s the most cited scientific publication ever? We just made it possible to answer questions like this instantly. OpenAIRE Graph is now available in @BaselightDB - one of the world’s largest open scholarly knowledge graphs, providing a 360° view of global research.
Paulo Sousa tweet media
English
1
3
2
115
Baselight
Baselight@BaselightDB·
You can explore and query the data directly. Try the Baselight AI chat: baselight.app Access directly all EU Funding & Tenders datasets: baselight.app/u/eufundingten… Includes: • Open grants and tenders • Historical grants and tenders • Grant updates • Funding & Tenders FAQs Curious to hear your feedback - and which datasets we should onboard next.
English
5
1
10
160
Baselight
Baselight@BaselightDB·
We’ve just crossed another major milestone in our mission to organize the world’s structured data. This week’s scale: • 400.9 billion rows (+5B) • 437,747 tables (+1K) • 68,445 datasets (+272) We also added a new high-impact public data source: the EU Funding & Tenders Portal - making funding and research data more accessible and queryable. Several major sources are already prepared and ready to go next: OpenAIRE Research Graph, US Gov Grants, SBIR / STTR programs. Step by step, we’re building a global data infrastructure where anyone can discover, query, and trust structured data.
Baselight tweet media
English
15
2
25
316