Collin Paran

2.5K posts

Collin Paran banner
Collin Paran

Collin Paran

@CollinParan

Put the first #AI #LLM in #Space | #Veteran | Early #Dogecoin #XLM adopter This is my personal account, views are my own.

Denver, CO Присоединился Temmuz 2012
5.4K Подписки5.1K Подписчики
Закреплённый твит
Collin Paran
Collin Paran@CollinParan·
@pnickdurham Okay let's go fund it, there are companies that are already doing that. They just need to scale.
English
1
0
3
670
Collin Paran ретвитнул
Vaishnavi
Vaishnavi@_vmlops·
MICROSOFT BUILT A TOOL THAT CONVERTS LITERALLY ANYTHING INTO CLEAN MARKDOWN FOR YOUR LLM pdfs. word docs. excel. powerpoint. audio. youtube urls one pip install and your AI pipeline stops choking on raw files forever no custom parsers. no broken layouts. no garbled text. just clean, structured markdown your LLM can actually read github.com/microsoft/mark…
English
80
517
5K
789.3K
Collin Paran ретвитнул
Andrej Karpathy
Andrej Karpathy@karpathy·
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
English
2.8K
6.8K
56.8K
20.1M
Collin Paran ретвитнул
OpenClaw🦞
OpenClaw🦞@openclaw·
🦞🛡️ OpenClaw × VirusTotal: every ClawHub skill now auto-scanned for malware 🔍 AI Code Insight catches reverse shells, crypto miners & exfiltration ⚡ ~30s verdicts 🚦 Benign/Suspicious/Malicious tiers 🔄 Daily re-scans This is not a silver bullet, but it is another layer to the shell 🦞openclaw.ai/blog/virustota…
English
305
429
4.4K
496.6K
Collin Paran ретвитнул
Boris Cherny
Boris Cherny@bcherny·
We just open sourced the code-simplifier agent we use on the Claude Code team. Try it: claude plugin install code-simplifier Or from within a session: /plugin marketplace update claude-plugins-official /plugin install code-simplifier Ask Claude to use the code simplifier agent at the end of a long coding session, or to clean up complex PRs. Let us know what you think!
Boris Cherny tweet media
English
348
1.1K
12.9K
1.8M
Collin Paran ретвитнул
Max Blumenthal
Max Blumenthal@MaxBlumenthal·
This crystal clear video of ICE shooting a US citizen in the head is what Trump’s “Golden Age” looks like: A regime of Terror Capitalism imposed on subjects across the Americas to protect economic plunder by a decadent class of Zionist tech plutocrats
English
960
3.8K
11.3K
303.1K
Collin Paran
Collin Paran@CollinParan·
@AskPerplexity What counts as first “high-powered inference” in Space?
English
1
0
0
416
Computer
Computer@AskPerplexity·
This November, history changes. An NVIDIA H100 GPU—100 times more powerful than any GPU ever flown in space—launches to orbit. It will run Google's Gemma—the open-source version of Gemini. In space. For the first time. First AI training in orbit. First model fine-tuning in space. First high-powered inference beyond Earth. And the CEO just said: "Within 10 years, almost all new datacenters will be built in space." This is Starcloud-1. Here's why it matters.
English
217
551
4.1K
1M
Collin Paran ретвитнул
Andrej Karpathy
Andrej Karpathy@karpathy·
Love this project: nanoGPT -> recursive self-improvement benchmark. Good old nanoGPT keeps on giving and surprising :) - First I wrote it as a small little repo to teach people the basics of training GPTs. - Then it became a target and baseline for my port to direct C/CUDA re-implementation in llm.c. - Then that was modded (by @kellerjordan0 et al.) into a (small-scale) LLM research harness. People iteratively optimized the training so that e.g. reproducing GPT-2 (124M) performance takes not 45 min (original) but now only 3 min! - Now the idea is to use this process of optimizing the code as a benchmark for LLM coding agents. If humans can speed up LLM training from 45 to 3 minutes, how well do LLM Agents do, under different kinds of settings (e.g. with or without hints etc.)? (spoiler: in this paper, as a baseline and right now not that well, even with strong hints). The idea of recursive self-improvement has of course been around for a long time. My usual rant on it is that it's not going to be this thing that didn't exist and then suddenly exists. Recursive self-improvement has already begun a long time ago and is under-way today in a smooth, incremental way. First, even basic software tools (e.g. coding IDEs) fall into the category because they speed up programmers in building the N+1 version. Any of our existing software infrastructure that speeds up development (google search, git, ...) qualifies. And then if you insist on AI as a special and distinct, most programmers now already routinely use LLM code completion or code diffs in their own programming workflows, collaborating in increasingly larger chunks of functionality and experimentation. This amount of collaboration will continue to grow. It's worth also pointing out that nanoGPT is a super simple, tiny educational codebase (~750 lines of code) and for only the pretraining stage of building LLMs. Production-grade code bases are *significantly* (100-1000X?) bigger and more complex. But for the current level of AI capability, it is imo an excellent, interesting, tractable benchmark that I look forward to following.
Minqi Jiang@MinqiJiang

Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then surely the singularity is just around the corner. How can we get a pulse check on whether current LLMs are capable of driving this kind of total self-improvement? Well, we know humans are pretty good at improving LLMs. In the NanoGPT speedrun challenge, created by @kellerjordan0, human researchers iteratively improved @karpathy's GPT-2 replication, slashing the training time (to the same target validation loss) from 45 minutes to under 3 minutes in just under a year (!). Surely, a necessary (but not sufficient) ability for an LLM that can automatically improve frontier techniques is the ability to *reproduce* known innovations on GPT-2, a tiny language model from over 5 years ago. 🤔 So we took several of the top models and combined them with various search scaffolds to create *LLM speedrunner agents*. We then asked these agents to reproduce each of the NanoGPT speedrun records, starting from the previous record, while providing them access to different forms of hints that revealed the exact changes needed to reach the next record. The results were surprising—not because we thought these agents would ace the benchmark, but because even the best agent failed to recover even half of the speed-up of human innovators on average in the easiest hint mode, where we show the agent the full pseudocode of the changes to the next record. We believe The Automated LLM Speedrunning Benchmark provides a simple eval for measuring the lower bound of LLM agents’ ability to reproduce scientific findings close to the frontier of ML. Beyond scientific reproducibility, this benchmark can also be run without hints, transforming into an automated *scientific innovation* benchmark. When run in "innovation mode," this benchmark effectively extends the NanoGPT speedrun to AI participants! While initial results here indicate that current agents seriously struggle to match human innovators beyond just a couple of records, benchmarks have a tendency to fall. This one is particularly exciting to watch, as new state-of-the-art here by definition implies a form of *superhuman innovation*.

English
93
622
4.3K
466.8K
Collin Paran ретвитнул
Jamie
Jamie@JamieTerbeest·
I just created a Cursor rule called ask-a-friend.mdc that manually calls up Claude Code when Cursor and all its LLMs are stuck. I think it's going to be a great tool to call when the going gets tough.
English
2
2
5
442
Collin Paran ретвитнул
el.cine
el.cine@EHuanglu·
this new AI agent is incredible.. it can analyse real time market data 24/7 and suggest you to buy/sell at the right moment, it's crazy here's how it works (it's still free now):
English
99
339
3.4K
571.7K
Collin Paran ретвитнул
Elon Musk
Elon Musk@elonmusk·
How much time is spent doing pointless “online training”? Sounds pretty bad. Even I have to do some of this stuff.
MandatoryFunDay@MonotoneMustang

Please @elonmusk do not eliminate the very important online training we do in the military every year.

English
6.8K
5.2K
77.4K
12.3M
Collin Paran
Collin Paran@CollinParan·
@elonmusk FYSA USASpending uses SQL. If @DOGE needs a SQL guy, I know a few.
English
0
0
0
30
Collin Paran ретвитнул
MandatoryFunDay
MandatoryFunDay@MonotoneMustang·
Please @elonmusk do not eliminate the very important online training we do in the military every year.
English
1.8K
2.5K
37.2K
13.8M
Collin Paran
Collin Paran@CollinParan·
@DOGE You should consider using SQL like USASpending
Collin Paran tweet media
English
0
0
0
28
Department of Government Efficiency
DOGE website is live! doge.gov Initial site: 1. X feed posts 2. Consolidated government org chart - enormous manual effort consolidating 16,000+ offices 3. Summary of the massive regulatory state, including the Unconstitutionality Index (ratio of rules written by unelected bureaucrats to laws passed by Congress) Coming soon (targeting Valentines Day): 1. Description/amount of each cost reduction (w/ receipts where applicable) 2. Overall savings scorecard We will constantly be working to maximize the site’s utility and transparency. Please let us know what else you want to see!
English
4.4K
13.4K
67.6K
20.5M
Collin Paran
Collin Paran@CollinParan·
@elonmusk Since you are all about transparency. Can @DOGE show us their SQL query to find these "150 year olds"? They can mask the actual data, an experience person can figure it out. Chances are it is an Oracle DB or MSSQL server anyway.
English
0
0
0
29
Elon Musk
Elon Musk@elonmusk·
Just learned that the social security database is not de-duplicated, meaning you can have the same SSN many times over, which further enables MASSIVE FRAUD!! Your tax dollars are being stolen.
Elon Musk@elonmusk

To be clear, what the @DOGE team and @USTreasury have jointly agreed makes sense is the following: - Require that all outgoing government payments have a payment categorization code, which is necessary in order to pass financial audits. This is frequently left blank, making audits almost impossible. - All payments must also include a rationale for the payment in the comment field, which is currently left blank. Importantly, we are not yet applying ANY judgment to this rationale, but simply requiring that SOME attempt be made to explain the payment more than NOTHING! - The DO-NOT-PAY list of entities known to be fraudulent or people who are dead or are probable fronts for terrorist organizations or do not match Congressional appropriations must actually be implemented and not ignored. Also, it can currently take up to a year to get on this list, which is far too long. This list should be updated at least weekly, if not daily. The above super obvious and necessary changes are being implemented by existing, long-time career government employees, not anyone from @DOGE. It is ridiculous that these changes didn’t exist already! Yesterday, I was told that there are currently over $100B/year of entitlements payments to individuals with no SSN or even a temporary ID number. If accurate, this is extremely suspicious. When I asked if anyone at Treasury had a rough guess for what percentage of that number is unequivocal and obvious fraud, the consensus in the room was about half, so $50B/year or $1B/week!! This is utterly insane and must be addressed immediately.

English
10.7K
43K
193.3K
61.3M
Collin Paran ретвитнул
unusual_whales
unusual_whales@unusual_whales·
The U.S. Department of Justice (DOJ) has argued that confiscating $50,000 from a small business did not infringe the business' right to private property because money is not property, per reason. "Money is not necessarily 'property' for constitutional purposes," the government's brief declared
unusual_whales tweet media
English
531
1.7K
8.3K
693K