hisham bedri

275 posts

hisham bedri banner
hisham bedri

hisham bedri

@ultimate_afro

Technologist working on reality-capture and creative tools. Tiktok: @ultrafro2

Katılım Kasım 2015
143 Takip Edilen194 Takipçiler
Andrej Karpathy
Andrej Karpathy@karpathy·
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
English
2.9K
7.2K
59.3K
21.2M
hisham bedri
hisham bedri@ultimate_afro·
@trq212 Claude code is so addicting, it feels like you're still making games :)
English
2
0
4
220
hisham bedri retweetledi
hisham bedri retweetledi
Aura
Aura@Aura_Assistant·
Aura is an AI agent that can make logic graphs, write and compile flawless Unreal C++, make assets, and perform any action you can in Engine. Runs on frontier models like Claude Opus 4.5. Here's 10 min showing how it works: Dropping 25 invites with $40 of usage in the reply. Let us know what you think.
English
11
8
48
23.3K
hisham bedri retweetledi
Thariq
Thariq@trq212·
if the government can own 10% of Intel, I feel like a few grocery stores are fine
English
14
6
161
16K
hisham bedri
hisham bedri@ultimate_afro·
@cohere i think i'm seeing a long time on embedding generation as of an hour ago
English
0
0
0
24
hisham bedri
hisham bedri@ultimate_afro·
please donate to his family here:
hisham bedri tweet media
English
0
0
0
57
hisham bedri
hisham bedri@ultimate_afro·
@trq212 congratulations to you and a big congratulations to Anthropic!!
English
0
0
1
65
hisham bedri retweetledi
Thariq
Thariq@trq212·
I joined Anthropic last week! It’s obvious to me that we’re bottlenecked not by model capabilities, but by creativity and understanding. I’ll be building demos and prototypes that highlight new capabilities and share what we learn about building using these models at Anthropic. Excited to build with the best team and the best models.
English
84
56
1.5K
165.2K
hisham bedri
hisham bedri@ultimate_afro·
@BenTelAviv Quite a claim about Zohran. What's your evidence that he hates Jews?
English
0
0
0
21
Ben Badejo
Ben Badejo@BenjaminBadejo·
New York City is the city with the greatest number of Jews outside of Israel. There is no other place outside of Israel with as many Jews, whether in America or elsewhere. The Jew-hating and intifada-supporting piece of trash called Zohran Mamdani will never be New York’s mayor.
English
38
46
394
7.4K
Metaplane by Datadog
Metaplane by Datadog@metaplane·
Metaplane is joining @datadoghq! 🚀 After 5 years helping data teams build trust, we're bringing data observability to 30,000+ Datadog customers and beyond. Together, we're excited to connect software and data teams with complete visibility across the entire data lifecycle.
Metaplane by Datadog tweet media
English
7
5
20
5K
Thariq
Thariq@trq212·
I made a quick, free tool to get AI copyediting with just a copy & paste! I find myself using it basically every day, including for this tweet lol.
English
4
0
14
747
hisham bedri
hisham bedri@ultimate_afro·
@paulg those of you calling Paul a hypocrite instead of addressing the logic in his point are not pushing the conversation forward. There are better and more fulfilling causes to put software engineering effort into than enshrining the ruling class.
English
0
0
1
87
hisham bedri retweetledi
Thariq
Thariq@trq212·
✨ New AI Interfaces powered by Interpretability I'm excited to share LatentLit, the result of my applied AI research fellowship with @GoodfireAI Mechanistic interpretability isn’t just important for AI safety, it also gives us new ways to steer and interact with LLMs.
English
41
56
575
67.9K
hisham bedri retweetledi
MrNeRF
MrNeRF@janusch_patas·
Wonderland: Navigating 3D Scenes from a Single Image Contributions: • First, we introduce a representation for controllable 3D generation by leveraging the generative priors from camera-guided video diffusion models. Unlike image models, video diffusion models are trained on extensive video datasets. This enables them to capture comprehensive spatial relationships within scenes across multiple views and embed a form of "3D awareness" in their latent space, which allows us to maintain 3D consistency in novel view synthesis. • Second, to achieve controllable novel view generation, we empower video models with precise control over specified camera motions. We introduce a novel dual-branch conditioning mechanism that effectively incorporates desired diverse camera trajectories into the video diffusion model. This enables expansion of a single image into a multi-view consistent capture of a 3D scene with precise pose control. • Third, to achieve efficient 3D reconstruction, we directly transform video latents into 3DGS. We propose a novel latent-based large reconstruction model (LaLRM) that lifts video latents to 3D in a feed-forward manner. With this design, during inference, our model directly predicts 3DGS from a single input image, effectively aligning the generation and reconstruction tasks—and bridging image space and 3D space—through the video latent space. Compared with reconstructing scenes from images, the video latent space offers a 256× spatial-temporal reduction while retaining essential and consistent 3D structural details. Such a high degree of compression is crucial, as it allows the LaLRM to handle a wider range of 3D scenes within the reconstruction framework, with the same memory constraints.
English
15
95
607
52.8K
hisham bedri
hisham bedri@ultimate_afro·
now with some head-motion
English
0
0
5
74