मनीष तिवारी

1.7K posts

मनीष तिवारी banner
मनीष तिवारी

मनीष तिवारी

@compmanish

कायर भोगे दुख सदा वीर भोग्य वसुंधरा। CS Engineer | Indic Wing

Katılım Haziran 2024
364 Takip Edilen45 Takipçiler
Sabitlenmiş Tweet
मनीष तिवारी
हल्दीघाटी की जमीन इतना खून पीकर बैठी, जितना भाखड़ा के बांध में रुका कोना पानी! ~दुर्योधन राणा
हिन्दी
1
0
1
296
Harsh Kaptan
Harsh Kaptan@harshkaptan06·
Not part of Super 30 3.0, but attended the orientation today. Worth it. • Build multiple good projects • Stay interview-ready • 2026 jobs need 2015 grind “Don’t tell me what you can do, show me what you have done.” @kirat_tw
Harsh Kaptan tweet media
English
3
1
33
2.3K
मनीष तिवारी
@Bawla_Scientist yes, agreed but it was their first iteration of foundational model and also they were mostly on product building side and their was a TL based which was their main target, waiting for their trillion params models, although I use their APIs for others like docs, STT loved them
English
0
0
1
14
Harsh ris
Harsh ris@Bawla_Scientist·
Personally i didnt like results from any of the sarvam models What sarvam truly wins is at projection of itself as a competing leader in the domain of global AI labs the move might feel quite finicky if HCL were to acquire stakes in sarvam But i regard this move in a good direction as others would follow the same and this might give rise to an entirely new ecosystem in India Where we see Indian IT giants betting big on startups and creating more new jobs than ever before
Keshav Lohia@Keshav_Lohiaaa

Never thought I’d see this day… an Indian IT major leading round of India’s top foundational model startup. Imagine the upheaval if Accenture led Anthropic’s major funding round. I don’t know what this signals about how Indian VCs think about sarvam

English
2
0
3
259
Shubham Tuteja • Jade
Shubham Tuteja • Jade@ShubhamTotu·
today I’ve made the difficult decision to increase the size of @c_engines by ~14%. actively hiring for these roles (in-office, blr): > video editors and designers, mid-senior. > mlx/os developers and researchers. > technical writers and smm leads. > interns + agents. general [conscious] applications - dm. the official job board is available on our website.
English
15
4
98
6.3K
मनीष तिवारी
@BrahminReich @SwarajyaMag In 8 months, without having a great capital, not large amount of GPUs building of the best in the world is quite hard they're not as of now saying they're going for best in the world but they're going for atleast putting india onto that AI map yes India have good enough
English
0
0
1
9
Swarajya
Swarajya@SwarajyaMag·
The Indian conversation on AI has settled into a fatalistic mood. We are told the country fell behind, that the frontier is now a closed shop between Silicon Valley and Hangzhou. Sarvam AI's Vivek Raghavan, who built Aadhaar's biometric stack, disagrees. 🧵
English
6
44
311
19.6K
Ankit Jxa
Ankit Jxa@kingofknowwhere·
Doing another interview today and this guy remembers the api endpoint URL too. He must be so smart.
Ankit Jxa tweet media
English
8
0
130
15K
मनीष तिवारी
😋😋😍😍😍😱😱
Pixxel@PixxelSpace

Today, we’re taking a step toward truly galactic-scale capabilities. 🚀 We’re partnering with @SarvamAI to bring sovereign AI into orbit aboard India’s first orbital data centre satellite, a pathfinder mission bringing datacenter-class GPUs and high-performance remote sensing together in space. Built and operated by Pixxel, with Sarvam providing the AI backbone, the demonstrator marks a step toward making orbital data centres real, operational, and scalable from India. May the 4th be with us all! ✨

ART
0
0
0
65
_Tej_
_Tej_@Tej_Intel·
DRDO is offering roughly $290 per month for internships. But the expectation is - Innovate. Contribute. Defend. No doubt the nation has to borrow everything from foreigners.
_Tej_ tweet media
English
15
39
226
5.7K
मनीष तिवारी
@Tej_Intel Internship..... Noone's gonna do something groundbreaking and build a rocket which will go to jupiter and comeback it's an internship where students are supposed to work on their ideas in a lab under supervision
English
0
0
1
236
Siddharth Sharma
Siddharth Sharma@Siddharth_shar·
@pramochanyaan what is even there to love about isro bad production rate worst camera quality improper code of conduct they can't even speak english properly they look like they have just woken up from bed talks about 2040 moon mission while they have't even sent a uncrewed capsule to space
English
3
1
1
453
Harsh ris
Harsh ris@Bawla_Scientist·
Cooked another sota model just using internet data beating benchmarks and speed working on it for past 2 3 weeks and now in crisis mode of what if Never gets old still got in me, not thinking of releasing it for now
English
2
0
3
206
प्रमोचन यान 🇮🇳
Most of people commenting about ISRO gives zero ducks about Manufacturing, Funding, Scale, Technology, Market, Capabilities, Indigenous tech, Context, Timing etc. They just believe in Elon Musk best ISRO bad Solid rocket bad Open or Close cycle engine mein kuchh common nahi
English
11
5
66
1.4K
मनीष तिवारी
@kartikktwt These are very few but open sources don't hate Indians most of the programs are mainly filled by indians and they do great work Just selecting the bad side is a personal choice. And shows the thought process
English
1
0
1
36
Karthik
Karthik@kartikktwt·
This is why Open source hates Indian contributors raising a ton of trivial issues like this in a prestigious project like moby??? maintainers of moby are full time engineers, they won't entertain this FYI he is a student from India
Karthik tweet media
English
8
1
21
2.4K
मनीष तिवारी retweetledi
GalaxEye
GalaxEye@GalaxEye·
Separation Confirmed! The world's first OptoSAR Satellite is now in space. Made in India for the world. Go Drishti! Go @GalaxEye! Go India!
GalaxEye tweet media
English
239
1.5K
6.2K
407.7K
arnav sonavane
arnav sonavane@w2sgarnav·
aiming for 5 research topics for the upcoming few months, if yall want to join in pls do so, GPU shortage wont be there (hopefully) (worked on these problem statements a bit previously, and have ran a few experiments on each) find them below: ps 1 : Process Reward Models Beyond Outcome Supervision Without the need for human-labeled trajectories, we provide a completely automated approach for training Process Reward Models (PRMs) that either meet or surpass the quality of gold step-level annotations. We create dense Monte-Carlo Tree Search (MCTS) rollouts with depth d ≥ 32 and branching factor b = 8, starting from a base policy π_θ trained via SFT on chain-of-thought data. Each intermediate step is scored using an ensemble of outcome verifiers (ORMs) bootstrapped from self-consistency and LLM-as-judge signals under temperature T = 0.7. A process-DPO variation with step-wise Bradley-Terry losses weighted by MCTS visit counts and calibrated via Platt scaling on a short held-out verification set is introduced to reduce verifier noise. By simultaneously optimising the PRM and policy under a single RLVR goal that alternates between process-level preference optimisation and outcome-level PPO updates, with adaptive mixing ratio λ_t planned via cosine annealing, our method closes the annotation gap. Our auto-annotated PRM delivers +14.7% pass@1 over outcome-only RM baselines at 7B scale and transfers to code and scientific reasoning domains with 3% deterioration following LoRA adaptation on 2k domain-specific trajectories, according to extensive ablation on GSM8K, MATH, and HumanEval. We present the multi-domain PRM benchmark, the distilled verifier weights, and the whole MCTS annotation program, offering the first production-ready recipe for frontier-scale process supervision. ps 2 : Computer-Use Agents and GUI Grounding In addition to introducing a large-scale synthetic data engine that uses Playwright + Android Emulator instrumentation to generate 500k grounded interaction traces across web, mobile, and desktop environments, we formalise GUI grounding failures through a tripartite decomposition: perception (pixel-to-semantic mapping), planning (high-level action sequence), and execution (low-level mouse/keyboard trajectories). Pixel-level segmentation masks, accessibility tree annotations, and oracle action sequences obtained via deterministic UI state diffing are linked with each trace. Using a hybrid loss that combines contrastive screen embedding alignment (using InfoNCE on cropped UI elements), autoregressive action token prediction, and auxiliary bounding-box regression heads that function at 4× downsampled resolution to maintain fine-grained OCR and icon semantics, we train a multimodal VLA policy on top of a Qwen2-VL-7B backbone. A domain-adversarial training objective that aligns screen embeddings across platforms while maintaining task-specific action distributions is combined with test-time adaptation using a lightweight 256M adapter that conditions on platform-specific accessibility trees to achieve cross-platform zero-shot transfer. Our model decreases end-to-end grounding error from 48% (Claude-3.5 baseline) to 19% on the recently released GUI-Grounding-Bench (which includes 12k actual jobs from WebArena, AndroidWorld, and OSWorld), with the biggest improvements in perception-heavy mobile UIs. We provide the cross-platform VLA checkpoint, the failure atlas taxonomy, and the complete synthetic trace generator, creating the first reproducible benchmark and recipe for reliable computer-use agents. ps 3 : Agent Memory Architectures Beyond RAG We present TypedAgentMemory, a modular memory substrate controlled by a differentiable memory controller trained end-to-end with the agent policy that explicitly distinguishes episodic semantic (dense vector summaries with SAE-derived concept tags), procedural, and working (short-term KV cache compression) memories. A 128-dim uncertainty head that thresholds epistemic uncertainty from an ensemble of forward passes gates memory writes. The controller uses a hierarchical policy over four memory operations: write, consolidate (graph-based merging with GNN message passing), forget (learned eviction via eligibility traces and recency + relevance scores), and retrieve (hybrid dense + symbolic query routing). Explicit memory consolidation every 50 steps is used to evaluate long-horizon tasks on τ-bench, WebArena, and GAIA. This results in a 2.3× decrease in context length and a 31% improvement in success rate over flat vector-store RAG baselines. Per-memory-type differential privacy approaches, such as homomorphic encryption for procedural skill graphs, concept-level k-anonymity on semantic features, and ε = 0.5 noise injection on episodic writing, are used to ensure privacy. Ablations show that typed memory facilitates effective cross-task transfer through procedural memory reuse and prevents catastrophic forgetting on 200-step agent trajectories. We provide the first rational substitute for monolithic RAG for production-grade autonomous agents by making the whole TypedAgentMemory library (based on LangGraph + FAISS + Neo4j), the long-horizon evaluation harness, and pretrained memory controllers for Llama-3.1-8B and Qwen2.5-72B open-source. ps 4: SAE Universality Across Model Families By training 128k-feature JumpReLU SAEs (expansion factor 64, k = 32) on residual streams of Llama-3.1-8B, Qwen2.5-72B, Gemma-2-27B, Mistral-Large-2, and DeepSeek-V3 with the same hyperparameters and reconstruction aims, we perform the first extensive cross-family SAE universality investigation. A bipartite matching that quantifies pairwise overlap at both neuron-level (cosine similarity > 0.85) and concept-level (via automated interpretation pipelines using 512 probe prompts per feature) is obtained by performing feature matching via optimal transport with Sinkhorn algorithm on normalised decoder weight matrices. By grouping similar features from different families into 4.2k platonic ideas and annotating each concept with activation data, downstream steering efficacy, and causal mediation scores calculated via route patching, we further build a universal feature library. Steering vectors created from the universal library outperform within-family SAEs on out-of-distribution tasks and enhance zero-shot generalisation on MMLU-Pro, GPQA, and LiveCodeBench by an average of 9.4% when transferred between families, according to downstream transfer studies. We make available the whole SAE training software, the universal concept library with 4.2k interpreted features, the cross-family matching dataset (which includes optimum transport plans), and a plug-and-play steering toolkit that works with Hugging Face Transformers and vLLM. In order to facilitate transfer learning, model merging, and safety interventions within the existing frontier model ecosystem, this study offers the first rigorous atlas and infrastructure for mechanistic universality. ps 5 : Synthetic Data Generation Without Mode Collapse We provide an iterated synthetic data pipeline that explicitly characterises the collapse threshold ρ*(q) as a function of generator quality q (as determined by the activation entropy of the SAE feature and the entropy of the output distribution H_π). Using temperature-annealed sampling (T=1.0 → 0.7) supplemented with SAE-guided rejection sampling, we create synthetic corpora at different mixing ratios ρ ∈ {0, 0.1,…, 1.0} starting from a 7B base policy π_θ trained on 200B tokens of FineWeb-Edu. At each generation, we train a 128k-feature JumpReLU SAE (expansion factor 64, k=32) on the residual stream of the current model and filter synthetic samples whose top-activating features show activation entropy below a calibrated threshold τ derived from the real-data reference distribution. Our experiments provide the first empirical collapse-threshold map ρ*(q) at 1.3B–7B scale, demonstrating that SAE-guided diversity sampling extends the safe mixing ratio by 2.3× compared to persona-conditioned or temperature-only baselines, while generator entropy H_π ≥ 4.2 nats delays the onset of measurable perplexity degradation on a held-out real validation set until generation 7 under accumulation (versus generation 3 under pure replacement). A closed-form constraint on variance contraction rate under synthetic mixing is derived theoretically, connecting the number of safe iterations before tail probability mass falls below 10^{-3} to the spectral gap of the generator's transition kernel.
arnav sonavane tweet media
English
17
14
163
15.6K
मनीष तिवारी retweetledi
PIB India
PIB India@PIB_India·
Great Nicobar Project: FAQs The Great Nicobar Project is a strategic initiative to strengthen India's presence in the Andaman Sea. It seeks to balance port-led growth with calibrated environmental safeguards. Protection of indigenous communities remains central to its planning. The project combines strategic, economic, and ecological priorities. This ensures that development is sustainable, inclusive, and aligned with national interests. The following FAQs provide an understanding of the key aspects of the project: ▶️Does the Great Nicobar Island Project serve a clear strategic and national purpose? The Great Nicobar Island Project is a project of strategic, defence, and national importance, undertaken after due diligence and careful consideration. It is of critical national security and strategic significance. The project will substantially strengthen India's presence in the Andaman Sea and Southeast Asia, enhance maritime and defence capabilities, and integrate the island with global trade and logistics networks. It will also establish a major international transshipment terminal with distinct locational advantages over competing ports in the Bay of Bengal region, positioning India as a key economic and strategic hub. ▶️Is the project well-planned, feasible, and designed with long-term impact in mind? The environmental impact of the project has been assessed in a detailed and multi-tiered manner in accordance with the Environment Impact Assessment Notification, 2006 and Coastal Regulation Zone Notification, 2019, wherein due consideration has been given to the ecological sensitivity and biodiversity value of the island. Based on such assessment, a robust Environmental Management Plan, along with stringent and enforceable conditions, has been prescribed to avoid, minimise and mitigate any potential impacts. 🔗static.pib.gov.in/WriteReadData/… Read more: pib.gov.in/PressReleasePa…
English
76
1.3K
4.4K
96.2K
मनीष तिवारी retweetledi
Justin Schroeder
Justin Schroeder@jpschroeder·
We’re announcing: VibeBench, a new benchmark for what actually matters — how models feel when used on real work by experienced software engineers. But, we need your help. Here’s how it works: 1. An initial cohort of 1000 qualified software engineers (join: vibebench.standardagents.ai) 2. Groups of 250 evaluate new models for 2 days on real work. 3. Participants subjectively rank the model relative to other models they have experience with. 4. On day 4 a report is released with objective results derived from the subjective tests. How can you help: 1. We all need this benchmark to exist, but for it to become reality, we need an initial cohort of 1000 qualified software engineers. If that’s you, please join! vibebench.standardagents.ai 2. Repost this! We need to reach as many qualified engineers as we can find. 3. Share this initiative with everyone on your engineering teams. Together we can make this benchmark a reality for all of us.
Justin Schroeder tweet media
English
11
37
142
44.4K