Sabitlenmiş Tweet
Kartikey
282 posts


@dikshatwt Solution :
Just make a todo list(bare minimum things you think you can do) and tick off everything by eod.
English

How to ruin yourself:
1. Stay on your phone all day.
2. Feel sad for no clear reason.
3. Stop eating well and ignore your studies.
4. Sleep super late and wake up in the afternoon.
5. Let sadness take over everything.
6. Always look at others' lives and feel yours isn't enough.
7. Keep blaming yourself for the past but never try to let it go.
8. Compare your progress with people who started years before you.
9. Get stuck imagining outcomes instead of creating them.
10. Keep waiting for motivation instead of building discipline.
English

@carlcamilleri @brankopetric00 While scaling, it becomes a bottleneck for cpus and hence libraries like FAISS have both their CPU & GPU versions. However, vector indexing and search on cpu is more precise than doing the same using gpu. But cpus are slower. But then we can't just go on to burn compute hence gpu
English

@brankopetric00 “Convert search into numbers” and “turn text into numbers” - how computationally expensive are these steps? Does it remain feasible for a fast-changing dataset, compared to traditional Lucene-based lexical search?
English

Vector databases explained for people who just want to understand.
You have 10,000 product descriptions. User searches for "comfortable outdoor furniture."
Traditional database:
- Searches for exact word matches
- Finds products containing "comfortable" OR "outdoor" OR "furniture"
- Misses "cozy patio seating" even though it's the same thing
- Keyword matching is stupid
Vector database approach:
- Convert search into numbers representing meaning: [0.2, 0.8, 0.1, 0.9, ...]
- Convert every product description to similar numbers
- Find products with similar number patterns
- Returns "cozy patio seating" because the numbers are close
- Meaning matching is smart
How it works:
Step 1: Turn text into vectors (arrays of numbers)
- "comfortable chair" becomes [0.2, 0.7, 0.1, 0.4, ...]
- "cozy seat" becomes [0.3, 0.8, 0.2, 0.5, ...]
- Similar meanings = similar numbers
- Uses AI models like OpenAI embeddings
Step 2: Store vectors efficiently
- Traditional database: Stores text
- Vector database: Stores arrays of numbers per item
- Indexes them for fast similarity search
- Optimized for "find similar" not "find exact"
Step 3: Search by similarity
- User query: "outdoor furniture"
- Convert to vector: [0.3, 0.6, 0.2, 0.8, ...]
- Find closest vectors using math (cosine similarity)
- Returns items ranked by similarity score
Use cases:
- Product search that understands intent
- Documentation search that finds relevant answers
- Recommendation engines
- Chatbots that find similar questions
- Anomaly detection
Popular vector databases:
- Pinecone: Managed, easy, expensive
- Weaviate: Open source, feature-rich
- Milvus: Fast, scalable, complex
- pgvector: Postgres extension, simple
- Qdrant: Fast, Rust-based
Controversial take: You don't need a vector database for most projects. Start with Postgres + pgvector extension.
Vector databases are great for scale. For under 1 million vectors, your regular database with a vector extension works fine.
English

@westoque @brankopetric00 Training the model isn't a part of vector databases imo. We generally use a pre trained model. Not aure on this though
English

@brankopetric00 you forgot the training step. someone actually needs to associate those word meanings that are close with each other and that is no easy feat. what makes a better search is how precise the training data is. arguably this is where a lot of the competition is now.
English

@mohit__kulhari @brankopetric00 Maintaining the indexes, index rebuilding(in somw cases) when inserting data and the maintaining the precision becomes difficult as we keep scaling for a million or for billion. Though libraries like FAISS, JVector help a lot and are built to mitigate this issue.
English

@brankopetric00 what are the biggest challenges when scaling vector databases to millions of records?
English

@DataSpeeder @brankopetric00 Think of populating the database like inserting vectora in the db.
A pre trained model like openAi or llama-3 when given the words cozy or comfortabkle, go on to give similar kind of embeddings. In simple words, they give vectors that are close to each other in space.
English

@brankopetric00 What is populating the vector database, and how does it determine that "cozy" and "comfortable" are close?
English

Networking online is powerful, but real connections happen in replies. Drop a hi and let’s connect. #Networking
English

If you could write a letter to your future self, what would you say? #SelfReflection
English

Would you rather build wealth fast or slow and steady? #MoneyTalk
English

Who’s the most inspiring person you follow here? #TwitterCommunity
English
