Secludy AI

12 posts

Secludy AI

Secludy AI

@Secludy

Privacy-Guaranteed Synthetic Data for Training AI Models

Entrou em Eylül 2024
27 Seguindo21 Seguidores
Secludy AI retweetou
Ben Cerchio
Ben Cerchio@ben_cerchio·
We're proud to be launching @Secludy today with $4M in seed funding. @ImpressionVC led the round. @LAUNCH and The Syndicate, a venture firm and angel investing group led by @Jason Calacanis, also joined, along with Wedbush Ventures, @PrecursorVC, @HustleFundVC, @scriptcapital, Mana Ventures, Chispa VC, and an amazing group of angel investors. Banks and fintechs are sitting on incredibly valuable data but can't use it to train AI models. Privacy laws, contracts, and cross-border rules keep it locked down. So my co-founder @Dr_Mingze_He and I got to work. Secludy unlocks your most sensitive data while keeping the utility intact. You can train custom models and do vendor evaluations without ever putting your sensitive data at risk or giving up AI model performance. Moving on to other regulated industries next... Also, special thank you to @sososazesh who wrote our first check and to all of our other existing investors. Read more in the comments.
Ben Cerchio tweet media
English
2
5
8
1.7K
Secludy AI retweetou
Lauren Wagner
Lauren Wagner@typewriters·
The solutions already exist, they're just unevenly distributed (/think tanks are not aware of what startups are building to solve these problems) See: @Secludy
Tao Burga@taoburr

1/7 High-quality data is a major bottleneck to AI progress. But while recent LLMs were trained on ~hundreds of TBs of data, the world has digitized 180 zettabytes of it. A billion times more. The problem is access. In a new essay for The Launch Sequence, @iamtrask and @lace31692 lay out a possible solution…

English
0
1
2
338
Secludy AI
Secludy AI@Secludy·
👀
Rohan Paul@rohanpaul_ai

LLMs leak up to 27.5% of sensitive training data PII (Personally Identifiable Information like emails, SSNs, VINs, Bitcoin wallets). @Secludy makes it easy to generate privacy-guaranteed synthetic data that is a near replica of the original unstructured dataset but better. How? They utilize privacy-protected LLMs by adding carefully controlled noise to the model weights before generating synthetic data for AI model fine-tuning/evaluations. They just released a technical report that demonstrates their approach. 🧵1/n LLMs are known to memorize and expose sensitive information, even when trained on masked unstructured datasets they can still retain and regurgitate personal data which is a major privacy risk.

ART
0
0
2
148
Secludy AI
Secludy AI@Secludy·
5/ Ready to secure your data? Our PII Leakage Testing Tool helps you audit leakage in fine-tuned LLMs.
English
1
0
1
29
Secludy AI
Secludy AI@Secludy·
4/ The European Data Protection Board (EDPB) now wants companies to run these audits as proof of resistance to re-identification (EPDB Opinion 28/2024).
English
2
0
0
38
Secludy AI
Secludy AI@Secludy·
We just launched our PII Leakage Testing Tool on AWS Marketplace! We put data masking to the test. A thread 🧵👇
English
2
1
1
80
Secludy AI retweetou
Ben Cerchio
Ben Cerchio@ben_cerchio·
Training AI models on sensitive medical data poses big risks, as this TechCrunch article highlights. The same goes for unstructured text. Privacy-preserving techniques like training on differentially private synthetic data can ensure utility while protecting privacy 🛡️ techcrunch.com/2024/11/19/psa…
English
0
1
2
181
Secludy AI retweetou
Ben Cerchio
Ben Cerchio@ben_cerchio·
1/ My hot take on the State of AI Report 2024 by Air Street Capital 🔥 Synthetic data isn’t just for big AI labs anymore. It’s about to become essential for nearly ALL companies fine-tuning their models
English
1
1
4
217
Secludy AI
Secludy AI@Secludy·
@ajhodls @kepano Absolutely, training on differentially private synthetic data is the way to go. It gives privacy-guarantees so that you can protect users’ sensitive data while learning from your most valuable data
English
0
0
1
21
kepano
kepano@kepano·
if your data is stored in a database that a company can freely read and access (i.e. not end-to-end encrypted), the company will eventually update their ToS so they can use your data for AI training — the incentives are too strong to resist
English
41
266
2K
667.6K