Benoît Quartier retweetledi
Benoît Quartier
126 posts

Benoît Quartier retweetledi

The State of Generative AI in the Enterprise and what i says to me! Last week, @MenloVentures released its 2024 report, which surveyed 600 U.S. IT decision-makers. Here is what it says to me: 👀
Open-source AI is slowly taking over. Enterprises want to build their own AI solutions. Infrastructure spending has increased 8x, deployment investment has increased 3.8x, and build vs. buy strategies have shifted from 20% to 53%. The complexity of deploying your own AI models takes more time compared to the faster, simpler option of using APIs. Combining this with open models approaching closed performance and @OpenAI losing its market share, the direction seems clear.
I was expecting fine-tuning to decrease, not because it is not needed. More because, with better models, it moves to a later stage in the lifecycle of AI applications. Thats why see an increase in RAG. As a company, you should always try to make things work and then optimize. Focusing on UX/DX, reliability (evaluation, tracing, logging) is more important for most companies in the current state. Fine-tuning will help these companies drive down costs significantly later.
Workflow automation will be the biggest growth area in the next two years in terms of use case adoption.
Vector databases likely won’t win long-term in enterprises. PostgreSQL and MongoDB's vector capabilities are good enough and already present in most companies.
The real challenge isn't about model capabilities anymore. We're in a similar situation as with cloud adoption in 2010. My advice is simple: build, learn, and iterate. Enterprises that invest in understanding these technologies will have significant advantages in the future.

English
Benoît Quartier retweetledi

OCR-2.0 is coming, and Generative AI and multimodal LLMs will power it! 🔍 GOT (General OCR Theory) is a 580M end-to-end OCR-2.0 model that outperforms all existing methods.
GOT consists of a Vision-Encoder to convert images into transformers images into tokens and a decoder for generating OCR outputs in various formats (e.g., plain text, markdown, Mathpix). GOT is designed to handle complex tasks like sheets, formulas, and geometric shapes
Implementation
1️⃣Vision Encoder Pre-training: The encoder (VitDet) trained using scene text and document OCR data to recognize both slice and whole-page inputs.
2️⃣ Joint-Training: The encoder is connected to the decoder (Qwen-0.5B) and both are trained on more general OCR tasks (e.g., formulas, sheet music, geometry).
3️⃣ Post-Training: Fine-tuned the model with specific tasks, such as fine-grained OCR, multi-page PDFs, and dynamic resolution, using new synthetic datasets.
Insights
🧠 Encoder-Decoder with 80M (VitDet) Encoder and 500M (Qwen2) Decoder with 8k context
🥇 Achieves a 0.035 edit distance and 0.972 BLEU score on plain OCR
📊 Outperforms LLaVA-NeXT and Qwen-VL-Max in document and scene text OCR
🧮 Can extract LaTeX formulas from Arxiv and convert them to Mathpix format
📃 Supports dynamic resolution and multi-page OCR
🖼️ Input resolutions up to 1024x1024
Paper: huggingface.co/papers/2409.01…
Github: github.com/Ucas-HaoranWei…

English
Benoît Quartier retweetledi

Elasticsearch (and Kibana) are Open Source, Again! Soooo excited. Read more in the blog I wrote elastic.co/blog/elasticse…
English
Benoît Quartier retweetledi
Benoît Quartier retweetledi

Apple is joining the public AI game with 4 new models on the Hugging Face hub! huggingface.co/collections/ap…
English
Benoît Quartier retweetledi

New short course: Open Source Models with Hugging Face 🤗, taught by @mariaKhalusova, @_marcsun, and Younes Belkada! @huggingface has been a game changer by letting you quickly grab any of hundreds of thousands of already-trained open source models to assemble into new applications. This course teaches you best practices for building this way, including how to search and choose among models.
You’ll learn to use the Transformers library and walk through multiple models for text, audio, and image processing, including zero-shot image segmentation, zero-shot audio classification, and speech recognition. You'll also learn to use multimodal models for visual question answering, image search, and image captioning. Finally, you’ll learn how to demo what you build locally, on the cloud, or via an API using Gradio and Hugging Face Spaces.
You can sign up here: deeplearning.ai/short-courses/…
English
Benoît Quartier retweetledi

Only 3 weeks until AMLD EPFL 2024! Check out the event schedule with 28 workshops, 43 tracks, poster sessions, an exhibition and inspiring keynote speakers. buff.ly/3IkaMNd 🗓 March, 23 to 26
🎟 Tickets buff.ly/4c02veP
🇨🇭 SwissTech Convention Center, Lausanne
English

@brinilo2 Where did Prof. Eckerle ask for "isolating" children or "taking away their lifes?"
If you think your children can't be harmed by the virus because they have a strong immune system, that's your call. What about other parents who are concerned, do want protection and vaccination?
English

@EckerleIsabella What do you think of this study: twitter.com/apsmunro/statu…
Isn't it reassuring?
Alasdair Munro@apsmunro
The best data by far on #LongCovid is out from the ONS For kids, the news is incredibly reassuring - parents minds should be put to rest Rates of common symptoms after #COVID19 at 12 w for kids are extremely low (0% to 1.7%) compared to controls ons.gov.uk/peoplepopulati… 1/
English

@MartiniGuyYT Ask him where the token sacrificed to pulsechain go and what they will be used for? Are they just adding to his personal wealth?
English

Today i get to speak to @RichardHeartWin about $HEX on a livestream at 3pm BST
If you have a question for him, ask it below ⬇️⬇️
English
Benoît Quartier retweetledi

La souveraineté numérique? La Suisse ne fait même pas semblant de s’y intéresser. Les récentes décisions stratégiques de la Confédération et de Swisscom laissent un arrière-goût très amer
letemps.ch/economie/souve… par @Anouch
Français
Benoît Quartier retweetledi

@SwissBorgNation @Boonzht @swissborg That's wrong (or I misunderstood the question).
The $CHSB that are locked for premium access do not yield interest.
English

🥁 We've reached an all-time high for our Community Index: 9.8! Each week we'll calculate the Community Index and share it on Wednesdays along with the $CHSB Yield of the day. The CHSB #Yield program aims to empower our community. Join us: swissborg.com/smart-yield-ac…
English

Join tomorrow for the Data+AI Online Meetup on Encoding multi-layered Vega-Lite #COVID19 Geodata visualizations with #MLflow Model Registry meetup.com/data-ai-online…

English

@jeremyphoward Where did you get the "50% of those sick have no symptoms."?
I only found two studies that shows around 20% of asymptomatic (#244429e47e90" target="_blank" rel="nofollow noopener">forbes.com/sites/brucelee…).
English

"World Health Organization officials Monday said they still recommend people not wear face masks unless they are sick"
50% of those sick have no symptoms. So this recommendation means #Masks4All is needed, I guess?
CNN@CNN
World Health Organization officials say they still recommend people not wear face masks unless they are sick with Covid-19 or caring for someone who is sick cnn.it/2UMWNGn
English




