Marc Sloan

221 posts

Marc Sloan banner
Marc Sloan

Marc Sloan

@MarcCSloan

Product Manager leading Dev Mode design to code at @figma. Founder of @ContextScout and @digit_allies

London Katılım Ekim 2021
644 Takip Edilen282 Takipçiler
Sabitlenmiş Tweet
Marc Sloan
Marc Sloan@MarcCSloan·
Find out more about me, my award winning charity work creating Covid Tech Support and have a play with an interactive demo of the in-browser AI I spent 5 years working on @ContextScout at marcsloan.com
Marc Sloan tweet media
English
0
0
6
0
Marc Sloan retweetledi
Figma
Figma@figma·
Figma MCP server, now with design context anywhere you work → Remote access → Connection with Figma Make → New Code Connect UI components
English
39
210
2.1K
272.9K
Marc Sloan retweetledi
Figma
Figma@figma·
New Code Connect update: Expanded web framework support to include design systems written in Angular, Vue, and HTML
GIF
English
12
38
375
36.4K
Marc Sloan retweetledi
Sharky_Bricks
Sharky_Bricks@Sharky_Bricks·
🚀 Exciting news! The Moon Lunar Landscape has been featured in an article on @SPACEdotcom! 🌕✨ space.com/lego-ideas-spa… Bringing this lunar vision to life in LEGO bricks has been an incredible journey. Grateful for the support of this amazing community!
English
0
1
1
109
Marc Sloan
Marc Sloan@MarcCSloan·
Excited to join @figma as a Product Manager, working on Dev Mode with a focus on Design to Code. Looking forward to advancing AI-empowered design-to-code workflows with Figma’s amazing tools. Thrilled to be part of such a talented team! #ProductManagement #AI #DesignToCode
Marc Sloan tweet media
English
0
1
1
133
Marc Sloan retweetledi
Sharky_Bricks
Sharky_Bricks@Sharky_Bricks·
We did it! 10k supporters! 🎉 A big thank you to everyone for helping to get it this far 😊 #LEGOIdeas
Sharky_Bricks tweet media
English
0
1
3
96
Marc Sloan retweetledi
Sharky_Bricks
Sharky_Bricks@Sharky_Bricks·
You, Me and the Moon 🌙👩‍❤️‍👨🌃 My final entry in the Lego Ideas Picture Perfect Memories challenge. A brick built Polaroid of the moon hanging over a city skyline at night during a first date. If you like it, head over to the Lego Ideas page and comment 😊 ideas.lego.com/challenges/02f…
Sharky_Bricks tweet mediaSharky_Bricks tweet media
English
0
1
3
239
Josh Miller
Josh Miller@joshm·
a little Sunday surprise for you... meet @browsercompany's 2nd product: 🔍Arc Search🔎 it's a default browser for your iPhone ...that BROWSES FOR YOU the origin story is a bit unusual so I wanted to give you the full backstory:
GIF
English
387
437
4K
1M
Marc Sloan
Marc Sloan@MarcCSloan·
We're one step closer to autonomous agents browsing the web on our behalf!
Yu Su@ysu_nlp

Generalist web agents may get here sooner than we thought---introducing SeeAct, a multimodal web agent built on GPT-4V(ision). What's this all about? > Back in June 2023, when we released Mind2Web (osu-nlp-group.github.io/Mind2Web/) and envisioned generalist web agent, a language agent that can work out of the box on any given website, my projection was that it would take at least several years to see such an agent that is anywhere near usable in practice. > Why wouldn't I? The most powerful LLM at the time (perhaps still is today), GPT-4, was pretty terrible at this---its end-to-end success rate was around 2% (!!) HTML of modern websites are too long and noisy for LLMs. It's like finding a needle in a haystack. And a long-horizon task can take 10+ actions, so an LLM needs to successfully find 10+ "needles" in a row (!!!) to complete a task. What's changed in just a few months? > Large multimodal models. The end of 2023 marked a major milestone for LMMs, with GPT-4V, Gemini, and many good OSS LMMs released. > Multimodal web agents. Websites are designed to be visually rendered and consumed. Visuals are much more clean and intuitive than HTML, 10x more efficient in terms of token counts. Plus, a pretty unique property of websites is that we have the correspondences between visual elements and HTML code! Such perfectly aligned multimodality is a gold mine for modeling. > Online evaluation. The final piece of the secret recipe is online evaluation on live websites. Mind2Web initially only supported offline eval on cached websites. We developed a new tool to support running and evaluating web agents on live websites. Both LLMs and LMMs get a big boost, because now they don't have to follow exactly the reference plan in offline eval but are rather free to explore alternative plans to achieve the same goal. SeeAct > SeeAct is a generalist web agent built on LMMs like GPT-4V. Specifically, given a task on any website (e.g., “Compare iPhone 15 Pro Max with iPhone 13 Pro Max” on the Apple homepage), the agent first performs action generation to produce a textual description of the action at each step towards completing the task (e.g., “Navigate to the iPhone category”), and then performs action grounding to identify the corresponding HTML element (e.g., “[button] iPhone”) and operation (e.g., CLICK, TYPE, or SELECT) on the webpage. Main results > SeeAct can successfully complete up to 50% of tasks on live websites, substantially outperforming GPT-4 (20%) and FLAN-T5 (18%), if oracle action grounding is provided. > However, grounding is still a major challenge. It turns out that GPT-4V can often accurately describe in text what action should be taken, but has trouble grounding the action to the exact HTML element and operation on the webpage. Existing grounding strategies like set-of-mark prompting turns out not very effective for web agents. Our best grounding strategy leverages the correspondences between visuals and HTML. > SeeAct w/ GPT-4V shows many interesting capabilities such as speculative planning, world knowledge (e.g., airport codes), and some sort of "world model" (for websites at least), that it can correctly predict the state transitions on a website (e.g., what would happen if I click this button) Fun fact Initially we were hoping to show that even GPT-4V would still be insufficient for generalist web agents and we may still need fine-tuning, but we kept getting blown away by its incredible capability as a web agent. Such pleasant surprises are why I enjoy doing AI research so much these days. I also look forward to test Gemini Ultra and see whether its strong performance on MMMU would transfer. Conclusion Practically useful web agents could be coming soon. Buckle up and start thinking about what new applications will be enabled. 📌Website: osu-nlp-group.github.io/SeeAct/ 📌Paper: github.com/OSU-NLP-Group/… 📌Code: github.com/OSU-NLP-Group/… Work led by my amazing students @boyuan__zheng @BoyuGouNLP from @osunlp, joint with Jihyung Kil and @hhsun1. Hire them for internships!

English
0
0
0
67
Marc Sloan
Marc Sloan@MarcCSloan·
Required reading for PMs further down the food chain in a corporate hierarchy looking to get things done: "Switch: How to change things when change is hard" by Chip and Dan Heath. Time to herd some elephants... 🐘 #ProductManagement
Marc Sloan tweet media
English
0
0
1
197
Marc Sloan
Marc Sloan@MarcCSloan·
Does anyone else use ChatGPT for Rubber duck debugging of problems? Articulating my issue to it often leads me to the solution, plus I get GPT's insights 🦆💡 #ChatGPT #ProblemSolving
Marc Sloan tweet media
English
0
0
0
72
Marc Sloan retweetledi
Conrad Godfrey
Conrad Godfrey@conradgodfrey·
GPT-4V "Describe this image" 🔃 Dall-E 3 "Generate this image" Recursive loop
English
118
581
4.5K
1.6M
Marc Sloan
Marc Sloan@MarcCSloan·
Just finished reading "The Build Trap" @lissijean, made a big impact on me, with practical insights I'll actually use. Highly recommend for anyone looking to rethink their own and their orgs approach to #ProductManagement 👏
Marc Sloan tweet media
English
0
2
6
1.5K