Jiri Simsa

145 posts

Jiri Simsa

@jsimsa

Working on data processing and analysis infrastructure for ML @ Google.

California, USA Katılım Eylül 2015

0 Takip Edilen165 Takipçiler

Jiri Simsa retweetledi

Ana Klimovic@anaklimovic·29 Ağu

Today at @MLSysConf, @MichaelKuchnik will present Plumber, our tool for diagnosing and removing performance bottlenecks in ML input data pipelines. Joint work with @jsimsa, @GeorgeAmvrosia2 and Virginia Smith. Paper: proceedings.mlsys.org/paper/2022/fil…

English

Jiri Simsa@jsimsa·9 Şub

If you are interested in advancing infrastructure that provides large scale data analysis and processing for ML workloads across Google, my team is hiring: linkedin.com/jobs/view/2905…

English

Jiri Simsa retweetledi

Ana Klimovic@anaklimovic·17 Ağu

Our VLDB’21 talk about tf.data, a ML data processing framework, is now online: youtu.be/VsOvy3eGK8Y More details in our paper: vldb.org/pvldb/vol14/p2… It has been great to collaborate on this work with @mrry @jsimsa & Ihor Indyk!

YouTube

English

Jiri Simsa retweetledi

Google@Google·13 Kas

Five years ago, we open sourced @TensorFlow, our machine learning framework that's now the most popular machine learning library in the world. 🌎 To celebrate, we’re sharing few interactive demos and tutorials you can try, no experience required → goo.gle/3nz22Xh

GIF

English

455

2.3K

Jiri Simsa@jsimsa·30 Tem

Awesome to see the success of TensorFlow and JAX, both using tf.data to ingest data fast enough to train to convergence in under 30 seconds!

Jeff Dean@JeffDean

Very excited to see the MLPerf 0.7 results released today, where Google TPUs set records in six of the eight benchmarks! We need bigger benchmarks, because we can now train the ResNet-50, BERT, Transformer, & SSD benchmarks each in under 30 seconds. cloud.google.com/blog/products/…

English

Jiri Simsa retweetledi

James Bradbury@jekbradbury·29 Tem

In 2016, when I was working on machine translation, it took me more than a week on a multi-GPU machine to train a competitive system on WMT English-German. Today, JAX on a TPU v3 supercomputer can train a better model on the same data in 16 seconds! cloud.google.com/blog/products/…

English

139

870

Jiri Simsa retweetledi

👩‍💻 Paige Bailey@DynamicWebPaige·25 Tem

👉 tf.data supports *any* machine learning framework (JAX, @TensorFlow, PyTorch, more!), and is a great way to speed up your data input pipelines. Be sure to try out our new features for tf.data, available in TF 2.3: #diff-781a53e648f3df8d16a08ec083b04bf4" target="_blank" rel="nofollow noopener">github.com/tensorflow/ten…

Ong Chin Hwee 🐼@ongchinhwee

1. Start with TF Data 2. Enable non-deterministic ordering 3. Cache data 4. Turn on experimental optimizations 5. Autotune parameter values --> >10% performance improvement! 🤯 #EuroPython

English

156

Jiri Simsa retweetledi

TensorFlow@TensorFlow·2 Nis

🔍Inside TensorFlow: tf.data + tf.distribute In this presentation, Jiri Simsa showcases best practices. You’ll learn about the input pipeline, parallel extraction, distributed training, and more. Watch here → goo.gle/2wYGEG7

English

123

Jiri Simsa retweetledi

Josh Gordon@random_forests·2 Kas

If your dataset is small, use an in-memory cache: ds = ds.cache() If large, create an on-disk cache: ds = ds.cache("my_file") Afterwards, you can call ds.batch() and ds.shuffle() as always. Complete example: tensorflow.org/tutorials/load…

English

175

Jiri Simsa retweetledi

Alluxio@Alluxio·15 Eki

Speaker spotlight - @jsimsa, tech lead of the tf.data project & software engineer at Google, to present on tf.data the recommended API for creating #TensorFlow input pipelines @ #DataOrchestrationSummit. RSVP: lnkd.in/d-M6cRz #opensource

English

Jiri Simsa@jsimsa·11 Tem

Presented tf.data and tf.distribute at #GoogleMLSummit in Tokyo! Stay tuned for a recording.

English

Jiri Simsa retweetledi

Takuo Suzuki@taquo·11 Tem

Google Developers ML Summit , @JeffDean の基調講演！#GoogleMLSummit

日本語

Jiri Simsa@jsimsa·5 May

Thank you for the kind words!

Neil Tenenholtz@ntenenz

Loving the "Inside Tensorflow" series. The latest release on the TF data API highlights just how much effort the @TensorFlow team has invested in making highly performant pipelines accessible to the end user. Major kudos. 👏👏👏 @mrry @jsimsa et al youtube.com/watch?v=kVEOCf…

English

Jiri Simsa retweetledi

Jeff Dean@JeffDean·23 Nis

Not only are TPUs fast for doing machine learning, but they are also more energy efficient than alternative platforms, so you can feel great as you train that language model on scientific articles about climate change. twitter.com/GCPcloud/statu…

Google Cloud Tech@GoogleCloudTech

Our Cloud TPUs are designed with energy efficiency in mind, specifically to accelerate deep learning workloads at higher teraflops per watt compared to general purpose processors → blog.google/topics/google-… #EarthDay

English

381

Jiri Simsa retweetledi

Brennan Saeta@bsaeta·20 Nis

Today in #CloudTPU announcements: (1) @TensorFlow 1.8 now available with a slew of perf improvements (2.7k to 3.2k images/sec on ResNet-50, aka 12.5 hours is now 9 hours to fully train), and (2) we have opened up a new zone (us-central1-b) for HA & load balancing.

English

Jiri Simsa retweetledi

Frank Chen@frankchn·20 Nis

Our latest DAWNBench results are live: 8h52m for @TensorFlow to train ResNet-50 on ImageNet on a single @GCPcloud TPU (<$60), and just 30 minutes on half a TPU pod! dawn.cs.stanford.edu/benchmark/

English

Jiri Simsa retweetledi

Jeff Dean@JeffDean·17 Nis

We just posted new DAWNBench results for ImageNet classification training time and cost using Google Cloud TPUs+AmoebaNet (architecture learned via evolutionary search). You can train a model to 93% top-5 accuracy in <7.5 hours for <$50. Results: dawn.cs.stanford.edu/benchmark/

English

203

551

Jiri Simsa retweetledi

Brennan Saeta@bsaeta·16 Nis

Cloud TPUs (now in !!open!! beta) are a leap forward in price & performance for Machine Learning. (See dawn.cs.stanford.edu/benchmark/ for end-to-end benchmarks.) Spin one up at console.cloud.google.com/compute/tpus today!

English

Jiri Simsa retweetledi

Derek Murray@mrry·30 Mar

If you want to find out more about tf.data performance after my talk at #TFDevSummit, check out this awesome guide by @jsimsa and @bsaeta! twitter.com/math_rachel/st…

Rachel Thomas@math_rachel

TensorFlow Data Pipeline Performance Guide #TFDevSummit @mrry tensorflow.org/performance/da…

Mountain View, CA 🇺🇸 English

Jiri Simsa retweetledi

Derek Murray@mrry·30 Mar

I'll be speaking about tf.data at 10am PDT. Hope you can tune in to the livestream! tensorflow.org/dev-summit/ twitter.com/TensorFlow/sta…

TensorFlow@TensorFlow

Hundreds of researchers, developers & TensorFlow enthusiasts arrive in Mountain View CA for the #TFDevSummit! We kick things off live in ~45 minutes. You can find the event livestream here → goo.gl/sxFLxD pscp.tv/TensorFlow/1dj…

English

Keşfet

@MLSysConf @MichaelKuchnik @GeorgeAmvrosia2 @mrry @TensorFlow @JeffDean @bsaeta @elonmusk