Nick Becker

40 posts

Nick Becker

Nick Becker

@becker_data

Building an accelerated data science and data engineering ecosystem with @rapidsai

New York, NY Se unió Ocak 2015
41 Siguiendo70 Seguidores
Nick Becker
Nick Becker@becker_data·
@DaPiePiece We've generally focused on enabling cuML with other HPO tools, but we're always looking to better understand user needs. Could you file a Github issue on cuML describing how built-in HPO tooling would affect your work? github.com/rapidsai/cuml/…
English
1
0
0
22
Pai Coffee
Pai Coffee@DaPiePiece·
@becker_data Oooh alright, I'll give it a try, sounds good. Btw, any plans on native HPO in cuML? I'm a big fan of how cuML is practically a superset of sklearn, makes migrating sklearn code extremely easy.
English
2
0
0
35
Pai Coffee
Pai Coffee@DaPiePiece·
God I despise PyTorch, who made this garbage? The literal only reason I'm using it over sklearn or cuML is cause sklearn doesn't run on a GPU and cuML doesn't have hyperparameter optimisation
English
2
0
0
118
Nick Becker
Nick Becker@becker_data·
@DaPiePiece SVMs are a perfect example of this scenario (results will vary based on data size, CPU, and GPU). In this example, my CPU is relatively weak so I can get benefits even with just 10K rows. gist.github.com/beckernick/ff0…
English
0
0
0
12
Nick Becker
Nick Becker@becker_data·
@DaPiePiece Using scikit-learn's GridSearchCV with cuML models should generally give you a large speedup if the model training itself is the bottleneck. If the cuML speedup is > the number of parallel models you can run via n_jobs without sacrificing performance, you'll come out ahead.
English
1
0
0
19
Pai Coffee
Pai Coffee@DaPiePiece·
@becker_data Hi yeah my issue is that my data set is masisve (10000*64 10x10 images) so kinda need to pass this onto a GPU, spent 6+ hours trying to GridSearchCV using sklearn to no avail, and passing between GPU cuML data to CPU sklearn data is, from what I've heard, slower
English
2
0
0
24
Nick Becker retuiteado
James Dolezal
James Dolezal@JamesDolezal·
🚀 Slideflow 2.0 is here! Take your #digitalpathology research to the next level with extended MIL support, expanded feature generators, and enhanced stain normalization. Deploy & visualize models with Slideflow Studio. 🔬💻 Check it out at slideflow.dev
English
0
4
15
4.6K
Nick Becker
Nick Becker@becker_data·
@bilzrd @bilzard It looks like this may work smoothly in the cuDF 23.04 nightlies. Please let us know if things work for you!
English
0
0
0
15
Nick Becker
Nick Becker@becker_data·
@bilzrd @bilzard which version does wandb require? cuDF has relaxed its protobuf constraint to >=4.21.6,<4.22 in the 23.04 nightly packages
English
1
0
0
33
Nick Becker
Nick Becker@becker_data·
@StonewrightAI @RAPIDSai Hi @StonewrightAI , we've just updated the RAPIDS website and it includes new installation instructions and advanced install resources. You can learn more at #quick-start" target="_blank" rel="nofollow noopener">rapids.ai/#quick-start
English
1
0
1
8
Stonewrot
Stonewrot@Stonewrot·
@RAPIDSai hello! Why am I having a hard time installing cudf? I believe I have compatible versions of python, CUDA, cupy, cuDNN, zlib, and more. Can you please point me toward installation resources? Thank you.
English
1
0
0
58
Nick Becker
Nick Becker@becker_data·
@rblourenco You can now pip install cuDF, cuML, and cugraph on Colab, which should be much faster!
English
1
0
1
42
Ricardo Barros Lourenço
Ricardo Barros Lourenço@rblourenco·
It took 29 minutes. Now able to work. But every time that I will restart, will need to reinstall 😭😭😭
English
1
0
0
133
Nick Becker
Nick Becker@becker_data·
@PtrPomorski @BenHarlander XGBoost now supports categorical features via optimal partitioning. If you haven't revisited categorical handling in XGBoost in a while, it's worth a look #entry-content-comments" target="_blank" rel="nofollow noopener">developer.nvidia.com/blog/categoric…
English
0
0
0
28
Piotr Pomorski
Piotr Pomorski@PtrPomorski·
Truth be told, currently there’s no significant performance difference between xgboost, random forest and lightgbm, so use whatever (just optimise hyperparams). Only when you have categorical features I highly recommend catboost, it beats them all.
English
2
0
1
445
Nick Becker
Nick Becker@becker_data·
@zaialamm You can use a GPU by clicking "Runtime -> Change runtime type" from the dropdown menu. You can then install and use XGBoost and other GPU-accelerated ML libraries like cuML rapids.ai/pip.html
English
1
0
0
32
Zai
Zai@zaialamm·
@becker_data I know. But I use Google Colab for modelling and I think it doesn't support CUDA, does it? 🤔
English
1
0
0
26
Zai
Zai@zaialamm·
Why is training with XGBoost very sloooww? I hate waiting ffs 😩
English
1
0
0
106
Nick Becker retuiteado
Jacob Tomlinson
Jacob Tomlinson@_JacobTomlinson·
Here are the CLI 5 commands it takes to get a multi-node multi-GPU Data Science cluster running in the cloud with RAPIDS and Dask. 1/7
Jacob Tomlinson tweet media
English
1
5
24
5K
Nick Becker retuiteado
RAPIDS AI
RAPIDS AI@RAPIDSai·
Working with BERTopic and not meeting your New Year's wait loss goals? This blog by @MaartenGron and the RAPDIS team on speeding up BERTopic on CPU, and going the next mile with GPU, will meet you where you are at to keep those resolutions strong. medium.com/rapids-ai/fast…
English
0
3
20
4.9K
Nick Becker
Nick Becker@becker_data·
@SNbarbat Are you experiencing a specific error or issue?
English
0
0
0
7
devx
devx@SantiDevX·
Does anyone know how to install rapids #cuML in a #AWS p3.2xlarge? Using the Deep Learning Pythorch image and Ubuntu 20.20
English
2
0
1
161
Nick Becker
Nick Becker@becker_data·
@tom_gxt @RAPIDSai @PyTorch You should now be able to smoothly combine the standard RAPIDS and PyTorch installation commands using the CUDA Toolkit version PyTorch wants. We now run tests every RAPIDS release to verify creating a joint environment works. This blog is separate, exciting new functionality!
English
0
0
0
11
Nick Becker retuiteado
RAPIDS AI
RAPIDS AI@RAPIDSai·
Anyone interested in topic modeling or #NLP should check out BERTopic. The upcoming release will include deeper support for #RAPIDS and cuML to help you get results faster and process larger datasets (among other amazing updates)!
Maarten Grootendorst@MaartenGr

Preview #5: Heavyweight BERTopic! Last week was a lightweight update... this week a heavyweight💪 Better and more native integration of cuML's HDBSCAN in BERTopic! Use the full force of GPU acceleration to scale topic modeling to millions of documents. A preview thread👇🧵

English
0
3
9
2.4K
Nick Becker retuiteado
Maarten Grootendorst
Maarten Grootendorst@MaartenGr·
Preview #5: Heavyweight BERTopic! Last week was a lightweight update... this week a heavyweight💪 Better and more native integration of cuML's HDBSCAN in BERTopic! Use the full force of GPU acceleration to scale topic modeling to millions of documents. A preview thread👇🧵
Maarten Grootendorst tweet media
English
4
14
101
14.5K