collapse

195 posts

collapse banner
collapse

collapse

@collapse_R

A C/C++ based package for advanced data transformation and statisical computing in R. Account managed by the author. #rcollapse

CRAN and Github 参加日 Haziran 2020
29 フォロー中954 フォロワー
collapse
collapse@collapse_R·
Recording of my talk on {collapse} and the {fastverse} at the Bank of Portugal‘s workshop on „Speeding up Empirical Research: Tools and Techniques for Fast Computing“ in December is now online: youtu.be/qO5dHIPsfK8?si… #Rstats #DataScience
YouTube video
YouTube
English
0
1
4
287
collapse
collapse@collapse_R·
Updated windows benchmarks for in-memory database-like operations by Adrian Antico show that {collapse} still leads on lagging and casting benchmarks (not covered in DuckDB benchmarks) and remains overall very competitive: #benmark-results" target="_blank" rel="nofollow noopener">github.com/AdrianAntico/B… #Rstats #DataScience
English
1
0
4
331
collapse
collapse@collapse_R·
I've released a new package {flownet} for efficient transport modeling and graph manipulation: sebkrantz.github.io/flownet/. It builds on 7 {fastverse} libraries, most notably {collapse}, from which it imports 60 functions. Thus, another great learning resource for developers #rstats
English
1
0
4
156
collapse
collapse@collapse_R·
@JosiahParry It’s a never ending hackathlon, unfortunately. But still good to have rigorous C/C++ checks done for free on many platforms. Creates more robust software.
English
0
0
1
128
collapse
collapse@collapse_R·
The {collapse} arXiv paper has just been updated - following extensive revision: arxiv.org/abs/2403.05038. I believe it is a great resource for anyone doing scientific computing with #rstats.
English
0
6
21
1.5K
collapse
collapse@collapse_R·
There is now a #fastverse benchmark wiki (github.com/fastverse/fast…) where users can freely contribute benchmarks. If you have benchmarks involving {fastverse} packages ({collapse}, {data.table}, etc., including extensions) please contribute them (takes 1 min) #rstats #DataScience
English
0
1
14
523
collapse
collapse@collapse_R·
I just improved the vignette a bit further, adding some detailed benchmarks and a section on Global Options. I needed to correct myself: it is not true that {collapse} global options should never be invoked in packages - they just need to be reversed like #rstats global options.
English
0
0
2
219
collapse
collapse@collapse_R·
It's nice to see an increasing number of #rstats packages using {collapse}. A developer focused vignette was long planned and now it is here - with modest advice on writing efficient R package code in general and using {collapse} in particular: sebkrantz.github.io/collapse/artic…
English
2
11
37
2.8K
collapse
collapse@collapse_R·
@MislavSagovac @r_data_table Of course not. But {collapse} can do many things, particularly complex statistical things, that cannot be done with {data.table} or are more difficult to do or do efficiently with {data.table} - as shown in this post and further in the docs/resources: sebkrantz.github.io/collapse/artic…
English
1
0
6
243
collapse がリツイート
Data Table
Data Table@r_data_table·
Check out the latest package to be granted the Seal of Approval: {collapse} by Sebastian Krantz! {collapse} is a partner package, that implements various data transformation and statistical analysis tasks using ultra fast C/C++ implementations. rdatatable-community.github.io/The-Raft/posts…
English
3
8
73
8.3K
collapse
collapse@collapse_R·
{collapse} v2.0.15, with fast aggregation pivots, has just reached CRAN. A minor but neat feature worth pointing out in this release is enhanced join verbosity. In addition to the join success rates, the join relationship is now determined and reported - at no extra cost #rstats
collapse tweet media
English
0
7
40
3.3K
collapse
collapse@collapse_R·
@statquant @JosiahParry Agreed, it’s not more apples to apples, but equally valid. DuckDB benchmarks max out all frameworks on a large linux cloud server. This one compares performance on a local windows system through IDE‘s and is thus closer to many users. For the 100M data they are quite similar…
English
0
0
0
105
statquant
statquant@statquant·
@collapse_R @JosiahParry Why is it more apple to apple than other benchmarks? Always surprised by benchmarks, it’s more difficult to write benchmarks than write the library! In this family I recommend the kdb/shakti bench where they say they are 100x faster than polars: shakti.com (about)
English
1
0
0
117
collapse
collapse@collapse_R·
New independent benchmark by Adrian Antico: github.com/AdrianAntico/B… Setup: - large local Windows machine - real data - broad range of tasks - scripts executed inside Rstudio and VScode -> shows that {collapse} is an absolute top performer in this setting #rstats #DataScience
English
0
5
24
6.3K
collapse
collapse@collapse_R·
@Yann_the3rd @JosiahParry Thanks, but I doubt that {collapse} is buggy. Many people are using it and most issues I get are feature requests. It simply does not support tidyselect syntax. across() inside fmutate() works fine. Please carefully read the docs and report any issues that you find on GitHub.
English
0
0
1
58
collapse
collapse@collapse_R·
@JosiahParry No, not at the moment, but it uses pointers to access their contents.
English
0
0
1
242