Haoyu Cheng

60 posts

Haoyu Cheng

Haoyu Cheng

@ChengChhy

Beigetreten Ekim 2018
260 Folgt377 Follower
Haoyu Cheng
Haoyu Cheng@ChengChhy·
New preprint on hifiasm (ONT)! We can now achieve near T2T human genome assembly using only ONT Simplex reads—in just half a day, with or without ultra-long sequencing. biorxiv.org/content/10.110…
Mike Vella@vellamike

Telomere-to-telomere de novo assembly from standard ONT reads (LSK114, Simplex). A really exciting advance—makes high-quality assembly practical for population-scale sequencing! Preprint from @ChengChhy, @lh3lh3 and colleagues biorxiv.org/content/10.110…

English
6
52
129
17.7K
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@XLR @lh3lh3 @vellamike @nanopore The new version of hifiasm performs phasing, improved full-length base-level alignment rather than window-based alignment, and considers base quality.
English
0
0
1
190
Haoyu Cheng retweetet
Heng Li
Heng Li@lh3lh3·
The latest hifiasm can directly assemble standard @nanopore simplex R10 reads, without HERRO correction or other preprocessing, to phased contigs of contiguity comparable to HiFi assembly. Like before, you can further add ultra-long, Hi-C or trio data for better assembly.
Mike Vella@vellamike

Exciting news! The latest hifiasm release from @ChengChhy and @lh3lh3 adds beta support for @nanopore simplex R10 reads. Initial results look very promising. 🚀 Check it out: github.com/chhylp123/hifi…"

English
2
57
183
22.3K
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@mprous1 Then I have no idea. Probably just give it a try. Hifiasm won't take too much time.
English
0
0
1
167
Marko Prous
Marko Prous@mprous1·
@ChengChhy i've got some datasets with very poor N50, 2 kb for example. So no dorado correct option. Certainly assemblies would not be very good in these cases, but would hifiasm be better than flye? Or hifiasm would not work at all with so short reads?
English
1
0
0
191
Haoyu Cheng
Haoyu Cheng@ChengChhy·
Hifiasm 0.21.0 has been released. It now has a beta module for direct assembly of ONT R10 simplex reads. Initial tests with regular simplex reads show very promising results! github.com/chhylp123/hifi…
English
3
43
107
8.6K
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@mprous1 We tested several datasets with about 30kb N50, and hifiasm worked well.
English
1
0
0
297
Haoyu Cheng
Haoyu Cheng@ChengChhy·
My new lab at @YaleBIDS is looking for a couple of postdocs, students and RAs in bioinformatics, genomics, machine learning and related fields (postdocs.yale.edu/postdoctoral-a…). Heartfelt thanks to my mentor @lh3lh3 and collaborators for their incredible support!
English
1
21
48
6.2K
Haoyu Cheng retweetet
Heng Li
Heng Li@lh3lh3·
Preprint on "BWT construction and search at the terabase scale". We can compress 100 human genomes to 11GB in 21 hours, find SMEMs with it, do affine-gap alignment and retrieve similar local haplotypes. 7.3Tb commonly sequenced bacterial genomes ⇒ 30GB arxiv.org/abs/2409.00613
Heng Li tweet media
English
9
218
715
192.8K
Sebastian Beier
Sebastian Beier@basti_beier·
@ChengChhy @HumanPangenome I was speaking about this part, see picture. There you specify -m1000000 for 100kb (but actually the command would filter out all reads below a length of 1Mb not 100kb. (Similar with -m500000 and the 50kb limit)
Sebastian Beier tweet media
English
1
0
1
65
Haoyu Cheng
Haoyu Cheng@ChengChhy·
Excited to share our new t2t assembly algorithm for diploid and polyploid genomes! Using 132 assembled haplotypes from the @HumanPangenome , we show that our approach is cost-efficient, robust, and could achieve t2t assemblies without high coverage reads arxiv.org/abs/2306.03399
English
3
42
116
33.1K
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@basti_beier @HumanPangenome We actually used the UL reads >=50kb/100kb for the assembly. There are very small fraction of UL reads that could be longer than 500kb/1Mb.
English
1
0
0
85
Sebastian Beier
Sebastian Beier@basti_beier·
@ChengChhy @HumanPangenome Very nice preprint, just went through it and stumbled upon the settings in the supplements of filtering ultra-long reads. Looks like the commands there would display i) reads above 1 Mb ii) 500 kb instead of 100 kb / 50 kb. Is this just a typo?
English
1
0
0
123
Erik Garrison
Erik Garrison@erikgarrison·
@XLR @nomad421 @lh3lh3 @ChengChhy The problem is possibly because of a failure to use the correct set of SIMD instructions. Are you using multiple dispatch or some kind of fat binaries to deal with this?
English
2
0
1
245
𝕐
𝕐@nomad421·
Most frustrating bioconda issue encountered thus far; program builds cleanly on conda under linux and MacOS, when you pull it down it runs normally on the data but produces (very) wrong results ... wtaf?!
English
6
4
15
7.7K
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@jpelbers @lh3lh3 We would be happy to have a try if there is UL data for the metagenome, but sounds like it is not easy to produce UL...
English
1
0
0
0
Haoyu Cheng
Haoyu Cheng@ChengChhy·
New hifiasm with the ultra-long integration is released! We tested it with four diploid human samples and got many T2T chromosomes. Any feedback will be much-appreciated @lh3lh3. Source code: github.com/chhylp123/hifi…
English
3
41
126
0
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@subgenomes @lh3lh3 @PacBio @nanopore It is compatible with the old bin files, but still would be better to rerun the whole workflow from the raw reads. We would also be interested in if it could work for the polyploid genome (although it might have some parameters to be tuned).
English
0
0
0
0
Heng Li
Heng Li@lh3lh3·
Hifiasm HiFi+UL integration is ready for beta testing. This new mode takes @PacBio HiFi and @nanopore ultra-long reads as input and produces longer phased contigs. It also works with trio or Hi-C for chromosome-long phasing. Add option --ul to provide UL reads.
Haoyu Cheng@ChengChhy

New hifiasm with the ultra-long integration is released! We tested it with four diploid human samples and got many T2T chromosomes. Any feedback will be much-appreciated @lh3lh3. Source code: github.com/chhylp123/hifi…

English
4
66
163
0
alex rubinsteyn
alex rubinsteyn@iskander·
@lh3lh3 @PacBio @nanopore Would this also act as a form of error correction on nanopore reads (eg with sufficient coverage can I think of the phased contigs as reads with lower indel error?)
English
1
0
0
0
Haoyu Cheng
Haoyu Cheng@ChengChhy·
@jpelbers @lh3lh3 Not sure if the ultra-long sequencing could work for the metagenome sequencing…
English
1
0
1
0