Haoyu Cheng

Telomere-to-telomere de novo assembly from standard ONT reads (LSK114, Simplex). A really exciting advance—makes high-quality assembly practical for population-scale sequencing! Preprint from @ChengChhy, @lh3lh3 and colleagues biorxiv.org/content/10.110…

5

1K

Haoyu Cheng@ChengChhy·18 Nis

New preprint on hifiasm (ONT)! We can now achieve near T2T human genome assembly using only ONT Simplex reads—in just half a day, with or without ultra-long sequencing. biorxiv.org/content/10.110…

Mike Vella@vellamike

English

6

52

129

17.7K

Haoyu Cheng@ChengChhy·28 Kas

@XLR @lh3lh3 @vellamike @nanopore The new version of hifiasm performs phasing, improved full-length base-level alignment rather than window-based alignment, and considers base quality.

English

1

190

Armin Töpfer@XLR·28 Kas

@lh3lh3 @vellamike @ChengChhy @nanopore Sorry. Brain fu, yes I meant hifiasm. Thank you.

English

0

1

250

Mike Vella@vellamike·27 Kas

Exciting news! The latest hifiasm release from @ChengChhy and @lh3lh3 adds beta support for @nanopore simplex R10 reads. Initial results look very promising. 🚀 Check it out: github.com/chhylp123/hifi…"

English

34

103

28.8K

Haoyu Cheng retweetet

Heng Li@lh3lh3·28 Kas

The latest hifiasm can directly assemble standard @nanopore simplex R10 reads, without HERRO correction or other preprocessing, to phased contigs of contiguity comparable to HiFi assembly. Like before, you can further add ultra-long, Hi-C or trio data for better assembly.

Mike Vella@vellamike

Exciting news! The latest hifiasm release from @ChengChhy and @lh3lh3 adds beta support for @nanopore simplex R10 reads. Initial results look very promising. 🚀 Check it out: github.com/chhylp123/hifi…"

English

2

57

183

22.3K

Haoyu Cheng@ChengChhy·27 Kas

@XLR @vellamike @lh3lh3 @nanopore It need fastq files but no other requirements.

English

1

144

Armin Töpfer@XLR·27 Kas

@vellamike @ChengChhy @lh3lh3 @nanopore What was required to get it working? The commit message is not rich

English

2

0

2

709

Haoyu Cheng@ChengChhy·27 Kas

@mprous1 Then I have no idea. Probably just give it a try. Hifiasm won't take too much time.

English

1

167

Marko Prous@mprous1·27 Kas

@ChengChhy i've got some datasets with very poor N50, 2 kb for example. So no dorado correct option. Certainly assemblies would not be very good in these cases, but would hifiasm be better than flye? Or hifiasm would not work at all with so short reads?

English

0

191

Haoyu Cheng@ChengChhy·27 Kas

Hifiasm 0.21.0 has been released. It now has a beta module for direct assembly of ONT R10 simplex reads. Initial tests with regular simplex reads show very promising results! github.com/chhylp123/hifi…

English

43

107

8.6K

Haoyu Cheng@ChengChhy·27 Kas

@mprous1 We tested several datasets with about 30kb N50, and hifiasm worked well.

English

0

297

Marko Prous@mprous1·27 Kas

@ChengChhy What's the minimum read length?

English

0

289

Haoyu Cheng@ChengChhy·6 Eyl

My new lab at @YaleBIDS is looking for a couple of postdocs, students and RAs in bioinformatics, genomics, machine learning and related fields (postdocs.yale.edu/postdoctoral-a…). Heartfelt thanks to my mentor @lh3lh3 and collaborators for their incredible support!

English

21

48

6.2K

Haoyu Cheng retweetet

Heng Li@lh3lh3·4 Eyl

Preprint on "BWT construction and search at the terabase scale". We can compress 100 human genomes to 11GB in 21 hours, find SMEMs with it, do affine-gap alignment and retrieve similar local haplotypes. 7.3Tb commonly sequenced bacterial genomes ⇒ 30GB arxiv.org/abs/2409.00613

English

9

218

715

192.8K

Haoyu Cheng@ChengChhy·27 Tem

@basti_beier @HumanPangenome Oh yes, that is typo which should be fixed… Thanks for pointing that!

English

1

49

Sebastian Beier@basti_beier·27 Tem

@ChengChhy @HumanPangenome I was speaking about this part, see picture. There you specify -m1000000 for 100kb (but actually the command would filter out all reads below a length of 1Mb not 100kb. (Similar with -m500000 and the 50kb limit)

English

0

1

65

Haoyu Cheng@ChengChhy·7 Haz

Excited to share our new t2t assembly algorithm for diploid and polyploid genomes! Using 132 assembled haplotypes from the @HumanPangenome , we show that our approach is cost-efficient, robust, and could achieve t2t assemblies without high coverage reads arxiv.org/abs/2306.03399

English

42

116

33.1K

Haoyu Cheng@ChengChhy·27 Tem

@basti_beier @HumanPangenome We actually used the UL reads >=50kb/100kb for the assembly. There are very small fraction of UL reads that could be longer than 500kb/1Mb.

English

0

85

Sebastian Beier@basti_beier·27 Tem

@ChengChhy @HumanPangenome Very nice preprint, just went through it and stumbled upon the settings in the supplements of filtering ultra-long reads. Looks like the commands there would display i) reads above 1 Mb ii) 500 kb instead of 100 kb / 50 kb. Is this just a typo?

English

0

123

Haoyu Cheng@ChengChhy·7 Haz

Many thanks to those who made this work possible! @asri_mobin Julian Lucas @sergekoren @lh3lh3

English

2

1.3K

Haoyu Cheng@ChengChhy·9 Şub

@erikgarrison @XLR @nomad421 @lh3lh3 I checked the conda recipes: github.com/bioconda/bioco…, but seems there is nothing to set

English

1

133

Erik Garrison@erikgarrison·9 Şub

@XLR @nomad421 @lh3lh3 @ChengChhy The problem is possibly because of a failure to use the correct set of SIMD instructions. Are you using multiple dispatch or some kind of fat binaries to deal with this?

English

2

0

1

245

𝕐@nomad421·9 Şub

Most frustrating bioconda issue encountered thus far; program builds cleanly on conda under linux and MacOS, when you pull it down it runs normally on the data but produces (very) wrong results ... wtaf?!

English

6

4

15

7.7K

Haoyu Cheng@ChengChhy·23 Kas

@jpelbers @lh3lh3 We would be happy to have a try if there is UL data for the metagenome, but sounds like it is not easy to produce UL...

English

0

Jean Elbers@jpelbers·22 Kas

@ChengChhy @lh3lh3 I guess I really only looked at the abstract here of this article - ncbi.nlm.nih.gov/pmc/articles/P… from 2019.

English

0

Haoyu Cheng@ChengChhy·18 Kas

New hifiasm with the ultra-long integration is released! We tested it with four diploid human samples and got many T2T chromosomes. Any feedback will be much-appreciated @lh3lh3. Source code: github.com/chhylp123/hifi…

English

41

126

0

Haoyu Cheng@ChengChhy·23 Kas

@subgenomes @lh3lh3 @PacBio @nanopore It is compatible with the old bin files, but still would be better to rerun the whole workflow from the raw reads. We would also be interested in if it could work for the polyploid genome (although it might have some parameters to be tuned).

English

Daniela Miller@subgenomes·22 Kas

@lh3lh3 @PacBio @nanopore Big news! Can it incorporate UL reads to a preexisting HiFi assembly or need to assemble raw? I'm working with a polyploid genome that I'd love to test this out on @ChengChhy @lh3lh3

English

New hifiasm with the ultra-long integration is released! We tested it with four diploid human samples and got many T2T chromosomes. Any feedback will be much-appreciated @lh3lh3. Source code: github.com/chhylp123/hifi…

0

Heng Li@lh3lh3·18 Kas

Hifiasm HiFi+UL integration is ready for beta testing. This new mode takes @PacBio HiFi and @nanopore ultra-long reads as input and produces longer phased contigs. It also works with trio or Hi-C for chromosome-long phasing. Add option --ul to provide UL reads.

Haoyu Cheng@ChengChhy

English

4

66

163

0

Haoyu Cheng@ChengChhy·22 Kas

@iskander @lh3lh3 @PacBio @nanopore Yes, it is similar to the error correction in theory.

English

1

0

alex rubinsteyn@iskander·21 Kas

@lh3lh3 @PacBio @nanopore Would this also act as a form of error correction on nanopore reads (eg with sufficient coverage can I think of the phased contigs as reads with lower indel error?)

English