Wasim A Iqbal, PhD

1.7K posts

Wasim A Iqbal, PhD

@waz_147

Newcastle Upon Tyne, England Katılım Ağustos 2011

392 Takip Edilen304 Takipçiler

Sabitlenmiş Tweet

Wasim A Iqbal, PhD@waz_147·12 Eyl

I am excited to share a new paper. It demonstrates machine learning can predict kinetics of numerous plant Rubiscos. This will ease bioengineering efforts and may allow species-specific parameterization of global photosynthesis models. 1/5 @JXBot doi.org/10.1093/jxb/er…

Newcastle Upon Tyne, England 🇬🇧 English

145

Wasim A Iqbal, PhD@waz_147·14 Oca

@OsitaNwokeocha Thank you, I hope you are well

English

Osita Nwokeocha@OsitaNwokeocha·14 Oca

@waz_147 Congratulations

English

Wasim A Iqbal, PhD@waz_147·14 Oca

Alhamdulillah starting my first permanent role tomorrow. Excited to work on projects helping to reduce inequality in Newcastle using A.I.

English

171

Wasim A Iqbal, PhD retweetledi

Dr. Ryan Thompson@RyanMicroBio·5 Kas

Very happy to have published the fifth paper of my PhD, detailing the Actinorhizal-Frankiaceae symbiosis and its application in phytoremediation 🌳🦠. Many thanks to my co-authors @MonteroCalasanz David George. …icro-journals.onlinelibrary.wiley.com/doi/10.1111/17… @ONEPlanetDTP @UniofNewcastle @SciencesNCL

English

609

Wasim A Iqbal, PhD retweetledi

Dr. Ryan Thompson@RyanMicroBio·24 Eyl

Very happy to have submitted my PhD thesis today, would not have been possible without the support of my wonderful supervisor @MonteroCalasanz Likewise, very pleased to have been awarded three months of post-submission funding by @UniofNewcastle to continue my doctoral research.

English

1.1K

Wasim A Iqbal, PhD retweetledi

Kevin K. Yang 楊凱筌@KevinKaichuang·28 Ağu

I'm the "and more!" Register to see my lightning talk on generative biology! researchforum.microsoft.com/?OCID=msr_rese…

Microsoft Research@MSFTResearch

Join us at Research Forum on September 3 to learn about new advancements from Microsoft in multimodal models, novel and self-improving models, guidance for responsible AI, and more. Register now.

English

9.4K

Wasim A Iqbal, PhD retweetledi

Jona Ochika@JChikankheni·30 Tem

Pleased to announce our new publication "Simple soil water monitoring tools increase yield and income of smallholder farmers in Malawi: A case study of four irrigation schemes" enjoy reading. @ACIARAfrica @ACIARAustralia @ANUFennerSchool doi.org/10.1002/ird.30…

English

135

Wasim A Iqbal, PhD@waz_147·22 Haz

@MattKirby_ @Prof_AJScott @clwnewc @JLuger @GeoMapperJo @ONEPlanetDTP @NUGeog I can spot a green belt throughout your pages. Congratulations!

English

Dr Matt Kirby@MattKirby_·22 Haz

Pleased to report that I've submitted my PhD thesis! Thanks to my fantastic supervisors @Prof_AJScott @clwnewc @JLuger as well as @GeoMapperJo for the support! I can't believe how fast 3.5 years has gone by and all I've achieved. @ONEPlanetDTP @NUGeog Now time for a break!

English

764

Wasim A Iqbal, PhD@waz_147·19 Haz

If you summarized the boomers using a single photo

English

442

ₕₐₘₚₜₒₙ@hamptonism·30 May

The universal approximation theorem states that a neural network with one hidden layer can approximate continuous functions on compact sets with any desired precision.

English

192

218.7K

Wasim A Iqbal, PhD@waz_147·31 May

@Hamptonism @amaani_hussain @NJR_Newcastle nice graphic explanation

English

251

Wasim A Iqbal, PhD@waz_147·21 May

@rohanpaul_ai @amaani_hussain

QAM

315

Rohan Paul@rohanpaul_ai·20 May

For systematically maximizing the performance of deep learning models, really a very comprehensive guide is Google's tuning playbook

English

343

46.4K

Wasim A Iqbal, PhD@waz_147·12 May

@predict_addict Pretty shitty move. Maybe release a dissertation against this work instead of publicly ousting someone's hard work.

English

329

Valeriy M., PhD, MBA, CQF@predict_addict·12 May

Never ask a woman her age, a man his salary or a Cambridge machine learning department why waste taxpayer funds on frameworks that neither work nor scale like on Gaussian processes or Bayesian deep nets. #bayesianism

English

149

1.4K

350K

Wasim A Iqbal, PhD@waz_147·10 May

@cwolferesearch @amaani_hussain this work would be useful for your next paper

English

385

Cameron R. Wolfe, Ph.D.@cwolferesearch·10 May

Recently, I’ve run hundreds of instruction tuning experiments with LoRA/QLoRA, and I wanted to share some (basic) code and findings that might be useful… The code (see replies) contains an instruction tuning script using LoRA/QLoRA and the Alpaca dataset, as well as evaluation code that uses the test set from Vicuna. The repo contains scripts for both training and observing model output. Most of my experiments were run with Mistral-7B using a 2x3090 GPU workstation (full training script takes a few hours to complete). When running instruction tuning experiments with LoRA, I started to observe some practical takeaways that I found to be (relatively) useful and generalizable across several models and datasets… 1. Using too high of a rank for LoRA typically leads to overfitting. A low rank (r=8/16) seems to be sufficient in most cases. 2. Adding dropout to the LoRA adapters didn’t do much to prevent overfitting in my experience. 3. For both LoRA and QLoRA, adding LoRA adapters to all linear layers in the model seemed to yield the best performance. 4. Given a properly tuned learning rate, the best performance was typically achieved using a constant learning rate schedule with a short (e.g., 2%-5% of iterations) warmup period. Using a cosine decay schedule for the learning rate did not improve performance much and led to worse overfitting in certain cases. 5. Adding a small weight decay (e.g., 1e-4) helps with overfitting. 6. Performing two training epochs can yield better performance in certain cases, but going beyond two epochs (e.g., three epochs) nearly always causes overfitting. 7. Sufficiently large batch sizes (e.g., around 64 or 128) are important for training stability (if you don’t have enough GPU memory just use gradient accumulation!). Batch sizes of 8 or 16 led to chaotic training curves and prevented convergence in some cases. 8. In general, observing model outputs on a variety of evaluation sets (e.g., held out Alpaca examples, the vicuna evaluation set, or hand-written prompts) was way more informative than tracking training/evaluation metrics. One other interesting observation that I had is that finetuning with LoRA (as opposed to QLoRA) is not always simple on consumer GPUs (e.g., 3090s), even with smaller LLMs. When finetuning Mistral-7B on the Alpaca dataset, I had to use a reduced sequence length (64-128 tokens) during training to avoid running out of memory (and I still hit sporadic OOMs). I’m not sure if other packages (e.g., LitGPT) better manage memory, but I was personally surprised that LoRA finetuning was non-trivial for a 7B model in bfloat16 on a 3090 GPU.

English

441

90.9K

Wasim A Iqbal, PhD retweetledi

Kate Halstead@Kateehalstead·8 May

I'm happy to share that our paper has been published in Forest Ecology and Management, @IUFRO 10th Wind and Trees Conference special edition. 'Localised damage patterns to oak during severe UK storms in winter 2021': authors.elsevier.com/sd/article/S03… A huge thanks to all authors involved @GaultonRachel @quine_chris @andysuggitt Roy Sanderson and @ActionOak!

English

2.7K

Wasim A Iqbal, PhD retweetledi

Gary Caldwell@applied_marine·9 May

Funded industrial PhD available: Project will sequence the genome & define the transcriptome of an industrially significant ammonium loving microalga used to remediate anaerobic digester liquors findaphd.com/phds/project/p… @newcastlemarine @mcsuk @bps_algae

English

1.1K

Wasim A Iqbal, PhD@waz_147·8 May

@IsomorphicLabs @GoogleDeepMind @Nature @NJR_Newcastle PSORTFold?

English

975

Isomorphic Labs@IsomorphicLabs·8 May

We're excited to announce #AlphaFold 3 with @GoogleDeepMind in @Nature: our new AI model for predicting biomolecule structures with unprecedented breadth and accuracy. Expanding beyond proteins to tackle DNA, RNA, small molecules to fuel advances in biology & drug design 🧵

English

202

977

296.8K

Wasim A Iqbal, PhD@waz_147·5 May

@predict_addict @NJR_Newcastle we need to try KAN . Its like an extreme case of a Bayesian neural network

English

556

Valeriy M., PhD, MBA, CQF@predict_addict·5 May

KAN is awesome and works exactly as mentioned in the paper. MLPs are struggling to approximate many functions and KAN by design combines Kolmogorov Arnold ideas and fuses them with the best of what MLPs can offer. The result is awesome KAN. Colab in the original post by @milos_ai #KAN

English

448

36.9K

Wasim A Iqbal, PhD@waz_147·29 Nis

@jxmnop @NJR_Newcastle @amaani_hussain A key example when "model type" does not matter. Often scale and the type of the training data is more important

English

487

dr. jack morris@jxmnop·29 Nis

one of the most important things I know about deep learning I learned from this paper: "Pretraining Without Attention" this what I found so surprising: these people developed an architecture very different from Transformers called BiGS, spent months and months optimizing it and training different configurations, only to discover that at the same parameter count, a wildly different architecture produces identical performance to transformers this may imply that as long as there are enough parameters, and things are reasonably well-conditioned (i.e. a decent number of nonlinearities and and connections between the pieces) then it really doesn't matter how you arrange them, i.e. any sufficiently good architecture works just fine i feel there's something really deep here, and we may be already very close to the upper bound of how well we can approximate a given function given a certain amount of compute. so we should spend more time thinking about other questions, such as what that function should actually look like (what data? which objective function?) and how to make it more efficient