Parameter Efficient Pre-Training: Comparison of ReLoRA and GaLore
Can we use parameter-efficient training methods and achieve PEFT-like efficiency gains during the pre-training stage too?
Text Machine Laboratory is a research group in machine learning for NLP, led by Anna Rumshisky ( Twitter). Our home is University of Massachusetts Lowell. Head over to the group website for more info about the current group members, a full list of our projects, and contact information. Here you will find blog posts accompanying our recent papers, as well as other related musings.
Can we use parameter-efficient training methods and achieve PEFT-like efficiency gains during the pre-training stage too?
Models of BERT family are overall robust to pruning, but they have an Achilles heel: the outlier dimensions, without which the quality of the model drops sig...
The world is filled with data. Can we learn from this data to generate something new?
QuAIL is a new challenging NLP benchmark that combines reading comprehension and commonsense reasoning.
BERT and its Transformer-based cousins are still ahead on all NLP leaderboards. But how much do they actually understand about language?