3.5 C

Microsoft LASERs away LLM inaccuracies



During the January Microsoft Research Forum, Dipendra Misra, a senior researcher at Microsoft Research Lab NYC and AI Frontiers, explained how Layer-Selective Rank Reduction (or LASER) can make large language models more accurate. 

With LASER, researchers can “intervene” and replace one weight matrix with an approximate smaller one. Weights are the contextual connections models make. The heavier the weight, the more the model relies on it. So, does replacing something with more correlations and contexts make the model less accurate? Based on their test results, the answer, surprisingly, is no. 

“We are doing intervention using LASER on the LLM, so one would expect that the model loss should go up as we are doing more approximation, meaning that the model is going to perform bad, right, because we are throwing out information from an LLM, which is trained on large amounts of data,” Misra said. “But to our surprise, we find that if the right type of LASER intervention is performed, the model loss doesn’t go up but actually goes down.”

Misra said his team successfully used LASER on three different open-source models: RoBERTa, Llama 2, and Eleuther’s GPT-J. He said, at times, model improvement increased by 20 to 30 percentage points. For example, the performance of GPT-J for gender prediction based on biographies went from 70.9 percent accuracy to 97.5 percent after a LASER intervention.

Source link

Subscribe to our magazine

━ more like this

Baby boomers are redefining work in their 60s, 70s, and beyond with ‘unretirement’ plans: ‘We’re not our grandparents’ vision of retirees’

After 27 years working at Fidelity Investments, Nan Ives jumped at the opportunity to take an early retirement package at age 59. She...

Residents rush to save artifacts as blaze engulfs Copenhagen’s historic stock exchange building in devastating fire

A fire ripping through Denmark’s old stock exchange building has torn down the structure’s dragon-tail spire, a Copenhagen landmark. The protected 400-year-old building caught...

Sony wants 60fps PS5 Pro “Enhanced” games, but it’s happy to settle for less

Sony is working on a new “high-end version” of the PS5, codenamed Trinity and likely to debut as the PS5 Pro later this...

The beginner’s guide to frequent flyer programs: How to earn, redeem and maximize airline miles

Fortune Recommends™ has partnered with CardRatings for our coverage of credit card products. Fortune Recommends™ and CardRatings may receive a commission from card...

7 people with power at Coinbase

Coinbase launched in 2012 as a one-man startup with the goal of bringing crypto into the mainstream. It has since grown into a...