Update - Microsoft AI Research 02.05.2025 reported
Low-bit quantization reduces model size, enabling efficient execution on edge devices (phones) and reducing energy consumption (battery life) ... 2-bit 7B Llama.
https://www.microsoft.com...661d39138dab32f86767 Next up? Microsoft has squeezed LLM down to 2-bit. If Kroger's "Lower than low" commercial had its say, we'd soon be looking at 1-bit (sign bit) LLM—a retro nod to the sign-bit processing of 1973, making a futuristic comeback!
Bits and Pieces history - recall IBM-360 / 370 mainframe used to process seismic data had 16-bit; not until IBM-390, we could finally handle full 32-bit floating-point data. For true amplitudes and amplitude-versus-offset analysis to infer rock properties, that is a must.
LLM quantization v sign-bit* processing Comp - Can't help to make the following comparison when it comes to the number of bits to make LLM run faster and cheaper today, just as seismic data processing in 1973.
A) On one hand, here is a post on LLM infrastructure optimization 08.23.2024 from Google.
https://cloud.google.com/...r-llm-serving-on-gke Quantization uses fewer bits, "newer model checkpoints are already published in 16-bit precision" and even drop down to 4-bit. "Recommendation - 1: Use quantization to save memory and cost. If you use less than 8-bit precision, do so only after evaluating model accuracy."
B) On the other hand, paper on using sign bits in GEOPHYSICS vol. 38, Issue 6, December 1973,
https://library.seg.org/doi/10.1190/1.1440394 "The statistical properties of sign‐bit semblance were such that this system could do a velocity analysis and an interpretation with no human intervention. In this mode of operation it yielded state‐of‐the‐art accuracy at greatly increased speed and with greatly reduced storage requirements."
Q: so can we use sign-bit for LLM quantization in all matters of GenAI training?
That will be one for the IMAGE audience to take on.
*Copilot assist - sign bit is a specific use of a one-bit representation, but not all one-bit representations necessarily serve as sign bits.
Corroborated reading - "Lack of bits and bytes is what could make or break your AI/ML project" {1} may find one bit [2] an enticing experiment.
[1]
https://www.techradar.com...ak-your-aiml-project [2] Samuel Allen -
https://wiki.seg.org/wiki/Samuel_Allen Sam Allen (past president of SEG) co-founded a new company, Geophysical Systems Corp., whose existence is based on an entirely new recording concept, sign bit recording. This revolutionary technique, conceived and implemented by Sam and his company, is making 1000-channel recording practical.