Llama 3 Debuts

LL3 debuts on April 18th. It feels like a year since then as far as progress. There was a 4bit bnb quant out in 4 hours after the release. Meta and the model wall… What’s the point? The second one person gets access the model is essentially public…

Q4KM quant provides an excellent output chat/chat instruct output.
Far superior conversational skills than GPT4 from my limited testing. More personable and “casual” feeling even at low temps.
Q4KM can fit on 2×4090 with room to spare

Very large context sizes and fine tunes already well represented on HF (< 2 weeks)- Papers already written about the quantization quality!