As many of you might have heard, what is rumoured to be an experimental Llama-3 34B base model weights have been leaked yesterday. Let’s go over what we know: it implements the bitnet architecture (https://arxiv.org/abs/2310.11453) and according to some speculations, the leaked model has been trained with anywhere between 10 to 40% of the training data.
I luckily got my hands on the weights before the twitter post with the magnet link was taken down and got this working on llama.cpp with some major tweaks. In my opinion, this model is amazing in logic and math (dare I say comparable to GPT-4), but I won’t hype it up too much before I finish my official benchmark tests. I quickly put together a Discord chatbot so people can try out chatting with it. Even though this is speculated to be a base model, it is flawless in chatting too.
Anyways, I haven’t slept in like 24 hours so I gotta go take a nap. You can access the Discord bot that I mentioned here:
submitted by /u/AIEchoesHumanity
[link] [comments]