Is the tokenizer the same? It may "work" without actually working optimally until llama.cpp patches it in.
And the instruct model was just uploaded.
reply
Is the tokenizer the same? It may "work" without actually working optimally until llama.cpp patches it in.
And the instruct model was just uploaded.
reply