Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Mistral opened their weights only for very small LLaMA-like model.


sort by: page size:

Mistral just released the most powerful open weight model in the history of humanity.

How did they weaken their commitment to open weights?


No, they’ve released the weights for Mistral-small. They haven’t released the weights for Mistral-medium.

The RAW Weights here: https://docs.mistral.ai/models/

Exactly, nice work BTW. And no hate for Mistral, they're doing great work, but let's not confuse weights-available with fully open models.

It's weird that more than a day after the weights dropped, there still isn't a proper announcement from Mistral with a model card. Nor is it available on Mistral's own platform.

Mistral would be the base model

Perhaps I'm missing something but what's the usp of Mistral, as far as I can see their models aren't competitive?

If you go to GroqChat (which is like a demo app), they offer Gemma, Mistral, and LLaMa. These are all open-weights models.

I think it does, as Mistral usually do their own models, and also, they couldn't fully commercialise Mistral Medium if it was LLama 2 based.

Thank you. I thought it was weird for them to release a 7B model and not mention Mistral in their release.

> Mistral just released the most powerful open weight model in the history of humanity.

Well, yeah, it's very welcome, but 'history of humanity' is hyperbole given ChatGPT isn't even two years old.

> How did they weaken their commitment to open weights?

Before https://web.archive.org/web/20240225001133/https://mistral.a... versus after https://web.archive.org/web/20240227025408/https://mistral.a... the Microsoft partnership announcement:

> Committing to open models.

to

> That is why we started our journey by releasing the world’s most capable open-weights models

There were similar changes on their about the Company page.


Announcing 2 new non-open source models, and they won't even release the previous mistral medium? I did not expect... well I did expect this, but I did not think they would pivot so soon.

To commemorate the change, their website appears to have changed too. Their title used to be "Mistral AI | Open-Weight models" a few days ago[0].

It is now "Mistral AI | Frontier AI in your hands." [1]

[0]https://web.archive.org/web/20240221172347/https://mistral.a...

[1]https://mistral.ai/


This. MistralAI is also underdog and released Mitral 7b and Mixtral 8x7b, but as soon as they got traction, they closed their models (e.g., Mistral Medium).

I had the wrong assumption that Mistral was built "on top of" Llama. Then again, I find sentences like "Mistral's models are based off on Meta's Llama".

I’ve been studying and tinkering with open weight LLMs since the original llama weights leaked. I’ve very recently become convinced that the true data and compute requirements needed to fine tune and produce an “unsafe” model are orders of magnitude less than what’s needed today. We are no more than a year away from anyone with a 4090 being able to fine tune their own mistral. The cat is out of the bag on this one.

It depends on what's being evaluated, but from what I've read, Mistral is also fairly competitive at a much smaller size.

One of the biggest problems right now is that there isn't really a great way to evaluate the performance of models, which (among other issues) results in every major foundation model release claiming to be competitive with the SOTA.


Huh! Nevermind then! I take it back. Would be interesting to see what kind of tuning they did/pit the model head-to-head with LLaMA-2-7B-chat. Seems like they did just instruction tuning but not RLHF? So I assume Mistral won't be refusing to answer etc., probably doesn't have many safety guardrails (I guess that's desirable for some!)

Mistral-medium has not been released yet.

Right, but they can just use Llama/Mistral for free, instead of their inferior models, which I'm sure take quite a bit of resources to train in the first place.
next

Legal | privacy