DeepSeek launched a free, open-source large language model in late December, claiming it was developed in just two months at a cost of under $6 million.
Most companies aren’t, and shouldn’t be, training their own models. Especially with stuff like RAG where you can use the highly trained model with your proprietary offline data with only a minimal performance hit.
What matters is inference and accuracy/validity. Inference being ridiculously cheap (the reason why AI/ML got so popular) and the latter being a whole different can of worms that industry and researchers don’t want you to think about (in part because “correct” might still be blatant lies because it is based on human data which is often blatant lies but…).
And for the companies that ARE going to train their own models? They make enough bank that ordering the latest Box from Jensen is a drop in the bucket.
That said, this DOES open the door back up for tiered training and the like where someone might use a cheaper commodity GPU to enhance an off the shelf model with local data or preferences. But it is unclear how much industry cares about that.
Not small but… smaller than you would expect.
Most companies aren’t, and shouldn’t be, training their own models. Especially with stuff like RAG where you can use the highly trained model with your proprietary offline data with only a minimal performance hit.
What matters is inference and accuracy/validity. Inference being ridiculously cheap (the reason why AI/ML got so popular) and the latter being a whole different can of worms that industry and researchers don’t want you to think about (in part because “correct” might still be blatant lies because it is based on human data which is often blatant lies but…).
And for the companies that ARE going to train their own models? They make enough bank that ordering the latest Box from Jensen is a drop in the bucket.
That said, this DOES open the door back up for tiered training and the like where someone might use a cheaper commodity GPU to enhance an off the shelf model with local data or preferences. But it is unclear how much industry cares about that.