It’s easy to doom and gloom about big businesses building their big data scraping AI to consolidate even more control and wealth, given that the odds are unfortunately in their favor.
However, AI presents yet another technological turning point for the public in a similar way to mass mechanical automation, and so, how might one imagine communities building their own to help themselves? How might communities use these tools to retain and make more use of their own accumulated data rather than continue to give it up to big businesses?
I don’t think it is possible, yet.
AI is still at the big money, big technical investment stage. It will be a decade or more before what you are talking about will be possible.
Aren’t there already a few free and open source tools available though? That’s a part of what inspired this question tbh.
The codebases are free, but the training sets are not. To have intelligence like you see in GPT-4 you need a lot of training data that is expensive to put together.
Honestly if I, the underdog, want to utilize AI for my goals by best bet is to pay $20/mo for the AI from OpenAI.
I thought the main obstacle was the computing power to update 175 billion neurons with large datasets. You probably could generate a good llm just using Wikipedia, but I think it requires a room full of expensive video cards to do.
Isn’t training data simply data? If a community were to agree to pool their data together to enable the AI, wouldn’t that bypass the cost issue? Or is this one of those situations where the amount of data required thoroughly demonstrates how much businesses have arguably stolen from the public, and in turn no community may produce sufficient data to enable their AI tools to the same degree?