Ethical Sourcing of Datasets for LLMs One of the key issues with LLMS (Large Language Models) is ethical sourcing of datasets used for training. This topic is especially tough in light of the way Bay Area VC-funded startups operate via Regulatory Entrepreneurship, i.e., they find a lucrative market, break the law, then get the law changed afterward. We see this with Uber, Airbnb, other startups, and now with OpenAI.
The Coming Insurgency against LLMs An upcoming battle will fight against the unauthorized use of content. Here are some of the emerging ideas. This article will be updated frequently.
Background My background is unique to the LLM problem in that I currently work in the AI domain outside of Big Tech, so I can have any view I want without worrying about censorship. Additionally, I have film credits and spent years working on big-budget Hollywood films and live TV in the technical realm.