Discussion about this post

User's avatar
Bill Newman's avatar

Hey Anant! Great article!

It got me thinking if specific high quality training data would be more valuable and thus economically viable for smaller curated models rather than the big providers (e.g., OpenAI). With GPT-4 being trained on trillions of tokens could be hard for a data provider like NYT to make sustainable revenue; however, if instead it was easy to license NYT data to produce a model trying to be an expert essay editor maybe that could work?? Also, is the future business model of NYT and other content creators going to be similar to Reddits shift charging for data they produce?

Expand full comment
1 more comment...

No posts