OpenAI’s Post

View organization page for OpenAI, graphic


We’re partnering with Reddit to bring its content to ChatGPT and new products:

OpenAI and Reddit Partnership

OpenAI and Reddit Partnership

Kaushik Immadisetty

Early-Stage Startups | Rice University MBA '23 | Sales, Programming & Customer Development


Prompt: How do I make money? Reply: Short Gamestop.

Christoph Beck

Dr. phil. | Integrating AI solutions for editorial purposes


Dear ChatGPT, Please provide me with 5 reasons why it is a bad idea to train a large language model (LLM) using data from Reddit. 1. **Quality and Reliability of Information**: Reddit content can vary significantly in quality, with many posts and comments containing misinformation, personal opinions, or unverified claims. 2. **Toxicity and Offensive Content**: Reddit hosts a wide range of communities, some of which contain toxic, offensive, or abusive language. 3. **Bias and Representativeness**: The demographic composition of Reddit users is not representative of the general population. 4. **Privacy and Consent Issues**: Reddit posts and comments are created by individuals who may not have consented to their data being used for training language models. 5. **Context and Coherence**: Reddit threads often involve fragmented and context-specific conversations. Training an LLM on such data may result in a model that struggles with maintaining context and coherence in its responses, as it may not effectively learn how to generate structured, logical, and contextually appropriate language.

Denis Jastrzebski

Director of CuriousHawk - Business Consultant & Growth Strategist | Creating Strategically Curious Business Solutions — AI Consultant


┐⁠(⁠ ⁠∵⁠ ⁠)⁠┌ I hope there's a lot of content filtering that will happen before OpenAI just assimilates all of Reddit in the end

Aleem Juma

Risk management leader, AI risk conference speaker


Whoever is rejoicing about this hasn't been on Reddit.

Harold Mansfield

Senior Level IT Professional Specializing in Cybersecurity


I've been a plus subscriber since the beginning. This is a terrible idea. Have you guys actually used Reddit? It's 99.9% dumpster fire, infighting, mod egos, poor moderation, ignorance, racism, sexism, d**k jokes, fart jokes, re-posted AI images, spam, scams, and preteen boys posting every 5 minutes asking for help to get a girl to like them, or various posts trying to get people to describe disturbing, bizarre, fetishes and sexual escapades to them. There are SO many kids on Reddit that Nickelodeon should be it's main sponsor. How are you verifying the ages of Reddit's users and their content? Because they said so? Kids have been getting around that for YEARS! Seriously, read the content! In the world of garbage content to train on, Reddit is one step below Yahoo Answers. Adding Reddit as a training source is like Snapple p***ing in the lemonade. One of the MAIN reasons I stopped using Google is that Reddit is now the top link in almost every search. Not a knowledgeable article, or expert, or white paper. Reddit.

Mmm not a fan of these news for a few reasons/questions: 1. Reddit, with its predominantly young male user base, exhibits a clear male bias. While it does offer some valuable content, it also harbors a significant amount of toxicity, with certain groups promoting unconscious behavior that can potentially harm all AI users and negatively train the LLM in ways we may not be able to fix. 2. Reddit users signed up to the platform believing their data would not be 'sold' or shared without consent. Is "Partnership" the legal loop for selling and sharing information from one platform to the next? Does this mean we are moving farther away from owning and securing personal, organizational, and State data? In the next 24 months, over 50 democratic countries are up for election. Much is at stake. 3. We know that one of the issues with LLM in the future is finding good, reliable data to continue feeding the model. Is this "partnership" a sign that we are indeed in a race to the bottom? Evolving LLMs without clear purpose; thoughtful ways in which AI technology translates into real value for humanity.

Daniel Joyce

Seeking a position as a Staff, Tech Lead, or Principal Software Developer


Oh boy, I mean, are you really really sure this is a good idea?

Sergio M.

AWS SAA - GenAI Certified - Founder at Cleta


Uff, this is a reminder that any 'free' service is collecting and eventually monetizing from your data.

Francois Legare

AI/ML Solutions Architect @ Bell | MBA


What ChatGPT-o had to say in the style of Wallstreetbets :D YO, FELLOW APES! 🚀🚀🚀 Did you catch that juicy news? OpenAI's teaming up with Reddit to pump more tendies into ChatGPT and new products! 💎🙌🦍 This collab is straight fire! 🔥🔥🔥 Imagine the power of all those glorious Reddit posts and comments fueling our rocket. We’re talking infinite meme potential, market analysis straight from the basement, and pure diamond-handed wisdom! 📈🌕 This is a game-changer, folks. The synergy is off the charts. It’s like loading up on $GME right before the moonshot! 🌝💸 This partnership is going to level up our gains, no doubt. 💪 Buckle up, we’re going to the stratosphere with this one! HODL tight and let’s see those rockets fly! 🚀🚀🚀 #YOLO #ToTheMoon #OpenAI #RedditCollab #InfiniteTendies

Jack Wilber

Indiana University | Kelley School of Business | B.S.B Finance | 2026


How will OpenAI ensure training data from Reddit’s API is of high quality? Will the AI decide on its own, or will there be supervised selection of data? There is no doubt an incredible amount of useful information on Reddit, but as with any online forum there is also a lot of garbage. I would be interested to hear what steps will be taken to make sure this data doesn’t compromise the accuracy of GPT’s replies.

See more comments

To view or add a comment, sign in

Explore topics