Censorship is Killing ChatGPT

Also in today's edition: Jensen Huang: The Steve Jobs of Chip World, Life Transforming Mantras from Tech Leaders & Microsoft Minus OpenAI = ?

May 30, 2023

∙ Paid

What happens when you try to integrate social values into chatbots? They break down. The RLHF (Reinforcement Learning Human Feedback) used to create a chatbot aligned with human values has stymied the true potential of the platform. The phenomenon is called the alignment tax of AI models.

A paper titled ‘Scaling Laws for Reward Model Overoptimisation’ delves into the phenomenon where RLHF preferences lead to biased models that impede the true performance of the models. Hence, every time an LLM is fine-tuned, it hinders its overall functionality and loses its performance.

Keep reading with a 7-day free trial

Subscribe to Sector 6 | The Newsletter of AIM to keep reading this post and get 7 days of free access to the full post archives.