Censorship is Killing ChatGPT
Also in today's edition: Jensen Huang: The Steve Jobs of Chip World, Life Transforming Mantras from Tech Leaders & Microsoft Minus OpenAI = ?
What happens when you try to integrate social values into chatbots? They break down. The RLHF (Reinforcement Learning Human Feedback) used to create a chatbot aligned with human values has stymied the true potential of the platform. The phenomenon is called the alignment tax of AI models.
A paper titled ‘Scaling Laws for Reward Model Overoptimisation’ delves into the phenomenon where RLHF preferences lead to biased models that impede the true performance of the models. Hence, every time an LLM is fine-tuned, it hinders its overall functionality and loses its performance.
Keep reading with a 7-day free trial
Subscribe to
The Belamy | Weekly dose of best Tech stories
to keep reading this post and get 7 days of free access to the full post archives.