Evaluating Feature Steering in AI to Mitigate Social Biases

Anthropic has published new research on "feature steering" for addressing social biases within AI models. This method aims to help AI developers and businesses deploy systems that are more ethical and socially responsible in areas like hiring, education, and law enforcement.