🃏Joker@sh.itjust.works to

Artificial Intelligence@lemmy.worldEnglish · 14 hours ago

Study: Some language reward models exhibit political bias

3

7

Study: Some language reward models exhibit political bias

🃏Joker@sh.itjust.works to

Artificial Intelligence@lemmy.worldEnglish · 14 hours ago

3

Research from the MIT Center for Constructive Communication finds some language reward models exhibit political bias, even when the models are trained on factual data.

Is it possible to train reward models to be both truthful and politically unbiased?

This is the question that the CCC team, led by PhD candidate Suyash Fulay and Research Scientist Jad Kabbara, sought to answer. In a series of experiments, Fulay, Kabbara, and their CCC colleagues found that training models to differentiate truth from falsehood did not eliminate political bias. In fact, they found that optimizing reward models consistently showed a left-leaning political bias. And that this bias becomes greater in larger models. “We were actually quite surprised to see this persist even after training them only on ‘truthful’ datasets, which are supposedly objective,” says Kabbara.

Chat

vzq@lemmy.world
link
fedilink
English
arrow-up
5·
14 hours ago
Maybe it’s because a certain end of the political spectrum JUST LIES ALL THE TIME?

Artificial Intelligence@lemmy.world

ai_@lemmy.world

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !ai_@lemmy.world

Welcome to the AI Community!

Let’s explore AI passionately, foster innovation, and learn together. Follow these guidelines for a vibrant and respectful community:

Be kind and respectful.
Share high-quality contributions.
Stay on-topic.
Enhance accessibility.
Verify information.
Encourage meaningful discussions.

You can access the AI Wiki at the following link: AI Wiki

Let’s create a thriving AI community together!

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

20 users / day
183 users / week
408 users / month
491 users / 6 months
11 local subscribers
1.36K subscribers
138 Posts
241 Comments
Modlog

mods: