Conversation
Notices
-
knotteye (knotteye@waldn.net)'s status on Wednesday, 09-Dec-2020 00:59:28 JST knotteye @Moon @georgia Thank you! I'm very interested in both AI and ethics so I really like talking about it. - バツ子(痛いの痛いの飛んでけ;; likes this.
-
Infected Moomin (moon@shitposter.club)'s status on Wednesday, 09-Dec-2020 00:59:29 JST Infected Moomin @knotteye @georgia You made a nuanced and thoughtful opinion that I think an AI ethicist would make, unfortunately not the one the authors made. -
knotteye (knotteye@waldn.net)'s status on Wednesday, 09-Dec-2020 00:59:30 JST knotteye @georgia @Moon The AI’s ‘fluency’ in the kind of language used by MeToo and BLM activists is the same thing as under-represented language to the model though. It’s a small subset of the data that can either be emphasized or ignored. Training the model involves making decisions about what sets of data are allowed to matter already, unless you’re fine with feeding the thing poisoned data (which involves determining who is poisoning it and who is acting normally) and spam.
There’s no neutral position here (except for the one that perfectly matches my views, of course), just positions people have convinced themselves are neutral. Everyone training an AI makes decisions about what kinds of data get to matter to that AI. It’s better to be honest about that.
-
georgia (georgia@netzsphaere.xyz)'s status on Wednesday, 09-Dec-2020 00:59:31 JST georgia @knotteye @Moon yeah those are both good reasons. but that's not prescriptivism, it's better descriptivism -
knotteye (knotteye@waldn.net)'s status on Wednesday, 09-Dec-2020 00:59:32 JST knotteye @georgia @Moon But there are perfectly legitimate reasons for stressing the importance of an under-represented dataset to a training model, you can't just assume that something is good because it's the majority. The very next paragraph points out a good reason for doing so.>It will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. The result is that AI-generated language will be homogenized, reflecting the practices of the richest countries and communities.And later on, an example of AI being deficient in under-represented languages causing problems:>The dangers are obvious: AI models could be used to generate misinformation about an election or the covid-19 pandemic, for instance. They can also go wrong inadvertently when used for machine translation. The researchers bring up an example: In 2017, Facebook mistranslated a Palestinian man’s post, which said “good morning” in Arabic, as “attack them” in Hebrew, leading to his arrest. -
georgia (georgia@netzsphaere.xyz)'s status on Wednesday, 09-Dec-2020 00:59:33 JST georgia @Moon funny how linguistic prescriptivism is bigoted until it isn't -
Infected Moomin (moon@shitposter.club)'s status on Wednesday, 09-Dec-2020 00:59:34 JST Infected Moomin https://www.technologyreview.com/2020/12/04/1013294/google-ai-ethics-research-paper-forced-out-timnit-gebru/> shifts in language play an important role in social change; the MeToo and Black Lives Matter movements, for example, have tried to establish a new anti-sexist and anti-racist vocabulary. An AI model trained on vast swaths of the internet won’t be attuned to the nuances of this vocabulary and won’t produce or interpret language in line with these new cultural norms."we should be allowed to force top-down changes in language but AI based on how people actually talk doesn't let us do that"