Colloquium Polaris 04/18/2024

on April 18, 2024 at 2:00 pm

Speaker : Philippe Gambette

Using and enriching Wikisource, Wikidata and Wikipedia for open and inclusive science.

The wikisource collaborative digital library can be used, alongside other electronic libraries such as Project Gutenberg, as a source of texts for research projects in automatic language processing or digital humanities. If it is to be used as a source of textual data for creating ‘corpora of convenience’, we need to be aware of the possible biases in the platform’s content, particularly as regards the gender bias of the authors. We will show how, using approaches similar to those put in place by the sans pagEs collective on Wikipedia, it is possible to assess these biases, in particular by using the Wikidata collaborative database, and then to remedy them. We wil present several initiatives undertaken as part of research projects at the Gustave Eiffel University, in partnership with the asoociation Le deuxième texte, to enrich the corpus with texts written by women. Finally, we will illustrate how Wikidata can also be used to make research data available, by severing as a pivotal database, as part of an open science approach.

More...

Amphi Ircica – 50 avenue Halley – Haute Borne – Villeneuve d’Ascq

More news