Traitement automatique des langues et Humanités numériques

TAL Journal: special issue NLP and Digital Humanities (60-3)

FR EN

TAL Journal:

2019 Volume 60 number 3

Editors: Jean-Gabriel Ganascia, Francesca Frontini

Call for papers

Digital Humanities (DH) is today a field in rapid expansion; while its boundaries are at times difficult to identify and constantly redefined (Dacos and Mounier, 2015; Terras et al., 2013; Ganascia, 2015), its impact on humanities, i.e. the disciplines that study human culture and/or human achievements, cannot be understated. Indeed the easy access to digital resources, and in particular the digitization of contents and the way computers process them is transforming the humanities and leads the way to the emergence of new scholarly practices. Since many of these contents, whether in literature, philosophy, archaeology or history, are given in textual form, the Natural Language Processing (NLP) techniques are potentially of great benefit for the Digital Humanities.

DH and present day NLP research both stem from a common tradition, that of “Literary and Linguistic Computing” (Hockey, 2004). Indeed most researchers identify the origins of DH in Roberto Busa’s Index Thomisticus, a seminal project, started in 1949, which aimed to use computers in order to automatically create an index of Thomas Aquinas' Summa Theologica. Today, the area that we may call “text-based Digital Humanities” still constitutes a large subfield of DH.

However, while current NLP research typically develops around well identified tasks of varying degree of complexity (such as syntactic labelling, lemmatization, stemming, named entity recognition or syntactic parsing, information extraction, question answering, text summarization, ...), DH apply NLP techniques and methods as a scholarly tool, and utilize them in complex research scenarios, which may go from the acquisition to the annotation and analysis of texts, and may involve unstructured collections but also highly encoded digital editions. Therefore, while progress in NLP is expected to have positive implications for humanities research, the ultimate challenge from a DH perspective is not only an improvement the performance of NLP tools per se, but their use for innovative research that can truly advance disciplinary knowledge in the different fields of the humanities. Besides, the corpora size may considerably differ in DH, from big digitized — and unfortunately too often noisy — libraries of hundreds of thousands books to tiny book-sets of tens to hundreds of texts.

Alongside these differences in the goals, a further problem lies with the wide variety and complexity of the texts to be processed. While NLP research does not ignore the necessity of adapting tools and methods to different textual typologies, registers and genres, the types of texts commonly treated in DH research often constitute, in their nature, an additional challenge for current tools and algorithms. In particular historical documents, recording older varieties of language, or literary texts may pose problems from the linguistic point of view as well as for the complexity of their content.

Despite or rather thanks to the aforementioned issues, DH applications can present themselves as an ideal test bench to evaluate the latest advancements in Natural Language Processing.

This special issue of the TAL journal will be devoted to collecting original contributions at the crossroad between DH and NLP, with a special focus on projects in which NLP tools are developed and/or applied to annotating, processing and studying textual content for the purpose of humanities research.

The disciplines covered will include all fields of the humanities, from literature and philosophy, to anthropology and history. All aspects and levels of analysis in written text processing may be involved such as :

corpus creation, digitization, transcription
automatic enrichment and annotation
advanced corpus querying and exploration
automatic text analysis

Contributions may concern the following areas (non exhaustive list):

mono- or multilingual text alignment
identification of text similarities, authorship attribution, text clustering
annotation of references to works, individuals or fictional characters
extraction and annotation of themes and topics
extraction of recurring linguistic patterns and traits for the purpose of linguistic and stylistic analysis
detection of borrowings or re-uses
adaptation of NLP tools to historical texts and languages
automatic knowledge extraction for the purpose of creating domain ontologies in any field of humanities
tools for textual genetics
exploration of large quantities of text for the purpose of exploring intertextuality or linguistic variation
exploration of large quantities of text for the identification of cultural and or historical trends
...

Theoretical and perspective articles will be taken into account, provided that they are based on previous research and projects by the authors or existing experiences and that they clearly show the contribution for NLP and DH.

TO NOTE

IMPORTANT DATES

Communication of intention to submit : April, 15th 2019
Submission deadline: May, 15th 2019
Notification to the authors after first review: July, 15th 2019
Notification to the authors after second review: October, 31st 2019
Final version: November, 30th 2019
Publication : beginning 2020

THE JOURNAL

TAL (Traitement Automatique des Langues / Natural Language Processing) is an international journal published by ATALA (French Association for Natural Language Processing, http://www.atala.org) since 1959 with the support of CNRS (National Centre for Scientific Research). It has moved to an electronic mode of publication, with printing on demand.

Privacy | Accessibility