Mapping the Journeys of Australian World War I Soldiers using Named Entity Recognition


DHA 2025: Short Talk


Slides:

Abstract:

During World War I, Australian soldiers trained in either Egypt or England, before serving in Gallipoli, the Western Front, or the Middle East. Despite these common locations, and path to war, each soldier had a personalised journey. This journey depended on a variety of factors, including being assigned to different units or roles, leave, injury, being taken prisoner, and death. These individual journeys are highlighted in their war diaries. The aim of this work is to automatically map each soldier’s journey through the war using named entity recognition (NER). The diaries used in this analysis come from the State Library of New South Wales. The initial step in this work is extracting locations using NER and cleaning them. However, this is difficult due to a variety of factors, including misspellings, shortened location names (e.g. “Aus” instead of “Australia”), generic locations (such as “town”), and the extraction of non-locations by the NER. This paper will discuss our work to date in using NER and dealing with the aforementioned issues. We will also present initial ideas regarding geocoding these locations. Geocoding allows us to obtain the coordinates of a location and easily map it. However, this can be difficult as many locations share names (e.g. there is a Liverpool in both Australia and the UK). Furthermore, diarists may have mentioned locations they were not at. Here, we hope that the temporal information of the diaries can aid us in correctly identifying each location.