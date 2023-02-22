The scientists set out to train a model on a dataset of georeferenced shapes of all known sites scattered throughout the southern Mesopotamian floodplain.

The model was implemented using pre-trained models for semantic segmentation, fine-tuned on satellite images and masks of the site shapes coming from a dataset containing almost 5,000 examples. But they encountered several problems.

“The dataset, while may be considered a very large one for near eastern archaeology with its almost 5,000 sites, is hardly sufficient for training a model as large as the state-of-the-art ones we see in use today and, perhaps more significantly, contains many cases that are visible only on certain old imagery,” explained the researchers.

Using a human-in-the-loop approach

To solve these issues, the team used a human-in-the-loop approach to integrate domain expertise during their experiments' training and evaluation phase. That was crucial in improving the dataset used and, in turn, the model.

The outcome of this iterative process was a model capable of obtaining a detection accuracy of around 80 percent.