Two days after the Russian invasion, Kijas, the music librarian at Tufts University, called for a “data rescue session.” She had already planned to attend a virtual conference on March 5 for digital humanists and saw an opportunity to help Ukrainians.
Kijas wondered what she could do as a librarian familiar with digital tools. She had a list of specific libraries and archives in Ukraine with unique music collections. She decided she could help find these items and digitally archive the web pages.
Kijas’ tweet caught the attention of Dombrowski and his colleague Majstorovic. The two approached Kijas with an idea: what if they saved more than music collections – and did it at a faster rate?
Initially, the team planned to use the Internet Archive’s Wayback Machine to record information, but the Wayback Machine has a “shallow crawl of web pages,” Kijas said. The machine can only record the first or second layer of a website. More complex functionality, such as interactive features and scripted code, may be lost.
To counter this problem, Majstorovic suggested the team use Webrecorder, an open-source tool that can drill down seven or eight layers into a website and capture some of the most complex media. The open source nature of the tool also meant that volunteers could directly edit the code and modify the tool to their specific needs.
“From there, once we started promoting our project and getting a lot of volunteers, people started asking, ‘Okay, what other tools can we use?'” Kijas said. . “We don’t have a budget either, do we? So we only do this as volunteers, and everyone who helps is a volunteer.
Browsertrix Crawler is another popular tool among volunteers and is particularly useful for capturing websites with more advanced features such as 3D tours and timelines. Bucknell University librarian Carrie Pirmann, who has described herself as “reasonably tech-savvy”, installed the tool on her home and work computer in early March. In her office, she will frequently run the crawler on her computer in the background while she works on other tasks – she has since crawled over 300 websites.
“That’s kind of where my library and detective skills really came into play,” Pirmann said. “I don’t speak Ukrainian. I don’t read Cyrillic.
Pirmann is also part of the Situation Monitoring team, where she uses Google Maps and library directories to identify heritage websites that may need archiving. One of Pirmann’s favorite websites that she has archived is the public library in a small town in Ukraine called Bohodukiv — which shares a sister city relationship with Pirmann’s hometown of Boyerton, Pennsylvania.
“I kind of started calling them digital scrapbooks,” Pirmann said, explaining that these libraries not only have information about their library and book collections, but also extensive photo albums of events at the lineup. for children. Pirmann hopes to one day return this website to the library.
“Maybe people can find pictures of their children. It may be one of the few places where if your apartment or house was bombed and you lost everything, you might still be able to go back to that archive and find a picture or two of your children,” said Pirmann. .
Web crawling is not the only part of the archiving process. The production of metadata is another important part. Kim Martin, assistant professor of history and cultural and technological studies at the University of Guelph, works for the metadata team. When an item is uploaded to the Internet Archive machine, it produces basic metadata such as a title, description, and URL. However, metadata is not limited to this information.
“So we not only needed the URL of a JPEG image, but we needed to show which page was linked from, we needed to give it context,” Martin explained. The team now has 16 metadata fields that they populate regularly, making conscious decisions about whether to populate metadata fields in English, Ukrainian, or Russian.
“We all learn from each other and really take on tasks and move forward as a group, which is quite interesting because none of us knew each other. [before]”, said Martin.
From Lewisburg, Pennsylvania, to Vienna, Austria, SUCHO volunteers coordinate all of their work through a Slack channel. There’s never a dull moment in Slack, with different channels ranging from rapid response teams to channels just for sending memes.
“Someone found a Wild West restaurant in Ukraine that had the most fanciful description of cheese sticks as sheriff’s snacks, and almost like a micro-story in the description of the cheese sticks menu,” Dombrowski recalls. It is these little moments that help support SUCHO volunteers in their work.