Computer scientists at Tel Aviv University are using artificial intelligence to gather the fragments of the world’s largest collection of medieval documents, the legendary Cairo Genizah, to tell the story of 1,000 years of Jewish history and culture. They have reconstructed more than 1,000 documents from 350,000 individual items found in the Cairo storage room: more in a few months than in 110 years of conventional scholarship.
They have decades to go before they are finished.
“The Genizah contains information about every single Jewish subject in the world — all learning,” said Rabbi Reuven Rubelow, manager of the Friedberg Genizah Project, which funds the research. “If it is holy, they kept it in this room.”
In some ways, the contents of the Cairo Genizah are more important than the Dead Sea Scrolls, several scholars believe. While the Dead Sea scrolls were the religious literature of a small sect that lived in the desert for a few years, the Cairo Genizah told the story of the day-to-day details of a millennium of Jewish life, from the mundane to the magnificent.
“What we have learned about Jewish culture and history… in the Muslim world in a century of research is unparalleled,” said Mark Cohen, professor of Near Eastern studies at Princeton University. It is especially true of the day-to-day life of the Jews.
“It’s like looking through a trash can outside your home,” said Phillip Lieberman, assistant professor of Jewish studies and law at Vanderbilt University. “I can tell a great deal about your life from what I find.”
What the Tel Aviv researchers are doing will revolutionize that search. While some of the archive includes complete letters, manuscripts and documents, much of it consists of fragments, some containing only a few words, or pages out of context. The fragments are spread out through 70 different libraries and museums around the world. One page of a letter could be in Oslo and another in Philadelphia.
Nachum Dershowitz and Lior Wolf of TAU’s Blavatnik School of Computer Science are taking the digitized documents and feeding them into computers to rejoin the parts.
Until now, researchers had to rely on serendipity to put together fragments; they would look at a document and remember that it looked like something they saw someplace else, Cohen said. But now, computers are able to learn from their own experience which fragments fit with which. The more documents the computer sees, the better the algorithm will get, an attribute that A.I. scientists call computer learning. The project uses A.I. techniques that were developed over the past decade for myriad reasons but only recently brought to bear on the Cairo Genizah.
Although a genizah has been described as a “holy trash” dump, it is actually a word from the Persian, meaning “hoard” or “hidden treasure.” The practice of storing documents in a genizah derives from the Jewish idea that letters, like people, are alive and sacred. When they wear out, or “die,” they are to be treated with respect, especially if, like the Torah, they contain the words of God. They are eventually either buried or, as in the case of the Cairo Genizah, allowed to decay on their own.
Eventually, genizot became neutral receptacles for any community documents. The one in Cairo is by far the oldest and largest.