Our project, Artists’ Books Holdings, is an attempt to analyze and visualize data about artists’ books holdings on an international scale. This project is a work in progress created for LIS 664-Programming for Cultural Heritage (PFCH) at Pratt Institute. It illustrates our ability to work with data in a programmatic manner and create visualizations that represent data in a more human readable manner.
Our project currently consists of four phases: data scraping, using the “request module” to get more information from the scrape, cleaning the data, then creating visualizations based on those results. The first phase involves gathering author names and titles of artists’ books. We had prior knowledge that the NYARC consortium, particularly MoMA, has a considerable sized holding for artists’ books. As of November 2015, according to Arcade, there are 15,486 holdings in the libraries of the museums in the NYARC Consortium. We used Beautiful Soup to parse the text of Arcade’s web pages to look for the author and title and then to write that information out as JSON. The next step involved using the author and title of 14K+ artists’ books and request the holdings number and OCLC number from Classify, an API tool created by OCLC. This information was written out to a JSON file. The third step involved taking that data and cleaning it up by putting it into a CSV file which we could work with to create data visualizations.
We plan to run the OCLC numbers through worldcat to find out the actual holdings locations of the artists’ books listings. Our script for Classify gives us the OCLC numbers for each edition as well as the holdings numbers but not the locations. We are planning to work on another script that will run the OCLC numbers through Worldcat and then create a map visualization illustrating where different hubs for artists’ book collections are around the world.