SF Nexus is an open access project. Data and resources provided here are free for everyone, including:
- Extracted Features: Disaggregated feature sets from copyrighted literature, available for research purposes
- Python Notebooks: Custom Jupyter notebooks in Google colab environments for easy exploration of our data
- Documentation: Descriptions of pipelines used to digitize and analyze our dataset, from OCR cleaning to topic modeling and visualization
- Visualizations: Output generated from analyses of our dataset, including topic modeling and word embeddings
Overviewing the SF Nexus
The SF Nexus comprises a collaborative network of research and public libraries with collections of SF, dedicated to making science fiction available online, including as data. While the SF Nexus project is based at Temple University’s Charles Library, we are committed to growing our collaborations with a SF-focused collective research community. This project presents a prototype of what could be developed as a large-scale collaborative digitization between the dozens of science fiction collections across England and North America, including but not limited to the members of the Science Fiction Collecting Libraries Consortium
The current phase of this website showcases a demonstration project of how libraries can digitize and make available their copyrighted cultural collections as data. Our current focus has been on sharing extracted features of the data, as well as documenting the corpus’ ingestion and curation in the HathiTrust Research Center. Additional projects at Temple Libraries involve developing localized data capsules for confidential computing access to copyrighted corpora, as well as novel ways of digitizing corpora under controlled circumstances.
The Core SF Archive: The Paskow Science Fiction Collection
Temple University Libraries, Special Collections Research Center
The Loretta C. Duckworth Scholars Studio has partnered with Temple University Libraries’ Special Collections Research Center (SCRC) and Digital Library Initiatives (DLI) to build a digitized corpus of copyrighted science fiction literature. Besides its voluminous Urban Archives, the SCRC also houses a significant collection of science-fiction literature. The Paskow Science Fiction Collection was originally established in 1972, when Temple acquired 5,000 science fiction paperbacks from a Temple alumnus, the late David C. Paskow.
Subsequent donations, including troves of fanzines and the papers of such sci-fi writers as John Varley and Stanley G. Weinbaum, expanded the collection over the last few decades, both in size and in the range of genres. SCRC staff and undergraduate student workers recently performed the usual comparison of gift titles against cataloged books, removing science fiction items that were exact duplicates of existing holdings. A refocusing of the SCRC’s collection development policy for science fiction de-emphasized fantasy and horror titles, so some titles in those genres were removed as well. From this set of deduplicated science fiction, a subset has been digitized over the last few years.
Digitizing the New Wave
The digitized SF books have been ingested into HathiTrust and can be viewed at the Temple University’s collection page.
To see an exhibit of selected SF book covers from the collection with related metadata, and For more information on the history of this digitization project since it began in 2017, visit our Omeka S site, Digitizing the New Wave
For an overview of our approach to data curation of literary texts, see Alex Wermer-Colan’s and James Kopaczewski’s article, “The New Wave of Digital Collections: Speculating on the Future of Library Curation”(2022).
Ultimately, the SF Nexus seeks to build and share a comprehensive dataset of science fiction literature. Due to limitations imposed on copyright, this project explores speculative approaches to data curation that can make elements of each book (extracted features) available to scholars seeking to engage in large scale analysis of text as data.