Considering Scholarly Readers
Moving to our interviews with readers, our main objective was to listen to them free of assumptions, understand how they work, hear their concerns and frustrations, and identify ways that digital reading technology might effectively support their academic pursuits.
Scholarly readers are those whose reading is done in pursuit of new knowledge, primarily—but not exclusively—in professional academic settings. We focused primarily on professors and graduate students who engage in research, teaching, and publishing.
We were quickly reminded that scholarly readers don’t read or annotate for a single purpose. An interviewee explained,
“For me, it’s more a question of the function the reading that I’m currently doing has for me, and I have sort of three layers of engagement. There is—in print—there’s a reading for teaching, there is a reading for understanding and things that I need to use for my research. And then there is reading digitally, which I usually actually use if I really need a text for a research project.”
Scholars engage in a wide range of activities with regard to scholarly materials, most of which are, according to our interviews, poorly addressed by existing digital solutions, and to some extent, even hindered by them.
Print vs. Digital
We discovered that scholars don’t favour print or digital per se, but rather use a mix of formats depending on the type of reading they are doing. For deep reading, most interviewees chose print. For one interviewee, this was just a matter of habit:
“I never embraced the digital note-taking phenomenon. It may well be that I was just too old when it came along. Sure it looks cool, but I never saw the value added. So, I guess the theory is you have a PDF, you can write notes on the PDF, you can save those written notes. I guess it does make sense, I just never adopted it. And I think it’s just because when that came around, by that time, I was at least thirty-five years old, and I had been a graduate student for ten years. And you continue with something you feel comfortable with, right?”
However, others would choose digital in certain contexts. For instance, digital was preferred when traveling, as scholars could take their collections with them; for skimming content when conducting a shallow read; to search for keywords to quickly assess the relevance of a text; or to locate quotations when writing; and most importantly, to retain a copy for quick referral while writing. While there was a consensus that “there is something about the printed text that’s really wonderful,” there was also agreement on the value of digital formats, during the writing process in particular. One interviewee described their process, saying,
“So, up until this point, up until the Ph.D., I really sided with electronic forms of printed materials, or of literature. And that was specifically because I was really mining from various sources, and I needed to— I realized that how I put together my ideas was in kind of gathering quotations from here and there, and the easiest way to do that without completely transcribing a book was to be able to copy and paste.”
Surprisingly to us, PDF emerged as the favoured choice of digital formats among all the interviewees. One explained:
“When it comes to digital, I use PDFs. I never use EPUBs, I never use, you know, Kindle, and so on. Largely, because this problem that I think then, that the location gets saved in some digital page number. But I do read quite a bit, in PDF form.”
The preference for PDF might stem from the similarity of layout in the digital PDF with the printed book, or the undeniable fact that the majority of digital and digitized monographs are only available in this format. The advantages of EPUB, let alone networked HTML books, are not apparent, partly, we surmise, because the tools for reading digital texts are so limited.
Not Just Monographs
While our research was focused primarily on scholarly monographs, when it came to investigating reading behaviors, the conversations and responses encompassed reading of all sorts of formats, not just books. Monographs certainly figured, but so did primary sources, essays, criticism, reviews, articles, and journals, and often, only individual chapters of books were used.
“We have materials that are very hard to find, manuscripts. Manuscripts, bits of manuscripts, letters, or something like that.”
What’s more, these sources aren’t always in a typical typeset format, such as PDF. They may be flat PDFs from scans and photocopies, or even photos.
“People tend to like scan or photocopy or take [photos] with their iPhone [of] thousands of primary sources and then, figure out what is important when you get back home.”
The nature of these materials can also influence a researcher’s choice between print and digital formats. One researcher explained their shift from digital-first to a more hybrid approach once they began their Ph.D.:
“So if there were [digital] scans of books, then already in circulation on the Internet, then I would run it through an OCR [Optical Character Recognition] and be able to kind of use that in a more dynamic way than as an image… Now, I’m at a point where the amount of material that I must contend with—that model that I’ve just described is no longer appropriate. The amount of work, the amount of hands-on labor that I would need to do for myself is just— it puts that model to shame. It makes it obsolete (emphasis added).”
Readers we spoke to did not make much, if any, distinction between monographs and other materials, in contrast to how we had intended to frame the conversations. Any system of scholarly knowledge management would then naturally need to handle texts of all sorts and all formats (at the very least PDF, EPUB/Web, and images). This was a critical observation and has helped to shape our understanding of user needs.
Access to Content
Curious to see the interplay between various groups in the publishing ecosystem, we asked scholars how they accessed the content needed for their research. It became apparent that access to content was by no means equitable or uniform.
For nearly all interviewees, the library was the primary point of access to content, with the Internet and search engines being the next:
“Immediately I go to the library. I go the library, I go to Google Books, right? … I would do it normally in that order, library, Google Books, I see whatever. I just Google the author, and see where copies are available online and again, it’s that. And in 95% of the cases I’ll be able to get it.”
Others face the challenge of essential materials to their field being difficult to access for other reasons. They may have to travel to access archives, be allowed a limited time with them, or even be restricted from taking anything but handwritten notes. Yet, these materials are still a vital part of the researcher’s corpus.
“Let’s say that I go to a library in [city redacted], right? I go to a library in [city redacted], they have a manuscript, they don’t allow me to even to take a photo of it, nothing. No citations. I just copy out passages by hand. Nevertheless, I want to be able to cite it, I want to know how much of my notes I’ve used, you know, something, that’s a very extreme example. But there can be many examples in the middle, of books that if they ever become digitized, it’d take five years or whatever.”
“I’m now having to contend with texts that are really old, or are [in] archives that I cannot scan, or they’re not accessible to me at all times. I only have them for a certain period of time, like a couple of hours sometimes, right?”
Our interviewees were quick to remind us that, even for more typical published materials, not everything they need is accessible through the libraries at their institutions:
“[The] most recent scholarship is generally available in digital form, through the library system. That having been said… a lot of the texts that I do work on… they haven’t been digitized at all, and sometimes I have to use the ILL [inter-library loan system] to actually request a physical book from somewhere else, just because it just doesn’t exist in digital form.”
As a result, scholars in particular disciplines have resorted to sharing materials, such as manuscripts, letters, and photos of archived texts, in private groups. They are wary of copyright infringement but keen to help their peers.
Others pointed out that some libraries have subscriptions and access to larger databases. An interviewee explained:
“I’ve always been in an elite university, and that’s relevant because I have had access to about as wide a range of secondary sources to read and then, later, I’ve been in universities that are so big and wealthy that they can buy the whole range of consortium. So, when the digital age came about, I could have access to any journal, any journal article and then later, when ebooks started to be included, I could have digital access to more or less any monograph that was a secondary source.”
Adjunct scholars affiliated with elite universities may have the challenge of being located far from campus, meaning that they cannot browse the library’s physical collections. Instead, they have to resort to other means to get materials – such as requesting library staff scan relevant works and send these via email.
Inequities in access to content are further evident in the case of scholars who are affiliated with universities that have less comprehensive access to resources. They must resort to paying exorbitant rates to access content. In many cases, these scholars will contact their peers with institutional access to “restricted” materials behind paywalls, and request a copy. For those academics who are unable to access materials critical to a research project, the effect can be detrimental to their careers.
“[It’s a] convoluted system, where you can’t make yourself attractive on the tenure-track market unless you have an affiliation….If not, you’ll have no access to the journals that you need….It’s a vicious, vicious cycle.”
Workflows and Workarounds
Among the scholars interviewed and surveyed, none made reference to any standard, formalized, or taught process, or workflow for managing their reading-related activities. However, a pattern emerged of improvised systems, combining different tools and developed over years of research, each remarkably unique from the others.
There were some commonalities, with four major areas emerging: file management, annotations and other text markup, note taking, and finally, writing. Beyond this, however, each researcher we spoke to had created their own system from scratch, piecing together different tools and methods available to them to create something functional.
For file management, some use cloud storage services such as Dropbox or Google Drive for digital files. Others stored files on their computers, developing an often complicated, idiosyncratic folder structure. Some used a spreadsheet or other document to keep track of readings related to a given project, including things like reading statuses, priorities, availability, and format (print from library, need inter-library loan, PDF on computer, etc.):
“What I do is when I start a project, I usually write a bibliography of possible sources and I also use that to track what I’ve seen and what I haven’t seen. What I’ve ordered through inter-library loan, and what I haven’t. What I’ve gotten a PDF of, and what I haven’t. And I simply— I make a link in that as a Word file. I make a link in the Word file to the PDF, if I have it…. And if it’s a physical text, I put in the call number, the library, or I mean, if it’s— if I have it on my overflowing shelf, I have a lot of books, I really need to buy more shelves. Then, I know of course that I have it, usually. It’s really getting to the point where I don’t know what’s there.”
In one case, a researcher also included a system of accountability:
“I have a separate list, actually, for while I’m on project holding myself accountable to actually reading these texts. So, I make a subsection [of] which texts I have to read through, which project, I put them in parentheses, and these are just like plain-text files… where [I] just put in parentheses the final number, the total number of pages. And track myself, where I am in that text, because it forces me to read it, but the same time, where I have a subsection at the bottom, where I just keep accumulating the texts that I’ve read. Sometimes for a specific project but also overall for the texts that I’ve read in the year, just because it gives me a sense of what I’ve read, what I’ve achieved.”
Annotations and Notes
Annotations and other text markup were done either on the page, in the case of print, or with a number of available tools for ebook and PDF annotation. We spoke to scholars to find out different ways that they write, annotate, or interact with texts. These included: notes or comments on paper, marginalia, index cards, blurbs, quote bubbles, highlights, sticky notes, conceptual maps, drawings, and conversation. More interesting was the language of colours, symbols, and other markers that each researcher developed to capture their thoughts as they read, and communicate significance, connections, project associations, and myriad other nuances of their interpretation of a text. Our interviewees explained:
“[I have] three levels of notations of one that I’m reacting to, one that’s like straight author’s points, main points. And then, my own reaction, or sorry, my own kind of question or discussion point that I want to bring up to the class.”
“I think in pictures and I think in lists. I think in like little blurbs and quotes, quote bubbles, key words, and then I’ll write where that thing is. …This is a key word that is really kind of coming up. And so, then, the process of me drawing all of this out, that becomes the conceptual map within which I operate to write….drawing becomes my way of pre-writing.”
“I wish I could annotate and draw at the same time. I wish that the iPad, for example, screen were maybe five times this size, right? So like have four, right? What if it was this big? With the text still that big, right? And I could use the extra margins in a way where I could start to do this more…”
It seemed at times that the remarkable freedom of writing freeform allowed these languages to form, but it was difficult, if not impossible, to replicate that freedom on available digital tools. Printing out articles or chapters of interest and annotating them with pen or pencil is still seen as the way to go by many. Having physical copies on hand also means easier management as this benefits from the very natural use of space for arranging things, e.g.: “The pile on the right contains my primary sources; on the left are things I’ve flagged as potentially interesting and to revisit.” Often mentioned was the use of digital editions for quick consultation and search, but print versions for in-depth reading and annotation. Most collect important works in print.
While some note taking did take place alongside annotation, each of our researchers would reach a point where they needed to take the texts they had read and turn the notes, quotes, and other takeaways into something they could then begin to incorporate into their writing. Again, the approaches to this varied widely, and depended on the tools used initially. Some would take handwritten annotations and highlighting and type them into a word processor. Others would export annotations from tools in whatever way is available to them, though an arduous copy-paste process was more common. One interviewee wrote his own script to extract his notes from a PDF annotation tool:
“What I do, is in any PDF I mark it up, using iAnnotate, usually using the note function, I… type a note into iAnnotate, and then, what I do is I programmed a script essentially, which grabs all of the notes from the PDF file and puts them in order. And then, it produces my Word file for me, so for that that’s very useful.”
At this point, there is no clear end to the reading and annotating process as writing begins. Rather, the two coexist and continue in tandem.
“So, the problem is, if you leave it at the end, and you write, and you’re like, ‘Oh, I should have looked at this and that.’ And then it’s too late, right? So, you write very early in the process. And the writing process and the research process is iterative.”
Reading to Write
Over the course of the interviews it was clear that the acts of reading and writing were inseparable in the minds of researchers. Every journal article, chapter in an edited volume, or monograph is read with the purpose of a specific output.
“The writing process is not something that’s done at the end… The writing process is something that starts, and this is something I emphasize with my doctoral students, is that the writing process literally starts right at the beginning. You write drafts, with the expectation that those are going to be revised, maybe thirty or forty times. But you start off with a draft and it’s the process of writing, which itself is an intellectual activity, which exposes other things you need to research.”
Annotations and highlights make their ways into notes and indexes, eventually to be massaged into whole new paragraphs, quotations, citations, and bibliographies. Scholarly reading is the continuous transformation of read material into newly written material.
This lack of separation between reading and writing is unique to scholarly reading, making it very different from leisure reading. One of our interviewees explained the reason for this inseparability between reading and writing:
“You’re providing an original perspective, original research. But you need to engage with an existing body of knowledge [to do so].”
We learned that, for academics, reading is not simply deciphering the words in a single text. Rather, scholarly reading is reading writ large: a set of activities that includes simple consumption and comprehension of a text, and also a number of related activities, including collecting and managing a large corpus of texts, annotating, note-taking, and writing.
Academics read for a number of purposes, and toward multiple goals – reading to write a journal article or monograph, reading to include a text in a course, reading to prepare a conference presentation, or reading to stay informed about a particular topic. And as they read, as they write, they discover newer avenues:
“It’s just completely a snowballing effect, it really is. So, every article that I read itself has let’s say fifty references …. [T]he first thing we do, as researchers, is actually not even look at the content, as much. We’re looking at the sources that they’re using. Not just the primary sources, but the second, everyone’s sort of like, ‘Hey, I haven’t read that.” I haven’t read that article.”
What’s more, any given reading project is likely to have multiple written outputs offshooting from a central focus. One interviewee explained,
“Authors will work through their ideas in journal articles, before they lead up to the big honking book. And so, it’s those are usually like the breadcrumbs that we follow until the book comes out, and then we all switch over to citing that book.”
Ultimately, scholarly reading cannot be separated from scholarly writing. This means, with our focus on the former, we must expand our understanding of what it means to read for research. Scholarly reading is not passive, or linear. As one interviewee put it, very succinctly,
“It’s a creative process itself.”
Avenues for Improvement
Having spoken to the range of readers we did, it appears that while they had largely developed a system that worked for them, it was a case of making the best of what is available. All ran up against limitations in the tools they chose, or were frustrated by not being able to do something quite how they’d like. Where an action is not supported—and in some cases not “permitted”—scholars resorted to manual workarounds which could potentially be automated if the technology were more open.
PDF clutter and convoluted file folder arrangements were common complaints, as was the difficulty of managing both print and digital resources.
“What I find increasingly frustrating is when you have different research projects, but at the same time, sort of a general repository of texts. Sometimes, I end up saving new texts that I should be adding to the repository to a specific subject folder for a research project. And then, a month later, I’m incapable of finding it without putting fifty units of research on my computer, restricting around or using the search function….But it would be useful if there was a general system where you have a double access to the same file, through the folders.”
“And so I’m now straddling these two worlds, where I either have to keep renewing books because I’m not quite done with them yet. Or if they get recalled, then I just have to click on that button two weeks from now, so that I can get it back, right? Or everything is on my computer and I just have to keep a mental inventory of where that document is. And I really wish I didn’t have to expend my mental energy on that.”
Many were also frustrated by the cost of stringing together all the different tools they needed. While some are offered through libraries, others are not, and paying for cloud storage, for example, was cited by many as an annoyance, and is certain to be a barrier for others. But, given the sheer volume of content a researcher accumulates, storage is essential.
“In my academic career in speaking with different people, attending conferences and liaising, I very frequently get PDF texts like books, monographs, in email attachments. And I can’t just search for them in my email and delete only those things. And so, it became a, ‘Okay, do I spend a week clearing out all thirteen years of my email? Or do I just buy my sanity and just shell out?’ And so that’s what I did. I probably will have to do something similar to Dropbox also.”
Overall, it is clear that workflows require time and patience to figure out and hone over years, largely without support or guidance, even from institutions.
“[There is] very minimal support on the institution’s side, mostly because the assumption is that at this point in your career, you should have figured that out by now. That is the baseline is, in order to have gotten to this point you have to have had figured that out.”
And the end result is generally still a very manual, sometimes clunky process. And yet, the development of a personal workflow is considered in many ways a critical part of developing your skills as a researcher over time.
“It’s something that I’m developing that just continues to develop. I think this also might be interesting, how the methods for people who are still doing graduate studies or undergraduate studies with reading and writing are very different from somebody who’s finished their graduate education [and is] now [a] professor. Because you have five to ten years, to slowly develop that process and come to something, arrive at something that you’re comfortable with. And that’s a part and parcel of the academic training and the reading that you do.”
The overall impression from our interviews with readers is that their methodologies of reading, writing, managing collections, and finding content are deeply personalized. While there are clear commonalities, there are no agreed-upon standards, best practices, or widely accepted models that a researcher could adopt or adapt.
As mentioned at the start of this report, readers are the driving force behind the scholarly ecosystem, so any software and tools must work for them if they are to have any value at all. These interviews made clear that a personal library or collections management software must meet certain requirements in order to be a meaningful addition to the array of tools available to them. These include the ability to:
- manage a scholar’s entire corpus or collection of texts
- accommodate multiple formats and materials (e.g. books, images, videos, audio, journal articles, reviews, archival material, letters, reviews, personal notes, or documents)
- remain flexible enough to accommodate disparate workflows
- support the development of personalized workflows, without restricting or mandating behaviours
Ultimately, the value our tool could offer will lie in managing an entire collection of texts, not just a single reading experience, and in the capacity to accommodate user-driven workflows. Neither of these criteria are met by any software currently on the market, and there is a real opportunity for one to emerge and, perhaps for the first time, properly support readers in their academic pursuits.