Reference Question Data Mining: A Systematic Approach to Library Outreach

Joshua Finnell and Walt Fontane

Print version (Adobe Reader required)
This exploratory study investigated the feasibility of using reference questions as an important tool in the construction of study guides, instructional outreach, and collection development at a small, four-year university in Lake Charles, Louisiana. The premise for the study was based on the assumption that the content of the reference question and class from which the question came provide more valuable information than the metadata normally captured within reference classification systems (e.g., directional, research). Reference question subjects received at the reference desk were recorded over six months by the reference staff. The authors then analyzed and classified the data to discover patterns in collection use. The resulting report was then disseminated to the reference, collection development, and instructional outreach departments. The findings reveal that this method of reference data classification and timely reporting provides an excellent reference for planning in these library departments.

A 2002 survey conducted by the Association of Research Libraries (ARL) revealed a steady decline in reference transactions (125,103 in 1991 to 67,697 in 2002).1 Faced with concrete numbers elucidating this trend, academic librarians and college administrators are expressing a need for new modes of displaying the usefulness of reference services in a continually changing environment. The ARL report also revealed a general lack of confidence on the part of librarians in current data collection. Traditional data-collection techniques (e.g., classifying reference questions into categories such as directional, ready-reference, specific search, and research intensive) allow librarians to analyze workflows and compare the institution to peer organizations. However, an essential component of the data-capturing process has been left out: the subject matter of the question. In 1981, White quipped, “If librarians go to the trouble of keeping such statistics, they ought to get something out of them locally.”2

McNeese State University is the ideal size to conduct a study like this. It is largely a commuter school servicing about eighty-one thousand students. It offers seventy-five associate, baccalaureate, and graduate degree programs. The Reference Department at Frazar Memorial Library includes five librarians that manage the reference desk for seventy-seven hours per week. Library instructional outreach at McNeese has led to a large increase in information literacy sessions in the past two years. The bulk of the instruction sessions are completed shortly after midterm exams, allowing ample time for the reference librarians and the Library Instruction Committee to analyze assignments, classes, and reference questions to discover patterns, problems, and trends, as well as plan future actions.

The current study began during the development of subject-specific resources guides, which included famous events and persons broken down by decade. Subsequently, the investigators decided to record all research-oriented questions with the twofold purpose of (1) collecting data to target specific courses for information literacy instruction and (2) identifying gaps or weak spots in library resources. Along with changes in the goals of this study, we also changed the method to better capture the reference questions and mine them for data in pursuit of the study’s three goals: study guide creation, instructional outreach, and collection development. At writing, the current method has been in place for more than six months and supports these three goals.

Literature Review

There is a paucity of library literature dealing with systematic approaches to capturing category-specific reference questions. An analysis of the 2002 collected reference logs compiled by the ARL revealed that only ten out of seventy-six institutions surveyed (13 percent) record the subject category of each question.3 One possible reason for this may be the time required to analyze reference transactions; time and human resources are precious commodities. Additionally, very few studies made an attempt to link the subject of each reference transaction to the course in which the student is enrolled.

Classification from Course Catalogs and Professors

Traditionally, collection development has relied upon faculty and subject bibliographers to identify gaps in the collection. In 1969, Durand of the University of Southwestern Louisiana embarked upon a project to classify the university catalog to ascertain the needs of the curriculum and identify gaps in the collection. Several catalogers correlated the courses listed in the university catalog with the Library of Congress (LC) classes. Major obstacles to success were the poor course descriptions and the subsequent interpretation. As Durand noted, “No numbers were assigned to a subject not explicit in that course description. This, of course, meant that many large blocks of class numbers would not appear in our final list.”4 In 1974, a similar study was conducted at the Gene Eppley Library at the University of Nebraska at Omaha (UNO), and the researchers discovered that often “the LC schedules did not contain specific classification numbers for the subject treated in the UNO courses.”5

In 1980, Whaley, then reference bibliographer at the University of North Carolina, Charlotte, suggested having individual professors—not librarians—assign LC classes that reflect the materials covered in their classes. Bibliographers then analyzed the professor-assigned LC classes against the general collection and identified gaps and made suggestions for purchase. To supplement the deficiencies of both course descriptions and the LC classes, professors were asked to use “free terms” to describe their courses and supplement the LC classes. These “free terms” were then translated into the closest LC subject headings (LCSH). Again, the bibliographer compared the subject headings against the collection and located gaps for purchase suggestion. Whaley concluded that this procedure, which included input from both professors and librarians, helped to establish and maintain a useful collection that serves the needs of its patrons.6

Classification from Syllabi

In 1982, Rambler underscored the importance of syllabus study as a proactive approach to academic librarianship. Examining The Pennsylvania State University’s syllabi for library-related projects, she noticed that the overwhelming majority of courses come from the upper divisions. “These findings suggest that the development of systematic library instructional programs should be explored for students other than entering freshman and transfer students.”7 In 1985, Sayles extended the outcomes of syllabus study to include collection development, subject guides, anticipatory reference, instruction, and overall course improvement.8 Syllabus studies are still a popular method of proactive library services.9 However, both Rambler and Sayles noted issues with the methodology. Rambler writes, “Since there is a gap between knowledge about the requirement stated in any syllabus and the actual activity a student will follow to complete the requirement, an argument could be made that this study is arbitrarily assigning library use.”10 Much like the earlier studies conducted on course descriptions, course syllabus studies leave out the most essential element: student input. The collaboration of faculty and librarians is a necessary, though not sufficient, element in improving academic library service. As Sayles points out, “There is a difference between what students need and what librarians think they need.”11

Classification from Random Sampling

In a 2006 meta-analysis of libraries using the statistical sampling method to capture reference statistics, a process that involves collecting reference statistics various times throughout the year, only Central Michigan University (CMU) classified the subject of each reference transaction by course.12 Using Statistical Package for the Social Sciences (SPSS) software, reference librarians at CMU coded several variables of all reference transaction for the 1988–89 fiscal year (gender, request date, subject, status of inquirer, and so on). They concluded that the inclusion of subject matter can assist in collection development and the inclusion of courses can assist in program marketing and long-term planning.13

In 2007, Henry and Neville conducted a study of classification systems, with the understanding that library “resources” include various electronic and technological items. They concluded that a “skill or strategy-based approach, rather than a system based on resources used or time allocated per question, leads to more consistent classification and provides a more accurate reflection of today’s reference desk activity.”14 Following this trend, the current study suggests a classification system based on the subject of each reference question.

The Reference Department is uniquely situated to collect student input. Transactions at the reference desk reflect the information needs of students at an academic institution. As can be seen from this literature review, classifying course descriptions and syllabi have been the means through which collection development, subject guides, and outreach have been and are being executed. Given the aforementioned trend of reference statistics providing useful feedback for reference librarians to improve the service of the library to patrons, a new classification of reference data that adapts and builds upon the methodology of syllabi and course description studies is needed.


We asked Reference staff, including paraprofessionals and librarians from other departments assisting at the reference desk, to briefly summarize (1) the subject matter of the question, (2) the class for which the question applied, or (3) the professor of the class for which the question applied. Because of the varying degrees of >technological expertise among the reference staff, data was recorded using a notepad and pencil. If the librarian did not record a class, or was interacting with a nonstudent patron (administrator, faculty, or public), they would record the class as “community.” We considered other labels for this vague category, including “public,” “other,” and “faculty-staff”; however, these labels could lead to theoretical and practical problems. We felt uncomfortable classifying patrons as “other” or “unknown.” We chose “community” because everyone who approaches the reference desk shares basic social bonds through location (reference desk, Frazar Library, and McNeese State University), interests (scholarship), and events (requiring assistance). On the practical side, this study did not seek to correlate patrons with resources, but rather to determine which resources merited strengthening through increased expenditure of the library’s money, personnel, and time.

Ideally, the information recorded for each interview would result in subject terms and class designations or course numbers. At the start of each month the investigators meet to review the previous month’s entries. We looked for patterns that might identify weaker areas of the collection, high-use areas (collection development), classes that may require an instruction session on library resources (instructional outreach), and areas of general confusion for both librarians and patrons (study guides).

The reference staff were responsible for recording the subject matter of each reference transaction succinctly, similar to Whaley’s idea of “free terms.” These keywords aided the analysis, which sought patterns in reference questions that indicated certain areas of library resources (especially the general collection) that receive frequent use. Analysis was made easier by assigning Library of Congress Subject Headings (LCSHs) to many of the free terms. Reference shares the monthly reports of this study with the circulation and collection management departments, whose staff are frequently evaluating the general collection. Upon viewing the monthly reports from this study, those departments can focus their evaluations on specific areas that are in high demand.

Pages: 1 2 3

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *