Select Page

Digitization is not infringement

Digitization of even copyrighted works for the specific purposes of allowing better, faster, easier searches, allowing access to the Visually-impaired and creating back-up copies for owners of the originals has been held to be fair use under the U.S. copyright statute by U.S. District Court Judge Harold Baer, Jr., in New York.

The case — The Authors Guild et al. v. HathiTrust et al.1 — was brought by the Authors Guild on behalf of all copyright holders to stop the digitization of works by HathiTrust, a consortium of university libraries using Google’s services and digitizing their book collections.

The ruling, handed down late on Wednesday, did not reach the most controversial issue — the digitizing of so-called orphan works (copyrighted works whose author has not been and, perhaps, cannot be located) — an issue the judge said wasn’t ripe for judicial decision since the program never actually went into effect. The decision was otherwise a total victory for HathiTrust.

The facts

The court opinion described the facts this way:

Defendants have entered into agreements with Google, Inc. (“Google”), that allow Google to create digital copies of works in the Universities’ libraries in exchange for which Google provides digital copies to Defendants (the “Mass Digitization Project” or “MDP”). The HathiTrust partnership is in the process of creating “a shared digital repository that already contains almost 10 million digital volumes, approximately 73% of which are protected by copyright.” After digitization, Google retains a copy of the digital book that is available through Google Books, an online system through which Google users can search the content and view “snippets” of the books. Google also provides a digital copy of each scanned work to the Universities, which includes scanned image files of the pages and a text file from the printed work. According to Plaintiffs, this process effectively creates two reproductions of the original. After Google provides the Universities with digital copies of their works, the Universities then “contribute” these digital copies to the HathiTrust Digital Library (“HDL”). The Complaint alleges that in total, twelve unauthorized digital copies are created during this digitization process. Google’s use of the digital works is the subject of a separate lawsuit.

For works with known authors, Defendants use the works within the HDL in three ways: (1) full-text searches; (2) preservation; and (3) access for people with certified print disabilities. The full-text search capabilities allow users to search for a particular term across all the works within the HDL. For works that are not in the public domain or for which the copyright owner has not authorized use, the full-text search indicates only the page numbers on which a particular term is found and the number of times the term appears on each page.2

The Guild’s position was that the copyrights of authors were being violated by digitizing the works regardless of the use that’s made of the digitized copies thereafter, that the mere scanning of the works and their storage digitally is a copyright violation, and, of course, that any distribution of any aspect of the copies online was not permitted.

HathiTrust took the position that everything it wanted to do (minus the orphan work project, which it suspended because of difficulties in checking the copyright status of the works to be included) was within the fair use exception to copyright protection.3

Both sides agreed that there weren’t any facts that required a trial and that the case could be decided as a matter of law (in legal parlance, this was decided based on competing motions for summary judgment).

The court ruling

Judge Baer agreed with HathiTrust that the fair use doctrine protected the copying.

First, the fair use doctrine examines the “purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.” 17 U.S.C. § 107(1). As to this factor, the judge concluded that while preservation of the original alone might not be enough, in combination with making access available to the visually-handicapped and permitting full-text searches that return page numbers where the search term is found, HathiTrust’s uses were transformative (meaning a new use, different from the original), favoring a finding of fair use.4

The second fair use factor, the nature of the work copied, would ordinarily tilt towards the authors because the record showed much of the material copied was fiction. But the judge held the factor was not dispositive because “the use is transformative, intended to facilitate key-word searches or access for print-disabled individuals.”5

The third fair use factor considers how much of a copyrighted work is copied compared to the purpose of the copying. Here, Judge Baer said:

“[T]he extent of permissible copying varies with the purpose and character of the use.” The question is whether “no more was taken than necessary.” Sometimes it is necessary to copy entire works. “Intermediate” copies may not be infringing when that copying is necessary for fair use. Here, entire copies were necessary to fulfill Defendants’ purposes of facilitation of searches and access for print-disabled individuals. … Plaintiffs argue that Defendants did not need to retain copies to facilitate searches; however, the maintenance of an electronic copy was necessary to provide access for print-disabled individuals.6

The last fair use factor is the effect on the market for the copied works. To the authors’ argument that “[e]ach digital copy of a book that Defendants created . . . rather than [purchased] through lawful channels, represents a lost sale,” the judge noted that “purchase of an additional copy would not have allowed either full-text searches or access for the print-disabled individuals, two transformative uses that are central” to the copying program. 7

And to the argument that the digitization prevented the authors from developing a licensing system, the judge rejected the argument as conjecture and held that copyright holders “cannot preempt a transformative market.”8 In other words, the fact that the authors might want to create some licensing scheme down the road can’t block this transformative use now.

The bottom line, according to the court, was that the “copyright law’s ‘goal of promoting the Progress of Science . . . would be better served by allowing the use than by preventing it.’” Therefore, “(t)he enhanced search capabilities that reveal no in-copyright material, the protection of Defendants’ fragile books, and, perhaps most importantly, the unprecedented ability of print-disabled individuals to have an equal opportunity to compete with their sighted peers … protect the copies made by Defendants as fair use…”9

The judge concluded: “Although I recognize that the facts here may on some levels be without precedent, I am convinced that they fall safely within the protection of fair use … I cannot imagine a definition of fair use that would not encompass (these) transformative uses … and would require that I terminate this invaluable contribution to the progress of science and cultivation of the arts …”10

What this means for us

So… what does this mean for genealogists?

To the extent that digitization allows full-text searching of millions of resources held in libraries around the country, it’s a wonderful thing. Being able to find out what books contain references to specific people and specific places should help us target our research in a way that hasn’t been possible before.

Because the HathiTrust decision doesn’t permit even snippets of copyrighted texts to appear online11 — only page numbers where a search term appears will be shown — it should have a minimal effect on the market for genealogy and other reference books.

It may stop people from buying books based on nothing more than the hope that a particular title will have a reference to a person or place we want, and that may well affect the bottom line of small-volume publishers. For all our sakes, since those publishers are important to us, we can hope that it also sparks sales of books where we’ll now know there is a reference we’re interested in.

Me, I’ve found that when I can see a snippet, or when I know that a book names a person or place I’m researching, I’m far more likely to buy the book than when it just has an interesting title.

So… tentatively (and remember: this decision can and likely will be appealed), my take is:

Right on, Judge Baer. Right on.


  1. See Complaint, The Authors Guild et al. v. HathiTrust et al., 11 CV 6351, U.S. District Court for the Southern District of New York; online at Justia ( : accessed 4 Sep 2012).
  2. The Authors Guild et al. v. HathiTrust et al., — F. Supp. 2d — (S.D.N.Y. Oct. 10, 2012) (slip op. at 2-3) (internal citations omitted); online at United States District Court, Southern District of New York ( : accessed 11 Oct 2012).
  3. The doctrine appears in the statute at 17 U.S.C. §107.
  4. The Authors Guild et al. v. HathiTrust et al. (slip op. at 15-18).
  5. Ibid. at 18.
  6. Ibid. at 18-19 (internal citations omitted).
  7. Ibid. at 19.
  8. Ibid. at 19-21.
  9. Ibid. at 21.
  10. Ibid. at 22.
  11. That issue is also being litigated, but in a separate suit by the Authors Guild against Google. See Complaint, The Authors Guild et al. v. Google, Inc., 05 CV 8136, U.S. District Court for the Southern District of New York, online at Justia ( : accessed 4 Sep 2012).
Print Friendly, PDF & Email