To jump back on an old hobby horse: I read an interesting post at digitallibrarian.org this morning on federated searching versus Google Scholar:
So, why is Google able to do this, and do it in a relatively short time span, while libraries haven’t? An arguement could be made that Google has a greater amount of resources at its disposal, and because it is Google, can work out agreements with database providers which allow for the harvesting of their metadata (and full text) for the purpose of providing search results (but at this time, not the full-text directly). Most likely, there is at least some truth to this arguement. But I don’t believe all of the credit goes to Google; a lot of the credit also goes to the Library community for being passive in its approach towards information providers. We now rent our information instead of buying it; we subscribe to journals and databases without assurance that, if we eventually cancel a subscription, we will retain access to the information for the years to which we duly paid. We accept these terms, and because we do, our technology and our services are limited by them.
This is an interesting question, and one that makes librarians very uncomfortable; why was Google able to come up with a search engine that worked, being completely (as far as I know) devoid of any librarians on their team? Why are the efforts of librarians largely ignored by the technorati while a group of young guys in California were able to change the world? There’s a seething whisper coming out of librarianship when it comes to Google and Google Scholar and the grand digitization project: it should have been us. Those interlopers, they were just kids and they turned our world upside down. Why could they do this when we could not?
I think I have at least a short answer for this. And this circles back to the Gorman affair, of course. Librarians as a group have not attracted enough of the paradigm-shifters of the technological world. When someone, like that 19 year old guy who started Google, thinks about building something to harness the power of the internet, they don’t come out of a library science background, nor do they (generally) consider librarianship as a career path. (What tech-savvy person would read the recent words of the ALA’s president-elect and think that Librarianship was the right fit?) People with the ideas and the knowledge to do the things we wish we were doing are coming out of other, more profitable and more technically-focused fields. At this point, it isn’t enough to understand the life of information, or to know the difference between the universe of knowledge and the bibliographic universe or the intricacies of AACR2. You need to understand the technology and what it’s capable of.
There is nothing as inspiring as really understanding how something works. An architect who understands the principles of construction will be more adept at twisting and bending those principles to create something new and interesting. Knowing what’s possible is a springboard to creating meaningful and useful change. Google Scholar was surely created because one of the Google staff saw that the metadata allowed for searches to be modified by type in just such a way to produce results useful for academics; did we know that was possible? Did it even occur to us? Why should it have; we’re not experts on the internet. We like to pride ourselves on being experts on organizing and ranking information sources, but we (for the most part) wouldn’t know an algorithm if it zipped to our homes and organized our underwear drawers for us. In order to deconstruct, we need to at least understand the construction.
Librarians did a very brave thing at one time. Librarians sought to organize information with the understanding that Google was impossible and would never exist. Librarians tried to create order and reason where there was only a morass of paper and ink. Without their efforts we would have been stuck looking at an idiosyncratic pile of looseleaf. If there is no order, there can be no searching or finding. But that’s no longer the case.
Librarians are like communists; they assume the best in people, they presume that any thinking person would rather learn the controlled vocabulary than get 20 extra (useless) hits. Google came in and did the opposite. Google presumes that most people are stupid and allows them to be.
“rogers high speed internet is a piece of shit”
“search the web for erotic stories”
“need ideas visual presentation film monster”
“Alice Walker feminist view on By the light of my father’s smile”
Would Melvil Dewey have ever considered building a classification and retrieval system that allowed users to plug in searches like these?
Not to say that Dewey’s ideas were wrong. Organizing information by subject is a good idea, and if anyone doubts that they should talk to someone who runs a bookstore. As soon as there’s a profit motive, you get to see what really works and what doesn’t when it hits the floor running. Putting things with like things means that users can find more of what they’re looking for (and buy more). Browsability is important. Librarians are good at organizing physical information (ie, books). It appears that we’ve struggled to move out of the card catalogue.
I spent a lot of time in cataloguing class talking about how digital information is actually no different from non-digital information. Whenever something new comes along everyone wants to separate it out; I have written several papers on the topic of digital exceptionalism and how it’s the plague of librarianship. But now I must offer a somewhat altered thesis. A change has happened; the world is not made up of information we can line up on a shelf. A card catalogue is not a search engine, and neither is a library OPAC.
From the digitallibrarian.org:
So, what should we do? We should seek to emulate what Google is doing; not necessarily try to emulate Google Scholar (though we could and have done worse), but seek to work out agreements where we are allowed a copy of the data to which we are providing access. If the folks at Google can work out terms which were acceptable to content providers, I’m sure libraries can as well. Maybe, just maybe, if librarians, who are quite good at organizing and working with indexed information, could start to play with the databases, indexes, and metadata provided by our major information vendors, then perhaps we can start to explore new access tools which are users actually want to adopt and use. Otherwise, instead of being second (after google) in the information search food chain our users consume, we may start to drop to third (after Google Scholar), or worse…
Librarians feel threatened by Google. As a new librarian, I’m not as invested in the way things were, so it’s easy for me to point fingers. But I don’t think emulating Google is the right move. After all, Google already exists. Being a cheap knock-off isn’t going to help anyone. I think we need to reconsider our role.
We missed the digital information organization boat. We are not going to be the kings of catagorization in this universe. But what we can do is get to know the technology, and see where we can contribute in ways that Google can’t. We can work with Google to get the end result that we want.
Mistake #1: when we started creating metadata, we stopped at the monograph level. This is exactly the problem that the Google digitization project is trying to fix, and exactly the reason why we have to rent information. If we had entered every journal article, every essay in a collection, every segment of every book into our catalogue, we wouldn’t need to buy some for-profit publisher’s wares. We should have added journal titles to our catalogue. You should have been able to do an author search and get a citation for every damn piece of writing that person has created, be it a book, a book chapter, a conference paper, a book review, a letter to the editor, or a journal article. But our catalogues don’t work that way, so Google Scholar will always be better.
Unless we offer to do something Google can’t do. And when we do it, we do it for free. We do it for the good of our patrons and of patrons world wide. We do it because everyone should have access to information. Don’t compete with Google; you’ll never win. Technology is not where our competence is.
Who created a bibliographic universe where salaried academics who write, edit, and peer review for free need to have their work bought back from for-profit publishers in order to assign it to their students? We did.
If we can fix that, we’d be an equal partner with Google, not a competitor. They come up with the interface and the algorithm, we make sure it has good content. A match made in heaven.
To-do list: start revolution.