Thursday, October 27, 2011

Data Harvesting (IRLS675-Unit 10)

Review the list of service providers at http://www.openarchives.org/service/listproviders.html and also http://gita.grainger.uiuc.edu/registry/services/. Review at least three or four of your choosing. Try some searches, and also see if you can identify the collections they draw from (there's usually a link somewhere to the list of contributing collections).

I chose to harvest the UCLA Archive of Popular American Music (6082 records), the Kansas City Sheet Music Collection (887 records), and the University of Illinois Library at Urbana-Champaign's Online U.S. Sheet Music Database (94304 records, omg!). It appears that they each draw only from their own collections housed in the university since I didn't find a list of providers and the items just pointed to the university pages.

In your blog, discuss what you think makes a good (useful) federated collection and why and how the service providers you selected did (or did not) create a good (useful) service. Some sites index metadata from a very large number of repositories. Why might this be a good thing? Or, why might this not be such a good thing?

There are both advantages and drawbacks to large federated collections. For instance, if you are doing a project on a certain subject, it could be useful to search across databases to get the information you need. However, with lots of data comes lots of metadata that may not always match up across collections. If a collection does not index by subject (or uses a different subject vocabulary), it may be overlooked in a subject search. Also, too many results are often hard to sift through to find exactly what you're looking for. Solid metadata (the more the merrier) and controlled vocabularies make for better federated collections. For example, the UCLA database was very complete and offered additional information in the description field. However, the other two databases offered very little other than title, creator, and date, which wasn't as useful for searching.

No comments:

Post a Comment