Thursday, October 27, 2011

Data Harvesting (IRLS675-Unit 10)

Review the list of service providers at http://www.openarchives.org/service/listproviders.html and also http://gita.grainger.uiuc.edu/registry/services/. Review at least three or four of your choosing. Try some searches, and also see if you can identify the collections they draw from (there's usually a link somewhere to the list of contributing collections).

I chose to harvest the UCLA Archive of Popular American Music (6082 records), the Kansas City Sheet Music Collection (887 records), and the University of Illinois Library at Urbana-Champaign's Online U.S. Sheet Music Database (94304 records, omg!). It appears that they each draw only from their own collections housed in the university since I didn't find a list of providers and the items just pointed to the university pages.

In your blog, discuss what you think makes a good (useful) federated collection and why and how the service providers you selected did (or did not) create a good (useful) service. Some sites index metadata from a very large number of repositories. Why might this be a good thing? Or, why might this not be such a good thing?

There are both advantages and drawbacks to large federated collections. For instance, if you are doing a project on a certain subject, it could be useful to search across databases to get the information you need. However, with lots of data comes lots of metadata that may not always match up across collections. If a collection does not index by subject (or uses a different subject vocabulary), it may be overlooked in a subject search. Also, too many results are often hard to sift through to find exactly what you're looking for. Solid metadata (the more the merrier) and controlled vocabularies make for better federated collections. For example, the UCLA database was very complete and offered additional information in the description field. However, the other two databases offered very little other than title, creator, and date, which wasn't as useful for searching.

Tuesday, October 18, 2011

Catalog Record Difficulties (IRLS675-Unit 9)

Creating a catalog record is expensive -- some estimates range from $50 to well over $100 per record. It's also not easy to create good metadata that is consistent enough so that queries across repositories (or even across different catalogers) return precise results. Discuss briefly the challenges you are having cataloging your items in terms of subject listings, key words and tags, categories and other facets. How are you approaching the problem of consistency (or are you)?


Cataloguing seems easiest in Drupal so far since I can easily customize the metadata and utilize some folksonomic tagging as well. My experiences with DSpace and EPrints has been less than satisfactory due to the complex customization systems. I feel that most of my collection would be easily searchable when I have my final product, although with tags there is always the risk of misspellings and such. While I'm the sole manager of the repository, it will be fine, but it would be a lot more complex if it was open to the public. This is an area I need to think about more before creating my final repository.

Wednesday, October 12, 2011

EPrints Install (IRLS675-Unit 8)

I thought the EPrints install was pretty straightforward. It was faster than drupal, but slower than Dspace for me. I successfully branded my EPrints site using the second method (tried the first and it didn't work for some reason), where I edit the main file to point to my logo. I also installed the "glass" theme, but it doesn't look noticeably different, so I'm not sure if it didn't work or if it's just not very unique. So far, EPrints seems difficult to customize, mostly because of the lack of good tutorials. It seems that the EPrints online community is not as large or as dedicated to helping out new users.

Tuesday, October 11, 2011

Tech Savvy (IRLS675-Unit 7)

This week when trying to customize my DSpace repository, I realized how much more work I have to do to become tech savvy in this environment. I've learned a crazy amount of stuff so far in the DigIn courses, but I'm realizing now that I need to spend more time beyond/after the course to get into the advanced skills and culture of these programs. For instance, I need to brush up on my html and take it to the next level in order to create custom forms for inputting my data. I can see that this path in librarianship will be a lifelong learning experience.

Monday, October 3, 2011

Installing DSpace (IRLS675-Unit 6)

The DSpace install went super smooth after I re-installed it using the proper version. I had attempted to install the older version with a different password and it messed up, so when I installed the newer version, I just went with the default passwords and it was fine. In looking over alternative DSpace installation instructions, the duraspace wiki seemed a bit confusing and several people in the comments section pointed out errors in the command line codes. The SunScholar wiki, however, seemed very well organized and easy to follow. I think I could probably figure out how to do this on my own, but it's very helpful to have the instructions tailored to my specific use in this course since it was a no-brainer to just follow the professor's code.