Monday, October 25, 2010

Robot4583 at your service...beep beep

Cataloguing was a big concern for me in Eprints. This is because I had already entered every item in other repositories: some in Drupal, some in Dspace, a few in both. So in the interest of being consistent with those previous entries I looked into importing the metadata from those CMS's into Eprints.

In Drupal, this simply did not seem possible because I didn't use Views for my collection. It could be that if I created a view for it, I could export it, and I may try that in the future. I expect there are modules to make this all work better but when you are evaluating systems on the time span that I am, you don't necessarily ever reach this step.

In Dspace I was able to export my collection to a zipped folder, but I couldn't get the hand of batch importing into Eprints if that is even possible. (It should be.) So instead I manually uploaded all 4 or 5 files (including one classified as "other" and therefore requiring a description, which I made DspaceBundleDescription for each) from each dspace item from that bulk export, then manually added title (copied and pasted from dublin core xml doc), abstract, creators, and status. All this me from having to enter details. I also started adding keywords (listed as subject in the Dspace-exported dublin _ore xml doc). And I selected Plain Text for format for the DspaceBundleDescription "Other" doc for each item, and also selected "Additional Metadata" for the dublin core xml docs.

Nonetheless I still had to then detail the record, adding at the very least: Title, Creators, and Status (I selected Submitted). All then had to be approved by me.

Eprints gave me many areas where I had to repeat identical metadata in this collection and this definitely led to human error. I minimized this as best I could by opening the Dspace-exported dublin_core.xml doc for each record and pasting from it and also constantly referencing it as I described the item in Eprints. But this felt like it should be unnecessary, and was certainly less foolproof than a computerized entry system.

One day, they'll sort out repositories so that humans only have to do what humans are good at: enter new concepts. For now, though, this is Robot4583, signing off.

No comments:

Post a Comment