Some time ago I bought a barcode scanner. For a short time after I got it I played around with scanning anything I could find that had a barcode. It is now, almost a year later, that I have finally put it to use in cataloging all of my novels.
Yesterday I spent an hour or so assembling a perl script to:
- Accept barcode as input
- Attempt to convert Bookland EAN barcode to ISBN
- Scrape title and author from the page on the Dymocks website for the ISBN
- Insert the obtained title, author, and ISBN into a simple database table
This allowed my to, over about two and a half hours, scan all of my novels (the reason that I got a scanner with a USB connection was so I could run it from my PowerBook and connect my one of my linux boxes over wireless where the script was running) and have just over three hundred and fifty of them (ie the ones that Dymocks has records for) entered into the database.
At this point I copied and then modified some existing PHP code to give me a simple CRUD interface so I can manually enter the remaining books which consisted of:
- Three dozen with Bookland EAN barcodes but the resulting ISBN wasn’t valid on the Dymocks website
- One dozen with UPC barcodes but an ISBN printed on the book. Around half of these were able to be looked up on the Dymocks website.
- A dozen and a half with no barcode (older books that I had bought second hand) but an ISBN printed on the back or inside the front cover. Three of these ISBN were on the Dymocks website.
- Another dozen and a half with no barcode and no ISBN (even older books) that I have not yet entered into the database.
Apart from that last group I now have an electronic listing of my books. Unfortunately there are two problems; first that the data I scraped from the Dymocks site wasn’t the best with inconsistencies in author names and second that there was no representation of sets and the book order within the set. I have begun entering this information manually with the side affect being identifying which books I am missing…
While I was manually entering books I started to think about how to handle them long term. If I were to print out barcodes (thanks to a free barcode font) to stick on these books I would be a simple matter of just scanning them in the future. This would also allow me to simply scan the books with UPC barcodes.
One thing that I may change is that I am storing the ten digit ISBN in the database. As they are transitioning to a thirteen digit ISBN (equivalent to the Bookland EAN for existing numbers) I may switch over to storing that…