Creative Communities of the World Forums

The peer to peer support community for media production professionals.

Activity Forums Storage & Archiving checksum in archival workflow?

  • checksum in archival workflow?

    Posted by Paul Dougherty on July 18, 2016 at 2:57 pm

    I’m trying help a colleague do a fairly large archiving/digitizing project for a career’s worth of work. As producers go he is pretty technical but not a hands-on techie per se. We can use Shotput Pro to copy all files and provide checksum verification (and a checksum value for future reference) every time an asset is copied. I don’t think his scale and needs warrant LTO tape.

    Most likely digitized NTSC “master” files will get stored on a minimum of two drives, each in a different location. Good quality access copies aka screeners will get stored in the cloud. Once we accomplish this, it’s hard to know what my parting words (or memo) should be about re-verification and auditing of his archives should be?

    I know there are rules of thumb that files should get migrated to new drives every so many years. But trickier still for me is to suggest a e-verification regime for a non-techie, suggestions?

    For years I have used CDFinder (now Neofinder) to ride herd on file collectors spanning many hard drives. I don’t have the latest Neofinder but it offers FileCheck that seems to address this issue. (see below). This seems like it might be a great fit but would love to hear from others.

    Thanks in advance for any suggestions.

    Paul

    https://www.cdfinder.de/en/en/filecheck.html

    If you verify the FileCheck values for an entire catalog, CDFinder will even show you a window containing all files who did NOT pass the check, so you know exactly which files are damaged and need to be replaced. Of course, CDFinder also displays the actual MD5 value for every file in the Inspector:

    Tim Jones replied 9 years, 10 months ago 2 Members · 5 Replies
  • 5 Replies
  • Tim Jones

    July 18, 2016 at 5:10 pm

    Hi Paul,

    Are you strictly working with disk or is LTO coming into this process?

    Tim

    Tim Jones
    CTO – TOLIS Group, Inc.
    https://www.tolisgroup.com
    BRU … because it’s the RESTORE that matters!

  • Paul Dougherty

    July 18, 2016 at 7:04 pm

    Hi Tim, Though I can’t absolutely guarantee it forever, I’d say no LTO on this project. And even if it should change, I expect to work with clients who have small collections and will never employ LTO. So I’d still be seeking an answer for an no-LTO scenario.

    Thanks,

    Paul

  • Tim Jones

    July 18, 2016 at 7:17 pm

    In that case, I would recommend using something like rsync or rsyncX (rsync with a GUI) since they perform checksumming at the time of the file copy to the destination automatically. It would be the same as performing an MD5 on the source end, copying the files to the destination end and then re-running the MD5 on the destination copy, and then comparing the MD5 results.

    Any other mechanism would involve manual processes to generate the checksums and that can lead to the loss of the sidecar MD5 values leaving you with no option to verify the copied files.

    As an aside, you could buy an LTO Thunderbolt solution and use it to provide the LTO side of the equation as a service for the users regardless of their size. LTO-6 tapes are only around $30 each, so there’s a potential for a new service offering for your customers.

    Tim

    Tim Jones
    CTO – TOLIS Group, Inc.
    https://www.tolisgroup.com
    BRU … because it’s the RESTORE that matters!

  • Paul Dougherty

    July 18, 2016 at 8:30 pm

    Thanks Tim,

    I have to admit that the rsyncX suggestion (advantages)went over my head, especially as compared to ShotPut Pro?

    Best,

    Paul

  • Tim Jones

    July 18, 2016 at 9:50 pm

    Sorry – I totally missed that you were using ShotPut PRO for the offloading and copies (I’m still trying to get used to the new Cow forum look and workflow 🙂 ).

    The difference is that rsync/rsyncX do the checksumming transparently and any errors are recognized at the point of copy rather than after the fact and with a separate sidecar checksum database/file.

    Tim

    Tim Jones
    CTO – TOLIS Group, Inc.
    https://www.tolisgroup.com
    BRU … because it’s the RESTORE that matters!

We use anonymous cookies to give you the best experience we can.
Our Privacy policy | GDPR Policy