Creative Communities of the World Forums

The peer to peer support community for media production professionals.

Activity Forums Apple Final Cut Pro Finding duplicates among 7000+ videos

  • Finding duplicates among 7000+ videos

    Posted by Xavier Paredes on September 26, 2017 at 7:08 pm

    Dear forum members,

    We have a rather large and daunting potential project. Our client has 7000+ video files on their Brightcove.com account. Client needs us to find duplicate videos so they can be taken down from the account. The client has downloaded all 7000+ videos from their Brightcove account via the API. When searching for dupes, we cannot use file names because each time a video is uploaded to Brightcove, it assigns a 13 digit number to the renditions and this is how a video file is named when you download it. Therefore, right now we have a hard drive with 7000+ MP4s with file names such as 2591587223001.MP4—which is what we have to work with.

    We are hoping to find an application or service that can find possible duplicate videos by analyzing the content of each and create some kind of report that we can then use to make it easier to manually identify the *definite* duplicates.

    We understand that no tool will be able to find duplicates with 100% accuracy. Nevertheless, such tool will help us narrow it down to a more manageable level.

    Hoping to hear feedback from anyone that can point us in the right direction or that may have some experience with such task.

    Thank you in advance!

    Xavier

    Joe Marler replied 8 years, 7 months ago 5 Members · 4 Replies
  • 4 Replies
  • Doug Metz

    September 26, 2017 at 7:28 pm

    Can they get a metadata dump from Brightcove?

    Doug Metz

    Anode

  • Noah Kadner

    September 26, 2017 at 7:32 pm

    I know of one possible solution that uses machine learning to identify video contents, Cantemo’s Iconik which is in beta:

    https://www.streamingmedia.com/Articles/Editorial/Featured-Articles/Google-Video-Intelligence-Analyzes-Images-in-Videos-119488.aspx

    That’s assuming that you don’t have an easier to sort metric such as identical file sizes/runtimes. Or you could always go old-school and get a couple of interns to sift through the footage. I’d guess that’s maybe 80 hours of actual work or two interns for a week.

    Noah

    FCPWORKS – FCPX Workflow
    FCP Exchange – FCPX Workshops
    XinTwo – FCPX Training

  • Jason Brown

    September 26, 2017 at 9:30 pm

    A basic sort by duration could probably get you a good chunk of the way there.

    If the exact frame duration is the same, it’s likely the same video.

    We use Brightcove, and there is definitely a way to get a metadata report. I’ve worked directly with that team and they are responsive and interested in solving problems. I would expect them to be super willing to help out.

  • Joe Marler

    September 26, 2017 at 9:45 pm

    [Xavier Paredes] “We are hoping to find an application or service that can find possible duplicate videos by analyzing the content of each and create some kind of report that we can then use to make it easier to manually identify the *definite* duplicates…”

    Gemini 2 will do this. It will first report “no duplicates found”, then if you click on the report you’ll see suspected duplicates based on some criteria (I think file size). It then allows you to delete the duplicates. For each duplicate you can examine a thumbnail to verify this before deleting them, or you can have it just delete the duplicates without inspecting them. If the file date/time has been altered there’s also an app preference to check for similar files having a date/time within a certain window. There are other tools for this but Gemini is what I’ve used:

    https://itunes.apple.com/us/app/gemini-2-the-duplicate-finder/id1090488118?mt=12

We use anonymous cookies to give you the best experience we can.
Our Privacy policy | GDPR Policy