Two thoughts, mostly conceptual. First I would start with the synchronized moments. Find those and the rest falls wherever it seems.
Second, though this isn’t the way you would deliver it, I would resize and position each of 10 layers of video so that I could see them all on one screen (or even output that for reflection).