Unfortunately I don’t think you can merge a multiclip with an audio file, so it’s best to merge video and audio before creating multiclips. I’ve cut many projects shot on the 5D with a separate audio source. I usually do this:
1) Drop CAM A video into a timeline without camera audio. Drop Zoom audio into same sequence and sync them up. Command+L to link. Now drag this clip from the timeline into a new bin in the browser to create a new master clip with video and Zoom audio only. Set an IN on sync point.
2) Set your IN on CAM B sync point. Now create multiclip with cams A and B. Your cam b audio will still be married into the multiclip, but you can just keep audio on cam A only while editing video into the timeline.
Or this:
1) Set an IN on the sync points for Cam A, Cam B and Zoom Audio A. In the browser, highlight Cam A and Audio A and control+click then select “merge clips”. This will create your new master clip with good audio (and camera audio)
2) Highlight new merged clip and Cam B, then control+click and “create multiclip”. Your new multiclip will have a bunch of audio, but just make sure to only patch the good audio in the timeline when inserting.
There might be a better way but I can usually knock these out pretty quick. Pluraleyes helps if there isn’t a slate. Also, this can get messy quick so I usually make a few temp bins to avoid clutter (“Multiclips”, “temp”, “new master clips”, etc.”). You can delete your original Cam A, Cam B and Audio from the FCP browser once you create these new multiclips.