There’s two parts. First part is how to fix it. Second part is what I believe is going wrong.
1. You’ll have to use Apple Compressor to transcode your footage into the SAME format. That is, the frame size, Vid Rate, Compressor, Aud Rate will all need to be the same. Since most of your footage is DVCPRO HD 1080p30 @ 23.98 fps, then I suggest transcoding the 29.97 fps footage into that vid rate, simply because it appears it will take less time. Make sure your sequence corresponds to this new “stuff” (frame size, vid rate, etc.)
2. I believe that the error you’re seeing is caused by placing mismatched footage into a sequence. I assume that also means you’re having to render in the sequence. A good rule of thumb is if you bring footage into a sequence and you have to render, something is wrong, so stop and figure it out. FCP Studio 2 is cool because it will “auto pick” your sequence presets. (Why they didn’t do this with 4.0–when I started FCP–I’ll never know!) But that won’t solve your problem because your vid rates differ.
Vid Rate is a trippy thing. Imagine the first frame as being 0, followed by the second frame 1, then 2, 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 then where does it go? Well, if you’re in a 23.98 sequence, then it goes to the next second, but if you’re in a 29.97 sequence, then it goes to 24 25 26 27 28 29, but since the footage itself is 23.98 and the sequence is 29.97 (or vice versa), then the computer has to recount the old frames. But it doesn’t “transcode” the vid rate. Instead, it creates a render file which tells the sequence what the new vid rate is. It’s terribly complicated.
2:3:3:2 is way different. It’s about video FIELDS not frames, lots more on that but I’ll skip it. Anyway, when we edit in FCP we do not make our edits on fields, but on frames, so the fancy math creates a frame out of fields. But the two fields exactly next to each other will not match up, so the fields must be interleaved to create a frame. If you assume it takes two fields to make one frame, then you’ll notices 2:3:3:2 means something like “the first frame has two fields, the second frame has three fields, the third frame has three fields, the fourth frame as two fields.” Add the number of fields up 2+3+3+2=10. That means 10 fields are making 4 frames. Normally, we’d expect 2 fields to make 1 frame, so 10 fields should make 5 frames. But that extra frame has been “pulled down.” Here’s a bit of video blasphemy: this “pull down” all by itself won’t make your video look like film 🙂
Well, that’s my best guess on what’s going wrong, and I hope someone checks this post for accuracy!