Hi Rick,
The results are a lot cleaner by eliminating every other frame (kicking down 59.940 to 29.970) than trying to create every other frame (kicking up 29.970 to 59.940), so I would tend to produce the combination of your source media to 29.970 fps.
However, with that said, your project with your source may be an exception. If I had the majority at 59.940 with maybe 5% 29.970, and the majority of the 59.940 was fast action sports type footage, I may opt to produce at the higher frame rate. Because shooting fast action at 59.940 usually present a smoother view than the slower 29.970.
To turn 29.970 into 59.940 the program can simply duplicate a frame to fill in (not very good results), or it can interpolate and build what the program thinks the new frame should look like by comparing the previous frame to the later frame and building it by guessing with mathematics. Some program and methods are better at this than others, and some source media subjects are better than others. It is up to you the viewer whether you like the results and are they acceptable for your purpose.
So rather than a general rule I like to approach it on a case by case basis, although I tend to shoot most of my action type videos to be viewed on something like Youtube at 59.940 and other stuff at 24 fps. I also make a decision before I start a project, what will be best for viewing and the project and try to keep all the source media matched (not always easy to do!).
So, I most likely every thing I have said, you already know!
Best Regards……George