How much delay (if any) a particular application needs depends on many details of the specific situation. It is correct that the KanexPro AUA2DCV only does analog-to-digital conversion and not any kind of delay function.
I was only offering my first-hand experience that the KanexPro AUA2DCV produces a digital signal compatible with the ATEM TVS.
Certainly there are many options for audio delay, some including A/D conversion as well. But last time we put a new TVS into service we discovered that the Behringer Ultramatch Pro SRC2496 is out of stock from all the usual suspects. I suspect because of the popularity of the BMS ATEM TVS. That is why I tried using the Kanex unit. It is a fraction of the size of a full 1-rack unit device (like the SRC2496) and a fraction of the cost as well.
Note that the ATEM TVS ignores and discards any audio coming in via the digital video inputs, so I don’t really understand your question about the “camera mics”?