I think the word "trucation" is typically used when converting word-length, say from 24 bit to 16 bit. This is when dithering comes into play. 24 bit audio is preferred for tracking and mixing due to the available headroom and resultant reduced need to record at or very close to full-code.
When converting high sample rates to lower rates, the "extra" samples are esentially discarded, leaving the DA's filters to reconstruct an accurate analog output from sparser data. In digital audio's early days, these filters were a source of brittleness and harshness, hence the desire for higher sample rates which allowed a gentler filter to be used when reconstructing the analog output. With today's oversampling converters and very accurate brickwall filters most folks, if not all, would be hard-pressed to identify any sonic difference between rates of 192k and 44.1k.
If you convert each track independantly, I believe you'd be able to use almost any SRC without trouble, as errors introduced will be masked by other tracks. By using the recommended SRC's prior to mixing, including the Apple or current MOTU SRC, I believe you are assured a transparent and completely satisfactory result. In all honesty, you could probably do the conversion post-mixing (SRC the mix track) and have happy camper customers, but then you'd have to deal with mixing at 192!
Knock 'em dead with your mixes FM!
For additional reading, here's excerpted discussion from Wiki:
Example
CDs are sampled at 44.1 kHz, but a Digital Audio Tape, or DAT is usually sampled at 48 kHz. How can material be converted from one sample rate to the other?
First, note that 44.1 and 48 are in the ratio 147/160. If the original audio signal had been recorded at 7.056 MHz sampling rate, the process would be simple. Since 7.056 MHz is 160 x 44.1 kHz, and also 147 x 48 kHz, all we would need to do is take every 160th sample to get a 44.1 kHz sampling rate, and every 147th sample to get a 48 kHz sampling rate. Taking every Nth sample like this preserves the content provided the information (the audio signal) does not have any content above half the lowest sampling rate used (22.05 kHz) in this case.
So now the problem is how to generate the 7.056 MHz sampled signal, given that the original has only 1/160 of the samples needed. The somewhat surprising answer is to replace the missing samples with zeros. So if the original audio samples were ..,a,b,c,.., then the 7.056 MHz sequence is ..,a,0,0,0,...0,0,b,0,0...0,0,c,.., with 159 zeros between each original sample. This too will create extra high frequency content (in fact it is worse in this respect than linear interpolation) but at least the frequency response is flat. Then the digital filter removes the unwanted high frequency content. The work of this digital filter is also much easier if zeros are inserted, since the filter is basically an average and almost all of the samples are known to be zero.
So inserting the zeros, then running the digital filter (almost always an FIR filter since these can be designed to have no phase distortion) gives the needed signal - sampled at 7.056 MHz, but with no content above 24 kHz. Then just taking every 147th sample gives the desired output. Which sample to start with does not matter - any set will work as long as they are 147 samples apart.