In my opinion performance should be ignored (not really, but micro optimizations should) until you have a reason for that. Without some hard requirements (this is in a tight loop that takes most of the CPU, the actual implementations of the interface member functions is very small…) it would be very hard if not impossible to notice the difference.
So I would focus on a higher design level. Does it make sense that all types used in UseA
share a common base? Are they really related? Is there a clear is-a relationship between the types? Then the OO approach might work. Are they unrelated? That is, do they share some traits but there is no direct is-a relationship that you can model? Go for the template approach.
The main advantage of the template is that you can use types that don’t conform to a particular and exact inheritance hierarchy. For example, you can store anything in a vector that is copy-constructible (move-constructible in C++11), but an int
and a Car
are not really related in any ways. This way, you reduce the coupling between the different types used with your UseA
type.
One of the disadvantages of templates is that each template instantiation is a different type that is unrelated to the rest of the template instantiations generated out of the same base template. This means that you cannot store UseA<A>
and UseA<B>
inside the same container, there will be code-bloat (UseA<int>::foo
and UseA<double>::foo
both are generated in the binary), longer compile times (even without considering the extra functions, two translation units that use UseA<int>::foo
will both generate the same function, and the linker will have to discard one of them).
Regarding the performance that other answers claim, they are somehow right, but most miss the important points. The main advantage of choosing templates over dynamic dispatch is not the extra overhead of the dynamic dispatch, but the fact that small functions can be inlined by the compiler (if the function definition itself is visible).
If the functions are not inlined, unless the function takes just very few cycles to execute, the overall cost of the function will trump the extra cost of dynamic dispatch (i.e. the extra indirection in the call and the possible offset of the this
pointer in the case of multiple/virtual inheritance). If the functions do some actual work, and/or they cannot be inlined they will have the same performance.
Even in the few cases where the difference in performance of one approach from the other could be measurable (say that the functions only take two cycles, and that dispatch thus doubles the cost of each function) if this code is part of the 80% of the code that takes less than 20% of the cpu time, and say that this particular piece of code takes 1% of the cpu (which is a huge amount if you consider the premise that for performance to be noticeable the function itself must take just one or two cycles!) then you are talking about 30 seconds out of 1 hour program run. Checking the premise again, on a 2GHz cpu, 1% of the time means that the function would have to be called over 10 million times per second.
All of the above is hand waving, and it is falling on the opposite direction as the other answers (i.e. there are some imprecisions could make it seem as if the difference is smaller than it really is, but reality is closer to this than it is to the general answer dynamic dispatch will make your code slower.