I'd argue it is bandwidth limitations/problems. You are generating the full image across a 3-D surface, even the feet, sides, and back, and transmitting it to another ship. If you try to do it fancy, and just project a 2-D image across a surface, you have to transmit the surface as well, including any changes that are made due to the person talking, moving their head, hands, etc. You also have a variable length file (to account for the various configurations), so bandwidth requirements will change. You'll have to increase average bandwidth requirements to be able to handle larger file lengths, with shorter file lengths being used to make up for it.
I.e. assume you have the bandwidth to transmit 10 'units' of date per 'cycle' of time. So you might transmit data in packet sizes of 12, 10, 7, 11, 9, 10. It averages to just under 10 data per cycle. In this case you were correct. But if the transmitter (for some reason) was sending data in packets of 13, 11, 8, 12, 10, 11, then the 10 units of data is insufficient, and you get lag effects (either reduced resolution, or the speech either slowing down or going out of synch with the lips). So you'd have to use extra bandwidth to allow for potential extra. This could be simply due to someone handing the speaker a PADD, and the extra data (for the shape, orientation, coloring, and display) has to be sent as well. Admittedly most of the PADD will remain the same, but the rest of it still gets sent.
Compare that with a 2-D image being transmitted across a known width and height (viewscreen). Much less bandwidth needed due to smaller image area (no 3-D effect needed), and it is a fixed data length so bandwidth use is constant (compression will make it even shorter). This makes 2D much more useful in combat or other situations where data limits exist. The best would be audio transmissions that include a text copy of what is being said, so even if most of the message is garbled, the receiving computer can put together a text display of the message for the receiver.
Here is some data for VoIP as a comparison. Assuming 33 packets per second (for packets of ~30 ms length), those can be up to 320 bytes in length. Assuming you talk fast, and can say 40 letters per second in the format of8 words at 5 letters each (i.e. 'right' is 5 letters, plus the space at the end), that is 48 letters that need to be transmitted. Each letter is 2 bytes in size (or smaller), so at most that is 96 bytes of data to transmit. So substituting one packet out of the 33 per second will provide the full text of what the person is saying. Since you can send almost 3 seconds worth of text per packet (320 bytes/96 bytes) you could even use it as a form of overlap. I.e. packet in second 1 sends text data from second 1. Packet in second 2 sends text from seconds 1 and 2. Packet in second 3 sends text from seconds 1-3. Packet 4 sends text from seconds 2-4, aso. Return packets can confirm which are sent and which are missed (similar to current Internet protocols)
So even losing half your transmission means you can still get the full message. If the computer is programmed correctly, as transmission issues arise, it automatically switches to text transmission rather than data, so the message still gets through. You lose the background audio as a result, so you can miss extra shouts from their bridge crew (though this could also be added as additional text fragments).
Hmm, more data in that page, I may have to update this data (or someone else that actually know what they are talking about will do it faster and in a clearer method.)
All I can think of for the holoprojector is for an admiral to 'personally' appear on the ships of a fleet to provide motivational speech and gestures to every bridge crew, so they do their jobs right during a desperate battle. The file will be relayed from ship to ship, instead of the admiral's ship trying to transmit to all of them.