This is the best explanation I’ve seen online how convolution transpose works is here.
I’ll give my own short description. It applies convolution with a fractional stride. In other words spacing out the input values (with zeroes) to apply the filter over a region that’s potentially smaller than the filter size.
As for the why one would want to use it. It can be used as a sort of upsampling with learned weights as opposed to bilinear interpolation or some other fixed form of upsampling.