Inverse neural network

What i can figure till now is , outputs should generate input, it's not one to one mapping. Something like shift and scale used, to make it more meaningful. I still don't understand it, it's being used in diffusion, Fno, wno.

They also do something very clever like splitting the input into 2 types( I still don’t know how exactly it is clever, but it gives the vibe that it is). The 2 inputs corresponds to affine coupling layers, from where scale and shift also comes from

The base equation is this:

What is diffusion, what is inverse neural network? How are they related, where does ELBO loss comes in all this, what is the usual loss function?