Summary: M4 Max is about 15-20% faster than M3 Max on most models. M4 Pro is about 55-60% of M4 Max or around two-thirds of M3 Max.
All slower than a 4090, as long as the models fits within memory. Only the Max models can run the 72B model at reasonable speed, around 9 tokens per second for M4 Max.