Abstract Extreme floods pose escalating risks in a changing climate, yet forecasting remains challenging due to peak flow underestimation and high uncertainty. We introduce diffusionâbased runoff model (DRUM), a probabilistic deep learning (DL) approach that advances extreme flood forecasting across representative basins in the contiguous United States. DRUM outperforms stateâofâtheâart benchmarks, enhancing nowcasting skill for the top 1â° of flows in 72.3% of studied basins. Under operational scenarios, DRUM extends reliable lead times by nearly a full day for 20â and 50âyear floods. When evaluated with measured precipitation, an ideal condition, recall improves by 0.3â0.4 and the early warning window extends by 2.3 days for 50âyear floods. The enhancement potential varies regionally, with precipitationâdriven flood zones in the eastern and northwestern US benefiting most, gaining 3â7 days in lead time. These findings highlight the transformative potential of diffusion models as a cuttingâedge generative AI technique for advancing hydrology and broader Earth system sciences.