Abstract Hydrologic forecasts are essential for mitigating water‐related risks. However, little work has explored whether operational ensemble forecasts have improved over time, particularly with respect to probabilistic performance and in cases with limited data. This study contributes a retrospective analysis of short‐ and medium‐range (1–14 days ahead) streamflow forecasts issued by the California Nevada River Forecast Center (RFC) at 97 sites between water years 2014–2025, using the National Weather Service (NWS)’s Hydrologic Ensemble Forecast Service (HEFS). We develop a novel and generalizable hierarchical Bayesian model to partially pool data across sites and quantify regional trends in deterministic and probabilistic forecast performance. Results suggest improved performance for moderate and high flow events, potentially linked to meteorological model upgrades and enhanced data assimilation. However, the degree of improvement depends on performance metric and lead time, with stronger trends for deterministic performance at shorter leads and only weak evidence for improvements in attributes of ensemble spread.