Emission metrics are used to compare the climate effect of the emission of different species, such as carbon dioxide (CO2) and methane (CH4). The most common metrics use linear impulse response functions (IRFs) derived from a single more complex model. There is currently little understanding on how IRFs vary across models, and how the model variation propagates into the metric values. In this study, we first derive CO2 and temperature IRFs for a large number of complex models participating in different intercomparison exercises, synthesizing the results in distributions representing the variety in behaviour. The derived IRF distributions differ considerably, which is partially related to differences among the underlying models, and partially to the specificity of the scenarios used (experimental setup). In a second part of the study, we investigate how differences among the IRFs impact the estimates of global warming potential (GWP), global temperature change potential (GTP) and integrated global temperature change potential (iGTP) for time horizons between 20 and 500 yr. Within each derived CO2 IRF distribution, underlying model differences give similar spreads on the metrics in the range of −20 to +40% (5–95% spread), and these spreads are similar among the three metrics. GTP and iGTP metrics are also impacted by variation in the temperature IRF. For GTP, this impact depends strongly on the lifetime of the species and the time horizon. The GTP of black carbon shows spreads of up to −60 to +80% for time horizons to 100 yr, and even larger spreads for longer time horizons. For CH4 the impact from variation in the temperature IRF is still large, but it becomes smaller for longerlived species. The impact from variation in the temperature IRF on iGTP is small and falls within a range of ±10% for all species and time horizons considered here. We have used the available data to estimate the IRFs, but we suggest the use of tailored intercomparison projects specific for IRFs in emission metrics. Intercomparison projects are an effective means to derive an IRF and its model spread for use in metrics, but more detailed analysis is required to explore a wider range of uncertainties. Further work can reveal which parameters in each IRF lead to the largest uncertainties, and this information may be used to reduce the uncertainty in metric values.