A couple of months ago, I reported mixed first-winter results from our heat pump installation: Good comfort, but a refrigerant leak which wiped out several years of potential environmental benefits from the pump. That much stands.
But I went further and quoted my own estimate of the combined seasonal efficiency of our pumps at 154% and concluded that my pumps were an operating net negative from a greenhouse gas perspective even without the leak. I got a lot of feedback on that estimate, including some expert analysis of our home’s data, and I did some more analysis on my own. I now conclude that the true efficiency of our heat pumps cannot be ascertained with confidence from the data we have before us. I am uncertain, but I am hopeful that (a) if there are no further major refrigerant leaks, next winter the performance of the downstairs pump may well be in the 200% to 250% range which available research suggests is typical (and good enough to be modestly GHG positive) and (b) while the performance of the upstairs pump may remain depressed because it is over-sized, we may be able to manage it to improve efficiency. While I acknowledged in the first post that “studies show better average results than we have achieved,” each number matters in the climate conversation. So, I feel it is important to review the weaknesses of my first analysis.
Measuring heat pump performance
The best way to test a heat pump’s performance is to attach multiple sensors to the pump and directly measure how much heat it is pumping over several seasons. Direct measurement is expensive, but more and more robust studies using direct measurement are coming out.
Another alternative is to exhaustively model energy use in the building, using building science methods to estimate heat loss through each component of the building envelope and ventilation systems. One must also model various internal offsetting sources of heat within the building — water heaters, clothes dryers, stoves, office equipment, televisions, human metabolism, etc. Internal sources of heat gain can balance heat loss through the envelope in the same way that a person can stay warm with only their body heat in a sleeping bag. Heating systems kick-in only when heat losses exceed heat gains and drive the internal temperature below the thermostat setting. Internal sources of heat gain may be negligible relative to heat loss in a poorly insulated structure, but can be significant in a well-insulated home like ours. If one had a complete energy model, one could compute the efficiency of a heat pump over a season (known as the seasonal coefficient of performance or SCOP), as the ratio of the modeled heating load (btus converted to kwh) to the electrical energy (kwh) used by the heat pump.
Detailed modeling of energy balance is empirically difficult, especially as to internal gains. A common alternative is to estimate heat load based on known fuel use over a baseline heat period. This is the method that I used in my first post (the computations were detailed in an attached spreadsheet). I had kept good records of our gas use over a five winter period from from 2010 to 2015 and one of those winters was near identical to last winter based on “heating degree days.” Assuming that other factors were likely equal, I simply divided our gas use during that winter (2011-12) by the energy used by our heat pumps. I did not adjust for imperfect efficiency of the gas burner or the increased use of fans by the heat pumps. Both of those adjustments were small and would have made the heat pumps look slightly worse. Otherwise, there was nothing mathematically wrong with that computation, but it was simplistic.
Abode energy, the group that had helped us initially choose our heat pumps, asked one of their HVAC design consultants to look at my analysis. That consultant happened to have worked with one of the contractors who had bid on our heat pump installation and had access to several years of our heating data which I had provided. He did a rough re-analysis and reached the conclusion that our overall SCOP (combining downstairs and upstairs) was roughly 2.0, instead of my 1.5. He observed that the downstairs heating system carries most of the load for the building and concluded that the reason for the relatively poor performance of the upstairs pump was that it was over-sized as compared to the light load it had to carry.
The expert was generous with his explanation of his methodology and I take his points well: My first computation was simplistic in that it relied too heavily on a single winter. I failed to appreciate how noisy the fuel use data is. Using linear regression, one can take advantage of all the available winters of data to develop a more robust baseline. Additionally, I erred in assuming that the single winter was a good comparison just by looking at the heating degree days from a base of 65. (The base is the temperature from which outdoor temperatures are subtracted to count heating degree days — more explanation of heating degree day bases here.) Just because two winters have the same total heating degree days from base 65 does not mean they will have the same heating degree days from base 50. As a rough way of factoring in internal heat gains, one needs to try regressions of baseline fuel use against degree days from multiple bases and use the HDD base that fits the fuel use data best.
The expert did not have access to our complete baseline fuel use data and did not choose to do a full set of regression runs from alternative heating degree day bases measured from the nearest weather-stations. He had enough professional experience to be confident in his rough analysis and practical judgment that the over-sizing of the upstairs pump was the main problem and that otherwise the pump performance was not too far off typical. He felt that further analysis would not likely add confidence to that basic conclusion. I accept that conclusion, although I come to it in a different way as discussed further below.
For a learning experience and just to make sure we weren’t missing anything, I went back and redid the regression analysis of monthly gas use vs. degree-days using the full period of available baseline data and implementing the consultant’s methodological pointers. I purchased the degree-day measurements from both Logan and Bedford airports at all bases from 39 to 71 degrees. Since we are roughly equidistant from those two weather stations, I averaged the degree counts between them. I then tested regressions of the full five winters of measured baseline fuel use against all HDD bases between 39 and 71. I found that the best fit regression (R-squared of .967) for our combined two units was with HDD base 49.
Best fit model (HDD49) for heating energy use during base line period (R-squared = .967)
During the baseline fuel period, our home was reasonably well-occupied — my retired parents were mostly at home, did a lot of cooking and cleaning, and used home office equipment heavily. Degreedays.net offers the “very rough estimate” that a “well-insulated building with more people and a fair amount of home or office equipment might have an average internal heat gain of 5°C or so (9°F or so).” So, in our maximally insulated and well-occupied pair of homes a combined balance point of 49°F seems plausible — perhaps 13°F below our thermostat settings (including night set-backs).
Note: As another check on plausibility of 49°F, based on the modeled line slope (0.1), 13°F additional degree days per day translates into 2708 btus/hour for each home (.1 model slope (therms/HDD) * 13 degrees * 30 days/month * (1/720) months/hour * 100000 (btus/therm) * 1/2 units/structure). 2708 btus/hour is close to the Model J allowance for a two bedroom apartment — according to Green Building Advisor, “At the low end, the Manual J process assigns about 2500 Btu/hr of internal gains for a two-bedroom house or apartment.” This is not a fair direct comparison because the real application of internal gains in Manual J is to estimate peak cooling loads as opposed to average heating loads, but it offers a comparison in the same general ballpark.
Using this more rigorous regression approach, I nonetheless ended up fairly close to my first estimate as to performance — a combined performance of 1.65 (formerly 1.54), with the downstairs pump at 1.92 and the upstairs pump at 1.21. This spreadsheet details that more complete analysis.
Special circumstances in our installation
I might, based on the reanalysis that didn’t change the bottom line much, continue to view the performance of my heat pumps as mysteriously disappointing and continue to speculate about possible design flaws: too much pressure in the ductwork? But I did some additional analysis that (a) exposes a larger impact of the refrigerant leak that we fixed in March on the performance of the downstairs pump; (b) suggests other dynamics putting in question the whole before vs. after estimating methodology.
I had previously viewed the leak as not affecting the whole winter, just a negligible period. But when I did a daily regression of the power use of the downstairs pump against HDD (at base 51) and arrayed the residual errors chronologically, a clear trend jumped out: The model based on the whole winter projects lower than actual power use in the late season and higher than actual power in the early season — consistent with the efficiency of the pump declining through the winter. We do not know when the leak started, but the pumps were installed the previous June, so the leak may have depressed performance even from the beginning of the winter.
Residual error in whole-season daily regression model increasing — suggests leak taking increasing performance toll through the winter (leak fixed March 14, Day 164)
If, to minimize the impact of the refrigerant leak, one computes the performance of the pumps over just the first three months of the winter, one gets a combined coefficient of performance of 1.82, with the downstairs pump at 2.4 and the upstairs pump at 1.1. Consistent with the consultant’s view of the situation, it seems possible that the downstairs pump could perform acceptably if properly charged with refrigerant. But three months of data — likely still to some extent distorted by the leak — is not enough to support a real estimate.
An additional insight emerged in the re-analysis: It is clear that our energy usage patterns have changed substantially since my parents passed away, casting doubt on the whole before-and-after-comparison methodology. First, the relative shares of the two units have changed. During the baseline period, the downstairs unit where my parents resided accounted for 73% of the heating gas use. In the single season of gas use that we have after their deaths in 2021, the downstairs unit accounted for only 52% of the heating gas use. After the heat pump retrofit, the downstairs unit accounted for 62% of the electric energy for heating, but that share may be inflated by operating inefficiency caused by the leak. The shift in load share could contribute to the low estimate of the upstairs heat pump’s performance; conversely, it could inflate the estimated performance of the downstairs pump.
Shifting shares of total heating energy use
|% of heating energy use
|Post-occupancy change (gas)
|Post retrofit (kwh)
A subtler indicator of changed energy use patterns is that the regressions on the fuel use for the units separately yield different best fit balance points and a swap in the relative balance points for the two units before and after retrofit.
Best fit regression balance-points for units separately
|Best fit HDD base
Certainly these changes make any estimate of an individual pump’s performance invalid. And there is one more data point that raises a question about the relevance of the baseline even for both pumps together: We did have one season of gas use (2021-2022) after my parents passed away. Total use in that season was considerably above what the baseline model from 2010 to 2015 predicted: we would have expected it to drop as a result of lower thermostat settings. Perhaps loss of their heavy occupancy of the downstairs unit may have reduced internal gains. But that theory is belied by the preceding table in two ways: (a) the overall balance point for both units didn’t change; and (b) the downstairs balance point actually went down. [Note added January 2024: An additional unknown is the extent and direction of change in use of the energy recover ventilators. A drop in ERV use would lower the balance point of a unit and reduce energy use. If operated 100% of the time, the ERV in one unit would account for roughly 3 MMBTu load per season, so the unknown operating time percentage is a significant variable given that total heating load for both units is only 20 to 30MMBtu. See ERV loss estimate spreadsheet.] We have crude indicators pointing in opposite directions and one season does not support a good inference anyway, but if load has in fact somehow increased above baseline, then that would depress the efficiency of both pumps as appearing through our methodology.
Possible unexplained upwards drift in heating energy consumption
|Predicted heating gas use for 2021-22 season based on model from 2010 to 2015 baseline (HDD49)
|Actual heating gas use in 2021-22 season
There are indicators pointing in conflicting directions in this case and we shouldn’t attempt to compute efficiency numbers out of it. The leak alone is enough to discard the first season’s results. I’ll provisionally accept the wisdom of the experienced professional that the downstairs pump is probably fine (when not leaking) — likely below rated efficiency but to the typical degree — and that the upstairs pump is likely oversized. As to greenhouse gas benefit, we’ll see if we can manage performance to achieve results next winter that would allow us to claim operating benefits with confidence.