Category: pftblog

Looking for help with an orphaned system
I was contacted recently by a reader who is trying to resurrect an older test system. It is a Spirotech S600, manufactured by Graseby-Andersen and outfitted for spirometry, helium dilution lung volumes and single-breath DLCO. It seems to be in reasonably good shape but he cannot test the components or use it in any way because he does not have the software for it. He is working on a shoestring budget and does not have funds to purchase a newer system.

If you have version 4 of the Spirotech S600 software (most likely on 5-1/4″ floppy disks), or if you have a manual or schematic for this system please contact Gabriel at [email protected].

I don’t know if Spirotech was absorbed by a larger company or went out of business but the last Spirotech spirometers that I know of were manufactured around 20 years ago. Many of us are fortunate enough to be able to occasionally replace older equipment (although this is usually only when the older equipment does not work and can’t be repaired, and even then you have to jump through hoops to justify a capital purchase). I have been in the position of trying to keep an older system functioning numerous times. The problem has always been that once a given test system has been replaced by a manufacturer’s newer model finding parts for the older model becomes difficult at best. More often it becomes impossible.

I don’t blame our equipment manufacturers for this. I’ve watched pulmonary function equipment evolve for over forty years and every model was built with the technology available at the time. But time moves on and trying to keep an older technology alive is always a lot more expensive than adapting to newer technology (and that’s even presuming that it was worthwhile to keep the older technology alive in the first place). Still, there is a fair amount of older equipment out there that is at least potentially capable of functioning. 3D printing may be a partial solution to missing parts but I think the bigger problem is not so much physical parts but computer software. Computers and computer software have been evolving incredibly rapidly and even if the software for an older test system was located more than likely the computer hardware the software was written for may no longer exist. But there’s always Ebay so even this problem can potentially be overcome.

So, hold onto the disks and manuals that came with your test systems. Even if you no longer have the test systems they were intended for they may still be able to help somebody else who is trying to make the best they can of a nonexistent budget.
Share this:

Print

Email
August 21, 2014
Assessing FEV1 trends, re-visited
There are a couple of different ways to assess changes in FEV1 from one patient visit to another. For several decades my lab has used a change of >=10% and >=200 ml as the threshold for a significant change. Recently the ATS released standards for occupational spirometry that included an age-adjusted change in FEV1 of >=15% as the threshold for significant change. For the time being we have continued to use the 10% threshold when comparing results that are relatively close in time and are using the 15% threshold when they are separated by a much longer period. Since we haven’t actually gotten around to defining what is recent and what isn’t there is still a bit of uncertainty in how we apply this but even though there are differences in thresholds and how the numbers are calculated both approaches are essentially numerical. Recently a couple of reports crossed my desk that have caused me to wonder whether a qualitative change should also be a consideration.

In the 14 years between these two tests the FEV1 has decreased by 0.56 L or -12.6%. By the 10% threshold criteria this is a significant change but I think that 14 years is a reasonably long period of time and the age-adjusted change is only 5.1% which indicates this change is not significant.

In the year between these two tests the FEV1 has decreased by 0.22 L or 7.0%, which doesn’t meet either criteria for a significant change.

But what has changed between these tests is that in both instances the spirometry went from normal to showing mild obstruction. This is a qualitative change and I think it is likely significant.

Spirometry is a “noisy” measurement and this applies both to a patient’s ability to perform a repeatable effort and for a spirometer to accurately measure it. My lab’s 10% threshold is based on studies from several decades ago but is probably not too far off the mark. A relatively recent study that looked at the normal distribution of change indicates that a decrease greater than approximately 8% within a year is likely not normal. This still doesn’t make a 7% absolute or 5% age-adjusted change clinically significant and in fact leaves them within the normal range of variability.

However, when a patient’s best spirometry effort is reported it is this result that is used to categorize the patient as being normal or as having airway obstruction (we’ll ignore restriction for the time being since spirometry cannot diagnose it). Obviously this categorization has significant implications for the patient’s clinical management so it would seem to me that when a patient goes from “normal” to “mild obstruction” (or vice versa) this should be considered a significant clinical change.

Note: This raises an interesting (and critical) point and that is where do you draw the line between “normal” and “obstruction”? The ATS-ERS advocates the use of LLN but this has been far from universally adopted. My lab (and others) uses an FEV1/FVC ratio less than 95% of predicted and other labs I know of use the predicted FEV1/FVC ratio minus 5. A patient that is considered to be obstructed at one institution can be considered to be normal at another. My personal opinion is that airway obstruction is multi-factorial and the reliance on any single standard will inevitably end up mis-categorizing at least some patients (at the moment I’m particularly curious about patients with elevated peak flows and a normal-ish FEV1). I suspect that more labs than not follow the ATS-ERS standard but I (and many others) have reservations about this approach since it is primarily statistical in nature. My medical director and I would be interested in hearing from any lab that uses different criteria than those outlined above or from anybody that has their own opinions about this point.

The problem is that a difference of just 1% can lead to placing results in either category. This admittedly is an artifact of the need to create dividing lines between categories. We may all publicly agree that these dividing lines are really fuzzy in nature but that doesn’t stop us from taking a numerical result and firmly placing it in one category or another and this is an artifact of human psychology. This doesn’t mean that there isn’t a lot of value in deciding what is normal and what isn’t but it still seems to me that a 1% change isn’t significant.

What does seem to be significant would be when a FEV1 is “reasonably” normal and changes to “reasonably” obstructed (or vice versa) even if it doesn’t meet any of the numerical criteria for a significant change. I don’t have a strong notion of where “reasonable” is for this purpose but I think that both of these examples I’ve presented meet this criteria.

The reason I think this is worthwhile is that at present the only criteria for alerting an ordering physician about a significant change is numerically based and does not alert a physician to a change in categorization. Of course an interpretation will include the categorization of the current results but when a comparison is made with prior results, the statement “there has been no significant change in FEV1” does not and cannot indicate that there has been a change from “normal” to “mild obstruction” (or vice versa). Saying “no significant change” is not necessarily being misleading since it is a true statement that meets ATS-ERS standards, but at the same time I think it is not being as clinically informative as it could be.

If I was setting up standards for this I think that a good starting point would be the category change between “normal” and “mild obstruction” (and vice versa, of course) that was accompanied by an age-adjusted numerical change of at least 5%. When this was present I think that a comment like “although the change in FEV1 is not statistically significant, results were previously categorized as normal/obstructed” should be made.

At the moment I think that the change between normal and mild obstruction is the only one that is really important since it is the most critical one and as importantly it is also well studied and characterized. Although the changes from mild to moderate to severe obstruction are important in their own way and may deserve mention in the same way the boundaries between them are far more arbitrary and are differences in degree rather than kind.

It could easily be argued that this is not a valid approach towards assessing change and I would agree that there are good statistical reasons it isn’t. Nevertheless, I think that the fact that non-significant changes can cause spirometry results to be categorized differently and that this difference has clinical significance means the change is significant.

References:

Brusasco V, Crapo R, Viegi G, et al. ATS/ERS Task Force: Standardisation of Lung Function Testing. Interpretive strategies for lung function testing. Eur Respir J 2005; 26: 948-968.

Kangalee KM, Abboud RT. Interlaboratory and intralaboratory variability in pulmonary function testing. A 13-year study using a normal biologic control. Chest 1992; 101: 88-92.

Redlich CA, Tarlo SM, Hankinson JL, Townsend MC, Eschenbacher WL, Von Essen SG, Sigsgaard T, Weissman DN. Official American Thoracic Society Technical Standards: Spirometry in the occupational setting. Amer J Respir Crit Care Med 2014; 1889: 984-994.

Wang ML, Petsonk EL. Repeated measures of FEV1 over six to twelve months: what change is abnormal? J Occup Environ Med 2004; 46: 591-595.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
August 19, 2014
What’s abnormal about FRC?
I’ve had a number of reports across my desk in the last couple of weeks with both elevated and reduced FRC’s that were associated with a more-or-less normal TLC. I reviewed the raw data from all of these tests (I review the raw data from all lung volume tests) and in only a few instances did I make any corrections to the report. This made me think however, about what, if anything, is an abnormal FRC trying to tell us?

The answers to that question range from “a whole bunch” to “not much” to “darned if I know”. When you measure lung volumes TLC is really the only clinically important result. RV can be useful at times but although the other lung volume subdivisions may play a role in the measurement process they have only a limited diagnostic value. All lung volume measurements start with FRC, however, and if you don’t know you have an accurate FRC how do you know that TLC is accurate?

FRC is a balance point of opposing forces in the lung and thorax. Lung tissue wants to collapse, the rib cage wants to spring open and the diaphragm wants to do whatever muscle tone, gravity and the abdomen allows it to do. All of these forces are to one extent or another dynamic and can change over time. These changes can occur both slowly and rapidly, and are the primary reason why isolated changes in FRC don’t tend to have a lot of clinical significance. For all lung volume measurements however, one primary assumption is that FRC does not change during the test and this isn’t necessarily true.

One of the simplest ways to affect FRC is by changes in body posture or by changes in ventilatory rate and volume. All of the lung volume test procedures attempt to control this first by having the subject sit upright and not change position during the test, and second by having the subject breathe quietly and regularly (at least during the times that are meant to establish FRC). We’ve always taught the importance of these things to our technicians but I’ll never forget the time I walked into one of our testing rooms during a FRC test only to find the patient slouched way back in the chair with one leg crossed over his other knee because the technician was letting the patient be “comfortable”. I’d like to hope this was an isolated incident (and it probably really was) but when I review a report there’s no way for me to know what position the patient was in and I have to wonder if some of the change in FRC between tests (we always try to perform lung volume tests more than once) is due to differences in the patient’s posture.

Trying to control a patient’s ventilation is in many ways a more difficult proposition. FRC tends to rise when a person hyperventilates (although the amount it increases is probably related to the degree of airway obstruction). Many patients are scared or anxious when they come to the PFT lab and this can affect their “normal” breathing pattern. Even the calmest patient can become hyper-conscious of their own breathing and adopt an unusual breathing pattern. In one sense, a non-normal breathing pattern doesn’t matter as long as the patient breathes the same way throughout the test since a change in FRC due to a change in breathing pattern does not by itself affect the TLC or RV. I’ve often seen that once the patient had done their first test that they relaxed significantly during the second test. When this happened the FRC was often quite different between the two tests but TLC and RV didn’t really change.

The helium dilution technique is probably most susceptible to error from short-term changes in FRC because it tends to take the longest amount of time to make a measurement and because the patient needs to breath at FRC during most of the test. For both the N2 washout and plethysmographic techniques, FRC is determined at the beginning of the test during a short period of quiet breathing. This short period would seem to limit the exposure to possible changes in FRC but it also means that the measured FRC comes from only a few breaths and that in itself increases the potential for error.

Realistically though, I consider the effect of posture and ventilation to be part of the background noise of lung volume testing. It’s always going to be there, but unless there is something particularly unusual occurring with either of these the amount of error they introduce into the lung volume measurement is probably small and more likely to affect the FRC than the TLC or RV. When I see a reduced or elevated FRC I am far more inclined to worry about testing errors, but which errors I look for depend on which test was used to measure lung volumes.

Both the helium dilution and N2 washout techniques are sensitive to leaks and leaks always lead to an overestimation of FRC. Large leaks are relatively obvious since they usually prevent the gas analyzer readings from stabilizing. Intermittent leaks (which usually occur when the patient adjusts their lips around the mouthpiece or tries to swallow) can show up as abrupt spikes or drops in analyzer readings. Smaller leaks though, can be very difficult to detect since they don’t necessarily prevent the patient from equilibrating but instead just slow the process down. Test length is related to the degree of airway obstruction so when I see a helium dilution or N2 washout test that seems too long in a patient with a normal-ish FEV1 I become suspicious.

A leak during plethysmography can lead to either over- or under-estimation of FRC. Leaks most often occur during the closed-valve panting phase of the test and tend to shift the baseline of the tidal breathing following it. FRC is determined by the tidal breathing before panting and since the SVC maneuver is performed after the panting an upwards shift will cause TLC to be overestimated and a downwards shift will cause it to be underestimated. I would say that leaks during tidal breathing are relatively easy to detect because they cause the tidal baseline to angle upwards or downwards, but after the panting phase has ended and the valve has re-opened, I’ve seen patients take a deep breath which can shift their tidal baseline upwards and it can take a half dozen breaths or longer for them to return to their original FRC baseline and it can be hard to differentiate between this and a leak.

Since FRC (TGV really) is measured from the angle of the loops made during the closed-valve panting phase of plethysmography, the wrong angle will of course cause it to be inaccurate. Despite the best efforts of technicians and patients loops can have odd contours and determining the angle can be difficult. I acknowledge this, but after each test our lab software displays the loops as a composite (all loops in one graph) and the technician has to right-click and select an option from a context menu in order to show the loops from the individual pants. All too often I’ve seen where the composite loop looked okay but when the individual loops were inspected, one or more of them was either suboptimal and shouldn’t have been used or just had the wrong angle calculated for it. Our technicians know they are supposed to inspect the individual loops so I’ll be kind and say they occasionally overlook this step because the composite loops look fine but this is one reason why I inspect the raw data for each loop from all plethysmograph tests.

About the only way I know for helium dilution or N2 washout measurements to underestimate FRC is for the test to end too soon. This should be a no-brainer but all too often I’ve seen our technicians rely on the testing system to say when the analyzer readings have stabilized and I know that the software algorithm for determining this is simplistic. A company service technician recently chided one of our technicians for not ending a helium dilution test at the 90 second mark because the software said it had reached equilibration. I disagree with this first because it’s been my experience that an average patient with normal lungs usually takes around 2-1/2 minutes to equilibrate but more importantly I’ve also seen that the end-of-test algorithm is often fooled by a process I call overshoot. Overshoot is usually caused by the test system adding too much oxygen to the breathing circuit too quickly. In this instance FRC will be overestimated if the test is ended when the software says it should because there is a period where the analyzer reading stabilizes at a low value before returning to its true equilibration point which is at a higher helium concentration.

When all is said and done however, and you are reasonably certain that FRC has been accurately measured and it’s still higher or lower than it “should” be, is this telling you anything clinically about the patient? This is where the answer of “not much” comes into play. There is an association between obesity and a reduced FRC, but not all patients that are morbidly obese have a reduced FRC. There are also plenty of patients with a normal BMI that have a reduced FRC so you certainly can’t say that a reduced FRC is solely due to obesity. Unless TLC is reduced I think that commenting on a reduced FRC in the presence of obesity is not particularly meaningful, and if it is then I think the comment should be about the TLC instead of the FRC anyway.

There is an association between COPD and an elevated FRC, but again not all patients with moderate to severe airway obstruction have an elevated FRC and there are lots of patients with an elevated FRC that don’t have any airway obstruction. To my mind, in COPD it’s not gas trapping or hyperinflation if the RV is not elevated as well as the FRC and even then RV can be overestimated by a suboptimal SVC maneuver so you need to be sure all your ducks are in a row before making a comment in that direction.

It is possible that an abnormal FRC seen in association with a normal TLC, normal spirometry and normal DLCO is trying to tell us something about the patient. It’s also possible that changes in FRC from one visit to another is also saying something. I’ve not seen this particular question addressed in any research study I’ve read however, so “darned if I know”.

There’s not a lot of clinical significance to FRC. It’s dynamic and can change significantly simply because of changes in posture or breathing. Population studies that have developed reference equations for lung volumes have a higher standard deviation and coefficient of error for FRC than they do for TLC and RV. The primary importance of an elevated or reduced FRC is to alert us to possible errors in the lung volume measurements but the fact is that I’m skeptical about all lung volume measurements until I’ve seen the raw test data and even then I can be hard to convince.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
August 9, 2014
DMCO, Vc and 1/theta
Roughton and Forester’s seminal paper from the 1950’s showed that DLCO was a function of two resistances: the alveolar-capillary membrane and the rate of CO uptake by red blood cells. This relationship is shown by:

Roughton and Forster also showed that the membrane diffusing capacity (DMCO) and pulmonary capillary blood volume (Vc) could be calculated by performing the DLCO test at different oxygen concentrations and then plotting the results.

Modifed from: Pulmonary Function Testing Guidelines and Controversies, published 1982, Jack Clausen ed., page 166.

Since the 1950’s DMCO and Vc have been measured for research fairly often. I first performed this test around 30 years ago mostly because I was interested in the technical aspects. I’ve tried to keep current with the research using DMCO and Vc ever since and have come to realize that there are several important details with a significant effect on how this test is performed and calculated.

From the formula it can be seen that DMCO and Vc depend on how theta is calculated. Theta (or more specifically 1/theta) is the rate of CO uptake by red blood cells. The original and most commonly used formula for calculating 1/theta was:

where:

PcO2 = pulmonary capillary partial pressure of O2

Hgb = hemoglobin (grams/decaliter)

The constants in this equation mostly come from laboratory bench-top research that measured the reaction rates of CO and hemoglobin, but they also depend on some assumptions. One of these is that CO has to travel past the exterior membrane of the red blood cell and then through the interior of the red blood cell before encountering a hemoglobin molecule. The ratio between these two resistances was originally measured as being 2.5, but when this value was used the DMCO and Vc results did not match what was known about lung physiology. If it was assumed that there was no cell membrane resistance at all the constants for 1/theta formula produced physiologically consistent results. A zero resistance is unlikely to be true but it has been used because it appears to match reality better. Some researchers however, have insisted on using the laboratory measured ratio of 2.5 instead and when this is used the equation becomes:

Laboratory research on the CO-hemoglobin reaction rate has been re-performed and at different times and places researchers have since used a variety of constants:

Unfortunately there is a distinct lack of any consensus regarding which set of constants should be used. There valid arguments that can be (and have been) used for and against each set. The fact that there is a certain level of discrepancy between reaction rates determined on a laboratory bench and those indicated by real world in-vivo measurements complicates the ability to choose one set over another.

Because carbon monoxide and oxygen both compete for sites on the hemoglobin molecule a second critical factor is that 1/theta (and DLCO) will therefore vary with the O2 concentration. The pulmonary capillary O2 concentration cannot be measured directly however, and must be estimated. There are several different approaches for doing this. The simplest is to subtract a constant (usually 10) from the alveolar PO2 (although I’ve seen that at least one researcher added 5 to PAO2 instead) to derive PcO2. A more rigorous approach is to calculate PcO2 from:

Most researchers however, have just substituted PAO2 for PcO2, but frustratingly how PAO2 has been determined is often not reported. When it is reported it has been most frequently been determined from the alveolar sample used to measure DLCO, which does make a certain amount of sense. Less frequently it has been estimated using the alveolar air equation or from measured from end-tidal PO2.

Although using PAO2 instead of PcO2 simplifies the measurement process greatly, PAO2 is not PcO2, and it is not clear to me what effect using it has on the derived DMCO and Vc. This is in a sense a core problem with Roughton and Forster’s original equation for DLCO’s serial resistances. PcO2 is more a concept than an actual thing since strictly speaking it is different for each alveoli and capillary. Arterial PO2 is the closest we can come to actually measuring it and since there are known differences between PAO2 and PaO2 that depend both on disease states and measurement conditions (like during exercise) it would seem the same applies to PcO2.

Note: While reviewing articles on DMCO and Vc I found that the constants used for the 1/theta calculations were reported relatively often but by no means all of the time. As mentioned the method by which PAO2 was determined was reported somewhat infrequently. Most frustrating however, is that many research papers only stated that DMCO and Vc were determined by the method of Roughton and Forster (or some other prior study, usually in an obscure or out of print journal) and gave no details at all.

It’s not particularly clear what level of accuracy or precision can be expected from DMCO and Vc measurements. The ATS-ERS statement on DLCO testing asks for a repeatability of 10%. Because DMCO is determined from the intercept of the 1/DLCO axis and Vc from the slope of the regression line small differences in the DLCO measurements can make a significant difference in the calculated DMCO and Vc. An error bar of 10% in individual DLCO measurements likely translates into a much larger error bar for DMCO and Vc.

I was an advocate for performing DMCO and Vc testing on and off for years, but to be honest that had more to do with the technical challenge than with any clinical relevance. I think that DMCO and Vc testing still has an important place in research (although I’d certainly like a better consensus and more clarity in reporting details). For routine clinical testing in a Pulmonary Function Lab, likely not so much. DMCO and Vc have been studied in COPD, asthma, sarcoidosis, sleep apnea, pulmonary fibrosis, pulmonary hypertension and various forms of heart disease. These studies have improved our understanding the physiology of these conditions but I don’t see where it has improved our ability to either diagnose or monitor them. At this time I can’t see that DMCO and Vc are any more clinically significant than just DLCO, and given the uncertainties in how these results are derived, probably less.

References:

Bosisio E, Grisetti GC, Panzutti F, Sergi M. Pulmonary diffusing capacity and its components (DM and Vc) in young, health smokers. Respiration 1980; 40: 307-310.

Ceridon ML, Beck KC, Olson TP, Bilezikian JA, Johnson BJ. Calculating alveolar capillary conductance and pulmonary capillary blood volume: comparing the multiple- and single-inspired oxygen tension method. J Appl Physiol 2010; 109: 643-653.

Georges R, Saumon G, Loiseau A. The relationship of age to pulmonary membrane conductance and capillary blood volume. Am Rev Resp Dis 1978; 117: 1069-1078.

Jain BP, Pande JN, Guleria JS. Membrane diffusing capacity and pulmonary capillary blood volume in chronic obstructive pulmonary disease. Am Rev Resp Dis 1972; 105: 900-907.

Kleerup EC, Koyal SN, Marques-Magallanes JA, Goldman MD, Tashkin DP. Chronic and acute effects of “crack” cocaine on diffusing capacity, membrane diffusion and pulmonary capillary blood volume in the lung. Chest 2002; 122: 629-638.

Lamberto C, Nunes H, Le Toumelin P, Duperron F, Valeyre D, Clerici C. Membrane and capillary blood components of diffusion capacity of the lung for carbon monoxide in pulmonary sarcoidosis: relation to exercise gas exchange. Chest 2004; 125: 2061-2068.

Overbeek MJ, Groepenhoff H, Voskuyl AE, Smit EF, Peeters JWL, Vonk-Noordefraaf A, Spreeuwenberg MD, Dijkmans BC, Boonstra A. Membrane diffusion- and capillary blood volume measurements are no useful as screening tools for pulmonary arterial hypertension in systemic sclerosis: a case control study. Respiratory Research 2008; 9: 68

Pande JN, Gupta SP, Guleria JS. Clinical significance of the measurement of membrane diffusing capacity and pulmonarycapillary blood volume. Respiration 1975; 32: 317-324.

Roughton FJW, Forster RE. Relative importance of diffusion and chemical reaction in determining rate of exchange of gases in the human lung with special reference to true diffusing capacity of pulmonoary membrane and volume of blood in the pulmonary capillaries. J Appl Physiol 1957; 11: 291-302.

Sansores RH, Pare P, Abboud RT. Effect of smoking cessation of pulmonary carbon monoxide diffusing capacity and capillary blood volume. Am Rev Resp Dis 1992; 146: 959-964

Zanen P, van der Lee I, van der Mark T, van den Bosch JMM. Reference values for alveolar membrane diffusion capacity and pulmonary capillary blood volume. Eur Respir J 2001; 18: 764-769.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
July 31, 2014
When back-extrapolation goes astray
A spirometry report that looked very questionable came across my desk recently. The flow-volume loop was misshapen and the technician’s notes indicated that the results had been highly variable and to “interpret with caution”. I pulled up the raw test results and saw a series of test efforts with flow-volume loops that were all somewhat flattened and with no consistency in either the loops or the numerical results.

This kind of inconsistency can be an indication of poor patient effort but can also occur because of airway problems. The cardio-thoracic surgeons at my hospital have an active airway stenting program and so we see a fair number of patients with trachemalacia. One hallmark of tracheomalacia is that there is usually a flow limitation and that this means that there is usually a flat expiratory plateau in the flow-volume loops. These loops had peak flow-ish humps, but the humps seemed to appear in different locations in every loop and they seemed to have a relatively high frequency flutter.

One plausible explanation for the inconsistent results is vocal cord dysfunction (VCD). VCD is characterized by the paradoxical closure of the vocal cords that results in wheezing or stridor and shortness of breath. The gold standard for diagnosing it is laryngoscopy while the patient is symptomatic but it can be difficult to make a definitive diagnosis since symptoms can often come and go. VCD can mimic asthma but patients usually don’t respond to bronchodilators and have negative challenge tests. Spirometry results like these can only be suggestive, however.

The real problem though, was that the spirometry effort that had been selected for reporting indicated the patient had moderately severe airway obstruction (FEV1 56% of predicted) and there were several efforts that had a significantly higher FEV1. When I checked the numerical values it was apparent that this effort had been selected because it was the effort with the highest FEV1 whose back-extrapolation met ATS-ERS criteria.

Back-extrapolation is a technique for standardizing the beginning of exhalation during a forced spirometry effort.

Taken from the ATS-ERS Standardisation of spirometry, page 324.

Specifically, the ATS-ERS statement on spirometry says “…the back-extrapolation method traces back from the steepest slope on the volume-time curve.” Interestingly, this steepest slope coincides with peak flow. This approach to standardizing the measurement of time makes sense but it is also based on the assumption that it would correct for a “soft” start to exhalation, where the actual beginning of the effort was somewhat indeterminate.

When I looked at the other efforts, there was a test with a higher FEV1 but whose back-extrapolation was high. When I looked closely at the effort, what I was saw was that the start of the effort was actually quite good but that the peak flow occurred quite late in the effort.

This means that the back-extrapolation was taken from a slope in the volume-time curve that occurred after 50% of the exhalation had occurred.

I don’t think this was how the back-extrapolation technique was supposed to be performed. I also think that the effort actually meets expectations for a rapid start and that the computerized back-extrapolation technique is actually mis-calculating the true start of the test. I pulled the volume-time curve into a graphics program with a ruler and found that if the start of the effort was taken from the “real” start of the test, that the FEV1 was actually 0.22 L less than reported.

Even taking this into consideration, the re-calculated FEV1 from this effort was almost a liter greater than the effort that had been selected and was therefore a more accurate representation of what the patient was capable of and so I selected this effort to be reported. In the end, both the FVC and FEV1 were reported to be WNL.

Was this the correct choice? Realistically, all of the test efforts were flawed for one reason or another and none of them could be considered to have acceptable quality. For this reason alone no choice could be the “right” choice. The effort that had been originally selected however, was chosen simply because it met the ATS-ERS criteria for back-extrapolation but it was actually quite flawed in other ways.

Original selection

I’ve tried to decide whether this situation indicates a degree of failure in our training program or not. We try to teach new technicians the criteria for selecting spirometry results and even in this case I think we’ve been somewhat successful (if for no other reason than the note to “interpret with caution”), but I’m at bit of a loss on how we can teach the times when the selection criteria need to be ignored. I think this is only something that can come from experience and for the moment I’m going to have to leave it at that.

However, this problem also points out some significant limitations of the testing software. Out of curiosity I let the software select the “best” spirometry effort for reporting and it went back to the original selection. It is apparent that the software has been designed to reject any effort that does not meet the ATS-ERS criteria for back-extrapolation. To some extent I understand this but in this case there was another effort with an FEV1 that was larger by almost a liter. Back-extrapolation or not, this is an indication there is something wrong with the automatic selection process.

This also leads me to some concern about the results from office spirometry systems. I suspect that the staff performing tests on these systems rely heavily on the software to select the correct efforts. The manufacturers of these systems can say with complete honesty that they meet the ATS-ERS standards but how many times is a spirometry effort automatically rejected because it is only a little bit outside these guidelines despite being substantially better overall?

There are established criteria for spirometry test quality. There are valid reasons for all of these criteria but when test quality is poor it becomes necessary to understand the difference between “should” meet criteria and “must” meet criteria. At the moment there appears to be mostly a binary [ meets criteria / does not meet criteria ] approach to decision making. Since each of the criteria has a different degree of importance it seems to me that weighting the criteria and how far a test effort is away from meeting a specific criteria would give a more nuanced approach. It would be easy to say that none of this patient’s spirometry efforts should have been reported since none of them came close to meeting the criteria for test quality. You have to work with what you get sometimes, and in this case I think that not reporting would have been a worse choice than reporting what were admittedly flawed results.

References:

Brusasco B, Crapo R, Viegi G. Serier ATS/ERS Task Force: Standardisation of Pulmonary Function Testing. Standardisation of spirometry. Eur Respir J 2005; 26: 319-338.

Morris MJ, Christopher KL. Diagnostic criteria for the classification of vocal cord dysfunction. Chest 2010; 138: 1213-1223.

Wilson MA, King CS, Holley AB, Greenburg DL, Mikita JA. Clinical and lung-function variables associated with vocal cord dysfunction. Respir Care 2009; 54(4): 467-473.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
July 13, 2014
Mixing doesn’t always make a match
Recently I was reviewing a spirometry report and noticed that the FVC was below normal. A low FVC can suggest a restrictive lung disease but the reported expiratory time was only about 4 seconds. I took a look at the graphics included with the report and the volume-time curve showed the effort ended well before 6 seconds so my first thought was that the reduced FVC was more likely because of suboptimal patient effort than anything else.

I always try to review spirometry results whenever there is anything questionable so I pulled up the raw test results and immediately saw that the reported FVC was actually a composite. The ATS-ERS statements on Spirometry and Interpretation say that the highest VC regardless of which test it came from (which even includes the slow vital capacity test from lung volume measurements and the inspiratory volume from a DLCO) should be used when reporting spirometry results. In this case the FVC came from one effort, the FEV1 and everything else came from a different effort. The interesting thing was that the effort the FVC came from was about 10 seconds long which shows it actually was an adequate effort. The FEV1 effort on the other hand was only about 4 seconds long and showed an abrupt and early termination of exhalation.

The technician who performed the tests selected the correct efforts to make a composite. The patient had made five spirometry efforts and the selected FVC was significantly larger than all of the other efforts but the FEV1 from the same effort was significantly lower than several other efforts. Our criteria for selecting FEV1 does not just go by the largest FEV1, we also look at the peak flow (PEF) and whether there has been any back extrapolation and the effort the FEV1 came from had the highest peak flow and no back extrapolation. So, a good choice had been made on both efforts.

When it comes to selecting values from different spirometry efforts there are only a limited number of results that our lab software allows us to mix and match. The FVC, the FEV1 and the graphics (flow-volume loop and volume-time curve which are linked to each other) can all be selected individually, but everything else, which includes the expiratory time, PEF, FEF25-75, MEF50 etc. etc. can only be selected as a group.

Our policy when selecting the FVC from one effort and the FEV1 from another is that the flow-volume loop and all the ancillary values go along with the FEV1 which is mainly because these include the values that indicate the quality of the FEV1. This makes sense for PEF and (within limits) the flow-volume loop, but less so for all of the other values. The most obvious mismatch is that the expiratory time belongs with the FVC, not the FEV1. For many, if not most, of the other ancillary values it is not clear at all which effort they should be selected from.

For example, the FEF25-75 will almost always be overestimated when it is taken from an effort with a short exhalation time. This is because the 25% and 75% points are based on the vital capacity, and when a spirometry is shorter than it should have been, these points will occur earlier when flow rates are higher.

The same consideration applies to MEF25, MEF50, MEF75, MIF25, MIF50 and MIF75 since these values are also determined in relation to the vital capacity. That being said, that doesn’t mean that the FEF25-75 and the other ancillary values should be taken from the effort with the largest FVC either. All of these values reflect expiratory flow rates and are presumably the highest values the patient can attain. There is nothing that says that a spirometry effort with the largest vital capacity volume was performed with maximal flow rates as well.

The ATS-ERS statement on Spirometry says the FEF25-75 should be taken from the effort with the highest combined FVC and FEV1. I understand the intent but I also think this recommendation was made without consideration of those times when when there are significant differences in FVC and FEV1 from test to test or when the VC comes from a test other than a forced vital capacity. Even so, if this guideline was followed, it would be possible to report a composite effort with the FVC from one effort, the FEV1 from another and the FEF25-75 from another. Interestingly, although the ATS-ERS statements discuss the performance of Peak Flow and Flow-Volume Loops, there are no specific guidelines for selecting them when a composite is made.

My particular quandary is that we cannot select the FEF25-75, peak flow and expiratory time separately. They come as a group and any choice is going to be a compromise. Fortunately, since we don’t routinely report FEF25-75 or any of the other ancillary flow values this simplifies the process somewhat. My lab’s decision has been that since PEF is used as part of the selection process for FEV1 and is also a reported value that is used to compared a patient’s personal peak flow readings, it trumps expiratory time and the reported value (along with everything else, like it or not) therefore comes from the FEV1 effort. For similar reasons, the flow-volume loop is associated more with FEV1 and peak flow, so it also comes from the FEV1 effort.

Our inability to select what we think are the most appropriate test results when making a composite is primarily due to limitations in our lab’s testing software, but I suspect this is a similar problem for all systems that allow the FVC and FEV1 to be selected from different spirometry efforts. The ATS-ERS statements offer no guidelines about selecting FEF25-75, peak flow and expiratory time when composite spirometry efforts are created so it’s up to each equipment manufacturer to decide what can and can’t be selected other than the FVC and the FEV1.

I think there are good reasons why PEF and flow-volume loop should go along with the FEV1. I also think that expiratory time should come from the largest FVC (ignoring of course those times when it comes from lung volume or DLCO tests). When it comes to FEF25-75 and the other ancillary flow values, I don’t think there can be a clear answer, and that in fact any selection at all is likely going to result in reporting values that are in one way or another incorrect (and is another reason for not reporting them in the first place).

One final issue I have with this whole process is that there is nothing whatsoever on our final report that indicates that the FVC and FEV1 came from different efforts. If I hadn’t decided to pull up the raw test data I wouldn’t have known. This information doesn’t necessarily need to be on a final report, but it’s not even an option for us. Since the final report (in PDF format) is the primary tool for reviewing (and eventually will also be for signing) tests for my lab, I think this is a problem.

The ATS-ERS statements on Spirometry and Interpretation say that the largest vital capacity should be used to report the FEV1/VC ratio. I am in complete agreement with this since I think it is best way to implement the intent of the forced vital capacity maneuver which is to determine the presence of airway obstruction. The devil is in the details however, since a spirometry test is not just about VC and FEV1, but also about peak flow, FEF25-75, expiratory time, the flow-volume loop, the volume-time curve and other values. The lack of guidelines and certain inconsistencies in the selection process for these other values makes it likely that incorrect or misleading results can and will be reported.

I would like to hope that these issues (as well as many others) will be addressed in the next ATS-ERS statements on Spirometry and Interpretation, but we’ll have to wait and see. In the meantime reviewers need to remember that the criteria used to select FVC and FEV1 may be inconsistent with those needed to select the other reported values; that limitations on the selection process may be imposed by testing software; that the graphics don’t necessarily match the numerical values; and finally the fact that a composite effort was created at all may be hidden from view by reporting limitations.

References:

Brusasco V, Crapo R, Viegi G. ATS/ERS Task Force: Standardisation of Lung Function Testing: Standarisation of spirometry. Eur Respir J 2005; 26: 319-338.

Brusasco V, Crapo R, Viegi G. ATS/ERS Task Force: Standardisation of Lung Function Testing: Interpretive strategies for lung function tests. Eur Respir J 2005; 26: 948-968.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
July 1, 2014
Seeing shouldn’t always be believing
Although the numerical results are of course important, visual inspection of the volume-time and flow-volume loop graphs from a spirometry test are a critical part of interpretation. Spirometry quality and performance issues that don’t show up in the numbers are often highly evident in the graphs. Choices we make in creating and configuring reports however, can hide important visual details and have the potential to decrease interpretation quality.

Recently I was inspecting the results for a spirometry test. There wasn’t anything particularly unusual about the numbers or the graphics on the report, I just like to make spot-checks on spirometry quality and wanted to make sure the best results had been selected. When I pulled up the raw test date on my computer screen I noticed an unusual wavering pattern in the volume-time curve. I don’t remember seeing a volume-time curve like this before and when I checked all of the patient’s efforts were similar and all showed similar oscillations.

The end-exhalation portion of the flow-volume loop looked a lot like cough artifacts but coughs usually look different on the volume-time curve so out of curiosity I measured the rate at which the volume oscillated and it was about 1 hz. Oscillations are not that uncommon in flow-volume loops but they usually occur throughout the loop and the oscillation frequencies usually start around 4 hz and can go up to 20 hz or higher. Lower frequencies are usually related to neuromuscular issues but a 1 hz oscillation was a bit on the unusual side and there was nothing in the patient’s diagnosis or history that indicated this might be an issue.

Interesting as they might be, there wasn’t anything overtly diagnostic about the oscillations so I went back to the report. I gave it another quick look-over and was about to put in my Out box when I noticed that the oscillations weren’t visible on the report.

When we configure a report for our testing software there are always certain design considerations. The hospital’s medical records department has very strict requirements for fonts and font sizes; for the placement and size of the hospital’s logo, and for the order and placement of the certain pieces of patient and test identification. This means that up to 2 inches of each page of the report have been spoken for before we even start to fit in the rest of the patient’s demographics, test results, technician notes, interpreting physician notes, graphics and the trends. Theoretically there is no reason we couldn’t make separate pages for flow-volume and volume-time graphics in every report, more realistically this is wasteful of paper and in addition would probably make the report harder, not easier, to read.

For every report format we’ve created we’ve had to make trade-offs in the size of the graphics. The spirometry reports are able to devote almost an entire page to the flow-volume loop and volume-time curve, but on all of the other reports, these graphics have had to share space with other report elements and this has limited how large they can be printed.

When graphics are displayed on a computer screen or printed on a report, the lines that make them up can never be smaller than a single pixel. This means that how much detail a line can show will depend on what scale is used display or print it. When you squeeze a graphic down to a smaller size, something will always be lost along the way.

The volume-time and flow-volume loop graphics on the report are about 1/4 the size as those on my computer screen. In order to make the graphics fit the size available for them on the report, the software simplified them. This removed some of the details from the curves, which in this case included the oscillations.

I’ve been involved with computers and graphics for close to 40 years and in the early years screen and printer resolution was always a significant issue so it’s not like this is a new problem. Computer screens and printers have improved dramatically and we are probably inclined to believe this kind of problem is not an issue any more, but the fact is that when we configure reports (or use the reports configured for us by the manufacturer) the problem never really went away.

There is no particular reason that high resolution graphics need to be on a final report any more than every single patient effort needs to be reported. This is okay because some simplification is always going to be needed to create a readable report. What is important however, is that whoever is responsible for reviewing and interpreting test results must be able to review them at a high enough resolution that test quality is easily evident and this is where I think there may be a problem.

I have a 24” LCD display on my computer with 1920 x 1200 pixels. That’s about as large a display as is commonly available (yes, they make larger displays with more pixels but the price curve gets very steep very rapidly and only a few graphics cards will support them anyway). This works well when I review results on my computer but I usually see reports as PDF’s which is more-or-less the same as their printed version. It’s only when I pull test results up in the lab’s testing software that I can see anything at a higher resolution and although I do this frequently I don’t do this for all reports and all tests. Moreover I suspect that PFT’s are most often reviewed and interpreted as paper reports, not on a computer screen, and that most reviewers don’t have access to higher resolution graphics. This makes it important that we be aware of the consequences of the resolution for the graphics we choose to place on our reports. Small graphics save space but only at the cost of losing information.

Single page reports for spirometry seem to be popular, particularly for office spirometry systems. I think that the limited quality of the graphics on these reports is often overlooked. I understand that handling multiple-page reports is likely a burden and therefore a problem for clinics and physician offices but at the same time a certain implication of this seems to be that those involved with office spirometry do not have the need (or the ability?) to assess test quality.

Many equipment manufacturers are moving towards the web-based on-line review and electronic signing of PFT reports. I’d like to think that this will make high-resolution graphics (and raw test data) available to a reviewer, and will be simple and straightforward to use. My lab is moving in this direction but the details we’ve received so far indicate that it will be the PDF version of the report that will be accessed via a web browser, not the raw data or the original graphics. I am disappointed about this and if this is actually the case we will probably have to re-configure some of our reports in order to show higher-resolution graphics, even if it moves us to longer reports.

Looking well down the road, I can see a time when all PFT reports regardless of where they are performed and regardless of which manufacturer’s equipment they come from are on-line and come with full graphics and the raw data imbedded, and can be pulled up and displayed (and trended) whenever needed. I see the beginning of this in the establishment of the inter-hospital communication mandates and standards but since I see no movement towards this goal either by the standards organizations that govern our field or any apparent desire for collaboration between PFT equipment manufacturers for the time being this will have to remain a pipe dream on my part.

The resolution and size of the graphics that are part of PFT test results needs to be a consideration when configuring reports. There needs to be balance between the amount of space they are allotted and the amount of information that needs to be conveyed. A picture may be worth a thousand words but without some care the message it provides can be misleading.
Share this:

Print

Email
June 26, 2014
Single-breath DLCO Breath-holding time (BHT)
The single-breath DLCO maneuver can rightly be criticized as being an artificial maneuver that bears little resemblance to normal breathing. It is only by standardizing the maneuver that clinically relevant and reproducible results can be obtained. One important aspect of this standardization is the breath-holding period.

The single-breath DLCO maneuver begins with a subject exhaling to RV, followed by an inhalation of the test gas mixture to TLC and then a 10-second breath-holding period, ending with an exhalation during which a sample of alveolar air is collected. The initial choice of a 10-second breath-hold period was largely arbitrary and was selected in order to strike a balance between being a short enough period that for most patients to hold their breath, long enough to minimize the inspiratory and expiratory phases and long enough to allow for a sufficiently measurable amount of carbon monoxide to be taken up.

During the inspiratory phase of the DLCO maneuver, carbon monoxide uptake does not begin until the inhaled gas has passed both the test system’s and the subject’s anatomic dead space and reached the first functional alveolar-capillary unit. The full rate of carbon monoxide uptake will not occur until the diffusing gas mixture has reached all available alveolar-capillary units and these units have reached their maximum surface area. The rate of carbon monoxide uptake therefore increases throughout inhalation and reaches a maximum near TLC.

During the exhalation phase, carbon monoxide uptake continues even as the alveolar sample is being taken. For this reason the concentration of carbon monoxide at the beginning of the sampling period tends to be higher than at the end of the sampling period. The size of the washout volume and the alveolar sample volume, which to some extent determines how long a patient has to exhale before the acquisition of an alveolar sample is complete, will also have an effect on exhaled gas concentrations.

Because the point at which carbon monoxide uptake starts and the point at which it ends are to some degree indeterminate, several methods for standardizing the measurement of the single-breath DLCO breath-hold period have been developed. Of these, the Ogilvie method starts measuring the breath-hold period at the very beginning of inhalation and stops at the beginning of the alveolar sampling period. The Epidemiology Standardization Project (ESP) method, on the other hand, also stops at the beginning of the alveolar sampling period but instead starts measuring at 50 percent of the inhaled volume. Finally, the Jones-Meade method starts measuring at 30 percent of the inspiratory time and stops in the middle of the alveolar sampling period.

As a reminder, single-breath DLCO is calculated by:

A DLCO test measures the rate of CO uptake which is expressed as ml/min/mm Hg. For the same inspired and expired CO concentrations a shorter BHT would mean a quick uptake of CO has occurred and a longer BHT would mean a slow uptake of CO has occurred. When these different breath-hold measurement methods are applied to the same test results, the ESP method, which generates the shortest breath-hold time, has the largest calculated DLCO. The Ogilvie method, which generates the longest breath-hold time, has the smallest calculated DLCO. The Jones-Meade method tends to fall in between the ESP and Ogilvie methods in patients with normal lungs.

Both the Ogilvie and the ESP methods tend to overestimate DLCO when airway obstruction is present. This is because with airway obstruction it takes longer to get an alveolar sample. Both the Ogilvie and ESP methods stop measuring time at the beginning of the alveolar sampling period, even though some CO uptake may still be occurring. For this reason they tend to generate a shorter breath-hold time than may be appropriate. The Jones-Meade method however, includes some of the alveolar sampling time in the measurement of breath-hold time and therefore tends to have the least overestimation of DLCO in the presence of airway obstruction. It is for this reason that the ATS-ERS Statement on DLCO testing recommends the use of the Jones-Meade method.

Both the Ogilvie and Jones-Meade methods for measuring the beginning of the breath-hold period are time based, so it is important to know when inspiration begins and ends. Ideally the onset of inspiration should be rapid and easily recognizable. When it is less easy to determine the beginning of inspiration, the ATS recommends use of the standard back extrapolation technique to determine the onset of inspiration. The end of inspiration should also be clear and easily recognizable. When it is not, the ATS recommends using point at which the inspiratory volume equals 90% of the patient’s vital capacity as the end of inspiration.

Regardless of the method used to measure breath-hold time, the periods of inhalation and exhalation should be kept as short as possible in order to minimize measurement inaccuracies. The formula used to calculate DLCO does not by itself differentiate between the inhalation phase, breath-hold period and the exhalation phase; all are considered to be part of the breath-hold period. Several researchers have devised techniques that calculate and then integrate DLCO for each of these phases. This approach is based on physiological models and is known as the 3-equation method. It is available on some test systems but is not recommended by the ATS or ERS at this time.

As mentioned previously, the choice of a 10-second breath-hold period was somewhat arbitrary. In subjects with normal lungs, DLCO tends to decrease when breath-holding lasts longer than 10 seconds. A variety of potential causes for this have been proposed which include build-up of CO back pressure and changes in circulation but regardless there seems to be little reason to propose a longer breath-holding period. A shorter breath-holding period may be acceptable since at least one study has shown that breath-holding periods of 6 or 8 seconds gives reasonably equivalent results. The problem with shorter breath-holding periods is that inspiration and expiration become a significant part of the overall time period.

These findings apply primarily to subjects with normal lungs however. There is general agreement that patients with COPD show increases in DLCO with longer breath-holding times. For these patients a longer breath-hold time allows more time for axial diffusion and therefore increased ventilation of poorly ventilated lung units as well as a better estimation of VA.

It should also be remembered that the conditions under which the breath-hold period is conducted can also affect the measured DLCO. During the breath-hold period, a Valsalva maneuver (forcible exhalation against a closed airway) acts to raises the intrathoracic pressure thereby decreasing both the pulmonary blood volume and the DLCO. A Muller maneuver (forcible inhalation against a closed airway), on the other hand, decreases intrathoracic pressure, thereby increasing pulmonary blood volume and the DLCO. For these reasons the ATS-ERS Statement on DLCO testing recommends that the patient avoid excessive positive (Valsalva) or negative (Muller) pressure maneuvers during the breath-hold period. Determining whether these have occurred, however, can be difficult. Most DLCO tests system have a valve arrangement that prevents exhalation during the breath-hold period. Some of the test systems my lab uses monitor airway pressure during the breath-hold period but the usefulness of this measurement assumes the patient keeps their airway open during breath-hold and my impression is that most patients close their glottis instead.

There are many approaches towards measuring gas exchange, but the single-breath DLCO has become the primary way of making this measurement. The single-breath DLCO test is an artificial maneuver that nevertheless provides clinically significant information about gas exchange. How the BHT is measured has implications for DLCO calculations and should also be a consideration when choosing DLCO reference equations. Because the BHT includes some or all of the inspiratory and expiratory components of the maneuver it necessarily simplifies a complex situation. The Jones-Meade approach to measuring BHT attempts to minimize these components and is probably a good compromise. Although the choice of a 10 second BHT was fairly arbitrary it also appears to strike a good balance between physiology, patient abilities and test equipment limitations.

REFERENCES:

Beck KC, Offord KP, Scanlon PD. Comparison of Four Methods for Calculating Diffusing Capacity by the Single Breath Method. Chest 1994; 105:594-600

Brusasco V, Crapo R, Viegi G editors. ATS/ERS Task Force: Standardization of Lung Function Testing. Standardization of the single-breath determination of carbon monoxide uptake in the lung. Eur Resp J 2005; 26:720-735

Dressel H, Filser L, Fishcher R, de la Motte D, Steinhaeusser W, Huber RM, Nowak D, Jorres RA. Lung diffusing capacity for nitric oxide and carbon monoxide: Dependence on breath-hold time. Chest 2008; 133:1149-1154

Ferris BG, ed. Epidemiology Standardization Project. Am Rev Resp Dis 1978; 118:6(Part 2;1-120)

Graham BL, Mink JT, Cotton DJ. Overestimation of the Single-Breath Carbon Monoxide Diffusing Capacity in Patients with Air-Flow Obstruction. Am Rev Resp Dis 1984; 129:403-408

Graham BL, Mink JT, Cotton DJ. Effect of breath-hold time on DLCO(SB) in patients with airway obstruction. J Appl Physiol 1985; 58:1319-1325

Jones RS, Meade FA. Pulmonary Diffusing Capacity: an improved single-breath method. Lancet 1:94-95

Lawson WH. Effect of drugs, hypoxia and ventilatory maneuvers on lung diffusion for CO in man. J Appl Physiol 1972; 32:788

Leech JA, Martz L, Liben A, Becklake M. Diffusing Capacity for Carbon Monoxide: The Effects of Different Derivations of Breathold Time and Alveolar Volume and of Carbon Monoxide Back Pressure on Calculated Results. Am Rev Resp Dis 1985; 132:1127-1129

Ogilvie CM, Forster RE, Blakemore WS, Morton JW. A Standardized Breath Holding Technique For The Clinical Measurement Of The Diffusing Capacity Of The Lung For Carbon Monoxide. J Clin Invest 1957; 36:1-17

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
June 23, 2014
Assessing change over a long period of time
Because our lab database goes back 24 years, we’ve started to see a certain number of patients who had last been seen ten or even twenty years ago fairly often. For this reason I’ve been thinking about what is a clinically significant change over that long a time period. The guidelines my lab uses for interpreting change in test results came about from a consensus among the department’s pulmonary physicians close to twenty years ago. As usual there are some discrepancies between our guidelines and those the ATS-ERS have published.

Test: %Change Minimum Change:

FVC >=10% >= 200 ml

FEV1 >=10% >= 200 ml

TLC >=10% ?

DLCO >=10% >= 2 ml/min/mmHg

Our criteria came primarily from the standards for repeatability in test results. The ATS-ERS guidelines for interpretation takes repeatability into consideration but also what appears to the minimum statistically significant clinical change. For year to year changes these are:

Test: %Change

FVC >=15%

FEV1 >=15%

DLCO >=10%

Interestingly a significant change in TLC is not discussed, but since I’ve searched the literature and have been unable to find any longitudinal studies on lung volumes in normal subjects I am not surprised that it was not included.

I think that both of these standards are based on the assumption that patients are seen on a relatively regular basis and that when results are compared they usually come from the more-or-less recent past, not from a decade or more. A change of 10 or 15 percent in FVC, FEV1 and DLCO is not unusual for a 10 or 20 year period however (I’m less certain about TLC). When comparing results over a long time period there at least of couple of issues that should be addressed, the first of which is normal age-associated declines in test results.

The ATS recently released their standards for Occupational Spirometry which included a discussion of assessing changes in FEV1 over time. The basic idea that was presented is to compare the percent predicted values, not the actual test results. Doing this adjusts the results for age and the recommendation was that a decline in FEV1 of 15% was significant. This threshold had been suggested in the prior ATS/ERS statement on interpretation but was formalized in the Occupational Spirometry standard with algorithms for calculating the change. This is an important guideline for assessing longitudinal changes in an individual but was notably limited solely to FEV1 because “it is less affected by technical factors than the FVC”.

In light of this standard my lab has decided to add a comment on an age-adjusted 15% decrease in FEV1 from any prior test whenever it is seen to occur (but only when there has been no significant decline from the last time spirometry was performed otherwise the comment would be somewhat redundant). Despite the specific exclusion of FVC, it would seem that this standard is at least a starting point for assessing significant decreases in FVC, TLC and DLCO.

An important aspect of aging was not addressed in either ATS-ERS standard, however, and that was decreases in height. My lab is fairly obsessive-compulsive about measuring patient height at every visit. When a patient hasn’t been seen in a while it is common to see a significant change in their height (which I can relate to since I’ve lost an inch and a half since I was twenty). The height decrease is occasionally enough, particularly over a long time period, that if their height had not been re-measured, the age-adjusted decrease in FVC, FEV1, TLC or DLCO would have been significant whereas with the change in height it was not.

Should decreases in height be ignored when making comparisons over long periods of time? There are relatively few longitudinal studies of changes in lung function over time. This is not surprising given the difficulties involved in following a group of people over a prolonged period of time but it does leave a significant gap in our knowledge. I was able to find and review about a half dozen longitudinal studies of spirometry and DLCO but in half of them only the original height was reported and in the others changes in height were noted, but not included in any statistical analysis of change.

A pulmonary physician I worked with at one time said that a patient’s height should not be updated because the percent predicted should always be compared to their original height. My counter argument was that reference equations are developed using a subject’s current height, not their height at some time in the past and that if the patient had not been seen previously their report would also be interpreted in light of their current height, not their prior height. For these reasons I am going to say that height should be updated and comparisons, even over long periods of time, should be based on the percent predicted of height and age at the time of the test.

Since I think that comparing age- and height-adjusted percent predicted values is the best approach for assessing changes over long periods of time, the question then becomes whether a 15% threshold for FVC, TLC and DLCO is too high, too low, or just right. As I mentioned previously, 10% is roughly the threshold for the repeatability of these tests but repeatability really applies to within-session testing, not I between sessions. In addition, a critical component when comparing results over time has to be test quality. Over the years I’ve become all too aware of the myriad of problems involved in obtaining accurate FVC, TLC and DLCO test results. These tests cannot be interpreted accurately in the first place without an assessment of their quality and comparing test quality, particularly for tests that occurred years apart and that were performed with different (and often long-since replaced) test equipment and technicians is difficult. Having said that, ongoing calibrations and quality control should have kept these differences to a minimum (otherwise what’s the point of trending results in the first place?) and that a threshold of 15% is probably reasonable. I would be concerned that a threshold less than 15% would have too many false positives and one that was higher would have too many false negatives but this is admittedly a guess. Nevertheless, when a significant change is detected in any of these results at least some attempt to assess test quality should also be made.

Computerized pulmonary function testing has been around for decades and lab databases have the potential to extend well into the past. Even if a lab makes a decision to limit the size of their database hospital information systems now collect patient records over extended periods of time and at some point guidelines for assessing changes over a prolonged period of time need to be made. The relatively small number of longitudinal studies limits our ability to accurately assess clinically significant changes, but an age- and height-adjusted threshold of 15% seems to be a good starting point.

References:

Brusasco V, Crapo R, Viegi G, et al. ATS/ERS Task Force: Standardisation of Lung Function Testing. Interpretive strategies for lung function testing. Eur Respir J 2005; 26: 948-968.

Redlich CA, Tarlo SM, Hankinson JL, Townsend MC, Eschenbacher WL, Von Essen SG, Sigsgaard T, Weissman DN. Official American Thoracic Society Technical Standards: Spirometry in the occupational setting. Amer J Respir Crit Care Med 2014; 189: 984-994.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
June 15, 2014
DLO2/Qc, SaO2 and CPETs
There are a number of simple observations that can be made during a cardio-pulmonary exercise test (CPET) that can point you immediately in a specific diagnostic direction. Recently I was reminded of this while reviewing the CPET results on patient with a complicated medical history whose test had been requested as part of a pre-operative assessment.

Most patients that are candidates for cardio-thoracic surgery do not need to have a CPET and that’s because it is usually straightforward to determine who is high risk and who is low risk from other routine tests. When risk is hard to determine or equivocal, the cardio-thoracic surgeons will order a CPET. They are primarily interested in the VO2 max and Ve-VCO2 slope since there are a number of widely accepted pre-op assessment algorithms that use these values. Even if the CPET results indicate the patient is high risk, the test details can help determine whether there is anything that can be done to improve the patient’s odds.

The patient whose report I was reviewing had moderately severe airway obstruction (FEV1 57% of predicted), mild restriction (TLC 77% of predicted) and a moderate gas exchange defect (DLCO 51% of predicted). This would normally pre-dispose me to look for a pulmonary vascular or pulmonary mechanical exercise limitation but there was a single test value that told me the limitation was going to be cardiovascular instead. That test value was the SaO2 at peak exercise which was 99%.

The reason that SaO2 can differentiate between pulmonary and cardiac limitations can be expressed in an equation:

This isn’t a “real” equation, however, and exists only to highlight the relationship between oxygen transfer and cardiac output. Specifically, what it is saying is that as long as the lung’s ability to transfer oxygen is greater than the blood flow rate through the lung, then SaO2 will be normal.

In this particular CPET, despite what the apparent low resting DLCO is saying about gas transfer, cardiac output has to be even lower. It did turn out that the patient had chronotropic incompetence (likely due in part to the beta-blocker medication Metoprolol). The patient also had a mildly reduced O2 pulse.

O2 pulse can be a good indicator of stroke volume. A low O2 pulse is a relatively common occurrence in COPD. Part of this is because individuals with COPD are usually deconditioned. Part can also be because COPD patients often desaturate and when this occurs O2 pulse decreases because of the lower O2 content in arterial blood. In this case, however, there was no desaturation so the low O2 pulse most likely indicates a low stroke volume. The combination of chronotropic incompetence and a low O2 pulse is a pretty strong indication of a low cardiac output in this individual.

I was still bothered by the fact that there was no desaturation during exercise even though the patient’s DLCO was 51% of predicted. After looking the report over carefully I reminded myself that a DLCO measured at rest does not predict a DLCO measured during exercise.

It has long been noted that DLCO increases during exercise. There are several reasons for this. First, an increased cardiac output increases pulmonary capillary blood volume so there is more blood at any one moment that is available for gas transfer. Second, the increased blood flow also means that within the same period more blood is available for gas transfer. Third, and possibly most important, is that ventilation/perfusion mismatching is often a primary reason for reduced DLCO measurements and when cardiac output increases during exercise, under-perfused regions of the lung can see increased blood flow and overall gas transfer can increase dramatically. For any or all of these reasons it is possible that the patient’s resting DLCO was underestimating their true capacity for gas exchange.

Regardless of whether the patient’s “real” DLCO was underestimated, the fact is that their cardiac output was even lower and despite their significant pulmonary disease their primary limitation to exercise was cardiovascular.

PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Share this:

Print

Email
June 3, 2014