“If you repeat something long enough people will begin to believe it’s the truth.”
How can I set training zones based on power? Are they accurate? Are they based on evidence? These are just some of the basic questions those looking to base their training on power. One of the most widely used and accepted methods of setting up training zones are based on what is know as the “Functional Threshold Power (FTP) test”. We see it used by coaches, listed in the magazines, and now it’s proliferated into online virtual training platforms and virtual reality training platforms such as Zwift and TrainerRoad.
However, what is the scientific basis of the FTP test? Does it measure / reflect lactate threshold? What are its limitations? Are there better options? In this blog and want to take a real look into the limitations of this suggested method of setting up a training program and why I believe it’s not all its suggest to be.
*In advance there is a little bit of physiology and some discussion of studies in the following blog. This sometimes breaks up the flow of a discussion but try to stick with it as it should help explain my views on FTP.
**November 2017 BLOG UPDATE: Please see comments and clarification following the publication with Dr Coggan, regarding FTP tests.
FTP or Critical Power
The basis of FTP and other measures of so-called ‘threshold testing’ is defining that point between energy being primarily supplied by the aerobic system (i.e. sustainable over a long time) and the anaerobic system (sustainable over a short period of time).
According to one of the main academics behind the FTP test; Dr Andrew Coggan states, “FTP is the highest power that a rider can maintain in a quasi-steady state without fatiguing for approximately one hour.” In addition it is suggested that the best predictor of performance is performance itself – so a 60-minute time trial is just that a great predictor of 60 minute time trial. Because 60minutes is often very difficult (especially the relatively untrained) its suggest by Coggan that a 20minute test can be used, which is described as underestimating the 60minute test by 5%. Knowing this the 20minute test is suggested as a means of determining FTP. This is interesting as a description of the test but what is the scientific basis? Why should we use it (or not) to develop training zones?
The underlying basis of the FTP test is touted as being 1. Being representative of lactate threshold (See Figure 1) and, 2. The mathematical concept of critical power (CP). So lets take a look at both of these with reference to the FTP test.
Figure 1. Here we see a test of lactate threshold with a subject working and increasing power and lactate levels rising at a relatively low rate until a threshold (LT) is reached where any additional increase in power output results in an almost exponential increase in lactate.[From Coggan AR. Training and racing using a power meter: an introduction. 2003. Accessed online at: www.ipmultisport.com/ref_lib/Coggan_Power_Meter.pdf].
Lactate threshold and FTP
One of the main studies cited as supportive of the 60minute FTP test as being reflective of lactate threshold and a pragmatic approach to non-lab based testing is that by Coyle et al.  In this study 14 male endurance athletes where used. The cycling lactate threshold test was based on testing at 5 different intensities and looked for a 1mmol (a blood measure of lactate) change on blood lactate above baseline as representing the balance between lactate production and use.
The performance test was cycling until fatigue at 88% of maximum (Vo2max). The study split the group into 2. One group (HL) that could work at a higher % of the maximum at lactate threshold (72-86%) and one at lower level (LL) (59-71%). The results in terms of time to fatigue for the LL group (working at 34% above threshold) and the HL (3% below threshold) was as follows.
Time to fatigue in the HL group was 60mins and the LL was 29mins.
Therefore, how can was state that a 60 minute FTP performance test can be related to this study and lactate threshold when the LL group did not work at lactate threshold but 34% above it. Similarly, the HL group although lasting on average 60mins, when we look at individual subjects we have one lasting 75minutes and another only 51minutes be fatigue. That’s a possible variation of 24Minutes between subjects? As such we cannot base any type of assumption that the FTP test is reflective of any type of late threshold based on the results of this study.
Given that subjects during the test where not aware of the elapsed time this perhaps speaks of the inherent variability and weakness of the FTP test i.e. how motivated are you to perform? When the real question is when does lactate threshold occur.
Therefore, I am not convinced that a 60minute test can predict accurately where the lactate threshold is or power at lactate threshold (or at least not without possible significant variability). Although there is no doubt a relationship between Lactate threshold and time to exhaustion that does not mean that time to exhaustion or max power produced over 60 minutes is an accurate value to determine training zones.
The concept of critical power (CP)
The critical power (CP) test was the mathematical basis of FTP in many ways but it when we look at what the CP test involves it not merely a 20 or 60minute performance test.
The relationship between power output and fatigue was initially introduced by Hill (1927). However, it was Monod and Scherrer (1965) that coined the term ‘Critical power’. These researchers investigated the relationship between power output and time to exhaustion during multiple bouts of exercise on specific, isolated muscle groups. They then derived a mathematical equation that defined the relationship between power output and time to fatigue. This test involved 4 -5 bouts over a time period of 2 – 24 minutes and the data then entered into the equation to define CP.
We can already see limitations to this work – as they say ‘no muscle is an island’ as such testing a single muscle group would not be reflective of the physiological stress brought about during cycling where we see modern day application of FTP. So what about looking at a more relevant studies.
One of the primary papers referenced as underpinning the suggestion that CP is representative of maximum lactate steady state (MLSS)(i.e. just below lactate threshold where there is a balance between the rate of lactate production and the rate of lactate removal primarily representing aerobic system) or just above it is that by Poole et al. In this trial a cycling test was used to assess the relationship between power and MLSS. Similarly, we see other studies referenced to demonstrate a relationship.
However, although there maybe a relationship that does not mean accurate. For example I might say driving when the group is icy may result in a 60% chance of a crash but 60% although significant does not predict it will happen. In assessing the accuracy of such a relationship last year Maturana and collegues demonstrate that CP (calculated in tests over 1-20mins) over-estimated MLSS by 20w (based on subjects with a threshold of about 255w). That’s an 8% overestimation and although it may not sound like much if you cycle 20w above MLSS it will result in a continual rise in blood lactate ending in subjects fatiguing before the end of the test. These results have been further repeated by studies from the likes of Bull et al, which demonstrated that CP overestimates the power output that can be maintained over 60minutes. 
Finally, the calculation of CP is highly impacted by the mathematics employed to identify CP, as is training status of subjects and pedalling frequency (higher cadence resulting in lower CP and FTP).
As such this general view amongst people that CP and FTP are representative of lactate threshold is clearly flawed and at best controversial among scientists. Therefore, care should be taken to base any type of training program on the basis that FTP or CP is an accurate representation of an athlete’s true threshold.
What else does FTP testing not tell us as athletes?
An important factor in developing an effective training program is to know what our physiological strengths and weakness are. As part of determining where there are weaknesses we need to look at factors such as aerobic or anaerobic capacity, or economic an athlete maybe (the oxygen cost of cycling at certain intensities). What we get from FTP testing is one value, ‘a performance measure over one hour’. We do not get a measure of oxygen cost (or oxygen cost per watt – economy), lactate threshold, or similar measures that are independent of psychological motivation to complete a test to full exhaustion. In fact in most lab-based test of aerobic capacity most can generate a value well before physical exhaustion.
Another important factor is the assessment of fuel use across a given range of exercise intensities. What I mean by this is how much fat (grams/min) and carbs (grams/min) are you burning to maintain a given effort (say 200w vs. 250w). You may ask why is this important?
Well for any event exceeding 2.5-3hours in duration it can be massively important as the results from sub-maximal and max testing can give and indication of how much carbs we would need to take on board (based on stored carbs or circa 400-500g) to get us through an event. For Ironman based events such information can be vital to effectively determine pacing and nutritional (Carb) intake requirements.
So what about the practicalities of getting testing carried out in a lab (no I don’t do such testing)? A submax (check of bodies response to aerobic up to threshold work), max (anaerobic capacity and maximum oxygen uptake), and LT test carried out for cycling and running may cost in the region of £300-400 in the UK. For cyclists only needed a bike test or runners needing a run test its going to be half this cost. When you think about the money spent on a new wheel, helmet or the latest watch such costs spread over the course of a year should not break the bank for most. The data from such testing should not be under estimated and can be massively important in tracking fitness but more importantly identifying how a training program should be structured and how much time dedicated to base, build and comp specific periods.
So whilst testing FTP are great as a performance measure and I do believe performance is the best measure of performance its limited as a tool for accurately setting up training zones. However, few of us compete in only 20minute time trials or even 60-minute time trial. As such I would rather base my performance on a performance trial that is closer to what I would experience in a race. The problem is I do Ironman and other than jumping into a half Ironman I don’t thing any performance test would be appropriate.
FTP repeated over time can help be a measure of improvement in fitness/ performance once any learning effects are overcome (i.e. the first time you do an FTP test you may go out to hard and burn out, the next time you will pace better, spreading the effort over the 20mins). However, what I am discussing in the blog is the data in the scientific literature. Maybe tomorrow a new study will find some other reason why the FTP 20 Min test is accurate as a measure of threshold, however, until I see that evidence I can only base my views of what I have read so far.
For setting training zones I want to know how my body is reacting internally – so how much oxygen, carbs, fat am I using at given intensity (heart rate, power, or velocity) and how much lactate I am producing. Psychologically, I cannot significantly control my lactate response of the amount of oxygen my muscle consume for a given power, yet I can control how hard I feel I am pushing for the FTP test.
I am sure many coaches would swear that FTP is a great way to monitor athletes and set up training zones but is this because they don’t have access to other forms of testing? Is it because FTP is quick and easy, needing limited equipment? Have they actually looked at the other options? The bro-science response well my athlete did x or qualified for Y using FTP is not a response to the limitations discussed above. Maybe if they used other ways to set up training their athletes would have achieved their goals earlier, or perhaps many of their athletes don’t achieve but they just pull out those that have as a defence.
In conclusion FTP has its limitations and if it works for you as a coach or athlete and you improving year on year then keep on using it. However, don’t do it blindly, always consider why you are doing something what are the limitations? Is it based on real evidence? I will in later blogs look at the other measure I mention above such as lactate threshold, Vo2max etc but for now I hope you find this blog useful.
Keep training and best of luck for 2017!
Ps. I asked some of the key authors behind the FTP test for comment on what I feel are the limitation before writing this blog but received no response.
- Hunter A, Coggan A. (2006) Training and racing with a power meter. VeloPress, Colorado USA.
- Ibid, pg.51
- Coyle EF, Coggan AR, Hopper MK, Walters TJ. Determinants of endurance in well-trained cyclists. J. Appl. Physiol. 64:2622-2630, 1988.
- Hill AV (1927). Speed and energy requirement. In Muscular Movement in Man, pp. 41–44. McGraw-Hill, New York.
- Monod H & Scherrer J (1965). The work capacity of a synergic muscular group. Ergonomics 8, 329–338.
- Poole DC, Ward SA, Whipp BJ. The effects of training on the metabolic and respiratory profile of high-intensity cycle ergometer exercise. Eur J Appl Physiol. 1990;59:421–9.
- Pringle JSM, Jones AM. Maximal lactate steady state, critical power and EMG during cycling. Eur J Appl Physiol. 2002;88:214–26.
- Maturana FM, Keir DA, McLay KM, Maurias JM. Can measures of critical power precisely estimate the maximal metabolic steady-state? Appl Physiol Nutr Metab. 2016; 41: 1197–1203
- Ibid n8, pg 218, 222
- Bull AJ, Housh TJ, Johnson GO, Perry SR. Effect of mathematical modeling on the estimation of critical power. Med Sci Sport & Ex. 2000; 32 (2), 526–530
- Barker T, Poole DC, Nobel ML, Barstow TJ. Human critical power – oxygen uptake relationship at different pedaling frequencies. Exp Physiol 91 (3), 621-632.