# Chapter 22 Successes and limitations of PLS regression analysis

## Learning goals for this lesson

- Learn about the mixed success of applying PLS regression in various contexts
- Understand important limitations of PLS regression

## 22.1 PLS regression

We learned about Projection-to-Latent-Structures (PLS) regression (also known as Partial Least Squares regression) in the previous lesson on Delineating temperature response phases with PLS regression. In the context of phenology analysis, we can use this method to correlate high-resolution temperature data (e.g. daily data) with low-resolution (annual) data on the timing of phenology events. We realized already, however, that in the case of pears at Klein-Altendorf we were only really able to recognize the forcing period (where warm conditions advance bloom), while the chilling phase remained obscure. This was a bit disappointing, because the two dormancy phases had emerged quite clearly in the study on walnut leaf emergence in California. Let’s look at a few more examples to understand where and when this works - and to try to figure out why.

## 22.2 PLS examples

### 22.2.1 Grasslands on the Tibetan Plateau

In one of our first applications of the PLS methodology, we evaluated the temperature responses of grasslands on the Tibetan Plateau. Specifically, we looked at how the beginning of the growing season has responded to climate change. When we just look at the trend over time, the pattern that emerges is rather confusing, with a fairly clear advancing trend until the late 1990s, followed by a surprising delay in ‘green up’ dates.

Similar to what we found for walnuts in California, we detected a conspicuous relationship between warm temperatures in winter and delayed beginning of the growing season in spring.

We later added a spatial component to this analysis, investigating vegetation responses to temperature on a pixel-by-pixel basis.

In principle, the temperature response pattern of grasslands is thus similar to what we’ve seen for walnuts in California. The mechanisms at work here are probably quite different, so we should not jump to conclusions here without adequate knowledge of grassland ecology (which I don’t have). These findings are concerning, however, because our initial expectation would probably have been that increasing temperature allows vegetation to get going earlier in the year. Failure of the vegetation to keep up with increasingly available thermal resources indicates a possible mismatch of the established ecosystems with future climatic conditions. Such a mismatch is usually not sustainable, and it may open opportunities for invasive species that are better able to exploit the climatic ‘resources’ that will be available in the future. Well, since I don’t know much about what’s going on here ecologically, I’ll stop speculating here. Let’s rather turn our focus back to deciduous trees.

### 22.2.2 Deciduous trees

In many of the early PLS analyses of tree phenology, I collaborated with Guo Liang, who was then a PhD student at the Kunming Institute of Botany in China (working in the group of Xu Jianchu, who also runs the regional office of World Agroforestry that is responsible for East and Central Asia). Guo Liang has since become a Full Professor, now running his own group at Northwest A & F University of China.

In his first analysis, Guo Liang looked at the phenology of Chinese chestnuts [https://en.wikipedia.org/wiki/Castanea_mollissima] grown in Beijing, China. Here are the findings:

Once again, we can quite clearly see the forcing period - the long period of consistent negative model coefficients from January to May. The chilling period is also somewhat visible, but model coefficients are much less consistent, with many ‘unimportant’ values and even some interruptions.

A similar analysis of cherry phenology from Campus Klein-Altendorf produced quite similar results:

Also here, we see the pronounced forcing phase, which follows a chilling period that is difficult to delineate.

A common pattern that emerges here is that the forcing phase is clearly visible, while the chilling phase is hard to see. This is disappointing after the very clear pattern we found earlier in California:

### 22.2.3 Why we’re not seeing the chilling phase

Does failure of the chilling phase to show up in the output of the PLS regression indicate that the method isn’t as useful for this purpose as we initially thought? Well, let’s not give up so easily, but rather look at what exactly PLS is sensitive to.

In the spider mite example, PLS regression was sensitive to the quantity of reflected radiation that reached the sensor, with greater reflectance at certain wavelengths and lower reflectance at other wavelengths indicating mite damage severity. In detecting the forcing phase, PLS responded to temperature, with higher temperatures indicating greater heat accumulation, which was in turn related to early bloom.

In all of these cases, changes in the response variable were monotonically related to changes in the signal, i.e. the greater the signal, the greater/smaller the response. The following figure illustrates why this doesn’t work for chill accumulation. Let’s look at the temperature ranges that the chill models respond to and compare this to the temperature range that we can observe at the three study locations during the winter months.

To determine the range of effective temperatures for the various chill models we’ve already worked with, let’s see how much chill they produce at various levels of constant temperatures (I’m ommitting chill days here, because this model doesn’t work with constant temperatures):

```
library(chillR)
library(dormancyR)
library(ggplot2)
library(kableExtra)
library(patchwork)
hourly_models <-
list(
Chilling_units = chilling_units,
Low_chill = low_chill_model,
Modified_Utah = modified_utah_model,
North_Carolina = north_carolina_model,
Positive_Utah = positive_utah_model,
Chilling_Hours = Chilling_Hours,
Utah_Chill_Units = Utah_Model,
Chill_Portions = Dynamic_Model)
daily_models <-
list(
Rate_of_Chill = rate_of_chill,
Exponential_Chill = exponential_chill,
Triangular_Chill_Haninnen = triangular_chill_1,
Triangular_Chill_Legave = triangular_chill_2)
metrics <- c(names(daily_models),
names(hourly_models))
model_labels <- c("Rate of Chill",
"Exponential Chill",
"Triangular Chill (Häninnen)",
"Triangular Chill (Legave)",
"Chilling Units",
"Low-Chill Chill Units",
"Modified Utah Chill Units",
"North Carolina Chill Units",
"Positive Utah Chill Units",
"Chilling Hours",
"Utah Chill Units",
"Chill Portions")
for(T in -20:30)
{
hourly <- sapply( hourly_models,
function(x)
x(rep(T,1000))
)[1000,]
temp_frame <- data.frame(Tmin = rep(T,1000),
Tmax = rep(T,1000),
Tmean = rep(T,1000))
daily <- sapply( daily_models,
function(x)
x(temp_frame)
)[1000,]
if(T == -20)
sensitivity <- c(T = T,
daily,
hourly) else
sensitivity <- rbind(sensitivity,
c(T = T,
daily,
hourly))
}
sensitivity_normal <-
as.data.frame(cbind(sensitivity[,1],
sapply(2:ncol(sensitivity),
function(x)
sensitivity[,x]/max(sensitivity[,x]))))
colnames(sensitivity_normal) <- colnames(sensitivity)
sensitivity_gg <-
sensitivity_normal %>%
pivot_longer(Rate_of_Chill:Chill_Portions)
# melt(sensitivity_normal,id.vars="T")
sensitivity_gg$value[sensitivity_gg$value<=0.001] <- NA
chill<-
ggplot(sensitivity_gg,
aes(x = T,
y = factor(name),
size = value)) +
geom_point(col = "light blue") +
scale_y_discrete(labels = model_labels) +
ylab("Chill model") +
xlab("Temperature (assumed constant, °C)") +
xlim(c(-30, 40)) +
theme_bw(base_size = 15) +
labs(size = "Chill \nWeight")
```

Now let’s summarize winter temperatures at the three locations for which we’ve seen phenology responses above: Klein-Altendorf (Germany), Beijing (China) and Davis (California). You can use the following buttons to download the temperature data. If you save them in the `data`

subfolder of your working directory, all the code below should work well.