TREND MODEL OF SEA EXTREMES

The confidence interval for return level is re-examined. A definitively new method is proposed by employing an implicit link function, a plug-in estimation and a gamma distribution with the shape parameter given by the degree of experience. It is applied to a trend model of sea extremes, and it should be a very notable finding that the passing of time enlarges the range of confidence interval even for the stationary model.


Introduction
The statistical analysis for extreme values has become widely applied and demonstrated in many practical studies.Bernardara et al. (2010), Burzel et al. (2010), Dong and Ji (2010), Goda et al. (2010), Jonkman et al. (2010), Kawai et al. (2010), Kortenhaus and Oumeraci (2010), Mendez et al. (2010), Mendoza et al. (2010), Mertens et al. (2010), Mudersbach and Jansen (2010), Naulin et al. (2010), Yasuda et al. (2010), Wahl et al. (2010) are found in this conference papers.Those studies discussed on regional analysis, multivariate extreme values model by copula, joint distribution of waves and surges, the resultant probability distributions by numerical simulation based on stochastic typhoon models, parameters estimation by L-moments method, flood risk analysis and non-stationary model with covariates and so on.For these advanced applications, we should re-consider the theoretical procedures of extreme value analysis fundamentally.
Extreme value analysis has been bothered by the two major problems: (1) extrapolation and (2) symmetric confidence interval for return level by the stereotyped manner.(1) The extrapolation is essential and inevitable in extreme value analysis.The fitted line seems to be extended as long as we want for the target return period.Some of prudent engineers will believe in the limitation of extension, but no one has discussed quantitatively and clarified the limitation.Somebody says that displaying the confidence intervals for various return levels helps us to check the statistical variety, instead of thinking the limitation of extending the return level.However, the confidence intervals are relative, and they are just comparable: the confidence interval for the longer return period is relatively wider than that of the shorter return period.It does not give the limitation.Moreover, the confidence interval is not suitable to compare the estimation errors mutually.(2) It is because the confidence interval for return level should be skewed, though the conventional manner makes it symmetric in the center of the point estimation.Some of prudent engineers imagine that the confidence interval is not symmetry, because the data should be placed for the return level and it's point estimation.These problems are originated in the treatment of estimation errors.
In this study we introduce the degree of experience to solve these essential problems in extreme value statistics.Especially we focus on the case of Gumbel type for the simplicity.At the final stage, we will apply it to non-stationary extreme model with a temporal trend.

Occurrence rate, return period and the return level
Primary we employ the definition of return period: R = 1 where λ x ( ) is the occurrence rate of exceeding the level x, which will be described later in detail.It differs from the usual definition of return period: where F x ( ) is the cumulative probability function of the annual maximum level x.The return period is the expected time interval of the successive occurrences of exceeding the level.Frankly speaking, it is the mean period, and the frequency is the occurrence rate per unit time.The deference of both definitions of return period is due to the unit time for the occurrence.For the usually employed definition of Eq. 2, the unit time is one year and the event is annual maximum.So, it is one or zero time that the annual maximum will exceed the level.It is never twice.On the other side, for the definition of Eq. 1, it takes into account several times of annual occurrence.It will be said that it is the return period for Peaks Over Threshold (POT).The definition can be allowed to use, even though the wave data is not recorded as the POT but the annual maximum or the annual largest values of sea extremes (wave heights, sea leves and so on).
From Eq. 1, the return level x R is linked to the parameters µ and σ , which will be shown in the case of a Gumbel distribution, and the occurrence rate.
x R = µ + σ log R = µ − σ log λ (3) Moreover, we can make the statistical models to describe the parameters µ and σ in terms of the covariates, for example: which will be employed in the later as the trend model for the sea levels of Venice.We introduce here the occurrence rate as the "implicit" link function: where we confirm to get the return level of Eq. 3, by solving the following equation: For the next steps to deal with the statistical variable, we consider the Poisson distribution with the occurrence rate λ per unit time (= 1 year) as the following: where j denotes the occurrence number per year exceeding a given threshold.By putting j = 0 (no occurrence) and the occurrence rate λ = λ x ( ) , then the resultant probability means the cumulative (no exceedance) probability for the level x , that is, As the consequence, a Gumbel distribution P x ( ) is derived naturally.According to the relation between the cumulative probability and the rate, as shown in Eq. 8, when you need to use the conventional definition of return period, it is found enough to replace it by where the definition of Eq. 1 differs from that of Eq. 2 only for smaller return periods, because the following relation is applied for larger return periods so that Eq. 6 is satisfied approximately.
The rate is also called the intensity of occurrence, which is related deeply to the point process of extremes.The point process characterization of extremes will conduct us to build the likelihood functions for the threshold excess model and the r largest order statistic model, see the detail description in the chapter 7 of the textbook by Coles (2001).

Bayesian inference
The occurrence rate for a certain level is a target of inference.In the Bayesian statistical inference, the occurrence rate is treated as a statistical variable, which is distributed by a gamma distribution: It is the natural conjugate distribution for the Poisson distribution.Examples are shown in Fig. 1.In case of K = 18 times excesses over a threshold during ν = 30 years, we will estimate the occurrence rate ˆ λ = 0.6 (per year) with more or less errors, while we will just guess the rate very ambiguously with vast range of errors in case of K = 3 times excesses during ν = 5 years.On the contrary, in case of K = 90 times excesses during ν = 150 years, our belief becomes so stronger and more concentrated that the probable range of the estimation becomes narrower.It is found that the value of shape parameter in gamma distribution plays an important role to interpret "the degree of experience" in counting the number of excess.The gamma distribution of Eq. 11 has the mean and variance as By arranging it, we can obtain the value of shape parameter of the gamma distribution.
which is expressed in another simple form in terms of the coefficient of variation of the rate.The parameters µ and σ in the Gumbel distribution is estimeted by the data set of sea extremes.Then, the statistical variety can be obtained by a sort of chain rule through the implicit link function of Eq. 5. 1 where ∇ is the differential operator, which is written by in the stationary model by a Gumbel type with two parameters µ and σ only.
The value obtained by the procedure of Eq. 13 is called the degree of experience, which interprets the degree of belief for the statistical error of the occurrence rate.In Kitano et al. (2008) and Kitano et al. (2009), the non-Bayesian derivations for the threshold model are shown.For the stationary Gumbel model of the r largest order statistics, whose the number of available order is r , the record length is N year and the total number of extreme values is rN , Fig. 2 shows the contour lines of K N .The profiles by the available order number r = 1, 3, 10 are shown in Fig. 3, and the profiles for normalized value by the total data size K r N are shown in Fig. 4, in which it is very clear the maximum value of K agrees with the total data size r N almost at the return period R = 1 r + 0.5

(
) .We can easily imagine to correspond the return level to the inferior level of the r largest order statistic model, because r + 0.5 is between the employed order r and the successive order r + 1, which is not employed.Therefore, the degree of experience K is regarded to indicate the effective number of the data size, employed for estimation.The inferior level will be estimated by all of the data.The value of the degree of experience decreases toward the record maximum value.The return level in the extrapolation for the larger value of return period will be estimated by very small number of data.The confidence intervals of 95 % for various values of degree of experience (K = 50, 5, 2, 1.5 and 1) are drawn in Fig. 5 in comparison with the occurrence rate λ distributed by the normal and the log- normal.Normally distributed occurrence rate is named "rustic" case, whose reason will be revealed in the later discussion on the return level.As the rate does not take the negative value naturally, then the Gaussian density is found to be truncated and the lower confidence interval is broken up.The gamma distribution with the value of shape parameter is less than 1, is open at the zero occurrence, this is not the truncation, while the log-normal distribution always pass through the origin.It is noted that both upper and lower ends of confidence interval of the gamma distributed occurrence rate seem to be just modest, which are at the mid between the end of the normal distributed one and that of the log-normal distributed one.
As the relation between the occurrence rate and the period is Eq. 1, the density and the confidence interval of the return period are obtained as the variable translation in the straightforward manner, shown in Fig. 6.All of the density function has a peak.

Inconclusive judgment
A problem arises for the cut-off value.That is to say, the rule of thumb is convenient if there exists the requirement of the minimum value for the degree of experience.We can probably seek for such a value in the mathematical manner, but we look for it in the human intuitions here at this time.Among numerous Japanese proverbs, we can choose the suitable one: "What happens twice will happen three times."which will be interpreted that it is worth discussing the event occurring more than two times.On the other hand, it should be inconclusive to do with the event occurring less than two times.The discussion will be suspended until more information of the occurrences, that is to say, more of data are collected enough.Other languages own the similar ones.Italian people say in Spanish (thanks to Dr. Minguez, S. R.).We apply this rule to our problem of the degree of experience for extremes.For example, the record length of 20 years is required at least for estimating the level of the return period 50 years.As seen in Fig. 2, the degree of experience per year K N takes 0.108 for the return period of 50 year in case of the available annual largest values up to the 3 order, and then we get 0.108 x 20 = 2.2 ( > 2.0).
If the degree of experience is lower than 2, we should wait to store enough the length of observation, or we can work with the regional frequency analysis to invent something by "trading space for time", for example, see van Gelder (2000).

Skewed confidence Interval for the return level by "plug-in" estimation
For the next step, we consider the translation from the occurrence rate to the return level.In the previous section, we have obtained the confidence interval of the occurrence rate ˆ λ lower and ˆ λ upper into the ˆ λ of the right hand side of the following equation: and by putting also the estimated parameters ˆ µ and ˆ σ in the Eq.5 and plugging it in the left hand side of Eq, 17, we can get the "plug-in" estimated values ˆ x R finally.

Figure 8. Confidence intervals of return level by the method of employing the gamma and log-normal distributions
Fig. 7 shows the confidence intervals of the return level plugged in the rustic estimation of occurrence rate.It shows also the conventional stereo-typed confidence intervals of the return level: where the value ± 1.96 are that of the standard normal quantile for the probabilities 0.025 and 0.975.It is noted that the upper bound of the confidence interval of return level by the rustic estimation is diverged to the infinity, because the lower bounds of that of occurrence rate take the negative values.This facts names it "rustic".Fig. 8 show the confidence intervals obtained by both log normal and gamma distributed occurrence rates.It is surprising that the ones by the log-normal are equivalent to those of the conventional stereo-type.The reason is very clear because we have for the model of Gumbel type.The plug-in estimation of ˆ x R ( ) bound in the log-normal case is obtained through the following: otherwise, by using Eq. 3, the equation above is also expressed by Then, by applying Eq. 19 to the right hand side of Eq.21, we can confirm the equivalence of confidence intervals by both methods.

Application for a trend model and the stationary model
We consider in this study a trend model with the location parameter given by Eq. 4.Then, the degree of experience is obtained by the following procedure with the differential operator.
The following formula will be derived after some mathematical arrangements.
where K 0 is the degree of experience by the stationary parameter, obtained by N is the total number of years of observation, and h ii is the leverage term of the covariate (temporal variable in trend model), given by where i is the ordinal number of year in the observation.For the mid time of the observation, i = N + 1 ( ) 2, we have h ii = 1 r N and then K = K 0 , which means that the trend does not affect at all the degree of experience.Moreover, it is another notable thing that there is never included the trend parameter ′ µ in the leverage h ii nor in the degree of experience K for the trend model.In other words, the difference of the magnitude of trend doesn't change the wideness of the confidence intervals.
The question is whether or not the passage of time is taken into account, which will plays an important role in the discussed later.The annual maximum of the sea levels in Venice has examined by a trend model in Smith (1986) and Coles (2001).Fig. 9 demonstrates the confidence intervals for the return levels of 1, 10 and 100 years return period.They are skewed with the upper wide and the lower narrow.For the gray regions the value of the degree of experience are less than 2.0.Those regions are enough far from the range where exists the data set.In this point of view, the same thing will be true of the stationary case (say, no trend).Actually in the developed theory of the degree of experience, the magnitude of trend is not included at all in Eq. 23.We just take the passage of time into consideration by using the leverage term h ii of Eq. 25.Fig. 10 shows an example of the confidence intervals of 1 year return level for the the simulated stationary annual maximum data set for 20 years.Shelf life is a sort of dead line for the suitable consumption not only for the food, drink and medicine, but it will be applicable also for the dataset of extreme values.To keep the old datasets always effective, we should continue to observe the sea extremes to control the leverage growth.It is a very natural thing but it may be easy to forget.

Conclusions
We introduced newly the degree of experience for sea extremes with a temporal trend.Especially, it was shown that the occurrence rate can be regarded as the implicit link function for the return level and the parameters of the extreme value distribution, and the gamma distribution is suitable for the estimation error of the occurrence rate in Bayesian inference.Then, we showed to make the confidence intervals of the return level as the plug-in estimation.A formula for the degree of experience in the case of the trend model of the Gumbel type was derived, which was discussed on the applicability even for the stationary data sets.It will be easy to develop it for the general cases (the generalized extreme value distribution) with just a little modification.

Figure 1 .
Figure 1.Concentrating degree of experience by increasing number of excesses

Figure 2 .Figure 3 .Figure 4 .
Figure 2. Contour lines of the degree of experience per year K/N

Figure 7 .
Figure 7. Confidence intervals of return level by the rustic method of using the normal distribution in comparison with those by the conventional method

Figure 9 .
Figure 9. Confidence intervals of Venice sea levels with the trend : A coastal storms intensity scale and induced coastal hazards for the NW Mediterranean, Proceedings of 32 nd Mertens, T., T. Verwaest, R. Delgado, K. Trouw and L. D. Nocker (2010): Coastal management and disaster planning on the bias of flood risk calculations, International Conference on Coastal Engineering, ASCE, in press.Mudersbach C. and J. Jansen (2010): An advanced statistical extreme value model for evaluating storm surge heights considering systematic records and climate scenarios, Proceedings of 32 nd Naulin, M. and A. Kortenhaus and H. Oumeraci (2010): Failure probability of flood defence structures/ Systems in risk analysis for extreme storm surges, Proceedings of 32 International Conference on Coastal Engineering, ASCE, in press.Smith, R. L. (1986): Extreme value theory based on the r International Conference on Coastal Engineering, ASCE, in press.largest Yasuda, T., H. Mase, S. Kunitomi, N. Mori and Y. Hayashi (2010): Stochastic typhoon model and its application to future typhoon projection, Proceedings of 32 annual events, Jour. of Hydrology, pp.27-43.Wahl, T., J. Jansen and C. Mudersbach (2010): A multivariate statistical model for advanced storm surge analyses in the north sea, Proceedings of 32 International Conference on Coastal Engineering, ASCE, in press.van Gelder, P. H. A. J. M., J. De Ronde, N. W. Neykov, and P. Neytchev (2000): Regional frequency analysis of extreme wave heights: trading space for time, Proceedings of 27 International Conference on Coastal Engineering, ASCE, in press.International Conference on Coastal Engineering, Sydney, ASCE, pp.1099-1112.
nd nd nd th