\documentclass[article, nojss]{jss}
\author{Ostap Okhrin \\Technische Universit\"{a}t Dresden \And
Alexander Ristig}
\title{Hierarchical Archimedean Copulae: The \pkg{HAC} Package}
\Plainauthor{Ostap Okhrin, Alexander Ristig}
\Plaintitle{Hierarchical Archimedean Copulae: The HAC Package}
\Abstract{
This paper presents the \proglang{R} package \pkg{HAC}, which
provides user friendly methods for dealing with hierarchical
Archimedean copulae (HAC). Computationally efficient estimation
procedures allow to recover the structure and the parameters of HAC
from data. In addition, arbitrary HAC can be constructed to sample
random vectors and to compute the values of the corresponding
cumulative distribution plus density functions. Accurate graphics of
the HAC structure can be produced by the \code{plot} method
implemented for these objects.
}
\Keywords{copula, \proglang{R}, hierarchical Archimedean copula, HAC}
\Plainkeywords{copula, R, hierarchical Archimedean copula, HAC}
\Address{
Ostap Okhrin\\
Chair of Econometrics and Statistics esp. Transportation\\
Institute of Economics and Transport\\
Faculty of Transportation\\
Technische Universit\"{a}t Dresden\\
01069 Dresden, Germany\\
E-mail: \email{ostap.okhrin@tu-dresden.de}\\
URL: \url{https://tu-dresden.de/bu/verkehr/ivw/osv/die-professur/inhaber-in}
}
\usepackage{enumerate}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{multirow}
\usepackage{amsthm}
\usepackage{rotating}
\newtheorem{algorithm}{Algorithm}
\usepackage[english, ngerman]{babel}
\selectlanguage{english}
\newsavebox\verbboxone
\newsavebox\verbboxtwo
%\VignetteIndexEntry{HAC}
%\VignetteDepends{HAC}
%\VignetteDepends{copula}
\begin{document}
<>=
options(prompt = "R> ", continue = "+ ", width = 70, useFancyQuotes = FALSE)
library("HAC")
@
\section[Introduction]{Introduction}\label{sec:Intro}
The use of copulae in applied statistics began in the end of the
90ies, when \citet{embrechts_mcneil_straumann_1999} introduced copula
to empirical finance in the context of risk management. Nowadays,
quantitative orientated sciences like biostatistics and hydrology use
copulae to attempt measuring the dependence of random variables, e.g.,
\citet{lakhal_2010, acar_craiu_yao_2011, bardossy_2006,
genest_favre_2007, bardossy_li_2008}. In finance, copulae became a
standard tool, explicitly on value at risk ($\operatorname{VaR}$)
measurement and in valuation of structured credit portfolios, see
\citet{mendes_souza_2004, junker_may_2005} and \citet{li_2000}. This
paper aims at providing the necessary tools for academics and
practitioners for simple and effective use of hierarchical Archimedean
copulae (HAC) in their statistical analysis.
A copula is the function splitting a multivariate distribution into
its margins and a pure dependency component. Formally, copulae are
introduced in \citet{sklar_1959} stating that if $F$ is an arbitrary
$d$-dimensional continuous distribution function of the random vector
$X=(X_1,\dots,X_d)^{\top}$, then the associated copula is unique and
defined as the continuous mapping $C:[0,1]^d\rightarrow[0,1]$ which
satisfies the equality
\begin{align*}
C(u_1,\dots,u_d)&=F\{F_1^{-1}(u_1),\dots,F_d^{-1}(u_d)\},\quad
u_1,\ldots,u_d\in[0,1],
\end{align*}
where $F_1^{-1}(\cdot),\ldots,F_d^{-1}(\cdot)$ are the quantile
functions of the corresponding continuous marginal distribution
functions $F_1(x_1),\ldots,F_d(x_d)$. Accordingly, a $d$-dimensional
density $f(\cdot)$ can be split in the copula density $c(\cdot)$ and
the product of the marginal densities. For an overview and recent
developments of copulae we refer to \citet{nelsen_2006},
\citet{cherubini_luciano_vecchiato_2004}, \citet{joe_1997} and
\citet{jaworski_durante_haerdle_2013}. If $F(\cdot)$ belongs to the
class of elliptical distributions, then $C(\cdot)$ is an elliptical
copula, which in most cases cannot be given explicitly because the
distribution function $F(\cdot)$ and the inverse marginal
distributions $F_j(\cdot)$ usually have integral representations. One
of the classes that overcomes this drawback of elliptical copulae is
the class of Archimedean copulae, which, however, is very restrictive
yet for moderate dimensions. Among other \proglang{R} \citep{R}
packages dealing with Archimedean copula \citep[see for
example][]{task}, we would like to mention the \pkg{copula} and the
\pkg{fCopulae} package, c.f.~\cite{yan_2007, kojadinovic_yan_2010,
hofert_maechler_2011, copula} and \cite{fCopulae}.
HAC generalize the concept of simple Archimedean copulae by
substituting (a) marginal distribution(s) by a further HAC. This class
is thoroughly analyzed in \citet{embrechts_lindskog_mcneil_2003,
whelan_2004, savu_trede_2010, hofert_11,
okhrin_okhrin_schmid_2013b}. The first sampling algorithms for
special HAC structures were provided by the \pkg{QRMlib} package of
\citet{QRMlib_2011}, which is not updated anymore, but several
functions were ported to the \pkg{QRM} package
\citep[see][]{QRM_2012}. \citet{nacopula} presented the comprehensive
\pkg{nacopula} package which, among other features, allows sampling
from arbitrary HAC and was integrated into the package \pkg{copula}
from version 0.8-1 on. The central contribution of the \pkg{HAC} package
\citep{hac} is the estimation of the parameter and the structure for
this class of copulae, as discussed in
\citet{okhrin_okhrin_schmid_2013a}, including a simple and intuitive
representation of HAC as \proglang{R} objects of the class
`\code{hac}'. We provide several maximum likelihood (ML) based estimation procedure, which determine the
parameter and the structure simultaneously as well as procedures for a predetermined structure estimating only the parameters. The asymptotic properties of the procedures
are rigorously studied in the literature, e.g., see \citet{okhrin_okhrin_schmid_2013a} and \citet{gorecki_hofert_holena_2014}. Besides, the package offers
functions for producing graphics of the copula's structure, for
sampling random vectors from a given copula and for computing values
of the corresponding distribution and density. An earlier version of this paper, for \pkg{HAC} 1.0-0, has been published as \citet{okhrin_ristig_2014}.
The paper is organized as follows. Section~\ref{sec:PHAC} describes
shortly the theoretical aspects of HAC and its
estimation. Section~\ref{sec:AHAC} presents the functions of the
\pkg{HAC} package and Section~\ref{sec:sim} a simulation
study. Section \ref{sec:Con}~concludes.
\section[Hierarchical Archimedean copulae]{Hierarchical Archimedean copulae}\label{sec:PHAC}
As mentioned above, the large class of copulae, which can describe
tail dependency, non-ellipticity, and, most importantly, has close
form representation
\begin{align}\label{eqn:m_arch}
C(u_1,\ldots,u_d; \theta)&=\phi_{\theta}\left\{\phi_{\theta}^{-1}(u_1)+\dots+\phi_{\theta}^{-1}(u_d)\right\},\quad
u_1,\ldots,u_d\in[0,1],
\end{align}
where $\phi_{\theta}(\cdot)\in\mathfrak{L} =
\{\phi_{\theta}:[0;\infty)\rightarrow [0,1]\,|\,
\phi_{\theta}(0)=1,\,\phi_{\theta}(\infty)=0;\,(-1)^j\phi_{\theta}^{(j)}\geq0;\,j
\in \mathbb{N} \}$ and $(-1)^{j}\phi_{\theta}^{(j)}(x)$ being
non-decreasing and convex on $[0,\infty)$, for $x>0$, is the class of
Archimedean copulae. The function $\phi(\cdot)$ is called the
generator of the copula and commonly depends on a single parameter
$\theta$. For example, the Gumbel generator is given by
$\phi_{\theta}(x) = \exp(-x^{1/\theta})$ for $0\leq x<\infty,\ 1\leq
\theta< \infty$. Detailed reviews of the properties of Archimedean
copulae can be found in \citet{mcneil_neslehova_2008} and in
\citet{joe_1997}.
A disadvantage of Archimedean copulae is the fact that the
multivariate dependency structure is very restricted, since it
typically depends on a single parameter of the generator function
$\phi(\cdot)$. Moreover, the rendered dependency is symmetric with
respect to the permutation of variables, i.e., the distribution is
exchangeable. HAC (also called nested Archimedean copulae) overcome
this problem by considering the compositions of simple Archimedean
copulae. For example, the special case of four-dimensional fully
nested HAC can be given by
\begin{align}\label{eqn:fully_nested_a}
C(u_1, u_2, u_3, u_4) &= C_3\{C_2(u_1, u_2, u_3), u_4\}\\
&= \phi_3\{\phi^{-1}_3\circ C_2(u_1,u_2,u_3)+\phi^{-1}_3(u_4)\},\nonumber
\end{align}
where $C_j(u_1, \ldots, u_{j+1})=\phi_j[\phi^{-1}_j
\{C_{j-1}(u_1,\ldots,u_{j})\}+\phi^{-1}_j(u_{j+1})]$,
$j=2,\ldots,d-1$, and $C_1 =
\phi_1\{\phi^{-1}_1(u_1)+\phi^{-1}_1(u_2)\}$. The functional form of
$C_j(\cdot)$ indicates that the composition can be applied
recursively. A different segmentation of the variables leads naturally
to more complex HAC. In the following, let $d$-dimensional HAC be
denoted by $C(u_1,\dots,u_d; s,\pmb{\theta})$, where $\pmb{\theta}$ denotes the
vector of feasible dependency parameters and $s=(\ldots(i_g
i_k)i_{\ell}\ldots)$ the structure of the entire HAC, where
$i_m\in\{1,\ldots,d\}$ is a reordering of the indices of the variables
with $m=1\ldots,d$, and $g,k,\ell\in\{1,\ldots,d: g\neq
k\neq\ell\}$. Structures of subcopulae are denoted by $s_j$ with
$s=s_{d-1}$. For instance, the structure according to
Equation~\ref{eqn:fully_nested_a} is $s=(s_2)4$ with
$s_j=(s_{j-1}(j+1))$, $j=2,3$, for the sucopulae and $s_1=(12)$. A
clear definition of the structure is essential, as $s$ is in fact a
parameter to estimate. Thus, Equation~\ref{eqn:fully_nested_a} can be
rewritten as
\begin{align*}
C(u_1, u_2, u_3, u_4; s=(((12)3)4), \pmb{\theta}) &=
C\{u_1, u_2, u_3, u_4; (s_{2}4),(\theta_1, \theta_2, \theta_3)^\top\}\\
& = \quad \phi_{\theta_{3}}(\phi^{-1}_{\theta_{3}}\circ C_2\{u_1, u_2, u_3; (s_{1}(3)),
(\theta_1,\theta_{2})^\top\}+\phi^{-1}_{\theta_{3}}(u_4)).
\end{align*}
Figure~\ref{fig:4dimHAC} presents the four-dimensional fully and
partially nested Archimedean copula.
<>=
Obj1 = hac.full(type = 1, y = c("u4", "u3", "u2", "u1"), theta = c(2, 3, 4))
par(mai = c(0, 0, 0, 0))
plot(Obj1, index = TRUE, l = 1.6)
@
<>=
Obj2 = hac(type = 1, tree = list(list("u4", "u3", 3), list("u1", "u2", 4), 2))
par(mai = c(0, 0, 0, 0))
plot(Obj2, index = TRUE, l = 1.6)
@
\begin{figure}[t!]
\begin{minipage}[t]{0.5\textwidth}
\centering
\includegraphics[width=2in, trim=0 25 0 0, clip]{HAC-fourfully.pdf}
\end{minipage}
\begin{minipage}[t]{0.5\textwidth}
\centering
\includegraphics[width=2in, trim=0 30 0 0, clip]{HAC-fourpartially.pdf}
\end{minipage}
\caption{Fully and partially nested Archimedean copulae of dimension $d=4$ with structures $s=(((12)3)4)$ on the left and $s=((43)(12))$ on the right.}
\label{fig:4dimHAC}
\end{figure}
HAC can adopt arbitrarily complex structures $s$. This makes it a very
flexible and simultaneously parsimonious distribution model. The
generators $\phi_{\theta_j}(\cdot)$ within a single nested Archimedean
copula can come either from a single generator family or from
different generator families. If the $\phi_{\theta_j}(\cdot)$'s belong
to the same family, then the required complete monotonicity of
$\phi_{\theta_{i+j}}^{-1}(\cdot)\circ\phi_{\theta_{j}}(\cdot)$ usually
imposes some constraints on the parameters
$\theta_1,\dots,\theta_{d-1}$. Theorem 4.4 of \citet{mcneil_2008}
provides sufficient conditions on the generator functions to guarantee
that $C(\cdot)$ is a copula. It holds that if
$\phi_{\theta_{j}}(\cdot)\in \mathfrak{L}$, for $j=1,\dots,d-1$, and
$\phi_{\theta_{j+1}}^{-1}(\cdot)\circ\phi_{\theta_{j}}(\cdot)$ have
completely monotone derivatives, then $C(\cdot)$ is a copula for
$d\geq2$. Weaker conditions are stated in \citet{rezapour_2015}. For the majority of generators feasible HAC require
decreasing parameters from the highest to the lowest hierarchical
level. However, in the case of different families within a single HAC,
the condition of complete monotonicity is not always fulfilled, see
\citet{hofert_11}. In our study, we consider HAC with generators from
the same family only. If we use the same single-parameter generator
function on each level, but with a different value of $\theta$, we may
specify the whole distribution with at most $d-1$ parameters. From
this point of view, the HAC approach can be seen as an alternative to
covariance driven models. Nevertheless, for HAC not only the
parameters are unknown, but also the structure has to be
determined. One possible procedure for estimating both the parameters
and the structure is to enumerate all possible structures and to
estimate at first the parameters only. Next, the optimal structure can
be determined by a suitable goodness-of-fit test. This approach is,
however, unrealistic in practice because the variety of different
structures is enormously large even in moderate
dimensions. \citet{okhrin_okhrin_schmid_2013a} suggest computationally
efficient procedures, which allow to estimate HAC recursively. The
\pkg{HAC} package provides methods for estimating the parameters
and structure in a user-friendly way.
\subsection[Estimation of HAC]{Estimation of HAC}\label{sec:EHAC}
The recursive procedure can be described as follows. At the
first iteration step we fit a bivariate copula to every couple of the
variables. The couple of variables with the strongest dependency is
selected. We denote the respective estimator of the parameter at the
first level by $\hat{\theta}_1$ and the set of indices of the
variables by $I_1$. The selected couple is joined together to define
the pseudo-variable $ Z_{I_1} \overset{\operatorname{def}}{=}
C\{(I_1);\hat{\theta}_1, \phi_1\}$. At the next step, we proceed in
the same way by considering the remaining variables and the new
pseudo-variable as the new set of variables. This procedure allows us
to determine the estimated structure of the copula. As the
restrictions on the parameters are always fulfilled due to shortening
the parameter space, the procedure leads to a feasible copula funciton
with $d-1$ parameters. Nevertheless, if the true copula is not
binary, the procedure might return a slightly misspecified
structure. Despite a difference in the structures, the difference in
the distribution functions is in general minor. To allow more
sophisticated structures, we aggregate the variables of the estimated
copula afterwards. This is possible if the absolute value of the
difference of two successive nodes is smaller than a fixed small
threshold, i.e., $\theta_{1} - \theta_{2} < \epsilon $, with $
\theta_{1} > \theta_{2} $, as suggested by
\citet{okhrin_okhrin_schmid_2013a}.
For better understanding, let us consider a three-dimensional example
with $U_j$, $j=1,2,3$, being uniformly distributed on $[0,1]$. All
possible pairs $C_{(12)}(u_1,u_2,\hat{\theta}_{(12)})$,
$C_{(13)}(u_1,u_3,\hat{\theta}_{(13)})$ and
$C_{(23)}(u_2,u_3,\hat{\theta}_{(23)})$ are estimated by regular ML,
see \citet{franke_haerdle_hafner_2011}. To compare the strengths of
the fit one can use computationally complicated goodness-of-fit tests,
which do not necessarily lead to a function which will be a copula on
the final level of aggregation due to the restrictions on
$\pmb{\theta}$. For that reason we compare simply the estimates
$\hat\theta_{(12)}$, $\hat\theta_{(13)}$ and $\hat\theta_{(23)}$. This
is due to the fact that for most Archimedean copulae, the larger the
parameter the stronger is the dependency (the larger the parameter the
larger is Kendall's correlation coefficient). Let the strongest
dependence be in the first pair
$\hat\theta_{1}\overset{\operatorname{def}}{=}
\hat\theta_{(12)}=\max\{\hat\theta_{(12)}, \hat\theta_{(13)},
\hat\theta_{(23)}\}$, then $I_1 = \{1,2\}$ and we introduce the
pseudo-variable $Z_1 \overset{\operatorname{def}}{=}
C_1(I_1;\hat\theta_1) = C_1(u_1,u_2;\hat\theta_{(12)})$. At the next
and final step for this example we join together $u_3$ and $Z_1$. The
theoretical validation is also reported by Proposition 1 of
\citet{okhrin_okhrin_schmid_2013b} stating that HAC can be uniquely
recovered from the marginal distribution functions and all bivariate
copula functions. Crucially for the superior recursive ML estimation
procedure, pseudo-variables are regarded as functions of
underlying random variables $U_1,\ldots,U_d$ and are not explicitly
computed.
In practice, the marginal distributions $F_j$, $j=1,\ldots,d$, are
either parametrically $\widehat{F}_j(\cdot)
=F_j(\cdot,\hat{\pmb{\alpha}}_j)$, where $\pmb{\alpha}_j$ denotes the vector of
parameters of the $j$th margin, or non-parametrically
\begin{align}\label{eqn:nonp}
\widehat{F}\left( x \right) &= \left( n+1 \right)^{-1} \sum_{i=1}^{n}\mathbf{I} \left( X_i \leq x \right)
\end{align}
estimated in advance. Accordingly, the marginal densities
$\hat{f}_j(\cdot)$, $j,\ldots,d,$ are estimated by an appropriate
kernel density estimator or using a parametric density.
Following \citet{okhrin_okhrin_schmid_2013a}, the estimation of the
copula parameters at each step of the iteration can be sketched as
follows: at first stage, we estimate the parameter of the copula at
the first hierarchical level assuming that the marginal distributions
are known. At further stages, the next level copula parameter is
estimated assuming that the margins as well as the copula parameters
at lower levels are known. Let $\mathbf{X}=\{X_{ij}\}^{\top}$ be the
respective sample, for $i=1,\ldots,n$, $j=1,\ldots,d$, and
$\pmb{\theta}=(\theta_1,\ldots,\theta_{d-1})^\top$ be the parameters of
the copula starting with the lowest up to the highest level. The
recursive multi-stage ML estimator $\widehat{\pmb{\theta}}$ solves the
system
\begin{align}\label{eqn:MLp}
\left(\frac{\partial \mathcal{L}_1}{\partial \theta_1}\right.,
\left.\dots,\frac{\partial \mathcal{L}_{d-1}}{\partial
\theta_{d-1}}\right)^\top&=\mathbf{0}, \\
%
\intertext{where for $j=1,\ldots,d-1$}
%
\mathcal{L}_j&=\sum_{i=1}^n l_j(\mathbf{X}_i), \nonumber \\
%
\intertext{with for $i=1\ldots,n$}
%
l_{j}(\mathbf{X}_i) &= \log \, \left\{c_j\big[ \{\widehat F_m(X_{im})\}_{m\in s_j};\,s_j, \theta_j\big]\prod_{m\in s_j}\hat
f_m(X_{im})\right\}, \nonumber
\end{align}
where $s_j$ refers to the (pseudo)-variables considered at the
$j$th estimation stage. \citet{chen_fan_2006} and
\citet{okhrin_okhrin_schmid_2013a} provide asymptotic behaviour of the
estimates. At the moment, there are four different ways to estimate
HAC:
\begin{enumerate}[(i)]
\item Ordinary (full) ML estimation, also denoted by FML, which is
based on the complete log-likelihood and hence on a predetermined
structure.
\item The ML setup is based on realized pseudo-variables, e.g., the
pseudo-variable for the variables $U_k$ and $U_{\ell}$ are
computed as $Z_{k\ell}\overset{\operatorname{def}}{=}\phi\left[2\phi^{-1}\left\{\max(U_k, U_{\ell})\right\}\right]$, so that the bivariate density is maximized with respect to
the copula parameter at each step of the procedure. \citet{gorecki_hofert_holena_2014} show that $Z_{k\ell}$ is unformily so that classical ML estimation of Archimedean copula is used at each step of the procedure, where the parameter space is subsequently shortend. This method is very robust and hence, selected as default method.
\item More precise results can be obtained by the recursive ML (RML)
procedure discussed in \citet{okhrin_okhrin_schmid_2013a}. The
difference between the ML method and the recursive ML procedure
results from the maximized log-likelihood. While the bivariate
log-likelihood is considered at each estimation step of the ML
method, the log-likelihood of the recursive ML procedure corresponds
at each estimation step to the full log-likelihood for the marginal
HAC regarded at that step. Compared to the full ML approach, the
log-likelihood is only optimized with respect to the parameter at
the root node taken the estimated parameter(s) at lower hierarchical
levels as given, so that the final HAC being a copula is ensured by
shortening the feasible parameter interval from above. From this
point of view, the computational challenge is to build the
log-likelihood as for full ML estimation. The $d$-dimensional density is, however, difficult to compute in a high-dimensional setting, see Section \ref{sec:cdf}.
\item The penalized ML (PML) estimation procedure aims at determining a data driven sequence $\epsilon_n$ in order to fuse subsequent nodes, if $\hat\theta_{\ell}-\hat\theta_{k(\ell)}<\epsilon_n$, where the estimate of parameter at the higher hierarchical level is denoted by $\hat\theta_{k(\ell)}$ and that at the lower level by $\hat\theta_{\ell}$. This method merges advantages of the ML and RML procedure. The proposed method is computationally intense, as the data driven thresholding parameter $\epsilon_n$ depends on tuning parameters arising from a non-concave penalization of the log-likelihood function, c.f., \citet{fan_li_2001}. These tuning parameters have to be determined at each stage of the estimation procedure. We refer the interested reader to \citet{okhrin_ristig_sheen_trueck_2015} for details.
\end{enumerate}
\section[Applications of HAC]{Applications of \pkg{HAC}}\label{sec:AHAC}
Core of the \pkg{HAC} package is the function \code{estimate.copula}
estimating the parameter and determining the structure for given
data. Let us consider the dataset \code{finData} included in the
\pkg{HAC} package. It contains the residuals of the filtered daily
log-returns of four oil corporations: Chevron Corporation
(\code{CVX}), Exxon Mobil Corporation (\code{XOM}), Royal Dutch Shell
(\code{RDSA}) and Total (\code{FP}), covering $n = 283$ observations
from 2011-02-02 to 2012-03-19. Intertemporal dependence is removed by
usual ARMA-GARCH models, whose standardized residuals are plotted in
Figure~\ref{fig:scatter1} and used in the subsequent analysis:
<<>>=
library("HAC")
data("finData")
system.time(result <- estimate.copula(finData, margins = "edf"))
result
@
The returned object \code{result} is of class `\code{hac}', whose
properties are explored below. A practical illustration of the
mechanism of \code{estimate.copula} related to the previous real data
example is presented in Table~\ref{tab:est_proc}.
<>=
par(mai = c(0, 0, 0, 0))
pairs(finData, pch = 20, cex.axis = 1.5)
@
\begin{figure}[t!]
\centering
\includegraphics[width=0.7\textwidth]{HAC-Scatter1.pdf}
\vspace{-0.75cm}
\caption{Scatterplot of the sample \code{finData}.}
\label{fig:scatter1}
\end{figure}
At the lowest hierarchical level, the parameters of all
bivariate copulae are estimated. The couple
$(X_{\tiny{\verb/CVX/}},X_{\tiny{\verb/XOM/}})$ produces the strongest
dependency, hence the best fit. Then, the pseudo-variable
\begin{eqnarray}\label{eqn:Z}
Z_{(\tiny{\verb/CVX.XOM/})}&\overset{\operatorname{def}}{=}&
\phi_{\hat{\theta}_{(\tiny{\verb/CVX.XOM/}})}\left[\phi_{\hat{\theta}_{(\tiny{\verb/CVX.XOM/}})}^{-1}\left\{\widehat{F}_{\tiny{\verb/XOM/}}\left(X_{\tiny{\verb/XOM/}}\right)\right\}+\phi_{(\hat{\theta}_{\tiny{\verb/CVX.XOM/}})}^{-1}\left\{\widehat{F}_{\tiny{\verb/CVX/}}\left(X_{\tiny{\verb/CVX/}}\right)\right\}\right]
\end{eqnarray}
is defined, whose values are however not computed in practice, as the
recursive ML procedure (\code{method = 3}) is used by default. At the
next nesting level the parameters of all bivariate subsets are
estimated and the variables $X_{\tiny{\verb/FP/}}$ and
$X_{\tiny{\verb/RDSA/}}$ exhibit the best fit. Finally, the
realizations of the remaining random variables
$Z_{(\tiny{\verb/CVX.XOM/})}$ and $Z_{(\tiny{\verb/FP.RDSA/})}$ are
grouped at the highest level of the hierarchy, where
$Z_{(\tiny{\verb/FP.RDSA/})}$ is defined analogously to
$Z_{(\tiny{\verb/CVX.XOM/})}$.
\begin{sidewaystable}
\centering
\begin{tabular}{l|lll|lll}
\hline
&&&&&&\\
&
\multicolumn{3}{c}{$z_{i,
(\tiny{\verb"CVX.XOM"})}\overset{\operatorname{def}}{=}\widehat{C}\{\widehat{F}_{\tiny{\verb"CVX"}}(x_{i,
\tiny{\verb"CVX"}}),\widehat{F}_{\tiny{\verb"XOM"}}(x_{i, \tiny{\verb"XOM"}})\}$}
& \multicolumn{3}{|c}{$z_{i,
(\tiny{\verb"FP.RDSA"})}\overset{\operatorname{def}}{=}\widehat{C}\{\widehat{F}_{\tiny{\verb"FP"}}(x_{i,
\tiny{\verb"FP"}}),\widehat{F}_{\tiny{\verb"RDSA"}}(x_{i, \tiny{\verb"RDSA"}})\}$}
\\
\begin{tabular}{ccl}
\\
(\verb/CVX.FP/) & $\rightsquigarrow$ & $\hat{\theta}_{(\tiny{\verb/CVX.FP/})}$\\
(\verb/CVX.XOM/) & $\rightsquigarrow$ & $\hat{\theta}_{(\tiny{\verb/CVX.XOM/})}$\\
(\verb/FP.RDSA/) & $\rightsquigarrow$ & $\hat{\theta}_{(\tiny{\verb/FP.RDSA/})}$\\
(\verb/FP.XOM/) & $\rightsquigarrow$ & $\hat{\theta}_{(\tiny{\verb/FP.XOM/})}$\\
(\verb/RDSA.XOM/) & $\rightsquigarrow$ & $\hat{\theta}_{(\tiny{\verb/RDSA.XOM/})}$\\
\\
\end{tabular}
& \rotatebox[origin=c]{90}{best fit}
& \rotatebox[origin=c]{90}{(\code{CVX.XOM})}\;\;$\Rightarrow$ &
\begin{tabular}{ccl}
\\
\\
(\verb/CVX.XOM/)\verb/FP/ & $\rightsquigarrow$ & $\hat{\theta}_{\tiny{(\verb/CVX.XOM/)\verb/FP/}}$\\
(\verb/CVX.XOM/)\verb/RDSA/ & $\rightsquigarrow$ & $\hat{\theta}_{\tiny{(\verb/CVX.XOM/)\verb/RDSA/}}$\\
(\verb/FP.RDSA/) & $\rightsquigarrow$ & $\hat{\theta}_{\tiny{(\verb/FP.RDSA/)}}$\\
\\
\\
\end{tabular}
& \rotatebox[origin=c]{90}{best fit}
& \rotatebox[origin=c]{90}{(\code{FP.RDSA})}\;\;$\Rightarrow$
\begin{tabular}{c}
\\
((\code{CVX.XOM})(\code{FP.RDSA}))\\
$\rightsquigarrow \hat{\theta}_{((\tiny{\verb/CVX.XOM/})(\tiny{\verb/FP.RDSA/}))}$\\
\end{tabular}
\\
\hline
\end{tabular}
\caption{The estimation procedure in practice.}
\label{tab:est_proc}
\vspace*{2cm}
\begin{tabular}{llcc}
\hline
Generator & Parameter & $\phi\left(u; \theta \right)$&$\tau \left(\theta\right)$\\
\hline
AMH & $\theta\in[0,1)$ & $(1-\theta)/\{\exp(u)-\theta\}$ & $1-2/(3 \theta^2)\left\{\theta+(1-\theta)^2\log(1-\theta)\right\}$\\
Clayton & $\theta\in(0,\infty)$& $\left(u+1\right)^{-1/\theta}$ & $\theta/(\theta+2)$\\
Frank & $\theta\in(0,\infty)$& $-\log\left[1-\left\{1-\exp(-\theta)\right\}\exp(-u)\right]/\theta$ & $1+4/\theta\left\{D(\theta)-1\right\}$\\
Gumbel & $\theta\in[1,\infty)$& $\exp\left(-u^{1/ \theta}\right)$ &$1-1/\theta$\\
Joe & $\theta\in[1,\infty)$ & $1-\left\{1-\exp(-u)\right\}^{1/\theta}$ & $1-4\sum_{\ell=1}^{\infty}\left[\ell(\theta\ell+2)\left\{2+\theta(\ell-1)\right\}\right]^{-1}$\\
\hline
\end{tabular}
\caption{Generator functions and the relations between the copula
parameter and Kendall's $\tau(\cdot)$. The Debye function
$D(\cdot)$ involved in $\tau(\cdot)$ for the family Frank is given
by $D(\theta)=1/\theta\int_0^{\theta}u/\{\exp(u)-1\}du$.}
\label{tab:Basics}
\end{sidewaystable}
In general, \code{estimate.copula} includes the following arguments:
<<>>=
names(formals(estimate.copula))
@
The whole procedure is divided in three (optional) computational
blocks. First, the margins are specified. Secondly, the copula
parameter, $\pmb{\theta}$, is estimated and finally the HAC is checked
for aggregation possibilities. The \code{margins} of the $(n \times
d)$ data matrix, \code{X}, are assumed to follow the standard uniform
distribution by default, i.e., \code{margins = NULL}, but the function
also permits non-uniformly distributed data as input if the argument
\code{margins} is specified. The marginal distributions can be
determined non-parametrically, \verb/margins = "edf"/, or in a
parametric way, e.g., \verb/margins = "norm"/. Following the latter
approach, the log-likelihood of the marginal distributions is
optimized with respect to the first (and second) parameter(s) of the
density \code{dxxx}. Based on these estimates, the values of the
univariate margins are computed. If the argument is defined as scalar,
all margins are computed according to this specification. Otherwise,
different margins can be defined, e.g.,
\verb/margins = c("norm", "t", "edf")/ for a three-dimensional
sample. Except the uniform distribution, all continuous distributions
of the \pkg{stats} package \citep[see \code{?Distributions};][]{R} are
available: \verb/"beta"/, \verb/"cauchy"/, \verb/"chisq"/,
\verb/"exp"/, \verb/"f"/, \verb/"gamma"/, \verb/"lnorm"/,
\verb/"norm"/, \verb/"t"/ and \verb/"weibull"/. The values of
non-parametrically estimated distributions are computed according to
Equation~\ref{eqn:nonp}.
Inappropriate usage of this argument might lead to misspecified
margins, e.g., \linebreak \verb/margins = "exp"/ although the sample
contains negative values. Even though the margins might be assumed to
follow parametric distributions if \code{margins != NULL}, no joint
log-likelihood is maximized, but the margins are estimated in
advance. As the asymptotic theory works well for parametric and
nonparametric estimation of margins, for the univariate analysis we
refer to other built-in packages. In practice, the column names of
\code{X} should be specified, as the default names \code{X1, X2, ...}
are given otherwise.
A further optional argument of \code{estimate.copula} determines the
estimation \code{method}. As discussed above, we present three
procedures: ML (\code{method = 1}), which is based on the bivariate
density, full ML (\code{method = 2}) and recursive ML (\code{method =
3}) respectively. The routines of the \pkg{copula} package are
imported if a simple Archimedean copula is fitted to the data, see
\cite{yan_2007, kojadinovic_yan_2010, hofert_maechler_2011}.
At the final computational step of the procedure the binary HAC is
checked for aggregation possibilities, if \code{epsilon > 0}. The new
dependency parameter is computed according to the specification
\code{agg.method}, i.e., the \verb/"min"/, \verb/"max"/ or
\verb/"mean"/ of the original parameters. To emphasize this point,
recall the four-dimensional binary HAC
\begin{eqnarray}\label{eqn:agg_b}
C(u_1,\dots,u_4; (((12)3)4), \pmb{\theta}) &= \phi_{\theta_{3}}\left\{\phi^{-1}_{\theta_{3}}\circ C\{u_1,\dots,u_{3}; ((12)3), (\theta_1,\theta_{2})^\top\}+\phi^{-1}_{\theta_{3}}(u_4)\right\},
\end{eqnarray}
from Section~\ref{sec:PHAC}. If we assume additionally $ \theta_{1}
\approx \theta_{2}$, such that $ \theta_{1} - \theta_{2} < \varepsilon
$, the copula $C(\cdot)$ can be approximated by
\begin{eqnarray}\label{eqn:agg_a}
C^{*}(u_1,\dots,u_4; ((123)4), \pmb{\theta}) &= \phi_{\theta_3}\left\{\phi^{-1}_{\theta_3}\circ C\{u_1, \ldots, u_{3}; (123), \theta^{*}\}+\phi^{-1}_{\theta_3}(u_4)\right\},
\end{eqnarray}
where $\theta^{*} = (\theta_1+\theta_2)/2$ for instance. This is
referred to as the associativity property of Archimedean copulae, see
Theorem 4.1.5 of \citet{nelsen_2006}. If the variables of two nodes
are aggregated, the new copula is checked for aggregation
possibilities as well. Beside this threshold approach, the realized
estimates $\hat{\theta}_1$ and $\hat{\theta}_2$ can obviously be used
to test $H_0: \, \theta_1 - \theta_2 = 0 $, since the asymptotic
distribution is known. On the other hand, this approach is extremely
expensive computationally. The estimation results for the
non-aggregated and the aggregated cases are presented in the
following:
<>=
result.agg = estimate.copula(finData, method = 1, margins = "edf", epsilon = 0.3)
par(mai = c(0, 0, 0, 0))
plot(result.agg, circles = 0.3, index = TRUE, l = 1.7)
@
<>=
par(mai = c(0, 0, 0, 0))
plot(result, circles = 0.3, index = TRUE, l = 1.7)
@
<>=
result.agg = estimate.copula(sample, margins = "edf", epsilon = 0.3)
plot(result, circles = 0.3, index = TRUE, l = 1.7)
plot(result.agg, circles = 0.3, index = TRUE, l = 1.7)
@
\begin{figure}[t!]
\begin{minipage}[t]{0.5\textwidth}
\centering
\includegraphics[width=2in, trim=0 45 0 0, clip]{HAC-result.pdf}
\end{minipage}
\begin{minipage}[t]{0.5\textwidth}
\centering
\includegraphics[width=2in, trim=0 45 0 0, clip]{HAC-result-agg.pdf}
\end{minipage}
\caption{Plot of \code{result} on the left and \code{result.agg} on the right hand side.}
\label{fig:result}
\end{figure}
\subsection[The `hac' object]{The `\code{hac}' object}\label{sec:hac_object}
`\code{hac}' objects can be constructed by the general function
\code{hac}, with the same name as the object it creates, and its
simplified version \code{hac.full} for building fully nested HAC. For
instance, consider the construction of a four-dimensional fully nested
HAC with Gumbel generator, i.e.,
<<>>=
G.cop = hac.full(type = 1,
y = c("X4", "X3", "X2", "X1"),
theta = c(1.1, 1.8, 2.5))
G.cop
@
where \code{y} denotes the vector of variables of class
`\code{character}' and \code{theta} denotes the vector of dependency
parameters. The parameters should be in ascending order, so that the
first parameter, \code{1.1}, refers to the initial node of the
HAC and the last parameter, \code{2.5}, corresponds to the first
hierarchical level with variables \verb/"X1"/ and \verb/"X2"/. The
vector \code{y} has to contain one element more than the vector
\code{theta}.
The \proglang{S}3 \code{print} method for `\code{hac}' objects gives
an output structured in three lines: (i) the object's \code{Class},
(ii) the \code{Generator} family and (iii) the HAC structure $s$. The
structure can also be produced by the supplementary function
\code{tree2str}. Variables, grouped at the same node are separated by
a dot ``\code{.}'' and the dependency parameters are printed within
the curly parentheses.
Partially nested Archimedean copulae are constructed by \code{hac}
with the main argument \code{tree}. For a better understanding let us
first consider a four-dimensional simple Archimedean copula with
dependency parameter $\theta = 2$:
<<>>=
hac(type = 1, tree = list("X1", "X2", "X3", "X4", 2))
@
The copula \code{tree} is constructed by a \code{list} consisting of
four \code{character} objects, i.e., \linebreak
\verb/"X1", "X2", "X3", "X4"/, and a number, which denotes the
dependency parameter of the Archimedean copula. According to the
theoretical construction of HAC in Section \ref{sec:PHAC}, we can
induce structure by substituting margins through a subcopula. The four
variables \verb/"X1"/, \verb/"X2"/, \verb/"X3"/, \verb/"X4"/ can, for
example, be structured by
<<>>=
hac(type = 1, tree = list(list("X1", "X2", 2.5), "X3", "X4", 1.5))
@
where the nested component, \verb/list("X1", "X2", 2.5)/, is the
subcopula at the lowest hierarchical level. Note that the nested
component is of the same general form \code{list(..., numeric(1))} as
the simple Archimedean copula, where \code{numeric(1)} denotes the
dependency parameter and ``\code{...}'' refers to arbitrary variables
and subcopulae, which may contain subcopulae as well, like shown
in the following:
<<>>=
HAC = hac(type = 1, tree = list(list("Y1", list("Z3", "Z4", 3), "Y2", 2.5),
list("Z1", "Z2", 2), list("X1", "X2", 2.4),
"X3", "X4", 1.5))
HAC
@
We cannot avoid the notation becoming more cumbersome for higher
dimensions, but the principle stays the same for arbitrary dimensions,
i.e., variables are substituted by lists of the general form
\code{list(..., numeric(1))}. The function \code{hac} provides a
further argument for specifying the \code{type} of the HAC.
\subsection[Graphics]{Graphics}\label{sec:graphics}
As the string representation of the structure becomes more unclear as
dimension increases, the package allows to produce graphics of
`\code{hac}' objects using the \proglang{S}3 \code{plot} method for
these objects. Figure~\ref{fig:HAC} illustrates for example the
dependence structure of the already defined object \code{HAC}.
<>=
par(mai = c(0, 0, 0, 0))
plot(HAC, cex = 0.8, circles = 0.35)
@
<>=
plot(HAC, cex = 0.8, circles = 0.35)
@
\begin{figure}
\centering
\includegraphics[width=0.7\textwidth]{HAC-HAC.pdf}
\vspace{-1cm}
\caption{Plot of the object \code{HAC}.}
\label{fig:HAC}
\end{figure}
The explanatory power of these plots can be enhanced by several of the
usual \code{plot} parameters:
<<>>=
names(formals(plot.hac))
@
The optional, boolean argument \code{theta} determines whether the
dependency parameter of the copula $\theta$ or Kendall's $\tau$ is
printed, whereby Kendall's $\tau$ cannot be easily interpreted in the
usual way for more than two dimensions. The supplementary function
\code{theta2tau} computes Kendall's rank correlation coefficient based
on the value of the dependency parameter, whereas \code{tau2theta}
corresponds to the inverse function, see Table~\ref{tab:Basics}. If
\code{index = TRUE}, strings illustrating the subcopulae of the nodes
are used as subscripts of the dependency parameters. If, additionally,
\code{numbering = TRUE}, the parameters are numbered, such that the
subscripts correspond to the estimation stages if the non-aggregated
output of \code{estimate.copula} is plotted. The radius of the
\code{circles}, the width \code{l} and the height \code{h} of the
rectangles and the specific colors of the lines and the text can be
adjusted. Further arguments ``\code{...}'' can, for example, be used
to modify the font size \code{cex} or to include a subtitle
\code{sub}.
\subsection[Random sampling]{Random sampling}\label{sec:RS}
To be in line with other \proglang{R} packages providing tools for
different univariate and multivariate distributions we provide: (i)
\code{dHAC} for computing the values of the copula density, (ii)
\code{pHAC} for the cumulative distribution function and (iii)
\code{rHAC} for simulations. Sampling methods are imported from the
\pkg{copula} package and rely on the algorithm suggested in
\citet{hofert_maechler_2011}, who summarize the sampling procedure as
follows:
\begin{algorithm}\label{Algo1}
Let $C(\cdot)$ be a nested Archimedean copula with root copula
$C_{0}(\cdot)$ generated by $\phi_{0}$. Let $U$ be a vector of the
same dimension as $C_0(\cdot)$.
\begin{enumerate}
\item Sample from inverse Laplace transform $\mathcal{LS}^{-1}$ of
$\phi_0(\cdot)$, i.e., $V_{0} \sim
F_{0}(\cdot)\overset{\operatorname{def}}{=}\mathcal{LS}^{-1}\left\{\phi_{0}(\cdot)\right\}$.
\item For all components $u$ of $C_{0}(\cdot)$ that are nested
Archimedean copulae do:
\begin{enumerate}
\item Set $C_{1}(\cdot)$ with generator $\phi_{1}(\cdot)$ to the
nested Archimedean copula $u$.
\item Sample $V_{01} \sim
F_{01}(\cdot)\overset{\operatorname{def}}{=}
\mathcal{LS}^{-1}\left\{\phi_{01} \left(\cdot; V_{0} \right)
\right\}$.
\item Set $ C_{0}(\cdot) \overset{\operatorname{def}}{=}
C_1(\cdot), \phi_{0}(\cdot) \overset{\operatorname{def}}{=}
\phi_{1}(\cdot)$, and $V_{0} \overset{\operatorname{def}}{=}
V_{01}$ and continue with 2.
\end{enumerate}
\item For all other components $u$ of $C_{0}(\cdot)$ do:
\begin{enumerate}
\item Sample $R \sim Exp(1)$.
\item Set the component of $U$ corresponding to $u$ to
$\phi_{0}\left(R/V_{0}\right)$.
\end{enumerate}
\item Return $U$.
\end{enumerate}
\end{algorithm}
The function \code{rHAC} requires only two arguments: (i) the sample
size \code{n} and (ii) an object of the class `\code{hac}' specifying
the characteristics of the underlying HAC, e.g.,
<>=
set.seed(1)
sim.data = rHAC(500, G.cop)
par(mai = c(0, 0, 0, 0))
pairs(sim.data, pch = 20, cex.axis = 1.75)
@
<>=
sim.data = rHAC(500, G.cop)
pairs(sim.data, pch = 20)
@
In particular, the contributions of \citet{mcneil_2008},
\citet{hofert_2008} and \citet{hofert_11} provide the theoretical
foundations to sample computationally efficient random vectors from
HAC. Algorithm~\ref{Algo1} exploits the recursively determined
structure of HAC and samples from $F_{0}$ and $F_{01}$, which are
comprehensively discussed in \citet{hofert_11} and
\citet{hofert_maechler_2011}.
\begin{figure}[t!]
\centering
\includegraphics[width=4in, trim=0 20 0 20, clip]{HAC-Scatter2.pdf}
\caption{Scatterplot of the sample \code{sim.data}, which is
simulated from \code{G.cop} associated with a four dimensional
HAC-based Gumbel copula.}
\label{fig:scatter2}
\end{figure}
\subsection[The CDF and density]{The CDF and density}\label{sec:cdf}
The arguments for \code{pHAC} are a `\code{hac}' object and a sample
\code{X}, whose column names should be identical to the variables'
names of the `\code{hac}' object, e.g.,
<<>>=
probs = pHAC(X = sim.data, hac = G.cop)
@
As the copula density is defined as $d$th derivative of the copula
$C(\cdot)$ with respect to the arguments $u_j$, $j=1,\ldots,d$,
c.f.~\citet{savu_trede_2010}, the explicit form of the density varies
with the structure of the underlying HAC. Hence, including the
explicit form of all possible $d$-dimensional copula densities is
absolutely unrealistic. Our function \code{dHAC} derives an analytical
expression of the density for a given `\code{hac}' object, which can be
instantaneously evaluated if \code{eval = TRUE}. The analytical
expression of the density is found by subsequently using the \code{D}
function to differentiate the algebraic form of the copula
``symbolically'' with respect to the variables of the inserted
`\code{hac}' object. Although the derivation and evaluation of the
density is computationally and numerically demanding, \code{dHAC}
provides a flexible way to work with HAC densities in practice,
because they do not need to be manually derived or numerically
approximated. Since the densities of the two-dimensional Archimedean
copulae are frequently called during the pseudo multi-stage estimation
procedure (\code{1}), their closed form expressions are given
explicitly.
\subsection[Empirical copula]{Empirical copula}\label{sec:empirical}
As long as our package does not cover goodness-of-fit tests, which are
difficult to implement in general and involve computational intensive
techniques via bootstrapping, see
\citet{genest_remillard_beaudoin_2009}, it might be difficult to
justify the choice of a parametric assumption. However, the values of
\code{probs} can be compared to those of the empirical copula, i.e.,
\begin{eqnarray}\label{nparcop}
\widehat{C} \left(u_{1}, \dots, u_{d} \right) &=& n^{-1} \sum_{i=1}^{n} \prod_{j=1}^{d} \mathbf{I}\left\{\widehat{F}_{j} \left( X_{ij} \right) \leq u_{j} \right\},
\end{eqnarray}
where $\widehat{F}_{j}(\cdot)$ denotes the estimated marginal
distribution function of variable $X_{j}$. Figure~\ref{fig:pp}
suggests a proper fit of the empirical copula computed by
<>=
probs.emp = emp.copula.self(sim.data, proc = "M")
plot(probs, probs.emp, pch = 20, xlab = "True Probabilities", ylab = "Empirical Probabilites", asp = 1, cex.lab = 1.125, cex.axis = 1.125)
grid(lwd = 1.5)
box(lwd = 1)
points(probs, probs.emp, pch = 20)
lines(c(0,1), c(0,1), col = "darkred", lwd = 2)
@
\begin{figure}[t!]
\centering
\includegraphics[width=0.6\textwidth, trim=0 15 0 45, clip]{HAC-pp.pdf}
\caption{The values of \code{probs} on the $x$-axis against the
values of \code{probs.emp}.}
\label{fig:pp}
\end{figure}
<>=
probs.emp = emp.copula.self(sim.data, proc = "M")
@
There are two functions which can be used for computing the empirical
copula:
<>=
emp.copula(u, x, proc = "M", sort = "none", margins = NULL,
na.rm = FALSE, ...)
emp.copula.self(x, proc = "M", sort = "none", margins = NULL,
na.rm = FALSE, ...)
@
The difference between the arguments of these functions is that
\code{emp.copula} requires a matrix \code{u}, at which the estimated
function is evaluated. This can, in particular, be helpful, when the
sample is decomposed to evaluate the out-of-sample performance as the
empirical copula can be regarded as natural benchmark. In contrast,
\code{emp.copula.self} evaluates $\widehat{C}(\cdot)$ at the sample
\code{x} used for the estimation and thus, the returned values can be
considered as in-sample fit. The argument \code{proc} enables the user
to choose between two computational methods. We recommend to use the
default method, \verb/proc = "M"/, which is based on matrix
manipulations, because its computational time is just a small fraction
of the time taken by method \verb/"A"/, which is based on
\code{apply}, see Figure~\ref{fig:speed}. However, method \verb/"M"/
is sensitive with respect to the size of the working memory and
therefore inapplicable for very large datasets. Note that standard
applications, e.g., measuring the VaR of a portfolio, are based on
$250$ or $500$ observations. Figure~\ref{fig:speed} illustrates
rapidly increasing computational times of the matrix-based
method for more than $5000$ observations until the method
collapses. In contrast, the runtimes of the alternative method
\verb/proc = "A"/ are more robust against an increasing sample
size. The computational times are less sensitive with respect to the
dimension and we recommend using the default method up to $d=100$ for
non-large sample sizes. Another possibility to deal with large
datasets is specifying the matrix \code{u} manually in order to reduce
the number of vectors which are to be evaluated.
\begin{lrbox}{\verbboxone} \verb|proc = "M"| \end{lrbox}
\begin{lrbox}{\verbboxtwo} \verb|proc = "A"| \end{lrbox}
\begin{figure}[t!]
\centering
\includegraphics[width=0.7\textwidth, trim=0 15 0 50, clip]{HAC-speed2.pdf}
\caption{The plot shows the computational times for an increasing
sample size but a fixed dimension $d=5$ on a log-log scale. The
solid line refers to \usebox{\verbboxone} and the dashed line
to \usebox{\verbboxtwo}.}
\label{fig:speed}
\end{figure}
\section[Simulation study]{Simulation study}\label{sec:sim}
While the first part of this section deals with the performance of the ML and RML procedure, the second part is about the more sophisticated PML procedure. Summary statistics can only be computed if the correct structure has been detected. Hence, we sample $m$ estimators as long as $n=1000$ structures are correctly identified.
\subsubsection[Quasi and recursive ML]{Quasi and recursive ML}
To illustrate the accuracy of the proposed methods, we generate random
data from six copula models of different dimension
$C_{i}^{j}\overset{\operatorname{def}}{=}C^{j}\left(\cdot;s_{i},\pmb{\theta}_i\right)$,
for $i=1,2,3$ and $j=\text{C},\text{G}$, and show that the estimates
almost coincide with the true model specification. Here, $j$ denotes
the copula family (Clayton or Gumbel) and the structures are given by
$((U_1,U_2)_{\tau^{-1}(2/3)},U_3)_{\tau^{-1}(1/3)}$, $((((U_1,U_2)_{\tau^{-1}(7/9)},U_3)_{\tau^{-1}(5/9)},U_4)_{\tau^{-1}(3/9)},$ $U_5)_{\tau^{-1}(1/9)}$ and $((U_1,U_2)_{\tau^{-1}(6/9)},(U_3,U_4)_{\tau^{-1}(3/9)},U_5)_{\tau^{-1}(1/9)}$. An overview of functions $\tau(\cdot)$ is tabulated in Table~\ref{tab:Basics}. The values
of $\pmb{\theta}_i$ in terms of are presented in Tables~\ref{tab:simulation1} and
\ref{tab:simulation3}. They are chosen to obtain a similar strength of
dependence by the Clayton and Gumbel based models.
Summary statistics are presented Tables~\ref{tab:simulation1} and
\ref{tab:simulation3}. As \code{estimate.copula} approximates the true structure,
we set \code{epsilon = 0.15} for $C_3^{\text{G}}$ and \code{epsilon = 0.20} for $C_3^{\text{C}}$, which are not based on a binary
structure and employ the RML procedure. Note that the
RML procedure attempts at aggregating the copula tree after
each estimation step. The simulated samples for the copula estimation
consist of $250$ observations for the copula types in order to
illustrate the finite sample properties of the
procedures. Tables~\ref{tab:simulation1} and \ref{tab:simulation3}
indicate, that the estimation procedure works properly for the
suggested models, as the estimates are on average consistent with the
true parameters. Nevertheless, a few points deserve being mentioned:
(i) The ML procedure detects the true structure for the
binary HAC in $n/m=100\%$ and the recursive ML procedure for the
non-binary HAC in at least $n/m=99\%$ of the cases as long as the
parameters exhibit the imposed distance and the permutation symmetry
of the variables at the same node is taken into consideration. (ii)
The estimates at lower hierarchical levels show a higher volatility
than the estimates close to the initial node and the estimates for the
Clayton models are more volatile than the estimates of the Gumbel
based HAC. (iii) All estimated models indicate more imprecise
estimates for higher nesting levels, but the gains from full ML
estimation regarding the precision are only observable for the
estimates at the root node of $C_3^{\text{G}}$, see
Table~\ref{tab:simulation3}. However, this minor improvement is costly
since the results are based on a preestimated structure. (iv) These
observations justify choosing different values of \code{epsilon} for
$C_3^{\text{G}}$ and $C_3^{\text{C}}$, as the tuning parameter should
reflect the variability of the parameters. A different \code{epsilon} for each aggregation of the structure is determined in the penalized ML method taking the parameter variability into account. If the
parameters are closer and/or the value of \code{epsilon} is chosen
smaller, the amount of correctly classified structures declines. On
the other hand, larger sample sizes permit smaller values of
\code{epsilon} as the parameters are more precisely estimated.
\begin{table}[t!]
\centering
\begin{tabular}{cc|ccccc}
\hline
& & \multicolumn{5}{c}{Statistics}\\
Model & $\pmb{\theta}$ & $\operatorname{min}$ & $\operatorname{median}$ & $\operatorname{mean}$ &
$\operatorname{max}$ &
$\operatorname{sd}$\\
\hline
\multirow{2}{*}{$C_1^{\text{G}}$} & $\theta_2=1.500$ & 1.29 & 1.50 & 1.50 & 1.79 & 0.07\\
& $\theta_1=3.000$ & 2.55 & 3.00 & 3.00 & 3.60 & 0.16\\
\hline
\multirow{2}{*}{$C_1^{\text{C}}$} & $\theta_2=1.000$ & 0.62 & 1.00 & 1.00 & 1.55 & 0.12\\
& $\theta_1=4.000$ & 3.26 & 4.01 & 4.01 & 5.00 & 0.27\\
\hline
\multirow{4}{*}{$C_2^{\text{G}}$} & $\theta_4=1.125$ & 1.00 & 1.13 & 1.13 & 1.29 & 0.05\\
& $\theta_3=1.500$ & 1.25 & 1.50 & 1.50 & 1.86 & 0.08\\
& $\theta_2=2.250$ & 1.95 & 2.25 & 2.25 & 2.62 & 0.12\\
& $\theta_1=4.500$ & 3.79 & 4.51 & 4.52 & 5.51 & 0.24\\
\hline
\multirow{4}{*}{$C_2^{\text{C}}$} & $\theta_4=0.250$ & 0.02 & 0.26 & 0.26 & 0.57 & 0.09\\
& $\theta_3=1.000$ & 0.62 & 1.00 & 1.00 & 1.50 & 0.12\\
& $\theta_2=2.500$ & 1.95 & 2.52 & 2.52 & 3.19 & 0.20\\
& $\theta_1=7.000$ & 5.95 & 7.00 & 7.02 & 8.40 & 0.43\\
\hline
\end{tabular}
\caption{The models for the Gumbel family $C_1^{\text{G}}$, $C_2^{\text{G}}$ and for the Clayton family $C_1^{\text{C}}$, $C_2^{\text{C}}$, where $\pmb{\theta}$ denotes the true copula parameters.}
\label{tab:simulation1}
\end{table}
\begin{table}[t!]
\centering
\begin{tabular}{cc|rccccc}
\hline
& & \multicolumn{6}{c}{Statistics for recursive ML}\\
Model & $\pmb{\theta}$ & $\bar{s}\qquad\qquad$ & $\operatorname{min}$ & $\operatorname{median}$ &
$\operatorname{mean}$ &
$\operatorname{max}$ & $\operatorname{sd}$\\
\hline
\multirow{3}{*}{$C_3^{\text{G}}$} & $\theta_3=1.125$ & \multirow{2}{*}{$(((34)5)(12))=0.50\%$} & 1.01 & 1.10 & 1.11 & 1.22 & 0.03\\
& $\theta_2=1.500$ & \multirow{2}{*}{$((534)(12))=0.10\%$} & 1.28 & 1.50 & 1.50 & 1.87 & 0.07\\
& $\theta_1=3.000$ & & 2.58 & 2.99 & 3.00 & 3.64 & 0.16\\
\hline
\multirow{3}{*}{$C_3^{\text{C}}$} & $\theta_3=0.250$ & \multirow{2}{*}{$(((34)5)(12))=0.40\%$} & 0.09 & 0.25 & 0.25 &
0.46 & 0.06\\
& $\theta_2=1.000$ & \multirow{2}{*}{$(((12)(34))5)=0.30\%$} & 0.62 & 1.00 & 1.01 & 1.42 & 0.13\\
& $\theta_1=4.000$ & & 3.08 & 4.00 & 4.01 & 5.05 & 0.29\\
\hline
& & \multicolumn{6}{c}{Statistics for full ML}\\
\hline
\multirow{3}{*}{$C_3^{\text{G}}$} & $\theta_3=1.125$ & \multirow{3}{*}{$-\qquad\qquad$} & 1.05 & 1.13 & 1.13 & 1.23 &
0.03\\
& $\theta_2=1.500$ & & 1.29 & 1.50 & 1.50 & 1.86 & 0.07\\
& $\theta_1=3.000$ & & 2.58 & 2.99 & 3.00 & 3.65 & 0.16\\
\hline
\multirow{3}{*}{$C_3^{\text{C}}$} & $\theta_3=0.250$ & \multirow{3}{*}{$-\qquad\qquad$} & 0.13 & 0.25 & 0.25 & 0.42 &
0.05\\
& $\theta_2=1.000$ & & 0.60 & 1.00 & 1.01 & 1.42 & 0.13\\
& $\theta_1=4.000$ & & 3.10 & 4.00 & 4.01 & 5.05 & 0.29\\
\hline
\end{tabular}
\caption{The model for the Gumbel family $C_3^{\text{G}}$ and for the Clayton family $C_3^{\text{C}}$, where $\pmb{\theta}$ denotes the true copula parameters and the column $\bar{s}$ refers to the percentage of incorrectly classified structures based on $n=1000$ replications.}
\label{tab:simulation3}
\end{table}
\subsubsection[Penalized ML]{Penalized ML}
In order to compare the results of the simulation study among Archimedean families (Clayton, Frank, Gumbel, Joe), let $\tau:\Theta\rightarrow[0,1]$ transform the parameter $\theta\in\Theta$ into Kendall's correlation coefficient. We illustrate the performance of the estimation procedure for two types of models:
(i) $5$-dimensional HAC with $((U_3,U_4,U_5)_{\theta_{1}},U_1,U_2)_{\theta_{2}}$, where $\theta_{1}=\tau^{-1}(0.7)$ and $\theta_{2}=\tau^{-1}(0.3)$ refer to the group specific dependence parameter of the random vectors $(U_3,U_4,U_5)^{\top}$ and $(\widehat U_{1},U_1,U_2)$ respectively, where $\widehat U_{1}=\phi_{\theta_{1}}[3\phi^{-1}_{\theta_{1}}\{\max(U_3,U_4,U_5)\}]$. The parameters in terms of Kendall's $\tau(\cdot)$ are chosen as in \citet[Section 8.1]{segers_uyttendaele_2014} for 4-variate HAC. The sample size is $250$. Table~\ref{tab:sim_d5} summarizes the results of the $5$-dimensional setup. The true structure is found in more than $82\%$ of the cases and the parameters are unbiasedly estimated with a small empirical standard deviation. If the true structure is not identified, HAC are constructed from $3$ parameters in most of the cases as shown in the last column of Table~\ref{tab:sim_d5}. Note that a correct classification of the structure requires several aggregation steps, which enlarges the room for mistakes. We would like to emphasize that the results are sensitive with respect to the selection of the tuning parameters of the penalty function. In practice, these parameters are computed by a global stochastic optimization algorithm namely simulated annealing. In our experiments, the longer the simulated annealing algorithm iterates the more precise are the estimation results with respect to a correct specification of the structure. However, more iterations make the entire procedure more computationally intensive. Furthermore, the PML procedure depends on the diagonal element of an estimate of the Fisher information matrix. In our experiments, the estimtors are not sensitive with respect to the estimator of the information matrix, i.e., the outer score product based estimator performs as good as the Hessian based estimator.
\begin{table}[t!]
\begin{center}
\begin{tabular}{l|ccc|c}
\hline\hline
Family&$s(\hat\theta)=s(\theta_0)$&$\tau(\hat\theta_1)\;(\text{sd})$&$\tau(\hat\theta_2)\;(\text{sd})$&$\#\{\hat\theta\}$\\
\hline
Clayton&0.82&0.70 (0.01)&0.30 (0.02)&3.04\\
Frank&0.85&0.70 (0.01) & 0.30 (0.02)&3.03\\
Gumbel&0.85&0.70 (0.01) & 0.30 (0.02)&3.02\\
Joe&0.88&0.70 (0.01) & 0.30 (0.02)&3.04\\
\hline\hline
\end{tabular}
\caption{$s(\hat\theta)=s(\theta_0)$ reports the fraction of correctly specified structures, $\tau(\hat\theta_k)\;(\text{sd})$, $k=1,2$, refers to the sample average of Kendall's $\tau(\cdot)$ evaluated at the estimates and sd to the sample standard deviation thereof. If the structure is misspecified, $\#\{\hat\theta\}$ gives the number of parameters on average included in the misspecified HAC. Monte Carlo sample size is $250$.}
\label{tab:sim_d5}
\end{center}
\end{table}
(ii) As $\tau(\cdot)$ is a non-linear transformation, we investigate differences in the aggregation performance for different strength of dependence. In detail, we consider $3$-dimensional HAC $((U_2,U_3)_{\theta_{\ell}},U_1)_{\theta_{k(\ell)}}$, where $\theta_{\ell}$ refers to the dependence between $(U_2,U_3)^{\top}$ and $\theta_{k(\ell)}$ to the dependence of between $(U_{\ell}, U_1)^{\top}$, with $U_{\ell}=\phi_{\theta_{\ell}}[2\phi^{-1}_{\theta_{\ell}}\{\max(U_2,U_3)\}]$. The parameters are chosen such that $\theta_{\ell}=\tau^{-1}(\omega_{\ell})$, with $\omega_{\ell}\in\{0.9,0.7,0.5,0.3,0.1\}$, and $\theta_{k(\ell)}=\tau^{-1}(\omega_{k(\ell)})$, with $k(\ell)=\ell,\ldots,5$. The sample size is $100$. The major findings of the $3$-dimensional setting are presented in Table~\ref{tab:sim_d3}. In contrast to the previous simulation study, there is only one possible error source to obtain a misspecified structure. The penalized estimator is overall unbiased and the correct structure is detected in most of the cases, especially if the distance between $\tau(\theta_{k(\ell),0})$ and $\tau(\theta_{\ell,0})$ is large. Nevertheless, the low classification rate of the $3$-dimensional Archimedean Clayton copula $(U_1,U_2,U_3)_{\tau^{-1}(0.1)}$ should be mentioned as well as the bias of the estimator of the lower parameter of the Frank copula $((U_2,U_3)_{\tau^{-1}(0.9)},U_1)_{\tau^{-1}(0.3)}$.
\begin{table}[!t]
\hspace{-0.5cm}
\begin{tabular}{cc|ccc|ccc}
\hline\hline
&& \multicolumn{3}{c|}{Clayton} & \multicolumn{3}{c}{Frank}\\
$\tau(\theta_{k(\ell),0})$ & $\tau(\theta_{\ell,0})$ & $s(\hat\theta)=s(\theta_0)$ & $\tau(\hat\theta_{k(\ell)})$ (sd) & $\tau(\hat\theta_{\ell})$ (sd)& $s(\hat\theta)=s(\theta_0)$ & $\tau(\hat\theta_{k(\ell)})$ (sd) & $\tau(\hat\theta_{\ell})$ (sd)\\
\hline
0.10 & 0.10 & 0.31 & 0.12 (0.03) & 0.12 (0.03) & 0.83 & 0.10 (0.09) & 0.10 (0.09)\\
0.10 & 0.30 & 0.93 & 0.10 (0.05) & 0.30 (0.05) & 0.77 & 0.09 (0.14) & 0.31 (0.05)\\
0.10 & 0.50 & 1.00 & 0.10 (0.05) & 0.50 (0.03) & 1.00 & 0.11 (0.14) & 0.50 (0.03)\\
0.10 & 0.70 & 1.00 & 0.10 (0.05) & 0.70 (0.02) & 1.00 & 0.10 (0.15) & 0.70 (0.01)\\
0.10 & 0.90 & 1.00 & 0.10 (0.05) & 0.90 (0.01) & 1.00 & 0.11 (0.13) & 0.91 (0.00)\\
\hline
0.30 & 0.30 & 0.88 & 0.30 (0.03) & 0.30 (0.03) & 0.88 & 0.30 (0.04) & 0.30 (0.04)\\
0.30 & 0.50 & 0.98 & 0.30 (0.04) & 0.50 (0.03) & 0.93 & 0.30 (0.05) & 0.50 (0.02)\\
0.30 & 0.70 & 1.00 & 0.30 (0.05) & 0.70 (0.02) & 1.00 & 0.30 (0.06) & 0.70 (0.01)\\
0.30 & 0.90 & 1.00 & 0.30 (0.05) & 0.90 (0.01) & 1.00 & 0.25 (0.06) & 0.92 (0.00)\\
\hline
0.50 & 0.50 & 0.89 & 0.05 (0.02) & 0.05 (0.02) & 0.88 & 0.50 (0.02) & 0.50 (0.02)\\
0.50 & 0.70 & 1.00 & 0.50 (0.03) & 0.70 (0.02) & 1.00 & 0.50 (0.03) & 0.70 (0.01)\\
0.50 & 0.90 & 1.00 & 0.50 (0.03) & 0.90 (0.01) & 1.00 & 0.47 (0.03) & 0.91 (0.00)\\
\hline
0.70 & 0.70 & 0.90 & 0.70 (0.02) & 0.70 (0.02) & 0.90 & 0.70 (0.01) & 0.70 (0.01)\\
0.70 & 0.90 & 1.00 & 0.70 (0.02) & 0.90 (0.01) & 1.00 & 0.69 (0.01) & 0.90 (0.00)\\
\hline
0.90 & 0.90 & 0.84 & 0.90 (0.01) & 0.90 (0.01) & 0.86 & 0.90 (0.00) & 0.90 (0.00)\\
\hline
&& \multicolumn{3}{c|}{Gumbel} & \multicolumn{3}{c}{Joe}\\
\hline
0.10 & 0.10 & 0.87 & 0.10 (0.01) & 0.10 (0.01) & 0.87 & 0.10 (0.02) & 0.10 (0.01)\\
0.10 & 0.30 & 0.85 & 0.09 (0.01) & 0.31 (0.02) & 0.91 & 0.09 (0.02) & 0.31 (0.02)\\
0.10 & 0.50 & 1.00 & 0.10 (0.02) & 0.50 (0.02) & 1.00 & 0.10 (0.02) & 0.50 (0.02)\\
0.10 & 0.70 & 1.00 & 0.10 (0.02) & 0.70 (0.02) & 1.00 & 0.10 (0.02) & 0.70 (0.02)\\
0.10 & 0.90 & 1.00 & 0.10 (0.02) & 0.90 (0.01) & 1.00 & 0.11 (0.02) & 0.90 (0.01)\\
\hline
0.30 & 0.30 & 0.90 & 0.30 (0.01) & 0.30 (0.01) & 0.89 & 0.30 (0.02) & 0.30 (0.02)\\
0.30 & 0.50 & 0.95 & 0.30 (0.02) & 0.50 (0.02) & 0.98 & 0.30 (0.03) & 0.50 (0.02)\\
0.30 & 0.70 & 1.00 & 0.30 (0.02) & 0.70 (0.02) & 1.00 & 0.30 (0.03) & 0.70 (0.02)\\
0.30 & 0.90 & 1.00 & 0.30 (0.02) & 0.90 (0.01) & 1.00 & 0.30 (0.03) & 0.90 (0.01)\\
\hline
0.50 & 0.50 & 0.90 & 0.50 (0.01) & 0.50 (0.02) & 0.90 & 0.50 (0.02) & 0.50 (0.02)\\
0.50 & 0.70 & 1.00 & 0.50 (0.02) & 0.70 (0.01) & 1.00 & 0.50 (0.02) & 0.70 (0.01)\\
0.50 & 0.90 & 1.00 & 0.50 (0.02) & 0.90 (0.01) & 1.00 & 0.50 (0.02) & 0.90 (0.01)\\
\hline
0.70 & 0.70 & 0.90 & 0.70 (0.01) & 0.70 (0.01) & 0.90 & 0.70 (0.01) & 0.70 (0.01)\\
0.70 & 0.90 & 1.00 & 0.70 (0.02) & 0.90 (0.01) & 1.00 & 0.70 (0.02) & 0.90 (0.01)\\
\hline
0.90 & 0.90 & 0.88 & 0.90 (0.01) & 0.90 (0.01) & 0.86 & 0.90 (0.01) & 0.90 (0.01)\\
\hline\hline
\end{tabular}
\caption{The columns $\tau(\theta_{k(\ell),0})$ and $\tau(\theta_{\ell,0})$ refer to parameter values in terms of Kendall's $\tau(\cdot)$ at the lower and higher hierarchical level, respectively. $s(\hat\theta)=s(\theta_0)$ reports the fraction of correctly specified structures, $\tau(\hat\theta_{k(\ell)})$ (sd) and $\tau(\hat\theta_{\ell})$ (sd) refer to the sample averages of Kendall's $\tau(\cdot)$ evaluated at the estimate of the higher and lower hierarchical level and sd to the sample standard deviation thereof. Monte Carlo sample size is $100$.}
\label{tab:sim_d3}
\end{table}
\section[Conclusion]{Conclusion}\label{sec:Con}
The \pkg{HAC} package focuses on the computationally efficient
estimation of hierarchical Ar\-chi\-me\-de\-an copulae, which are
based on grouping binary structures within a recursive multi-stage ML
procedure and on a predetermined structure respectively. In particular, the recursive procedures are computationally tractable, as the consecutive optimization of the
log-likelihood avoids the singular optimization of the
$d$-dimensional one with respect to several parameters. Since HAC
permit modeling large-dimensional random vectors, the package provides
a function for producing plots of the related `\code{hac}'
objects. According to the usual naming of distributions in
\proglang{R}, we provide \code{dHAC}, \code{pHAC} and \code{rHAC} to
compute the values of density- and distribution functions or to sample
from arbitrary HAC. Finally, the accuracy of the methods has been
shown in a small simulation study.
\section*{Acknowledgments}
The authors are grateful to the editors of the Journal of Statistical Software and two anonymous referees for several helpful comments and suggestions. The research was supported by the Deutsche Forschungsgemeinschaft through the CRC 649 ``Economic Risk'', Humboldt-Universit\"at zu Berlin and the International Research Training Group 1792.
\bibliography{ref}
\end{document}