October 30, 2017 | Author: Anonymous | Category: N/A
finances : statistique, théorie et gestion de portefeuille statistiques du coton ......
UNIVERSITE DE NICE-SOPHIA ANTIPOLIS - UFR SCIENCES Ecole Doctorale de Sciences Fondamentales et Appliqu´ees
THESE Pr´esent´ee pour obtenir le titre de
Docteur en SCIENCES de l’Universit´e de Nice - Sophia Antipolis Sp´ecialit´e : Physique par
Yannick MALEVERGNE
Risques extrˆemes en finance : statistique, th´eorie et gestion de portefeuille Soutenue publiquement le 20 d´ecembre 2002 devant le jury compos´e de : Jean-Claude AUGROS Professeur, Universit´e Lyon I Jean-Paul LAURENT Professeur, Universit´e Lyon I Jean-Franc¸ois MUZY Charg´e de recherche au CNRS Michael ROCKINGER Professeur, HEC Lausanne Bertrand ROEHNER Professeur, Universit´e Paris VII Didier SORNETTE Directeur de recherche au CNRS
(Pr´esident) (Co-directeur de Th`ese)
(Rapporteur) (Directeur de Th`ese)
a` l’Institut de Science Financi`ere et d’Assurances - Universit´e Claude Bernard - Lyon I
2
Table des mati`eres Avant-propos
11
Introduction
13
I
Etude et mod´elisation des propri´et´es des rentabilit´es des actifs financiers
19
1
Faits stylis´es des rentabilit´es boursi`eres
21
1.1
Rappel des faits stylis´es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
1.1.1
La distribution des rendements . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
1.1.2
Propri´et´es de d´ependances temporelles . . . . . . . . . . . . . . . . . . . . . .
23
1.1.3
Autres faits stylis´es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
1.2
De la difficult´e de repr´esenter la distribution des rendements . . . . . . . . . . . . . . .
27
1.3
Mod´elisation des propri´et´es de d´ependance des rendements . . . . . . . . . . . . . . . .
28
1.4
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
2
Mod`eles ph´enom´enologiques de cours 2.1
2.2
31
Bulles rationnelles multi-dimensionnelles et queues e´ paisses . . . . . . . . . . . . . . .
32
2.1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
2.1.2
Hypoth`eses du mod`ele de Blanchard et Watson . . . . . . . . . . . . . . . . . .
34
2.1.3
G´en´eralisation des bulles rationnelles en dimensions quelconques . . . . . . . .
36
2.1.4
Th´eorie du renouvellement des produits de matrices al´eatoires . . . . . . . . . .
37
2.1.5
Cons´equences pour les bulles rationnelles . . . . . . . . . . . . . . . . . . . . .
39
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
Des bulles rationnelles aux krachs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
2.2.1
43
Les mod`eles de bulles rationnelles . . . . . . . . . . . . . . . . . . . . . . . . . 3
4
3
Table des mati`eres
2.2.2
Bulles rationnelles pour un actif isol´e . . . . . . . . . . . . . . . . . . . . . . .
46
2.2.3
G´en´eralisation des bulles rationnelles a` un nombre arbitraire de dimensions . . .
49
2.2.4
Mod`ele de krach avec taux de hasard . . . . . . . . . . . . . . . . . . . . . . .
54
2.2.5
Mod`ele de croissance non stationnaire . . . . . . . . . . . . . . . . . . . . . .
56
2.2.6
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
63
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
3.2
Statistiques descriptives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
3.2.1
Les donn´ees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
3.2.2
Existence de d´ependance temporelle . . . . . . . . . . . . . . . . . . . . . . .
70
Propri´et´es extrˆemes d’un processus a` m´emoire longue . . . . . . . . . . . . . . . . . .
70
3.3.1
R´esultats th´eoriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
3.3.2
Exemples de convergence lente vers les distributions des valeurs extrˆemes et de Pareto g´en´eralis´ees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
3.3.3
G´en´eration d’un processus a` m´emoire longue avec des marginales prescrites . .
73
3.3.4
Simulations num´eriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
3.3.5
Estimateurs des param`etres des GEV et GPD pour les donn´ees r´eelles . . . . . .
77
Estimation param´etrique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
3.4.1
D´efinition d’une famille de distributions a` trois param`etres . . . . . . . . . . . .
78
3.4.2
M´ethodologie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
3.4.3
R´esultats empiriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
3.4.4
R´esum´e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
Comparaison du pouvoir descriptif des diff´erentes familles . . . . . . . . . . . . . . . .
84
3.3
3.4
3.5
3.5.1
Comparaison entre les quatre distributions param´etriques et la distribution globale 85
3.5.2
Comparaison directe des mod`eles de Pareto et Exponentiel-Etir´e . . . . . . . .
86
Discussion et conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
3.6
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4
Relaxation de la volatilit´e 4.1
135
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Table des mati`eres
5
4.2
M´emoire longue et distinction entre chocs endog`enes et exog`enes . . . . . . . . . . . . 139
4.3
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5
6
Approche comportementale des march´es financiers : l’apport des mod`eles d’agents
149
5.1
Prix d’un actif et exc`es de demande . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.2
Mod`eles d’opinion contre mod`eles de march´e . . . . . . . . . . . . . . . . . . . . . . . 152
5.3
Mod`eles a` agents adaptatifs contre mod`eles a` agents non adaptatifs . . . . . . . . . . . . 154
5.4
Cons´equences des ph´enom`enes d’imitation et d’antagonisme... . . . . . . . . . . . . . . 155
5.5
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Comportements mim´etiques et antagonistes : bulles hyperboliques, krachs et chaos
159
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.2
Le mod`ele . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.3
Analyse qualitative des propri´et´es dynamiques . . . . . . . . . . . . . . . . . . . . . . 165
6.4
Analyse quantitative des bulles sp´eculatives en r´egime chaotique dans le cas sym´etrique 167
6.5
Propri´et´es statistiques des rendements dans le cas sym´etrique . . . . . . . . . . . . . . 169
6.6
Cas asym´etriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.7
Effets de taille finie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.8
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Appendice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
II Etude des propri´et´es de d´ependances entre actifs financiers 7
Etude de la d´ependance a` l’aide des copules
179 181
7.1
Les copules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.2
Quelques familles de copules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.3
7.2.1
Les copules gaussiennes et copules de Student . . . . . . . . . . . . . . . . . . 184
7.2.2
Les copules archim´ediennes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Tests empiriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 7.3.1
Tests param´etriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6
Table des mati`eres
7.3.2 7.4 8
Tests non param´etriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Tests de copule gaussienne
191
8.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
8.2
G´en´eralit´es sur les copules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
8.3
8.4
8.5
8.2.1
D´efinitions et r´esultats importants concernant les copules . . . . . . . . . . . . 195
8.2.2
D´ependance entre deux variables al´eatoires . . . . . . . . . . . . . . . . . . . . 196
8.2.3
La copule gaussienne . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.2.1
La copule de Student . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Tester l’hypoth`ese de copule gaussienne . . . . . . . . . . . . . . . . . . . . . . . . . . 200 8.3.1
statistique de test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.3.2
Proc´edure de test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.3.3
Sensibilit´e de la m´ethode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
R´esultats empiriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 8.4.1
Les devises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.4.2
Les mati`eres premi`eres : les m´etaux . . . . . . . . . . . . . . . . . . . . . . . . 209
8.4.3
Les actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 9
Mesure de la d´ependance extrˆeme entre deux actifs financiers 9.1
237
Les diff´erentes mesures de d´ependances extrˆemes . . . . . . . . . . . . . . . . . . . . . 238 9.1.1
Coefficients de corr´elation conditionnels . . . . . . . . . . . . . . . . . . . . . 243
9.1.2
Mesures de d´ependances conditionnelles . . . . . . . . . . . . . . . . . . . . . 249
9.1.3
Coefficient de d´ependance de queue . . . . . . . . . . . . . . . . . . . . . . . . 252
9.1.4
Synth`ese et discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 9.2
Estimation du coefficient de d´ependance de queue . . . . . . . . . . . . . . . . . . . . . 299 9.2.1
Mesure intrins`eque de d´ependance extrˆeme . . . . . . . . . . . . . . . . . . . . 303
9.2.2
Coefficient de d´ependance de queue pour un mod`ele a` facteurs . . . . . . . . . 305
9.2.3
Etude empirique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Table des mati`eres
9.2.4
7
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 9.3
III
Synth`ese de la description de la d´ependance entre actifs financiers . . . . . . . . . . . . 340
Mesures des risques extrˆemes et application a` la gestion de portefeuille
10 La mesure du risque
343 345
10.1 La th´eorie de l’utilit´e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 10.1.1 Th´eorie de l’utilit´e en environnement certain . . . . . . . . . . . . . . . . . . . 346 10.1.2 Th´eorie de la d´ecision face au risque . . . . . . . . . . . . . . . . . . . . . . . . 347 10.1.3 Th´eorie de la d´ecision face a` l’incertain . . . . . . . . . . . . . . . . . . . . . . 350 10.2 Les mesures de risque coh´erentes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 10.2.1 D´efinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 10.2.2 Quelques exemples de mesures de risque coh´erentes . . . . . . . . . . . . . . . 354 10.2.3 Repr´esentation des mesures de risque coh´erentes . . . . . . . . . . . . . . . . . 355 10.2.4 Cons´equences sur l’allocation de capital . . . . . . . . . . . . . . . . . . . . . . 356 10.2.5 Critique des mesures coh´erentes de risque . . . . . . . . . . . . . . . . . . . . . 357 10.3 Les mesures de fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 10.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 11 Portefeuilles optimaux et e´ quilibre de march´e
363
11.1 Les limites de l’approche moyenne - variance . . . . . . . . . . . . . . . . . . . . . . . 364 11.2 Prise en compte des grands risques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 11.2.1 Optimisation sous contrainte de capital e´ conomique . . . . . . . . . . . . . . . . 365 11.2.2 Optimisation sous contrainte de fluctuations autour du rendement esp´er´e . . . . . 366 11.2.3 Optimisation sous d’autres contraintes . . . . . . . . . . . . . . . . . . . . . . . 366 11.3 Equilibre de march´e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 11.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 11.5 Annexe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 12 Gestion des risques grands et extrˆemes
373
12.1 Comprendre et g´erer les risques grands et extrˆemes . . . . . . . . . . . . . . . . . . . . 374
8
Table des mati`eres
12.1.1 Distributions des rendements a` queues e´ paisses . . . . . . . . . . . . . . . . . . 376 12.1.2 D´ependance temporelle intermittente a` l’origine des grandes pertes . . . . . . . 376 12.1.3 D´ependance de queue et contagion . . . . . . . . . . . . . . . . . . . . . . . . 377 12.1.4 Nature multidimensionnelle des risques . . . . . . . . . . . . . . . . . . . . . . 378 12.1.5
Petits risques, grands risques et rendement . . . . . . . . . . . . . . . . . . . . 378
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 12.2 Minimiser l’impact des grands co-mouvements . . . . . . . . . . . . . . . . . . . . . . 381 12.2.1 Quantifier les grands co-mouvements . . . . . . . . . . . . . . . . . . . . . . . 382 12.2.2 D´ependance de queue g´ener´ee par un mod`ele a` facteur . . . . . . . . . . . . . . 383 12.2.3 impl´ementation pratique et cons´equences . . . . . . . . . . . . . . . . . . . . . 384 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 13 Gestion de portefeuille sous contraintes de capital e´ conomique
387
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 13.1 D´efinition et concepts importants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 13.1.1 Distributions de Weibull modifi´ees . . . . . . . . . . . . . . . . . . . . . . . . 392 13.1.2 Equivalence de queue pour les fonctions de distribution . . . . . . . . . . . . . 393 13.1.3 La copule gaussienne . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 13.2 Distribution de richesse d’un portefeuille pour diff´erentes structures de d´ependance . . . 395 13.2.1 Distribution de richesse pour des actifs ind´ependants . . . . . . . . . . . . . . . 395 13.2.2 Distribution de richesse pour des actifs comonotones . . . . . . . . . . . . . . . 396 13.2.3 Distribution de richesse sous hypoth`ese de copule gaussienne . . . . . . . . . . 398 13.3 Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 13.4 Portefeuilles Optimaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 13.4.1 Portefeuilles a` risque minimum . . . . . . . . . . . . . . . . . . . . . . . . . . 401 13.4.2 Portefeuilles VaR-efficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 13.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 14 Gestion de Portefeuilles multimoments et e´ quilibre de march´e
423
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 14.2 Mesure des grands risques d’un portefeuille . . . . . . . . . . . . . . . . . . . . . . . . 428
Table des mati`eres
9
14.2.1 Pourquoi les moments d’ordres e´ lev´es permettent-ils de quantifier de plus grands risques ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 14.2.2 Quantification des fluctuations d’un actif . . . . . . . . . . . . . . . . . . . . . 429 14.2.3 Exemples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 14.3 La fronti`ere efficiente g´en´eralis´ee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432 14.3.1 Fronti`ere en l’absence d’actif sans risque . . . . . . . . . . . . . . . . . . . . . 432 14.3.2 Fronti`ere efficiente en pr´esence d’actif sans risque . . . . . . . . . . . . . . . . 433 14.3.3 Th´eor`eme de s´eparation en deux fonds . . . . . . . . . . . . . . . . . . . . . . 433 14.3.4 Influence du taux d’int´erˆet sans risque . . . . . . . . . . . . . . . . . . . . . . . 434 14.4 Classifications des actifs et des portefeuilles . . . . . . . . . . . . . . . . . . . . . . . . 434 14.4.1 M´ethode d’ajustement du risque . . . . . . . . . . . . . . . . . . . . . . . . . . 435 14.4.2 Risque marginal d’un actif au sein d’un portefeuille . . . . . . . . . . . . . . . 436 14.5 Un nouveau mod`ele d’´equilibre de march´e . . . . . . . . . . . . . . . . . . . . . . . . 437 14.5.1 Equilibre en march´e homog`ene . . . . . . . . . . . . . . . . . . . . . . . . . . 438 14.5.2 Equilibre en march´e h´et´erog`ene . . . . . . . . . . . . . . . . . . . . . . . . . . 438 14.6 Estimation de la distribution de probabilit´e jointe du rendement de plusieurs actifs . . . 439 14.6.1 Br`eve exposition et justification de la m´ethode . . . . . . . . . . . . . . . . . . 440 14.6.2 Transformation d’une variable al´eatoire en une variable gaussienne . . . . . . . 441 14.6.3 Determination de la distribution jointe : maximum d’entropie et copule gaussienne442 14.6.4 Test empirique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 14.7 Choix d’une famille exponentielle pour param´etrer les distributions marginales . . . . . 444 14.7.1 Les distributions de Weibull modifi´ees . . . . . . . . . . . . . . . . . . . . . . 444 14.7.2 Transformation des lois de Weibull en gaussiennes . . . . . . . . . . . . . . . . 445 14.7.3 Test empirique et estimation des param`etres . . . . . . . . . . . . . . . . . . . 445 14.8 D´eveloppement en cumulants de la distribution du portefeuille . . . . . . . . . . . . . . 446 14.8.1 Lien entre moments et cumulants . . . . . . . . . . . . . . . . . . . . . . . . . 446 14.8.2 Cas de distributions sym´etriques . . . . . . . . . . . . . . . . . . . . . . . . . 447 14.8.3 Cas de distributions asym´etriques . . . . . . . . . . . . . . . . . . . . . . . . . 449 14.8.4 Tests empiriques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 14.9 Peut-on avoir le beurre et l’argent du beurre ? . . . . . . . . . . . . . . . . . . . . . . . 450 14.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
10
Table des mati`eres
R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 Conclusions et Perspectives
491
A Evaluation de la conduite du projet de th`ese
495
A.1 Bref r´esum´e du sujet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 A.2 El´ements de contexte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 A.2.1 Choix du sujet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 A.2.2 Choix de l’encadrement et du laboratoire d’accueil . . . . . . . . . . . . . . . . 496 A.2.3 Financement du projet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 A.3 Evolution du projet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 A.3.1 Elaboration du projet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 A.3.2 Conduite du projet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 A.3.3 Retomb´ees scientifiques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 A.4 Comp´etences acquises et enseignements personnels . . . . . . . . . . . . . . . . . . . . 499 A.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 Bibliographie
503
Avant-propos La r´edaction d’une th`ese est un travail de longue haleine, exigeant, quelque d´eroutant voire d´ecourageant, tout particuli`erement lorsque les difficult´es s’accumulent et semblent insurmontables. Mais au final, combien est captivante, passionnante et enthousiasmante cette exp´erience, premi`ere r´eelle occasion de mettre a` profit la somme de connaissances patiemment engrang´ee au cours de longues ann´ees d’´etudes. C’est lorsque vient le moment de la r´edaction finale que l’on mesure l’ensemble du chemin parcouru. Les h´esitations et difficult´es rencontr´ees tout au long de la route s’´evanouissent et seuls les succ`es et satisfactions reviennent en m´emoire. Ressurgissent aussi les noms de ceux qui vous ont accompagn´e sur cette route, et c’est a` eux que je souhaiterai, avant toute chose, adresser ces quelques remerciements. Le travail que j’ai men´e durant ces deux ann´ees de th`ese doit e´ norm´ement a` Didier Sornette et Jean-Paul Laurent qui ont accept´e de le diriger. Je tiens avant tout a` leur exprimer ma gratitude pour ces ann´ees pass´ees a` leur contact et voudrais dire le grand plaisir et l’immense satisfaction que j’ai eu a` travailler avec eux. Ils ont su a` la fois me guider judicieusement et me laisser une grande latitude quant aux choix des th`emes de recherche que je souhaitais poursuivre, ce dont je leur suis infiniment reconnaissant. Je tiens aussi a` remercier chaleureusement les professeurs Gouri´eroux et Roehner qui m’ont fait l’honneur d’accepter d’ˆetre les rapporteurs de cette th`ese. Les questions et remarques qu’ils ont formul´ees, ainsi que les discussions que nous avons eues ont e´ t´e pour moi du plus grand int´erˆet car elles ont e´ t´e g´en´eratrices d’id´ees nouvelles et m’ont ainsi permis d’entrevoir d’autres directions de recherche, compl´ementaires de celle suivies jusqu’ici. Je souhaite e´ galement exprimer ma reconnaissance aux professeurs Augros et Rockinger ainsi qu’`a Jean-Franc¸ois Muzy – avec qui j’ai eu en outre le grand plaisir de collaborer – qui ont bien voulu consacrer une part de leur temps a` juger mon travail et accept´e de participer a` mon jury de th`ese. J’aimerais e´ galement remercier les nombreuses personnes avec qui j’ai e´ t´e amen´e a` travailler sur l’un ou l’autre des divers th`emes de recherche que j’ai abord´e au cours de ma th`ese. En premier lieu, je souhaiterais dire toute mon estime et ma reconnaissance a` Vladilen Pisarenko, pour l’aide pr´ecieuse qu’il m’a apport´e tout au long de ces deux ann´ees. Je voudrais aussi remercier Jorgen Andersen avec qui j’ai pass´e les trois mois qui ont pr´ec´ed´e le d´ebut de ma th`ese durant mon stage de DEA ainsi qu’Ali Chabaane et Franc¸oise Turpin du groupe BNP-Paribas avec qui j’ai entam´e une collaboration fructueuse. Je souhaiterais aussi remercier Anne Sornette pour son soutien et ses encouragements constants mais e´ galement pour m’avoir incit´e a` participer au projet de “Nouveau chapitre de la th`ese”, exp´erience qui s’est av´er´ee eˆ tre tr`es int´eressante et enrichissante. Incidemment, je remercie Nadjia Hohweiller, consultante en bilan de comp´etence a` la chambre de commerce et d’industrie de Nice, qui a particip´e a` la r´edaction de ce chapitre et Emmanuel Tric de l’Association Bernard Gregory qui soutient ce projet. Je voudrais e´ galement remercier Bernard Gaypara et Yann Ageon pour l’aide pr´ecieuse qu’ils m’ont apport´ee dans le domaine informatique, Bernard pour ce qui a e´ t´e de r´esoudre les (nombreuses) pannes de mon ordinateur et autres probl`emes d’impression et Yann pour ses comp´etences de programmeur qui 11
12
m’ont souvent e´ t´e d’un grand secours. Enfin, je tiens a` remercier ma famille, mes ami(e)s, mes proches et tous ceux qui m’ont encourag´e tout au long de ces deux ann´ees. Qu’ils voient dans l’accomplissement de ce travail un t´emoignage de ma reconnaissance pour le soutien et la confiance qu’ils m’ont toujours manifest´es.
Nice, Janvier 2003
Introduction La multiplication des risques majeurs est l’une des caract´eristiques des soci´et´es modernes, a` tel point que, reprenant le titre d’un des plus c´el`ebres ouvrages du sociologue allemand Ulrich Beck, on n’h´esite plus a` les qualifier de soci´et´es du risque. En outre, notre conception mˆeme du risque a e´ volu´e de sorte que les e´ v´enements catastrophiques ne semblent plus d´esormais perc¸us par nos concitoyens comme s’imposant a` nos soci´et´es du fait d’un destin injuste et implacable mais comme r´esultant essentiellement de notre propre d´eveloppement technologique dont la complexit´e atteint un tel degr´e qu’elle engendre “naturellement” des situations de crises pour le pr´esent (Tchernobyl, AZF...) comme pour l’avenir (r´echauffement plan´etaire et changement climatique, par exemple). Cette prolif´eration des sources de risques impose alors aux organisations publiques, mais aussi aux institutions priv´ees telles que les compagnies d’assurances ou les banques, de nouvelles obligations quant a` la d´etermination, au contrˆole et a` la gestion de ces risques extrˆemes afin d’en rendre les cons´equences supportables pour la communaut´e sans pour autant mettre en p´eril les organismes qui ont en charge de les assurer. Dans le secteur strictement financier, les krachs repr´esentent peut-ˆetre les e´ v´enements les plus frappants de toute une cat´egorie d’´ev´enements extrˆemes, et de plus en plus fr´equemment l’activit´e financi`ere doit en subir les effets n´efastes. Que l’on songe, pour fixer les id´ees, que le krach d’octobre 1987 a englouti en quelques jours plus de mille milliards de dollars ou encore que le r´ecent krach de la nouvelle e´ conomie a conduit a` un effondrement de pr`es d’un tiers de la capitalisation boursi`ere mondiale par rapport a` son niveau de 1999. Or, si le rˆole de l’argent est de constituer une r´eserve de valeur, encore faut-il eˆ tre capable d’en maˆıtriser les fluctuations afin notamment de ne pas voir, en un instant, englouti l’´epargne de toute une vie, ruin´e les projets d’expansion d’une entreprise ou an´eanti l’´economie d’une nation. Il est donc absolument n´ecessaire de se doter d’outils et de normes permettant de mieux appr´ehender les risques extrˆemes sur les march´es financiers. Les instances bancaires mondiales, pleinement conscientes de cette n´ecessit´e, ont e´ mis des instructions en se sens, au travers des recommandations du comit´e de Bˆale (1996, 2001), en proposant des mod`eles de gestion interne des risques et en imposant des montants minimaux de fonds propres en ad´equation avec les expositions aux risques. Cependant, quelques critiques se sont e´ lev´ees contre ces recommandations (Szerg¨o 1999, Danielsson, Embrechts, Goodhart, Keating, Muennich, Renault et Shin 2001) jug´ees inadapt´ees et pouvant mˆeme conduire a` une d´estabilisation des march´es. Cette controverse fait ressortir toute l’importance d’une meilleure compr´ehension des risques extrˆemes, de leurs cons´equences et des moyens de s’en pr´emunir. Ce challenge comporte a` notre avis deux volets. D’une part il est indispensable d’ˆetre capable de mieux quantifier les risques extrˆemes, et cela passe par le d´eveloppement d’outils statistiques permettant de d´epasser le cadre gaussien dans lequel s’ancre la th´eorie financi`ere classique h´erit´ee de Bachelier (1900), Markovitz (1959), et Black et Scholes (1973) notamment. D’autre part, il convient de s’interroger sur la fac¸on dont peut s’int´egrer a` la gestion de portefeuille la prise en compte des risques extrˆemes. En effet, il est fondamental de savoir si, a` l’instar de ce que nous apprend la th´eorie financi`ere classique bas´ee sur l’approche moyenne-variance, les risques extrˆemes demeurent diversifiables ? Si tel n’est pas le cas, il conviendra alors d’envisager l’utilisation d’autres moyens que la constitution de portefeuilles 13
14
Introduction
pour esp´erer se couvrir contre ce type de risques en ayant recours notamment aux produits d´eriv´es, pour autant qu’ils soient capables de fournir une r´eelle assurance contre les grandes variations de cours - ce qui fˆut loin d’ˆetre le cas lors du krach de 1987 - ou peut-ˆetre avoir recours a` la mutualisation comme en assurance par exemple. Le probl`eme du contrˆole des risques extrˆemes en finance et plus particuli`erement son application a` la gestion de portefeuille peuvent sembler totalement e´ trangers au domaine de la physique. Cela n’est cependant qu’une apparence puisque d`es ses premiers pas, la finance a partag´e avec la physique, la devanc¸ant mˆeme parfois, de nombreuses m´ethodes et outils telle la marche al´eatoire brownienne (Bachelier 1900, Einstein 1905), l’utilisation de la notion de probabilit´e subjective qui permet de rendre compte du comportement des agents e´ conomiques dans l’incertain (Savage 1954) mais aussi d’interpr´eter des exp´eriences de m´ecanique quantique (Caves, Fluchs et Schack 2002), ou encore la notion de compromis moyennevariance en th´eorie du portefeuille Markovitz (1959) comme en algorithmique quantique (Maurer 2001), ainsi que l’´equation de diffusion de la chaleur qui sert e´ galement a` d´ecrire l’´evolution du prix d’une option dans l’univers de Black et Scholes (1973), et plus r´ecemment l’utilisation de mod`eles d’agents en interactions (variante du mod`ele d’Ising par exemple) permettant de d´epasser le cadre standard de l’agent e´ conomique repr´esentatif et ainsi de mieux percevoir les m´ecanismes fondamentaux a` l’oeuvre sur les march´es financiers. C’est donc dans les m´ethodes, mais aussi dans les concepts, qu’il faut chercher un lien entre des mati`eres en apparence si diff´erentes que la finance et la physique. En effet, la mesure du risque fait appel a` des notions de th´eorie de l’information, de la d´ecision et du contrˆole dynamique qui ont une longue tradition en physique, quant a` la gestion de portefeuille, de mani`ere un peu sch´ematique, on peut affirmer que ce n’est rien d’autre qu’un exemple particulier de probl`eme d’optimisation sous contraintes, comme on en rencontre dans de nombreux domaines. En cela, d’un point de vue purement math´ematique, la recherche de portefeuilles optimaux n’est gu`ere diff´erente de la recherche des e´ tats d’´equilibres d’un syst`eme thermodynamique. En effet, dans le premier cas, on cherche a` obtenir une valeur minimale du risque associ´e aux variations de richesse du portefeuille pour une valeur fix´ee du rendement moyen et un montant de la richesse initiale donn´ee. Dans le second cas, on cherche - pour ce qui est de l’´etude de l’ensemble micro-canonique - a` maximiser l’entropie (ou minimiser la n´eguentropie) du syst`eme sous la contrainte d’une e´ nergie totale et d’un nombre de particules fix´e. On peut donc, formellement, faire un parall`ele (certes un peu rapide) entre risque et n´eguentropie, rendement esp´er´e et e´ nergie totale et richesse initiale et nombre de particules. Ceci e´ tant, le parall`ele s’arrˆete malheureusement l`a et l’on ne peut esp´erer d’une application simple et directe de certains r´esultats g´en´eraux de la thermodynamique statistique ou de m´ecanique quantique qu’elle nous permette d’obtenir d’int´eressants enseignements pour la gestion de portefeuille. La distinction essentielle entre ces deux probl`emes d’optimisation que sont la gestion de portefeuille et la physique statistique vient en fait de la diff´erence entre les ordres de grandeur en jeu dans ces deux situations : les plus gros portefeuilles contiennent au maximum quelques milliers d’actifs alors que les syst`emes thermodynamiques auxquels nous sommes usuellement confront´es sont constitu´es de quelques cent milles milliards de milliards (NA ' 1023 ) de particules. Pour un tel nombre de particules, le th´eor`eme de la limite centrale s’applique pleinement, y compris pour des e´ v´enements de tr`es faibles probabilit´es, et fournit une simplification fort utile. De plus, vu le nombre gigantesque d’entit´es consid´er´ees, les fluctuations relatives √ typiques des grandeurs mesur´ees par rapport a` leurs valeurs moyennes (esp´er´ees) sont de l’ordre de 1/ NA ∼ 10−11 − 10−12 , ce qui les rend tout a` fait imperceptibles. Les choses sont toutes autres pour ce qui est de la gestion de portefeuille, o`u il convient de caract´eriser avec rigueur la distribution de rendement dudit portefeuille. En effet, du fait du nombre le plus souvent mod´er´e d’actifs qui constituent un portefeuille et de par la structure sous-jacente des distributions de chacun des actifs, la distribution de rendement du portefeuille s’av`ere tr`es e´ loign´ee de la distribution
Introduction
15
gaussienne, contrairement a` ce que pourrait conduire a` penser le th´eor`eme de la limite centrale et a` ce que l’on observe le plus souvent en physique statistique. Il est donc n´ecessaire d’estimer la distribution de rendement du portefeuille, ce qui peut eˆ tre r´ealis´e de mani`ere directe pour une allocation de capital donn´ee ou bien de mani`ere plus g´en´erale en se penchant sur le probl`eme de l’estimation de la distribution jointe de tous les actifs a` inclure dans le portefeuille. La premi`ere approche est certes beaucoup plus simple et rapide puisqu’elle ne consiste qu’en l’estimation d’une distribution monovari´ee, mais il est aussi tr`es clair qu’elle n’est en fait gu`ere satisfaisante car elle n´eglige une grande partie de l’information observable et extractible a` partir des donn´ees recueillies sur les march´es financiers et dont seule est capable de rendre compte la distribution multivari´ee des actifs. Cependant, on peut garder a` l’esprit que ces deux approches se rejoignent dans la mesure o`u l’on sait que la connaissance des distributions de rendements de tous les portefeuilles (pour toutes les allocations de capital possibles) est e´ quivalente a` la connaissance de la distribution multivari´ee. Au final, la seconde m´ethode semble pr´ef´erable, et parait eˆ tre celle qui mobilise aujourd’hui le plus d’´energie aussi bien dans la recherche acad´emique que priv´ee. L’attaque frontale du probl`eme de la d´etermination de la distribution multivari´ee des actifs est ardue et a` notre avis beaucoup moins instructive que l’´etude s´epar´ee du comportement des distributions marginales de chaque actif d’une part et de la structure de d´ependance entre ces actifs d’autre part. C’est pourquoi nous avons privil´egi´e cette d´emarche-ci, avec pour objectif principal de rendre compte de mani`ere la plus pr´ecise possible des diverses sources de risques : risques associ´es individuellement a` chaque actif d’une part et risques collectivement li´es a` l’ensemble des actifs d’autre part. Ceci nous a conduit dans un premier temps (partie I) a` nous int´eresser au comportement marginal des actifs dans le but de faire ressortir les points essentiels qui n´ecessitaient une compr´ehension approfondie afin de mieux cerner l’origine des risques inh´erents aux actifs individuels. Dans notre optique, se focalisant sur l’impact des risques extrˆemes, les krachs - succ´edant aux bulles sp´eculatives - pr´esentent un attrait tout particulier, au mˆeme titre que les quelques plus grands mouvements observ´es sur les march´es. En effet, conform´ement a` la c´el`ebre loi des 80-20 de V. Pareto, une toute petite fraction d’´ev´enements (ici les krachs et autres grands mouvements) suffit a` rendre compte de la plus grande partie des cons´equences (ici le rendement a` long terme des actifs, par exemple) r´esultant de l’ensemble de tous les e´ v´enements. En outre, de par leurs r´ep´etitions qui semblent de plus en plus fr´equentes, on ne peut plus faire l’´economie de n´egliger ces e´ v´enements extrˆemes, en laissant au seul hasard / destin le soin de d´ecider de leurs effets tout autant macro-´economiques que sur chacun d’entre nous. Il est donc souhaitable de mieux comprendre comment et pourquoi ces e´ v´enements se produisent. Pour cela, nous avons men´e notre e´ tude selon trois directions compl´ementaires. Nous avons tout d’abord commenc´e par une e´ tude empirique conduisant a` une description statistique et statique des grands risques. Nous avons ensuite poursuivi par une e´ tude ph´enom´enologique prenant en compte le caract`ere dynamique de l’occurrence des extrˆemes, et dont le but e´ tait de d´ecrire de mani`ere ad hoc les ph´enom`enes observ´es. Nous avons enfin termin´e par une e´ tude “microscopique” ou micro-structurelle nous permettant d’aborder de mani`ere fondamentale les m´ecanismes a` l’œuvre sur les march´es et de relier les propri´et´es statistiques statiques et dynamiques des cours a` des comportements individuels et des organisations de march´es. D’un point de vue statistique, il est n´ecessaire d’estimer pr´ecis´ement la probabilit´e d’occurrence des e´ v´enements extrˆemes, qui se traduisent par le fait d´esormais bien connu que les distributions de rendement des actifs ont des “queues e´ paisses”. Cependant, encore faut-il eˆ tre capable de caract´eriser le plus parfaitement possible ces “queues e´ paisses”, c’est-`a-dire sans n´egliger ni surestimer ces e´ v´enements extrˆemes. De plus, il faut garder a` l’esprit que toute description param´etrique est sujette a` l’erreur de mod`ele et qu’ainsi plusieurs repr´esentations doivent eˆ tre consid´er´ees afin de cerner correctement les effets d’une telle source d’erreur. C’est pour cela que nous nous sommes, tout au long de notre e´ tude, int´eress´es a` deux classes de distributions a` queues e´ paisses : les distributions r´eguli`erement variables et
16
Introduction
les distributions exponentielles e´ tir´ees dont nous avons discut´e la pertinence et les cons´equences vis-`a-vis de la sous / surestimation des risques (chapitres 1, 2 et 3). Nous nous sommes ensuite int´eress´es a` la mani`ere dont la volatilit´e retourne a` un niveau d’“´equilibre” apr`es une longue p´eriode de forte variabilit´e des cours (chapitre 4). On sait en effet que la volatilit´e pr´esente un ph´enom`ene de persistance tr`es marqu´e, et qu’apr`es une p´eriode anormalement haute (ou basse), elle relaxe vers un niveau moyen, que l’on peut en quelque sorte associer a` un niveau d’´equilibre. L’´etude de cette relaxation est tr`es importante du point de vue de la gestion des risques, car elle permet d’une part d’estimer la dur´ee typique d’une p´eriode anormalement turbulente ou calme et d’autre part elle apporte un moyen de pr´ediction relativement fiable de la valeur future de la volatilit´e, dont on connaˆıt l’int´erˆet pour tout ce qui concerne le “pricing” de produits d´eriv´es notamment. De plus, et c’est le point cl´e de notre d´eveloppement, cela nous a permis de proposer un m´ecanisme expliquant l’impact du flux d’information sur la dynamique de la volatilit´e. Ceci ouvre la voie a` une e´ tude syst´ematique de l’effet de l’arriv´ee de telle ou telle information sur les march´es aussi bien ex post, pour tout ce qui concerne l’analyse des causes des grands mouvements de cours, qu’ex ante pour tout ce qui touche a` l’´etude de sc´enarii et l’anticipation des effets des grands chocs. Enfin, nous avons voulu explorer un peu plus avant les causes microscopiques responsables des ph´enom`enes observ´es sur les march´es (chapitres 5 et 6). Pour cela, nous avons construit un mod`ele d’agents en interaction, relativement parcimonieux, visant a` rendre compte de mani`ere plus r´ealiste que ne le font les mod`eles actuels de la croissance super-exponentielle des cours lors des phases de bulles sp´eculatives qui conduisent a` d’importantes surestimations du prix des actifs et finalement a` de fortes corrections. Les r´esultats obtenus sont int´eressants et ont confirm´e l’importance de certains types de comportements des agents conduisant a` des “emball´ees” des march´es et par suite aux p´eriodes de grandes volatilit´es et aux fluctuations extrˆemes. Cependant, on ne peut, en l’´etat, esp´erer int´egrer ces mod`eles, encore trop rudimentaires, a` une chaˆıne de d´ecision concernant la politique de gestion des risques d’une institution. N´eanmoins, nous pensons que ce genre d’outil pr´esente un attrait particulier et devrait, a` terme, fournir d’utiles informations permettant une meilleure estimation / pr´evision des risques a` venir. En particulier, ces mod`eles devraient permettre d’obtenir, a` l’aide de la g´en´eration de sc´enarii, des estimations des probabilit´es associ´ees aux e´ v´enements rares dans certaines phases de march´es, estimation beaucoup plus fiable que ne le sont les actuelles estimations subjectives (Johnson, Lamper, Jefferies, Hart et Howison 2001). Le deuxi`eme maillon de la chaˆıne de reconstruction de la distribution multivari´ee des rendements d’actifs consiste en l’´etude de leur structure de d´ependance, probl`eme que nous aborderons dans la partie II. En effet, les risques ne sont pas uniquement dus au comportement marginal de chaque actif mais e´ galement a` leur comportement collectif. Celui-ci peut eˆ tre e´ tudi´e a` l’aide d’objets math´ematiques nomm´es copules qui permettent de capturer compl`etement la d´ependance entre les actifs. En fait, nous verrons aux chapitres 7 et 8 que la d´etermination de la copule est une affaire d´elicate et que l`a encore le risque d’erreur de mod`ele est important, express´ement pour ce qui est des risques extrˆemes. Ainsi, nous verrons qu’il s’av`ere n´ecessaire de mener une e´ tude sp´ecifique de la d´ependance entre les extrˆemes, ce que nous ferrons au chapitre 9, en d´eveloppant une m´ethode d’estimation du coefficient de d´ependance de queue, c’est-`a-dire de la probabilit´e qu’un actif subisse une tr`es grande perte sachant qu’un autre actif par exemple ou que le march´e dans son ensemble a lui aussi subi une tr`es grande perte. Cette quantit´e est a` notre avis absolument cruciale car elle quantifie de mani`ere tr`es simple le fait que les risques extrˆemes puissent eˆ tre ou non diversifi´es par agr´egation au sein de portefeuilles. En effet, soit le coefficient de d´ependance de queue est nul et les extrˆemes se produisent de mani`ere asymptotiquement ind´ependante ce qui permet alors d’envisager de les diversifier, soit ils demeurent asymptotiquement d´ependants et on ne peut qu’esp´erer minimiser la probabilit´e d’occurrence des mouvements extrˆemes concomitants. Synth´etisant les r´esultats de ces deux premi`eres parties, nous pourrons alors nous poser la question pro-
Introduction
17
prement dite de la gestion de portefeuille, ce qui sera l’objet de la partie III. Avant d’en arriver au probl`eme de la recherche des portefeuilles optimaux, qui n’est en fait qu’un probl`eme math´ematique, nous nous interrogerons sur la mani`ere de choisir un portefeuille optimal, ce qui nous am`enera a` discuter, au chapitre 10, de la th´eorie de la d´ecision et de divers moyens de quantifier les risques associ´es a` la distribution de rendement d’un portefeuille. La question n’est en fait pas triviale car elle revient a` se demander comment synth´etiser le mieux possible en un seul nombre l’information contenue dans une fonction (de distribution) toute enti`ere et donc une infinit´e de nombres. Ce r´esum´e de l’information ne pouvant qu’ˆetre partiel, il convient, afin d’en conserver la fraction la plus pertinente, de bien cerner nos objectifs et pour cela de bien comprendre les buts que l’on cherche a` atteindre en d´efinissant ce qu’est la notion de grand risque ou de risque extrˆeme dans le contexte de la s´election de portefeuilles. A notre avis, la gestion de portefeuille doit r´epondre a` deux objectifs. Premi`erement, l’allocation des actifs au sein du portefeuille doit satisfaire a` des contraintes portant sur le capital e´ conomique, c’esta` -dire sur la somme qui doit eˆ tre investie dans un actif sans risque pour permettre au gestionnaire de faire face a` ses obligations en d´epit des fluctuations de la valeur de march´e du portefeuille. L’objectif primordial e´ tant, bien e´ videmment d’´eviter la ruine. Deuxi`emement, le portefeuille doit r´epondre a` un objectif de rentabilit´e fix´e. Il convient donc d’ˆetre capable de quantifier la propension du portefeuille a` r´ealiser l’objectif qui a e´ t´e d´efini. Plus clairement, l’objectif de rentabilit´e est g´en´eralement d´etermin´e par la valeur du rendement moyen (ou esp´er´e) a` atteindre et dont on souhaite ne pas voir trop s’´ecarter les rendements r´ealis´es. Il est alors n´ecessaire d’ˆetre capable d’estimer les fluctuations que le portefeuille subit typiquement. Donc, la notion de risque pr´esente au minimum un double sens, dans la mesure o`u elle concerne d’une part le capital e´ conomique et d’autre part l’amplitude des fluctuations statistiques du rendement autour du rendement a` atteindre. Moyennant cela, nous verrons ensuite comment mettre a` profit ces mesures de risques grands et extrˆemes afin de construire des portefeuilles optimaux supportant le mieux possible les grandes fluctuations des march´es financiers (chapitre 11), et en particulier comment minimiser l’impact des grands co-mouvements entre actifs qui ne peuvent eˆ tre totalement diversifi´es (chapitre 12), puis comment satisfaire la contrainte portant sur le capital e´ conomique au travers de mesures telles que la Value-at-Risk ou l’Expected-shortall (chapitre 13) et enfin, comment construire des portefeuilles dont les rendements r´ealis´es s’´ecartent le moins possible de l’objectif de rentabilit´e initialement fix´e sous l’effet des grandes fluctuations (chapitre 14), ce qui nous am`enera aussi a` discuter des cons´equences du choix de portefeuilles minimisant l’impact des risques grands et extrˆemes sur l’´equilibre des march´es. Enfin, nous synth´etiserons l’ensemble des r´esultats obtenus, discuterons de leurs cons´equences vis-`a-vis de la gestion des risques extrˆemes et conclurons par quelques perspectives de recherches futures.
18
Introduction
Premi`ere partie
Etude et mod´elisation des propri´et´es des rentabilit´es des actifs financiers
19
Chapitre 1
Faits stylis´es des rentabilit´es boursi`eres L’objet de ce chapitre est de pr´esenter les principaux faits stylis´es des rendements boursiers, d’en discuter la pertinence vis-`a-vis des grands principes de la finance ou de certains mod`eles ph´enom´enologiques, mais aussi de remettre en cause ou g´en´eraliser certains r´esultats consid´er´es comme acquis et bien e´ tablis alors mˆeme que d’autres caract´eristiques toutes aussi r´ealistes semblent devoir eˆ tre mises a` jour. Par fait stylis´e nous d´esignons toute caract´eristique commune des s´eries temporelles issues des march´es financiers. Ainsi donc, nous ne nous int´eressons pas a` une e´ tude e´ v´enementielle des march´es, e´ tude o`u l’on cherche a` expliquer le mouvement des cours par l’arriv´ee de telle ou telle information, mais plutˆot a` une approche statistique dont le but est d’identifier et d’isoler les traits communs des s´eries temporelles issues des cours des actifs financiers. En e´ crivant cela, nous ne cherchons pas a` opposer ces deux approches, qui n’ont rien d’antinomiques, et peuvent mˆeme se r´ev´eler compl´ementaires comme nous le verrons a` la fin de ce chapitre. L’´etude des faits stylis´es repose sur les donn´ees fournies par les s´eries temporelles des prix d’actifs financiers et de leurs rendements. Dans la suite, nous d´efinirons par Pt le prix d’un actif a` l’instant t et Pt+1 le prix de cet actif a` l’instant t + ∆t, o`u ∆t d´esigne l’unit´e de temps consid´er´ee, qui peut aussi bien eˆ tre la minute que la journ´ee ou le mois. Le rendement de l’actif a` l’instant t sera quant a` lui not´e rt et d´efini par Pt − Pt−1 Pt rt = ln ' , (1.1) Pt−1 Pt−1 qui est en toute rigueur le rendement logarithmique mais diff`ere peu du rendement traditionnel tant que celui-ci demeure tr`es petit devant un. La diff´erence essentielle vient de ce que le rendement traditionnel est born´e inf´erieurement par moins un, tandis que le rendement logarithmique est d´efinie sur la droite r´eelle toute enti`ere, ce qui constitue une distinction importante pour la d´etermination de la distribution marginale des rendements. De plus, dans ces conditions le logarithme du prix ln Pt suit un procesus dont les incr´ements sont donn´es par les rendements rt . Une question cruciale se pose alors, a` savoir celle de la stationnarit´e des donn´ees sur lesquelles se basent les e´ tudes statistiques. L’hypoth`ese de la stationnarit´e des prix des actifs est e´ videmment a` rejeter, puisque ces derniers d´erivent d’un processus int´egr´e, comme le traduit l’´equation (1.1). En revanche, hypoth`ese de stationnarit´e semble beaucoup plus acceptable pour les s´eries de rendements et est tr`es souvent faite, car n´ecessaire a` l’analyse statistique des donn´ees1 , mˆeme s’il nous faut reconnaitre qu’elle est parfois sujette a` caution. En effet, certaines e´ tudes auxquelles nous feront r´ef´erence dans la suite ont e´ t´e men´ees sur des 1
Dans le cas de s´eries non stationnaires, l’´etude reste dans une certaine mesure possible, mais fait appel a` des m´ethodes plus complexes et dont l’emploi n’est valable que dans le cadre de mod`eles plus ou moins sp´ecifiques (Gouri´eroux et Jasiak 2001)
21
22
1. Faits stylis´es des rentabilit´es boursi`eres
s´eries longues d’un si`ecle. Or, sur de telles dur´ees, l’´evolution du contexte e´ conomique, r´eglementaire mais aussi l’introduction de nouveaux produits financiers semblent laisser a` penser que les modes de fonctionnement des march´es se sont consid´erablement modifi´es au fil du temps ce qui devrait se refl´eter dans les cours par des non-stationnarit´es. De plus, mˆeme sur de br`eves p´eriodes, des effets saisonniers apparaissent et l’on note par exemple des comportements anormaux en d´ebut et fin de journ´ee, de semaine ou d’ann´ee. Cependant, ces effets e´ tant identifi´es, il est possible de les corriger. En r´esum´e, la stationnarit´e pose potentiellement un probl`eme important, mais on peut esp´erer en limiter les effets par la dessaisonalisation des donn´ees et la consid´eration d’´echantillons de taille raisonnable (de cinq a` dix ans tout au plus) s´electionnant des s´equences de march´es relativement homog`enes. Dans ce qui suit, nous allons essentiellement pr´esenter les propri´et´es des rendements journaliers ou intradays, et sauf indication contraire, le terme rendement fera implicitement r´ef´erence aux rendements calcul´es a` une e´ chelle de temps inf´erieure ou e´ gale a` la journ´ee. Lorsque nous consid´ererons des rendements mensuels par exemple, il en sera explicitement fait mention. Enfin, nous voulons souligner que le rappel des faits stylis´es que nous donnons dans ce chapitre ne pr´etend pas a` l’exhaustivit´e, mais se concentre sur la description de ceux qui nous seront les plus utiles pour la suite de notre expos´e, et nous renvoyons le lecteur aux nombreux articles de revue sur le sujet comme par exemple Pagan (1996), Cont (2001) ou encore Engle et Patton (2001) ainsi qu’aux ouvrages de Campbell, Lo et MacKinlay (1997) ou Gouri´eroux et Jasiak (2001) pour une approche e´ conom´etrique et Mantegna et Stanley (1999), Bouchaud et Potters (2000) ou Roehner (2001, 2002) pour une vision de physiciens.
1.1 Rappel des faits stylis´es Nous allons maintenant exposer les principaux faits stylis´es en commenc¸ant par la distribution des rendements dont il semble aujourd’hui certain qu’elle poss`ede des “queues e´ paisses”, puis nous pr´esenterons les propri´et´es de d´ependances temporelles des s´eries financi`eres essentiellement caract´eris´ees par l’absence de corr´elation entre les rendements mais une forte persistance de la volatilit´e, apr`es quoi nous terminerons par quelques propri´et´es compl´ementaires de ces s´eries financi`eres.
1.1.1
La distribution des rendements
La toute premi`ere particularit´e des rendements financiers est de suivre des lois dites “`a queues e´ paisses”. Par loi a` queues e´ paisses nous d´esignons l’ensemble des lois r´eguli`erement variables a` l’infini, c’esta` -dire pour simplifier, les lois e´ quivalentes a` l’infini a` une loi de puissance (pour une d´efinition exacte de la notion de variation r´eguli`ere voir Bingham, Goldie et Teugel (1987)). Ce comportement est radicalement diff´erent de celui g´en´eralement admis durant la premi`ere moiti´e du XX`eme si`ecle. En effet, depuis des travaux pionniers de Bachelier (1900) repris et e´ tendus par Samuelson (1965, 1973), tout le monde s’accordait sur le fait que les distributions de rendements suivaient des distributions gaussiennes. Il faudra attendre Mandelbrot (1963) et son e´ tude des prix pratiqu´es sur le march´e du coton, puis Fama (1963, 1965a) pour clairement rejeter cette hypoth`ese et en venir a` consid´erer des distributions en lois de puissance, et plus particuli`erement les lois stables de L´evy, dont l’une des caract´eristiques est de ne pas admettre de second moment et donc d’avoir une variance infinie. Or, des tests directs sur la plupart des s´eries financi`eres permettent clairement de conclure a` l’existence de ce second moment. En effet, par agr´egation temporelle des rendements, c’est-`a-dire lorsque l’on passe des rendements journaliers aux rendements mensuels ou a` des rendements calcul´es a` des e´ chelles encore plus importantes, on note une (lente) convergence de la distribution vers la gaussienne (Bouchaud et Potters 2000, Campbell
1.1. Rappel des faits stylis´es
23
et al. 1997, Mantegna et Stanley 1999). Donc, les lois stables de L´evy ne sauraient elles non plus convenir a` la description des distributions de rentabilit´es boursi`eres, du moins dans leur globalit´e. Durant les ann´ees 80-90, de gigantesques bases de donn´ees contenant les cours des actifs financiers sur de longues p´eriodes et enregistr´ees a` de tr`es hautes fr´equences voient le jours. Dans le mˆeme temps, l’´econom´etrie et plus particuli`erement l’´econom´etrie financi`ere d´eveloppe de nouveaux outils, si bien qu’`a la fois sur le plan m´ethodologique que sur le plan de la quantit´e de donn´ees accessibles, une petite r´evolution se produit. C’est alors que l’on en vient a` cerner de mani`ere beaucoup plus pr´ecise le comportement asymptotique des distributions de rendements des actifs financiers. De nombreuses e´ tudes montrent qu’effectivement les distributions de rendements se comportent comme des lois r´eguli`erement variables dont l’exposant de queue est compris entre deux et demi et quatre de sorte que les deux premiers moments de ces distributions existent et peut-ˆetre aussi les troisi`eme et quatri`eme. Ces e´ tudes ont e´ t´e men´ees sur diverses sortes d’actifs mais conduisent toutes a` des conclusions similaires. On pourra notamment consulter Longin (1996), Lux (1996), Pagan (1996), Gopikrishnan, Meyer, Amaral et Stanley (1998) pour des e´ tudes concernant les march´es d’actions ainsi que Dacorogna, M¨uller, Pictet et de Vries (1992), de Vries (1994) ou encore Guillaume, Dacorogna, Dav´e, M¨uller, Olsen et Pictet (1997) pour ce qui est des taux de change. Cependant, quelques voix se sont r´ecemment e´ lev´ees pour sugg´erer qu’il serait peut-ˆetre sage de temp´erer l’enthousiasme g´en´eral a` l’´egard des lois de puissances. En effet, selon Mantegna et Stanley (1995) les r´esultats de Mandelbrot (1963) d´ecrivent certes de fac¸on correcte une grande partie de la distribution des rendements mais, de mani`ere ultime, le comportement en loi stable de L´evy est erron´e et doit eˆ tre tronqu´e par une d´ecroissance exponentielle. On sait d’ailleurs que ce type de distribution converge - par convolution - de mani`ere extraordinairement lente vers la gaussienne (Mantegna et Stanley 1994), ce qui est conforme aux observations empiriques. Dans le mˆeme esprit, Gouri´eroux et Jasiak (1998) concluent que la densit´e de probabilit´e du titre Alcatel d´ecroˆıt plus vite que toute loi de puissance. Enfin, Laherr`ere et Sornette (1999) affirment de mani`ere plus pr´ecise que les distributions dites exponentielles e´ tir´ees semblent fournir une meilleure description des distributions de rendements que les lois de puissance, non seulement dans les queues extrˆemes mais aussi sur une large gamme de rendements. Aussi a` des fins de g´en´eralit´e, il semble judicieux d’´etendre la notion de distributions a` queues e´ paisses aux distributions sous-exponentielles, c’est-`a-dire qui ne d´ecroissent pas plus vite qu’une exponentielle a` l’infini. Pour une d´efinition rigoureuse ainsi qu’une synth`ese des propri´et´es de ces distributions, nous renvoyons le lecteur a` Embrechts, Kl¨uppelberg et Mikosh (1997) et Goldie et Kl¨uppelberg (1998). Au vu du flou qui demeure quant au comportement exact des distributions de rendements, un r´eexamen de leur comportement asymptotique semble n´ecessaire. En particulier, il parait indispensable de comparer le pouvoir descriptif des deux familles de distributions que nous venons d’´evoquer, et donc nous reviendrons en d´etail sur ce probl`eme dans un prochain paragraphe. Nous d´elaissons temporairement ce point pour nous tourner maintenant vers le deuxi`eme trait caract´eristique des s´eries financi`eres, a` savoir l’existence de structures de d´ependances temporelles non triviales. Ceci est d’ailleurs un e´ l´ement fondamental de la compr´ehension de la difficult´e a` d´eterminer pr´ecis´ement la distribution marginale des rendements.
1.1.2
Propri´et´es de d´ependances temporelles
Les propri´et´es de d´ependances temporelles que nous allons r´esumer dans ce paragraphe d´erivent toutes de l’un des grands principes fondamentaux de la th´eorie financi`ere, a` savoir le principe de non arbitrage. Sans rentrer ici dans le d´etail, nous nous bornerons a` dire de mani`ere sommaire que ce principe postule l’impossibilit´e d’obtenir un gain certain sur les march´es financiers. La logique d’un tel postulat tient dans le fait que si un agent d´etecte une telle opportunit´e, il va imm´ediatement en tirer profit et celle-ci va donc
24
1. Faits stylis´es des rentabilit´es boursi`eres
0.01
0.009
0.008
0.007
Volatility
0.006
0.005
0.004
0.003
0.002
0.001
0 Jan 70
Dec 85
F IG . 1.1 – Volatilit´e r´ealis´ee (carr´es des rendements) de l’indice Standard & Poor’s 500 entre janvier 1970 et d´ecembre 1985.
disparaˆıtre. Ainsi, hormis de mani`ere fugitive, une telle situation ne peut se produire. Cela implique en particulier que les rendements futurs ne peuvent eˆ tre pr´edits. Une cons´equence imm´ediate et tr`es commun´ement observ´ee est l’absence d’auto-corr´elation entre les rendements2 (Fama 1971, Pagan 1996, Gouri´eroux et Jasiak 2001). Cette absence de corr´elations temporelles est ais´ement justifiable par l’absence d’opportunit´e d’arbitrage (statistique). En effet, s’il existe des corr´elations significatives, il devient possible de pr´edire les prix futurs, ce qui offre - du moins statistiquement - une possibilit´e de gagner de l’argent a` coup sˆur. Ainsi, selon Mandelbrot (1971) l’absence d’opportunit´e d’arbitrage conduit a` blanchir le spectre des changements de prix et donc a` faire disparaˆıtre les corr´elations temporelles. Ceci a longtemps e´ tait l’un des supports de la th´eorie de l’efficience des march´es financiers. En effet, l’absence de corr´elation justifie notamment l’hypoth`ese remontant a` Bachelier (1900) selon laquelle les prix suivent une marche al´eatoire (mouvement Brownien). Cela dit, a` partir du moment o`u l’on admet que le processus stochastique d´ecrivant l’´evolution au cours du temps des rendements n’est pas gaussien, sa seule fonction d’auto-corr´elation ne peut suffire a` le d´efinir. En particulier, l’absence de corr´elation ne peut permettre de conclure que le prix (plus pr´ecis´ement le logarithme du prix) suit un processus a` incr´ements ind´ependants, ce qui serait faux. Ceci se v´erifie d’ailleurs simplement lorsque l’on observe une s´erie de rendements. En effet, il apparaˆıt que les p´eriodes de grande volatilit´e alternent avec les p´eriodes de volatilit´e plus faible montrant ainsi que la volatilit´e3 pr´esente un ph´enom`ene de persistance (voir figure 1.1). Ce ph´enom`ene est tr`es similaire 2
En toute rigueur, il existe deux limites a` cette assertion. Premi`erement, en dessous d’un intervalle de temps de l’ordre de quelques secondes a` quelques minutes, les effets de microstructure des march´es deviennent importants et justifient que l’on observe l’existence de corr´elations significatives (Campbell et al. 1997). Deuxi`emement, pour les rendements calcul´es a` des e´ chelles de temps de l’ordre du mois ou plus, il semble l`a aussi que des corr´elations significatives apparaissent. Cela peut s’expliquer par le fait que sur des e´ chelles de temps de cet ordre, le rendement d’un actif poss`ede un contenu e´ conomique beaucoup plus marqu´e qu’un rendement journalier essentiellement domin´e par des activit´es de (noise) trading pouvant eˆ tre totalement d´ecoupl´ees de toutes r´ealit´es e´ conomiques. 3 Nous parlons ici de volatilit´e dans un sens relativement vague, qui peut recouvrir a` la fois l’´ecart type des rendements ou simplement leur amplitude. On peut toutefois noter qu’une telle persistance s’observe aussi sur la sknewness et la kurtosis (Jondeau et Rockinger 2000, par exemple).
1.1. Rappel des faits stylis´es
25
au ph´enom`ene d’intermittence observ´e en turbulence (Frisch 1995). Il caract´erise clairement le fait qu’il existe une certaine d´ependance dans les s´eries financi`eres : la s´equence des rendements n’apparaˆıt pas de mani`ere ind´ependante. En fait, on constate que la valeur absolue des rendements ou de leurs carr´es sont fortement corr´el´es et que cette corr´elation persiste : la fonction d’auto-corr´elation des valeurs absolues des rendements d´ecroˆıt typiquement comme une loi de puissance dont l’exposant est compris entre 0.2 et 0.4 (Cont, Potters et Bouchaud 1997, Liu, Cizeau, Meyer, Peng et Stanley 1997). Une fois encore, cela est conforme au principe de non arbitrage, car la volatilit´e n’´etant pas une grandeur sign´ee, elle ne permet en rien de d´eduire les rendements futurs. De mani`ere g´en´erale, toute fonction de la volatilit´e (ou de l’amplitude du rendement) est autoris´ee a` pr´esenter des corr´elations temporelles significatives. Pour quelques exemples voir notamment Cont (2001) mais aussi Muzy, Delour et Bacry (2000), o`u il est montr´e que la corr´elation du logarithme de la valeur absolue des rendements d´ecroˆıt comme le logarithme du d´ecalage temporel entre les rendements consid´er´es et poss`ede donc une m´emoire longue. Ce ph´enom`ene de persistance (ou m´emoire longue) est caract´eristique de la volatilit´e mais sa mise en e´ vidence est quelques fois d´elicate et sujette a` caution comme l’ont montr´e Granger et Ter¨asvirta (1999), qui ont construit un processus auto-r´egressif du premier ordre non-lin´eaire passant les tests standards de m´emoire longue ou Andersson, Eklund et Lyhagen (1999) qui ont prouv´e le contraire, c’est-`a-dire qu’un processus a` m´emoire longue pouvait passer les tests de lin´earit´e. Pour en terminer avec les exemples sur les corr´elations temporelles, il convient de citer l’effet de levier mis en e´ vidence par Black (1976) puis Christie (1982) et Bouchaud, Matacz et Potters (2001) notamment, et que l’on rencontre pour les rendements d’actions et les taux d’int´erˆet mais pas les taux de change. Cet effet de levier se traduit par une corr´elation n´egative entre la volatilit´e a` un instant donn´e et les rendements ant´erieurs, ce qui signifie que les mouvements de prix a` la baisse ont tendance a` entraˆıner une augmentation de la volatilit´e. Il semble d’ailleurs que cet effet de levier soit a` l’origine de la cascade causale observ´ee par Arn´eodo, Muzy et Sornette (1998) sur les march´es d’actions4 . En r´esum´e, de par la contrainte de non arbitrage, il est impossible d’observer des corr´elations significatives entre des fonctions des variables pass´ees et les rendements futurs qui permettraient de pr´edire ces derniers. En revanche, il est tout a` fait possible d’observer - et c’est effectivement le cas - d’importantes corr´elations entre des fonctions des variables pass´ees et la volatilit´e future, ceci ne permettant pas de r´ealiser de pr´ediction sur les rendements. Enfin, outre l’existence de fortes d´ependances entre volatilit´e future d’une part, et volatilit´e et rendements pass´es d’autre part, il convient de remarquer que globalement, celle-ci a tendance a` retourner a` un niveau moyen, que l’on peut interpr´eter comme son niveau “normal”. Ce ph´enom`ene de retour a` la moyenne est bien observ´e lorsque l’on s’int´eresse a` la volatilit´e pr´edite par la plupart des mod`eles usuels : lorsque l’horizon de pr´evision augmente, on constate que la volatilit´e tend vers une mˆeme valeur moyenne (voir Engle et Patton (2001) par exemple). Il est alors tr`es int´eressant d’´etudier le mode de relaxation de la volatilit´e vers son niveau moyen, ce qui a e´ t´e r´ealis´e par Sornette, Malevergne et Muzy (2002) et dont nous parlerons un peu plus en d´etail dans un paragraphe ult´erieur.
1.1.3
Autres faits stylis´es
Les faits stylis´es pr´ec´edemment e´ voqu´es sont les faits stylis´es principaux universellement admis. Ce sont ceux dont doit au moins rendre compte tout mod`ele de cours r´ealiste. Cela peut sembler peu, mais en fait, il est d´ej`a assez difficile de trouver de mani`ere ad hoc des processus satisfaisant ces quelques restrictions. Nous en donnerons quelques exemples dans la suite de ce chapitre. A cot´e de ces grands faits stylis´es, existent plusieurs autres caract´eristiques importantes dont notamment la dissym´etrie gains/pertes pour 4
Cette interpr´etation nous a e´ t´e sugg´er´ee par J.F. Muzy lors d’une conversation priv´ee.
26
1. Faits stylis´es des rentabilit´es boursi`eres
↓ ↓ 10000
Hang-Seng
↓ ↓
↓
↓
1000
↓ ↓
100 70
75
80
85 Date
90
95
100
F IG . 1.2 – Indice Hang Seng (bourse de Hong Kong) entre janvier 1970 et Decembre 2000.
les actions, les acc´el´erations super-exponentielles des cours lors des phases de bulles sp´eculatives ou encore la multifractalit´e et la corr´elation entre volatilit´e et volume de transaction, ce dernier point e´ tant intuitivement e´ vident (Gopikrishnan, Plerou, Gabiax et Stanley 2000). Concernant la dissym´etrie gains/pertes, il a e´ t´e clairement e´ tabli par Johansen et Sornette (1998) et Johansen et Sornette (2002) que les march´es d’actions sont sujets a` des pertes cumul´ees de tr`es grandes amplitudes, alors que l’on n’observe pas syst´ematiquement de ph´enom`enes de telle ampleur pour les gains. De plus, ce ph´enom`ene ne semble pas toucher les march´es de change, o`u une plus grande sym´etrie parait de mise. Ceci est en fait naturel, car la hausse ou la baisse d’un taux de change n’est que relative a` la position que l’on adopte : une hausse du taux euro/dollar par exemple correspond e´ videmment a` une baisse du taux dollar/euro. Pour ce qui est de la croissance super-exponentielle du prix des actifs, celle-ci est illustr´ee sur la figure 1.2. On y observe que le prix (repr´esent´e en e´ chelle logarithmique) croit en moyenne de mani`ere lin´eaire en fonction du temps, ce qui caract´erise une croissance exponentielle et donc un taux de croissance constant. Mais, un examen plus approfondi montre qu’en fait se succ`edent une s´erie de phases de croissance acc´el´er´ee - et donc plus rapides que la moyenne - interrompues par de fortes corrections du prix. Ce ph´enom`ene a e´ t´e qualifi´e par Roehner et Sornette (1998) de “sharp peak, flat trough” et semble tirer son origine des ph´enom`enes d’imitation et effets de foule observ´es sur les march´es. Nous reviendrons beaucoup plus en d´etail sur ce point pr´ecis aux chapitres 5 et 6. Enfin le dernier point que nous aborderons concernera la quantification de la r´egularit´e de la trajectoire des prix d’actifs financiers. Arn´eodo et al. (1998) et Fisher, Calvet et Mandelbrot (1998) ont conclu au caract`ere multifractal de ces trajectoires. En effet, ils ont montr´e que quel que soit l’actif consid´er´e, son spectre multifractal est compris strictement entre z´ero et un, ce qui assure que sa trajectoire est presque partout continue mais non d´erivable, et a la forme d’une parabole dont le maximum est voisin de 0.65 . Par comparaison, le mouvement brownien e´ tant auto similaire - et donc (mono) fractal - son spectre vaut simplement un demi. Ceci conduit a` de tr`es fortes contraintes sur les processus permettant d’obtenir ce type de trajectoires, mais un b´emol s’impose quant aux r´esultats de ces e´ tudes car de nombreux auteurs, 5
Il est int´eressant de remarquer que ces observations conduisent a` rejeter un grand nombre de mod`eles en temps continu tels que les processus de diffusion, les processus de L´evy ou les processus a` saut.
1.2. De la difficult´e de repr´esenter la distribution des rendements
27
dont Veneziano, Moglen et Bras (1995), Avnir, Biham, Lidar et Malcai (1998), Bouchaud, Potters et Meyer (2000), LeBaron (2001) ou encore Sornette et Andersen (2002), ont montr´e qu’il e´ tait tr`es facile de mettre en e´ vidence un caract`ere multifractal pour des processus dont il est connu qu’ils ne poss`edent aucune propri´et´e de fractalit´e, simplement du fait de la taille finie des s´eries temporelles consid´er´ees ou d’autres m´ecanismes faisant intervenir des non-lin´earit´es.
1.2 De la difficult´e de repr´esenter la distribution des rendements Nous venons de voir dans la premi`ere partie du paragraphe pr´ec´edent que la distribution des rendements semblait pouvoir eˆ tre d´ecrite par des distributions r´eguli`erement variables d’indice de queue de l’ordre de trois ou quatre. Cependant, une autre hypoth`ese a r´ecemment vu le jour, hypoth`ese selon laquelle les distributions de rendements suivraient des lois exponentielles e´ tir´ees. Or, si ces deux types de distributions poss`edent beaucoup de points communs, elles pr´esentent n´eanmoins une diff´erence majeure dont l’impact th´eorique est tr`es important. Pour ce qui est des points communs, ces deux familles de distributions appartiennent a` la classe des distributions sous-exponentielles et sont donc a` mˆeme de produire des e´ v´enements extrˆemes avec une probabilit´e e´ lev´ee. En particulier, pour cette classe de variables al´eatoires, les grandes d´eviations de la somme d’un grand nombre de ces variables sont domin´ees par les grandes d´eviations d’une seule d’entre elles : la somme est grande si l’une des variables est grande (Embrechts et al. 1997, Sornette 2000), ce qui contraste e´ norm´ement avec les distributions dites super-exponentielles o`u chaque variable contribue de mani`ere significative (et globalement e´ quivalente) a` la somme (Frisch et Sornette 1997). Cela dit, il existe une diff´erence majeure : les distributions sous-exponentielles admettent des moments de tous ordres alors que les distributions r´eguli`erement variables n’admettent de moments que d’ordre inf´erieur a` leur indice de queue. Cette remarque montre combien il est crucial de pouvoir trancher entre ces deux types de distributions, car nous verrons en partie III que ces moments vont jouer un rˆole essentiel dans la repr´esentation du comportement qu’adoptent les agents e´ conomiques face au risque ainsi que dans la mod´elisation des d´ecisions qu’ils prennent. Il est donc absolument fondamental d’ˆetre sˆur de leur existence. L’incertitude qui semble naˆıtre quant a` la nature exacte de la distribution des rendements dans les extrˆemes et les difficult´es a` la d´eterminer pr´ecis´ement est a` notre avis en grande partie li´ee aux d´ependances temporelles pr´esentent dans les s´eries financi`eres. En effet, lorsque les observations sont suppos´ees ind´ependantes et identiquement distribu´ees, la th´eorie montre que les estimateurs usuels - tel l’estimateur de Hill (1975), qui permet de d´eterminer l’exposant de queue d’une distribution r´eguli`erement variable, mais aussi l’estimateur de Pickands (1975) - sont asymptotiquement consistants et normalement distribu´es (Embrechts et al. 1997), ce qui permet, par exemple, d’estimer de fac¸on pr´ecise l’incertitude sur la mesure d’un indice de queue et e´ ventuellement d’effectuer des tests d’´egalit´e entre les indices de queues de diff´erents actifs ou entre la queue positive et n´egative d’un mˆeme actif afin de mettre, ou non, a` jour d’´eventuelles dissym´etries (Jondeau et Rockinger 2001). Mais, lorsque les donn´ees pr´esentent une certaine d´ependance temporelle, on ne peut plus gu`ere esp´erer que la consistance asymptotique de ces estimateurs (Rootz`en, Leadbetter et de Haan 1998), l’incertitude de la mesure e´ tant quant a` elle beaucoup plus importante que celle fournit sous hypoth`ese d’asymptotique normalit´e. En effet, Kearns et Pagan (1997) ont montr´e que pour des processus de type (G)ARCH, qui permettent de rendre raisonnablement compte des d´ependances des s´eries financi`eres (voir paragraphe suivant), la d´eviation standard de l’estimateur de Hill peut eˆ tre sept a` huit fois plus grande que celle estim´ee pour des s´eries dont les donn´ees sont indentiquement et ind´ependamment distribu´ees. Ces
28
1. Faits stylis´es des rentabilit´es boursi`eres
r´esultats sont encore pires pour l’estimateur de Pickand, dont on sait qu’il joue un rˆole important dans la d´etermination empirique du domaine d’attraction des lois extrˆemes auquel appartient une distribution donn´ee. Cette derni`ere remarque est cruciale et conduit a` remettre potentiellement en cause bon nombre d’´etudes sur le comportement asymptotique des distributions de rendements. En effet, n´egliger les d´ependances temporelles revient a` minorer l’incertitude des estimateurs comme dans le cas des e´ tudes se fondant sur la th´eorie des valeurs extrˆemes men´ees par Longin (1996) ou Lux (1997) notamment, o`u l’on peut dire que l’hypoth`ese selon laquelle les distributions de rendements sont r´eguli`erement variables a` l’infini semble avoir e´ t´e un peu rapidement et peut-ˆetre inconsid´er´ement accept´ee. De plus, sur le plan th´eorique, il a e´ t´e d´emontr´e que certaines classes de mod`eles supportant l’hypoth`ese de distributions r´eguli`erement variables devaient eˆ tre abandonn´ees. En effet Lux et Sornette (2002) et Malevergne et Sornette (2001a) ont prouv´e que les mod`eles ph´enom´enologiques a` bruit multiplicatif du type Blanchard (1979) et Blanchard et Watson (1982), et qui permettent de justifier la notion de bulles rationnelles (Colletaz et Gourlaouen 1989, Broze, Gouri´eroux et Szafarz 1990), conduisent, de par la condition de non arbitrage, a` des distributions de rendement r´eguli`erement variables dont l’exposant de queue est n´ecessairement inf´erieur a` un (voir chapitre 2), ce que toutes les e´ tudes empiriques men´ees jusqu’`a ce jour ont r´efut´e. Enfin, parmi les divers mod`eles de prix d’actifs pass´es en revue dans Sornette et Malevergne (2001), il ressort notamment que les mod`eles de Johansen, Sornette et Ledoit (1999) et Johansen, Ledoit et Sornette (2000) - dans lesquels est introduit un taux de krach qui en assure la stationnarit´e - admettent des dynamiques de prix compatibles aussi bien avec des distributions de rendements r´eguli`erement variables qu’exponentielles e´ tir´ees. Au vu de ces incertitudes aussi bien empiriques que th´eoriques, nous avons jug´e n´ecessaire de mener une e´ tude comparative du pouvoir descriptif des deux repr´esentations possibles des distributions de rendements (voir Malevergne, Pisarenko et Sornette (2002) pr´esent´e au chapitre 3). Il ressort de cette e´ tude que, d’un point de vue statistique, les exponentielles e´ tir´ees rendent au moins aussi bien compte des distributions de rendements que les lois de puissances et autres lois a` variations r´eguli`eres, mais que du fait de la d´ependance temporelle, il ne nous est pas possible de dire que l’une est r´eellement meilleure que l’autre. Nous devons donc nous borner a` dire que les exponentielles e´ tir´ees fournissent une alternative parfaitement cr´edible aux distributions r´eguli`erement variables, mais nous sommes malheureusement incapable de trancher la question de l’existence ou non des moments au-del`a d’un certain ordre.
1.3 Mod´elisation des propri´et´es de d´ependance des rendements Le paragraphe pr´ec´edent vient de nous permettre de comprendre que d’un point de vue statistique, il n’est pas vraiment justifi´e de privil´egier les distributions r´eguli`erement variables au profit des distributions exponentielles e´ tir´ees ou vice-versa. En fait, comme nous allons le voir maintenant, l’´etude de l’´evolution dynamique des cours fournit un moyen de sp´ecifier un peu mieux ce choix. En effet, la prise en compte des faits stylis´es concernant la d´ependance temporelle de la volatilit´e impose des contraintes suffisamment fortes pour permettre de d´efinir de fac¸on assez satisfaisante le processus stochastique d´ecrivant l’´evolution des cours, ce qui, en retour, fixe la distribution marginale des rendements compatible avec le processus sp´ecifi´e. L’int´erˆet de trouver un processus stochastique est donc double : d´ecrire la dynamique des cours en tenant compte de la d´ependance exhib´ee par la volatilit´e et fournir des jalons quant a` la distribution marginale des rendements. Les premi`eres descriptions des processus de prix en terme de marches al´eatoires a` incr´ements ind´ependants ou la mod´elisation de la dynamique des rendements a` l’aide de processus ARMA ont rapidement
1.3. Mod´elisation des propri´et´es de d´ependance des rendements
29
e´ t´e abandonn´ees en raison de leur incapacit´e a` rendre compte de la m´emoire longue de la volatilit´e et notamment des bouff´ees de volatilit´e. En effet, par construction, les marches al´eatoires a` incr´ements ind´ependants ne pr´esentent aucune d´ependance de volatilit´e tandis que la fonction d’auto-corr´elation des processus ARMA d´ecroˆıt de mani`ere exponentielle (voir Gouri´eroux et Jasiak (2001) par exemple) et non de mani`ere alg´ebrique. Une alternative aux processus ARMA est fournie par les processus ARIMA fractionnaires (Granger et Joyeux 1980, Hosking 1981) qui pr´esentent des corr´elations a` longues port´ees mais dont l’usage n’est pas bien adapt´e a` la mod´elisation de la volatilit´e. La premi`ere r´eelle e´ bauche de solution a en fait e´ t´e propos´ee par Engle (1982) et les mod`eles autor´egressifs conditionnellement h´et´erosk´edastiques (ARCH) qui furent ensuite g´en´eralis´es par Bollerslev (1986) avec les mod`eles GARCH. Dans cette famille de mod`eles, les rendements sont d´ecompos´es comme un produit : (1.2) rt = σ t · ε t , o`u la volatilit´e σt suit un processus auto-r´egressif et εt est un bruit blanc ind´ependant de σt , ce qui assure l’ind´ependance des rendements a` des dates successives. Dans un mod`ele ARCH, la volatilit´e pr´esente σt ne d´epend que des r´ealisations pass´ees des rendements rt−1 , rt−2 , · · · alors que pour un mod`ele GARCH, elle est aussi fonction des volatilit´es pass´ees σt−1 , σt−2 , · · ·. En pratique, le mod`ele GARCH poss`ede un ind´eniable avantage sur le mod`ele ARCH car il est beaucoup plus parcimonieux : prendre en compte la derni`ere volatilit´e et les deux derniers rendements (et donc trois param`etres) est en g´en´eral suffisant alors que pour un mod`ele ARCH, il n’est pas rare de devoir consid´erer les dix ou quinze derniers rendements r´ealis´es. Du point de vue de l’ad´equation vis-`a-vis des faits stylis´es, ces processus permettent de rendre compte de la lente d´ecroissance de l’auto-corr´elation de la volatilit´e, ce qui e´ tait leur premier objectif et du retour a` la moyenne. De plus, on peut monter facilement (voir Embrechts et al. (1997) par exemple) a` l’aide de la th´eorie des e´ quations de renouvellement stochastique (Kesten 1973, Goldie 1991) que la distribution marginale des rendements est a` variation r´eguli`ere. Une des limites de cette mod´elisation est de ne pas rendre compte convenablement de la lente d´ecroissance de la corr´elation du logarithme de la volatilit´e, de la multifractalit´e6 ni de l’effet de levier. Cependant, ce dernier point peut eˆ tre corrig´e en consid´erant le mod`ele EGARCH de Nelson (1991) qui n’est rien d’autre qu’un processus GARCH sur le logarithme de la volatilit´e et non la volatilit´e elle-mˆeme ou encore les mod`eles TARCH de Glosten, Jagannanthan et Runkle (1993) et Zakoian (1994) qui sont de mod`eles GARCH a` seuil. Nous n’irons pas plus avant dans la description de la vaste famille des mod`eles ARCH et de leur g´en´eralisation et nous renvoyons le lecteur aux nombreux articles de revues et livres sur le sujet dont notamment Bollerslev, Chou et Kroner (1992), Bollerslev, Engle et Nelson (1994) et Gouri´eroux (1997). La seconde alternative a e´ t´e pr´esent´ee plus r´ecemment par Bacry, Delour et Muzy (2001) qui ont d´evelopp´e un processus de marche al´eatoire multifractale (MRW), processus en temps continu infiniment logdivisible (Bacry et Muzy 2002, Muzy et Bacry 2002). Comme dans un processus EGARCH, c’est ici le logarithme de la volatilit´e qui est mod´elis´e. Ce processus est en fait un processus gaussien stationnaire dont l’auto-corr´elation du logarithme de la volatilit´e est sp´ecifi´ee de sorte qu’elle d´ecroisse proportionnellement au logarithme du d´ecalage entre les volatilit´es, ce qui est un des faits stylis´es pr´esent´es plus haut et dont ne rendait pas compte les mod`eles (G)ARCH. C’est en fait l’avanc´ee majeure propos´ee par ce type de mod`ele par rapport aux autres processus d´ej`a existant, d’autant qu’il rend compte, par construction mˆeme, des corr´elations a` longue port´ee de la volatilit´e et est l’un des rares processus connus a` satisfaire de mani`ere th´eorique, et non artificiellement, les contraintes li´ees a` la multifractalit´e. 6
Sur des e´ chantillons de taille finie, Baviera, Biferale, Mantegna et Vulpiani (1998) ont cependant not´e la pr´esence - purement artificielle - d’une apparente multifractalit´e.
30
1. Faits stylis´es des rentabilit´es boursi`eres
Pour ce qui est de la distribution marginale des rendements, celle-ci est aussi a` queues e´ paisses et r´eguli`erement variables mais avec des exposants de queue beaucoup plus e´ lev´es que ceux pr´ec´edemment rencontr´es. En effet, lorsque le mod`ele est calibr´e sur les donn´ees r´eelles, il indique nettement que les moments existent jusqu’`a un ordre sup´erieur a` vingt, ce qui est bien plus important que ce qui est habituellement admis. Cela montre une fois de plus combien il est difficile de d´ecider du comportement asymptotique de la distribution marginale des rendements. Enfin, ce mod`ele a permis a` Sornette et al. (2002) de pr´edire d’un point de vue th´eorique le mode de relaxation de la volatilit´e, c’est-`a-dire comment d´ecroˆıt (ou croit) la volatilit´e lorsqu’elle retourne vers sa valeur moyenne. Lillo et Mantegna (2002) ont r´ecemment observ´e qu’apr`es un grand choc, la volatilit´e relaxait selon une loi de puissance dont l’exposant e´ tait e´ trangement proche de celui de la fonction d’autocorr´elation de la volatilit´e, ce qui pouvait conduire a` penser a` une relation de type fluctuation-dissipation entre les corr´elations des fluctuations de la volatilit´e “ a` l’´equilibre” et la dissipation entraˆınant le retour a` l’´equilibre apr`es un choc. En fait il n’en n’est rien, et nous avons montr´e que le mode de relaxation (plus pr´ecis´ement l’exposant de la loi de puissance) d´ependait de la nature du choc. Si une importante nouvelle est suffisante par elle-mˆeme pour faire bouger le march´e (comme le coup d’´etat contre Gorbatchev en 1991, par exemple) la volatilit´e relaxe avec un exposant -1/2, ind´ependamment de l’amplitude du choc. Au contraire, si le choc est dˆu a` une accumulation de “petites” mauvaises nouvelles, la relaxation se fait avec un exposant en g´en´eral plus faible dont la valeur d´epend de l’amplitude du choc. Cette approche, v´erifi´ee exp´erimentalement, a apport´e une nouvelle confirmation de l’int´erˆet du mod`ele MRW, puisqu’il semble le seul mod`ele a` mˆeme de d´ecrire ce changement de mode de relaxation selon la nature du choc. En particulier, la relaxation observ´ee sur un processus GARCH n’en d´epend pas. Enfin, cette approche permet d’entrevoir une m´ethode de classement syst´ematique des chocs de volatilit´e selon leur nature “endog`ene” ou “exog`ene”. Cependant, une limite actuelle de ce mod`ele est de traiter de mani`ere totalement e´ quivalente les gains et les pertes, n´egligeant de ce fait l’effet de levier. Une solution envisageable serait de consid´erer un processus MRW asym´etrique tel celui propos´e par Pochart et Bouchaud (2002).
1.4 Conclusion L’objet de ce chapitre e´ tait d’exposer les principaux faits stylis´es des rentabilit´es boursi`eres, que l’on peut r´esumer comme suit : – distribution des rendements a` queues e´ paisses, – d´ependance temporelle complexe : absence de corr´elation temporelle entre les rendements, mais corr´elation persistante de la volatilit´e, – croissance super-exponentielle transitoire mettant en e´ vidence l’existence de diff´erentes phases de march´e, – et caract`ere multifractal des trajectoires des prix des actifs financiers. Certaines justifications th´eoriques peuvent eˆ tre apport´ees a` ces faits stylis´es en faisant appel aussi bien a` des principes fondamentaux de la th´eorie financi`ere qu’`a de simples mod`eles ph´enom´enologiques. En retour, cela permet de limiter le champ des processus acceptables pour la mod´elisation des prix des actifs financiers. Pour autant, cette d´emarche reste superficielle en ce qu’elle ne permet pas une meilleure compr´ehension des m´ecanismes de fonctionnement des march´es financiers et des comportements des agents prenant position sur ces march´es. C’est pourquoi il sera n´ecessaire de nous int´eresser a` l’aspect “microscopique” ou microstructurel des march´es financiers, ce qui sera l’objet des chapitres 5 et 6. Mais avant cela, nous allons, dans les chapitres qui suivent, pr´esenter certains des r´esultats que nous avons obtenus et qui permettent de justifier les assertions que nous avons formul´ees tout au long de ce chapitre.
Chapitre 2
Mod`eles ph´enom´enologiques de cours Nous pr´esentons dans ce chapitre quelques mod`eles ph´enom´enologiques de cours d’actifs afin d’illustrer notre affirmation du chapitre 1, section 1.2 selon laquelle il e´ tait tout aussi possible de construire des mod`eles permettant de justifier que la distribution stationnaire des rendements est r´eguli`erement variable ou exponentielle e´ tir´ee. Pour cela, nous commenc¸ons par montrer que les mod`eles simples de bulles rationnelles1 a` la Blanchard et Watson (1982) ne sont en fait pas compatibles avec les donn´ees empiriques, car s’ils conduisent effectivement a` justifier l’existence de distributions de rendements r´eguli`erement variables, l’indice de queue auquel conduisent ces mod`eles est beaucoup plus faible que celui r´eellement estim´e sous cette hypoth`ese. Nous passons ensuite en revue deux mod`eles qui satisfont aux contraintes empiriques en v´erifiant la plupart des faits stylis´es expos´es au chapitre 1, et qui permettent de justifier tout autant l’hypoth`ese de distribution hyperbolique que l’hypoth`ese de distribution exponentielle e´ tir´ee.
1
Le lecteur est invit´e a` se r´ef´erer a` Blanchard et Watson (1982), Colletaz et Gourlaouen (1989), Broze et al. (1990) ou Adams et Szarfarz (1992) notamment pour une revue sur le sujet.
31
32
2. Mod`eles ph´enom´enologiques de cours
2.1 Bulles rationnelles multi-dimensionnelles et queues e´ paisses Lux et Sornette (2002) ont d´emontr´e que les queues des distributions inconditionnelles des incr´ements de prix et de rendements associ´ees au mod`ele de bulles rationnelles de Blanchard et Watson (1982) suivent des lois de puissance (d´ecroissent de mani`ere hyperbolique), dont l’exposant de queue µ est inf´erieur a` un sur un large domaine. Bien que les queues en lois de puissance soient une caract´eristique marquante relev´ee sur les donn´ees empiriques, la valeur num´erique µ < 1 est en d´esaccord avec les estimations habituelles qui, selon ce type de mod`ele, donnent µ ' 3. Parmi les quatre hypoth`eses soutenant le mod`ele de bulles rationnelles de Blanchard et Watson (rationalit´e des agents, condition de non arbitrage, dynamique multiplicative, bulles ind´ependantes pour chaque actif), nous d´emontrons que le mˆeme r´esultat µ < 1 reste valable lorsque que l’on relaxe la derni`ere de ces hypoth`eses, i.e en permettant le couplage entre les diff´erentes bulles des divers actifs. En cons´equence, des extensions non lin´eaires de la dynamique des bulles ou une relaxation partielle du principe d’´evaluation rationnel des prix sont n´ecessaires si l’on souhaite rendre compte des observations empiriques.
Reprint from : Y. Malevergne and D. Sornette (2001), “Multi-dimensional bubbles and fat tails”, Quantitative Finance 1, 533-541.
33
2.1. Bulles rationnelles multi-dimensionnelles et queues e´ paisses
RE S E A R C H PA P E R
Q U A N T I T A T I V E F I N A N C E V O L U M E 1 (2001) 533–541 INSTITUTE O F PHYSICS PUBLISHING
quant.iop.org
Multi-dimensional rational bubbles and fat tails Y Malevergne1,2 and D Sornette1,3 1
Laboratoire de Physique de la Mati`ere Condens´ee CNRS UMR 6622, Universit´e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France 2 Institut de Science Financi`ere et d’Assurances, Universit´e Lyon I, 43 Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France 3 Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, CA 90095, USA E-mail:
[email protected] and
[email protected] Received 20 July 2001, in final form 20 August 2001 Published 14 September 2001 Online at stacks.iop.org/Quant/1/533
Abstract Lux and Sornette have demonstrated that the tails of the unconditional distributions of price differences and of returns associated with the model of rational bubbles of Blanchard and Watson follow power laws (i.e. exhibit hyperbolic decline), with an asymptotic tail exponent µ < 1 over an extended range. Although power-law tails are a pervasive feature of empirical data, the numerical value µ < 1 is in disagreement with the usual empirical estimates µ ≈ 3. Among the four hypotheses underlying the Blanchard and Watson rational bubbles model (rationality of the agents, no-arbitrage condition, multiplicative dynamics and bubble independence across assets), we prove that the same result µ < 1 holds when relaxing the last hypothesis, i.e. by allowing coupling between different bubbles on several assets. Therefore, nonlinear extensions of the bubble dynamics or partial relaxation of the rational pricing principle are necessary.
1. Introduction Blanchard (1979) and Blanchard and Watson (1982) originally introduced the model of rational expectations (RE) bubbles to account for the possibility, often discussed in the empirical literature and by practitioners, that observed prices may, over extended time intervals, deviate significantly from fundamental prices. While allowing for deviations from fundamental prices, rational bubbles keep a fundamental anchor point of economic modelling, namely that bubble Xt must obey the condition of rational expectations Xt = δ · E[Xt+1 |Ft ],
(1)
where δ 1 is the discount factor and Ft the available information at time t. In order to avoid the unrealistic picture of ever-increasing deviations from fundamental values, Blanchard and Watson 1469-7688/01/050533+09$30.00
© 2001 IOP Publishing Ltd
(1982) proposed a model with periodically collapsing bubbles in which the bubble component of the price follows an exponential explosive path (the price being multiplied by at = a¯ > 1) with probability π and collapses to zero (the price being multiplied by at = 0) with probability 1 − π. It is clear that, in this model, a bubble has an exponential distribution of lifetimes with a finite average lifetime π/(1 − π). Bubbles are thus transient phenomena. The condition of rational expectations imposes that π a¯ = 1/δ 1. In order to allow for the start of new bubbles after the collapse, a stochastic zero mean normally distributed component bt is added to the systematic part of the bubble Xt . This leads to the following dynamical equation Xt+1 = at Xt + bt ,
(2)
where, as we said, at = a¯ with probability π and at = 0 with probability 1 − π. Both variables at and bt do not depend
PII: S1469-7688(01)26999-1
533
34
2. Mod`eles ph´enom´enologiques de cours
Q UANTITATIVE F I N A N C E
Y Malevergne and D Sornette
on the process Xt . There is a huge literature on theoretical refinements of this model and on the empirical detectability of RE bubbles in financial data (see Camerer (1989) and Adam and Szafarz (1992) for surveys of this literature). Recently, Lux and Sornette (1999) studied the implications of the RE bubble models for the unconditional distribution of prices, price changes and returns resulting from a more general discrete-time formulation extending (2) by allowing the multiplicative factor at to take arbitrary values and be i.i.d. random variables drawn from some non-degenerate probability density function (pdf) Pa (a). The model can also be generalized by considering non-normal realizations of bt with distribution Pb (b) with E[bt ] = 0, where E[·] is the expectation operator. Since in (2) the bubble Xt denotes the difference between the observed price and the fundamental price, the ‘bubble’ regimes refer to the cases when Xt explodes exponentially under the action of successive multiplications by factor at , at+1 , . . . with a majority of them larger than 1 in absolute value but different, thus adding a stochastic component to the standard model of Blanchard and Watson (1982). For this large class of stochastic processes, Lux and Sornette (1999) have shown that the distribution of returns is a power law whose exponent µ is enforced by the nofree-lunch condition to remain lower than one. Although power-law tails are a pervasive feature of empirical data, these characterizations are in strong disagreement with the usual empirical estimates which find µ ≈ 3 (de Vries 1994, Lux 1996, Pagan 1996, Guillaume et al 1997, Gopikrishnan et al 1998). Thus, Lux and Sornette (1999) concluded that exogenous rational bubbles are hardly compatible with the empirical distribution data. At this stage, one could argue that there is a logical trap in the finding µ < 1: how could people be rational in the Blanchard–Watson sense, that is using expected values, in an infinite expectation framework? Actually, we stress that the condition µ < 1 is not incompatible with the rational expectation condition for the following reason. Under the rational expectation condition, the best estimation of the price Xt+1 of an asset at time t + 1 viewed from time t is given by the expectation of Xt+1 conditioned upon the knowledge of the filtration {Ft } (i.e. sum of all available information accumulated) up to time t : E[Xt+1 |Ft ]. Therefore, in the Blanchard–Watson model, expectations are taken conditioned upon the knowledge of the filtration {Ft }, which means that E[Xt+1 |Ft ] is taken at Xt given. In the model (2), E[Xt+1 |Ft ] thus involves only expectations of the coefficients at and bt which are perfectly defined and, given Xt , E[Xt+1 |Ft ] is always finite whatever the value of the exponent µ < 1, In other words, the logical trap is avoided by realizing that the rational expectation condition of the Blanchard–Watson model involves conditional expectations, which are always defined and finite, and not unconditional expectations (which would lead to infinities). What is the origin of this difference with the empirical value of µ? What is wrong in the rational expectation bubble model? Should one reject it completely or is it possible to ‘cure’ it by a suitable and reasonable extension? To make progress and address these questions, we note that the rational bubbles model relies on four main assumptions:
534
1. 2. 3. 4.
the rationality of the agents, the no-arbitrage condition, the independence of the bubbles from assets to assets, and the autoregressive dynamics assumed for the bubble.
In the next section 2, we examine the strengths and weaknesses of these different hypotheses. This leads us to propose conservatively that the first two are quite reasonable while the last two can be modified. Section 3 introduces a multivariate generalization of the standard bubble model to account for possible dependences between the bubbles of different assets. Section 4 proves that the same tail-exponent µ characterizes the distribution of prices, price variations and returns for all assets. In addition, the rational expectation and the noarbitrage conditions are found to impose again µ < 1, as for the one-dimensional case discussed by Lux and Sornette (1999). Section 5 concludes that retrieving a reasonable value µ ≈ 3 thus requires an extension of the bubble dynamics to nonlinear behaviour or a partial relaxation of the rational pricing principle.
2. Assumptions of the Blanchard and Watson model 2.1. Rationality of the agents and the no-free-lunch condition The Blanchard and Watson model assumes that the economic agents act rationally. This means that their expectations are based on the best predictor, given all the available information. Thus, at time t, the best predictor Pˆt+1 of the price Pt+1 of an asset at time t + 1 is Pˆt+1 = E[Pt+1 |Ft ],
(3)
where Ft denotes the available information at time t (the filtration up to time t). The concept of rational agent is very useful to build mathematical models of behaviour. But, as stressed in recent studies, investors are not fully rational, or have only bounded rationality. Indeed, most agents have only limited abilities with respect to analysing the limited information they are able to collect. Therefore, they cannot perform an optimal prediction, and consequently their expectations cannot be fully rational. Moreover, behavioural and psychological mechanisms, such as herding, may be important in the shaping of market prices (Thaler 1993, Shefrin 2000, Shleifer 2000). Notwithstanding these limitations, it is a well-known fact that, for fluid assets, dynamic investment strategies rarely perform better than buy-and-hold strategies (Malkiel 1999). In other words, the market is not far from being efficient and little arbitrage opportunities exist as a result of the constant search for gains by sophisticated investors. Thus, notwithstanding the limited abilities of the investors and their not-fully-rational behaviour, the market behaves almost as if the collective effect of these agents was equivalent to the impact of fully-rational investors. The emergence of such collective behaviours at the macroscopic scale from the collective effect of many agents is known to occur in simple agent-based models (see
35
2.1. Bulles rationnelles multi-dimensionnelles et queues e´ paisses
Q UANTITATIVE F I N A N C E
Multi-dimensional rational bubbles
for instance Hommes (2001) and Challet et al (2001) and references therein). Following many others before us, we conclude that the hypotheses of rational expectations and of noarbitrage condition are useful approximation or starting points for model constructions. Within this framework, the price Pt of an asset at time t should obey the equation Pt = δ · EQ [Pt+1 |Ft ] + dt
∀{Pt }t0 ,
(4)
where dt is an exogeneous ‘dividend’, δ is the discount factor and EQ [Pt+1 |Ft ] is the expectation of Pt+1 conditioned upon the knowledge of the filtration up to time t under the risk-neutral probability Q. It is important to stress that the rationality of both expectations and behaviour does not necessarily imply that the price of an asset be equal to its fundamental value. In other words, there can be rational deviations of the price from this value, called rational bubbles. A rational bubble can arise when the actual market price depends positively on its own expected rate of change. This is thought to sometimes occur in asset markets and constitutes the very mechanism underlying the models of Blanchard (1979) and Blanchard and Watson (1982). Indeed, the ‘forward’ solution of (4) is well-known to be +∞ δ i · EQ [dt+i |Ft ]. (5) Ft = i=0
It is straightforward to check by replacement that the general solution of (4) is the sum of the forward solution (5) and of an arbitrary component Xt Pt = Ft + Xt ,
(6)
where Xt has to obey the single condition: Xt = δ · EQ [Xt+1 |Ft ].
(7)
With only two fundamental and quite reasonable assumptions, it is thus possible to derive an equation (6) (with (7)) justifying the existence of price fluctuations and deviations from the fundamental value Ft , for equilibrium markets with rational agents. However, the dynamics of the bubbles remains unknown and is a priori completely arbitrary apart from the no-arbitrage constraint (7).
2.2. Bubble dynamics One of the main contributions of the model of Blanchard and Watson (1982), and its possible generalizations (8), is to propose a bubble dynamics which is both compatible with most of the empirical stylized facts of price time series and sufficiently simple to allow for a tractable analytical treatment. Indeed, the stochastic autoregressive process Xt+1 = at Xt + bt ,
(8)
where {at } and {bt } are i.i.d. random variables, is well-known to lead to fat-tailed distributions and volatility clustering. Kesten (1973) (see Goldie (1991) for a modern extension) has shown that, if E[ln |a|] < 0, the stochastic process {Xt }
admits a stationary solution (i.e. with a stationary distribution function), whose distribution density P (X) is a power law P (X) ∼ X −1−µ with exponent µ satisfying E[|a|µ ] = 1,
(9)
provided that E[|b|µ ] < ∞. It is easy to show that without other constraints every exponent µ > 0 can be reached. Consider, for example, a sequence of variables {at } independently and identically distributed according to a lognormal law with localization parameter a0 < 1 and scale parameter σ . Equation (9) simply yields µ = −2 lnσa20 , which shows that, varying σ , µ can range over the entire positive real line. The second interesting point is that the process (8) allows volatility clustering. It is easy to see that a large value of Xt will be followed by a large value of Xt+1 with a large probability. The change of variables Xt = Yt2 with at = vZt2 and bt = uZt2 maps exactly the process (8) onto an ARCH(1) process Yt+1 = Zt u + vYt 2 , (10) where Zt is a Gaussian random variable. The ARCH processes are known to account for volatility clustering. Therefore, the process (8) also exhibits volatility clustering. This class (8) of stochastic processes thus provides several interesting features of real price series (Roman et al 2001). There is however an objection related to the fact that, without additional assumptions, the bubble price can become arbitrarily negative and can then lead to a negative price: Pt < 0. In fact, a negative price is not as meaningless as often taken for granted, as shown in Sornette (2000). But, even without allowing for negative prices, it is reasonable to argue that near Pt = 0, other mechanisms come into play and modify equation (8) in the neighbourhood of a vanishing price. For instance, when a market undergoes a too strong and abrupt loss, quotations are interrupted.
2.3. The independent bubbles assumption The Blanchard and Watson model assumes that there is only one asset and one bubble. In other words, the evolution of each asset does not depend on the dynamics of the others. But, in reality, there is no such thing as an isolated asset. Stock markets exhibit a variety of inter-dependences, based in part on the mutual influences between the USA, European and Japanese markets. In addition, individual stocks may be sensitive to the behaviour of the specific industry as a whole to which they belong and to a few other indicators, such as the main indices, interest rates and so on. Mantegna (1999) and Bonanno et al (2001) have indeed shown the existence of a hierarchical organization of stock interdependences. Furthermore, bubbles often appear to be not isolated features of a set of markets. For instance, Flood et al (1984) tested whether a bubble simultaneously existed across the nations, such as Germany, Poland, and Hungary, that experienced hyperinflation in the early 1920s. Coordinated corrections to what may be considered to be correlated bubbles can sometimes be detected. One of the most prominent examples
535
36
2. Mod`eles ph´enom´enologiques de cours
Q UANTITATIVE F I N A N C E
Y Malevergne and D Sornette
is found in the market appreciations observed in many of the world markets prior to the world market crash in October 1987 (Barro et al 1989). The collective growth of most of the markets worldwide was interrupted by a worldwide market crash: from the opening on 14 October 1987 through the market close on 19 October major indices of market valuation in the United States declined by 30% or more. Furthermore, all major markets in the world declined substantially in the month: out of 23 major industrial countries, 19 had a decline greater than 20%. This is one of the most striking evidences of the existence of correlations between corrections to bubbles across the world markets. Similar intermittent coordination of bubbles have been detected among the significant bubbles followed by large crashes or severe corrections in LatinAmerican and Asian stock markets (Johansen and Sornette 2000). These empirical facts suggest an improvement on the onedimensional bubble model by introducing a multi-dimensional generalization.
2.4. Position of the problem As shown in Lux and Sornette (1999), the one-asset model (8) suffers from a deficiency: the power law tail exponents predicted by the Blanchard and Watson model are not compatible with the empirical facts. The proof relies on the following ingredients: the no-free-lunch condition and the one-dimensional dynamics of the bubble. The onedimensional dynamics of the bubble implies that the equation (9) holds, while the no-free-lunch condition imposes E[a] > 1.
(11)
Putting these two equations together allows Lux and Sornette (1999) to conclude that necessarily µ < 1, while extensive empirical studies have shown that µ 3 (de Vries 1994, Lux 1996, Pagan 1996, Guillaume et al 1997, Gopikrishnan et al 1998). When confronted with such a disagreement between the model’s prediction and empirical data, it is natural to ask which of the hypotheses underlying Blanchard and Watson’s model is wrong. As previously discussed, we keep the rationality and no-free-lunch hypotheses. It remains to decide whether the inconsistency of the model comes from its dynamics or from the assumed independence of the assets. Empirical evidence suggests that asset price series are not independent. It thus seems reasonable to investigate whether the discrepancies between theoretical and empirical facts come from this assumption of independence of the bubbles in different assets. It is this hypothesis we propose to test here. To this aim, we will extend the Blanchard and Watson model to the multivariate case to account for possible coupling between bubbles. Then, using the renewal theory for products of random matrices, we will check whether the implications of this new model are in agreement with empirical facts.
536
3. Generalization of rational bubbles to arbitrary dimensions 3.1. Generalization to several coupled assets In the case of several assets, the rational pricing theory again dictates that the fundamental price of each individual asset is given by a formula like (4), where the specific dividend flow of each asset is used, with the same discount factor. The corresponding forward solution (5) is again valid for each asset. The general solution for each asset is (6) with a bubble component Xt different from one asset to the next. The different bubble components can be coupled, as we shall see, but they must each obey the condition (7), component by component. This imposes specific conditions on the coupling terms, as we will show. Following this reasoning, we can therefore propose the simplest generalization of a bubble into a ‘two-dimensional’ bubble for two assets X and Y with bubble prices respectively equal to Xt and Yt at time t. We propose the following generalization of the Blanchard–Watson model: Xt+1 = at Xt + bt Yt + ηt
(12)
Yt+1 = ct Xt + dt Yk + t
(13)
where at , bt , ct and dt are drawn from some multivariate probability density function. The two additive noises ηt and t are also drawn from some distribution function with zero mean. The diagonal case bt = ct = 0 for all t recovers the previous one-dimensional case with two uncoupled bubbles, provided at , dt , ηt and t are independent. Rational expectations require that Xt and Yt obey both the ‘no-free-lunch’ condition (1), i.e. δ · EQ [Xt+1 |Ft ] = Xt and δ · EQ [Yt+1 |Ft ] = Yt . With (12), (13), this gives (14) EQ [at ] − δ −1 Xt + EQ [bt ]Yt = 0, EQ [ct ]Xt + EQ [dt ] − δ −1 Yt = 0, (15) where we have used EQ [ηt ] = EQ [t ] = 0. The two equations (14), (15) must be true for all times, i.e. for all values of Xt and Yt visited by the dynamics. This imposes EQ [bt ] = EQ [ct ] = 0 and EQ [at ] = EQ [dt ] = δ −1 . We are going to retrieve this result more formally in the general case.
3.2. General formulation As discussed previously (see Flood et al (1984), Barro et al (1989), Johansen and Sornette (2000) and references therein), it is reasonable to introduce some dependence in the bubble dynamics of multiple assets. With respect to its calibration to empirical data, it is important to specify the scale at which this dependence is introduced. The dependence between assets can be introduced at the level of an industrial sector, at the level of a whole market, at the level of a country or even at the highest world-wide level. The economic analysis and implications are distinct in each case. Here, we shall avoid a specific discussion since our goal is to derive a general result which is fundamentally independent of these additional ingredients that can be introduced in the model.
37
2.1. Bulles rationnelles multi-dimensionnelles et queues e´ paisses
Q UANTITATIVE F I N A N C E
Multi-dimensional rational bubbles
A generalization of the two-dimensional case to arbitrary dimensions leads to the following stochastic random equation (SRE) Xt = At Xt−1 + Bt (16) where (Xt , Bt ) are d-dimensional vectors. Each component of Xt can be thought of as the difference between the price of the asset and its fundamental price. The matrices (At ) are identically independent distributed d × d-dimensional stochastic matrices. We assume that Bt are identically independent distributed random vectors and that (Xt ) is a causal stationary solution of (16). Generalizations introducing additional arbitrary linear terms at larger time lags such as Xt−2 , . . . can be treated with slight modifications of our approach and yield the same conclusions. We shall thus confine our demonstration on the SRE of order 1, keeping in mind that our results apply analogously to arbitrary orders of regressions. To formalize the SRE in a rigorous manner, we introduce in a standard way the probability space (, F , P) and a filtration (Ft ). Here P represents the product measure P = PX ⊗ PA ⊗ PB , where PX , PA and PB are the probability measures associated with {Xt }, {At } and {Bt }. We further assume as is customory that the stochastic process (Xt ) is adapted to the filtration (Ft ). In the following, we denote by | · | the Euclidean norm and by || · || the corresponding norm for any d × d-matrix A ||A|| = sup |Ax|. |x|=1
(17)
Now, we will formalize the ‘no-free-lunch’ condition for the SRE (16) and show that it entails in particular that the spectral radius (largest eigenvalue) of E[At ] must be equal to the inverse of the discount factor, hence it must be larger than 1.
The proof is given in appendix A. The condition (19) imposes some stringent constraints on admissible matrices At . Indeed, while At are not diagonal in general, their expectation must be diagonal. This implies that the off-diagonal terms of the matrices At must take negative values, sufficiently often so that their averages vanish. The offdiagonal coefficients quantify the influence of other bubbles on a given one. The condition (19) thus means that the average effect of other bubbles on any given one must vanish. It is straightforward to check that, in this linear framework, this implies an absence of correlation (but not of dependence) between the different bubble components E[X(k) X ($) ] = 0 for any k = $, where X (j ) denotes the j th component of the bubble X . In contrast, the diagonal elements of At must be mostly positive in order for EP [Aii ] = δ −1 , for all i, to hold true. In fact, on economic grounds, we can exclude the cases where the diagonal elements take negative values. Indeed, a negative value of Aii at a given time t would imply that Xt(i) might abruptly change sign between t − 1 and t, what does not seem to be a reasonable financial process. 3.3.2. Consequence for the no-free-lunch condition under historic probability measure The historical P and riskneutral Q probability measures are equivalent. This means that there exists a non-negative matrix h(θ ) = hij (θij ) such that, for each element indexed by i, j , we have EP [Aij ] = EQ [hij · Aij ] =
hij (θij0 )
· EQ [Aij ] for some
Proposition 1. The stochastic process Xt = At Xt−1 + Bt
(18)
satisfies the no-arbitrage condition if and only if 1 EQ [A] = Id . δ
(19)
(21)
f (θ ) · g(θ)dµ(θ) = g(θ0 ) ·
3.3.1. No-free-lunch condition under the risk-neutral probability measure The ‘no-free-lunch’ condition is equivalent to the existence of a probability measure Q equivalent to P such that, for all self-financing t, t−1 −1 portfolios !−1 !t is a Q-martingale, where S = δ , δ = (1 + r ) 0,t i i i i=0 S0,t is the discount factor for period i and ri is the corresponding risk-free interest rate. It is natural to assume that, for a given period i, the discount rates ri are the same for all assets. In frictionless markets, a deviation for this hypothesis would lead to arbitrage opportunities. Furthermore, since the sequence of matrices {At } is i.i.d. and therefore stationary, this implies that δt or rt must be constant and equal respectively to δ and r. Under those conditions, we have the following proposition:
∈ R.
The second equation comes from the well-known result:
3.3. The no-free-lunch condition
(20) θij0
f (θ )dµ(θ ) for some θ0 ∈ R. (22)
We thus get EP [Aij ] = 0 1 EP [Aij ] = (i) δ
if i = j
(23)
if i = j,
(24)
where the δ (i) can be different. We can thus write −1
−1
−1 where δ −1 = diag[δ (1) , . . . , δ (d) ]. EP [A] = δ
(25)
Appendix B gives a proof showing that δ (i) is indeed the genuine discount factor for the ith bubble component.
4. Renewal theory for products of random matrices In the following, we will consider that the random d × d matrices At are invertible matrices with real entries. We will denote by GLd (R) the group of these matrices.
537
38
2. Mod`eles ph´enom´enologiques de cours
Q UANTITATIVE F I N A N C E
Y Malevergne and D Sornette
4.1. Definitions
4.3. Comments on the theorem
Definition 1 (Feasible matrix). A matrix M ∈ GLd (R) is P-feasible if there exists an n ∈ N and M1 , . . . , Mn ∈ supp(P) such that M = M1 . . . Mn and if M has a simple real eigenvalue q(M) which, in modulus, exceeds all other eigenvalues of M.
4.3.1. Intuitive meaning of the hypotheses A suitable property for an economic model is the existence of a stationary solution, i.e. the solution Xt of the SRE (16) should not blow up. This condition is ensured by the hypothesis (H1). Indeed, EPA [ln ||A||] < 0 implies that the Lyapunov exponent of the sequence {An } of i.i.d. matrices is negative (Davis et al 1999). And it is well known that the negativity of the Lyapunov exponent is a sufficient condition for the existence of a stationary solution Xt , provided that E[ln+ ||B ||] < ∞. However, this condition can lead to too fast a decay of the tail of the distibution of {X }. This phenomenon is prevented by (H5) which means intuitively that the multiplicative factors given by the elements of At sometimes produce an amplification of Xt . In the one-dimensional bubble case, this condition reduces to the simple rule that at must sometimes be larger than 1 so that a bubble can develop. Otherwise, the power law tail will be replaced by an exponential tail. So, (H1) and (H5) keep the balance between two opposite objectives: to obtain a stationary solution and to observe a fat-tailed distribution for the process (Xt ). Another desirable property for the model is ergodicity: we expect the price process (Xt ) to explore the entire space Rd . This is ensured by hypotheses (H2) and (H4): hypothesis (H2) allows Xt to visit the neighbourhood of any point in Rd , and (H4) forbids the trajectory to be trapped at some points. Hypotheses (H3) and (H6) are more technical ones. The hypothesis (H6) simply ensures that the tails of the distributions of At and Bt are thinner than the tail created by the SRE (16), so that the observed tail index is really due to the dynamics of the system and not to the heavy-tail nature of the distributions of At or Bt . The hypothesis (H3) expresses some kind of aperiodicity condition.
Definition 2. For any matrix M ∈ GLd (R) and M its transpose, MM is a symmetric positive definite matrix. We define λ(M) as the square root of the smallest eigenvalue of MM .
4.2. Theorem We extend the theorem 2.7 of Davis et al (1999), which synthesized Kesten’s theorems 3 and 4 in Kesten (1973), to the case of real valued matrices. The proof of this theorem is given in Le Page (1983). We stress that the conditions listed below do not require the matrices (An ) to be non-negative. Actually, we have seen that, in order for the rational expectation condition not to lead to trivial results, the off-diagonal coefficients of (An ) have to be negative with sufficiently large probability such that their means vanish. Theorem 1. Let (An ) be an i.i.d. sequence of matrices in GLd (R) satisfying the following set of conditions: H1: for some > 0, EPA [||A|| ] < 1, H2: for every open U ⊂ Sd−1 (the unit sphere in Rd ) and for all x ∈ Sd−1 there exists an n such that xA1 . . . An Pr ∈ U > 0. (26) ||xA1 . . . An || H3: the group {ln |q(M)|, M is PA -feasible} is dense in R . H4: for all r ∈ Rd , Pr{A1 r + B1 = r } < 1. H5: there exists a κ0 > 0 such that EPA ([λ(A1 )]κ0 ) 1.
(27)
H6: with the same κ0 > 0 as for the previous condition, there exists a real number u > 0 such that
κ +u 1 is that the spectral radius of EPA [A] be smaller than 1: κ1 > 1 ⇒ ρ(EPA [A]) < 1.
(30)
The proof is given in appendix C. This proposition, put together with proposition 1 above, will allow us to derive our main result.
39
2.1. Bulles rationnelles multi-dimensionnelles et queues e´ paisses
Q UANTITATIVE F I N A N C E
5. Consequences for rational expectation bubbles We have seen in section 3.3 from proposition 1 that, as a result of the no-arbitrage condition, the spectral radius of the matrix EP [A] = 1δ Id is greater than 1. As a consequence, by application of the converse of proposition 2, this proves that the tail index κ1 of the distribution of (X ) is smaller than 1. Using the same arguments as in Lux and Sornette (1999), that we do not recall here, it can be shown that the distribution of price differences and price returns follows, at least over an extended range of large returns, a power law distribution whose exponent remains lower than 1. This result generalizes to arbitrary d-dimensional processes the result of Lux and Sornette (1999). As a consequence, d-dimensional rational expectation bubbles linking several assets suffer from the same discrepancy compared to empirical data as the onedimensional bubbles. It therefore appears that accounting for possible dependences between bubbles is not sufficient to cure the Blanchard and Watson model: a linear multi-dimensional bubble dynamics such as (16) is hardly reconcilable with some of the most fundamental stylized facts of financial data at a very elementary level. This result does not rely on the diagonal property of the matrices E[At ] but only on the value of its spectral radius imposed by the no-arbitrage condition. In other words, the fact that the introduction of dependences between asset bubbles is not sufficient to cure the model can be traced to the constraint introduced by the no-arbitrage condition, which imposes that the averages of the off-diagonal terms of the matrices At vanish. As we indicated before, this implies zero correlations (but not the absence of dependence) between asset bubbles. It thus seems that the multi-dimensional generalization is constrained so much by the no-arbitrage condition that the multi-dimensional bubble model almost reduces to an average one-dimensional model. With this insight, our present result generalizing that of Lux and Sornette (1999) is natural. To our knowledge, there are two possible remedies. The first one is based on the rational bubble and crash model of Johansen et al (1999, 2000) which abandons the linear stochastic dynamics (8) in favour of an essentially arbitrary and nonlinear dynamics controlled by a crash hazard rate. A jump process for crashes is added to the process, with a crash hazard rate evolving with time such that the rational expectation condition is ensured. This model is squarely based on the rational expectation framework and shows that changing the dynamics of the Blanchard and Watson model allows satisfying results to be obtained, as the corresponding return distributions can be made to exhibit reasonable fat tails. The second solution (Sornette 2001) requires the existence of an average exponential growth of the fundamental price at some return rate rf > 0 larger than the discount rate. With the condition that the price fluctuations associated with bubbles must on average grow with the mean market return rf , it can be shown that the exponent of the power law tail of the returns is no more bounded by 1 as soon as rf is larger than the discount rate r and can take essentially arbitrary values. This second approach amounts to abandoning the rational pricing theory
Multi-dimensional rational bubbles
(6) with (4) and keeping only the no-arbitrage constraint (7) on the bubble component. This hypothesis may be true in the case of a firm, sometimes in the case of an industry (railways in the 19th century, oil and computer in the 20th for instance), but is hard to defend in the case of an economy as a whole. The real long run interest rate in the US is approximately 3.5%, and the real rate of growth of profits since World War II is about 2.1%. Thus, for the economy as a whole, the discounted sum is always finite. It would be interesting to investigate the interplay between inter-dependence between several bubbles and this exponential growth model. As a final positive remark balancing our negative result, we stress that the stochastic multiplicative multi-asset rational bubble model presented here provides a natural mechanism for the existence of a ‘universal’ tail exponent µ across different markets.
Acknowledgments We acknowledge helpful discussions and exchanges with J P Laurent, T Lux, V Pisarenko and M Taqqu. We also thank T Mikosch for providing access to Le Page (1983) and V Pisarenko for a critical reading of the manuscript. Any remaining error is ours.
Appendix A. Proof of proposition 1 on the no-arbitrage condition Let !t be the value at time t of any self-financing portfolio: !t = Wt Xt ,
(A.1)
where Wt = (W1 , . . . , Wd ) is the vector whose components are the weights of the different assets and the prime denotes the transpose. The no-free-lunch condition reads: !t = δ · EQ [!t+1 |Ft ]
∀{!t }t0 .
(A.2)
Therefore, for all self-financing strategies (Wt ), we have 1 E Q [A ] − I d X t = 0 Wt+1 ∀ Xt ∈ Rd , (A.3) δ where we have used the fact that (Wt+1 ) is (Ft )-measurable and that the sequence of matrices {At } is i.i.d. The strategy Wt = (0, . . . , 0, 1, 0, . . . , 0) (1 in ith position), for all t, is self-financing and implies 1 (ai1 , ai2 , . . . , aii − , . . . , aid ) δ ·(Xt(1) , Xt(2) , . . . , Xt(i) , . . . , Xt(d) ) = 0, (A.4) for all Xt ∈ Rd . We have called aij the (i, j )th coefficient of the matrix EQ [A]. As a consequence, 1 (ai1 , ai2 , . . . , aii − , . . . , aid ) = 0 δ and EQ [A] =
1 Id . δ
∀i,
(A.5)
(A.6)
539
40
2. Mod`eles ph´enom´enologiques de cours
Q UANTITATIVE F I N A N C E
Y Malevergne and D Sornette
The no-arbitrage condition thus implies: EQ [A] = 1δ Id . We now show that the converse is true, namely that if EQ [A] = 1δ Id is true, then the no-arbitrage condition is verified. Let us thus assume that EQ [A] = 1δ Id holds. Then EQ [Pt+1 |Ft ] = EQ [Wt+1 · Xt+1 |Ft ] = Wt+1 · EQ [Xt+1 |Ft ]
This equation together with (A.18), leads to (A.7)
= Wt+1 · E Q [A ] · X t 1 = Wt+1 · Xt . δ
(A.10) (A.11) (A.12)
The condition that the portfolio is self-financing is Wt+1 Xt = Wt Xt , which means that the weights can be rebalanced a priori arbitrarily between the assets with the constraint that the total wealth at the same time remains constant. We can thus write 1 W · Xt δ t+1 1 = !t . δ
EQ [!t+1 |Ft ] =
˜ t−1 = EP [A]Xt−1 , X
(A.8)
· EQ [At+1 Xt + Bt+1 |Ft ] (A.9) = Wt+1 = Wt+1 · EQ [At+1 |Ft ] · Xt
Again, we evaluate the conditional expectation of (16), and using the fact that {At } are i.i.d., we have
EP Xt |Ft−1 = EP [A]Xt−1 . (A.19)
which can be rewritten −1 X EP [A] − δ t−1 = 0, −1
which is the result announced in section (3.3.2).
Appendix C. Proof of proposition 2 on the condition κ1 < 1
(A.14)
First step.
f (κ) = lim
n→∞
• • • •
=
where δ (i) is the discount factor. A priori, each asset has a different return. introducing the vector X˜ t whose ith component is can summarize the rational expectation condition as ˜ t = EP [Xt+1 |Ft ] . X
(A.23)
Thus, there is a unique solution κ1 in (0, κ0 ) such that f (κ1 ) = 0. In order to have κ1 > 1, it is necessary that f (1) < 0, or using the definition of f : lim
rt(i) ,
1 ln EPA ||An . . . A1 ||κ1 n
f is continuous on [0, κ0 ], f (0) = 0 and f (κ0 ) > 0, f (0) < 0 (this results from the stationarity condition), f is convex on [0, κ0 ].
n→∞
(A.15)
where rt(i) is the return of the asset i between t and t + 1. As previously, we will assume in what follows that rt(i) = r (i) is time independent. Thus, the rational expectation condition for the assets i reads (i) Xt(i) = δ (i) · EPX Xt+1 |Ft , (A.16) (i) = δ (i) · EP Xt+1 |Ft , (A.17)
540
Behaviour of the function
in the interval [0, κ0 ]. In the fourth step of the proof of theorem 3, Kesten (1973) shows that the function f has the following properties:
Here, we express the no-free-lunch condition in the historical space (or real space). The condition we will obtain is the socalled ‘rational expectation condition’, which is a little bit less general than the condition detailed in the previous appendix A. Given the prices {Xk(i) }kt of an asset, labelled by i, until (i) |Ft ]. the date t, the best estimation of its price at t +1 is EP [Xt+1 So, the RE condition leads to −
−1
(A.13)
Appendix B. Proof that δ (i) is the discount factor for the ith bubble component in the historical space
Xt(i)
(A.21)
−1 = diag[δ (1) where δ . . . δ (d) ] is the matrix whose ith (i) −1 and 0 elsewhere. diagonal component is δ The equation (A.21) must be true for every Xt−1 ∈ Rd , thus −1 , EP [A] = δ (A.22)
Therefore, the discounted process {!t } is a Q-martingale.
(i) EP [Xt+1 |Ft ] Xt(i)
(A.20)
Thus, we
Xt(i) , δ (i)
(A.18)
1 ln EPA ||An . . . A1 || < 0. n
(A.24)
The qualitative shape of the function f (κ) is shown in figure A1. Second step.
The operator || · || is convex:
∀α ∈ [0, 1] and ∀(A, C ) d × d-matrices, ||α A + (1 − α)C || α||A|| + (1 − α)||C ||.
(A.25)
Thanks to Jensen’s inequality, we have EPA ||An . . . A1 || ||EPA [An . . . A1 ]||. The matrices (An ) being i.i.d., we obtain n EPA ||An . . . A1 || || EPA [A] ||.
(A.26)
(A.27)
Now, let consider the normalized-eigenvector xmax associated with the largest eigenvalue λmax ≡ ρ(EPA [A]),
(A.28)
2.1. Bulles rationnelles multi-dimensionnelles et queues e´ paisses
Q UANTITATIVE F I N A N C E
Multi-dimensional rational bubbles
f(k)
f(1) 0
k1
1
k0
k
Figure A1. Schematic shape of the function f (κ) defined in (A.23).
where ρ(EPA [A]) is the spectral radius of A. By definition, n n || EPA [A] || | EPA [A] xmax | = λnmax . (A.29) Then lim
n→∞
1 1 ln EPA ||An . . . A1 || lim ln ρ(EPA [A])n n→∞ n n = ln ρ(EPA [A]) . (A.30)
Now, suppose that ρ(EPA [A]) 1. We obtain f (1) = lim
n→∞
41
1 ln EPA ||An . . . A1 || 0, n
(A.31)
which is in contradiction with the necessary condition (A.24). Thus, f (1) < 0 ⇒ ρ(EPA [A]) < 1.
(A.32)
References Adam M C and Szafarz A 1992 Speculative bubbles and financial markets Oxford Econ. Papers 44 626–40 Barro R J, Fama E F, Fischel D R, Meltzer A H, Roll R and Telser L G 1989 Black Monday and the Future of Financial Markets ed R W Kamphuis Jr, R C Kormendi and J W H Watson (Mid American Institute for Public Policy Research Inc. and Dow Jones–Irwin Inc.) Blanchard O J 1979 Speculative bubbles, crashes and rational expectations Economics Lett. 3 387–9 Blanchard O J and Watson M W 1982 Bubbles, rational expectations and speculative markets Crisis in Economic and Financial Structure: Bubbles, Bursts, and Shocks ed P Wachtel (Lexington: Lexington Books) Bonanno G, Lillo F and Mantegna R N 2001 High-frequency cross-correlation in a set of stocks Quantitative Finance 1 96–104 Camerer C 1989 Bubbles and fads in asset prices J. Econ. Surveys 3 3–41 Challet D, Chessa A, Marsili M and Zhang Y-C 2001 From Minority Games to real markets Quantitative Finance 1 168–76
Davis R A, Mikosch T and Basrak B 1999 Sample ACF of multivariate stochastic recurrence equations with applications to GARCH Technical Report University of Groeningen ˜ (available at www.math.rug.nl/mikosch) de Vries C G 1994 Stylized facts of nominal exchange rate returns The Handbook of International Macroeconomics ed F van der Ploeg (Oxford: Blackwell) pp 348–89 Flood R P, Garber P M and Scott L O 1984 Multi-country tests for price level bubbles J. Econ. Dynamics Control 8 329–40 Goldie C M 1991 Implicit renewal theory and tails of solutions of random equations Ann. Appl. Prob. 1 126–66 Gopikrishnan P, Meyer M, Amaral L A N and Stanley H E 1998 Inverse cubic law for the distribution of stock price variations Eur. Phys. J. B 3 139-40 Guillaume D M, Dacorogna M M, Dav´e R R, M¨uller J A, Olsen R B and Pictet O V 1997 From the bird’s eye to the microscope: a survey of new stylized facts of the intra-daily foreign exchange markets Finance Stochastics 1 95–129 Hommes C H 2001 Financial markets as nonlinear adaptive evolutionary systems Quantitative Finance 1 149–67 Johansen A, Ledoit O and Sornette D 2000 Crashes as critical points Int. J. Theor. Appl. Finance 3 219–55 Johansen A, Sornette D and Ledoit O 1999 Predicting financial crashes using discrete scale invariance J. Risk 1 5–32 Johansen A and Sornette D 2001 Bubbles and anti-bubbles in Latin American, Asian and Western stock markets: an empirical study Int. J. Theor. Appl. Finance at press (Johansen A and Sornette D 1999 Preprint cond-mat/9907270) Kesten H 1973 Random difference equations and renewal theory for products of random matrices Acta Mathematica 131 207–48 Le Page E 1983 Th´eor`emes de renouvellement pour les produits de ´ matrices al´eatoires. Equations aux diff´erences al´eatoires S´eminaires de probabilit´es, Rennes 1938 S´em. Math. Univ. Rennes I p 116 Lux L 1996 The stable Paretian hypothesis and the frequency of large returns: an examination of major German stocks Appl. Financial Economics 6 463–75 Lux T and Sornette D 1999 On rational bubbles and fat tails J. Money Credit Banking at press (Lux T and Sornette D 1999 Preprint cond-mat/9910141) Malkiel B G 1999 A Random Walk Down Wall Street (New York: W W Norton & Company) Mantegna R N 1999 Hierarchical structure in financial markets Eur. Phys. J. B 11 193–7 Pagan A 1996 The econometrics of financial markets J. Empirical Finance 3 15–102 Roman H E, Porto M and Giovanardi N 2001 Anomalous scaling of stock price dynamics within ARCH-models Eur. Phys. J. B 21 155–8 Shefrin H 2000 Beyond Greed and Fear: Understanding Behavioral Finance and the Psychology of Investing (Boston, MA: Harvard Business School Press) Shleifer A 2000 Inefficient Markets: An Introduction to Behavioral Finance (Oxford: Oxford University Press) Sornette D 2000 Stock market speculation: spontaneous symmetry breaking of economic valuation Physica A 284 355–75 Sornette D 2001 ‘Slimming’ of power law tails by increasing market returns Preprint cond-mat/001011 Thaler R H (ed) 1993 Advances in Behavioral Finance (New York: Russell Sage Foundation)
541
42
2. Mod`eles ph´enom´enologiques de cours
2.2 Des bulles rationnelles aux krachs Nous e´ tudions et g´en´eralisons de diverses mani`eres le mod`ele de bulles rationnelles introduit dans la litt´erature e´ conomique par Blanchard et Watson (1982). Les bulles sont pr´esent´ees comme e´ quivalentes aux modes de Goldstone du prix fondamental de l’´equation de “pricing” rationnel, associ´es a` la brisure de sym´etrie li´ee a` l’existence d’un dividende non nul. G´en´eralisant les bulles en terme de processus stochastique multiplicatif, nous r´esumons le r´esultat de Lux et Sornette (2002) selon lequel la condition de non arbitrage impose que la queue des distributions de rendement est r´eguli`erement variable, d’indice de queue µ < 1. Nous rappelons ensuite le principal r´esultat de Malevergne et Sornette (2001a), qui e´ tend le mod`ele de bulles rationnelles a` un nombre quelconque de dimensions d : un nombre d de s´eries financi`eres sont rendues lin´eairement interd´ependantes via une matrice de couplage al´eatoire. Nous d´erivons la condition de non arbitrage dans ce contexte et, a` l’aide de la th´eorie de renouvellement des matrices al´eatoires, nous e´ tendons le th´eor`eme de Lux et Sornette (2002) et d´emontrons que les queues des distributions inconditionnelles des rendements de chaque actif sont r´eguli`erement variables et de mˆeme indice de queue inf´erieur a` un. Bien que le comportement asymptotique en loi de puissance des queues de distributions soit une caract´eristique marquante ressortant de l’´etude des donn´ees empiriques, l’exposant de queue inf´erieur a` un est en contradiction flagrante avec l’estimation empirique g´en´eralement admise qui est de l’ordre de trois. Nous discutons alors deux extensions du mod`ele de bulles rationnelles en accord avec les faits stylis´es.
Reprint from : D. Sornette et Y. Malevergne (2001),“From rational bubbles to crashes”, Physica A 299, 40-59.
43
2.2. Des bulles rationnelles aux krachs
Physica A 299 (2001) 40–59
www.elsevier.com/locate/physa
From rational bubbles to crashes a Laboratoire
D. Sornettea; b; ∗ , Y. Malevergnea
de Physique de la Matiere Condensee, CNRS UMR 6622 and Universite de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France b Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, CA 90095, USA
Abstract We study and generalize in various ways the model of rational expectation (RE) bubbles introduced by Blanchard and Watson in the economic literature. Bubbles are argued to be the equivalent of Goldstone modes of the fundamental rational pricing equation, associated with the symmetry-breaking introduced by non-vanishing dividends. Generalizing bubbles in terms of multiplicative stochastic maps, we summarize the result of Lux and Sornette that the no-arbitrage condition imposes that the tail of the return distribution is hyperbolic with an exponent ¡ 1. We then outline the main results of Malevergne and Sornette, who extend the RE bubble model to arbitrary dimensions d: a number d of market time series are made linearly interdependent via d × d stochastic coupling coe5cients. We derive the no-arbitrage condition in this context and, with the renewal theory for products of random matrices applied to stochastic recurrence equations, we extend the theorem of Lux and Sornette to demonstrate that the tails of the unconditional distributions associated with such d-dimensional bubble processes follow power laws, with the same asymptotic tail exponent ¡ 1 for all assets. The distribution of price di7erences and of returns is dominated by the same power-law over an extended range of large returns. Although power-law tails are a pervasive feature of empirical data, the numerical value ¡ 1 is in disagreement with the usual empirical estimates ≈ 3. We then discuss two extensions (the crash hazard rate model and the non-stationary growth rate model) of the RE bubble model that provide two ways of reconciliation with the stylized facts of 9nancial data. c 2001 Elsevier Science B.V. All rights reserved.
1. The model of rational bubbles Blanchard [1] and Blanchard and Watson [2] originally introduced the model of rational expectations (RE) bubbles to account for the possibility, often discussed in ∗
Corresponding author. Laboratoire de Physique de la Mati>ere Condens?ee, CNRS UMR 6622 and Universit?e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France. Fax: +33-4-92-07-67-54. E-mail addresses:
[email protected] (D. Sornette),
[email protected] (Y. Malevergne). c 2001 Elsevier Science B.V. All rights reserved. 0378-4371/01/$ - see front matter PII: S 0 3 7 8 - 4 3 7 1 ( 0 1 ) 0 0 2 8 1 - 3
44
2. Mod`eles ph´enom´enologiques de cours
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
41
the empirical literature and by practitioners, that observed prices may deviate signi9cantly and over extended time intervals from fundamental prices. While allowing for deviations from fundamental prices, rational bubbles keep a fundamental anchor point of economic modelling, namely that bubbles must obey the condition of rational expectations. In contrast, recent works stress that investors are not fully rational, or have at most bounded rationality, and that behavioral and psychological mechanisms, such as herding, may be important in the shaping of market prices [3–5]. However, for Juid assets, dynamic investment strategies rarely perform over simple buy-and-hold strategies [6], in other words, the market is not far from being e5cient and little arbitrage opportunities exist as a result of the constant search for gains by sophisticated investors. Here, we shall work within the conditions of rational expectations and of no-arbitrage condition, taken as useful approximations. Indeed, the rationality of both expectations and behavior often does not imply that the price of an asset be equal to its fundamental value. In other words, there can be rational deviations of the price from this value, called rational bubbles. A rational bubble can arise when the actual market price depends positively on its own expected rate of change, as sometimes occurs in asset markets, which is the mechanism underlying the models of Refs. [1,2]. In order to avoid the unrealistic picture of ever-increasing deviations from fundamental values, Blanchard [2] proposed a model with periodically collapsing bubbles in which the bubble component of the price follows an exponential explosive path (the price being multiplied by at = aL ¿ 1) with probability and collapses to zero (the price being multiplied by at = 0) with probability 1 − . It is clear that, in this model, a bubble has an exponential distribution of lifetimes with a 9nite average lifetime =(1 − ). Bubbles are thus transient phenomena. The condition of rational expectations imposes that aL = 1=( ), where is the discount factor. In order to allow for the start of new bubbles after the collapse, a stochastic zero mean normally distributed component bt is added to the systematic part of Xt . This leads to the following dynamical equation Xt+1 = at Xt + bt ;
(1)
where, as we said, at = aL with probability and at = 0 with probability 1 − . Both variables at and bt do not depend on the process Xt . There is a huge literature on theoretical re9nements of this model and on the empirical detectability of RE bubbles in 9nancial data (see Refs. [7,8], for surveys of this literature). Model (1) has also been explored in a large variety of contexts, for instance in ARCH processes in econometry [9], 1D random-9eld Ising models [10] using Mellin transforms, and more recently using extremal properties of the G-harmonic functions on non-compact groups [11] and the Wiener–Hopf technique [12]. See also Ref. [13] for a short review of other domains of applications including population dynamics with external sources, epidemics, immigration and investment portfolios, the internet, directed polymers in random media : : : : Large |Xk | are generated by intermittent ampli9cations resulting from the multiplication by several successive values of |a| larger than one. We now o7er a simple “mean-9eld” type argument that clari9es the origin of the power law fat tail. Let us
2.2. Des bulles rationnelles aux krachs
42
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
call p¿ the probability that the absolute value of the multiplicative factor a is found larger than 1. The probability to observe n successive multiplicative factors |a| larger n than 1 is thus p¿ . Let us call |a¿ | the average of |a| conditioned on being larger than 1 : |a¿ | is thus the typical absolute value of the ampli9cation factor. When n successive multiplicative factors occur with absolute values larger than 1, they typically lead to an ampli9cation of the amplitude of X by |a¿ |n . Using the fact that the additive term bk ensures that the amplitude of Xk remains of the order of the standard deviation or of other measures of typical scales b of the distribution Pb (b) when the multiplicative factors |a| are less than 1, this shows that a value of Xk of the order of |X | ≈ b |a¿ |n occurs with probability 1 ln |X |=b n = p¿ = exp (n ln p¿ ) ≈ exp ln p¿ (2) ln |a¿ | (|X |=b ) with = ln p¿ =ln |a¿ |, which can be rewritten as p¿ |a¿ | = 1. Note the similarity between this last “mean-9eld” equation and the exact solution (8) given below. The power law distribution is thus the result of an exponentially small probability of creating an exponentially large value [14]. Expression (2) does not provide a precise determination of the exponent , only an approximate one since we have used a kind of mean-9eld argument in the de9nition of |a¿ |. In the next section, we recall how bubbles appear as possible solutions of the fundamental pricing equation and play the role of Goldstone modes of a price-symmetry broken by the dividend Jow. We then describe the Kesten generalization of rational bubbles in terms of random multiplicative maps and present the fundamental result [15] that the no-arbitrage condition leads to the constraint that the exponent of the power law tail is less than 1. We then present an extension to arbitrary multidimensional random multiplicative maps: a number d of market time series are made linearly interdependent via d × d stochastic coupling coe5cients. We show that the no-arbitrage condition imposes that the non-diagonal impacts of any asset i on any other asset j = i has to vanish on average, i.e., must exhibit random alternative regimes of reinforcement and contrarian feedbacks. In contrast, the diagonal terms must be positive and equal on average to the inverse of the discount factor. Applying the results of renewal theory for products of random matrices to stochastic recurrence equations (SRE), we extend the theorem of Ref. [15] and demonstrate that the tails of the unconditional distributions associated with such d-dimensional bubble processes follow power laws (i.e., exhibit hyperbolic decline), with the same asymptotic tail exponent ¡ 1 for all assets. The distribution of price di7erences and of returns is dominated by the same power-law over an extended range of large returns. In order to unlock the paradox, we brieJy discuss the crash hazard rate model [16,17] and the non-stationary growth model [18]. We conclude by proposing a link with the theory of speculative pricing through a spontaneous symmetry-breaking [19]. We should stress that, due to the no-arbitrage condition that forms the backbone of our theoretical approach, correlations of returns are vanishing. In addition, the multiplicative stochastic structure of the models ensures the phenomenon of volatility
45
46
2. Mod`eles ph´enom´enologiques de cours
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
43
clustering. These two stylized facts, taken for granted in our present approach, will not be discussed further. 2. Rational bubbles of an isolated asset [15] 2.1. Rational expectation bubble model and Goldstone modes We 9rst brieJy recall that pricing of an asset under rational expectations theory is based on the two following hypothesis: the rationality of the agents and the “no-free lunch” condition. Under the rational expectation condition, the best estimation of the price pt+1 of an asset at time t + 1 viewed from time t is given by the expectation of pt+1 conditioned upon the knowledge of the 9ltration {Ft } (i.e., sum of all available information accumulated) up to time t: E[pt+1 |Ft ]. The “no-free lunch” condition imposes that the expected returns of every assets are all equal under a given probability measure Q equivalent to the historical probability measure P. In particular, the expected return of each asset is equal to the return r of the risk-free asset (which is assumed to exist), and thus the probability measure Q is named the risk neutral probability measure. Putting together these two conditions, we are led to the following valuation formula for the price pt : pt = EQ [pt+1 |Ft ] + dt
∀{pt }t¿0 ;
(3)
where dt is an exogeneous “dividend”, and = (1+r)−1 is the discount factor. The 9rst term in the r.h.s. quanti9es the usual fact that something tomorrow is less valuable than today by a factor called the discount factor. Intuitively, the second term, the dividend, is added to express the fact that the expected price tomorrow has to be decreased by the dividend since the value before giving the dividend incorporates it in the pricing. The “forward” solution of (3) is well-known to be the fundamental price +∞ ptf =
i EQ [dt+i |Ft ] : (4) i=0
It is straightforward to check by replacement that the sum of the forward solution (4) and of an arbitrary component Xt pt = ptf + Xt ;
(5)
where Xt has to obey the single condition of being an arbitrary martingale Xt = EQ [Xt+1 |Ft ]
(6)
is also the solution of (3). In fact, it can be shown [20] that (5) is the general solution of (3). Here, it is important to note that, in the framework of the Blanchard and Watson model, the speculative bubbles appear as a natural consequence of the valuation formula
47
2.2. Des bulles rationnelles aux krachs
44
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
(3), i.e., of the no free-lunch condition and the rationality of the agents. Thus, the concept of bubbles is not an addition to the theory, as sometimes believed, but is entirely embedded in it. Notice also that the component Xt in (5) plays a role analogous to the Goldstone mode in nuclear, particle and condensed-matter physics [21,22]. Goldstone modes are the zero-wavenumber zero-energy modal Juctuations that attempt to restore a broken symmetry. For instance, consider a Bloch wall between two semi-in9nite magnetic domains of opposite spin directions selected by opposite magnetic 9eld at boundaries far away. At non-zero temperature, capillary waves are excited by thermal Juctuations. The limit of very long-wavelength capillary modes correspond to Goldstone modes that tend to restore the translational symmetry broken by the presence of the Bloch wall [23]. In the present context, as shown in Ref. [19], the p → −p
parity symmetry
(7)
is broken by the “external” 9eld embodied in the dividend Jow dt . Indeed, as can be seen from (3) and its forward solution (4), the fundamental price is identically zero in absence of dividends. Ref. [19] has stressed the fact that it makes perfect sense to think of negative prices. For instance, we are ready to pay a (positive) price for a commodity that we need or like. However, we will not pay a positive price to get something we dislike or which disturb us, such as garbage, waste, broken and useless car, chemical and industrial hazards, etc. Consider a chunk of waste. We will be ready to buy it for a negative price, in other words, we are ready to take the unwanted commodity if it comes with cash. Positive dividends imply positive prices, negative dividends lead to negative prices. Negative dividends correspond to the premium to pay to keep an asset for instance. From an economic view point, what makes a share of a company desirable is its earnings, that provide dividends, and its potential appreciation that give rise to capital gains. As a consequence, in absence of dividends and of speculation, the price of share must be nil and the symmetry (7) holds. The earnings leading to dividends d thus act as a symmetry-breaking “9eld”, since a positive d makes the share desirable and thus develop a positive price. It is now clear that the addition of the bubble Xt , which can be anything but for the martingale condition (6), is playing the role of the Goldstone modes restoring the broken symmetry: the bubble price can wander up or down and, in the limit where it becomes very large in absolute value, dominate over the fundamental price, restoring the independence with respect to dividend. Moreover, as in condensed-matter physics where the Goldstone mode appears spontaneously since it has no energy cost, the rational bubble itself can appear spontaneously with no dividend. The “bubble” Goldstone mode turns out to be intimately related to the “money” Goldstone mode introduced by Bak et al. [24]. Ref. [24] introduces a dynamical many-body theory of money, in which the value of money in equilibrium is not 9xed by the equations, and thus obeys a continuous symmetry. The dynamics breaks this continuous symmetry by 9xating the value of money at a level which depends on initial
48
2. Mod`eles ph´enom´enologiques de cours
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
45
conditions. The Juctuations around the equilibrium, for instance in the presence of noise, are governed by the Goldstone modes associated with the broken symmetry. In apparent contrast, a bubble represents the arbitrary deviation from fundamental valuation. Introducing money, a given valuation or price is equivalent to a certain amount of money. A growing bubble thus corresponds to the same asset becoming equivalent to more and more cash. Equivalently, from the point of view of the asset, this can be seen as cash devaluation, i.e., inJation. The “bubble” Goldstone mode and the “money” Goldstone mode are thus two facets of the same fundamental phenomenon: they both are left unconstrained by the valuation equations. 2.2. The no-arbitrage condition and fat tails Following Ref. [15], we study the implications of the RE bubble models for the unconditional distribution of prices, price changes and returns resulting from a general discrete-time formulation extending (1) by allowing the multiplicative factor at to take arbitrary values and be i.i.d. random variables drawn from some non-degenerate probability density function (pdf) Pa (a). The model can also be generalized by considering non-normal realizations of bt with distribution Pb (b) with EP [bt ] = 0, where EP [ · ] is the unconditional expectation with respect to the probability measure P. Provided EP [ln a] ¡ 0 (stationarity condition) and if there is a number such that 0 ¡ EP [|b| ] ¡ + ∞, such that EP [|a| ] = 1
(8)
and such that EP [|a| ln |a|] ¡ + ∞, then the tail of the distribution of X is asymptotically (for large X ’s) a power law [25,26] C dX ; (9) PX (X ) dX ≈ |X |1+ with an exponent given by the real positive solution of (8). Rational expectations require in addition that the bubble component of asset prices obeys the “no free-lunch” condition
EQ [Xt+1 |Ft ] = Xt
(10)
where ¡ 1 is the discount factor and the expectation is taken conditional on the knowledge of the 9ltration (information) until time t. Condition (10) with (1) imposes 9rst EQ [a] = 1= ¿ 1 ;
(11)
and then EP [a] ¿ 1 ;
(12)
on the distribution of the multiplicative factors at . Consider the function M () = EP [a ] :
(13)
2.2. Des bulles rationnelles aux krachs
46
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
Fig. 1. Convexity of M (). This enforces that the exponent solution of (8) is ¡ 1.
It has the following properties: (1) M (0) = 1 by de9nition, (2) M (0) = EP [ln a] ¡ 0 from the stationarity condition, (3) M () = EP [(ln a)2 |a| ] ¿ 0, by the positivity of the square, (4) M (1) = 1= ¿ 1 by the no-arbitrage result (12). M () is thus convex and is shown in Fig. 1. This demonstrate that ¡ 1 automatically (see Ref. [15] for a detailed demonstration). It is easy to show [15] that the distribution of price di7erences has the same power law tail with the exponent ¡ 1 and the distribution of returns is dominated by the same power-law over an extended range of large returns [15]. Although power-law tails are a pervasive feature of empirical data, these characterizations are in strong disagreement with the usual empirical estimates which 9nd ≈ 3 [27–31]. Lux and Sornette [15] concluded that exogenous rational bubbles are thus hardly reconcilable with some of the stylized facts of 9nancial data at a very elementary level.
3. Generalization of rational bubbles to arbitrary dimensions [32] 3.1. Generalization to several coupled assets In reality, there is no such thing as an isolated asset. Stock markets exhibit a variety of inter-dependences, based in part on the mutual inJuences between the USA, European and Japanese markets. In addition, individual stocks may be sensitive to the behavior of the speci9c industry as a whole to which they belong and to a few other indicators, such as the main indices, interest rates and so on. Mantegna et al. [33,34] have indeed shown the existence of a hierarchical organization of stock interdependences. Furthermore, bubbles often appear to be not isolated features of a set of markets. For instance, Ref. [35] tested whether a bubble simultaneously existed across
49
50
2. Mod`eles ph´enom´enologiques de cours
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
47
the nations, such as Germany, Poland, and Hungary, that experienced hyperinJation in the early 1920s. Coordinated bubbles can sometimes be detected. One of the most prominent example is found in the market appreciations observed in many of the world markets prior to the world market crash in October 1987 [36]. Similar intermittent coordination of bubbles have been detected among the signi9cant bubbles followed by large crashes or severe corrections in Latin American and Asian stock markets [37]. It is therefore desirable to generalize the one-dimensional RE bubble model (1) to the multi-dimensional case. One could also hope a priori that this generalization would modify the result ¡ 1 obtained in the one-dimensional case and allow for a better adequation with empirical results. Indeed, 1d-systems are well-known to exhibit anamalous properties often not shared by higher dimensional systems. Here however, it turns out that the same result ¡ 1 holds, as we shall see. In the case of several assets, rational pricing theory again dictates that the fundamental price of each individual asset is given by a formula like (3), where the speci9c dividend Jow of each asset is used, with the same discount factor. The corresponding forward solution (4) is again valid for each asset. The general solution for each asset is (5) with a bubble component Xt di7erent from an asset to the next. The di7erent bubble components can be coupled, as we shall see, but they must each obey the martingale condition (6), component by component. This imposes speci9c conditions on the coupling terms, as we shall see. Following this reasoning, we can therefore propose the simplest generalization of a bubble into a “two-dimensional” bubble for two assets X and Y with bubble prices, respectively, equal to Xt and Yt at time t. We express the generalization of the Blanchard– Watson model as follows: Xt+1 = at Xt + bt Yt + t ;
(14)
Yt+1 = ct Xt + dt Yk + t ;
(15)
where at , bt , ct and dt are drawn from some multivariate probability density function. The two additive noises t and t are also drawn from some distribution function with zero mean. The diagonal case bt = ct = 0 for all t recovers the previous one-dimensional case with two uncoupled bubbles, provided t and t are independent. Rational expectations require that Xt and Yt obey both the “no-free lunch” condition (10), i.e., · EQ [Xt+1 |Ft ] = Xt and · EQ [Yt+1 |Ft ] = Yt . With (14; 15); this gives (EQ [at ] − −1 )Xt + EQ [bt ]Yt = 0 ;
(16)
EQ [ct ]Xt + (EQ [dt ] − −1 )Yt = 0 ;
(17)
where we have used that t and t are centered. The two equations (16; 17) must be true for all times, i.e., for all values of Xt and Yt visited by the dynamics. This
51
2.2. Des bulles rationnelles aux krachs
48
D. Sornette, Y. Malevergne / Physica A 299 (2001) 40–59
imposes EQ [bt ] = EQ [ct ] = 0 and EQ [at ] = EQ [dt ] = −1 . We are going to retrieve this result more formally in the general case. 3.2. General formulation A generalization to arbitrary dimensions leads to the following stochastic random equation (SRE): Xt = At Xt−1 + Bt ;
(18)
where (Xt ; Bt ) are d-dimensional vectors. Each component of Xt can be thought of as the price of an asset above its fundamental price. The matrices (At ) are identically independent distributed d × d-dimensional stochastic matrices. We assume that Bt are identically independent distributed random vectors and that (Xt ) is a causal stationary solution of (18). Generalizations introducing additional arbitrary linear terms at larger time lags such as Xt−2 ; : : : can be treated with slight modi9cations of our approach and yield the same conclusions. We shall thus con9ne our demonstration on the SRE of order 1, keeping in mind that our results apply analogously to arbitrary orders of regressions. In the following, we denote by | · | the Euclidean norm and by · the corresponding norm for any d × d-matrix A A = sup |Ax| : |x|=1
(19)
Technical details are given in [32]. 3.3. The no-free lunch condition The valuation formula (3) and the martingale condition (6) given for a single asset easily extends to a basket of assets. It is natural to assume that, for a given period t, the discount rate rt (i), associated with asset i, are all the same. In frictionless markets, a deviation for this hypothesis would lead to arbitrage opportunities. Furthermore, since the sequence of matrices {At } is assumed to be i.i.d. and therefore stationary, this implies that t or rt must be constant and equal, respectively, to and r. Under those conditions, we have the following proposition. Proposition 1. The stochastic process Xt = At Xt−1 + Bt satis 2), whereas the existence of the third (skewness) and the fourth (kurtosis) moments is questionable. These apparent contradictory results actually do not apply to the same quantiles of the distributions of returns. Indeed, Mantegna and Stanley (1995) have shown that the distribution of returns can be described accurately by a L´evy law only within a range of approximately nine standard deviations, while a faster decay of the distribution is observed beyond. This almost-but-not-quite L´evy stable description explains (in part) the slow convergence of the returns distribution to the Gaussian law under time aggregation (Sornette 2000). And it is precisely outside this range where the L´evy law applies that a tail index of about three have been estimated. This can be seen from the fact that most authors who have reported a tail index b ∼ = 3 have used some optimality criteria for choosing the sample fractions (i.e., the largest values) for the estimation of the tail index. Thus, unlike the authors supporting stable laws, they have used only a fraction of the largest (positive tail) and smallest (negative tail) sample values. It would thus seem that all has been said on the distributions of returns. However, there are dissenting views in the literature. Indeed, the class of regularly varying distributions is not the sole one able to account for the large kurtosis and fat-tailness of the distributions of returns. Some recent works suggest alternative descriptions for the distributions of returns. For instance, Gouri´eroux and Jasiak (1998) claim that the distribution of returns on the French stock market decays faster than any power law. Cont et al. (1997) have proposed to use exponentially truncated stable distributions (Cont et al. 1997) while Laherr`ere and Sornette (1999) suggest to fit the distributions of stock returns by the Stretched-Exponential (SE) law. These results, challenging the traditional hypothesis of power-like tail, offer a new representation of the returns distributions and need to be tested rigorously on a statistical ground. A priori, one could assert that Longin (1996)’s results should rule out the exponential and StretchedExponential hypotheses. Indeed, his results, based on extreme value theory, show that the distributions of log-returns belong to the maximum domain of attraction of the Fr´echet distribution, so that they are necessarily regularly varying power-like laws. However, his study, like almost all others on this subject, has been performed under the assumption that (1) financial time series are made of independent and identically distributed returns and (2) the corresponding distributions of returns belong to one of only three possible maximum domains of attraction. However, these assumptions are not fulfilled in general. While Smith (1985)’s results indicate that the dependence of the data does not constitute a major problem in the limit of large samples, we shall see that it can significantly bias standard statistical methods for samples of size commonly used in extreme tails studies. Moreover, Longin’s conclusions are essentially based on an aggregation procedure which stresses the central part of the distribution while smoothing the characteristics of the tail, which are essential in characterizing the tail behavior. In addition, real financial time series exhibit GARCH effects (Bollerslev 1986, Bollerslev et al. 1994) leading to heteroskedasticity and to clustering of high threshold exceedances due to a long memory of the volatility. These rather complex dependent structures make difficult if not questionable the blind application of standard statistical tools for data analysis. In particular, the existence of significant dependence in the return volatility leads to dramatic underestimates of the true standard deviation of the statistical estimators of tail indices. Indeed, there are now many examples showing that dependences and long memories as well as nonlinearities mislead standard statistical tests (Andersson et al. 1999, Granger and Ter¨asvirta 1999, for instance). Consider the Hill’s and Pickand’s estimators, which play an important role in the study of the tails of distributions. It is often overlooked that, for dependent time series, Hill’s estimator remains only consistent but not asymptotically efficient (Rootzen et al. 1998). Moreover, for financial time series with a dependence structure described by a GARCH process, Kearns and Pagan (1997) have shown that the standard deviation of Hill’s estimator obtained by a bootstrap method can be seven to eight time larger than the standard deviation derived under the asymptotic normality assumption. These figures are even worse for 3
68
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Pickand’s estimator. The question then arises whether the many results and seemingly almost consensus obtained by ignoring the limitations of usual statistical tools could have led to erroneous conclusions about the tail behavior of the distributions of returns. Here, we propose to investigate once more this delicate problem of the tail behavior of distributions of returns in order to shed new lights. To this aim, we investigate two time series: the daily returns of the Dow Jones Industrial Average (DJ) Index over a century and the five-minutes returns of the Nasdaq Composite index (ND) over one year from April 1997 to May 1998. These two sets of data have been chosen since they are typical of the data sets used in most previous studies. Their size (about 20, 000 data points), while significant compared with those used in investment and portfolio analysis, is however much smaller than recent data-intensive studies using ten of millions of data points (Gopikrishnan et al. 1998, Matia et al. 2002, Mizuno et al. 2002). Our first conclusion is that none of the standard parametric family distributions (Pareto, exponential, stretchedexponential and incomplete Gamma) fits satisfactorily the DJ and ND data on the whole range of either positive or negative returns. While this is also true for the family of stretched exponential distribution, this family appears to be the best among the four considered parametric families, in so far as it is able to fit the data over the largest interval. Our second and main conclusion comes from the discovery that the Pareto distribution is a limit case of the Stretched-Exponential distribution. This allows us to test the encompassing of these two models with respect to the true data, from which it seems that the Stretched-Exponential distribution is the most relevant model of returns distributions. The regular decay of the fractional exponent of the SE model together with the regular increase of the tail index of the Pareto model lead us to think that the extreme tail of true distribution of returns is fatter that any stretched-exponential, strictly speaking -i.e., with a strickly positive fractional exponent- but thinner than any power law. Notwithstanding our best efforts, we cannot conclude on the exact nature of the far-tail of distributions of returns. As already mentioned, other works have proposed the so-called inverse-cubic law (b = 3) based on the analysis of distributions of returns of high-frequency data aggregated over hundreds up to thousands of stocks. This aggregating procedure leads to novel problems of interpretation. We think that the relevant question for most practical applications is not to determine what is the true asymptotic tail but what is the best effective description of the tails in the domain of useful applications. As we shall show below, it may be that the extreme asymptotic tail is a regularly varying function with tail index b = 3 for daily returns, but this is not very useful if this tail describes events whose recurrence time is a century or more. Our present work must thus be gauged as an attempt to provide a simple efficient effective description of the tails of distribution of returns covering most of the range of interest for practical applications. We feel that the efforts requested to go beyond the tails analyzed here, while of great interest from a scientific point of view to potentially help unravel market mechanisms, may be too artificial and unreachable to have significant applications. The paper is organized as follows. The next section is devoted to the presentation of our two data sets and to some of their basic statistical properties, emphasizing their fat tailed behavior. We discuss, in particular, the importance of the so-called “lunch effect” for the tail properties of intraday returns. We then demonstrate the presence of a significant temporal dependence structure and study the possible non-stationary character of these time series. Section 3 attempts to account for the temporal dependence of our time series and investigates its effect on the determination of the extreme behavior of the tails of the distribution of returns. In this goal, we build two simple long memory stochastic volatility processes whose stationary distributions are by construction either asymptotically regularly varying or exponential. We show that, due to the long range dependence on the volatility, the estimation with standard statistical estimators is severely biased. This leads to a very unreliable estimation of the parameters. These results justify our re-examination of previous claims of regularly varying tails.
4
69
To fit our two data sets, section 4 proposes a general parametric representation of the distribution of returns encompassing both a regularly varying distribution in one limit of the parameters and rapidly varying distributions of the class of stretched exponential distributions in another limit. The use of regularly varying distributions have been justified above. From a theoretical view point, the class of stretched exponentials is motivated in part by the fact that the large deviations of multiplicative processes are generically distributed with stretched exponential distributions (Frisch and Sornette 1997). Stretched exponential distributions are also parsimonious examples of the important subset of sub-exponentials, that is, of the general class of distributions decaying slower than an exponential. This class of sub-exponentials share several important properties of heavy-tailed distributions (Embrechts et al. 1997), not shared by exponentials or distributions decreasing faster than exponentials. The descriptive power of these different hypotheses are compared in section 5. We first consider nested hypotheses and use Wilk’s test to this aim. It appears that both the stretched-exponential and the Pareto distributions are the most parsimonous models compatible with the data with a slight advantage in favor of the stretched exponential model. Then, in order to directly compare the descriptive power of these two models, we perform encompassing tests, which prove the validity of the two representations, but for different quantile ranges. Finally we show that these two distributions can be set within a single model. Section 7 summarizes our results and concludes.
2
Some basic statistical features
2.1
The data
We use two sets of data. The first sample consists in the daily returns2 of the Dow Jones Industrial Average Index (DJ) over the time interval from May 27, 1896 to May 31, 2000, which represents a sample size n = 28415. The second data set contains the high-frequency (5 minutes) returns of Nasdaq Composite (ND) index for the period from April 8, 1997 to May 29, 1998 which represents n=22123 data points. The choice of these two data sets is justified by their similarity with (1) the data set of daily returns used by Longin (1996) particularly and (2) the high frequency data used by Guillaume et al. (1997), Lux (2000), M¨uller et al. (1998) among others. For the intra-day Nasdaq data, there are two caveats that must be addressed. First, in order to remove the effect of overnight price jumps, we have determined the returns separately for each of 289 days contained in the Nasdaq data and have taken the union of all these 289 return data sets to obtain a global return data set. Second, the volatility of intra-day data are known to exhibit a U-shape, also called “lunch-effect”, that is, an abnormally high volatility at the begining and the end of the trading day compared with a low volatility at the approximate time of lunch. Such effect is present in our data, as depicted on figure 1, where the average absolute returns are shown as a function of the time within a trading day. It is desirable to correct the data from this systematic effect. This has been performed by renormalizing the 5 minutes-returns at a given moment of the trading day by the corresponding average absolute return at the same moment. We shall refer to this time series as the corrected Nasdaq returns in constrast with the raw (incorrect) Nasdaq returns and we shall examine both data sets for comparison. Although the distributions of positive and negative returns are known to be very similar (Jondeau and Rockinger 2001, for instance), we have chosen to treat them separately. For the Dow Jones, this gives us 14949 positive and 13464 negative data points while, for the Nasdaq, we have 11241 positive and 10751 negative data points. 2 Throughout
the paper, we will use compound returns, i.e., log-returns.
5
70
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Table 1 summarizes the main statistical properties of these two time series (both for the raw and for the corrected Nasdaq returns) in terms of the average returns, their standard deviations, the skewness and the excess kurtosis for four time scales of five minutes, an hour, one day and one month. The Dow Jones exhibits a significaltly negative skewness, which can be ascribed to the impact of the market crashes. The raw Nasdaq returns are significantly positively skewed while the returns corrected for the “lunch effect” are negatively skewed, showing that the lunch effect plays an important role in the shaping of the distribution of the intra-day returns. Note also the important decrease of the kurtosis after correction of the Nasdaq returns for lunch effect, confirming the strong impact of the lunch effect. In all cases, the excess-kurtosis are high and remains significant even after a time aggregation of one month. Jarque-Bera’s test (Cromwell et al. 1994), a joint statistic using skewness and kurtosis coefficients, is used to reject the normality assumption for these time series.
2.2
Existence of time dependence
It is well-known that financial time series exhibit complex dependence structures like heteroskedasticity or non-linearities. These properties are clearly observed in our two times series. For instance, we have estimated the statistical characteristic V (for positive random variables) called coefficient of variation V=
Std(X) , E(X)
(1)
which is often used as a testing statistic of the randomness property of a time series. It can be applied to a sequence of points (or, intervals generated by these points on the line). If these points are “absolutely random,” that is, generated by a Poissonian flow, then the intervals between them are distributed according to an exponential distribution for which V = 1. If V > 1 are associated with a clustering phenomenon. We estimated V = V (u) for extrema X > u and X < −u as function of threshold u (both for positive and for negative extrema). The results are shown in figure 2 for the Dow Jones daily returns. As the results are essentially the same for the Nasdaq, we do not show them. Figure 2 shows that, in the main range |X| < 0.02, containing ∼ 95% of the sample, V increases with u, indicating that the “clustering” property becomes stronger as the threshold u increases. We have then applied several formal statistical tests of independence. We have first performed the Lagrange multiplier test proposed by Engle (1984) which leads to the T · R2 test statistic, where T denotes the sample size and R2 is the determination coefficient of the regression of the squared centered returns xt on a constant and on q of their lags xt−1 , xt−2 , · · · , xt−q . Under the null hypothesis of homoskedastic time series, T · R2 follows a χ2 -statistic with q degrees of freedom. The test have been performed up to q = 10 and, in every case, the null hypothesis is strongly rejected, at any usual significance level. Thus, the time series are heteroskedastics and exhibit volatility clustering. We have also performed a BDS test (Brock et al. 1987) which allows us to detect not only volatility clustering, like in the previous test, but also departure from iid-ness due to non-linearities. Again, we strongly rejects the null-hypothesis of iid data, at any usual significance level, confirming the Lagrange multiplier test.
3
Can long memory processes lead to misleading measures of extreme properties?
Since the descriptive statistics given in the previous section have clearly shown the existence of a significant temporal dependence structure, it is important to consider the possibility that it can lead to erroneous conclusions on estimated parameters. We first briefly recall the standard procedures used to investigate extremal 6
71
properties, stressing the problems and drawbacks arising from the existence of temporal dependence. We then perform a numerical simulation to study the behavior of the estimators in presence of dependence. We put particular emphasis on the possible appearance of significant biases due to dependence in the data set. Finally, we present the results on the extremal properties of our two DJ and ND data sets in the light of the bootstrap results.
3.1
Some theoretical results
Two limit theorems allow one to study the extremal properties and to determine the maximum domain of attraction (MDA) of a distribution function in two forms. First, consider a sample of N iid realizations X1 , X2 , · · · , XN . Let X ∧ denotes the maximum of this sample. Then, the Gnedenko theorem states that if, after an adequate centering and normalization, the distribution of X ∧ converges to a non-degenerate distribution as N goes to infinity, this limit distribution is then necessarily the Generalized Extreme Value (GEV) distribution defined by h i Hξ (x) = exp −(1 + ξ · x)−1/ξ . (2) When ξ = 0, Hξ (x) should be understood as
Thus, for N large enough
Hξ=0 (x) = exp[− exp(−x)].
(3)
¶ x−µ , ψ
(4)
© ª Pr X ∧ < x = Hξ
µ
for some value of the centering parameter µ, scale factor ψ and tail index ξ. It should be noted that the existence of non-degenerate limit distribution of properly centered and normalized X ∧ is a rather strong limitation. There are a lot of distribution functions that do not satisfy this limitation, e.g., infinitely alternating functions between a power-like and an exponential behavior. The second limit theorem is called after Gnedenko-Pickands-Balkema-de Haan (GPBH) and its formulation is as follows. In order to state the GPBH theorem, we define the right endpoint xF of a distribution function F(x) as xF = sup{x : F(x) < 1}. Let us call the function Pr{X − u ≥ x | X > u} ≡ F¯u (x)
(5)
the excess distribution function (DF). Then, this DF F¯u (x) belongs to the Maximum Domain of Attraction of Hξ (x) defined by eq.(2) if and only if there exists a positive scale-function s(u), depending on the threshold u, such that ¯ limu→xF sup0≤x≤xF −u |F¯u (x) − G(x/ξ, s(u))| = 0 , (6) where
G(x/ξ, s) = 1 + ln(Hξ (x/s)) = 1 − (1 + ξ · x/s)−1/ξ .
(7)
By taking the limit ξ → 0, expression (7) leads to the exponential distribution. The support of the distribution function (7) is defined as follows: ( 0 6 x < ∞, if ξ > 0 (8) 0 6 x 6 −d/ξ, if ξ < 0. Thus, the Generalized Pareto Distribution has a finite support for ξ < 0.
7
72
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
The form parameter ξ is of paramount importance for the form of the limiting distribution. Its sign determines three possible limiting forms of the distribution of maximua: If ξ > 0, then the limit distribution is the Fr´echet power-like distribution; If ξ = 0, then the limit distribution is the Gumbel (double-exponential) distribution; If ξ < 0, then the limit distribution has a support bounded from above. All these three distributions are united in eq.(2) by this parameterization. The determination of the parameter ξ is the central problem of extreme value analysis. Indeed, it allows one to determine the maximum domain of attraction of the underling distribution. When ξ > 0, the underlying distribution belongs to the Fr´echet maximum domain of attraction and is regularly varying (power-like tail). When ξ = 0, it belongs to the Gumbel MDA and is rapidly varying (exponential tail), while if ξ < 0 it belongs to the Weibull MDA and has a finite right endpoint.
3.2
Examples of slow convergence to limit GEV and GPD distributions
There exist two ways of estimating ξ. First, if there is a sample of maxima (taken from sub-samples of sufficiently large size), then one can fit to this sample the GEV distribution, thus estimating the parameters by Maximum Likelihood method. Alternatively, one can prefer the distribution of exceedance over a large threshold given by the GPD (7), whose tail index can be estimated with Pickands’ estimator or by Maximum Likelihood, as previously. Hill’s estimator cannot be used since it assumes ξ > 0, while the essence of extreme value analysis is, as we said, to test for the class of limit distributions without excluding any possibility, and not only to determine the quantitative value of an exponent. Each of these methods has its advantages and drawbacks, especially when one has to study dependent data, as we show below. Given a sample of size N, one consider the q-maxima drawn from q sub-samples of size p (such that p · q = N) to estimate the parameters (µ, ψ, ξ) in (4) by Maximum Likelihood. This procedure yields consistent and asymptotically Gaussian estimators, provided that ξ > −1/2 (Smith 1985). The properties of the estimators still hold approximately for dependent data, provided that the interdependence of data is weak. However, it is difficult to choose an optimal value of q of the sub-samples. It depends both on the size N of the entire sample and on the underlying distribution: the maxima drawn from an Exponential distribution are known to converge very quickly to Gumbel’s distribution (Hall and Wellnel 1979), while for the Gaussian law, convergence is particularly slow (Hall 1979). The second possibility is to estimate the parameter ξ from the distribution of exceedances (the GPD) or Pickand’s estimator. For this, one can use either the Maximum Likelihood estimator or Pickands’ estimator. Maximum Likelihood estimators are well-known to be the most efficient ones (at least for ξ > −1/2 and for independent data) but, in this particular case, Pickands’ estimator works reasonably well. Given an ordered sample x1 ≤ x2 ≤ · · · xN of size N, Pickands’ estimator is given by xk − x2k 1 ln . ξˆ k,N = ln 2 x2k − x4k
(9)
For independent and identically distributed data, this estimator is consistent provided that k is chosen so that k → ∞ and k/N → 0 as N → ∞. Morover, ξˆ k,n is asymptotically normal with variance σ(ξˆ k,N )2 · k =
ξ2 (22ξ+1 + 1) . (2(2ξ − 1) ln 2)2
(10)
In the presence of dependence between data, one can expect an increase of the standard deviation, as reported by Kearns and Pagan (1997). For time dependence of the GARCH class, Kearns and Pagan (1997) have indeed demonstrated a significant increase of the standard deviation of the tail index estimator, such as Hill’s estimator, by a factor more than seven with respect to their asymptotic properties for iid samples. This leads to very inaccurate index estimates for time series with this kind of temporal dependence. 8
73
Another problem lies in the determination of the optimal threshold u of the GPD, which is in fact related to the optimal determination of the sub-samples size q in the case of the estimation of the parameters of the distribution of maximum. In sum, none of these methods seem really satisfying and each one presents severe drawbacks. The estimation of the parameters of the GEV distribution and of the GPD may be less sensitive to the dependence of the data, but this property is only asymptotic, thus it requires a bootstrap investigation to be able to compare the real power of each estimation method. As a first simple example illustrating the possibly very slow convergence to the limit distributions of extreme value theory mentioned above, let us consider a simulated sample of iid Weibull random variables (we thus fulfill the most basic assumption of extreme values theory, i.e, iid-ness). For such a distribution, one must obtain in the limit N → ∞ the exponential distribution (ξ = 0). We took two values for the exponent of the Weibull distribution: c = 0.7 and c = 0.3, with d = 1 (scale parameter). Such the estimation of ξ by the distribution of the GPD of exceedance should give estimated values of ξ close to zero in the limit of large N. In order to use the GPD, we have taken the conditional Weibull distribution under condition X > Uk , k = 1...15, where the thresholds Uk are choosen as: U1 = 0.1; U2 = 0.3; U3 = 1; U4 = 3; U5 = 10; U6 = 30; U7 = 100; U8 = 300; U9 = 1000; U10 = 3000; U11 = 104 ; U12 = 3 · 104 ; U13 = 105 ; U14 = 3 · 105 and U15 = 106 . For each simulation, the size of the sample above the considered threshold Uk is choosen equal to 50, 000 in order to get small standard deviations. The Maximum-Likelihood estimates of the GPD form parameter ξ are shown in figure 3. For c = 0.7, the threshold U7 gives an estimate ξ = 0.0123 with standard deviation equal to 0.0045, i.e., the estimate for ξ differs significantly from zero (which is the correct theoretical limit value). This occurs notwithstanding the huge size of the implied data set; indeed, the probability Pr X > U7 for c = 0.7 is about 10−9 , so that in order to obtain a data set of conditional samples from an unconditional data set of the size studied here (50, 000 realizations above U7 ), the size of such unconditional sample should be approximately 109 times larger than the number of “peaks over threshold”, i.e., it is practically impossible to have such a sample. For c = 0.3 the convergence to the theoretical value zero is even slower. Indeed, even the largest financial datasets for a single asset, drawn from high frequency data, are no larger than or of the order of one million points 3 . The situation does not change even for data sets one or two orders of magnitudes larger as considered in (Gopikrishnan et al. 1998), obtained by aggregating thousands of stocks 4 . Thus, although the GPD form parameter should be zero theoretically in the limit of large sample for the Weibull distribution, this limit cannot be reached for any available sample sizes. This is a clear illustration that a rapidly varying distribution, like the Weibull distribution with exponent smaller than one, i.e., a Stretched-Exponential distribution, can be mistaken for a Pareto or any other regularly varying distribution for any practical applications.
3.3
Generation of a long memory process with a well-defined stationary distribution
In order to study the performance of the various estimators of the tail index ξ and the influence of interdependence of sample values, we have generated six samples with distinct properties. The first three samples are made of iid realizations drawn respectively from a Pareto Distribution with tail index b = 3 and from a Stretched-Exponential distribution with exponent c = 0.3 and c = 0.7. The three other samples contain realizations exhibiting long-range memory with the same three distributions as for the first three samples: a regularly varying distribution with tail index b = 3 and a Stretched-Exponential distribution with expo3 One
year of data sampled at the 1 minute time scale gives approximately 1.2 · 105 data points this case, another issue arises concerning the fact that the aggregation of returns from differents assets may distord the information and the very structure of the tails of the pdfs, if they exhibit some intrinsic variability (Matia et al. 2002). 4 In
9
74
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
nent c = 0.3 and c = 0.7. Thus, the three first samples are the iid counterparts of the later ones. The sample with regularly varying iid distributions converges to the Fr´echet’s maximum domain of attraction with ξ = 1/3 = 0.33, while the iid Stretched-Exponential distribution converges to Gumbel’s maximum domain of attraction with ξ = 0. We now study how well can one distinguish between these two distributions belonging to two different maximum domains of attraction. For the stochastic processes with long memory, we use a simple stochastic volatility model. First, we construct a Gaussian process {Xt }t≥1 with correlation function ( 1 if |t| ≤ T , α C(t) = (1+|t|) (11) 0 if |t| > T . It should be noted that, in order for the stochastic process to be well-defined, the correlation function must satisfy a positivity condition. More precisely, the spectral density (the Fourier transform of the correlation function) must remain positive. This condition imposes that the duration of the memory T be larger than a constant depending on α. The next step consists in building the process {Ut }t≥1 , defined by Ut = Φ(Xt ) ,
(12)
where Φ(·) is the Gaussian distribution function. The process {Ut }t≥1 exhibits exactly the same long range dependence as the process {Xt }t≥1 . This is ensured by the property of invariance of the copula under strickly increasing change of variables. Let us recall that a copula is the mathematical embodiement of the dependence structure between different random variables (Joe 1997, Nelsen 1998). The process {Ut }t≥1 thus possesses a Gaussian copula dependence structure with long memory and uniform marginals 5 . In the last step, we define the volatility process −1/b
σt = σ0 ·Ut
,
(13)
which ensures that the stationary distribution of the volatility is a Pareto distribution with tail index b. Such a distribution of the volatility is not realistic in the bulk which is found to be approximately a lognormal distribution for not too large volatilities (Sornette et al. 2000), but is in agreement with the hypothesis of an asymptotic regularly varying distribution. A change of variable more complicated than (13) can provide a more realistic behavior of the volatility on the entire range of the distribution but our main goal is not to provide a realistic stochastic volatility model but only to exhibit a long memory process with well-defined prescribed marginals in order to test the influence of a long range dependence structure. The return process is then given by rt = σt · εt ,
(14)
where the εt are Gaussian random variables independent from σt . The construction (14) ensures the decorrelation of the returns at every time lag. The stationary distribution of rt admits the density µ ¶ b−1 b − 1 r2 b , , (15) p(r) = 2 2 · Γ b+1 2 2 r ³ ´ ¡ b−1 ¢ r2 which is regularly varying at infinity since Γ b−1 , 2 2 goes to Γ 2 . This completes the construction and characterization of our long memory process with regularly varying stationary distribution. 5 Of course,
one can make the correlation as small as one wants under an adequate choice of a strickly increasing transformation but this does not change the fact that the dependence remains unchanged. This is another illustration of the fact that the correlation is not always a good and adapted measure of dependence (Malevergne and Sornette 2002).
10
75
In order to obtain a process with Stretched-Exponential distribution with long range dependence, we apply to {rt }t≥1 the following increasing mapping G : r → y r 1/c r > r0 (x0 + ln r0 ) 1/c G(r) = sgn(r) · |r| (16) |r| ≤ r0 1/c −(r0 + ln |r/r0 |) r < −r0 . This transformation gives a stretched exponential of index c for all values of the return larger than the scale factor r0 . This derives from the fact that the process {rt }t≥1 admits a regularly varying distribution function, characterized by F¯r (r) = 1 − Fr (r) = L (r)|r|−b , for some slowly varying function L . As a consequence, the stationary distribution of {Yt }t≥1 is given by ¡ ¢ ebr0 c F¯Y (y) = L r0 e−x0 exp (yc ) b · e−b|y| , x0 c
= L 0 (y) · e−b|y| ,
∀|y| > r0 ,
L 0 is slowly varying at infinity,
(17) (18)
which is a Stretched-Exponential distribution. To summarize, starting with a long memory Gaussian process, we have defined a long memory process characterized by a stationary distribution function of our choice, thanks to the invariance of the temporal dependence structure (the copula) under strictly increasing change of variable. In particular, this approach gives long memory processes with a regularly varying marginal distribution and with a stretched-exponential distribution. Notwithstanding the difference in their marginals, these two processes possess by construction exactly the same time dependence. This allows us to compare the impact of the same dependence on these two classes of marginals.
3.4
Results of numerical simulations
We have generated 1000 samples of each kind (iid Stretched-Exponential, iid Pareto, long memory process with a Pareto distribution and with a Stretched-Exponential distribution). Each sample contains 10, 000 realizations, which is approximately the number of points in each tail of our real samples. In order to generate the Gaussian process with correlation function (11), we have used the algorithm based on Fast Fourier Transform described in Beran (1994). The parameter T has been set to 250 and α to 0.5 (it can be checked that for α = 0.5 the lower bound for T is equal to 23). Panel (a) of table 2 presents the mean values and standard deviations of the Maximum Likelihood estimates of ξ, using the Generalized Extreme Value distribution and the Generalized Pareto Distribution for the three samples of iid data. To estimate the parameters of the GEV distribution and study the influence of the sub-sample size, we have grouped the data in clusters of size q = 10, 50, 100 and 200. For the analysis in terms of the GPD, we have considered four different large thresholds u, corresponding to the quantiles 90%, 95%, 99% and 99.5%. The estimates obtained from the distribution of maxima are significantly different from the theoretical ones: 0.2 in average over the four different size of sub-samples intead of 0.0 for the Stretched-Exponential distribution with c = 0.7, 1.0 instead of 0.0 for c = 0.3 for the Stretched-Exponential distribution and 0.40 instead of 0.33 for the Pareto Distribution. At the same time, the standard deviation of these estimator remains very low. This significant bias of the estimator is a clear sign that the distribution of the maximum has not yet converged to the asymptotic GEV distribution, even for subsamples of size 200. The results are better with smaller biases for the Maximum Likelihood estimates obtained from the GPD. However, the standard deviations are significantly larger than in the previous case, which testifies of the high variability of this estimator. Thus, for such sample sizes, the GEV and GPD Maximum Likelihood estimates seem not very reliable due to an important bias for the former and large statistical fluctuations for the later. 11
76
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Nevertheless, an optimistic view point is that, discarding the largest quantile of 0.995, the GPD estimator is compatible with a value ξ = 0.32 ± 0.1 for the iid data with Pareto distribution while it is compatible with ξ = (−0.21) − (0.19) ± 0.6 − 0.8 for c = 0.3 and with ξ = (−0.13) − (0.11) ± 0.04 − 0.5 for c = 0.7. The consistency of the GPD estimator in the case of the Pareto distribution in contrast with the wild variations for the Stretched-Exponential distributions suggests that one could conclude that the first result qualifies (correctly) a regularly varying function while the second one either (correctly) disqualifies a regularly varying function or more conservatily is unable to conclude. In other words, when the sample data is truely Pareto, the GPD estimator seems to be able to retrieve this information reliably, in constrast with the GEV estimator which is quite unreliable in all cases. Panel (b) of table 2 presents the same results for dependent data. The GEV estimates exhibit in each case a significant bias, either positive or negative and a huge increase of the standard deviation in the case of the Stretched-Exponential with exponent c = 0.7. Interestingly, the GEV estimator for the Pareto distribution is utterly wrong. The situation is different for the GPD estimates which show a weak bias not really sensitive to the quantile. In constrast, the standard deviations of the GPD estimators strongly increase with the quantile, which is natural since the number of obervations decreases accordingly. The GPD behaves surprisingly well and seems to be the only one able to perfom a reasonable estimation of the tail index. To summarize, the Maximum Likelihood estimators derived form the GEV or GPD distributions are not very efficient in the presence of dependence in the data and of non-asymptotic effects due to the slow convergence toward the asymptotic GEV or GPD distributions. The only positive note is that the GPD estimator correctly recovers the range of the index ξ with an uncertainty smaller than 20% for data with a pure Pareto distribution while it is cannot reject the hypothesis that ξ = 0 when the data is generated with a Stretched-Exponential distribution, albeit with a very large uncertainty, in other words with very little power. Table 3 focuses on the results given by Pickands’ estimator for the tail index of the GPD. For each threshods u, corresponding to the quantiles 90%, 95%, 99% and 99.5% respectively, the results of our simulations are given for two particular values of k (defined in 9) corresponding to N/k = 4, which is the largest admissible value, and N/k = 10 corresponding to be sufficiently far in the tail of the GPD. Table 3 provides the mean value and the numerically estimated as well as the theoretical (given by (10)) standard deviation of ξˆ k,N . Panel (a) gives the result for iid data. The mean values do not exhibit a significant bias for the Pareto distribution and the Stretched-Exponential with c = 0.7, but are utterly wrong in the case c = 0.3 since the estimates are comparable with those given for the Pareto distribution. In each case, we note a very good agreement between the empirical and theoretical standard deviations, even for the larger quantiles (and thus the smaller samples). Panel (b) presents the results for dependent data. The estimated standard deviations remains of the same order as the theoretical ones, contrarily to results reported by Kearns and Pagan (1997) for GARCH processes. However, like these authors, we find that the bias, either positive or negative, becomes very significant and leads on to misclassify a Stretched-Exponential distribution with c = 0.3 for a Pareto distribution with b = 3. Thus, in presence of dependence, Pickands’ estimator is unreliable. To summarize, the impact of the dependence can add a severe scatter of estimators which increase their standard deviation. The determination of the maximum domain of attraction with usual estimators does not appear to be a very efficient way to study the extreme properties of dependent times series. Almost all the previous studies which have investigated the tail behavior of asset returns distributions have focused on these methods (see the influencial works of Longin (1996) for instance) and may thus have led to spurious results on the determination of the tail behavior. In particular, our simulations show that rapidly varying function may be mistaken for regularly varying functions. Thus, according to our simulations, this casts doubts on the strength of the conclusion of previous works that the distributions of returns are regularly varying as seems to have been the consensus until now and suggests to re-examine the possibility that the distribution of returns may be rapidly varying as suggested by Gouri´eroux and Jasiak (1998) or Laherr`ere 12
77
and Sornette (1999) for instance. We now turn to this question using the framework of GEV and GDP estimators just described.
3.5
GEV and GPD estimators of the Dow Jones and Nasdaq data sets
We have applied the same analysis as in the previous section on the real samples of the Dow Jones and Nasdaq (raw and corrected) returns. To this aim, we have randomly generated one thousand sub-samples, each sub-sample being constituted of ten thousand data points in the positive or negative parts of the samples respectively (without replacement). Obviously, among the one thousand sub-samples, many of them are interdependent as they contain parts of the same observed values. With this database, we have estimated the mean value and standard deviations of Pickands’ estimator for the GPD derived from the upper quantiles of these distributions, and of ML-estimators for the distribution of maximum and for the GPD. The results are given in tables 4 and 5. These results confirm the confusion about the tail behavior of the returns distributions and it seems impossible to exclude a rapidly varying behavior of their tails. Indeed, even the estimations performed by Maximum Likelihood with the GPD tail index, which have appeared as the less unreliable estimator in our previous tests, does not allow us to reject the hypothesis that the tails of the empirical distributions of returns are rapidly varying. For the Nasdaq dataset, accounting for the lunch effect does not yield a significant change in the estimations, except for a very strong increase of the standard variation of the GPD Maximum Likelihood estimator. This results from the fact that extremes are no more dominated by the few largest realizations of the returns at the begining or the end of trading days. Indeed, panel (b) of table 4 shows that the sample variance of the GPD maximum likelihood estimate vanishes for quantile 99% and 99.5%. It is due to the important overlap of the sub-samples together with the impact of the extreme realizations of the returns at the open or close trading days. In panel (c), which corresponds to the Nasdaq data corrected for the lunch effect, the sample variance vanishes only in one case (instead of four), clearly showing that extremes are less dominated by the large returns of the beginning and at the end of each day. As a last non-parametric attempt to distinguish between a regularly varying tail and a rapidly varying tail of the exponential or Stretched-Exponential families, we study the Mean Excess Function which is one of the known methods that often can help in deciding what parametric family is appropriate for approximation (see for details Embrechts et al. (1997)). The Mean Excess Function MEF(u) of a random value X (also called “shortfall” when applied to negative returns in the context of financial risk management) is defined as MEF(u) = E(X − u|X > u) .
(19)
The Mean Excess Function MEF(u) is obviously related to the GPD for sufficiently large threshold u and its behavior can be derived in this limit for the three maximum domains of attraction. In addition, more precise results can be given for particular radom variables, even in a non-asymptotic regime. Indeed, for an exponential random variable X, the MEF(u) is just a constant. For a Pareto random variable, the MEF(u) is a straight increasing line, whereas for the Streched-Exponential and the Gauss distributions the MEF(u) is a decreasing function. We evaluated the sample analogues of the MEF(u) (Embrechts et al. 1997, p.296) which are shown in figure 4. All attempts to find a constant or a linearly increasing behavior of the MEF(u) on the main central part of the range of returns were ineffective. In the central part of the range of negative returns (|X| > 0.002; q ∼ = 98% for ND data, and |X| > 0.025 ; q ∼ = 96% for DJ data), the MEF(u) behaves like a convex function which exclude both exponential and power (Pareto) distributions. Thus, the MEF(u) tool does not support using any of these two distributions. In view of the stalemate reached with the above non-parametric approaches and in particular with the stan13
78
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
dard extreme value estimators, the sequel of this paper is devoted to the investigation of a parametric approach in order to decide which class of extreme value distributions, rapidly versus regularly varying, accounts best for the empirical distributions of returns.
4
Fitting distributions of returns with parametric densities
Since our previous results lead to doubt the validity of the rejection of the hypothesis that the distribution of returns are rapidly varying, we now propose to pit a parametric champion for this class of functions against the Pareto champion of regularly varying functions. To represent the class of rapidly varying functions, we propose the family of Stretched-Exponentials. As discussed in the introduction, the class of stretched exponentials is motivated in part from a theoretical view point by the fact that the large deviations of multiplicative processes are generically distributed with stretched exponential distributions (Frisch and Sornette 1997). Stretched exponential distributions are also parsimonious examples of sub-exponential distributions with fat tails for instance in the sense of the asymptotic probability weight of the maximum compared with the sum of large samples (Feller 1971). Notwithstanding their fat-tailness, Stretched Exponential distributions have all their moments finite6 , in constrast with regularly varying distributions for which moments of order equal to or larger than the index b are not defined. This property may provide a substantial advantage to exploit in generalizations of the mean-variance portfolio theory using higher-order moments (Rubinstein 1973, Fang and Lai 1997, Hwang and Satchell 1999, Sornette et al. 2000, Andersen and Sornette 2001, Jurczenko and Maillet 2002, Malevergne and Sornette 2002, for instance ). Moreover, the existence of all moments is an important property allowing for an efficient estimation of any high-order moment, since it ensures that the estimators are asymptotically Gaussian. In particular, for Stretched-Exponentially distributed random variables, the variance, skewness and kurtosis can be well estimated, contrarily to random variables with regularly varying distribution with tail index in the range 3 − 5.
4.1
Definition of a general 3-parameters family of distributions
We thus consider a general 3-parameters family of distributions and its particular restrictions corresponding to some fixed value(s) of two (one) parameters. This family is defined by its density function given by: ( £ ¡ ¢c ¤ if x > u > 0 A(b, c, d, u) x−(b+1) exp − dx (20) fu (x|b, c, d) = 0 if x < u. Here, b, c, d are unknown parameters, u is a known lower threshold that will be varied for the purposes of our analysis and A(b, c, d, u) is a normalizing constant given by the expression: A(b, c, d, u) =
db c , Γ(−b/c, (u/d)c )
(21)
where Γ(a, x) denotes the (non-normalized) incomplete Gamma function. The parameter b ranges from minus infinity to infinity while c and d range from zero to infinity. In the particular case where c = 0, the parameter b also needs to be positive to ensure the normalization of the probability density function (pdf). The interval of definition of this family is the positive semi-axis. Negative log-returns will be studied by taking their absolute values. The family (20) includes several well-known pdf’s often used in different applications. We enumerate them. 6 However, they do not admit an exponential moment, which leads to problems in the reconstruction of the distribution from the knowledge of their moments (Stuart and Ord 1994).
14
79
1. The Pareto distribution: Fu (x) = 1 − (u/x)b ,
(22)
which corresponds to the set of parameters (b > 0, c = 0) with A(b, c, d, u) = b·ub . Several works have attempted to derive or justified the existence of a power tail of the distribution of returns from agentbased models (Challet and Marsili 2002), from optimal trading of large funds with sizes distributed according to the Zipf law (Gabaix et al. 2002) or from stochastic processes (Biham et al 1998, 2002). 2. The Weibull distribution:
h ³ x ´c ³ u ´c i Fu (x) = 1 − exp − + , d d
(23) £¡ ¢c ¤ with parameter set (b = −c, c > 0, d > 0) and normalization constant A(b, c, d, u) = dcc exp du . This distribution is said to be a “Stretched-Exponential” distribution when the exponent c is smaller than 1, namely when the distribution decays more slowly than an exponential distribution. 3. The exponential distribution:
³ x u´ , Fu (x) = 1 − exp − + d d
(24) ¡ ¢ with parameter set (b = −1, c = 1, d > 0) and normalization constant A(b, c, d, u) = d1 exp − du . The exponential family can for instance derive from a simple model where stock price dynamics is governed by a geometrical (multiplicative) Brownian motion with stochastic variance. Dragulescu and Yakovenko (2002) have found an excellent fit of this model with the Dow-Jones index for time lags from 1 to 250 trading days, within an asymptotic exponential tail of the distribution of log-returns with a time-dependent exponent. 4. The incomplete Gamma distribution: Fu (x) = 1 −
Γ(−b, x/d) Γ(−b, u/d)
with parameter set (b, c = 1, d > 0) and normalization A(b, c, d, u) =
(25) db Γ(−b,u/d) .
Thus, the Pareto distribution (PD) and exponential distribution (ED) are one-parameter families, whereas the stretched exponential (SE) and the incomplete Gamma distribution (IG) are two-parameter families. The comprehensive distribution (CD) given by equation (20) contains three unknown parameters. Interesting links between these different models reveal themselves under specific asymptotic conditions. For instance, in the limit b → +∞, the Pareto model becomes the Exponential model (Bouchaud and Potters 2000). Indeed, provided that the scale parameter u of the power law is simultaneously scaled as ub = (b/α)b , we can write the tail of the cumulative distribution function of the PD as ub /(u + x)b which is indeed of the form ub /xb for large x. Then, ub /(u + x)b = (1 + αx/b)−b → exp(−αx) for b → +∞. This shows that the Exponential model can be approximated with any desired accuracy on an arbitrary interval (u > 0,U) by the (PD) model with parameters (β, u) satisfying ub = (b/α)b . Although the value b → +∞ does not give strickly speaking a Exponential distribution, the limit b → +∞ provides any desired approximation to the Exponential distribution, uniformly on any finite interval (u,U). More interesting for our present study is the behavior of the (SE) model when c → 0. In this limit, and provided that ³ u ´c c· → β, as c → 0 . (26) d
15
80
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
the (SE) model goes to the Pareto model. Indeed, we can write µ c ¶ ³ u ´c xc−1 h ³ u ´c ³³ x ´c ´i c c−1 x − uc · x · exp − = c · exp − · − 1 , dc dc d uc d u h ³ u ´c xi ' β · x−1 exp −c · ln , as c → 0 d i u h x −1 ' β · x exp −β · ln , u uβ ' β β+1 , x
(27)
which is the pdf of the (PD) model with tail index β. The condition (26) comes naturally from the properties of the maximum-likelihood estimator of the scale parameter d given by equation (47) and underlined by equation (90) in the appendices at the end of the paper. It implies that, as c → 0, the characteristic scale d of the (SE) model must also go to zero with c to ensure the convergence of the (SE) model towards the (PD) model. This shows that the Pareto model can be approximated with any desired accuracy on an arbitrary interval (u > 0,U) by the (SE) model with paramters (c, d) satisfying equation (26) where the arrow is replaced by an equality. Although the value c = 0 does not give strickly speaking a Stretched-Exponential distribution, the limit c → 0 provides any desired approximation to the Pareto distribution, uniformly on any finite interval (u,U). This deep relationship between the SE and PD models allows us to understand why it can be very difficult to decide, on a statistical basis, which of these models fits the data best.
4.2
Methodology
We start with fitting our two data sets (DJ and ND) by the five distributions enumerated above (20) and (2225). Our first goal is to show that no single parametric representation among any of the cited pdf’s fits the whole range of the data sets. Recall that we analyze separately positive and negative returns (the later being converted to the positive semi-axis). We shall use in our analysis a movable lower threshold u, restricting by this threshold our sample to observations satisfying to x > u. In addition to estimating the parameters involved in each representation (20,22-25) by maximum likelihood for each particular threshold u7 , we need a characterization of the goodness-of-fit. For this, we propose to use a distance measure between the estimated distribution and the sample distribution. Many distances can be used: mean-squared error, Kullback-Liebler distance8 , Kolmogorov distance, Spearman distance (as in Longin (1996)) or Anderson-Darling distance, to cite a few. We can also use one of these distances to determine the parameters of each pdf according to the criterion of minimizing the distance between the estimated distribution and the sample distribution. The chosen distance is thus useful both for characterizing and for estimating the parametric pdf. In the later case, once an estimation of the parameters of particular distribution family has been obtained according to the selected distance, we need to quantify the statistical significance of the fit. This requires to derive the statistics associated with the chosen distance. These statistics are known for most of the distances cited above, in the limit of large sample. We have chosen the Anderson-Darling distance to derive our estimated parameters and perform our tests of goodness of fit. The Anderson-Darling distance between a theoretical distribution function F(x) and its 7 The
estimators and their asymptotic properties are derived in appendix A. distance (or divergence, strictly speaking) is the natural distance associated with maximum-likelihood estimation since it is for these values of the estimated parameters that the distance between the true model and the assumed model reaches its minimum. 8 This
16
81
empirical analog FN (x), estimated from a sample of N realizations, is evaluated as follows: Z
ADS = N ·
[FN (x) − F(x)]2 dF(x) F(x)(1 − F(x))
(28)
N
= −N − 2 ∑ {wk log(F(yk )) + (1 − wk ) log(1 − F(yk ))},
(29)
1
where wk = 2k/(2N + 1), k = 1 . . . N and y1 6 . . . 6 yN is its ordered sample. If the sample is drawn from a population with distribution function F(x), the Anderson-Darling statistics (ADS) has a standard AD-distribution free of the theoretical df F(x) (Anderson and Darling 1952), similarly to the χ2 for the χ2 -statistic, or the Kolmogorov distribution for the Kolmogorov statistic. It should be noted that the ADS weights the squared difference in eq.(28) by 1/F(x)(1 − F(x)) which is nothing but the inverse of the variance of the difference in square brackets. The AD distance thus emphasizes more the tails of the distribution than, say, the Kolmogorov distance which is determined by the maximum absolute deviation of Fn (x) from F(x) or the mean-squared error, which is mostly controlled by the middle of range of the distribution. Since we have to insert the estimated parameters into the ADS, this statistic does not obey any more the standard AD-distribution: the ADS decreases because the use of the fitting parameters ensures a better fit to the sample distribution. However, we can still use the standard quantiles of the AD-distribution as upper boundaries of the ADS. If the observed ADS is larger than the standard quantile with a high significance level (1 − ε), we can then conclude that the null hypothesis F(x) is rejected with significance level larger than (1 − ε). If we wish to estimate the real significance level of the ADS in the case where it does not exceed the standard quantile of a high significance level, we are forced to use some other method of estimation of the significance level of the ADS, such as the bootstrap method. In the following, the estimates minimizing the Anderson-Darling distance will be refered to as AD-estimates. The maximum likelihood estimates (ML-estimates) are asymptotically more efficient than AD-estimates for independent data and under the condition that the null hypothesis (given by one of the four distributions (2225), for instance) corresponds to the true data generating model. When this is not the case, the AD-estimates provide a better practical tool for approximating sample distributions compared with the ML-estimates. We have determined the AD-estimates for 18 standard significance levels q1 . . . q18 given in table 6. The corresponding sample quantiles corresponding to these significance levels or thresholds u1 . . . u18 for our samples are also shown in table 6. Despite the fact that thresholds uk vary from sample to sample, they always corresponded to the same fixed set of significance levels qk throughout the paper and allows us to compare the goodness-of-fit for samples of different sizes.
4.3
Empirical results
The Anderson-Darling statistics (ADS) for five parametric distributions (Weibull or Stretched-Exponential, Generalized Pareto, Gamma, exponential and Pareto) are shown in table 7 for two quantile ranges, the first top half of the table corresponding to the 90% lowest thresholds while the second bottom half corresponds to the 10% highest ones. For the lowest thresholds, the ADS rejects all distributions, except the StretchedExponential for the Nasdaq. Thus, none of the considered distributions is really adequate to model the data over such large ranges. For the 10% highest quantiles, only the exponential model is rejected at the 95% confidence level. The Stretched-Exponential distribution is the best, just before the Pareto distribution and the Incomplete Gamma that cannot be rejected. We now present an analysis of each case in more details.
17
82
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
4.3.1 Pareto distribution Figure 5a shows the cumulative sample distribution function 1 − F(x) for the Dow Jones Industrial Average index, and in figure 5b the cumulative sample distribution function for the Nasdaq Composite index. The mismatch between the Pareto distribution and the data can be seen with the naked eye: if samples were taken from a Pareto population, the graph in double log-scale should be a straight line. Even in the tails, this is doubtful. To formalize this impression, we calculate the Hill and AD estimators for each threshold u. Denoting y1 > . . . > ynu the ordered sub-sample of values exceeding u where Nu is size of this sub-sample, the Hill maximum likelihood estimate of parameter b is (Hill 1975) " 1 bˆ u = Nu
#−1
Nu
∑ log(yk /u)
.
(30)
1
The standard deviations of bˆ u can be estimated as √ Std(bˆ u ) = bˆ u / Nu ,
(31)
under the assumption of iid data, but very severely underestimate the true standard deviation when samples exhibit dependence, as reported by Kearns and Pagan (1997). Figure 6a and 6b shows the Hill estimates bˆ u as a function of u for the Dow Jones and for the Nasdaq. Instead of an approximately constant exponent (as would be the case for true Pareto samples), the tail index estimator increases until u ∼ = 0.04, beyond which it seems to slow its growth and oscillates around a value ≈ 3 − 4 up to the threshold u ∼ = .08. It should be noted that interval [0, 0.04] contains 99.12% of the sample whereas interval [0.04, 0.08] contains only 0.64% of the sample. The behavior of bˆ u for the ND shown in figure 6b is similar: Hill’s estimates bˆ u seem to slow its growth already at u ∼ = 0.0013 corresponding to the 95% quantile. Are these slowdowns of the growth of bˆ u genuine signatures of a possible constant well-defined asymptotic value that would qualify a regularly varying function? As a first answer to this question, table 8 compares the AD-estimates of the tail exponent b with the corresponding maximum likelihood estimates for the 18 intervals u1 . . . u18 . Both maximum likelihood and Anderson-Darling estimates of b steadily increase with the threshold u (excepted for the highest quantiles of the positive tail of the Nasdaq). The corresponding figures for positive and negative returns are very close each to other and almost never significantly different at the usual 95% confidence level. Some slight non-monotonicity of the increase for the highest thresholds can be explained by small sample sizes. One can observe that both MLE and ADS estimates continue increasing as the interval of estimation is contracting to the extreme values. It seems that their growth potential has not been exhausted even for the largest quantile u18 , except for the positive tail of the Nasdaq sample. This statement might be however not very strong as the standard deviations of the tail index estimator also grow when exploring the largest quantiles. However, the non-exhausted growth is observed for three samples out of four. Moreover, this effect is seen for several threshold values and we can add that random fluctuations would distort the b-curve in a random manner, i.e, now up now down, whearas we note in three cases an increasing curve. Assuming that the observation, that the sample distribution can be approximated by a Pareto distribution with a growing index b, is correct, an important question arises: how far beyond the sample this growth will continue? Judging from table 8, we can think this growth is still not exhausted. Figure 7 suggests a specific form of this growth, by plotting the hill estimator bˆ u for all four data sets (positive and negative branches of the distribution of returns for the DJ and for the ND) as a function of the index n = 1, ..., 18 of the 18 quantiles or standard significance levels q1 . . . q18 given in table 6. Similar results are obtained with the AD estimates. Apart from the positive branch of the ND data set, all other three branches suggest a continuous growth of the Hill estimator bˆ u as a function of n = 1, ..., 18. Since the quantiles q1 . . . q18 given in table 6 18
83
have been chosen to converge to 1 approximately exponentially as 1 − qn = 3.08 e−0.342n ,
(32)
the linear fit of bˆ u as a function of n shown as the dashed line in figure 7 corresponds to 3.08 bˆ u (qn ) = 0.08 + 0.626 ln . 1 − qn
(33)
Expression (33) suggests an unbound logarithmic growth of bˆ u as the quantile approaches 1. For instance, for a quantile 1 − q = 0.1%, expression (33) predicts bˆ u (1 − q = 10−3 ) = 5.1. For a quantile 1 − q = 0.01%, expression (33) predicts bˆ u (1 − q = 10−4 ) = 6.5, and so on. Each time the quantile 1 − q is divided by a factor 10, the apparent exponent bˆ u (q) is increased by the additive constant ∼ = 1.45: bˆ u ((1 − q)/10) = ˆbu (1 − q) + 1.45. This very slow growth uncovered here may be an explanation for the belief and possibly mistaken conclusion that the Hill and other estimators of the tail index tends to a constant for high quantiles. Indeed, it is now clear that the slowdowns of the growth of bˆ u seen in figures 6 decorated by large fluctuations due to small size effects is mostly the result of a dilatation of the data expressed in terms of threshold u. When recast in the more natural logarithm scale of the quantiles q1 . . . q18 , this slowdown disappears. Of course, it is impossible to know how long this growth given by (33) may go on as the quantile q tends to 1. In other words, how can we escape from the sample range when estimating quantiles? How can we estimate the so-called “high quantiles” at the level q > 1 − 1/T where T is the total number of sampled points. Embrechts et al. (1997) have summarized the situation in this way: “there is no free lunch when it comes to high quantiles estimation!” It is possible that bˆ u (q) will grow without limit as would be the case if the true underlying distribution was rapidly varying. Alternatively, bˆ u (q) may saturate to a large value, as predicted for instance by the traditional GARCH model which yields tails indices which can reach 10 − 20 (Engle and Patton 2001, Starica and Pictet 1999) or by the recent multifractal random walk (MRW) model which gives an asymptotic tail exponent in the range 20 − 50 (Muzy et al. 2000, Muzy et al. 2001). According to (33), a value bˆ u ≈ 20 (respectively 50) would be attained for 1 − q ≈ 10−13 (respectively 1 − q ≈ 10−34 )! If one believes in the prediction of the MRW model, the tail of the distribution of returns is regularly varying but this insight is completely useless for all practical purposes due to the astronomically high statistics that would be needed to sample this regime. In this context, we cannot hope to get access to the true nature of the pdf of returns but only strive to define the best effective or apparent most parsimonious and robust model. We do not discuss here the new class of estimation issues raised by the MRW model, which is interesting in itself but requires a specific analysis of its own left for another work. The question of the exhaustion of the growth of the tail index is really crucial. Indeed, if it is unboundedly increasing, it is the signature that the tails of the distributions of returns decay faster than any power-law, and thus cannot be regularly varying. We revisit this question of the growth of the apparent exponent b, using the notion of local exponent, in Appendix C as an attempt to better constraint this growth. The analysis developed in this Appendix C basically confirms the first indication shown in figure 7 . 4.3.2 Weibull distributions Let us now fit our data with the Weibull (SE) distribution (23). The Anderson-Darling statistics (ADS) for this case are shown in table 7. The ML-estimates and AD-estimates of the form parameter c are represented in table 9. Table 7 shows that, for the higest quantiles, the ADS for the Stretched-Exponential is the smallest of all ADS, suggesting that the SE is the best model of all. Moreover, for the lowest quantiles, it is the sole model not systematically rejected at the 95% level. The c-estimates are found to decrease when increasing the order q of the threshold uq beyond which the estimations are performed. In addition, the c-estimate is identically zero for u18 . However, this does not 19
84
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
automatically imply that the SE model is not the correct model for the data even for these highest quantiles. Indeed, numerical simulations show that, even for synthetic samples drawn from genuine StretchedExponential distributions with exponent c smaller than 0.5 and whose size is comparable with that of our data, in about one case out of three (depending on the exact value of c) the estimated value of c is zero. This a priori surprising result comes from condition (51) in appendix A which is not fulfilled with certainty even for samples drawn for SE distributions. Notwithstanding this cautionary remark, note that the c-estimate of the positive tail of the Nasdaq data equal zero for all quantiles higher than q14 = 0.97%. In fact, in every cases, the estimated c is not significantly different from zero - at the 95% significance level - for quantiles higher than q12 -q14 . In addition, table 10 gives the values of the estimated scale paremeter d, which are found very small - particularly for the Nasdaq - beyond q12 = 95%. In constrast, the Dow Jones keeps significant scale factors until q16 − q17 . These evidences taken all together provide a clear indication on the existence of a change of behavior of the true pdf of these four distributions: while the bulks of the distributions seem rather well approximated by a SE model, a fatter tailed distribution than that of the (SE) model is required for the highest quantiles. Actually, the fact that both c and d are extremely small may be interpreted according to the asymptotic correspondence given by (26) and (27) as the existence of a possible power law tail. 4.3.3 Exponential and incomplet Gamma distribution Let us now fit our data with the exponential distribution (24). The average ADS for this case are shown in table 7. The maximum likelihood- and Anderson-Darling estimates of the scale parameter d are given in table 11. Note that they always decrease as the threshold uq increases. Comparing the mean ADS-values of table 7 with the standard AD quantiles, we can conclude that, on the whole, the exponential distribution (even with moving scale parameter d) does not fit our data: this model is systematically rejected at the 95% confidence level for the lowest and highest quantiles - excepted for the negative tail of the Nasdaq. Finally, we fit our data by the IG-distribution (25). The mean ADS for this class of functions are shown in table 7. The Maximum likelihood and Anderson Darling estimates of the power index b are represented in table 12. Comparing the mean ADS-values of table 7 with the standard AD quantiles, we can again conclude that, on the whole, the IG-distribution does not fit our data. The model is rejected at the 95% confidence level excepted for the negative tail of the Nasdaq for which it is not rejected marginally (significance level: 94.13%). However, for the largest quantiles, this model becomes again revelant since it cannot be rejected at the 95% level.
4.4
Summary
At this stage, two conclusions can be drawn. First, it appears that none of the considered distributions fit the data over the entire range, which is not a surprise. Second, for the highest quantiles, three models seem to be able to represent to data, the Gamma model, the Pareto model and the Stretched-Exponential model. This last one has the lowest Anderson-Darling statistic and thus seems to be the most reasonable model among the three models compatible with the data.
20
85
5
Comparison of the descriptive power of the different families
As we have seen by comparing the Anderson-Darling statistics corresponding to the four parametric families (22-25), the best model in the sense of minimizing the Anderson-Darling distance is the StretchedExponential distribution. We now compare these four distributions with the comprehensive distribution (20) using Wilks’ theorem (Wilks 1938) of nested hypotheses to check whether or not some of the four distributions are sufficient compared with the comprehensive distribution to describe the data. We then turn to the Wald encompassing test for non-nested hypotheses which provides a pairwise comparison of the different models.
5.1
Comparison between the four parametric families and the comprehensive distribution
According to Wilk’s theorem, the doubled generalized log-likelihood ratio Λ: Λ = 2 log
max L (CD, X, Θ) , max L (z, X, θ)
(34)
has asymptotically (as the size N of the sample X tends to infinity) the χ2 -distribution. Here L denotes the likelihood function, θ and Θ are parametric spaces corresponding to hypotheses z and CD correspondingly (hypothesis z is one of the four hypotheses (22-25) that are particular cases of the CD under some parameter relations). The statement of the theorem is valid under the condition that the sample X obeys hypothesis z for some particular value of its parameter belonging to the space θ. The number of degrees of freedom of the χ2 -distribution equals to the difference of the dimensions of the two spaces Θ and θ. Since dim(Θ) = 3 and dim(θ) = 2 for the Stretched-Exponential and Incomplet Gamma distributions; dim(θ) = 1 for the Pareto and the Exponential distributions, we have one degree of freedom for the formers and two degrees of freedom for the laters. The maximum of the likelihood in the numerator of (34) is taken over the space Θ, whereas the maximum of the likelihood in the denominator of (34) is taken over the space θ. Since we have always θ ⊂ Θ, the likelihood ratio is always larger than 1, and the log-likelihood ratio is non-negative. If the observed value of Λ does not exceed some high-confidence level (say, 99% confidence level) of the χ2 , we then reject the hypothesis CD in favour of the hypothesis z, considering the space Θ redundant. Otherwise, we accept the hypothesis CD, considering the space θ insufficient. The doubled log-likelihood ratios (34) are shown in figures 8 for the positive and negative branches of the distribution of returns of the Nasdaq and in figures 9 for the Dow Jones. The 95% χ2 confidence levels for 1 and 2 degrees of freedom are given by the horizontal lines. For the Nasdaq data, figure 8 clearly shows that Exponential distribution is completely insufficient: for all lower thresholds, the Wilks log-likelihood ratio exceeds the 95% χ21 level 3.84. The Pareto distribution is insufficient for thresholds u1 − u11 (92.5% of the ordered sample) and becomes comparable with the Comprehensive distribution in the tail u12 − u18 (7.5% of the tail probability). It is natural that two-parametric families Incomplete Gamma and Stretched-Exponential have higher goodness-of-fit than the one-parametric Exponential and Pareto distributions. The Incomplete Gamma distribution is comparable with the Comprehensive distribution starting with u10 (90%), whereas the Stretched-Exponential is somewhat better (u9 or u8 , i.e., 70%). For the tails representing 7.5% of the data, all parametric families except for the Exponential distribution fit the sample distribution with almost the same efficiency. The results obtained for the Dow Jones data shown in figure 9 are similar. The Stretched-Exponential is comparable with the Comprehensive distribution starting with u8 (70%). On the whole, one can say that the Stretched-Exponential distribution performs better than the three other parametric families.
21
86
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
We should stress that each log-likelihood ratio represented in figures 8 and 9, so-to say “acts on its own ground,” that is, the corresponding χ2 -distribution is valid under the assumption of the validity of each particular hypothesis whose likelihood stands in the numerator of the double log-likelihood (34). It would be desirable to compare all combinations of pairs of hypotheses directly, in addition to comparing each of them with the comprehensive distribution. Unfortunately, the Wilks theorem can not be used in the case of pair-wise comparison because the problem is not more that of comparing nested hypothesis (that is, one hypothesis is a particular case of the comprehensive model). As a consequence, our results on the comparison of the relative merits of each of the four distributions using the generalized log-likelihood ratio should be interpreted with a care, in particular, in a case of contradictory conclusions. Fortunately, the main conclusion of the comparison (an advantage of the Stretched-Exponential distribution over the three other distribution) does not contradict our earlier results discussed above.
5.2
Pair-wise comparison of the Pareto model with the Stretched-Exponential model
In order to compare formally the descriptive power of the Stretched-Exponential distribution (the best twoparameters ditribution) with that of the Pareto distribution (the best one-parameter distribution), we need to use the methods for testing non-nested hypotheses. There are in fact many ways to perform such a test (see Gouri´eroux and Monfort (1994) for a review). Concerning the log-likelihood ratio test used in the previous section for nested-hypotheses testing, a direct generalization for non-nested hypotheses has been provided by Cox’s test (1961, 1962). However, such a test requires that the true distribution of the sample be nested in one of the two considered models. Our previous investigations have shown that it is not the case, so we need to use another testing procedure. Indeed, when comparing the Pareto model with the Stretched-Exponential model, we have found that using the methodology of non-nested hypothesis leads to inconsistencies such as negative variances of estimators. This is another (indirect) confirmation that neither the Pareto nor the Stretched-Exponential distributions are the true distribution. In the case where none of the tested hypotheses contain the true distribution, it can be useful to consider the encompassing principle introduced by Mizon and Richard (1986). A model, (SE) say, is said to encompass another model, (PD) for instance, if the representative of (PD), which is the closest to the best representative of (SE), is also the best representative of (PD) per se. Here, the best representative of a model is the distribution which is the nearest to the true distribution for the considered model. The detailled testing procedure is based on the Wald and Score encompassing tests (Gouri´eroux and Monfort 1994), which are detailed in appendix D. Table 13 presents the results of the tests for the null hypothesis “(SE) encompasses (PD)”. In every cases, the null hypothesis cannot be rejected at the 95% significance level for quantiles higher than q6 = 0.5 and at the 99% significance level for quantiles higher than q10 = 0.9. The unfilled entries for the largest quantiles correspond to MLE giving c → 0. In this case, as shown in Appendix D.1.2, b† has a non-trivial and wellˆ Thus, the Wald tests is automatically verified at any defined limit which is nothing but the true value b. confidence levels. Thus, in the tail, the Stretched-Eponential model encompasses the Pareto model. We can then conclude that it provides a description of the data which is at least as good as that given by the later. In order to test whether the (SE) model is superior to the Pareto model, we should perform the reverse test, namely the encompassing of the former model into the later. This task is difficult since the pseudotrue values of the parameters (c, d) are not always well-defined as exposed in appendix D. Thus, Wald encompassing test cannot be performed as in the previous case. As a remedy and alternative, we propose a test of the null hypothesis H0 that the Pareto distribution is the true underlying distribution. This test is
22
87
based on the fact that the quantity
"µ ¶ # u cˆ ηˆ T = cˆ +1 dˆ
(35)
goes to b under the null hypothesis, whatever c being positive or negative. This can be seen from the asymptotic correspondence given by (26) and (27). Moreover, it is proved in appendix D that the variable µ ¶ ηˆ T −1 , (36) ζT = T bˆ asymptoticaly follows a χ2 -distribution with one degree of freedom. The results of this test are given in table 14. They show that H0 is more often rejected for the Dow Jones than for the Nasdaq. Indeed, beyond quantile q12 = 95%, H0 cannot be rejected at the 95% confidence level for the Nasdaq data. For the Dow Jones, we must consider quantiles higher than q18 = 99% in order not to reject H0 at the 95% significance level. These results are in agreement with the central limit theorem: the power-law regime (if it really exists) is pushed back to higher quantiles due to time agregation (recall that our Dow Jones data is at the daily scale while our Nasdaq data is at the 5 minutes time scale). In summary, the (SE) model encompasses the Pareto model as soon as one considers quantiles higher than q6 = 50%. On the other hand, the null hypothesis that the true distribution is the Pareto distribution is strongly rejected until quantiles 90% − 95% or so. Thus, within this range, the (SE) model seems the best. But, for the very highest quantiles (above 95% − 98%) we cannot any more reject the hypothesis that the Pareto model is the right one. Thus, for these extreme quantiles, this last model seems to slightly outperform the (SE) model. However, recall that section 4.1 has shown that the Pareto model is retrieved as a limiting case of the (SE) model when the fractional exponent c and the scale factor d go to zero at an appropriated rate. Thus, all the results above are compatible with the (SE) model in a generalized version. Indeed, defining a generalized Stretched-Exponential model as ( £ c c¤ , c>0 F¯u (x) = exp − x d−u c ¡ u ¢b ¡ ¢c (37) ¯ Fu (x) = , c = 0 and b = limc→0 c u = b, x
d
our tests show the relevance of this representation and its superiority over all the models considered here. Indeed, we have shown that it is the best (i.e, the most parcimonious) representation of the data for all quantiles above q9 = 80%.
6
Discussion and Conclusions
We have presented a statistical analysis of the tail behavior of the distributions of the daily log-returns of the Dow Jones Industrial Average and of the 5-minutes log-returns of the Nasdaq Composite index. We have emphasized practical aspects of the application of statistical methods to this problem. Although the application of statistical methods to the study of empirical distributions of returns seems to be an obvious approach, it is necessary to keep in mind the existence of necessary conditions that the empirical data must obey for the conclusions of the statistical study to be valid. Maybe the most important condition in order to speak meaningfully about distribution functions is the stationarity of the data, a difficult issue that we have not considered here. In particular, the importance of regime switching is now well established (Ramcham and Susmel 1998, Ang and Bekeart 2001) and should be accounted for.
23
88
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Our purpose here has been to revisit a generally accepted fact that the tails of the distributions of returns present a power-like behavior. Although there are some disagreements concerning the exact value of the power indices (the majority of previous workers accepts index values between 3 and 3.5, depending on the particular asset and the investigated time interval), the power-like character of the tails of distributions of returns is not subjected to doubts. Often, the conviction of the existence of a power-like tail is based on the Gnedenko theorem stating the existence of only three possible types of limit distributions of normalized maxima (a finite maximum value, an exponential tail, and a power-like tail) together with the exclusion of ¯ the first two types by experimental evidence. The power-like character of the log-return tail F(x) follows then simply from the power-like distribution of maxima. However, in this chain of arguments, the conditions needed for the fulfillment of the corresponding mathematical theorems are often omitted and not discussed properly. In addition, widely used arguments in favor of power law tails invoke the self-similarity of the data but are often assumptions rather than experimental evidence or consequences of economic and financial laws. Here, we have shown that standard statistical estimators of heavy tails are much less efficient that often assumed and cannot in general clearly distinguish between a power law tail and a Stretched Exponential tail even in the absence of long-range dependence in the volatility. In fact, this can be rationalized by our discovery that, in a certain limit where the exponent c of the stretched exponential pdf goes to zero (together with condition (26) as seen in the derivation (27)), the stretched exponential pdf tends to the Pareto distribution. Thus, the Pareto (or power law) distribution can be approximated with any desired accuracy on an arbitrary interval by a suitable adjustment of the pair (c, d) of the parameters of the stretched exponential pdf. We have then turned to parametric tests which indicate that the class of Stretched Exponential distribution provides a significantly better fit to empirical returns than the Pareto, the exponential or the incomplete Gamma distributions. All our tests are consistent with the conclusion that the Stretched Exponential model provides the best effective apparent and parsimonious model to account for the empirical data. However, this does not mean that the stretched exponential (SE) model is the correct description of the tails of empirical distributions of returns. Again, as already mentioned, the strength of the SE model comes from the fact that it encompasses the Pareto model in the tail and offers a better description in the bulk of the distribution. To see where the problem arises, we report in table 6 our best ML-estimates for the SE parameters c (form parameter) and d (scale parameter) restricted to the quantile level q12 = 95%, which offers a good compromise between a sufficiently large sample size and a restricted tail range leading to an accurate approximation in this range. Sample ND positive returns ND negative returns DJ positive returns DJ negative returns
c 0.039 (0.138) 0.273 (0.155) 0.274 (0.111) 0.362 (0.119)
d 4.54 · 10−52 (2.17· 10−49 ) 1.90 · 10−7 (1.38· 10−6 ) 4.81 · 10−6 (2.49· 10−5 ) 1.02 · 10−4 (2.87· 10−4 )
One can see that c is very small (and all the more so for the scale parameter d) for the tail of positive returns of the Nasdaq data suggesting a convergence to a power law tail. The exponents c for the three other tails are an order of magnitude larger but our tests show that they are not incompatible with an asymptotic power tail either. Note also that the exponents c seem larger for the daily DJ data than for the 5-minutes ND data, in agreement with an expected (slow) convergence to the Gaussian law according to the central limit theory (see Sornette et al. (2000) and figures 3.6-3.8 pp. 68 of Sornette (2000) where it is shown that SE distributions are approximately stable in family and the effect of aggregation can be seen to slowly increase the exponent c). However, a t-test does not allow us to reject the hypotheses that the exponents c remains the same for a given 24
89
tail (positive or negative) of the Dow Jones data. Thus, we confirm previous results (Lux 1996, Jondeau and Rockinger 2001, for instance) according to which the extreme tails can be considered as symmetric, at least for the Dow Jones data. These are the evidence in favor of the existence of an asymptotic power law tail. Balancing this, many of our tests have shown that the power law model is not as powerful compared with the SE model, even arbitrarily far in the tail (as far as the available data allows us to probe). In addition, our attempts for a direct estimation of the exponent b of a possible power law tail has failed to confirm the existence of a well-converged asymptotic value (except maybe for the positive tail of the Nasdaq). In constrast, we have found that the exponent b of the power law model systematically increases when going deeper and deeper in the tails, with no visible sign of exhausting this growth. We have proposed tentative parameterization of this growth of the apparent power law exponent. We note again that this behavior is expected from models such as the GARCH or the Multifractal Random Walk models which predict asymptotic power law tails but with exponents of the order of 20 or larger, that would be sampled at unattainable quantiles. Attempting to wrap up the different results obtained by the battery of tests presented here, we can offer the following conservative conclusion: it seems that the four tails examined here are decaying faster than any (reasonable) power law but slower than any stretched exponentials. Maybe log-normal or log-Weibull distributions could offer a better effective description of the distribution of returns 9 . Such a model has already been suggested by (Serva et al. 2002). The correct description of the distribution of returns has important implications for the assessment of large risks not yet sampled by historical time series. Indeed, the whole purpose of a characterization of the functional form of the distribution of returns is to extrapolate currently available historical time series beyond the range provided by the empirical reconstruction of the distributions. For risk management, the determination of the tail of the distribution is crucial. Indeed, many risk measures, such as the Value-at-Risk or the Expected-Shartfall, are based on the properties of the tail of the distributions of returns. In order to assess risk at probability levels of 95% or more, non-parametric methods have merits. However, in order to estimate risks at high probability level such as 99% or larger, non-parametric estimations fail by lack of data and parametric models become unavoidable. This shift in strategy has a cost and replaces sampling errors by model errors. The considered distribution can be too thin-tailed as when using normal laws, and risk will be underestimated, or it is too fat-tailed and risk will be over estimated as with L´evy law and possibly with Pareto tails according to the present study. In each case, large amounts of money are at stake and can be lost due to a too conservative or too optimistic risk measurement. Our present study suggests that the Paretian paradigm leads to an overestimation of the probability of large events and therefore leads to the adoption of too conservative positions. Generalizing to larger time scales, the overly pessimistic view of large risks deriving from the Paretian paradigm should be all the more revised, due to the action of the central limit theorem. Finally, an additional note of caution is in order. This study has focused on the marginal distributions of returns calculated at fixed time scales and thus neglects the possible occurrence of runs of dependencies, such as in cumulative drawdowns. In the presence of dependencies between returns, and especially if the dependence is non stationary and increases in time of stress, the characterization of the marginal distributions of returns is not sufficient. As an example, Johansen and Sornette (2002) have recently shown that the recurrence time of very large drawdowns cannot be predicted from the sole knowledge of the distribution of returns and that transient dependence effets occurring in time of stress make very large drawdowns more frequent, qualifying them as abnormal “outliers.”
9 Let
us stress that we are speaking of a log-normal distribution of returns, not of price! Indeed, the standard Black and Scholes model of a log-normal distribution of prices is equivalent to a Gaussian distribution of returns. Thus, a log-normal distribution of returns is much more fat tailed, and in fact bracketed by power law tails and stretched exponential tails.
25
90
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
A Maximum likelihood estimators In this appendix, we give the expressions of the maximum likelihood estimators derived from the four distributions (22-25)
A.1 The Pareto distribution According to expression (22), the Pareto distribution is given by ³ u ´b Fu (x) = 1 − , x≥u x
(38)
and its density is fu (x|b) = b
ub xb+1
(39)
Let us denote by T
ˆ = max ∑ ln fu (xi |b) LTPD (b) b
(40)
i=1
the maximum of log-likekihood function derived under hypothesis (PD). bˆ is the maximum likelihood estimator of the tail index b under such hyptothesis. The maximum of the likelihood function is solution of 1 1 + ln u − b T which yields
" bˆ =
1 T
T
∑ ln xi = 0,
#−1
∑ ln xi − ln u
,
(41)
µ ¶ 1 PD ˆ bˆ 1 L (b) = ln − 1 + . T T u bˆ
and
i=1
Moreover, one easily shows that bˆ is asymptotically normally distributed: √ N(bˆ − b) ∼ N (0, b).
(42)
(43)
A.2 The Weibull distribution The Weibull distribution is given by equation (23) and its density is h ³ x ´c i u c c fu (x|c, d) = c · e( d ) xc−1 · exp − , d d
x ≥ u.
(44)
The maximum of the log-likelihood function is T
ˆ = max ∑ ln fu (xi |c, d) LTSE (c, ˆ d)
(45)
c,d i=1
ˆ are solution of Thus, the maximum likehood estimators (c, ˆ d) ¡ xi ¢c xi 1 T ln 1 1 ∑ i=1 = T1 T ¡ uxi ¢c u − c T −1 T ∑i=1 u T ³ ´c c u xi − 1. dc = ∑ T i=1 u 26
T
xi
∑ ln u ,
(46)
i=1
(47)
91
Equation (46) depends on c only and must be solved numerically. Then, the resulting value of c can be reinjected in (47) to get d. The maximum of the log-likelihood function is T 1 SE ˆ = ln cˆ + cˆ − 1 ∑ ln xi − 1. LT (c, ˆ d) T T i=1 dˆcˆ
(48)
√ Since c > 0, the vector N(cˆ − c, dˆ − d) is asymptotically normal, with a covariance matrix whose expression is given in appendix D by the inverse of K11 . It should be noted that maximum likelihood equations (46-47) admit a solution with positive c not for all possible samples (x1 , · · · , xN ). Indeed, the function ¡ ¢c 1 T1 ∑Ti=1 xui ln xui 1 T xi h(c) = − 1 T ¡ xi ¢c + ∑ ln , (49) c T u −1 i=1 T ∑i=1 u ˆ which is the total derivative of LTSE (c, d(c)), is a decreasing function of c. It means, as one can expect, that the likelihood function is concave. Thus, a necessary and sufficient condition for equation (46) to admit a solution is that h(0) is positive. After some calculations, we find h(0) =
2
¡1 T
∑ ln xui
¢2
− T1 ∑ ln2 xui
,
(50)
∑ ln2 u > 0.
(51)
2 T
∑ ln xui
−
1 T
which is positive if and only if µ 2
1 T
∑ ln
xi u
¶2
xi
However, the probability of occurring a sample providing a negative maximum-likelyhood estimate of c tends to zero (under Hypothesis of SE with a positive c) as µ √ ¶ 2 c N σ −c N Φ − '√ e 2σ2 , (52) σ 2π Nc i.e. exponentially with respect to N. Here σ2 is the variance of the limit Gaussian distribution of maximum-likelihood c-estimator that can be derived explicitly. If h(0) is negative, LTSE reaches its maximum at c = 0 and in such a case µ ¶ 1 SE 1 xi 1 LT (c = 0) = − ln ln − ∑ ln xi − 1. (53) ∑ T T u T Now, if we apply maximum likelihood estimation based on SE assumption to samples distributed differently from SE, then we can get negative c-estimate with some positive probability not tending to zero with N → ∞. If sample is distributed according to Pareto distribution, for instance, then maximum-likelihood c-estimate converges in probability to a Gaussian random variable with zero mean, and thus the probability of negative c-estimates converges to 0.5.
A.3 The Exponential distribution The Exponential distribution function is given by equation (24), and its density is £ ¤ h xi exp du fu (x|d) = exp − , x ≥ u. d d 27
(54)
92
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
The maximum of the log-likelihood function is reach at 1 dˆ = T
T
∑ xi − u,
(55)
i=1
and is given by 1 ED ˆ ˆ L (d) = −(1 + ln d). T T
The random variable
(56)
√ N(dˆ − d) is asymptotically normally distributed with zero mean and variance d 2 /N.
A.4 The Incomplete Gamma distribution The expression of the Incomplete Gamma distribution function is given by (25) and its density is fu (x|b, d) =
h ³ x ´i db ¡ ¢ −(b+1) exp − , u ·x d Γ −b, d
x ≥ u.
(57)
Let us introduce the partial derivative of the logarithm of the incomplete Gamma function: Ψ(a, x) =
∂ 1 ln Γ(a, x) = ∂a Γ(a, x)
Z ∞
dt lnt t a−1 e−t .
(58)
x
ˆ d) ˆ solution of The maximum of the log-likelihood function is reached at the point (b, 1 T
T
xi
∑ ln d
i=1
1 T
T
xi
∑d
i=1
³ u´ = Ψ −b, , d ³ u ´−b u 1 ¡ ¢ e− d − b, = d Γ −b, du
(59) (60)
and is equal to ³ ³ ³ u ´−b u 1 IG ˆ ˆ u´ u´ 1 ¢ LT (b, d) = − ln dˆ − ln Γ −b, e− d . + (b + 1) · Ψ −b, +b− ¡ T d d d Γ −b, du
28
(61)
93
B Minimum Anderson-Darling Estimators We derive in this appendix the expressions allowing the calculation of the parameters which minimize the Anderson-Darling distance between the assumed distribution and the true distribution. Given the ordered sample x1 ≤ x2 ≤ · · · ≤ xN , the AD-distance is given by N
ADN = −N − 2 ∑ [wk log F(xk |α) + (1 − wk ) log(1 − F(xk |α))] ,
(62)
k=1
where α represents the vector of paramaters and wk = 2k/(2N + 1). It is easy to show that the minimun is reached at the point αˆ solution of µ ∑ 1− N
k=1
B.1
¶ wk log(1 − F(xk |α)) = 0. F(xk |α)
(63)
The Pareto distribution
Applying equation (63) to the Pareto distribution yields N
∑
k=1
1−
N u wk u ³ ´b ln = ∑ ln . x x k k u k=1
(64)
xk
This equation always admits a unique solution, and can easily be solved numerically.
B.2
Stretched-Exponential distribution
In the Stretched-Exponential case, we obtain the two following equations ¶ µ xk ³ xk ´c i wk h u ³ u ´c = 0, ∑ 1 − Fk ln d d − ln d d k=1 ¶ N µ wk ∑ 1 − Fk (uc − xkc ) = 0, k=1 N
· c ¸ u − xkc Fk = 1 − exp − . dc
with
(65) (66)
(67)
After some simple algebraic manipulations, the first equation can be slightly simplified, to finally yields ¶ µ xk ³ xk ´c wk ∑ 1 − Fk ln u u = 0, k=1 ¶ ´ N µ wk ³³ xk ´c 1 − − 1 = 0. ∑ Fk u k=1 N
(68) (69)
However, these two equations remain coupled. Moreover, we have not yet been able to prove the unicity of the solution.
29
94
B.3
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Exponential distribution
In the exponential case, equation(63) becomes N
∑
k=1
with
µ
¶ wk − 1 (u − xk ) = 0, Fk
· ¸ u − xk Fk = 1 − exp − . d
Here again, we can show that this equation admits a unique solution.
30
(70)
(71)
95
C Local exponent In this Appendix, we come back to the notion of a local Pareto exponent discussed above in section 4.3.1, with figure 7 and expression (33). We call the local exponent β. Generally speaking, any positive smooth function g(x) can be represented in the form g(x) = 1/xβ(x) ,
x > 1,
(72)
by defining β(x) = − ln(g(x))/ ln x. Thus, the local index β(x) of a distribution F(x) will be defined as β(x) = −
ln(1 − F(x)) . ln(x)
(73)
For example, the log-normal distribution can be mistaken for a power law over several decades with very slowly (logarithmically) varying exponents if its variance is large (see figures 4.2 and 4.3 in section 4.1.3 of (Sornette 2000)). When we approximate a sample distribution by some parametric family with moving index β(x), it is important that β(x) should have as small a variation as possible, i.e., that the representation (72) be parsimonious. Given a sample x1 , x2 , · · · , xN , drawn from a distribution function F(x), x ≥ 1, the tail index β(x) is consistently estimated by ¡ ¢ ln N − ln 1 ∑ {x >x} i i ˆ β(x) = , (74) ln(x) where 1{·} is the indicator function which equals one if its argument is true and zero otherwhise. The asymptotic distribution of the estimator is easily derived and reads: ³ ´ ˆ d N 1/2 · eln(x)·β(x) − eln(x)·β(x) −→ N (0, σ2 ), (75) with σ2 =
1 . F(x) · (1 − F(x))
(76)
As an example, let us illustrate the properties of the local index β(x) for regularly varying distributions. The ¯ general representation of any regularly varying distribution is given by F(x) = L (x) · x−α , where L (·) is a slowly varying function. In such a case, the local power index can be written as β(x) = α −
ln L (x) , ln(x)
(77)
which goes to α as x goes to infinity, as expected from the build-in slow variation of L (x). The upper panel of figure 10 shows the local index β(x) for a simulated Pareto distribution with power index b = 1.2 (X > 1). The estimate β(x) oscillates very closely to the true value b = 1.2. Let us now assume that we observe a regularly varying local exponent β(x) = L (x) · xc ,
with c > 0,
it would be clearly the charaterization of a Stretched-Exponential distribution £ ¤ ¯ F(x) = exp −L 0 (x) · xc ,
31
(78)
(79)
96
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
where L 0 (x) = ln(x) L (x). The lower panel of figure 10 shows the local index β(x) for a simulated StretchedExponential distribution with c = 0.3. In this case, the local index β(x) continuously increases, which corresponds to our intuition that an Stretched-Exponential behaves like a power-law with an always increasing exponent. Figure 11 shows the local index β(x) for a distribution constructed by joining two Pareto distributions with exponents b1 = 0.70 and b2 = 1.5 at the cross-over point u1 = 10. In this case, the local index β(x) again increases but not so quickly as in previous Stretched-Exponential case. Even for such large sample size (n = 15000), the “final” β(x) is about 1.3 which is still less than the true b-value for the second Pareto part (b = 1.5), showing the existence of strong cross-over effects. Such very strong cross-over effects occurring in the presence of a transition from a power law to another power law have already been noticed in (Sornette et al. 1996). Similarly to the local index β(x) based on the Pareto distribution taken as a reference, we define the notion of a local exponent c(x) taking Stretched-Exponential distributions as a reference. Indeed, given any sufficiently smooth positive function g(x), one can always find a function c(x) such that h i g(x) = exp 1 − xc(x) , x ≥ 1, (80) with c(x) = ln(1 − ln(g(x)))/ ln(x). Obviously, for any Stretched-Exponential distribution with exponent c, the local exponent c(x) converges to c as x goes to infinity. This property is the same as for the local index β(x) which goes to the true tail index β for regularly varying distributions. Figure 12 shows the sample tail (continuous line), the local index β(x) (dashed line) and the local exponent c(x) (dash-dotted line) for the negative tail of the Nasdaq five-minutes returns. The local exponent c(x) clearly reaches an asymptotic value ∼ = 0.32 for large enough values of the returns. In contrast, the local exponent β(x) remains continuously increasing. The lower panel shows in double logarithmic scale the local index β(x). Over a large range, β(x) increases approximately a power law of index 0.77 while beyond the quantile 99% (see the inset) it behaves like a power law with smaller index equal to 0.54 implying a decelerating growth. The goodness of fit of the regression of ln β(x) on ln x has been qualified by a χ2 test which does not allow to reject this model at any usual confidence level. Note that a power law dependence of the local Pareto exponent β(x) as a function of x qualifies a Stretched-Exponential distribution, according to (78) and (79). The second regime fitted with the exponent c = 0.54 seems still perturbed by a cross-over effect as it does not retrieve the value c ∼ = 0.32 which characterizes the tail of the distributions of returns according to the Stretched-Exponential model. These fits quantifying the growth of the local Pareto exponent are not in contradiction with figure 7 and expression (33). Indeed, a decreasing positive exponent is an alternative description for a logarithmic growth and vice-versa. These fits provide an improved quantification of the previously rough characterization of the growth of the exponent estimated per quantile shown with 7 and expression (33) but using a better characterization of the “local” exponent. The positive tail of the Nasdaq and both tails of the Dow Jones exhibit exactly the same continuously increasing behaviour with the same characteristics and we are not showing them. Taken together, these obervations suggest that the Stretched-Exponential representation provides a better model of the tail behavior of large returns than does a regularly varying distribution.
32
97
D Testing non-nested hypotheses with the encompassing principle D.1 Testing the Pareto model against the (SE) model D.1.1 Pseudo-true value Let us consider the two models, stretched exponential (SE) and Pareto (P). The pdf’s associated with these two models are f1 (x|c, d) and f2 (x|b) respectively . Under the true distribution f0 (x), we will determine the ˆ cˆ and d,ˆ namely, the values of these parameters pseudo-true values of the maximum likelihood estimators b, which minimize the (Kullback-Leibler) distance between the considered model and the true distribution ˆ cˆ and dˆ appear as the (Gouri´eroux and Monfort 1994). Thus, the pseudo-true values b∗ , c∗ and d ∗ of b, expected values of the estimators under f0 . For instance ¸ · f0 (x) ∗ , (81) b = arg inf E0 ln b f2 (x|b) where, in all what follows, E0 [·] denotes the expectation under the probability measure associated with the true distribution f0 . Thus, b∗ is simply solution of · ¸ ∂ f0 (x) E0 ln = 0, (82) ∂b f2 (x|b) which yields
b∗ = (E0 [ln x] − ln u)−1 ,
and is consistently estimated by the maximum likelihood estimator µ ¶−1 ˆb = 1 ∑ ln xi − ln u . T
(83)
(84)
In fact, the maximum likelihood estimator bˆ converges to its pseudo-true value, with (bˆ − b∗ ) asymptotically normally distributed with zero mean. Similarly, the maximum likelihood estimators cˆ and dˆ converge to their pseudo-true values c∗ and d ∗ . D.1.2 Binding functions and encompassing Let us now ask what is the value b† (c, d) of the parameter b for which f2 (x|b) is the nearest to f1 (x|c, d), for a given (c, d). Such a value b† is the binding function and is solution of · ¸ f1 (x|c, d) † , (85) b (c, d) = arg inf E1 ln b f2 (x|b) which only involves f1 and f2 but not the true distribution f0 . The binding function is given by ³ h xi u ´−1 b† (c, d) = E1 ln − ln . d d After some calculations, we find h xi E1 ln = d
(86)
Z
∞ x c c ( u )c x e d · dx ln xc−1 e−( d ) , c d d u u c ³ ³ ) ( u e d u ´c ´ = ln + Γ 0, , d c d
33
(87) (88)
98
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
so that the binding function can be expressed as ³ ³ u ´c ´ u c b† (c, d) = c · e−( d ) Γ−1 0, , d
when c > 0
(89)
In the case when c goes to zero and c and d are related by equation (47), we can still calculate b† . Indeed, we can first show that " #−1 ³ u ´c 1 1 T xi 1 (90) ∼ ∑ ln u = c · b.ˆ d c T i=1 Now, using the asymptoptic relation ex · Γ(0, x) ∼ x−1 ,
as x → ∞,
(91)
we conclude that ˆ = b, ˆ b† (c, ˆ d)
as cˆ → 0.
(92)
This result is in fact natural because the PD model can be seen formally as the limit of the SE model for c → 0 under the condition ³ u ´c c· → β, as c → 0 , (93) d as previously exposed in section 4.1 Now, following Mizon and Richard (1986), the model (SE) with pdf f1 is said to encompass the model (PD) with pdf f2 if the best representative of (PD) -with parameter b∗ - is also the distribution nearest to the best representative of (SE) -with parameters (c∗ , d ∗ ). Thus, (SE) is said to encompass (PD) if and only if b∗ = b† (c∗ , d ∗ ). The reverse situation can be considered in order to study the encompassing of the model (SE) by the model (PD). Such situation occurs if and only if µ ∗ ¶ µ † ∗ ¶ c c (b ) = . (94) d∗ d † (b∗ ) D.1.3 Wald encompassing test We first test the encompassing of (PD) into (SE), namely the null hypothesis H0 = {b∗ = b† (c∗ , d ∗ )}. Under √ ¡ this† null ¢hypothesis, it can be shown (Gouri´eroux and Monfort 1994) that the random variable ˆ is asymptotically normally distributed with zero mean and variance V given by T bˆ − b (c, ˆ d) V
−1 −1 −1 = K22 [C22 −C21C11 C12 ]K22 + −1 −1 −1 + K [C21C − C˜21 K ]C11 [C−1C12 − K −1C˜12 ]K −1 22
11
11
11
11
22
(95)
where the expression of the involved in V will be given below. Thus, under H0 , the random vari¡ ¢ ¡ ¢ coefficients −1 † † ˆ ˆ ˆ ˆ ˆ b − b (c, ˆ d) , where Vˆ −1 is a consistent estimator of V −1 , follows asympable ξT = T b − b (c, ˆ d) V 2 totically a χ -distribution with one degrees of freedom. The matrix K11 is given by · K11 (i, j) = −E0
¸ ∂2 ln f1 (x|c∗ , d ∗ ) , ∂αi ∂α j
34
i, j = 1, 2,
(96)
99 where α = (c∗ , d ∗ ). It can be consistently estimated by µ ¶ µ ¶cˆ µ ¶ µ ¶cˆ 1 T u u xi xi + ∑ ln2 , T i=1 dˆ dˆ dˆ dˆ " µ ¶µ ¶ µ ¶ µ ¶cˆ # cˆ T u xi c u 1 xi Kˆ 11 (1, 2) = Kˆ 11 (2, 1) = ln − ∑ ln , ˆ ˆ ˆ d T i=1 d d d dˆ µ ¶2 cˆ Kˆ 11 (2, 2) = . dˆ 1 − ln2 cˆ2
Kˆ 11 (1, 1) =
The coefficient K22 is given by
· K22 = −E0
¸ ∂2 ln f (x|b∗ ) 1 = , 2 ∂b∗ b∗ 2
(97) (98) (99)
(100)
which is consistently estimated by Kˆ 22 = bˆ −2 . t . Its first component is Now, we have to calculate the two components of the vector C˜12 = C˜21 · ¸ ∂ ln f1 (x|c∗ , d ∗ ) ∂ ln f2 (x|b∗ ) · , (101) C˜12 (1) = E1 ∂c∗ ∂b∗ ·½µ ¶ µ ¶¾ ½µ ¶ ¾¸ 1 u ³ u ´c∗ x x ³ x ´c∗ 1 u x = E1 + ln ∗ + ln ∗ − ln ∗ · + ln ∗ − ln ∗ (102) , c∗ d d∗ d d d∗ b∗ d d ¶ µ ¶ µ ¶ µ h xi u ³ u ´c∗ 1 u 1 u ³ u ´c∗ 1 + ln · + ln − + ln · E ln ∗ = 1 c∗ d∗ d∗ b∗ d c∗ d∗ d∗ d µ ¶ µ h · ¸¶ · h i ³ x ´c∗ ¸ 1 u xi x ³ x ´c∗ x 2 2 x + + ln · E ln − E ln − E ln + E ln (103) . 1 1 1 1 b∗ d∗ d∗ d∗ d∗ d∗ d∗ d∗
Some simple calculations show that h x E1 ln d h x E1 ln2 d
h xi u ³ u ´c 1 + ln · + E1 ln , d c d d d ³ x ´c i ³ u ´c 2 h x i h xi u · + E1 ln + E1 ln2 = ln2 · , d d d c d d ·
³ x ´c i
=
(104) (105)
which allows us to show that the first and third terms cancel out, and it remains 1 h xi u ³ u ´c∗ ³ h x i u´ C˜12 (1) = ∗ E1 ln ∗ − ln ∗ E ln − ln . 1 c d d d∗ d∗ d∗
(106)
The second component is ¸ · ∂ ln f1 (x|c∗ , d ∗ ) ∂ ln f2 (x|b∗ ) ˜ · , C12 (2) = E1 ∂d ∗ ∂b∗ · ∗ ½µ ¶ ¾¸ ¶ ³ u ´c∗ ³ x ´c∗ ¾ ½µ 1 c u x = E1 − ∗ 1+ ∗ + ln ∗ − ln ∗ , − ∗ · d d d b∗ d d ·µ ¶ µ ³ u ´c∗ ¶ µ 1 ³ u ´c∗ ¶ h x i c∗ u = − ∗ 1+ ∗ + ln − 1 + E1 ln ∗ d d b∗ d∗ d∗ d ¶ ·³ ´ ∗ ¸ · ¸¸ µ ∗ ³ ´ c c u x x x 1 + ln ∗ E1 + E1 ln ∗ . − b∗ d d∗ d d∗
35
(107) (108)
(109)
100
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Now, accounting for the relation E1 [xc ] = uc + d c , the first and third term within the brackets cancel out, and accounting for equation (104) yields · h x i´¸ c∗ 1 ³ u ´c∗ ³ u ˜ C12 (2) = − ∗ ∗ + ∗ · ln ∗ − E ln ∗ . (110) d c d d d ˆ and using the relation Finally, C˜12 can be consistently estimated by replacing (c∗ , d ∗ ) by (c, ˆ d) h xi u 1 u c ³ ³ u ´c ´ E ln = ln + e( d ) · Γ 0, . d d c d
(111)
Let us now define à g1 (x|c∗ , d ∗ ) = g2 (x|b∗ ) =
∂ ln f1 (x|c∗ ,d ∗ ) ∂c∗ ∂ ln f1 (x|c∗ ,d ∗ ) ∂d ∗ ∂ ln f2 (x|b∗ ) = ∂b∗
!
à =
1 c∗
¡ ¢c∗ ¡ ¢c∗ ! + ln du∗ h du∗ + ln dx∗ − ln dx∗i dx ¡ ¢c∗ ¡ ¢c∗ ∗ − dc ∗ 1 + du∗ − dx∗
1 + ln u − ln x. b∗
(112) (113)
The matrices Ci j are defined as £ ¤ Ci j = E0 gi (x) · g j (x)t ,
i, j = 1, 2,
(114)
i, j = 1, 2.
(115)
which can be consistently estimated by 1 Cˆi j = T
T
∑ gi (x) · g j (x)t ,
i=1
D.2 Testing the (SE) model against the Pareto model Let us now assume that, beyond a given high threshold u, the true model is the Pareto model, that is, the true returns distribution is a power law with pdf f0 (x|b) = b
ub , xb+1
x ≥ u.
(116)
This will be our null hypothesis H0 . ˆ of the model (SE). They are solution of equaNow consider the maximum likelihood estimators (c, ˆ d) tions (46-47) ¡ xi ¢c xi 1 T 1 1 T xi T ∑i=1 ¡ u ¢ ln u = 1 T xi c (117) − ∑ ln , c − 1 T i=1 u T ∑i=1 u uc T ³ xi ´c dc = (118) ∑ u − 1. T i=1 ˆ converges to the pseudo-true values, solutions of Under H0 , (c, ˆ d) £¡ ¢c ¤ h xi E0 ux ln ux 1 ¢ ¤ £¡ − E ln , = 0 c c u E0 ux − 1 d c = E0 [xc ] − uc , 36
(119) (120)
101
where E0 [·] denotes the expectation with respect to the power-law distribution f0 . We have h³ x ´c i b E0 = , and c < b, u b−c h xi 1 E0 ln = , u b h³ x ´c x i b = E0 ln . u u (b − c)2
(121) (122) (123)
Thus, we easily obtain that the unique solution of (119) is c = 0, and equation (120) does not make sense any more. So, under H0 , cˆ goes to zero and dˆ is not well defined. Thus, Wald test cannot be performed under such a null hypothesis. We must find another way to test (SE) against H0 . In this goal, we remark that the quantity
"µ ¶ # u cˆ ηˆ T = cˆ +1 dˆ
(124)
is still well defined. Using (120), it is easy to show that, as T goes to infinity, this quantity goes to b, whatever cˆ being positive or equal to zero. For positive c, this is obvious from (120) and (121), while for c = 0, expanding (118) around cˆ = 0 yields " #−1 µ ¶cˆ 1 T u xi (125) cˆ = ∑ ln u T i=1 dˆ = bˆ → b.
(126)
In order to test the descriptive power of the (SE) model against the null hypothesis that the true model is the Pareto model, we can consider the statistic µ ¶ ηˆ T ζT = T −1 , (127) bˆ which asymptoticaly follows a χ2 -distribution with one degree of freedom. Indeed expanding the quantity (xi /u)cˆ in power series around c = 0 gives ³x ´ ³ x ´ cˆ3 ³x ´ ³ x ´cˆ cˆ2 i i i i ∼ ) + · log2 + · log3 + · · · , as cˆ → 0, (128) = 1 + cˆ · log( u u 2 u 6 u which allows us to get 1 ³ xi ´cˆ ∼ cˆ2 cˆ3 (129) = 1 + cˆ · S1 + · S2 + S3 , ∑ T u 2 3 ³x ´ 1 ³ xi ´c cˆ2 i ∼ log (130) = S1 + cˆ · S2 + S3 , ∑ T u u 2 where ³x ´ 1 T i S1 = ∑ log , (131) T i=1 u ³x ´ 1 T i S2 = ∑ log2 , (132) T i=1 u ³x ´ 1 T i . (133) S3 = ∑ log3 T i=1 u (134) 37
102
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Thus, we get from equation (117): cˆ '
1 2 2 S2 − S1 , 1 1 2 S1 S2 − 3 S3
(135)
and from equations (124) and (118): S2 − 1 S2 + 2cˆ [S1 S2 − 13 S3 ] ηˆ T − 1 = cˆ · 1 2 . 2 bˆ S1 + 2cˆ S2 + cˆ6 S3
(136)
Now, accounting for the fact that the variables ξ1 = S1 − b−1 , ξ2 = S2 − 2b−2 and ξ3 = S3 − 6b−3 are asymptoticaly Gaussian random variables with zero mean and variance of order T −1/2 , we obtain, at the lowest order in T −1/2 : ¶ µ b 2 (137) cˆ = b 2ξ1 − ξ2 , 2 and
S12 − 12 S2 + 2cˆ [S1 S2 − 13 S3 ] 2
S1 + 2cˆ S2 + cˆ6 S3 which shows that
µ ¶ b = 2ξ1 − ξ2 , 2
¶2 µ ηˆ T b 2 − 1 = b 2ξ1 − ξ2 . 2 bˆ
(138)
(139)
We now use the fact that ξ1 , ξ2 are asymptotically Gaussian random variables with zero mean. We find their variances: 1 20 4 Var(ξ1 ) = , Var(ξ2 ) = , and Cov(ξ1 , ξ2 ) = . (140) 2 4 Tb Tb T b3 Using equation (139), the random variable b(2ξ1 − bξ2 /2) is in the limit of large T a Gaussian rv with zero mean and variance 1/T , i.e., ζT is asymptoticaly distributed according to a χ2 distribution with one degree of freedom. Thus, the test consists in accepting H0 if ζT ≤ χ21−ε (1) and rejecting it otherwise. (ε denotes the asymptotic level of the test).
38
103
References Andersen, J.V. and D. Sornette, 2001, Have your cake and eat it too: increasing returns while lowering large risks! Journal of Risk Finance 2 (3), 70-82 . Anderson, T.W. and D.A. Darling, 1952, Asymptotic theory of certain “goodness of fit” criteria, annals of Mathematical statistics 23, 193-212. Andersson, M., B. Eklund and J. Lyhagen, 1999, A simple linear time series model with misleading nonlinear properties, Economics Letter 65, 281-285. Ang, A. and G. Bekeart, 2001, International asset allocation with regime shifts, Review of Financial Studies. Bachelier, L., 1900, Th´eorie de la sp´eculation, Annales de l’Ecole Normale Sup´erieure 17, 21-86. Barndorff-Nielsen, O.E., 1997, Normal inverse Gaussian distributions and the modelling of stock returns Scandinavian, J. Statistics 24, 1-13. Beran, J., 1994, Statistics for long-memory processes, Monographs on Statistics and Applied Probabilty 61, Chapman & Hall. Biham, O., Malcai, O., Levy, M. and Solomon, S., 1998, Generic emergence of power law distributions and Levy-Stable intermittent fluctuations in discrete logistic systems, Phys. Rev. E 58, 1352-1358. Biham, O., Z-F. Huang, O. Malcai and S. Solomon, 2002, Long-Time Fluctuations in a Dynamical Model of Stock Market Indices, preprint at http://arXiv.org/abs/cond-mat/0208464 Bingham, N.H., C.M. Goldie and J.L. Teugel, 1987, Regular variation, Cambridge University Press. Black, F. and M. Scholes, 1973, the pricing of options and corporate liabilities, Journal of Political Economy 81, 637-653. Blanchard, O.J. and M.W. Watson, 1982, Bubbles, Rational Expectations and Speculative Markets, in: Wachtel, P. ,eds., Crisis in Economic and Financial Structure: Bubbles, Bursts, and Shocks. Lexington Books: Lexington. Blattberg, R. and Gonnedes, N., 1974, A comparison of stable and Student distribution as statistical models for stock prices, J. Business 47, 244-280. Bollerslev, T., 1986, Generalized autoregressive conditional heteroskedasticity, Journal of Econometrics 31, 307-327. Bollerslev T, R.F. Engle and D.B. Nelson, 1994, ARCH models, Handbook of Econometrics IV, 2959-3038. Bouchaud, J.-P. and M. Potters, Theory of financial risks: from statistical physics to risk management (Cambridge [England]; New York: Cambridge University Press, 2000). Brock, W.A., W.D. Dechert and J.A. Scheinkman, 1987, A Test for independence Based on the Correlation Dimension, unpublished manuscript, Department of Economics, University of Wisconsin, Madison. Campbell, J.Y., Lo, A.W. and MacKinlay, A.C., 1997, The econometrics of financial markets (Princeton, N.J. : Princeton University Press). Challet, D. and M. Marsili, 2002, Criticality and finite size effects in a simple realistic model of stock market, preprint at http://arXiv.org/abs/cond-mat/0210549 39
104
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Champenowne, D.G, 1953, Economic Journal 63, 318-51. Cont, R., Potters, M. and J.-P. Bouchaud, 1997, Scaling in stock market data: stable laws and beyond, in Scale Invariance and Beyond (Proc. CNRS Workshop on Scale Invariance, Les Houches, 1997), eds. B. Dubrulle, F. Graner and D. Sornette (Berlin: Springer). Cox, D.R., 1961, Test of separate families of hypotheses, in : Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probabilty 1, 105-123, University of California Press: Berkeley . Cox, D.R., 1962, Further results of separate families of hypotheses, journal of the Royal Statistical Society B 24, 406-424. Cromwell, J.B., W.C. Labys and M. Terraza, 1994, Univariate Tests for Time Series Models (Sage, Thousand Oaks, CA, pp. 20-22). Dacorogna M.M., U.A. M¨uller, O.V. Pictet and C.G. de Vries, 1992, The distribution of extremal foreign exchange rate returns in large date sets, Working Paper, Olsen and Associates Internal Documents UAM, 19921022. Dragulescu, A.A. and V.M. Yakovenko, 2002, Probability distribution of returns for a model with stochastic volatility, preprint at http://xxx.lanl.gov/abs/cond-mat/0203046 Eberlein, E., Keller, U. and Prause, K., 1998, New insights into smile, mispricing and value at risk: the hyperbolic model J. Business 71, 371-405. Embrechts P., C.P. Kl¨uppelberg and T. Mikosh, 1997, Modelling Extremal Events (Springer-Verlag). Engle, R.F., 1984, Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics , in: Z. Griliches and M.D. Intriligator, eds., Handbook of Econometrics, Vol. II. North-Holland: Amsterdam. Engle, R.F. and A.J. Patton, 2001, What good is a volatility model ?, Quantitative Finance 1, 237-245. Fama E.F., 1965, The Behavior of Stock Market Prices, J. Business 38, 34-105. Fama E.F. and K.R. French, 1996, Multifactor explanations of asset pricing anomalies, Journal of Finance 51. Fang, H. and T. Lai, 1997, Co-kurtosis and capital asset pricing, Financial Review 32, 293-307. Farmer J.D., 1999, Physicists Attempt to Scale the Ivory Towers of Finance, Computing in Science and Engineering, Nov/Dec 1999, 26-39. Feller, W., 1971, An Introduction to Probability Theory and its Applications, vol. II (John Wiley and sons, New York). Frisch, U. and D. Sornette, 1997, Extreme deviations and applications, J. Phys. I France 7, 1155-1171. Gabaix, X., 1999, Zipf’s law for cities: An explanation, Quaterly J. Econ. 114, 739-767. Gabaix, X., P. Gopikrishnan, V. Plerou and H.E. Stanley, 2002, Understanding Large Movements in Stock Market Activity, in press in Nature Gopikrishnan P., M. Meyer, L.A.N. Amaral and H.E. Stanley, 1998, Inverse Cubic Law for the Distribution of Stock Price Variations, European Physical Journal B 3, 139 –140. Gouri´eroux C. and J. Jasiak, 1998, Truncated maximum likelihood, goodness of fit tests and tail analysis, Working paper, CREST. 40
105
Gouri´eroux C. and A. Monfort, 1994, Testing non nested hypothesis, Handbook of Econometrics 4, 25852637. Granger, C.W.J. and T. Ter¨asvirta, 1999, A simple nonlinear model with misleading properties, Economics Letter 62, 741-782. Guillaume D.M., M.M. Dacorogna, R.R. Dav´e, U.A. M¨uller, R.B. Olsen and O.V. Pictet, 1997, From the bird’s eye to the microscope: a survey of new stylized facts of the intra-day foreign exchange markets, Finance and Stochastics 1, 95-130. Hall, P.G, 1979, On the rate of convergence of normal extremes. Journal of Applied Probabilities 16, 433439. Hall, W.J. and J.A. Wellnel, 1979, The rate of convergence in law of the maximum of an exponential sample. Statistica Neerlandica 33, 151-154. Hill, B.M., 1975, A simple general approach to inference about the tailof a distribution, Annals of statistics 3, 1163-1174. Hwang, S. and S. Satchell, 1999, Modelling emerging market risk premia using higher moments, International Journal of Finance and Economics 4, 271-296. Joe, H, 1997, Multivariate models and dependence concepts, Chapman & Hall, London. Johansen, A. and D. Sornette, 2002, Large Stock Market Price Drawdowns Are Outliers, Journal of Risk 4 (2), 69-110. E. Jondeau and M. Rockinger, 2001, Testing for differences in the tails of stock-market returns. Working Paper available at htt p : //papers.ssrn.com/paper.ta f ?abstract id = 291399 Jurcenko, E. and B. Maillet, 2002, The four-moment capital asset pricing model: some basic results, Working Paper. Kearns P. and A. Pagan, 1997, Estimating the density tail index for financial time series, Review of Economics and Statistics 79, 171-175. Kon, S., 1984, Models of stock returns: a comparison J. Finance XXXIX, 147-165. Laherr`ere J. and D. Sornette, 1999, Stretched exponential distributions in nature and economy: Fat tails with characteristic scales, European Physical Journal B 2, 525-539. Levy, M., S. Solomon and G. Ram, 1996, Dynamical explanation of the emergence of power law in a stock market model, Int. J. Mod. Phys. C 7, 65-72. Longin F.M., 1996, The asymptotic distribution of extreme stock market returns, Journal of Business 96, 383-408. Lux T., 1996, The stable Paretian hypothesis and the frequency of large returns: an examination of major German stocks, Applied Financial Economics 6, 463-475. Lux T., 2000, The Limiting Extreme Behavior of Speculative Returns: An Analysis of Intra-Daily Data from the Francfurt Stock Exchange, preprint,March 2000, 29 p. Lux, L. and D. Sornette, 2002, On Rational Bubbles and Fat Tails, J. Money Credit and Banking, Part 1, 34, 589-610. 41
106
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Malevergne, Y. and D. Sornette, 2001, Multi-dimensional Rational Bubbles and fat tails, Quantitative Finance 1, 533-541. Malevergne, Y. and D. Sornette, 2002, Multi-Moments Method for Portfolio Management: Generalized Capital Asset Pricing Model in Homogeneous and Heterogeneous markets, working paper (htt p : //papers.ssrn.com/paper.ta f ?abstract id = 319544) Malevergne, Y. and D. Sornette, 2002, Investigating Extreme Dependences: Concepts and Tools, working paper (htt p : //papers.ssrn.com/paper.ta f ?abstracti d = 303465) Mandelbrot B., 1963, The variation of certain speculative prices, Journal of Business 36, 392-417. Mantegna R.N. and H.E. Stanley, 1995, Scaling behavior of an economic index, Nature 376, 46-55. Mantegna R.N. and H.E.Stanley, 2000, An Introduction to Econophysics, Correlations and Complexity in Finance, Cambridge Univ. Press, Cambridge UK. Markovitz, H., 1959, Portfolio Selection: Efficient diversification of investments, john Wiley and Sons, New York. Matia, K., L.A.N. Amaral, S.P. Goodwin and H.E. Stanley, Non-L´evy Distribution of Commodity Price Fluctuations, preprint cond-mat/0202028 Mittnik S., S.T.Rachev, M.S.Paolella,1998, Stable Paretian Modeling in Finance: Some Empirical and Theoretical Aspects, In: A Practical Guide to Heavy Tails, pp.79-110, Eds. R.J.Adler, R.E.Feldman, M.S.Taqqu, Birkhauser, Boston. Mizon, G.E. and J.F. Richard, 1984, The encompassing principle and its applications to testing non-nested hypotheses, Econometrica 54, 675-678. M¨uller U.A., M.M.Dacarogna, O.V.Picktet, 1998, Heavy Tails in High-Frequency Financial Data, In: A Practical Guide to Heavy Tails, pp.55-78, Eds. R.J.Adler, R.E.Feldman, M.S.Taqqu, Birkhauser, Boston. Muzy, J.-F., Delour, J. and Bacry, E., 2000, Modelling fluctuations of financial time series: from cascade process to stochastic volatility model, European Physical Journal 17, 537-548. Muzy, J.-F., D. Sornette, J. Delour and A. Arneodo, 2001, Multifractal returns and Hierarchical Portfolio Theory, Quantitative Finance 1, 131-148. Nagahara, Y. and G. Kitagawa, 1999, A non-Gaussian stochastic volatility model, J. Computational Finance 2, 33-47. Nelsen, R., 1998, An introduction to copulas, Lectures notes in statistics 139, Springer Verlag, New York. Pagan A., 1996, The econometrics of financial markets, Journal of Empirical Finance 3, 15-102. Prause, K., 1998, The generalized hyperbolic model, PhD Dissertation, University of Freiburg. Ramshand, L. and R. Susmel, 1998, Volatility and cross correlation across major stock markets, Journal of Empirical Finance 5, 397-416. Rootzen H., M.R. Leadbetter and L. de Haan, 1998, On the distribution of tail array sums for strongly mixing stationnary sequences, Annals of Applied Probability 8, 868-885.
42
107
Rubinstein, M., 1973, The fundamental theorem of parameter-preference security valuation. Journal of Financial and Quantitative Analysis 8, 61-69. Samuelson, P.A., 1965, Proof that Properly Anticipated Prices Fluctuate Randomly, Industrial Management Review 6, 41-49. Serva, M., U.L. Fulco, M.L. Lyra and G.M. Viswanathan, 2002, Kinematics of stock prices, preprint at http://arXiv.org/abs/cond-mat/0209103 Sharpe, W., 1964, Capital asset prices: A theory of market equilirium under condition of risk, journal of finance 19, 425-442. Simon, H.A., 1957, Models of man: social and rational; mathematical essays on rational human behavior in a social setting (New York, Wiley) Smith, R.L., 1985, Maximum likelihood estimation in a class of non-regular cases. Biometrika 72, 67-90. Sornette, D., 1998, Linear stochastic dynamics with nonlinear fractal properties, Physica A 250, 295-314. Sornette, D., 2000, Critical Phenomena in Natural Sciences, Chaos, Fractals, Self-organization and Disorder: Concepts and Tools, (Springer Series in Synergetics, Heidelberg). Sornette, D. and R. Cont, 1997, Convergent multiplicative processes repelled from zero: power laws and truncated power laws, J. Phys. I France 7, 431-444. Sornette, D., L. Knopoff, Y.Y. Kagan and C. Vanneste, 1996, Rank-ordering statistics of extreme events: application to the distribution of large earthquakes, J.Geophys.Res. 101, 13883-13893. Sornette, D., P. Simonetti and J.V. Andersen, 2000, φq -field theory for Portfolio optimization: “fat tails” and non-linear correlations, Physics Report 335 (2), 19-92. Starica C. and O. Pictet, 1999, The tales the tails of GARCH(1,1) process tell. Working Paper, Univ. of Pennsylvania. Stuart A. and K. Ord, 1994, Kendall’s advances theory of statistics (John Wiley and Sons). Mizuno, T., S. Kurihara, M. Takayasu and H. Takayasu, 2002, Analysis of high-resolution foreign exchange data of USD-JPY for 13 years, working paper http://xxx.lanl.gov/abs/cond-mat/0211162 Vries, de C.G., 1994, Stylized facts of nominal exchange rate returns, in The Handbook of International Macroeconomics, F. van der Ploeg (ed.), 348-389 (Blackwell). Wilks, S. S., 1938, The Large Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , Annals of Mathemurical Statistics 9, 60-62.
43
108
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Nasdaq (5 minutes) † Nasdaq (1 hour) † Nasdaq (5 minutes) ‡ Nasdaq (1 hour) ‡ Dow Jones (1 day) Dow jones (1 month)
Mean 1.80 ·10−6 2.40 ·10−5 - 6.33 ·10−9 1.05 ·10−6 8.96·10−5 1.80 ·10−3
St. Dev. 6.61 ·10−4 3.30 ·10−3 3.85 ·10−4 1.90 ·10−3 4.70 ·10−3 2.54 ·10−2
Skewness 0.0326 1.3396 -0.0562 -0.0374 -0.6101 -0.6998
Ex. Kurtosis 11.8535 23.7946 6.9641 4.5250 22.5443 5.3619
Jarque-Bera 1.30 ·105 (0.00) 4.40 ·104 (0.00) 4.50 ·104 (0.00) 1.58 ·103 (0.00) 6.03 ·105 (0.00) 1.28 ·103 (0.00)
Table 1: Descriptive statistics for the Dow Jones returns calculated over one day and one month and for the Nasdaq returns calculated over five minutes and one hour. The numbers within parenthesis represent the p-value of Jarque-Bera’s normality test. (†) raw data, (‡) data corrected for the U-shape of the intra-day volatility due to the lunch effect.
44
0.995 -0.0509 0.4147
45
0.9 -0.1005 0.0386
GPD 0.95 0.99 -0.0593 -0.0189 0.0503 0.1097 0.995 -0.0314 0.1700
0.9 -0.1234 0.8139
GPD 0.95 0.99 0.1947 -0.2147 0.6197 0.6625 0.995 -0.2681 0.597
200 1.1011 0.0508
quantile mean Emp Std
0.9 0.0732 0.0942
GPD 0.95 0.99 0.0740 0.0031 0.1804 0.3744
0.995 -0.0559 0.4822
Dependent Data Stretched-Exponential c=0.3 with long memory Maximum cluster 10 50 100 200 mean 0.1230 0.1143 0.1359 0.1569 Emp Std 0.0328 0.0414 0.0450 0.0443
quantile mean Emp Std
Independent Data Stretched-Exponential c=0.3 Maximum cluster 10 50 100 mean 0.9313 0.9910 1.0617 Emp Std 0.0380 0.0464 0.0492
0.9 0.3336 0.0454
GPD 0.95 0.99 0.3271 0.3193 0.0604 0.1271
0.995 0.2585 0.2991
200 0.4769 0.0407
quantile mean Emp Std
0.9 0.2952 0.0548
GPD 0.95 0.99 0.3203 0.3153 0.0694 0.1812
0.995 0.2570 0.3382
Pareto with long memory b=3 Maximum cluster 10 50 100 200 mean 0.0566 0.0606 0.0794 0.0978 Emp Std 0.0422 0.0495 0.0533 0.0560
quantile mean Emp Std
cluster mean Emp Std
Pareto Distribution b=3 Maximum 10 50 100 0.3669 0.4007 0.4455 0.0267 0.0353 0.0384
Table 2: Mean values and standard deviations of the Maximum Likelihood estimates of the parameter ξ (inverse of the Pareto exponent) for the distribution of maxima (cf. equation 4) when data are clustered in samples of size 10, 50, 100 and 200 and for the Generalized Pareto Distribution (7) for thresholds u corresponding to quantiles 90%, 95%, 99% ans 99.5%. In panel (a), we have used iid samples of size 10000 drawn from a Stretched-Exponential distribution with c = 0.7 and c = 0.3 and a Pareto distribution with tail index b = 3, while in panel (b) the samples are drawn from a long memory process with Stretched-Exponential marginals and regularly-varying marginal as explained in the text.
quantile mean Emp Std
(b) Stretched-Exponential c=0.7 with long memory Maximum cluster 10 50 100 200 mean -0.2169 -0.2226 -0.2027 -0.1918 Emp Std 0.1553 0.1512 0.1553 0.1718
0.9 0.1108 0.0346
GPD 0.95 0.99 0.0758 -0.1374 0.1579 0.4648
quantile mean Emp Std
200 0.2290 0.0415
Stretched-Exponential c=0.7 Maximum cluster 10 50 100 mean 0.1870 0.1734 0.2032 Emp Std 0.0262 0.0340 0.0382
(a)
109
0.1083 0.1839 0.1827
N/k=10 mean emp. Std th. Std 0.3613 0.8961 0.8493
0.3819 0.6141 0.6027
0.5028 0.2741 0.2755
0.5950 0.1963 0.1983
0.7768 0.1340 0.1302
46
-0.0307 0.5754 0.5680
-0.0928 0.2561 0.2524
-0.1781 0.1821 0.1771
0.0961 0.8314 0.8158
0.1275 0.5796 0.5793
0.0682 0.2607 0.2570
0.0521 0.1801 0.1814
N/k=10 mean emp. Std th. Std
quantile N/k=4 mean emp. Std th. Std
N/k=10 mean emp. Std th. Std
quantile N/k=4 mean emp. Std th. Std
0.3379 0.6056 0.5982
0.3496 0.3631 0.3791
0.3213 0.2551 0.2668
0.3359 0.1760 0.1691
0.1239 0.8352 0.8188
0.1458 0.5572 0.5194
0.1461 0.5811 0.5808
0.1261 0.3765 0.3663
0.0584 0.2608 0.2567
-0.0023 0.1633 0.1612
Pareto b=3 with long memory 0.005 0.01 0.05
0.3134 0.9003 0.8425
0.3460 0.5384 0.5358
Pareto Distribution b=3 0.005 0.01 0.05
0.1
0.3382 0.1933 0.1892
0.3334 0.1175 0.1195
0.1
0.0196 0.1801 0.1807
-0.0790 0.1125 0.1130
Table 3: Pickands estimates (9) of the parameter ξ for the Generalized Pareto Distribution (7) for thresholds u corresponding to quantiles 90%, 95%, 99% ans 99.5% and two different values of the ratio N/k respectively equal to 4 and 10. In panel (a), we have used iid samples of size 10000 drawn from a Stretched-Exponential distribution with c = 0.7 and c = 0.3 and a Pareto distribution with tail index b = 3, while in panel (b) the samples are drawn from a long memory process with Stretched-Exponential marginals and regularly-varying marginal.
-0.0371 0.8325 0.8028
N/k=10 mean emp. Std th. Std
0.0880 0.2602 0.2577
0.1432 0.1154 0.1161
N/k=10 mean emp. Std th. Std
0.0652 0.5925 0.5745
0.1172 0.1652 0.1636
0.1
Dependent data Stretched-Exponential c=0.3 with long memory quantile 0.005 0.01 0.05 0.1 N/k=4 mean 0.1297 0.1215 0.0410 0.0235 emp. Std 0.5557 0.3756 0.1633 0.1130 th. Std 0.5183 0.3661 0.1620 0.1143
0.0726 0.8218 0.8133
N/k=10 mean emp. Std th. Std
0.0689 0.3614 0.3635
0.1
Independent data Stretched-Exponential c=0.3 quantile 0.005 0.01 0.05 N/k=4 mean 0.3843 0.4340 0.6301 emp. Std 0.5643 0.3925 0.1800 th. Std 0.5393 0.3847 0.1786
(b) Stretched-Exponential c=0.7 with long memory quantile 0.005 0.01 0.05 0.1 N/k=4 mean -0.0439 -0.0736 -0.2134 -0.3395 emp. Std 0.5396 0.3599 0.1643 0.1128 th. Std 0.5073 0.3576 0.1579 0.1108
0.0791 0.5156 0.5148
Stretched-Exponential c=0.7 0.005 0.01 0.05
quantile N/k=4 mean emp. Std th. Std
(a)
110 3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
111
(a)
Dow Jones
cluster mean Emp Std
Positive Tail Maximum 10 50 100 0.2195 0.2150 0.2429 0.0850 0.1163 0.1342
quantile mean Emp Std
GPD 0.95 0.2493 0.0436
0.9 0.2479 0.0224
0.99 0.1808 0.0654
(b)
200 0.2884 0.1238
0.995 0.3458 0.1209
cluster mean Emp Std
Negative Tail Maximum 10 50 100 0.1346 0.1012 0.1222 0.1318 0.1437 0.1549
200 0.1700 0.1671
quantile mean Emp Std
GPD 0.95 0.0244 0.0344
0.99 0.2496 0.0979
0.995 0.3118 0.1452
200 0.4200 0.3980
GPD 0.95 0.2189 0.0798
0.99 0.5973 0
0.995 0.5973 0
200 0.3681 0.4289
GPD 0.95 0.3185 0.2137
0.995 0.5847 0.0883
0.9 0.2121 0.0294
Nasdaq (Raw data)
cluster mean Emp Std
Positive Tail Maximum 10 50 100 1.2961 1.1992 0.9858 0.2837 0.2955 0.4377
200 0.7844 0.4679
cluster mean Emp Std
Negative Tail Maximum 10 50 100 1.2105 0.4707 0.4146 0.2780 0.4625 0.4117
quantile mean Emp Std
0.9 0.2123 0.0693
GPD 0.95 0.3025 0.0712
0.995 0.5973 0
quantile mean Emp Std
0.9 0.1477 0.0468
0.99 0.5973 0
(c)
Nasdaq (Correted data)
cluster mean Emp Std
Positive Tail Maximum 10 50 100 0.2163 0.2652 0.2738 0.2461 0.2825 0.2364
200 0.2771 0.2604
cluster mean Emp Std
Negative Tail Maximum 10 50 100 0.2530 0.2276 0.3173 0.3345 0.3273 0.4303
quantile mean Emp Std
0.9 0.2569 0.1287
GPD 0.95 0.4945 0.1683
0.995 0.5973 0.0000
quantile mean Emp Std
0.9 0.2560 0.1794
0.99 0.5943 0.0304
0.99 0.5929 0.0446
Table 4: Mean values and standard deviations of the Maximum Likelihood estimates of the parameter ξ for the distribution of maximum (cf. equation 4) when data are clustered in samples of size 10, 50, 100 and 200 and for the Generalized Pareto Distribution (7) for thresholds u corresponding to quantiles 90%, 95%, 99% ans 99.5%. In panel (a), are presented the results for the Dow Jones, in panel (b) for the Nasdaq for raw data and in panel (c) the Nasdaq corrected for the “lunch effect”.
47
112
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
(a)
Dow Jones Negative Tail 0.995 0.99
quantile N/k =4 mean emp. Std th. Std
0.2491 0.3561 0.5274
N/k =10 mean emp. Std th. Std
0.4449 0.5354 0.8619
0.95
0.90
-0.0366 0.2382 0.3590
0.2721 0.0974 0.1674
-0.2539 0.5563 0.5568
0.0279 0.1314 0.2558
(b) quantile N/k =4 mean emp. Std th. Std N/k =10 mean emp. Std th. Std
Positive Tail 0.995 0.99
0.2268 0.0639 0.1175
quantile N/k =4 mean emp. Std th. Std
0.6427 0.3913 0.5663
0.2905 0.0943 0.1877
N/k=10 mean emp. Std th. Std
-0.2228 1.3601 0.7891
0.95
0.90
-0.3035 0.2233 0.3510
0.4006 0.0902 0.1710
0.2338 0.0695 0.1177
0.8057 0.5420 0.6552
0.3622 0.1301 0.2686
0.3107 0.1227 0.1883
0.95
0.90
Nasdaq (Raw data) Negative Tail 0.995 0.99 0.3189 0.2489 0.5333
0.2034 0.6173 0.8281
0.0006 0.1295 0.3605
-0.1098 0.4596 0.5634
0.95
0.90
0.1055 0.0649 0.1634
0.1634 0.0871 0.2604
(c)
Positive Tail 0.995 0.99
0.0284 0.043 0.1144
quantile N/k=4 mean emp. Std th. Std
1.0709 0.3461 0.6222
0.2538 0.2190 0.3732
0.1090 0.0626 0.1634
0.0386 0.0384 0.1145
0.2445 0.0571 0.1863
N/k =10 mean emp. Std th. Std
-0.8666 0.6405 0.7835
1.1228 0.3499 0.7042
0.1601 0.1157 0.2602
0.2846 0.0762 0.1875
0.95
0.90
Nasdaq (Corrected data) Negative Tail 0.995 0.99
quantile N/k =4 mean emp. Std th. Std
0.1910 0.2238 0.5228
N/k =10 mean emp. Std th. Std
0.0630 0.4899 0.5137
0.95
0.90
0.3435 0.1441 0.3787
0.0616 0.0487 0.1624
0.0543 0.1004 0.3629
0.4537 0.0780 0.1727
Positive Tail 0.995 0.99
0.2070 0.0333 0.1172
quantile N/k=4 mean emp. Std th. Std
1.4051 0.2299 0.6742
-0.1313 0.1424 0.3556
0.0255 0.0739 0.1617
0.2259 0.0390 0.1175
-0.0699 0.0614 0.1150
N/k =10 mean emp. Std th. Std
-0.3357 0.6830 0.7835
1.3961 0.2481 0.7521
0.4034 0.0891 0.2705
0.0770 0.0643 0.1820
Table 5: Pickands estimates (9) of the parameter ξ for the Generalized Pareto Distribution (7) for thresholds u corresponding to quantiles 90%, 95%, 99% ans 99.5% and two different values of the ratio N/k respectiveley equal to 4 and 10. In panel (a), are presented the results for the Dow Jones, in panel (b) for the Nasdaq for raw data and in panel (c) the Nasdaq corrected for the “lunch effect”.
48
113
Nasdaq q q1 =0 q2 =0.1 q3 =0.2 q4 =0.3 q5 =0.4 q6 =0.5 q7 =0.6 q8 =0.7 q9 =0.8 q10 =0.9 q11 =0.925 q12 =0.95 q13 =0.96 q14 =0.97 q15 =0.98 q16 =0.99 q17 =0.9925 q18 =0.995
Pos. Tail 103 u nu 0.0053 11241 0.0573 10117 0.1124 8993 0.1729 7869 0.238 6745 0.3157 5620 0.406 4496 0.5211 3372 0.6901 2248 0.973 1124 1.1016 843 1.2926 562 1.3859 450 1.53 337 1.713 225 2.1188 112 2.3176 84 3.0508 56
Neg. Tail 103 u nu 0.0053 10751 0.0571 9676 0.1129 8601 0.1723 7526 0.2365 6451 0.3147 5376 0.412 4300 0.5374 3225 0.7188 2150 1.0494 1075 1.1833 806 1.3888 538 1.4955 430 1.639 323 1.8557 215 1.8855 108 2.4451 81 2.7623 54
Pos. 102 u 0.0032 0.0976 0.1833 0.2783 0.3872 0.5055 0.6426 0.8225 1.0545 1.4919 1.6956 1.9846 2.1734 2.413 2.7949 3.5704 3.9701 4.5746
Dow Jones Tail Neg. nu 102 u 14949 0.0028 13454 0.0862 11959 0.1739 10464 0.263 8969 0.3697 7475 0.4963 5980 0.6492 4485 0.8376 2990 1.1057 1495 1.6223 1121 1.8637 747 2.2285 598 2.4197 448 2.7218 299 3.1647 149 4.1025 112 4.3781 75 5.0944
Tail nu 13464 12118 10771 9425 8078 6732 5386 4039 2693 1346 1010 673 539 404 269 135 101 67
Table 6: Significance levels qk and their corresponding lower thresholds uk for the four different samples. The number nu provides the size of the sub-sample beyond the threshold uk .
49
114
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Weibull Gen. Pareto Gamma Exponential Pareto Weibull Gen. Pareto Gamma Exponential Pareto
Mean AD-statistic for u1 – u9 N-pos N-neg DJ-pos 1.86 (89.34%) 1.34 (79.18%) 5.43 (99.82%) 4.45 (99.45%) 2.68 (96.01%) 12.47 (99.996%) 3.59 (98.59%) 2.82 (96.62%) 8.76 (99.996%) 3.64 (98.66%) 2.76 (96.36%) 13.96 (99.996%) 475.2 (99.996%) 441.7 (99.996%) 691.7 (99.996%) Mean AD-statistic for u10 – u18 1.21 (74.29%) .988 (63.39%) .835 (53,52%) 2.29 (93.57%) 1.88 (89.52%) 1.95 (90.28%) 2.49 (95.00%) 1.90 (89.74%) 2.12 (92.01%) 3.26 (97.97%) 1.93 (90.02%) 4.52 (99.10%) 1.80 (88.60%) 1.77 (88.23%) 1.18 (73.01%)
DJ-neg 3.85 (98.91%) 6.44 (99,99%) 7.23 (99.996%) 10.10 (99.996%) 607.9 (99.996%) .849 1.36 1.63 2.70 1.65
(54.54%) (79.67%) (86.02%) (96.11%) (86.40%)
Table 7: Mean Anderson-Darling distances in the range of thresholds u1 -u9 and in the range u10 -u18 . The figures within parenthesis characterize the goodness of fit: they represent the significance levels with which the considered model can be rejected.
50
115
Nasdaq
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Pos. Tail MLE ADE 0.256 (0.002) 0.192 0.555 (0.006) 0.443 0.765 (0.008) 0.630 0.970 (0.011) 0.819 1.169 (0.014) 1.004 1.400 (0.019) 1.227 1.639 (0.024) 1.460 1.916 (0.033) 1.733 2.308 (0.049) 2.145 2.759 (0.082) 2.613 2.955 (0.102) 2.839 3.232 (0.136) 3.210 3.231 (0.152) 3.193 3.358 (0.183) 3.390 3.281 (0.219) 3.306 3.327 (0.313) 3.472 3.372 (0.366) 3.636 3.136 (0.415) 3.326
Dow Jones
Neg. Tail MLE ADE 0.254 (0.002) 0.191 0.548 (0.006) 0.439 0.755 (0.008) 0.625 0.945 (0.011) 0.800 1.122 (0.014) 0.965 1.325 (0.018) 1.157 1.562 (0.024) 1.386 1.838 (0.032) 1.655 2.195 (0.047) 1.999 2.824 (0.086) 2.651 3.008 (0.106) 2.836 3.352 (0.145) 3.259 3.441 (0.166) 3.352 3.551 (0.198) 3.479 3.728 (0.254) 3.730 3.990 (0.384) 3.983 3.917 (0.435) 3.860 4.251 (0.578) 4.302
Pos. Tail MLE ADE 0.204 (0.002) 0.150 0.576 (0.005) 0.461 0.782 (0.007) 0.644 0.989 (0.010) 0.833 1.219 (0.013) 1.053 1.447 (0.017) 1.279 1.685 (0.022) 1.519 1.984 (0.030) 1.840 2.240 (0.041) 2.115 2.575 (0.067) 2.474 2.715 (0.081) 2.648 2.787 (0.102) 2.707 2.877 (0.118) 2.808 2.920 (0.138) 2.841 2.989 (0.173) 2.871 3.226 (0.263) 3.114 3.427 (0.322) 3.351 3.818 (0.441) 3.989
Neg. Tail MLE ADE 0.199 (0.002) 0.147 0.538 (0.005) 0.431 0.745 (0.007) 0.617 0.920 (0.009) 0.777 1.114 (0.012) 0.960 1.327 (0.016) 1.169 1.563 (0.021) 1.408 1.804 (0.028) 1.659 2.060 (0.040) 1.921 2.436 (0.066) 2.315 2.581 (0.081) 2.467 2.765 (0.107) 2.655 2.782 (0.120) 2.642 2.903 (0.144) 2.740 3.059 (0.186) 2.870 3.690 (0.318) 3.668 3.518 (0.350) 3.397 4.168 (0.506) 4.395
Table 8: Maximum Likelihood and Anderson-Darling estimates of the Pareto parameter b. Figures within parentheses give the standard deviation of the Maximum Likelihood estimator.
51
116
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Nasdaq
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Pos. Tail MLE ADE 1.007 (0.008) 1.053 0.983 (0.011) 1.051 0.944 (0.014) 1.031 0.896 (0.018) 0.995 0.857 (0.021) 0.978 0.790 (0.026) 0.916 0.732 (0.033) 0.882 0.661 (0.042) 0.846 0.509 (0.058) 0.676 0.359 (0.092) 0.631 0.252 (0.110) 0.515 0.039 (0.138) 0.177 0.057 (0.155) 0.233 0 0 0 0 0 0 0 0 0 0
Dow Jones
Neg. Tail MLE ADE 0.987 (0.008) 1.017 0.953 (0.011) 0.993 0.912 (0.014) 0.955 0.876 (0.018) 0.916 0.861 (0.021) 0.912 0.833 (0.026) 0.891 0.796 (0.033) 0.859 0.756 (0.042) 0.834 0.715 (0.059) 0.865 0.522 (0.099) 0.688 0.481 (0.120) 0.697 0.273 (0.155) 0.275 0.255 (0.177) 0.274 0.215 (0.209) 0.194 0.091 (0.260) 0 0.064 (0.390) 0 0.158 (0.452) 0.224 0 0
Pos. Tail MLE ADE 1.040 (0.007) 1.104 0.973 (0.010) 1.075 0.931 (0.013) 1.064 0.878 (0.015) 1.038 0.792 (0.019) 0.955 0.708 (0.023) 0.873 0.622 (0.028) 0.788 0.480 (0.035) 0.586 0.394 (0.047) 0.461 0.304 (0.074) 0.346 0.231 (0.087) 0.158 0.269 (0.111) 0.207 0.247 (0.127) 0.147 0.283 (0.150) 0.174 0.374 (0.192) 0.407 0.372 (0.290) 0.382 0.281 (0.346) 0.255 0 0
Neg. Tail MLE ADE 0.975 (0.007) 1.026 0.910 (0.010) 0.989 0.856 (0.012) 0.948 0.821 (0.015) 0.933 0.767 (0.018) 0.889 0.698 (0.022) 0.819 0.612 (0.028) 0.713 0.531 (0.035) 0.597 0.478 (0.047) 0.527 0.403 (0.076) 0.387 0.379 (0.091) 0.337 0.357 (0.119) 0.288 0.428 (0.136) 0.465 0.448 (0.164) 0.641 0.451 (0.210) 0.863 0.022 (0.319) 0.110 0.178 (0.367) 0.703 0 0
Table 9: Maximum Likelihood and Anderson-Darling estimates of the form parameter c of the Weibull (Stretched-Exponential) distribution.
52
117
Nasdaq
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Pos. Tail MLE ADE 0.443 (0.004) 0.441 0.429 (0.006) 0.440 0.406 (0.008) 0.432 0.372 (0.011) 0.414 0.341 (0.015) 0.404 0.283 (0.020) 0.364 0.231 (0.026) 0.339 0.166 (0.034) 0.311 0.053 (0.030) 0.164 0.005 (0.010) 0.128 0.000 (0.001) 0.049 0.000 (0.000) 0.000 0.000 (0.000) 0.000 -
Dow Jones
Neg. Tail MLE ADE 0.455 (0.005) 0.452 0.436 (0.006) 0.443 0.410 (0.009) 0.424 0.383 (0.012) 0.402 0.369 (0.016) 0.399 0.345 (0.021) 0.383 0.309 (0.028) 0.358 0.269 (0.039) 0.336 0.225 (0.057) 0.365 0.058 (0.057) 0.184 0.036 (0.053) 0.194 0.000 (0.001) 0.000 0.000 (0.001) 0.000 0.000 (0.000) 0.000 0.000 (0.000) 0.000 (0.000) 0.000 (0.000) 0.000 -
Pos. Tail MLE ADE 7.137 (0.060) 7.107 6.639 (0.082) 6.894 6.236 (0.113) 6.841 5.621 (0.155) 6.655 4.515 (0.215) 5.942 3.358 (0.277) 5.081 2.192 (0.326) 4.073 0.682 (0.256) 1.606 0.195 (0.163) 0.510 0.019 (0.048) 0.065 0.001 (0.003) 0.000 0.005 (0.025) 0.000 0.001 (0.010) 0.000 0.009 (0.055) 0.000 0.149 (0.629) 0.282 0.145 (0.960) 0.179 0.007 (0.109) 0.002 -
Neg. Tail MLE ADE 7.268 (0.068) 7.127 6.726 (0.094) 6.952 6.108 (0.131) 6.640 5.656 (0.175) 6.515 4.876 (0.235) 6.066 3.801 (0.305) 5.220 2.475 (0.366) 3.764 1.385 (0.389) 2.149 0.810 (0.417) 1.297 0.276 (0.361) 0.207 0.169 (0.316) 0.065 0.103 (0.291) 0.012 0.427 (0.912) 0.729 0.577 (1.357) 3.509 0.613 (1.855) 9.640 0.000 (0.000) 0.000 0.000 (0.000) 5.528 -
Table 10: Maximum Likelihood and Anderson-Darling estimates of the form parameter d(×103 ) of the Weibull (Stretched-Exponential) distribution.
53
118
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Nasdaq Pos. Tail Neg. Tail MLE ADE MLE ADE 0.441 (0.004) 0.441 0.458 (0.004) 0.451 0.435 (0.004) 0.431 0.454 (0.005) 0.444 0.431 (0.005) 0.424 0.452 (0.005) 0.438 0.428 (0.005) 0.416 0.453 (0.005) 0.437 0.429 (0.005) 0.415 0.458 (0.006) 0.443 0.429 (0.006) 0.411 0.464 (0.006) 0.447 0.436 (0.006) 0.413 0.472 (0.007) 0.453 0.447 (0.008) 0.421 0.483 (0.009) 0.463 0.462 (0.010) 0.425 0.503 (0.011) 0.482 0.517 (0.015) 0.468 0.529 (0.016) 0.496 0.540 (0.019) 0.479 0.551 (0.019) 0.514 0.574 (0.024) 0.489 0.570 (0.025) 0.516 0.615 (0.029) 0.526 0.594 (0.029) 0.537 0.653 (0.035) 0.543 0.627 (0.035) 0.564 0.750 (0.050) 0.625 0.671 (0.046) 0.594 0.917 (0.086) 0.741 0.760 (0.073) 0.674 0.991 (0.107) 0.783 0.827 (0.092) 0.744 1.178 (0.156) 0.978 0.857 (0.117) 0.742
Dow Jones Pos. Tail Neg. Tail MLE ADE MLE ADE 7.012 (0.057) 7.055 7.358 (0.063) 7.135 6.793 (0.059) 6.701 7.292 (0.066) 6.982 6.731 (0.062) 6.575 7.275 (0.070) 6.890 6.675 (0.065) 6.444 7.358 (0.076) 6.938 6.607 (0.070) 6.264 7.429 (0.083) 6.941 6.630 (0.077) 6.186 7.529 (0.092) 6.951 6.750 (0.087) 6.207 7.700 (0.105) 7.005 6.920 (0.103) 6.199 8.071 (0.127) 7.264 7.513 (0.137) 6.662 8.797 (0.170) 7.908 8.792 (0.227) 7.745 10.205 (0.278) 9.175 9.349 (0.279) 8.148 10.835 (0.341) 9.751 10.487 (0.383) 9.265 11.796 (0.454) 10.657 11.017 (0.451) 9.722 12.598 (0.543) 11.581 11.920 (0.563) 10.626 13.349 (0.664) 12.386 13.251 (0.766) 12.062 14.462 (0.880) 13.521 15.264 (1.246) 13.943 15.294 (1.316) 13.285 15.766 (1.483) 14.210 17.140 (1.705) 15.327 16.207 (1.871) 13.697 16.883 (2.047) 13.476
Table 11: Maximum Likelihood- and Anderson-Darling estimates of the scale parameter d = 10−3 d 0 of the Exponential distribution.Figures within parentheses give the standard deviation of the Maximum Likelihood estimator.
54
119
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Pos. MLE MLE -1.03 -1.02 -0.931 -0.787 -0.655 -0.395 -0.142 0.206 0.971 1.83 2.34 3.12 3.10 3.35 3.27 3.30 3.34 2.74
Nasdaq Tail Neg. ADE MLE ADE MLE -1.09 -1.00 -1.13 -0.934 -1.13 -0.821 -1.09 -0.701 -1.12 -0.636 -1.01 -0.518 -1.03 -0.351 -1.09 -0.149 -0.754 0.101 -1.04 1.17 -0.441 1.45 -0.445 2.52 -0.444 2.63 1.43 2.89 1.57 3.36 2.97 3.80 3.19 3.46 2.90 4.22
Tail ADE ADE -1.03 -1.01 -0.955 -0.887 -0.914 -0.911 -0.906 -0.97 -1.35 -1.33 -1.53 -0.435 -0.402 -0.419 1.35 -0.411 -0.412 -0.408
Pos. MLE MLE -1.12 -1.01 -0.921 -0.766 -0.458 -0.119 0.261 0.881 1.31 1.82 2.10 2.04 2.16 2.07 1.82 1.88 2.35 3.73
Dow Jones Tail Neg. ADE MLE ADE MLE -1.18 -0.100 -1.19 -0.862 -1.23 -0.710 -1.24 -0.594 -1.09 -0.397 -0.929 -0.118 -0.763 0.251 -0.202 0.619 0.127 0.930 0.408 1.40 0.949 1.59 0.733 1.78 0.886 1.57 0.786 1.58 -0.282 1.64 -0.129 3.60 -0.317 3.19 3.27 4.11
Tail ADE ADE -1.05 -1.01 -0.943 -0.944 -0.870 -0.715 -0.462 -0.160 -0.018 0.435 0.420 0.403 -0.375 -0.425 -2.75 -0.428 -0.433 0.374
Table 12: Maximum Likelihood- and Anderson-Darling estimates of the form parameter b of the Incomplete Gamma distribution.
55
120
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
bˆ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0.204 0.576 0.782 0.989 1.219 1.447 1.685 1.984 2.240 2.575 2.715 2.787 2.877 2.920 2.989 3.226 3.427 3.818
bˆ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0.256 0.555 0.765 0.970 1.169 1.400 1.639 1.916 2.308 2.759 2.955 3.232 3.231 3.358 3.281 3.327 3.372 3.136
Dow Jones Positive Tail b† Wald Test Score Test 0.205 0.580 0.787 0.995 1.224 1.452 1.689 1.986 2.241 2.575 2.715 2.787 2.877 2.920 2.989 3.226 3.427 -
79.009% 52.580% 40.708% 29.416% 15.222% 7.202% 2.984% 0.449% 0.078% 0.005% 0.004% 0.004% 0.004% 0.005% 0.002% 0.001% 0.001% -
78.793% 52.300% 40.474% 29.251% 15.159% 7.181% 2.978% 0.448% 0.078% 0.005% 0.004% 0.004% 0.004% 0.005% 0.002% 0.001% 0.001% -
Nasdaq Positive Tail b† Wald Test Score Test 0.256 0.557 0.769 0.974 1.174 1.404 1.643 1.920 2.311 2.761 2.956 3.232 3.231 -
38.328% 38.525% 27.629% 17.914% 13.144% 7.080% 4.221% 2.167% 0.420% 0.071% 0.013% 0.000% 0.000% -
38.238% 38.352% 27.505% 17.844% 13.094% 7.060% 4.211% 2.162% 0.419% 0.071% 0.013% 0.000% 0.000% -
bˆ 0.199 0.538 0.745 0.920 1.114 1.327 1.563 1.804 2.060 2.436 2.581 2.765 2.782 2.903 3.059 3.690 3.518 4.168
bˆ 0.254 0.548 0.755 0.945 1.122 1.325 1.562 1.838 2.195 2.824 3.008 3.352 3.441 3.551 3.728 3.990 3.917 4.251
Dow Jones Negative Tail b† Wald Test Score Test 0.200 0.541 0.749 0.924 1.118 1.330 1.565 1.805 2.060 2.437 2.582 2.765 2.783 2.905 3.063 3.690 3.519 -
77.394% 45.609% 28.921% 21.447% 13.074% 6.456% 2.224% 0.559% 0.179% 0.020% 0.010% 0.009% 0.048% 0.082% 0.099% 0.000% 0.002% -
77.166% 45.385% 28.792% 21.354% 13.027% 6.439% 2.221% 0.558% 0.179% 0.020% 0.010% 0.009% 0.048% 0.082% 0.099% 0.000% 0.002% -
Nasdaq Negative Tail b† Wald Test Score Test 0.255 0.550 0.757 0.946 1.124 1.327 1.564 1.840 2.199 2.826 3.010 3.353 3.441 3.551 3.728 3.990 3.917 -
39.906% 26.450% 14.636% 7.777% 6.269% 4.096% 2.468% 1.481% 1.061% 0.137% 0.089% 0.002% 0.001% 0.000% 0.000% 0.000% 0.000% -
39.812% 26.369% 14.601% 7.763% 6.258% 4.090% 2.464% 1.480% 1.059% 0.136% 0.089% 0.002% 0.001% 0.000% 0.000% 0.000% 0.000% -
Table 13: Wald and Score encompassing Test for non-nested hypotheses. The p-value gives the significance with which one can reject the null hypothesis: (SE) encompasses (PD). b† is the estimator of the Pareto model under the hypothesis for the SE distribution. bˆ is the estimator under the true distribution for the Pareto model. 56
121
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Dow Jones positive tail ηˆ T bˆ p-value 1.044 0.204 100.000% 1.123 0.576 100.000% 1.229 0.782 100.000% 1.351 0.989 100.000% 1.493 1.219 100.000% 1.653 1.447 100.000% 1.835 1.685 100.000% 2.069 1.984 100.000% 2.294 2.240 100.000% 2.605 2.575 99.997% 2.732 2.715 99.165% 2.809 2.787 98.487% 2.895 2.877 94.723% 2.943 2.920 94.010% 3.027 2.989 95.032% 3.261 3.226 80.157% 3.447 3.427 58.002% 3.818 3.818 -
Dow Jones negative tail ηˆ T bˆ p-value 0.979 0.199 100.000% 1.050 0.538 100.000% 1.148 0.745 100.000% 1.259 0.920 100.000% 1.388 1.114 100.000% 1.538 1.327 100.000% 1.715 1.563 100.000% 1.912 1.804 100.000% 2.141 2.060 100.000% 2.489 2.436 100.000% 2.626 2.581 99.997% 2.803 2.765 99.766% 2.835 2.782 99.869% 2.960 2.903 99.514% 3.116 3.059 97.480% 3.690 3.690 5.634% 3.527 3.518 38.929% 4.168 4.168 -
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Nasdaq positive tail ηˆ T bˆ p-value 1.019 0.256 100.000% 1.118 0.555 100.000% 1.225 0.765 100.000% 1.347 0.970 100.000% 1.486 1.169 100.000% 1.650 1.400 100.000% 1.840 1.639 100.000% 2.069 1.916 100.000% 2.393 2.308 100.000% 2.799 2.759 99.994% 2.974 2.955 98.132% 3.232 3.232 22.570% 3.232 3.231 28.945% 3.358 3.358 3.281 3.281 3.327 3.327 3.372 3.372 3.136 3.136 -
Nasdaq negative tail ηˆ T bˆ p-value 1.000 0.254 100.000% 1.090 0.548 100.000% 1.194 0.755 100.000% 1.312 0.945 100.000% 1.447 1.122 100.000% 1.605 1.325 100.000% 1.796 1.562 100.000% 2.032 1.838 100.000% 2.353 2.195 100.000% 2.900 2.824 100.000% 3.070 3.008 99.996% 3.372 3.352 92.329% 3.457 3.441 85.305% 3.563 3.551 69.909% 3.730 3.728 27.179% 3.991 3.990 13.117% 3.923 3.917 27.655% 4.251 4.251 -
Table 14: Test of the (SE) model against the null hypothesis that the true model is the Pareto model. The p-value gives the significance with which one can reject the null hypothesis.
57
122
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
−4
10
x 10
9
mean absolute return |r|
8
7
6
5
4
3
2
0
10
20
30
40 50 time (by five minutes)
60
70
80
Figure 1: Average absolute return, as a function of time within a trading day. The U-shape characterizes the so-called lunch effect.
58
123
Variation coeff.=std/mean of time intervals between pos. extremums of DJ 4
3.5
Variation coefficient
3
2.5
2
1.5
1
0.5
0
0
0.01
0.02
0.03 0.04 0.05 Threshold u, (for extremums X > u)
0.06
0.07
0.08
Variation coeff.=std/mean of time intervals between neg. extremums, DJ 4
3.5
Variation coefficient
3
2.5
2
1.5
1
0.5
0
0
0.01
0.02
0.03 0.04 0.05 Threshold, u (for extremums X < −u)
0.06
0.07
0.08
Figure 2: Coefficient of variation V for the Dow Jones daily returns. An increase of V characterizes the increase of “clustering”.
59
124
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
MLE of GPD form parameter ξ from SE samples, n=50000 1.8 1.6 1.4 1.2
MLE of ξ
1 0.8 c=0.3 0.6 0.4 c=0.7
0.2 0 −0.2
0
2
4
6 8 10 Number of lower threshold h
12
14
16
Figure 3: Maximum Likelihood estimates of the GPD form parameter for Stretched-Exponentail samples of size 50,000.
60
125
Mean excess functions for DJ−daily pos.(line) and neg.(pointwise) 0.06
0.05
Mean excess function
0.04
0.03
0.02
0.01
0
0
0.05
0.1
0.15
Lower threshold, u −3
3.5
Mean excess functions for ND−5min pos.(line) and neg.(pointwise)
x 10
3
Mean excess function
2.5
2
1.5
1
0.5
0
0
0.001
0.002
0.003
0.004 0.005 0.006 Lower threshold, u
0.007
0.008
0.009
0.01
Figure 4: Mean excess functions for the Dow Jones daily returns (upper panel) and the Nasdaq five minutes returns (lower panel). The plain line represents the positive returns and the dotted line the negative ones
61
126
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Complementary DF of DJ−daily pos.(line),n=14949 and neg.(pointwise),n=13464
0
10
−1
Complementary sample DF
10
−2
10
−3
10
−4
10
−5
10
−5
10
−4
−3
10
−2
−1
10 10 Absolute log−return, x
10
0
10
Complementary DF of ND−5min pos.(line),n=11241 and neg.(pointwise) n=10751
0
10
−1
Complementary sample DF
10
−2
10
−3
10
−4
10
−5
10
−6
10
−5
10
−4
10 Absolute log−return, x
−3
10
−2
10
Figure 5: Cumulative sample distributions for the Dow Jones (a) and for the Nasdaq (b) data sets.
62
127
Hill‘s estimates of b for DJ−daily pos.(line), n=14949, and neg.(pointwise),n=13464 u
5
4.5
4
Hill‘s estimate bu
3.5
3
2.5
2
1.5
1
0.5
0 −5 10
−4
10
−3
10 Lower threshold, u
−2
10
Hill‘s estimates of bu for ND−5min pos.(line),n=11241, and neg.(pointwise),n=10751 6
5
Hill‘s estimate bu
4
3
2
1
0
−5
10
−4
10 Lower threshold, u
−3
10
Figure 6: Hill estimates bˆ u as a function of the threshold u for the Dow Jones (a) and for the Nasdaq (b).
63
128
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
5 ND< ND> DJ> DJ<
4
b
3 2 1 0
0
5
10
15
20
Index n of the quantile qn Figure 7: Hill estimator bˆ u for all four data sets (positive and negative branches of the distribution of returns for the DJ and for the ND) as a function of the index n = 1, ..., 18 of the 18 quantiles or standard significance levels q1 . . . q18 given in table 6. The dashed line is expression (33) with 1 − qn = 3.08 e−0.342n given by (32).
64
129
Wilks statistics for CD vs 4 parametric families, Npos, n=11241 100
Wilks statistic (doubled log−likelihood ratio)
90 80 70 ED 60 IG 50 40 2
χ2(.95)
30 SE
χ2(.95) 1
PD
20 10 0
0
0.5
1
1.5 Lower threshold, u
2
2.5
3 −3
x 10
Wilks statistics for CD vs 4 parametric families, Nneg, n=10751 80
Wilks statistic (doubled log−likelihood ratio)
70
60 ED 50
40 2 2
χ (.95) 30
χ21(.95) 20
IG PD
10 SE
0
0
0.5
1
1.5 Lower threshold, u
2
2.5
3 −3
x 10
Figure 8: Wilks statistic for the comprehensive distribution versus the four parametric distributions : Pareto (PD), Weibull (SE), Exponential (ED) and Incomplete Gamma (IG) for the Nasdaq five minutes returns. The upper panel refers to the positive returns and lower panel to the negative ones.
65
130
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Wilks statistics for CD vs 4 parametric families, DJpos, n=14949 80
Wilks statistic (doubled log−likelihood ratio)
70
60
50 ED 40
30 χ21(.95)
PD 20
χ2(.95) 2
SE IG
10
0
0
0.005
0.01
0.015
0.02 0.025 0.03 Lower threshold, u
0.035
0.04
0.045
0.05
Wilks statistics for CD vs 4 parametric families, DJneg, n=13464 80
Wilks statistic (doubled log−likelihood ratio)
70
60 ED 50 IG
40
PD
30
χ22(.95)
20
χ21(.95)
SE 10
0
0
0.01
0.02
0.03 Lower threshold, u
0.04
0.05
0.06
Figure 9: Wilks statistic for the comprehensive distribution versus the four parametric distributions : Pareto (PD), Weibull (SE), Exponential (ED) and Incomplete Gamma (IG) for the Dow Jones daily returns. The upper panel refers to the positive returns and the lower panel to the negative ones.
66
131
Sample tail 1−F(x)(thick) and "local" Pareto−b(thin), simulated Pareto−1.2,n=15000 1.4
1.2
Tail 1−F(x) and parameter b
1
0.8
0.6
0.4
0.2
0
10
20
30
40
50 x
60
70
80
90
100
Sample tail 1−F(x)(thick) and "local" Pareto−b(thin), simulated SE−0.3,n=15000 1.4
1.2
Tail 1−F(x) and parameter b
1
0.8
0.6
0.4
0.2
0
100
200
300
400
500 x
600
700
800
900
1000
Figure 10: Local index β(x) estimated for a Pareto distribution with tail index = 1.2 (upper panel) and a Stretched exponential with exponent c = 0.3 (lower panel).
67
132
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Sample tail 1−F(x)(tick) and Pareto−b(thin), simulated crossover 1.4
1.2
Tail and parameter b
1
Simulated 2 Pareto sample b1=0.70; b2=1.5; Lower threshold u0=1; Crossover point u1=10;
0.8
PX>10≅0.1; ( ) 0.6
0.4
0.2
0
20
40
60
80
100 x
120
140
160
180
200
Figure 11: Local index β(x) for a distribution constructed by joining two Pareto distributions with exponents b1 = 0.70 and b2 = 1.5 at the cross-over point u1 = 10.
68
133
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
1
2
3
4
5
log−return x
6
7 −3
x 10
1
10
1
10
y ∼ x0.54
0
10 0
10
−1
10
−3
−2
10
10
y ∼ x0.77 −1
10
−2
10
−5
10
−4
−3
10
10
−2
10
log−return x
Figure 12: Upper panel: Sample tail (continuous line), local index β(x) (dashed line) and local exponent c(x) (dash-dotted line) for the negative tail of the Nasdaq five minutes returns. Lower panel: doubled logarithmic plot of the local index β(x). Over most of the data set, β(x) increases with the log-return x as a power law of index 0.77 while beyond the quantile 99% (see the inset) it behaves like another power law of a smaller index equal to 0.54. The goodness of fit with these two power laws has been checked by a χ2 test.
69
134
3. Distributions exponentielles e´ tir´ees contre distributions r´eguli`erement variables
Chapitre 4
Relaxation de la volatilit´e L’un des enjeux de la th´eorie financi`ere est de comprendre comment le flux incessant d’information arrivant sur les march´es financiers est incorpor´e dans le prix des actifs. Toutes les nouvelles n’ayant pas le mˆeme impact, la question se pose de savoir si l’on peut distinguer les effets des attentats du 11 septembre 2001 ou du coup d’´etat contre Gorbatchev le 19 aout 1991 du crash de 1987 ou d’autre chocs de volatilit´e de plus faible amplitude ? Utilisant un processus autor´egressif a` m´emoire longue d´efini sur le logarithme de la volatilit´e - le processus de marche al´eatoire multifractale (MRW) - nous pr´edisons des fonctions de r´eponse de la volatilit´e des prix totalement diff´erentes aux grands chocs externes compar´ees a` celles que nous pr´edisons pour les chocs endog`enes, i.e, qui r´esultent d’une accumulation constructive (ou coh´erente) d’un grand nombre de petits chocs. Ces pr´edictions sont remarquablement bien confirm´ees empiriquement sur divers chocs de volatilit´e d’amplitudes vari´ees. Notre th´eorie permet de distinguer deux types d’´ev`ements (endog`enes et exog`enes) avec des signatures sp´ecifiques et des pr´ecurseurs caract´eristiques pour la classe des chocs endog`enes. Cela explique aussi l’origine de ce type de chocs par l’accumulation coh´erente de petites mauvaises nouvelles, et ainsi permet d’unifier les pr´ec´edentes explications des grands krachs, incluant celui d’octobre 1987.
135
136
4. Relaxation de la volatilit´e
137
Volatility Fingerprints of Large Shocks: Endogeneous Versus Exogeneous ∗ D. Sornette1,2 , Y. Malevergne1,3 and J.-F. Muzy4 1
Laboratoire de Physique de la Mati`ere Condens´ee CNRS UMR 6622 Universit´e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France 2 Institute of Geophysics and Planetary Physics and Department of Earth and Space Science University of California, Los Angeles, California 90095, USA 3 Institut de Science Financi`ere et d’Assurances - Universit´e Lyon I 43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France 4 Laboratoire Syst`emes Physiques de l’Environemment, CNRS UMR 6134 Universit´e de Corse, Quartier Grossetti, 20250 Corte, France email:
[email protected],
[email protected] and
[email protected] fax: (310) 206 30 51 Forthcoming Risk
Abstract Finance is about how the continuous stream of news gets incorporated into prices. But not all news have the same impact. Can one distinguish the effects of the Sept. 11, 2001 attack or of the coup against Gorbachev on Aug., 19, 1991 from financial crashes such as Oct. 1987 as well as smaller volatility bursts? Using a parsimonious autoregressive process with long-range memory defined on the logarithm of the volatility, we predict strikingly different response functions of the price volatility to great external shocks compared to what we term endogeneous shocks, i.e., which result from the cooperative accumulation of many small shocks. These predictions are remarkably well-confirmed empirically on a hierarchy of volatility shocks. Our theory allows us to classify two classes of events (endogeneous and exogeneous) with specific signatures and characteristic precursors for the endogeneous class. It also explains the origin of endogeneous shocks as the coherent accumulations of tiny bad news, and thus unify all previous explanations of large crashes including Oct. 1987.
1 Introduction A market crash occurring simultaneously on most of the stock markets of the world as witnessed in Oct. 1987 would amount to the quasi-instantaneous evaporation of trillions of dollars. Market crashes are the extreme end members of a hierarchy of market shocks, which shake stock markets repeatedly. Among ∗ We acknowledge helpful discussions and exchanges with E. Bacry and V. Pisarenko. This work was partially supported by the James S. Mc Donnell Foundation 21st century scientist award/studying complex system.
1
138
4. Relaxation de la volatilit´e
recent events still fresh in memories are the Hong-Kong crash and the turmoil on US markets on oct. 1997, the Russian default in Aug. 1998 and the ensuing market turbulence in western stock markets and the collapse of the “new economy” bubble with the crash of the Nasdaq index in March 2000. In each case, a lot of work has been carried out to unravel the origin(s) of the crash, so as to understand its causes and develop possible remedies. However, no clear cause can usually be singled out. A case in point is the Oct. 1987 crash, for which many explanations have been proposed but none has been widely accepted unambiguously. These proposed causes include computer trading, derivative securities, illiquidity, trade and budget deficits, over-inflated prices generated by speculative bubble during the earlier period, the auction system itself, the presence or absence of limits on price movements, regulated margin requirements, off-market and off-hours trading, the presence or absence of floor brokers, the extent of trading in the cash market versus the forward market, the identity of traders (i.e. institutions such as banks or specialized trading firms), the significance of transaction taxes, etc. More rigorous and systematic analyses on univariate associations and multiple regressions of these various factors conclude that it is not at all clear what caused the crash (Barro et al. 1989). The most precise statement, albeit somewhat self-referencial, is that the most statistically significant explanatory variable in the October crash can be ascribed to the normal response of each country’s stock market to a worldwide market motion (Barro et al. 1989). In view of the stalemate reached by the approaches attempting to find a proximal cause of a market shock, several researchers have looked for more fundamental origins and have proposed that a crash may be the climax of an endogeneous instability associated with the (rational or irrational) imitative behavior of agents (see for instance (Orl´ean 1989, Orl´ean 1995, Johansen and Sornette 1999, Shiller 2000)). Are there qualifying signatures of such a mechanism? According to (Johansen and Sornette 1999, Sornette and Johansen 2001) for which a crash is a stochastic event associated with the end of a bubble, the detection of such bubble would provide a fingerprint. A large literature has emerged on the empirical detectability of bubbles in financial data and in particular on rational expectation bubbles (see (Camerer 1989, Adam and Szafarz 1992) for a survey). Unfortunately, the present evidence for speculative bubbles is fuzzy and unresolved at best, according to the standard economic and econometric literature. Other than the still controversial (Feigenbaum 2001) suggestion that super-exponential price acceleration (Sornette and Andersen 2002) and log-periodicity may qualify a speculative bubble (Johansen and Sornette 1999, Sornette and Johansen 2001), there are no unambiguous signatures that would allow one to qualify a market shock or a crash as specifically endogeneous. On the other end, standard economic theory holds that the complex trajectory of stock market prices is the faithful reflection of the continuous flow of news that are interpreted and digested by an army of analysts and traders (Cutler et al. 1989). Accordingly, large shocks should result from really bad surprises. It is a fact that exogeneous shocks exist, as epitomized by the recent events of Sept. 11, 2001 and the coup against Gorbachev on Aug., 19, 1991, and there is no doubt about the existence of utterly exogeneous bad news that move stock market prices and create strong bursts of volatility. However, some could argue that precursory fingerprints of these events were known to some elites, suggesting the possibility the action of these informed agents may have been reflected in part in stock markets prices. Even more difficult is the classification (endogeneous versus exogeneous) of the hierarchy of volatility bursts that continuously shake stock markets. While it is a common practice to associate the large market moves and strong bursts of volatility with external economic, political or natural events (White 1996), there is not convincing evidence supporting it. Here, we provide a clear and novel signature allowing us to distinguish between an endogeneous and an exogeneous origin to a volatility shock. Tests on the Oct. 1987 crash, on a hierarchy of volatility shocks and on a few of the obvious exogeneous shocks validate the concept. Our theoretical framework combines a rather novel but really powerful and parsimonious so-called multifractal random walk with conditional probability calculations.
2
139
2
Long-range memory and distinction between endogeneous and exogeneous shocks
While returns do not exhibit discernable correlations beyond a time scale of a few minutes in liquid arbitraged markets, the historical volatility (measured as the standard deviation of price returns or more generally as a positive power of the absolute value of centered price returns) exhibits a long-range dependence characterized by a power law decaying two-point correlation function (Ding et al. 1993, Ding and Granger 1996, Arneodo et al. 1998) approximately following a (t/T )−ν decay rate with an exponent ν ≈ 0.2. A variety of models have been proposed to account for these long-range correlations (Granger and Ding 1996, Baillie 1996, M¨uller et al. 1997, Muzy et al. 2000, Muzy et al. 2001, M¨uller et al. 1997). In addition, not only are returns clustered in bursts of volatility exhibiting long-range dependence, but they also exhibit the property of multifractal scale invariance (or multifractality), according to which moments mq ≡ h|rτ |q i of the returns at time scale τ are found to scale as mq ∝ τ ζq , with the exponent ζq being a non-linear function of the moment order q (Mandelbrot 1997, Muzy et al. 2000). To make quantitative predictions, we use a flexible and parsimonious model, the so-called multifractal random walk (MRW) (see Appendix A and (Muzy et al. 2000, Bacry et al. 2001)), which unifies these two empirical observations by deriving naturally the multifractal scale invariance from the volatility long range dependence. The long-range nature of the volatility correlation function can be seen as the direct consequence of a slow power law decay of the response function K∆ (t) of the market volatility measured a time t after the occurrence of an external perturbation of the volatility at scale ∆t. We find that the distinct difference between exogeneous and endogeneous shocks is found in the way the volatility relaxes to its unconditional average value. The prediction of the MRW model (see Appendix B for the technical derivation) is that the excess volatility Eexo [σ 2 (t) | ω0 ] − σ 2 (t), at scale ∆t, due to an external shock of amplitude ω0 relaxes to zero according to the universal response 1 −1/2 Eexo [σ 2 (t) | ω0 ] − σ 2 (t) ∝ e2K0 t −1≈ √ , (1) t for not too small times, where σ 2 (t) = σ 2 ∆t is the unconditional average volatility. This prediction is nothing but the response function K∆ (t) of the MRW model to a single piece of very bad news that is sufficient by itself to move the market significantly. This prediction is well-verified by the empirical data shown in figure 1. On the other hand, an “endogeneous” shock is the result of the cumulative effect of many small bad news, each one looking relatively benign taken alone, but when taken all together collectively along the full path of news can add up coherently due to the long-range memory of the volatility dynamics to create a large “endogeneous” shock. This term “endogeneous” is thus not exactly adequate since prices and volatilities are always moved by external news. The difference is that an endogeneous shock in the present sense is the sum of the contribution of many “small” news adding up according to a specific most probable trajectory. It is this set of small bad news prior to the large shock that not only led to it but also continues to influence the dynamics of the volatility time series and creates an anomalously slow relaxation. Appendix C gives the derivation of the specific relaxation (21) associated with endogeneous shocks. Figure 2 reports empirical estimates of the conditional volatility relaxation after local maxima of the S&P100 intradaily series made of 5 minute close prices during the period from 04/08/1997 to 12/24/2001 (figure 1(a)). The original intraday squared returns have been preprocessed in order to remove the U-shaped volatility modulation associated with the intraday variations of market activity. Figure 2(b) shows that the MRW 3
140
4. Relaxation de la volatilit´e
−1
10
Nikkei 250 : Aug 19, 1991 S&P 500 : Aug 19, 1991 FT−SE 100 : Aug 19, 1991 CAC 40 : Sep 11, 2001 S&P 500
: Oct 19, 1987
∫ dt E[σ(t) | ω0] − E[σ(t)]
Slope α = 1/2
August 19, 1991 : Putsh against President Gorbachev September 11, 2001: Attack against the WTC
−2
10
0
10
1
10 Time (trading days)
2
10
Figure 1: Cumulative excess volatility at scale ∆t, that is, integral over time of Eexo [σ 2 (t) | ω0 ] − σ 2 (t), due to the volatility shock induced by the coup against President Gorbachev observed in three British, Japanese and USA indices and the shock induced by the attack of September 11, 2002 against the World √ Trade Center. The dashed line is the theoretical prediction obtained by integrating (1), which gives a ∝ t time-dependence. The cumulative excess volatility following the crash of October 1987 is also shown with circles. Notice that the slope of the non-constant curve for the October 1987 crash is very different from the value 1/2 expected and observed for exogeneous shocks. This crash and the resulting volatility relaxation can be interpreted as an endogeneous event.
4
141
Figure 2: Measuring the conditional volatility response exponent α(s) for S&P 100 intradaily time series as a function of the endogeneous shock amplitude parameterized by s, defined by (19). (a) The original 5 minute intradaily time series from 04/08/1997 to 12/24/2001. The 5 minute de-seasonalized squared returns are aggregated in order to estimate the 40 minutes and daily volatilities. (b) 40 minute log-volatility covariance C40 (τ ) as a function of the logarithm of the lag τ . The MRW theoretical curve with λ2 = 0.018 and T = 1 year (dashed line) provides an excellent fit of the data up to lags of one month. (c) Conditional volatility response ln(Eendo [σ 2 (t) | s]) as a function of ln(t) for three shocks with amplitudes given by s = −1, 0, 1. (d) Estimated exponent α(s) for ∆t = 40 minutes (•) as a function of s. The solid line is the prediction corresponding to Eq. (22). The dashed line corresponds to the empirical MRW estimate obtained by averaging over 500 Monte-Carlo trials. It fits more accurately for negative s (volatility lower than normal) due to the fact that the estimations of the variance by aggregation over smaller scales is very noisy for small variance values. The error bars give the 95 % confidence intervals estimated by Monte-Carlo trials of the MRW process. In the inset, α(s) is compared for ∆t = 40 minutes (•) and ∆t = 1 day (×).
5
142
4. Relaxation de la volatilit´e
model provides a very good fit of the empirical volatility covariance in a range of time scales from 5 minutes to one month. Fig. 2(c) plots in a double logarithmic representation, for the time scale ∆t = 40 minutes, the estimated conditional volatility responses for s = 1, 0, −1, where the endogeneous shocks are parameterized by e2s σ 2 (t). A value s > 0 (resp. s < 0) corresponds to a positive bump (resp. negative dip) of the volatility above (resp. below) the average level σ 2 (t). The straight lines are the predictions (Eqs. (24,22)) of the MRW model and qualify power law responses whose exponents α(s) are continuous function of the shock amplitude s. Figure 1(d) plots the conditional response exponent α(s) as a function of s for the two time scales ∆t = 40 minutes and ∆t = 1 day (inset). For ∆t = 40 minutes, we observe that α varies between −0.2 for the largest positive shocks to +0.2 for the largest negative shocks, in excellent agreement with MRW estimates (dashed line) and, for α ≥ 0, with Eq. (22) obtained without any adjustable parameters. 1 The error bars represent the 95 % confidence intervals estimated using 500 trials of synthetic MRW with the same parameters as observed for the S&P 100 series. By comparing α(s) for different ∆t (inset), we can see the the MRW model is thus able to recover not only the s-dependence of the exponent α(s) of the conditional response function to endogeneous shocks but also its time scale ∆t variations: this exponent increases as one goes from fine to coarse scales. Similar results are obtained for other intradaily time series (Nasdaq, FX-rates, etc.). We also obtain the same results for 17 years of daily return times series of various indices (French, German, canada, Japan, etc.). In summary, the most remarkable result is the qualitatively different functional dependence of the response (1) to an exogeneous compared to the response (24,22) to an endogeneous shock. The former gives a decay of the burst of volatility ∝ 1/t1/2 compared to 1/tα(s) for endogeneous shocks with amplitude e2s σ 2 (t), with an exponent α(s) being a linear function of s.
3
Discussion
What is the source of endogeneous shocks characterized by the response function (21)? Appendix D and equation (29) predict that the expected path of the continuous information flow prior to the endogeneous shock grows proportionally to the response function K(tc − t) measured in backward time to the shock occuring at tc . In other words, conditioned on the observation of a large endogeneous shock, there is specific set of trajectories of the news flow that led to it. This specific flow has an expectation given by (29). This result allows us to understand the distinctive features of an endogeneous shock compared to an external shock. The later is a single piece of very bad news that is sufficient by itself to move the market significantly according to (1). In contrast, an “endogeneous” shock is the result of the cumulative effect of many small bad news, each one looking relatively benign taken alone, but when taken all together collectively along the full path of news can add up coherently due to the long-range memory of the log-volatility dynamics to create a large “endogeneous” shock. This term “endogeneous” is thus not exactly adequate since prices and volatilities are always moved by external news. The difference is that an endogeneous shock in the present sense is the sum of the contribution of many “small” news adding up according to a specific most probable trajectory. It is this set of small bad news prior to the large shock that not only led to it but also continues to influence the dynamics of the volatility time series and creates the anomalously slow relaxation (21). In this respect, this result allows us to rationalize and unify the many explanations proposed to account for the Oct. 1987 crash: according to the present theory, each of the explanations is insufficient to explain the crash; however, our theory suggests that it is the cumulative effect of many such effects that led to the crash. In a sense, the different commentators and analysts were all right in attributing the origin of the Oct. 1987 crash to many different factors but they missed the main point that the crash was the extreme response of 1 The deviation of α(s) from expression (22) for negative s, originates from the error in the volatility estimation using a sum of squared returns. The smaller the sum of squared returns, the larger the error is. As ∆t increases, this error becomes negligible
6
143
the system to the accumulation of many tiny bad news contributions. To test this idea, we note that the decay of the volatility response after the Oct. 1987 crash has been described by a power law 1/t0.3 (Lillo and Mantegna 2001), which is in line with the prediction of our MRW theory with equation (22) for such a large shock (see also figure 2 panel d). This value of the exponent is still significantly smaller than 0.5. Figure 1 demonstrates further the difference between the relaxation of the volatility after this event shown with circle and those following the exogenous coup against Gorbachev and the September 11 attack. There is clearly a strong constrast which qualifies the Oct. 1987 crash as endogeneous, in the sense of our theory of “conditional response.” This provides an independent confirmation of the concept advanced before in (Johansen and Sornette 1999, Sornette and Johansen 2001). It is also interesting to compare the prediction (21) with those obtained with a linear autoregressive model of the type (5), in which ω(t) is replaced by σ(t). FIGARCH models fall in this general class. It is easy to show in this case that this linear (in volatility) model predicts the same exponent for the response of the volatility to endogeneous shocks, independently of their magnitude. This prediction is in stark constrast with the prediction (21) of the log-volatility MRW model. The later model is thus strongly validated by our empirical tests.
Appendix A: The Multifractal Randow Walk (MRW) model The multifractal random walk model is the continuous time limit of a stochastic volatility model where log-volatility2 correlations decay logarithmically. It possesses a nice “stability” property related to its scale invariance property: For each time scale ∆t ≤ T , the returns at scale ∆t, r∆t (t) ≡ ln[p(t)/p(t − ∆t)], can be described as a stochastic volatility model: r∆t (t) = ²(t) · σ∆t (t) = ²(t) · eω∆t (t) ,
(2)
where ²(t) is a standardized Gaussian white noise independent of ω∆t (t) and ω∆t (t) is a nearly Gaussian process with mean and covariance: µ∆t =
1 ln(σ 2 ∆t) − C∆t (0) 2
C∆t (τ ) = Cov[ω∆t (t), ω∆t (t + τ )] = λ2 ln
(3) µ
T |τ | + e−3/2 ∆t
¶ .
(4)
σ 2 ∆t is the return variance at scale ∆t and T represents an “integral” (correlation) time scale. Such logarithmic decay of log-volatility covariance at different time scales has been demonstrated empirically in (Arneodo et al. 1998, Muzy et al. 2000). Typical values for T and λ2 are respectively 1 year and 0.02. According to the MRW model, the volatility correlation exponent ν is related to λ2 by ν = 4λ2 . The MRW model can be expressed in a more familiar form, in which the log-volatility ω∆t (t) obeys an auto-regressive equation whose solution reads Z t ω∆t (t) = µ∆t + dτ η(τ ) K∆t (t − τ ) , (5) −∞
where η(t) denotes a standardized Gaussian white noise and the memory kernel K∆t (·) is a causal function, ensuring that the system is not anticipative. The process η(t) can be seen as the information flow. Thus ω(t) represents the response of the market to incoming information up to the date t. At time t, the distribution 2 The log-volatilty is the natural quantity used in canonical stochatic volatility models (see (Kim et al. 1998, and references therein)).
7
144
4. Relaxation de la volatilit´e
´ ³ R∞ 2 (τ ) = λ2 ln T e3/2 . Its covariance, of ω∆t (t) is Gaussian with mean µ∆t and variance V∆t = 0 dτ K∆t ∆t which entirely specifies the random process, is given by Z ∞ C∆t (τ ) = dt K∆t (t)K∆t (t + |τ |) . (6) 0
hR ˆ ∆t (f )2 = Cˆ∆t (f ) = 2λ2 f −1 T f Performing a Fourier tranform, we obtain K 0 which shows that for τ small enough r λ2 T K∆t (τ ) ∼ K0 for ∆t 0, by Cov(X, Y | X ∈ A, Y ∈ B) ρA,B = p . Var(X | X ∈ A, Y ∈ B) · Var(Y | X ∈ A, Y ∈ B)
(12)
In this case, it is much more difficult to obtain general results for any specified class of distributions compared to the previous case of conditioning on a single variable. We have only been able to give the asymptotic behavior for a Gaussian distribution in the situation detailed below, using the expressions in (Johnson and Kotz 1972, p 113) or proposition A.1 of (Ang and Chen 2001). Let us assume that the pair of random variables (X,Y) has a Normal distribution with unit unconditional variance and unconditional correlation coefficient ρ. The subsets A and B are both choosen equal to [u, +∞), with u ∈ R+ , so that we focus on the correlation coefficient conditional on the returns of both X and Y larger than the threshold u. Denoting by ρu the correlation coefficient conditional on this particular choice for the subsets A and B, we are able to show (see eppendix A.2) that, for large u: ρu ∼u→∞ ρ
1+ρ 1 · , 1 − ρ u2
(13)
which goes to zero. This decay is faster than in the case governed by (3) resulting from the conditioning on a single variable leading to ρ+ v ∼v→+∞ 1/v, but, unfortunately, we do not observe a qualitative change. Thus, the correlation coefficient conditioned on both variables does not yield new significant information and does not provide any special improvement with respect to the correlation coefficient conditioned on a single variable.
1.5
Empirical evidence
We consider four national stock markets in Latin America, namely Argentina (MERVAL index), Brazil (IBOV index), Chile (IPSA index) and Mexico (MEXBOL index). We are particularly interested in the contagion effects which may have occurred across these markets. We will study this question for the market indexes expressed in US Dollar to emphasize the effect of the devaluations of local currencies and so to account for monetary crises. Doing so, we follow the same methodology as in most contagion papers (see (Forbes and Rigobon 2002), for instance). Our sample contains the daily (log) returns of each stock in local currency and US dollar during the time interval from January 15, 1992 to June 15, 2002 and thus encompasses both the Mexican crisis as well as the current Argentina crisis. Before applying the theoretical results derived above, we first need to check whether we are allowed to do so. Namely, we have to test whether the index returns distributions are not too fat tailed. Indeed, it its well known that the correlation coefficient exists if and only if the tail of the distribution decays faster than a power law with tail index α = 2, and its estimator given by the Pearson’s coefficient is well behaved if at least the fourth moment of the distribution is finite. Figure 1 represents the complementary distribution of the positive and negative tails of the index returns in US dollar. We observe that the positive tail clearly decays faster than a power law with tail index µ = 2. In fact, Hill’s estimator provides a value ranging between 3 and 4 for the four indexes. The case of the negative tail is slightly different, particularly for the Brazilian index. Indeed, for the Argentina, the Chilean and the Mexican indexes, the negative tail behaves almost like the positive one, but for the Brazilian index, the negative tail exponent is hardly larger than two, as confirmed by Hill’s estimator. This means that, in
9
248
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
the Brazilian case, the estimates of the correlation coefficient will be particularly noisy and thus of weak statistical value. We have checked that the fat tailness of the indexes expressed in US dollar comes from the impact of the exchange rates. Thus, an alternative should be to consider the indexes in local currency, following (Longin and Solnik 1995) and (Longin and Solnik 2001)’s methodology, but it would lead to focus on the linkages between markets only and to neglect the impact of the devaluations, which is precisely the main concern of many studies about the contagion in Latin America. Figures 2, 4 and 6 give the conditional correlation coefficient ρ+,− (plain thick line) for the pairs Arv gentina / Brazil, Brazil / Chile and Chile / Mexico while figures 3, 5 and 7 show the conditional correlation coefficient ρsv for the same pairs. For each figure, the dashed thick line gives the theoretical curve obtained under the bivariate Gaussian assumption whose analytical expressions can be found in appendices A.1.1 and A.1.2. The unconditional correlation coefficient of the Gaussian model is set to the empirically estimated unconditional correlation coefficent. The two dashed thin lines represent the interval within which we cannot reject, at the 95% confidence level, the hypothesis according to which the estimated conditional correlation coefficient is equal to the theoretical one. This confidence interval has been estimated using the Fisher’s statistics. Similarly, the thick dotted curve graphs the theoretical curve obtained under the bivariate Student’s assumption with ν = 3 degrees of freedom (whose expressions are given in appendices B.3 and B.4) and the two thin dotted lines are its 95% confidence level. Here, the Fisher’s statistics cannot be applied, since it requires at least that the fourth moment of the distribution exists. In fact, (Meerschaert and Scheffler 2001) have shown that, in such a case, the distribution of the sample correlation converges to a stable law with index 3/2, which justifies why the confidence interval for the Student’s model with three degres of freedom is much larger than the confidence interval for the Gaussian model. In the present study, we have used a bootstrap method to derive this confidence interval since the scale factor intervening in the stable law is difficult to calculate. In figures 2, 4 and 6, we observe that the change in the conditional correlation coefficients ρ+,− are not sigv nificantly different, at the 95% confidence level, from those obtained with a bivariate Student’s model with three degrees of freedom. In contrast, the Gaussian model is often rejected. In fact, similar results hold (but are not depicted here) for the three others pairs Argentina / Chile, Argentina / Mexico and Brazil / Mexico. Thus, these observations should lead us to conclude that, in these cases, no change in the correlations, and therefore no contagion mechanism, needs to be invoked to explain the data, since they are compatible with a Student’s model with constant correlation. Let us now discuss the results obtained for the correlation coefficient conditioned on the volatility. Figures 3 and 7 show that the estimated correlation coefficients conditioned on volatility remain consistent with the Student’s model with three degres of freedom, while they still reject the Gaussian model. In contrast, figure 5 shows that the increase of the correlation cannot be explained by any of the Gaussian or Student models, when conditioning on the Mexican index volatilty. Indeed, when the Mexican index volatility becomes larger than 2.5 times its standard deviation, none of our models can account for the increase of the correlation. The same discrepancy is observed for the pairs Argentina / Chile, Argentina / Mexico and Brazil / Mexico which have not been represented here. In each case, the Chilean or the Mexican market have an impact on the Argentina or the Brazilian market which cannot be accounted for by neither the Gaussian or Student models with constant correlation. To conclude this empirical part, there is no significant increase in the real correlation between Argentina and Brazil in the one hand and between Chile and Mexico in the other hand, when the volatility or the returns exhibit large moves. But, in period of high volatility, the Chilean and Mexican market may have an genuine impact on the Argentina and Brazilian markets. Thus, a priori, this should lead us to conclude on the existence of a contagion across these markets. However, this conclusion is based on the investigation of
10
9.1. Les diff´erentes mesures de d´ependances extrˆemes
249
only two theoretical models and it would be a little bit too hasty to conclude on the existence of contagion on the sole basis of these results. This is all the more so since our theoretical models are all symmetric in their positive and negative tails, a crucial property needed for the derivation of the expressions of ρsv , while the empirical sample distributions are certainly not symmetric, as shown in figure 1.
1.6
Summary
The previous sections have shown that the conditional correlation coefficients can exhibit any behavior, depending on their conditioning set and the underlying distributions of returns. More precisely, we have shown that the correlation coefficients, conditioned on large returns or volatility above a threshold v, can be either increasing or decreasing functions of the threshold, can go to any value between zero and one when the threshold goes to infinity and can produce contradictory results in the sense that accounting for a trend or not can lead to conclude on an absence of linear correlation or on a perfect linear correlation. Moreover, due to the large statistical fluctuations of the empirical estimates, one should be very careful when concluding on an increase or decrease of the genuine correlations. Thus, from the general standpoint of the study of extreme dependences, but more particularly for the specific problem of the contagion across countries, the use of conditional correlation does not seem very informative and is sometimes misleading since it leads to spurious changes in the observed correlations: even when the unconditional correlation remains constant, conditional correlations yield artificial changes as we have forcefully shown. Since one of the most commonly accepted and used definition of contagion is the detection of an increase of the conditional correlations during a period of turmoil, namely when the volatility increases, our results cast serious shadows on previous results. In this respect, the conclusions of (Calvo and Reinhart 1996) about the occurrence of contagion across Latin American markets during the 1994 Mexican crisis but more generally also the results of (King and Wadhwani 1990), or (Lee and Kim 1993) on the effect of the October 1987 crash on the linkage of national markets, must be considered with some caution. It is quite desirable to find a more reliable tool for studying extreme dependences.
2
Conditional concordance measures
The (conditional) correlation coefficients, which have just been investigated, suffer from several theoretical as well as empirical deficiencies. From the theoretical point of view, it is only a measure of linear dependence. Thus, as stressed by (Embrechts et al. 1999), it is fully satisfying only for the description of the dependence of variables with elliptical distributions. Moreover, the correlation coefficient aggregates the information contained both in the marginal and in the collective behavior. The correlation coefficient is not invariant under an increasing change of variable, a transformation which is known to let unchanged the dependence structure. From the empirical standpoint, we have seen that, for some considered data, the correlation coefficient may not always exist, and even when it exits, it cannot always be accuratly estimated, due to sometimes “wild” statistical fluctuations. Thus, it is desirable to find another measure of the dependence between two assets or more generally between two random variables, which, contrarily to the linear correlation coefficient, is always well-defined and only depends on the copula properties. This ensures that this measure is not affected by a change in the marginal distributions (provided that the mapping is increasing). It turns out that this desirable property is shared by all measures of concordance. Among these measures are the well-known Kendall’s tau, Spearman’s rho or Gini’s beta (see (Nelsen 1998) for details). However, these concordance measures are not well-adapted, as such, to the study of extreme dependence, because they are functions of the whole distribution, including the moderate and small returns. A simple idea
11
250
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
to investigate the extreme concordance properties of two random variables is to calculate these quantities conditioned on values larger than a given threshold and let this threshold go to infinity. In the sequel, we will only focus on the Spearman’s rho which can be easily estimated empirically. It offers a natural generalization of the (linear) correlation coefficient. Indeed, the correlation coefficient quantifies the degree of linear dependence between two random variables, while the Spearman’s rho quantifies the degree of functional dependence, whatever the functional dependence between the two random variables may be. This represents a very interesting improvement. Perfect correlations (respectively anti-correlation) give a value 1 (respectively −1) both for the standard correlation coefficient and for the Spearman’s rho. Otherwise, there is no general relation allowing us to deduce the Spearman’s rho from the correlation coefficient and vice-versa.
2.1
Definition
The Spearman’s rho, denoted ρs in the sequel, measures the difference between the probability of concordance and the probability of discordance for the two pairs of random variables (X1 , Y1 ) and (X2 , Y3 ), where the pairs (X1 , Y1 ), (X2 , Y2 ) and (X3 , Y3 ) are three independent realizations drawn from the same distribution: ρs = 3 (Pr[(X1 − X2 )(Y1 − Y3 ) > 0] − Pr[(X1 − X2 )(Y1 − Y3 ) < 0]) .
(14)
The Spearman’s rho can also be expressed with the copula C of the two variables X and Y (see (Nelsen 1998), for instance): Z 1Z 1 C(u, v) du dv − 3, (15) ρs = 12 0
0
which allows us to easily calculate ρs when the copula C is known in closed form. Denoting U = FX (X) and V = FY (V ), it is easy to show that ρs is nothing but the (linear) correlation coefficient of the uniform random variables U and V : Cov(U, V ) ρs = p . (16) Var(U )Var(V ) This justifies its name as a correlation coefficient of the rank, and shows that it can easily be estimated. An attractive feature of the Spearman’s rho is to be independent of the margins, as we can see in equation (15). Thus, contrarily to the linear correlation coefficient, which aggregates the marginal properties of the variables with their collective behavior, the rank correlation coefficient takes into account only the dependence structure of the variables. Using expression (16), we propose a natural definition of the conditional rank correlation, conditioned on V larger than a given threshold v˜: Cov(U, V | V ≥ v˜) ρs (˜ v) = p , Var(U | V ≥ v˜)Var(V | V ≥ v˜)
(17)
whose expression in term of the copula C(·, ·) is given in appendix D.
2.2
Example
Contrarily to the conditional correlation coefficient, we have not been able to obtain analytical expressions for the conditional Spearman’s rho, at least for the distributions that we have considered up to now. Obviously, for many families of copulas known in closed form, equation (17) allows for an explicit calculation 12
9.1. Les diff´erentes mesures de d´ependances extrˆemes
251
of ρs (v). However, most copulas of interest in finance have no simple closed form, so that it is necessary to resort to numerical computations. As an example, let us consider the bivariate Gaussian distribution (or copula) with unconditional correlation coefficent ρ. It is well-known that its unconditional Spearman’s rho is given by ρs =
6 ρ · arcsin . π 2
(18)
The left panel of figure 8 shows the conditional Spearman’s rho ρs (v) defined by (17) obtained from a numerical integration. We observe the same bias as for the conditional correlation coefficient, namely the conditional rank correlation changes with v eventhough the unconditional correlation is fixed to a constant value. Nonetheless, this conditional Spearman’s rho seems more sensitive than the conditional correlation coefficient since we can observe in the left panel of figure 8 that, as v goes to one, the conditional Spearman’s rho ρs (v) does not go to zero for all values of ρ (at the precision of our bootstrap estimates), as previously observed with the conditional correlation coefficient (see equation (3)). The right panel of figure 8 depicts the conditional Spearman’s rho of the Student’s copula with three degres of freedom. The results are the same concerning the bias, but this time ρs (v) goes to zero for all value of ρ when v goes to one. Thus, here again, many different behaviours can be observed depending on the underlying copula of the random variables. Moreover, these two examples show that the quantification of extreme dependence is a function of the tools used to quantify this dependence. Here, the conditional Spearman’s ρ goes to a non-vanishing constant for the Gaussian model, while the conditional (linear) correlation coefficient goes to zero, contrarily to the Student’s distribution for which the situation is exactly the opposite.
2.3
Empirical evidence
Figures 9, 10 and 11 give the conditionnal Spearman’s rho respectively for the Argentina / Brazilian stock markets, the Brazilian / Chilean stock markets and the Chilean / Mexican stock markets. As previously, the plain thick line refers to the estimated correlation, while the dashed lines refer to the Gaussian copula and its 95% confidence levels and and dotted lines to the Student’s copula with three degrees of freedom and its 95% confidence levels. We first observe that contrarily to the cases of the conditional (linear) correlation coefficient exhibited in figures 2, 4 and 6, the empirical conditional Spearman’s ρ does not always comply with the Student’s model (neither with the Gaussian one), and thus confirm the discrepancies observed in figures 3, 5 and 7. In all cases, for thresholds v larger than the quantile 0.5 corresponding to the positive returns, the Student’s model with three degrees of freedom is always sufficient to explain the data. In contrast, for the negative returns and thus thresholds v lower then the quantile 0.5, only the interaction between the Chilean and the Mexican markets is well described by the Student copula and does not needed any additional ingredient such as the contagion mechanism. For all other pairs, none of our models explain the data satisfyingly. Therefore, for these cases and from the perspective of our models, the contagion hypothesis seems to be needed. There are however several caveats. First, even though we have considered the most natural financial models, there may be other models, that we have ignored, with constant dependence structure which can account for the observed evolutions of the conditional Spearman’s ρ. If this is true, then the contagion hypothesis would not be needed. Second, the discrepancy between the empirical conditional Spearman’s ρ and the prediction of the the Student’s model does not occur in the tails the distribution, i.e for large and extreme movements, but in the bulk. Thus, during periods of turmoil, the Student’s model with three degrees fo freedom seems to remain a good model of co-movements. Third, the contagion effect is never necessary for upwards moves. Indeed, we observe the same asymmetry or trend dependence as found by (Longin and Solnik 2001) for five 13
252
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
major equity markets. This was apparent in figures 2, 4 and 6 for ρ+,− v , and is strongly confirmed on the conditional Spearman’s ρ. Interestingly, there is also an asymmetry or directivity in the mutual influence between markets. For instance, the Chilean and Mexican markets have an influence on the Argentina and Brazilian markets, but the later do not have any impact on the Chile and Mexican markets. Chile and Mexico have no contagion effect on each other while Argentina and Brazil have. These empirical results on the conditional Spearman’s ρ are different from and often opposite to the conclusion derived from the conditional correlation coefficients ρ+,− v . This puts in light the difficulty in obtaining reliable, unambiguous and sensitive estimations of conditional correlation measures. In particular, the Pearson’s coefficient usually employed to estimate the correlation coefficient between two variables is known to be not very efficient when the variables are fat-tailed and when the estimation is performed on a small sample. Indeed, with small samples, the Pearson’s coefficient is very sensitive to the largest value, which can lead to an important bias in the estimation. Moreover, even with large sample sizes, (Meerschaert and Scheffler 2001) have shown that the nature of convergence as the sample size T tends to infinity of the Pearson’s coefficient of two times series with tail index µ towards the theoretical correlation is sensitive to the existence and strength of the theoretical correlation. If there is no theoretical correlation between the two times series, the sample correlation tends to zero with Gaussian fluctuations. If the theoretical correlation is non-zero, the difference between the sample correlation and the theoretical correlation times T 1−2/µ converges in distribution to a stable law with index µ/2. These large statistical fluctuations are responsible for the lack of accuracy of the estimated conditional correlation coefficient encountered in the previous section. Thus, we think that the conditional Spearman’s ρ provides a good alternative both from a theoretical and an empirical viewpoint.
3
Tail dependence
For the sake of completeness, and since it is directly related to the multivariate extreme values theory, we study the so-called coefficient of tail dependence λ. To our knowledge, its interest for financial applications has been first underlined by (Embrechts et al. 2001). The coefficient of tail dependence characterizes an important property of the extreme dependence between X and Y , using the (original or unconditional) copula of X and Y . In constrast, the conditional spearman’s rho is defined in terms of a conditional copula, and can be seen as the “unconditional Spearman’s rho” of the copula of X and Y conditioned on Y larger than the threshold v. This copula of X and Y conditioned on Y larger than the threshold v is not the true copula of X and Y because it is modified by the conditioning. In this sense, the tail dependence parameter λ is a more natural property directly related to the copula of X and Y . ¯ (see below) which allows one to To begin with, we recall the definition of the coefficient λ as well as of λ quantify the amount of dependence in the tail. Then, we present several results concerning the coefficient λ of tail dependence for various distributions and models, and finally, we discuss the problems encountered in the estimation of these quantities.
14
253
9.1. Les diff´erentes mesures de d´ependances extrˆemes
3.1
Definition
The concept of tail dependence is appealing by its simplicity. By definition, the (upper) tail dependence coefficient is: λ = lim Pr[X > FX−1 (u)|Y > FY−1 (u)] , (19) u→1
and quantifies the probability to observe a large X, assuming that Y is large itself. For a survey of the properties of the tail dependence coefficient, the reader is refered to (Coles et al. 1999, Embrechts et al. 2001, Lindskog 1999), for instance. In words, given that Y is very large (which occurs with probability 1 − u), the probability that X is very large at the same probability level u defines asymptotically the tail dependence coefficient λ. As an example, considering that X and Y represent the volatility of two different national markets, the coefficient of tail dependence λ gives the probabilty that both markets exhibit together very high volatility. One of the appeal of this definition of tail dependence is that it is a pure copula property, i.e., it is independent of the margins of X and Y . Indeed, let C be the copula of the variables X and Y , then if the bivariate copula C is such that 1 − 2u + C(u, u) log C(u, u) lim = lim 2 − =λ (20) u→1 u→1 1−u log u exists, then C has an upper tail dependence coefficient λ (see (Coles et al. 1999, Embrechts et al. 2001, Lindskog 1999)). If λ > 0, the copula presents tail dependence and large events tend to occur simultaneously, with the probability λ. On the contrary, when λ = 0, the copula has no tail dependence and the variables X and Y are said asymptotically independent. There is however a subtlety in this definition (19) of tail dependence. To make it clear, first consider the case where for large X and Y the cumulative distribution function H(x, y) factorizes such that F (x, y) =1, (21) lim x,y→∞ FX (x)FY (y) where FX (x) and FY (y) are the margins of X and Y respectively. This means that, for X and Y sufficiently large, these two variables can be considered as independent. It is then easy to show that lim Pr{X > FX −1 (u)|Y > FY −1 (u)} =
u→1
=
lim 1 − FX (FX −1 (u))
(22)
lim 1 − u = 0,
(23)
u→1 u→1
so that independent variables really have no tail dependence λ = 0, as one can expect. However, the result λ = 0 does not imply that the multivariate distribution can be automatically factorized asymptotically, as shown by the Gaussian example. Indeed, the Gaussian multivariate distribution does not have a factorizable multivariate distribution, even asymptotically for extreme values, since the non-diagonal term of the quadratic form in the exponential function does not become negligible in general as X and Y go to infinity. Therefore, in a weaker sense, there may still be a dependence in the tail even when λ = 0. To make this statement more precise, following (Coles et al. 1999), let us introduce the coefficient 2 log Pr{X > FX −1 (u)} −1 u→1 log Pr{X > FX −1 (u), Y > FY −1 (u)} 2 log(1 − u) = lim −1. u→1 log[1 − 2u + C(u, u)]
¯ = λ
lim
(24) (25)
¯ = 1 if and only if the coefficient of tail dependence λ > 0, while λ ¯ It can be shown that the coefficient λ takes values in [−1, 1) when λ = 0, allowing us to quantify the strength of the dependence in the tail in such 15
254
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
¯ > 0, the variables X and Y are simultaneously large more a case. In fact, it has been established that, when λ frequently than independent variables, while simultaneous large deviations of X and Y occur less frequenlty ¯ < 0 (the interested reader is refered to (Ledford and Tawn 1996, Ledford than under independence when λ and Tawn 1998)). To summarize, independence (factorization of the bivariate distribution) implies no tail dependence λ = 0. But λ = 0 is not sufficient to imply factorization and thus true independence. It also requires, as a necessary ¯ = 0. condition, that λ We will first recall the expression of the tail dependence coefficient for usual distributions, and then calculate it in the case of a one-factor model for different distributions of the factor.
3.2
Tail dependence for Gaussian distributions and Student’s distributions
Assuming that (X, Y ) are normally distributed with correlation coefficient ρ, (Embrechts et al. 2001) shows ¯ = ρ, which expresses, as one can expect, that for all ρ ∈ [−1, 1), λ = 0, while (Heffernan 2000) gives λ that extremes appear more likely together for positively correlated variables. In constrast, if (X, Y ) have a Student’s distribution, (Embrechts et al. 2001) shows that the tail dependence coefficient is r ¶ µ √ 1−ρ ¯ ν+1 , (26) λ = 2 · Tν+1 1+ρ ¯ = 1. This last example proves that extremes appear which is greater than zero for all ρ > −1, and thus λ more likely together whatever the correlation coefficient may be, showing that, in fact, there is no general relationship between the asymptotic dependence and the linear correlation coefficient. The Gaussian and Student’s distributions are elliptical, for which the following general result is known: (Hult and Lindskog 2001) have shown that ellipticaly distributed random variables presents tail dependence if and only if they are regularly varing, i.e., behaves asymptotically like power laws with some exponent ν > 0. In such a case, for every regularly varying pair of random variables elliptically distributed, the coefficent of tail dependence λ is given by expression (26). This result is very natural since the correlation coefficient is an invariant quantity within the class of elliptical distributions and since the coefficient of tail dependence is only determined by the asymptotic behavior of the distribution, so that it does not matter that the distribution is a Student’s distribution with ν degrees of freedom or any other elliptical distribution as long as they have the same asymptotic behavior in the tail.
3.3
Tail dependence generated by a factor model
Consider the one-factor model X1 = α1 · Y + ² 1 ,
(27)
X2 = α2 · Y + ² 2 ,
(28)
where the ²i ’s are random variables independent of Y and the αi ’s non-random positive coefficients. (Malevergne and Sornette 2002) have shown that the tail dependence λ of X1 and X2 can be simply expressed as the minimum of the tail dependence coefficients λ1 and λ2 between the two random variables X1 and Y and X2 and Y respectively: λ = min{λ1 , λ2 }. (29)
16
255
9.1. Les diff´erentes mesures de d´ependances extrˆemes
To understand this result, note that the tail dependence between X1 and X2 is created only through the common factor Y . It is thus natural that the tail dependence between X1 and X2 is bounded from above by the weakest tail dependence between the Xi ’s and Y while deriving the equality requires more work (Malevergne and Sornette 2002). Thus, it it only necessary to focus our study on the tail dependence between any Xi and Y . So, in order to simplify the notations, we neglect the subscripts 1 or 2, since they are irrelevant for the dependence of X1 (or X2 ) and Y . A general result concerning the tail dependence generated by factor models for every kind of factor and noise distributions can be found in (Malevergne and Sornette 2002). It has been proved that the coefficient of (upper) tail dependence between X and Y is given by Z ∞ λ= dx f (x) , (30) max{1, αl } where, provided that they exist, FX −1 (u) , u→1 FY −1 (u) t · PY (t · x) f (x) = lim . t→∞ F¯Y (t) l =
lim
(31) (32)
As a direct consequence, one can show that any rapidly varying factor, which encompasses the Gaussian, the exponential or the gamma distributed factors for instance, leads to a vanishing coefficient of tail dependence, whatever the distribution of the idiosyncratic noise may be. This resut is obvious when both the factor and the idiosyncratic are Gaussianly distributed, since then X and Y follow a bivariate Gaussian distibution, whose tail dependence has been said to be zero. On the contrary, regularly vaying factors, like the Student’s distributed factors, lead to a tail dependence, provided that the distribution of the idiosycratic noise does not become fatter-tailed than the factor distribution. One can thus conclude that, in order to generate tail dependence, the factor must have a sufficiently ‘wild’ distribution. To present an explicit example, let us assume now that the factor Y and the idiosyncratic noise ² have centered Student’s distributions with the same number ν of degrees of freedom and scale factors respectively equal to 1 and σ. The choice of the scale factor equal to 1 for Y is not restrictive but only provides a convenient normalization for σ. Appendix E shows that the tail dependence coefficient is λ=
1+
1 ¡ σ ¢ν .
(33)
α
As is reasonable intuitively, the larger the typical scale σ of the fluctuation of ² and the weaker is the coupling coefficient α, the smaller is the tail dependence. 2
Let us recall that the unconditional correlation coefficient ρ can be writen as ρ = (1+ ασ 2 )−1/2 , which allows us to rewrite the coefficient of upper tail dependence as λ=
ρν . ρν + (1 − ρ2 )ν/2
(34)
Surprinsingly, λ does not go to zero for all ρ’s as ν goes to infinity, as one would expect intuitively. Indeed, a natural reasoning would be that, as ν goes to infinity, the Student’s distribution goes to the Gaussian distribution. Therefore, one could a priori expect to find again the result given in the previous √ section for the Gaussian factor model. We note that λ → 0 when ν → ∞ for all ρ’s smaller than 1/ 2. But, and √ here lies the surprise, λ → 1 for all ρ larger than 1/ 2 when ν → ∞. This counter-intuitive result is due 17
256
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
to a non-uniform convergence which makes the order to two limits non-commutative: taking first the limit u → 1 and then ν → ∞ is different from taking first the limit ν → ∞ and then u → 1. In a sense, by taking first the limit u → 1, we always ensure somehow the power law regime even if ν is later taken to infinity. This is different from first “sitting” on the Gaussian limit ν → ∞. It then is a posteriori reasonable that the absence of uniform convergence is made strongly apparent in its consequences when measuring a quantity probing the extreme tails of the distributions. As an illustration, figure 12 represents the coefficient of tail dependence for the Student’s copula and Student’s factor model as a function of ρ for various value of ν. It is interesting to note that λ equals zero for all negative ρ in the case of the factor model, while λ remains non-zero for negative values of the correlation coefficient for bivariate Student’s variables. If Y and ² have different numbers νY and ν² of degrees of freedom, two cases occur. For νY < ν² , ² is negligible asymptotically and λ = 1. For νY > ν² , X becomes asymptotically identical to ². Then, X and Y have the same tail-dependence as ² and Y , which is 0 by construction.
3.4
Estimation of the coefficient of tail dependence
It would seem that the coefficient of tail dependence could provide a useful measure of the extreme dependence between two random variables which could then be useful for the analysis of contagion between markets. Indeed, either the whole data set does not exhibit tail dependence, and a contagion mechanism seems necessary to explain the occurrence of concomitant large movements during turmoil periods, or it exhibits tail dependence so that the usual dependence structure is such that it is able to produce by itself concomitant extremes. Unfortunately, the empirical estimation of the coefficient of tail dependence is a strenuous task. Indeed, a direct estimation of the conditional probability Pr{X > FX −1 (u) | Y > FY −1 (u)}, which should tend to λ when u → 1 is impossible to put in practice due to the combination of the curse of dimensionality and the drastic decrease of the number of realisations as u become close to one. A better approach consists in using kernel estimators, which generally provide smooth and accurate estimators (Kulpa 1999, Li et al. 1998, Scaillet 2000). However, these smooth estimators lead to differentiable estimated copulas which have automatically vanishing tail dependence. Indeed, in order to obtain a non-vanishing coefficient of tail dependence, it is necessary for the corresponding copula to be non-differentiable at the point (1, 1) (or at (0, 0)). An alternative is then the fully parametric approach. One can choose to model dependence via a specific copula, and thus to determine the associated tail dependence (Longin and Solnik 2001, Malevergne and Sornette 2001, Patton 2001). The problem with such a method is that the choice of the parameterization of the copula amounts to choose a priori whether or not the data presents tail dependence. In fact, there exist three ways to estimate the tail dependence coefficient. The two first ones are specific to a class of copulas or of models, while the last one is very general, but obvioulsy less accurate. The first method is only reliable when it is known that the underlying copula is Archimedian (see (Joe 1997) or (Nelsen 1998) for the definition). In such a case, a limit theorem established by (Juri and W¨uthrich 2002) allows to estimate the tail dependence. The problem is that it is not obvious that the Archimedian copulas provide a good representation of the dependence structure for financial assets. For instance, the Achimedian copulas are generally inconsistent with a representation of assets by factor models. In such case, a second method provided by (Malevergne and Sornette 2002) offers good results allowing to estimate the tail dependence in a semi-parametric way, which solely relies on the estimation of marginal distributions, a significantly easier task. When none of these situations occur, or when the factors are too difficult to extract, a third and fully non-
18
257
9.1. Les diff´erentes mesures de d´ependances extrˆemes
parametric method exists, which is based upon the mathematical results by (Ledford and Tawn 1996, Ledford and Tawn 1998) and (Coles et al. 1999) and has recently been applied by (Poon et al. 2001). The method consists in tranforming the original random variables X and Y into Fr´echet random variables denoted by S and T respectively. Then, considering the variable Z = min{S, T }, its survival distribution is: Pr{Z > z} = d · z 1/η
as z → ∞,
(35)
¯ = 2 · η − 1, with λ = 0 if λ ¯ < 1 or λ ¯ = 1 and λ = d. The parameters η and d can be estimated and λ by maximum likelihood, and deriving their asymptotic statistics allows one to test whether the hypothesis ¯ = 1 can be rejected or not, and consequently, whether the data present tail dependence or not. λ We have implemented this procedure and the estimated values of the coefficient of tail dependence are given in table 1 both for the positive and the negative tails. Our tests show that we cannot reject the hypothesis of tail dependence between the four considered Latin American markets (Argentina, Brazil, Chile and Mexico). Notice that the positive tail dependence is almost always slightly smaller than the negative one, which could be linked with the trend asymmetry of (Longin and Solnik 2001), but it turns out that these differences are not statistically significant. These results indicate that, according to this analysis of the extreme dependence coefficient, the propensity of extreme co-movements is almost the same for each pair of stock markets: even if the transmission mechanisms of a crisis are different from a country to another one, the propagation occur with the same probability overall. Thus, the subsequent risks are the same. In table 2, we also give the coefficients of tail dependence estimated under the Student’s copula (or in fact any copula derived from an elliptical distribution) with three dregrees of freedom, given by expression (26). One can observe a remarkable agreement between these values and the non-parametric estimates given in table 1. This is consistent with the results given by the conditional Spearman’s ρ, for which we have shown that the Student’s copula seems to reasonably account for the extreme dependence.
4
Summary and Discussion
Table 3 summarizes the asymptotic dependences for large v and u of the signed conditional correlation s coefficient ρ+ v , the unsigned conditional correlation coefficient ρv and the correlation coefficient ρu conditioned on both variables for the bivariate Gaussian, the Student’s model, the Gaussian factor model and the Student’s factor model. Our results provide a quantitative proof that conditioning on exceedance leads to conditional correlation coefficients that may be very different from the unconditional correlation. This provides a straightforward mechanism for fluctuations or changes of correlations, based on fluctuations of volatility or changes of trends. In other words, the many reported variations of correlation structure might be in large part attributed to changes in volatility (and statisical uncertainty). We also suggest that the distinct dependences as a function of exceedance v and u of the conditional correlation coefficients may offer novel tools for characterizing the statistical multivariate distributions of extreme events. Since their direct characterization is in general restricted by the curse of dimensionality and the scarsity of data, the conditional correlation coefficients provide reduced robust statistics which can be estimated with reasonable accuracy and reliability. In this respect, our empirical results encourage us to assert that a Student’s copula, or more generally and elliptical copula, with a tail index of about three seems able to account for the main extreme dependence properties investigated here. s Table 4 gives the asymptotic values of ρ+ v , ρv and ρu for v → +∞ and u → ∞ in order to compare them with the tail-dependence λ.
These two tables only scratch the surface of the rich sets of measures of tail and extreme dependences. We have shown that complete independence implies the absence of tail dependence: λ = 0. But λ = 0 does not 19
258
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
implies independence, at least in the intermediate range, since it is only an asymptotic property. Conversely, a non-zero tail dependence λ implies the absence of asymptotic independence. Nonetheless, it does not s imply necessarily that the conditional correlation coefficients ρ+ v=∞ and ρv=∞ are non-zero, as one would have expected naively. Note that the examples of Table 4 are such that λ = 0 seems to go hand-in-hand with ρ+ v→∞ = 0. However, + the logical implication (λ = 0) ⇒ (ρv→∞ = 0) does not hold in general. A counter example is offered by the Student’s factor model in the case where νY > ν² (the tail of the distribution of the idiosynchratic noise is fatter than that of the distribution of the factor). In this case, X and Y have the same tail-dependence s as ² and Y , which is 0 by construction. But, ρ+ v=∞ and ρv=∞ are both 1 because a large Y almost always gives a large X and the simultaneous occurrence of a large Y and a large ² can be neglected. The reason for this absence of tail dependence (in the sense of λ) coming together with asymptotically strong conditional correlation coefficients stems from two facts: • first, the conditional correlation coefficients put much less weight on the extreme tails that the tails dependence parameter λ. In other words, ρ+ v=∞ and ρv=∞ are sensitive to the marginals, i.e., there are determined by the full bivariate distribution, while, as we said, λ is a pure copula property independent s of the marginals. Since ρ+ v=∞ and ρv=∞ are measures of tail dependence weighted by the specific shapes of the marginals, it is natural that they may behave differently. • Secondly, the tail dependence λ probes the extreme dependence property of the original copula of the random variables X and Y . On the contrary, when conditioning on Y , one changes the copula of X and Y , so that the extreme dependence properties investigated by the conditional correlations are not exactly those of the original copula. This last remark explains clearly why we observe what (Boyer et al. 1997) call a “bias” in the conditional correlations. Indeed, changing the dependence between two random variables obviously leads to changing their correlations. The consequences are of importance. In such a situation, one measure (λ) would conclude on asymptotic s tail-independence while the other measures ρ+ v=∞ and ρv=∞ would conclude the opposite. Thus, before concluding on a change in the dependence structure with respect to a given parameter - the volatility or the trend, for instance - one should check that this change does not result from the tool used to probe the dependence. In this respect, our study allows us to shed new lights on recent controversial results about the occurrence or abscence of contagion during the Latin American crises. As in every previous works, we find no contagion evidence between Chile and Mexico, but contrarily to (Forbes and Rigobon 2002), we think it is difficult to ignore the possibility of contagion towards Argentina and Brazil, and in this respect we agree with (Calvo and Reinhart 1996). In fact, we think that most of the discrepancies between these different studies stem from the fact that the conditional correlation coefficient does not provide an accurate tool for probing the potential changes of dependence. Indeed, even when the bias have been accounted for, the fat-tailness of the distributions of returns are such that the Pearson’s coefficient is subjected to very strong statistical fluctuations which forbid an accurate estimation of the correlation. Moreover, when studying the dependence properties, it is interesting to free oneself from the marginal behavior of each random variable. This is why the conditional Spearman’s rho seems a good tool: it only depends on the copula and is statistically well-behaved. The conditional Spearman’s rho has allowed us to identify a change in the dependence structure during downward trends in Latin American markets, similar to that found by (Longin and Solnik 2001) in their study of the contagion across five major equity markets. It has also enabled us to put in light the asymmetry in the contagion effects: Mexico and Chile can be potential source sof contagion toward Argentina and Brazil, while reverse does not seem to hold. This phenomenom has been observed during the 1994 Mexican
20
9.1. Les diff´erentes mesures de d´ependances extrˆemes
259
crisis and appears to remain true in the recent Argentina crisis, for which only Brazil seems to exhibit the signature of a possible contagion. We suggest that a possible origin for the discovered asymmetry may lie in the difference in the more marketoriented countries versus more state intervention oriented economies, giving rise to either currency floating regimes adapted to an important manufacturing sector which tend to deliver more competitive real exchange rates (Chile and Mexico) or fixed rate pegs (Argentina until the 2001 crisis and Brazil until the early 1999 crisis) (Frieden 1992, Frieden et al. 2000a, Frieden et al. 2000b). The asymmetry of the contagion is compatible with the view that fixed echange rates tighten more strickly an economy and its stock market to external shocks (case of Argentina and Brazil) while a more flexible exchange rate seems to provide a cushion allowing a decoupling between the stock market and external influences. Finally, the absence of contagion does not imply necessarily the absence of “contamination.” Indeed, the study of the coefficient of tail dependence has proven that with or without contagion mechanisms - i.e., increase in the linkage between markets during crisis - the probability of extreme co-movements during the crisis - i.e., the contamination - is almost the same for all pairs of markets. Thus, whatever the propagation mechanism may be - historically closed relationship or irrational fear and herd behavior - the observed effects are the same: the propagation of the crisis. From the practical perspective of risk management or regulatory policy, this last point is may be more important than the real knowledge of the occurrence or not of contagion.
21
260
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Figure legends Figure 1: The upper (respectively lower) panel graphs the complementary distribution of the positive (respectively the minus negative) returns in US dollar. The straight line represents the slope of a power law with tail exponent α = 2. Figure 2: In the upper panel, the thick plain curve depicts the correlation coefficient between the Argentina stock index daily returns and the Mexican stock index daily returns conditional on the Mexican stock index daily returns larger than (smaller than) a given positive (negative) value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− v calculated for a bivariate Gaussian model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Argentina stock index daily returns larger than (smaller than) a given positive (negative) value v. Figure 3: In the upper panel, the thick plain curve gives the correlation coefficient between the Argentina stock index daily returns and the Mexican stock index daily returns conditioned on the Mexican stock index daily volatility larger than a give value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− calculated for a bivariate Gaussian v model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Argentina stock index daily volatility larger than a given value v. Figure 4: In the upper panel, the thick plain curve gives the correlation coefficient between the Brazilian stock index daily returns and the Chilean stock index daily returns conditioned on the Chilean stock index daily returns larger than (smaller than) a given positive (negative) value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− v calculated for a bivariate Gaussian model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Brazilian stock index daily returns larger than (smaller than) a given positive (negative) value v. Figure 5: In the upper panel, the thick plain curve shows the correlation coefficient between the Brazilian stock index daily returns and the Chilean stock index daily returns conditional on the Chilean stock index daily volatility larger than a given value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− calculated for a bivariate Gaussian v model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Brazil22
9.1. Les diff´erentes mesures de d´ependances extrˆemes
261
ian stock index daily volatility larger than a given value v. Figure 6: In the upper panel, the thick plain curve shows the correlation coefficient between the Chilean stock index daily returns and the Mexican stock index daily returns conditioned on the Mexican stock index daily returns larger than (smaller than) a given positive (negative) value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− v calculated for a bivariate Gaussian model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Chilean stock index daily returns larger than (smaller than) a given positive (negative) value v. Figure 7: In the upper panel, the thick plain curve depicts the correlation coefficient between the Chilean stock index daily returns and the Mexican stock index daily returns conditioned on the Mexican stock index daily volatility larger than a give value v (after normalization by the standard deviation). The thick dashed curve represents the theoretical conditional correlation coefficient ρ+,− calculated for a bivariate Gaussian v model, while the two thin dashed curves represent the area within which we cannot consider, at the 95% confidence level, that the estimated correlation coefficient is significantly different from its Gaussian theoretical value. The dotted curves provide the same information under the assumption of a bivariate Student’s model with ν = 3 degrees of freedom. The lower panel gives the same kind of information for the correlation coefficient conditioned on the Chilean stock index daily volatility larger than a given value v. Figure 8: Conditional Spearman’s rho for a bivariate Gaussian copula (left panel) and a Student’s copula with three degrees of freedom (right panel), with an unconditional linear correlation coefficient ρ = 0.1, 0.3, 0.5, 0.7, 0.9. Figure 9: In the upper panel, the thick curve shows the Spearman’s rho between the Argentina stock index daily returns and the Brazilian stock index daily returns. Above the quantile v = 0.5, the Spearman’s rho is conditioned on the Brazilian index daily returns whose quantiles are larger than v, while below the quantile v = 0.5 it is conditioned on the Brazilian index daily returns whose quantiles are smaller than v. As in the above figures for the correlation coefficients, the dashed lines refer to the prediction of the Gaussian copula and its 95% confidence levels and the dotted lines to the Student’s copula with three degrees of freedom and its 95% confidence levels. The lower panel gives the same kind of information for the Spearman’s rho conditioned on the realizations of the Argentina index daily returns. Figure 10: In the upper panel, the thick curve shows the Spearman’s rho between the Brazilian stock index daily returns and the chilean stock index daily returns. Above the quantile v = 0.5, the Spearman’s rho is conditioned on the Chilean index daily returns whose quantiles are larger than v, while below the quantile v = 0.5 it is conditioned on the Mexican index daily returns whose quantiles are smaller than v. The dashed lines refer to the prediction of the Gaussian copula and its 95% confidence levels and the dotted lines to the Student’s copula with three degrees of freedom and its 95% confidence levels. The lower panel gives the same kind of information for the Spearman’s rho conditioned on the realizations of the Brazilian index daily returns. Figure 11: In the upper panel, the thick curve depicts the Spearman’s rho between the Chilean stock index daily returns and the Mexican stock index daily returns. Above the quantile v = 0.5, the Spearman’s rho is conditioned on the Mexican index daily returns whose quantiles are larger than v, while below the quantile v = 0.5 it is conditioned on the Mexican index daily returns whose quantiles are smaller than v. The dashed lines refer to the prediction of the Gaussian copula and its 95% confidence levels and the dotted lines to the 23
262
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Student’s copula with three degrees of freedom and its 95% confidence levels. The lower panel gives the same kind of information for the Spearman’s rho conditioned on the realizations of the Chilean index daily returns. Figure 12: Coefficient of upper tail dependence as a function of the correlation coefficient ρ for various values of the number of degres of freedomn ν for the student’s Copula (left panel) and the Student’s factor model (right panel). ¯ ¯ ¯ 1 ¯ ¯ Figure 13: The graph of the function ¯ (1+ ²u )ν − 1¯¯ (thick solid line), the string which gives an upper bound x0 £ ¤ 0 of the function £within¤ 1−x , 0 (dashed line) and the tangent in 0+ which gives an upper bound of the ² x0 function within 0, ² (dash dotted line).
24
263
9.1. Les diff´erentes mesures de d´ependances extrˆemes
A Conditional correlation coefficient for Gaussian variables Let us consider a pair of Normal random variables (X, Y ) ∼ N (0, Σ) where Σ is their covariance matrix with unconditional correlation coefficient ρ. Without loss of generality, and for simplicity, we shall assume Σ with unit unconditional variances.
A.1 Conditioning on one variable A.1.1 Conditioning on Y larger than v Given a conditioning set A = [v, +∞), v ∈ R+ , ρA = ρ+ v is the correlation coefficient conditioned on Y larger than v: ρ . (A.1) ρ+ v = q 2 ρ2 + Var(Y1−ρ | Y >v) We start with the calculation of the first and the second moment of Y conditioned on Y larger than v: √ µ ¶ 2 1 2 1 ³ ´ =v+ − 3 +O E(Y | Y > v) = √ v2 , (A.2) v v v5 πe 2 erfc √v2 √ µ ¶ 2 2v 1 2 ³ ´ = v2 + 2 − 2 + O E(Y | Y > v) = 1 + √ v2 , (A.3) 4 v v v πe 2 erfc √2 which allows us to obtain the variance of Y conditioned on Y larger than v: √ 2v
√ 2
2
1 ³ ´ − √ v2 ³ ´ = 2 + O Var(Y | Y > v) = 1 + √ v2 v πe 2 erfc √v2 πe 2 erfc √v2 which, for large v, yields:
µ
1 v4
ρ 1 ρ+ · . v ∼v→∞ p 2 v 1−ρ
¶ ,
(A.4)
(A.5)
A.1.2 Conditioning on |Y | larger than v Given a conditioning set A = (−∞, −v] ∪ [v, +∞), v ∈ R+ , ρA = ρsv is the correlation coefficient conditioned on |Y | larger than v: ρ ρsv = q . (A.6) 2 ρ2 + Var(Y1−ρ | |Y |>v) The first and second moment of Y conditioned on |Y | larger than v can be easily calculated: E(Y | |Y | > v) = 0,
√ 2v
2 ³ ´ = v2 + 2 − 2 + O E(Y 2 | |Y | > v) = 1 + √ v2 v πe 2 erfc √v2
25
µ
1 v4
(A.7)
¶ .
(A.8)
264
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Expression (A.8) is the same as (A.4) as it should. This gives the following conditional variance: √ µ ¶ 2v 1 2 ³ ´ =v +2+O Var(Y | |Y | > v) = 1 + √ v2 , 2 v v πe 2 erfc √2
(A.9)
and finally yields, for large v, ρsv ∼v→∞ q
ρ ρ2 +
1−ρ2 2+v 2
∼v→∞ 1 −
1 1 − ρ2 1 . 2 ρ2 v 2
(A.10)
A.1.3 Intuitive meaning Let us provide an intuitive explanation (see also (Longin and Solnik 2001)). As seen from (A.1), ρ+ v is 2 controlled by the dependence Var(Y | Y > v) ∝ 1/v derived in Appendix A.1.1. In contrast, as seen from (A.6), ρsv is controlled by Var(Y | |Y | > v) ∝ v 2 given in Appendix A.1.2. The difference between ρ+ v and ρsv can thus be traced back to that between Var(Y | Y > v) ∝ 1/v 2 and Var(Y | |Y | > v) ∝ v 2 for large v. This results from the following effect. For Y > v, one can picture the possible realizations of Y as those of a random particle on the line, which is strongly attracted to the origin by a spring (the Gaussian distribution that prevents Y from performing significant fluctuations beyond a few standard deviations) while being forced to be on the right to a wall at Y = v. It is clear that the fluctuations of the position of this particle are very small as it is strongly glued to the unpenetrable wall by the restoring spring, hence the result Var(Y | Y > v) ∝ 1/v 2 . In constrast, for the condition |Y | > v, by the same argument, the fluctuations of the particle are hindered to be very close to |Y | = v, i.e., very close to Y = +v or Y = −v. Thus, the fluctuations of Y typically flip from −v to +v and vice-versa. It is thus not surprising to find Var(Y | |Y | > v) ∝ v 2 . This argument makes intuitive the results Var(Y | Y > v) ∝ 1/v 2 and Var(Y | |Y | > v) ∝ v 2 for large 1 s + v and thus the results for ρ+ v and for ρv if we use (A.1) and (A.6). We now attempt to justify ρv ∼v→∞ v s 2 and 1 − ρv ∼v→∞ 1/v directly by the following intuitive argument. Using the picture of particles, X and Y can be visualized as the positions of two particles which fluctuate randomly. Their joint bivariate Gaussian distribution with non-zero unconditional correlation amounts to the existence of a spring that ties them together. Their Gaussian marginals also exert a spring-like force attaching them to the origin. When Y > v, the X-particle is teared off between two extremes, between 0 and v. When the unconditional correlation ρ is less than 1, the spring attracting to the origin is stronger than the spring attracting to the wall at v. The particle X thus undergoes tiny fluctuations around the origin that are relatively less and less 1 attracted by the Y -particle, hence the result ρ+ v ∼v→∞ v → 0. In constrast, for |Y | > v, notwithstanding the still strong attraction of the X-particle to the origin, it can follow the sign of the Y -particle without paying too much cost in matching its amplitude |v|. Relatively tiny fluctuation of the X-particle but of the same sign as Y ≈ ±v will result in a strong ρsv , thus justifying that ρsv → 1 for v → +∞.
A.2 Conditioning on both X and Y larger than u By definition, the conditional correlation coefficient ρu , conditioned on both X and Y larger than u, is ρu = =
Cov[X, Y | X > u, Y > u] p p , Var[X | X > u, Y > u] Var[Y | X > u, Y > u] m11 − m10 · m01 √ √ , m20 − m10 2 m02 − m01 2 26
(A.11) (A.12)
265
9.1. Les diff´erentes mesures de d´ependances extrˆemes where mij denotes E[X i · Y j | X > u, Y > u].
Using the proposition A.1 of (Ang and Chen 2001) or the expressions in (Johnson and Kotz 1972, p 113), we can assert that · µr ¶¸ 1−ρ (1 + ρ) φ(u) 1 − Φ u , (A.13) 1+ρ p ¶¸ µr ¶ · µr ρ 1 − ρ2 1−ρ 2 2 √ u + φ u + L(u, u; ρ),(A.14) L(u, u; ρ) = (1 + ρ ) u φ(u) 1 − Φ 1+ρ 1+ρ 2π r · µr ¶¸ p ¶ µ 1−ρ 1 − ρ2 2 L(u, u; ρ) = 2ρ u φ(u) 1 − Φ u + √ u + ρ L(u, u; ρ) , (A.15) φ 1+ρ 1 + ρ 2π
m10 L(u, u; ρ) = m20 m11
where L(·, ·; ·) denotes the bivariate Gaussian survival (or complementary cumulative) distribution: ¶ µ Z ∞ Z ∞ 1 1 x2 − 2ρxy + y 2 L(h, k; ρ) = p , (A.16) dx dy exp − 2 1 − ρ2 2π 1 − ρ2 h k φ(·) is the Gaussian density:
x2 1 φ(x) = √ e− 2 , 2π and Φ(·) is the cumulative Gaussian distribution: Z x Φ(x) = du φ(u).
(A.17)
(A.18)
−∞
A.2.1 Asymptotic behavior of L(u, u; ρ) We focus on the asymptotic behavior of L(u, u; ρ) =
2π
p
Z
1 1 − ρ2
Z
∞
∞
dx u
u
µ ¶ 1 x2 − 2ρxy + y 2 dy exp − , 2 1 − ρ2
(A.19)
for large u. Performing the change of variables x0 = x − u and y 0 = y − u, we can write −
u2
e 1+ρ L(u, u; ρ) = p 2π 1 − ρ2
Z 0
∞
Z dx0 0
∞
µ ¶ µ ¶ x0 + y 0 1 x02 − 2ρx0 y 0 + y 02 dy 0 exp −u exp − . (A.20) 1+ρ 2 1 − ρ2
Using the fact that µ ¶ 1 x02 − 2ρx0 y 0 + y 02 x02 − 2ρx0 y 0 + y 02 (x02 − 2ρx0 y 0 + y 02 )2 (x02 − 2ρx0 y 0 + y 02 )3 + exp − = 1− − +· · · , 2 1 − ρ2 2(1 − ρ2 ) 8(1 − ρ2 )2 48(1 − ρ2 )3 (A.21) and applying theorem 3.1.1 in (Jensen 1995, p 58) (Laplace’s method), equations (A.20) and (A.21) yield 2 − u · (2 − ρ)(1 + ρ) 1 (2ρ2 − 6ρ + 7)(1 + ρ)2 1 (1 + ρ)2 e 1+ρ 1 − · + · 4 L(u, u; ρ) = p · u2 1−ρ u2 (1 − ρ)2 u 2π 1 − ρ2 µ ¶¸ (12 − 13ρ + 8ρ2 − 2ρ3 )(1 + ρ)3 1 1 −3 · 6 +O , (A.22) (1 − ρ)3 u u8
and
p · u2 2π u2 1 − ρ2 1+ρ (2 − ρ)(1 + ρ) 1 3 − 2ρ + ρ2 )(1 + ρ)2 1 1/L(u, u; ρ) = · e 1 + · − · 4 (1 + ρ)2 1−ρ u2 (1 − ρ)2 u µ ¶¸ 2 3 3 1 1 (16 − 13ρ + 10ρ − 3ρ )(1 + ρ) · 6 +O . (A.23) + 3 (1 − ρ) u u8 27
266
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
A.2.2 Asymptotic behavior of the first moment m10 The first moment m10 = E[X | X > u, Y > u] is given by (A.13). For large u, Ãs ! µr ¶ 1−ρ 1 1−ρ 1−Φ u = erfc u 1+ρ 2 2(1 + ρ) " r µ ¶ − 1−ρ u2 1 + ρ e 2(1+ρ) 1+ρ 2 1 1+ρ 1 √ = 1− · +3 · 4 1−ρ 1 − ρ u2 1−ρ u 2π u µ µ ¶# ¶3 1+ρ 1 1 −15 · 6 +O , 1−ρ u u8
(A.24)
(A.25)
so that multiplying by (1 + ρ) φ(u), we obtain 2 " µ ¶ µ ¶ µ ¶# − u 1+ρ 2 1 1+ρ 3 1 1 (1 + ρ)2 e 1+ρ 1+ρ 1 · 4 − 15 · 6 +O . m10 L(u, u; ρ) = p 1− · 2 +3 8 2 2π u 1 − ρ u 1 − ρ u 1 − ρ u u 1−ρ (A.26) Using the result given by equation (A.22), we can conclude that m10 = u + (1 + ρ) ·
1 (1 + ρ)2 (2 − ρ) 1 (10 − 8ρ + 3ρ2 )(1 + ρ)3 1 − · 3+ · 5 +O u (1 − ρ) u (1 − ρ)2 u
In the sequel, we will also need the behavior of m10 2 : m10 2 = u2 + 2 (1 + ρ) −
(1 + ρ)2 (3 − ρ) 1 (8 − 5ρ + 2ρ2 )(1 + ρ)3 1 · 2 +2 · 4 +O (1 − ρ) u (1 − ρ)2 u
µ
µ
1 u7
1 u6
¶ .
(A.27)
.
(A.28)
¶
A.2.3 Asymptotic behavior of the second moment m20 The second moment m20 = E[X 2 | X > u, Y > u] is given by expression (A.14). The first term in the right hand side of (A.14) yields u2 " r ¶¸ ¶ · µr µ − 1+ρ 1 − ρ 1 + ρ e 1+ρ 1 1+ρ 2 1 2 2 (1 + ρ ) u φ(u) 1 − Φ u = (1 + ρ ) 1− · +3 · 4 1+ρ 1 − ρ 2π 1 − ρ u2 1−ρ u µ ¶ µ ¶# 1+ρ 3 1 1 −15 · 6 +O (A.29) , 1−ρ u u8 while the second term gives ρ
p u2 µr ¶ − 1+ρ p 1 − ρ2 2 e √ φ u = ρ 1 − ρ2 . 1+ρ 2π 2π
(A.30)
Putting these two expressions together and factorizing the term (1 + ρ)/(1 + ρ2 ) allows us to obtain 2 − u · (1 + ρ)2 e 1+ρ 1 + ρ2 1 (1 + ρ2 )(1 + ρ) 1 m20 L(u, u; ρ) = p 1− · 2 +3 · 4 1−ρ u (1 − ρ)2 u 1 − ρ2 2π µ ¶¸ 1 (1 + ρ2 )(1 + ρ)2 1 · 6 +O + L(u, u; ρ) , −15 (1 − ρ)3 u u8
28
(A.31)
267
9.1. Les diff´erentes mesures de d´ependances extrˆemes
which finally yields m20
(1 + ρ)2 1 (5 + 4ρ + ρ3 )(1 + ρ)2 1 = u + 2 (1 + ρ) − 2 · 2 +2 +O 1−ρ u (1 − ρ)2 u4 2
µ
1 u6
¶ .
(A.32)
A.2.4 Asymptotic behavior of the cross moment m11 The cross moment m11 = E[X · Y | X > u, Y > u] is given by expression (A.15). The first and second terms in the right hand side of (A.15) respectively give r 2ρ u φ(u)[1 − Φ(u)] = 2ρ
−
u2
1 + ρ e 1+ρ 1 − ρ 2π
"
µ ¶ 1+ρ 2 1 1+ρ 1 +3 1− · · 4 1 − ρ u2 1−ρ u µ ¶# µ ¶ 1 1+ρ 3 1 , −15 · 6 +O 1−ρ u u8
(A.33)
p u2 µr ¶ p − 1+ρ 1 − ρ2 2 e √ φ u = 1 − ρ2 , 1+ρ 2π 2π
(A.34)
which, after factorization by (1 + ρ)/ρ, yields 2 − u · (1 + ρ)2 e 1+ρ ρ 1 ρ(1 + ρ) 1 m11 L(u, u; ρ) = p 1−2 · 2 +6 · 2 2π 1−ρ u (1 − ρ)2 u4 1−ρ µ ¶¸ ρ(1 + ρ)2 1 1 −30 · 6 +O + ρ L(u, u; ρ), 3 (1 − ρ) u u8
(A.35)
and finally m11 = u2 + 2 (1 + ρ) −
(16 − 9ρ + 3ρ2 )(1 + ρ)3 1 (1 + ρ)2 (3 − ρ) 1 · 2+ · 4 +O (1 − ρ) u (1 − ρ)2 u
µ
1 u6
¶ .
(A.36)
A.2.5 Asymptotic behavior of the correlation coefficient The conditional correlation coefficient conditioned on both X and Y larger than u is defined by (A.12). Using the symmetry between X and Y , we have m10 = m01 and m20 = m02 , which allows us to rewrite (A.12) as follows m11 − m10 2 ρu = . (A.37) m20 − m10 2 Putting together the previous results, we have µ ¶ (4 − ρ + 3ρ2 + 3ρ3 )(1 + ρ)2 1 1 (1 + ρ)2 − 2 · + O , (A.38) m20 − m10 2 = u2 1−ρ u4 u6 µ ¶ (1 + ρ)3 1 1 2 m11 − m10 = ρ · 4 +O , (A.39) 1−ρ u u6 which proves that ρu = ρ
1+ρ 1 · +O 1 − ρ u2
µ
1 u4
29
¶ and ρ ∈ [−1, 1).
(A.40)
268
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
B Conditional correlation coefficient for Student’s variables B.1
Proposition
Let us consider a pair of Student’s random variables (X, Y ) with ν degrees of freedom and unconditional correlation coefficient ρ. Let A be a subset of R such that Pr{Y ∈ A} > 0. The correlation coefficient of (X, Y ), conditioned on Y ∈ A defined by Cov(X, Y | Y ∈ A) p ρA = p . Var(X | Y ∈ A) Var(Y | Y ∈ A) can be expressed as ρA = q with
Pr ν − 1 · Var(Y | Y ∈ A) = ν ν−2
ρ ρ2 +
nq
E[E(x2 | Y )−ρ2 Y 2 | Y ∈A] Var(Y | Y ∈A)
ν ν−2 Y
o ∈A|ν−2
Pr{Y ∈ A | ν}
and E[E(X 2 | Y ) − ρ2 Y 2 | Y ∈ A] = (1 − ρ2 )
B.2
ν · ν−2
Pr
nq
,
(B.42)
"R
− 1 −
(B.41)
y∈A
dy y · ty (y)
#2 ,
Pr{Y ∈ A | ν}
ν ν−2 Y
(B.43)
o ∈A|ν−2
Pr{Y ∈ A | ν}
.
(B.44)
Proof of the proposition
Let the variables X and Y have a multivariate Student’s distribution with ν degrees of freedom and a correlation coefficient ρ : ¡ ¢ ¶− ν+2 µ 2 Γ ν+2 x2 − 2ρxy + y 2 2 PXY (x, y) = , (B.45) 1+ ¡ ν+1 ¢ p 2 ν (1 − ρ ) νπ Γ 2 1 − ρ2 "µ # µ ¶ ¶ ν + 1 1/2 x − ρy ν + 1 1/2 1 p p , (B.46) = tν (y) · tν+1 ν + y2 ν + y2 1 − ρ2 1 − ρ2 where tν (·) denotes the univariate Student’s density with ν degrees of freedom ¡ ¢ Γ ν+1 1 Cν 2 tν (x) = ¡ ν ¢ · ´ ν+1 = ³ ´ ν+1 . Γ 2 (νπ)1/2 ³ 2 2 2 x x2 1+ ν 1+ ν
(B.47)
Let us evaluate Cov(X, Y | Y ∈ A): Cov(X, Y | Y ∈ A) = E(X · Y | Y ∈ A) − E(X | Y ∈ A) · E(Y | Y ∈ A),
(B.48)
= E(E(X | Y ) · Y | Y ∈ A) − E(E(X | Y ) | Y ∈ A) · E(Y | Y ∈ A).(B.49) As it can be seen in equation (B.46), E(X | Y ) = ρY , which gives Cov(X, Y | Y ∈ A) = ρ · E(Y 2 | Y ∈ A) − ρ · E(Y | Y ∈ A)2 , = ρ · Var(Y | Y ∈ A). 30
(B.50) (B.51)
269
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Thus, we have
s Var(Y | Y ∈ A) . Var(X | Y ∈ A)
ρA = ρ
(B.52)
Using the same method as for the calculation of Cov(X, Y | Y ∈ A), we find Var(X | Y ∈ A) = E[E(X 2 | Y ) | Y ∈ A)] − E[E(X | Y ) | Y ∈ A)]2 , 2
2
(B.53)
2
= E[E(X | Y ) | Y ∈ A)] − ρ · E[Y | Y ∈ A] , 2
2
2
(B.54)
2
= E[E(X | Y ) − ρ Y | Y ∈ A)] − ρ · Var[Y | Y ∈ A],
(B.55) (B.56)
which yields
ρ
ρA = q
ρ2 +
E[E(x2 | Y )−ρ2 Y 2 | Y ∈A] Var(Y | Y ∈A)
,
(B.57)
as asserted in (B.42). To go one step further, we have to evaluate the three terms E(Y | Y ∈ A), E(Y 2 | Y ∈ A), and E[E(X 2 | Y ) | Y ∈ A]. The first one is trivial to calculate :
R E(Y | Y ∈ A) =
y∈A dy
y · ty (y)
Pr{Y ∈ A | ν}
.
(B.58)
The second one gives R E(Y 2 | Y ∈ A) =
y∈A dy
y 2 · ty (y)
, Pr{Y ∈ A | ν} nq o ν Pr ν−2 Y ∈ A | ν − 2 ν − 1 · − 1 , = ν ν−2 Pr{Y ∈ A | ν}
so that
Pr ν − 1 Var(Y | Y ∈ A) = ν · ν−2
nq
ν ν−2 Y
o ∈A|ν−2
Pr{Y ∈ A | ν}
− 1 −
"R
y∈A dy
y · ty (y)
Pr{Y ∈ A | ν}
(B.59)
(B.60)
#2 .
(B.61)
To calculate the third term, we first need to evaluate E(X 2 | Y ). Using equation (B.46) and the results given in (Abramovitz and Stegun 1972), we find "µ # µ ¶ ¶ Z ν + 1 1/2 x2 ν + 1 1/2 x − ρy 2 p p E(X | Y ) = dx · tν+1 , (B.62) ν + y2 ν + y2 1 − ρ2 1 − ρ2 =
ν + y2 (1 − ρ2 ) − ρ2 y 2 , ν−1
(B.63)
which yields E[E(X 2 | Y ) − ρ2 Y 2 | Y ∈ A] =
ν 1 − ρ2 (1 − ρ2 ) + E[Y 2 | Y ∈ A] , ν−1 ν−1 31
(B.64)
270
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
and applying the result given in eqation (B.60), we finally obtain E[E(X 2 | Y ) − ρ2 Y 2 | Y ∈ A] = (1 − ρ2 )
ν · ν−2
Pr
nq
ν ν−2 Y
o ∈A|ν−2
Pr{Y ∈ A | ν}
,
(B.65)
which concludes the proof.
B.3
Conditioning on Y larger than v
The conditioning set is A = [v, +∞), thus ³ ´ ν−1 Cν −(ν+2) + O Pr{Y ∈ A | ν} = T¯ν (v) = ν 2 v , (B.66) vν ! Ãr ½r ¾ ν−p ³ ´ Cν−p ν ν−p ν 2 −(ν−p+2) Pr Y ∈A|ν−p = T¯ν−p v = + O v , (B.67) 1 ν−p ν−p ν (ν − p) 2 v Ãr ! r ν Z ³ ´ ν ν−2 ν2 Cν−2 −(ν−3) dy y · ty (y) = tν−2 v =√ + O v (B.68) , ν−2 ν ν − 2 v ν−1 y∈A where tν (·) and T¯ν (·) denote respectively the density and the Student’s survival distribution with ν degrees of freedom and Cν is defined in (B.47). Using equation (B.42), one can thus give the exact expression of ρ+ v . Since it is very cumbersomme, we will not write it explicitely. We will only give the asymptotic expression of ρ+ v . In this respect, we can show that ν v 2 + O(1) (ν − 2)(ν − 1)2 r ν 1 − ρ2 2 2 2 2 v + O(1) . E[E(X | Y ) − ρ Y | Y ∈ A] = ν−2 ν−1 Var(Y | Y ∈ A) =
Thus, for large v, ρ+ v −→ r
B.4
ρ . q ν−2 2 2 ρ + (ν − 1) ν (1 − ρ )
(B.69) (B.70)
(B.71)
Conditioning on |Y | larger than v
The conditioning set is now A = (−∞, −v]∪[v, +∞), with v ∈ R+ . Thus, the right hand sides of equations (B.66) and (B.67) have to be multiplied by two while Z dy y · ty (y) = 0, (B.72) y∈A
for symmetry reasons. So the equation (B.70) still holds while Var(Y | Y ∈ A) =
ν v 2 + O(1) . (ν − 2)
Thus, for large v, ρsv −→ r ρ2 +
1 (ν−1)
32
ρ q
. ν−2 ν
(1 − ρ2 )
(B.73)
(B.74)
9.1. Les diff´erentes mesures de d´ependances extrˆemes
B.5
271
Conditioning on Y > v versus on |Y | > v
The results (B.71) and (B.74) are valid for ν > 2, as one can expect since the second moment has to exist for the correlation coefficient to be defined. We remark that here, contrarily to the Gaussian case, the s conditioning set is not really important. Indeed with both conditioning set, ρ+ v and ρv goes to a constant different from zero and one, when v goes to infinity. This striking difference can be explained by the large fluctuations allowed by the Student’s distribution, and can be related to the fact that the coefficient of tail dependence for this distribution does not vanish even though the variables are anti-correlated (see section 3.2 below). Contrarily to the Gaussian distribution which binds the fluctuations of the variables near the origin, the Student’s distribution allows for ‘wild’ fluctuations. These properties are thus responsible for the result that, contrarily to the Gaussian case for which the conditional correlation coefficient goes to zero when conditioned on large signed values and goes to one when conditioned on large unsigned values, the conditional correlation coefficient for Student’s variables have a similar behavior in both cases. Intuitively, the large fluctuations of X for large v dominate and control the asymptotic dependence.
33
272
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
C Proof of equation (9) We assume that X and Y are related by the equation X = αY + ² ,
(C.75)
where α is a non random real coefficient and ² an idiosyncratic noise independent of Y , whose distribution is assumed to admit a moment of second order σ²2 . Let us also denote by σy2 the second moment of the variable Y . We have Cov(X, Y | Y ∈ A) = Cov(αY + ², Y | Y ∈ A),
(C.76)
= αVar(Y | Y ∈ A) + Cov(², Y | Y ∈ A),
(C.77)
= αVar(Y | Y ∈ A),
(C.78)
since Y and ² are independent. We have also Var(X | Y ∈ A) = = α2 Var(Y | Y ∈ A) + 2 Cov(², Y | Y ∈ A) + Var(² | Y ∈ A), (C.79) = α2 Var(Y | Y ∈ A) + σ²2 ,
(C.80)
where, again, we have used the independence of Y and ². This allows us to write ρA = =
αVar(Y | Y ∈ A) p , Var(Y | Y ∈ A)(α2 Var(Y | Y ∈ A) + σ²2 ) sgn(α) q . 2 1 + ασ²2 · Var(Y 1| Y ∈A)
Since
sgn(α)
ρ= q
1+
σ²2 α2
we finally obtain
·
1 Var(Y )
,
ρ
ρA = q
Var(y) ρ2 + (1 − ρ2 ) Var(y | y∈A)
which conclude the proof.
34
(C.81) (C.82)
(C.83)
,
(C.84)
273
9.1. Les diff´erentes mesures de d´ependances extrˆemes
D Conditional Spearman’s rho The conditional Spearman’s rho has been defined by Cov(U, V | V ≥ v˜) ρs (˜ v) = p , Var(U | V ≥ v˜)Var(V | V ≥ v˜) We have
R1R1
· dC(u, v)
1 E[· | V ≥ v˜] = R 1 R 1 = 1 − v˜ v˜ 0 dC(u, v) v˜
0
(D.85)
1Z 1
Z
· dC(u, v) , v˜
(D.86)
0
thus, performing a simple integration by parts, we obtain ·Z 1 ¸ 1 1 E[U | V ≥ v˜] = 1 + du C(u, v˜) − , 1 − v˜ 0 2 1 + v˜ E[V | V ≥ v˜] = , 2 ·Z 1 ¸ 2 1 2 E[U | V ≥ v˜] = 1 + du u C(u, v˜) − , 1 − v˜ 0 3 v˜2 + v˜ + 1 E[V 2 | V ≥ v˜] = , 3 ·Z 1 Z 1 ¸ Z 1 1 1 1 + v˜ + dv du C(u, v) + v˜ du C(u, v˜) − , E[U · V | V ≥ v˜] = 2 1 − v˜ v˜ 2 0 0
(D.87) (D.88) (D.89) (D.90) (D.91)
which yields Z 1 Z 1 Z 1 1 1 1 Cov(U, V | V ≥ v˜) = du C(u, v) − dv du C(u, v˜) − , (D.92) 1 − v˜ v˜ 2 0 4 0 Z 1 Z 1 1 − 4˜ v 2 2˜ v−1 du C(u, v˜) Var(U | V ≥ v˜) = + du u C(u, v ˜ ) + 12 (1 − v˜)2 1 − v˜ 0 (1 − v˜)2 0 µZ 1 ¶2 1 − du C(u, v˜) , (D.93) (1 − v˜)2 0 (1 − v˜)2 , (D.94) Var(V | V ≥ v˜) = 12 so that ρs (˜ v) = r
12 1−˜ v
1 − 4˜ v + 24 (1 − v˜)
R1 0
R1 v˜
dv
R1 0
du C(u, v) − 6
R1 0
du u C(u, v˜) + 12 (2˜ v − 1)
du C(u, v˜) − 3 R1 0
du C(u, v˜) − 12
³R 1 0
´2 du C(u, v˜) (D.95)
E Tail dependence generated by the Student’s factor model We consider two random variables X and Y , related by the relation X = αY + ²,
35
(E.96)
274
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
where ² is a random variable independent of Y and α a non random positive coefficient. Assume that Y and ² have a Student’s distribution with density: PY (y) = ³ 1+ P² (²) =
Cν
´ ν+1 ,
(E.97)
Cν ³ ´ ν+1 . 2 2 σ 1 + ν ²σ2
(E.98)
y2 ν
2
We first give a general expression for the probability for X to be larger than FX−1 (u) knowing that Y is larger than FY−1 (u) : L EMMA 1 The probability that X is larger than FX−1 (u) knowing that Y is larger than FY−1 (u) is given by : Z ∞ α −1 −1 ¯ dy F¯Y (y) · P² [αFY−1 (u) + η − αy] , Pr[X > FX (u)|Y > FY (u)] = F² (η) + 1 − u FY−1 (u) with
η = FX−1 (u) − αFY−1 (u).
(E.99)
(E.100)
The proof of this lemma relies on a simple integration by part and a change of variable, which are detailed in appendix E.1. Introducing the notation we can show that
Y˜u = FY−1 (u) ,
(E.101)
·³ ¸ ³ σ ´ν ´1/ν η =α 1+ − 1 Y˜u + O(Y˜u−1 ), α
(E.102)
which allows us to conclude that η goes to infinity as u goes to 1 (see appendix E.2 for the derivation of this result). Thus, F¯² (η) goes to zero as u goes to 1 and Z ∞ α λ = lim dy F¯Y (y) · P² (αY˜u + η − αy) . (E.103) u→1 1 − u Y˜u Now, using the following result : L EMMA 2 Assuming ν > 0 and x0 > 1, 1 ²→0 ²
Z
lim
∞
dx 1
1 Cν 1 = ν, ¡ x−x0 ¢2 i ν+1 xν h x 2 0 1+ ²
(E.104)
whose proof is given in appendix E.3, it is straigthforward to show that λ=
1+
1 ¡ σ ¢ν . α
The final steps of this calculation are given in appendix E.4. 36
(E.105)
275
9.1. Les diff´erentes mesures de d´ependances extrˆemes
E.1
Proof of Lemma 1
By definition, Z Pr[X > FX−1 (u), Y > FY−1 (u)] = =
Z
∞
−1 FX (u) Z ∞
FY−1 (u)
dx
∞
FY−1 (u)
dy PY (y) · P² (x − αy)
dy PY (y) · F¯² [FX−1 (u) − αy].
(E.106) (E.107)
Let us perform an integration by part : £ ¤∞ −F¯Y (y) · F¯² (FX−1 (u) − αy) F −1 (u) + Y Z ∞ + α dy F¯Y (y) · P² (FX−1 (u) − αy)
(E.108)
= (1 − u)F¯² (FX−1 (u) − αFY−1 (u)) + Z ∞ dy F¯Y (y) · P² (FX−1 (u) − αy) + α
(E.109)
Pr[X > FX−1 (u), Y > FY−1 (u)] =
FY−1 (u)
FY−1 (u)
Defining η = FX−1 (u) − αFY−1 (u) (see equation(E.100)), and dividing each term by Pr[Y > FY−1 (u)] = 1 − u,
(E.110)
we obtain the result given in (E.99)
E.2
Derivation of equation (E.102)
The factor Y and the idiosyncratic noise ² have Student’s distributions with ν degrees of freedom given by (E.97) and (E.98) respectively. It follows that the survival distributions of Y and ² are : ν F¯Y (y) = σν ν F¯² (²) =
ν−1 2
Cν
+ O(y −(ν+2) ),
(E.111)
Cν
+ O(²−(ν+2) ),
(E.112)
yν ν−1 2
²ν
(E.113) and
(αν + σ ν ) ν F¯X (x) = xν
ν−1 2
Cν
+ O(x−(ν+2) ).
(E.114)
Using the notation (E.101), equation (E.100) can be rewritten as F¯X (η + αY˜u ) = F¯Y (Y˜u ) = 1 − u, whose solution for large Y˜u (or equivalently as u goes to 1) is ·³ ¸ ³ σ ´ν ´1/ν η =α 1+ − 1 Y˜u + O(Y˜u−1 ). α
(E.115)
(E.116)
To obain this equation, we have used the asymptotic expressions of F¯X and F¯Y given in (E.114) and (E.111). 37
276
E.3
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Proof of lemma 2
We want to prove that, assuming ν > 0 and x0 > 1, Z 1 1 ∞ lim dx ν h ²→0 ² 1 x 1+ The change of variable u=
Cν 1 = ν. ¡ x−x ¢2 i ν+1 x 2 0 0
1 ν
(E.117)
²
x − x0 , ²
(E.118)
gives 1 ²
Z 1
∞
1 dx ν h x 1+
Z
Cν
1 ν
¡ x−x0 ¢2 i
ν+1 2
=
∞
du
1−x0 ²
²
=
1 xν0
=
1 xν0
Z Z
1 Cν ν (²u + x0 ) (1 + u2 ) ν+1 2 ν
∞ 1−x0 ² x0 ²
du
Cν 1 ²u ν (1 + x0 ) (1 + u2 ) ν+1 2 ν
1 Cν + (1 + x²u0 )ν (1 + u2 ) ν+1 2 ν Z ∞ 1 1 Cν du . xν0 x0 (1 + x²u0 )ν (1 + u2 ) ν+1 2
+
1−x0 ²
u≥ 1 (1 + u2 )
ν+1 2
≤
ν
(E.121)
ν
x0 , ²
which allows us to write
(E.120)
du
²
Consider the second integral. We have
(E.119)
(E.122) ν+1 2
²ν+1
xν+1 0
,
(E.123)
so that ¯ ¯Z ¯ ¯ ∞ 1 Cν ¯ ¯ ¯ x du ¯ ≤ ν+1 ²u ν ¯ 0 2 (1 + x0 ) (1 + u ) 2 ¯ ² =
ν
ν+1 2
²ν+1
xν+1 0 ν
ν+1 2
xν0 ν
²ν
Z
Z
∞ x0 ²
du
∞
dv 1
Cν (1 + x²u0 )ν
Cν (1 + v)ν
= O(² ).
(E.124) (E.125) (E.126)
The next step of the proof is to show that Z
x0 ² 1−x0 ²
du
Cν 1 −→ 1 as ² −→ 0. (1 + x²u0 )ν (1 + u2 ) ν+1 2 ν
Let us calculate ¯Z x0 ¯Z x0 ¯ ¯ ² ¯ ² ¯ Cν Cν 1 1 ¯ ¯ ¯ = − 1 − ¯ 1−x du ¯ 1−x du ¯ ²u ²u 2 ν+1 ν ν u ¯ ¯ ¯ 0 0 (1 + x0 ) (1 + ) 2 (1 + x0 ) (1 + u2 ) ν+1 2 ² ² ν ν 38
(E.127)
277
9.1. Les diff´erentes mesures de d´ependances extrˆemes
¯ ¯ ¯ du (E.128) ¯ 2 ν+1 u −∞ (1 + ν ) 2 ¯ ¯Z x0 " # ¯ ² 1 Cν ¯ −1 − ¯ 1−x du 2 ν+1 ¯ 0 (1 + x²u0 )ν (1 + uν ) 2 ² ¯ Z 1−x0 Z ∞ ¯ ² Cν Cν ¯ du − du ¯ (E.129) 2 ν+1 2 ν+1 x0 u u −∞ (1 + ν ) 2 (1 + ν ) 2 ¯ ² ¯Z x0 ¯ " # ¯ ² ¯ 1 Cν ¯ ¯ du − 1 ¯ 1−x ¯+ u2 ν+1 ¯ 0 (1 + x²u0 )ν 2 ¯ (1 + ) ² ν ¯Z 1−x0 ¯ ¯Z ¯ ¯ ¯ ¯ ¯ ∞ ² Cν Cν ¯ ¯ ¯ ¯ du + (E.130) du ¯ ¯ ¯ ¯. ν+1 ν+1 2 x0 u2 u ¯ −∞ ¯ ¯ (1 + ν ) 2 (1 + ν ) 2 ¯ ² Z
− = − ≤ +
∞
Cν
The second and third integrals obviously behave like O(²ν ) when ² goes to zero since we have assumed 0 x0 > 1 what ensures that 1−x → −∞ and x²0 → ∞ when ² → 0. For the first integral, we have ² ¯Z x0 ¯ Z x0 ¯ ¯ " # ¯ ² ¯ ¯ ¯ ² 1 1 Cν Cν ¯ ¯ ¯ ¯ du . (E.131) − 1 ≤ − 1 ¯ 1−x du ¯ ¯ ¯ ν+1 ²u ²u 2 ν ν 1−x0 ¯ ¯ (1 + x0 ) ¯ (1 + u2 ) ν+1 0 (1 + x0 ) 2 (1 + uν ) 2 ¯ ² ² ν The function
¯ ¯ ¯ ¯ 1 ¯ ¯ − 1 ¯ ¯ ¯ (1 + x²u0 )ν ¯
(E.132)
x0 0 vanishes at u = 0, is convex for u ∈ [ 1−x ² , 0] and concave for u ∈ [0, ² ] (see also figure 13), so that there are two constants A, B > 0 such that ¯ ¯ · ¸ ¯ ¯ 1 − x0 1 xν0 − 1 ¯ ¯ ² · u = −A · ² · u, ∀u ∈ ,0 (E.133) − 1¯ ≤ − ¯ ¯ (1 + x²u0 )ν ¯ x0 − 1 ² ¯ ¯ ¯ ¯ h x i 1 ν² ¯ ¯ 0 u = B · ² · u, ∀u ∈ 0, . (E.134) − 1 ≤ ¯ ¯ ²u ¯ (1 + x0 )ν ¯ x0 ²
We can thus conclude that ¯Z x0 ¯ " # Z 0 ¯ ² ¯ 1 C u · Cν ¯ ¯ ν −1 ¯ 1−x du ¯ ≤ −A · ² 1−x du ²u 2 ν+1 2 ν+1 ν u ¯ 0 0 (1 + x0 ) (1 + ν ) 2 ¯ (1 + uν ) 2 ² ² Z x0 ² u · Cν + B·² du 2 ν+1 0 (1 + uν ) 2 = O(²α ),
(E.135) (E.136)
with α = min{ν, 1}. Indeed, the two integrals can be perfomed exactly, which shows that they behave as O(1) if ν > 1 and as O(²ν−1 ) otherwise. Thus, we finally obtain ¯ ¯Z x0 ¯ ¯ ² Cν 1 ¯ ¯ (E.137) − 1 du ¯ = O(²α ). ¯ 1−x ¯ ¯ 0 (1 + x²u0 )ν (1 + u2 ) ν+1 2 ² ν
39
278
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Putting together equations (E.126) and (E.137) we obtain ¯ ¯ ¯ Z ¯ ¯1 ∞ 1 Cν 1 ¯¯ ¯ dx ν h − ν ¯ = O(²min{ν,1} ) , ¯ ¡ x−x ¢2 i ν+1 ¯² 1 x x0 ¯ 2 1 0 ¯ ¯ 1+ ν ²
(E.138)
which concludes the proof.
E.4
Derivation of equation (E.105)
From equation (E.111), we can deduce ν F¯Y (y) =
ν−1 2
y
¢ Cν ¡ −2 1 + O(y ) . ν
(E.139)
Using equations (E.98) and (E.102), we obtain ³ ´ P² (αY˜u + η − αy) = P² (γ Y˜u − αy) · 1 + O(Y˜u−2 ) , where
(E.140)
³ ³ σ ´ν ´1/ν γ =α 1+ . α
(E.141)
Putting together these results yields for the leading order Z
∞
Y˜u
Z dy F¯Y (y) · P² (αY˜u + η − αy) =
=
∞
Y˜u
ν
dy
ν−1 2
ν
Cν
ν−1 2
Cν
yν Z
ν α Y˜u
Cν
³ σ 1+
∞
1
·
(γ Y˜u −αy)2 ν σ2
´ ν+1 ,
(E.142)
2
˜
Cν ασYu 1 dx ν · Ã , (E.143) µ γ ¶2 ! ν+1 x 2 x− α 1 + ν1 σ ˜u αY
where the change of variable x =
y Y˜u
We now apply lemma 2 with x0 =
γ α
Z
∞
Y˜u
has been performed in the last equation. > 1 and ² =
σ αY˜u
which goes to zero as u goes to 1. This gives
ν dy F¯Y (y) · P² (αY˜u + η − αy) ∼u→1
which shows that Pr[X >
FX−1 (u), Y
>
FY−1 (u)]
∼u→1 FY−1 (Y˜u )
ν−1 2
α Y˜uν
Pr[X >
>
FY−1 (u)]
which finally yields λ=
1+
40
1 ¡ σ ¢ν . α
µ ¶ν α , γ
µ ¶ν µ ¶ν α α = (1 − u) , γ γ
thus FX−1 (u)|Y
Cν
∼u→1
µ ¶ν α , γ
(E.144)
(E.145)
(E.146)
(E.147)
9.1. Les diff´erentes mesures de d´ependances extrˆemes
279
References Abramovitz, E. and I.A. Stegun, 1972, Handbook of Mathematical functions (Dover Publications, New York). Andersen, J.V. and D. Sornette, 2001, Have your cake and eat it too: increasing returns while lowering large risks! Journal of Risk Finance 2, 70-82. Ang, A. and G. Bekaert, 2001, International asset allocation with regime shifts, forthcoming Review of Financial Studies. . Ang, A. and J. Chen, 2001, Asymmetric correlations of equity portfolios, forthcoming Journal of Financial Economics. Baig, T. and I. Golgfajn, 1998, Financial market contagion in the Asian crisis, IMF Working Paper. Bhansali, V. and M.B. Wise, 2001, Forecasting portfolio risk in normal and stressed markets, working paper (preprint at http://xxx.lanl.gov/abs/nlin.AO/0108022) Bookstaber, R., 1997, Global risk management: are we missing the point? Journal of Portfolio Management, 23, 102-107. Boyer, B.H., M.S Gibson and M. Lauretan, 1997, Pitfalls in tests for changes in correlations, International Finance Discussion Paper 597, Board of the Governors of the Federal Reserve System. Davis, R.A., T. Mikosch and B. Basrak, 1999, Sample ACF of multivariate stochastic recurrence equations with application to GARCH, Working paper. Calvo, S. and C.M. Reinhart, 1995, Capital flows to Latin Americ: Is there evidence of contagion effects?, in G.A. Calvo, M. Goldstein and E. Haochreiter,eds: Private Captial Flows to Emerging Market After the Mexican Crisis (Institue for Internoational Economics, Washington DC) Cizeau, P., M. Potters and J.P. Bouchaud, 2001, Correlation structure of extreme stock returns, Quantitative Finance 1, 217-222. Claessen, S., R.W. Dornbush and Y.C. Park, 2001, Contagion: Why crises spread and how this can be stopped, in S. Cleassens and K.J. Forbes eds, International Financial Contagion (Kluwer Academic Press). Coles, S., J. Heffernan and J. Tawn, 1999, Dependence measures for extreme value analyses, Extremes 2, 339-365. Embrechs, P., A.J. McNeil and D. Straumann, 1999, Correlation : Pitfalls and Alternatives. Risk, 69-71. Embrechts, P., A.J. McNeil and D. Straumann, 2001, Correlation and Dependency in Risk Management : Properties and Pitfalls, in : Dempster, M., ed., Value at Risk and Beyond (Cambridge University Press). Forbes, K.J. and R. Rigobon, 2002, No contagion, only interdependence: measuring stock market comovements, forthcoming Journal of Finance. Frees, E. and E. Valdez, 1998, Understanding Relationships using copulas, North Americam Actuarial Journal 2, 1-25.
41
280
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Frieden, J.A., Debt, Development, and Democracy: Modern Political Economy and Latin America, 19651985, Princeton University Press, 1992. J.A. Frieden, P. Ghezzi and E. Stein, Politics and Exchange Rates: A Cross-Country Approach to Latin America, Harvard University working paper, October 2000 J.A. Frieden and E. Stein, The Political Economy of Exchange Rate Policy in Latin America: An Analytical Overview, Harvard University working paper, October 2000 Hartman, P., S. Straetmans and C.G. de Vries, 2001, Asset market linkages in crisis periods, European Central Bank, Working paper no 71. Hauksson, H.A., M.M. Dacorogna, T. Domenig, U.A. M¨uller and G. Samorodnitsky, 2001, Multivariate Extremes, Aggregation and Risk Estimation, Quantitative Finance 1, 79-95. Herffernan J.E., 2000, A directory of tail dependence, Extremes 3, 279-290. Hult, H. and F. Lindskog, 2001, Multivariate extremes, aggregation and dependence in elliptical distributions, Risklab working paper. Jensen, J.L., 1995, Saddlepoint Approximations (Oxford University Press). Joe, H., 1997, Multivariate models and dependence concepts (Chapman & Hall, London) Johnson, N.L. and S. Kotz, 1972, Distributions in statistics: Continuous multivariate distributions (John Willey and Sons). Juri, A. and M.V. W¨uthrich, 2002, Copula convergence theorem for tail events, working paper, Risklab. King, M. and S. Wadhwani, 1990, Transmission of volatility between stock markets, The Review of Financial Studies 3, 5-330. Kulpa, T., 1999, On approximations of copulas, international Journal of Mathematics and Mathematical sciences 22, 259-269. Ledford, A.W. and J.A. Tawn,1996 Statistics for near independence in multivariate extrem values, Biometrika 83, 169-187. Ledford, A.W. and J.A. Tawn, 1998, Concomitant tail behavior for extremes, Adv. Appl. Prob. 30, 197-215. Lee, S.B. and K.J. Kim, 1993, Does the October 1987 crash strengthen the co-movements among national stock markets?, Review of Financial Economics 3, 89-102. Li, X., P. Mikusincki and M.D. Taylor, 1998, Strong approximation of copulas, Journal of Mathemetical Analisys and Applications 225, 608-623. Lindskog, F., Modelling Dependence with Copulas, Risklab working paper. Longin F. and B. Solnik, 1995, Is the correlation in international equity returns constant: 1960-1990? Journal of International Money and Finance 14, 3-26. Longin F. and B. Solnik, 2001, Extreme Correlation of International Equity Markets, The Journal of Finance LVI, 649-676. Loretan, M., 2000, Evaluating changes in correlations during periods of high market volatility, Global Investor 135, 65-68. 42
9.1. Les diff´erentes mesures de d´ependances extrˆemes
281
Loretan, M. and W.B. English, 2000, Working paper 000-658, Board of Governors of the Federal Reserve System Malevergne, Y. and D.Sornette, 2001, Testing the Gaussian copula hypothesis for financial assets dependence, Working paper. Malevergne, Y. and D.Sornette, 2002, Tail dependence for factor models, Working paper. Mansilla, R., 2001, Algorithmic complexity of real financial markets, Physica A 301, 483-492. Meerschaert, M.M. and H.P. Scheffler, 2001, Sample cross-correlations for moving averages with regularly varying tails, Journal of Time Series Analysis 22, 481-492. Nelsen, R.B., 1998, An Introduction to Copulas. Lectures Notes in statistic 139 (Springer Verlag, New York). Patton, J.A., 2001, Estimation of copula models for time series of possibly different lengths, U of California, Econ. Disc. Paper No. 2001-17. Poon, S.H., M. Rockinger and J. Tawn, 2001, New extreme-value dependence measures and finance applications, working paper. Quintos, C.E., 2001, Estimating tail dependence and testing for contagion using tail indices, working paper. Quintos, C.E., Z.H. Fan and P.C.B. Phillips, 2001, Structural change tests in tail behaviour and the Asian crisis, Review of Economic Studies 68, 633-663. Ramchand, L. and R. Susmel, 1998, Volatility and cross correlation across major stock markets, Journal of Empirical Finance 5, 397-416. Ross, S., 1976, The arbitrage theory of capital asset pricing, Journal of Economic Theory 17, 254-286. Scaillet, O. 2000, Nonparametric estimation of copulas for time series, Working paper. Sharpe, W., 1964, Capital assets prices: a theory of market equilibrium under conditions of risk, Journal of Finance, 19, 425-442. Silvapulle, P. and C.W.J. Granger, 2001, Large returns, conditional correlation and portfolio diversification: a value-at-risk approach, Quantitative Finance 1, 542-551. Sornette, D. P. Simonetti and J. V. Andersen, 2000, φq -field theory for Portfolio optimization: “fat tails” and non-linear correlations, Physics Report 335, 19-92. Sornette, D., J.V. Andersen and P. Simonetti, 2000, Portfolio Theory for “Fat Tails”, International Journal of Theoretical and Applied Finance 3, 523-535. Starica, C., 1999, Multivariate extremes for models with constant conditional correlations, Journal of Empirical Finance 6, 515-553. Tsui, A.K. and Q. Yu, 1999, Constant conditional correlation in a bivariate GARCH model: evidence from the stock markets of China, Mathematics and Computers in Simulation 48, 503-509.
43
282
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Argentina Brazil Chile Mexico
Argentina Brazil Chile Mexico
Argentina -
Argentina -
Negative Tail Brazil Chile 0.28 (0.04) 0.25 (0.04) 0.19 (0.03) Positive Tail Brazil Chile 0.21 (0.06) 0.20 (0.04) 0.28 (0.04) -
Mexico 0.25 (0.05) 0.25 (0.05) 0.24 (0.07) Mexico 0.22 (0.04) 0.19 (0.04) 0.19 (0.03) -
Table 1: Coefficients of tail dependence between the four Latin American markets. The figure within parenthesis gives the standard deviation of the estimated value derived under the assumption of asymptotic normality of the estimators. Only the coefficients above the diagonal are indicated since they are symmetric.
44
283
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Student Hypothesis nu=3 Argentina Brazil Chile Argentina 0.24 0.25 Brazil 0.24 Chile Mexico
Mexico 0.27 0.27 0.28 -
Table 2: Coefficients of tail dependence derived under the assumption of a Student’s copula with three degrees of freedom
45
284
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
ρ+ v √ρ
Bivariate Gaussian Bivariate Student’s
1−ρ2
r
ρ2 +(ν−1)
ρ q
·
ρsv 1 v
1−
(3) (6)
ν−2 ν
(1−ρ2 )
Gaussian Factor Model
same as (3)
Student’s Factor Model
1−
K v2
1 1−ρ2 1 2 ρ2 v 2
r
ρ2 +
ρu
1 (ν−1)
ρ q
(7) ν−2 ν
1−
K v2
1 u2
(13)
-
(1−ρ2 )
same as (4)
(11)
1+ρ ρ 1−ρ ·
(4)
(11)
same as (13) -
s Table 3: Large v and u dependence of the conditional correlations ρ+ v (signed condition), ρv (unsigned condition) and ρu (on both variables) for the different models studied in the present paper, described in the first column. The numbers in parentheses give the equation numbers from which the formulas are derived. The factor model is defined by (8), i.e., X = αY + ². ρ is the unconditional correlation coefficient.
46
285
9.1. Les diff´erentes mesures de d´ependances extrˆemes
ρ+ v=∞
ρsv=∞
ρu=∞
λ
¯ λ
Bivariate Gaussian
0
1
0
0
ρ
Bivariate Student’s
see Table 3
see Table 3
-
Gaussian Factor Model
0
1
0
0
ρ
Student’s Factor Model
1
1
-
ρν ρν +(1−ρ2 )ν/2
1
2 · T¯ν+1
q ´ ³√ ν + 1 1−ρ 1+ρ
1
s Table 4: Asymptotic values of ρ+ v , ρv and ρu for v → +∞ and u → ∞ and comparison with the tail¯ for the four models indicated in the first column. The factor model is defined by (8), dependence λ and λ i.e., X = αY + ². ρ is the unconditional correlation coefficient. For the Student’s factor model, Y and ² have centered Student’s distributions with the same number ν of degrees of freedom and their scale factors 2 are respectively equal to 1 and σ, so that ρ = (1 + ασ 2 )−1/2 . For the Bivariate Student’s distribution, we s refer to Table 1 for the constant values of ρ+ v=∞ and ρv=∞ .
47
286
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Positive Tail
0
10
−1
10
µ=2
−2
10
Argentina Brazil Chile Mexico
−3
10
−4
10
−4
−3
10
10
−2
10
−1
10
0
10
Negative Tail
0
10
−1
10
µ=2 −2
10
Argentina Brazil Chile Mexico
−3
10
−4
10
−4
10
−3
10
−2
10
Figure 1:
48
−1
10
0
10
287
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Argentina−Brazil 1
ρ+,− ,y=Brazil v
0.5
0
−0.5
−1 −5
−4
−3
−2
−1 v
0
1
2
3
−3
−2
−1
0 v
1
2
3
4
ρ+,− ,y=Argentina v
1
0.5
0
−0.5
−1 −4
Figure 2:
49
288
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Argentina−Brazil 1.2 1 ρsv,y=Brazil
0.8 0.6 0.4 0.2 0 −0.2
0
0.5
1
1.5
2
2.5 v
3
3.5
4
4.5
5
1.2
ρsv,y=Argentina
1 0.8 0.6 0.4 0.2 0 −0.2
0
0.5
1
1.5
2 v
Figure 3:
50
2.5
3
3.5
4
289
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Brazil−Chile 1
ρ+,− ,y=Chile v
0.5
0
−0.5
−1 −4
−3
−2
−1
0 v
1
2
3
4
1
ρ+,− ,y=Brazil v
0.5
0
−0.5
−1 −4
−3
−2
−1
0 v
Figure 4:
51
1
2
3
290
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Brazil−Chile 1.2 1 ρsv,y=Chile
0.8 0.6 0.4 0.2 0 −0.2
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
v
1.2 1 ρsv,y=Brazil
0.8 0.6 0.4 0.2 0 −0.2
0
1
2
3 v
Figure 5:
52
4
5
6
291
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Chile−Mexico 1
ρ+,− ,y=Mexico v
0.5
0
−0.5
−1 −4
−3
−2
−1
0 v
1
2
3
4
−3
−2
−1
0 v
1
2
3
4
1
ρ+,− ,y=Chile v
0.5
0
−0.5
−1 −4
Figure 6:
53
292
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Chile−Mexico 1.2
ρsv,y=Mexico
1 0.8 0.6 0.4 0.2 0 −0.2
0
1
2
3 v
4
5
6
1.2 1 ρsv,y=Chile
0.8 0.6 0.4 0.2 0 −0.2
0
0.5
1
1.5
2
2.5 v
Figure 7:
54
3
3.5
4
4.5
55 0.4
0.5 v
0.6
0.7
0.8
0.9
1
Figure 8:
0
0.3
0
0.2
0.1
0.1
0.1
0.2
0.2
0.4
0.3
0
0.5
0.6
0.7
0.8
0.9
0.3
0.4
ρs(v)
0.5
0.6
0.7
0.8
ρ=0.1 ρ=0.3 ρ=0.5 ρ=0.7 ρ=0.9
ρ (v) s
0.9
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
ρ=0.1 ρ=0.3 ρ=0.5 ρ=0.7 ρ=0.9
1
9.1. Les diff´erentes mesures de d´ependances extrˆemes 293
294
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Argentina−Brazil 0.4
ρ+,− ,y=Brazil v
0.2 0
−0.2 −0.4
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
ρ+,− ,y=Argentina v
0.4 0.2 0
−0.2 −0.4
Figure 9:
56
295
9.1. Les diff´erentes mesures de d´ependances extrˆemes
Brazil−Chile 0.6
ρ+,− ,y=Chile v
0.4 0.2 0
−0.2 0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
0.2
ρ+,− ,y=Brazil v
0 −0.2 −0.4 −0.6 −0.8
Figure 10:
57
296
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Chile−Mexico 0.5
ρ+,− ,y=Mexico v
0.4 0.3 0.2 0.1 0
−0.1 −0.2
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
0.4
0.5 v
0.6
0.7
0.8
0.9
1
0.6
ρ+,− ,y=Chile v
0.4 0.2 0
−0.2
Figure 11:
58
λ
59
−0.4
−0.2
0 ρ
0.2
0.4
0.6
0.8
1
0
Figure 12:
0
0.1
0.1
−0.6
0.2
0.2
0 −1
0.3
0.3
0.5
0.6
0.7
0.8
0.9
1
0.4
−0.8
ν=3 ν=5 ν=10 ν=20 ν=50 ν=100
Student’s Copula
λ
0.4
0.5
0.6
0.7
0.8
0.9
1
0.1
ν=3 ν=5 ν=10 ν=20 ν=50 ν=100
0.2
0.3
0.4
0.5 ρ
0.6
Student’s Factor Model
0.7
0.8
0.9
1
9.1. Les diff´erentes mesures de d´ependances extrˆemes 297
298
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
−
xν0−1 x0−1
⋅ε⋅u
ν⋅ε ⋅u x0
0 (1−x0)/ε
0
u
Figure 13:
60
x0/ε
9.2. Estimation du coefficient de d´ependance de queue
299
9.2 Estimation du coefficient de d´ependance de queue Utilisant le cadre des mod`eles a` facteurs, nous e´ tudions les co-mouvements extrˆemes entre deux actifs financiers ou entre un actif et le march´e. A cette fin, nous e´ tablissons l’expression g´en´erale du coefficient de d´ependance de queue entre le march´e et un actif (c’est-`a-dire, la probabilit´e qu’un actif subisse une perte extrˆeme, sachant que le march´e a lui-mˆeme subi une perte extrˆeme) et entre deux actifs comme une fonction des param`etres du mod`ele a` facteur et des param`etres de queue des distributions du facteur et du bruit idiosynchratique. Notre formule est valable pour des distributions marginales quelconques et ne requi`ert aucune param`etrisation de la distribution jointe du march´e et des actifs. La d´etermination de ce param`etre extrˆeme, qui n’est pas accessible par inf´erence statistique directe, est rendue possible par la mesure de param`etres dont l’estimation implique une quantit´e significative de donn´ees. Nos tests empiriques d´emontrent un bon accord entre le coefficient de d´ependance de queue calibr´e et les grandes pertes r´ealis´ees entre 1962 et 2000. N´eanmoins, un biais syst´ematique est d´etect´e, sugg´erant l’existence d’un “ outlier” lors du krach d’octobre 1987 et pouvant inciter a` penser que le mod`ele a` un facteur (CAPM) que nous avons consid´er´e ne suffit pas totalement a` rendre compte des propri´et´es extrˆemes dans certaines phases critiques de march´e.
300
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
How to account for extreme co-movements between individual stocks and the market∗ Y. Malevergne1,2 and D. Sornette1,3 1
3
Laboratoire de Physique de la Mati`ere Condens´ee CNRS UMR 6622 Universit´e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France 2 Institut de Science Financi`ere et d’Assurances - Universit´e Lyon I 43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France Institute of Geophysics and Planetary Physics and Department of Earth and Space Science University of California, Los Angeles, California 90095, USA email:
[email protected] and
[email protected] fax: (33) 4 92 07 67 54 August 8, 2002
Abstract Using the framework of factor models, we study the extreme co-movements between two stocks and between a stock and the market. In this goal, we establish the general expression of the coefficient of tail dependence between the market and a stock (that is, the probability that the stock incurs a large loss, assuming that the market has also undergone a large loss) and between two stocks as a function of the parameters of the underlying factor model and of the tail parameters of the distributions of the factor and of the idiosyncratic noise of each stock. Our formula holds for arbitrary marginal distributions and in addition does not require any parameterization of the multivariate distributions of the market and stocks. The determination of the extreme parameter, which is not accessible by a direct statistical inference, is made possible by the measurement of parameters whose estimation involves a significant part of the data with sufficient statistics. Our empirical tests find a good agreement between the calibration of the tail dependence coefficient and the realized large losses over the period from 1962 to 2000. Nevertheless, a bias is detected which suggests the presence of an outlier in the form of the crash of October 1987.
∗
We acknowledge helpful discussions and exchanges with C.W.G. Granger, J.P. Laurent, V. Pisarenko, R. Valkanov and D. Zajdenweber. This work was partially supported by the James S. Mc Donnell Foundation 21st century scientist award/studying complex system.
1
9.2. Estimation du coefficient de d´ependance de queue
301
Introduction The concept of extreme or “tail dependence” probes the reaction of a variable to the realization of another variable when this realization is of extreme amplitude and very low probability. The dependence, and especially the extreme dependence, between two assets or between an asset and any other exogeneous economic variable is an issue of major importance both for practioners and for academics. The determination of extreme dependences is crucial for financial and for insurance institutions involved in risk management. It is also fundamental for the establishment of a rational investment policy striving for the best diversification of the various sources of risk. In all these situations, the objective is to prevent or at least minimize the simultaneous occurrence of large losses across the different positions held in the portfolio. From an academic perspective, taking into account the extreme dependence properties provide useful yardsticks and important constraints for the construction of models, which should not underestimate or overestimate risks. From the point of view of univariate statistics, extreme values theory provides the mathematical framework for the classification and quantification of very large risks. This has been made possible by the existence of a “universal” behavior summarized by the Gnedenko-Pickands-Balkema-de Haan theorem which gives a natural limit law for peak-overthreshold values in the form of the Generalized Pareto Distribution (see Embrechts, Kluppelberg, and Mikosh (1997, pp 152-168)). Moreover, most of these univariate extreme values results are robust with respect to the time dependences observed in financial time series (see de Haan, Resnick, Rootzen and de Vries (1989) or Starica (1999) for instance). In contrast, no such result is yet available in the multivariate case. In such absence of theoretical guidelines, the alternative is therefore to impose some dependence structure in a rather ad hoc and arbitrary way. This was the stance taken for instance in Longin and Solnik (2001) in their study of the phenomenon of contagion across international equity markets. This approach, where the dependence structure is not determined from empirical facts or from an economic model, is not fully satisfying. As a remedy, we propose a new approach, which does not directly rely on multivariate extreme values theory, but rather derives the extreme dependence structure from the characteristics of a financial model of assets. Specifically, we use the general class of factor models, which is probably one of the most versatile and relevant one, and whose introduction in finance can be traced back at least to Ross (1976). The factor models are now widely used in many branches of finance, including stock return models, interest rate models (Vasicek (1977), Brennan and Schwarz (1978), Cox, Ingersoll, and Ross (1985)), credit risks models (Carey (1998), Gorby (2000), Lucas, Klaassen, Spreij and Straetmans (2001)), and so on, and are found at the core of many theories and equilibrium models. Here, we shall first focus on the characterization of the extreme dependence between stock returns and the market return and then on the extreme dependence between two stocks which share a common explaining factor. The role of the market return as a factor explaining the evolution of individual stock returns is supported both by theoretical models such as the Capital Asset Pricing Model (CAPM) (Sharpe (1964), Lintner (1965), Mossin (1966)) or the Arbitrage Pricing Theory (APT) (Ross (1976)) and by empirical studies (Fama and Beth (1973), Kandel and Staumbaugh (1987) among many others). It has even been shown in Roll (1988) that in certain dramatic circumstances, such as the October 1987 stock-market crash, the (global) market was the sole relevant factor needed to explain the stock market movements and the propagation of the crash across countries. Thus, the choice of factor models is a very natural starting point for studying extreme dependences from a general point of view. The main gain is that, without imposing any a priori ad hoc dependences other than the definition of the factor model, we shall be able to 2
302
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
derive the general properties of extreme dependence between an asset and one of its factor and to empirically determine these properties by a simple estimation of the factor model parameters. Our results are directly relevant to a portfolio manager using any of the factor models such as the CAPM or the APT to estimate the impact on her extreme risks upon the addition or removal of an asset in her portfolio. In this framework, our results stated for single assets can easily be extended to an entire portfolio, and some examples will be given. This problem is acute in particular in funds of funds. From a more global perspective, our analysis of the tail dependence of two assets is the correct setting for analyzing the strategic asset allocation facing a portofolio manager striving to diversify between a portfolio of stocks and a portfolio of bonds or between portfolios constituted of domestic and of international assets. Our main addition to the literature is to provide a completely general analytical formula for the extreme dependence between any two assets, which holds for any distribution of returns of these two assets and of their common factor and which thus embodies their intrinsic dependence. Our second innovation is to provide a novel and robust method for estimating empirically the extreme dependence which we test on twenty majors stocks of the NYSE. Comparing with historical comovements in the last forty years, we check that our prediction is validated out-of-sample and thus provide an ex-ante method to quantify futur stressful periods, so that our results can be directly used to construct a portfolio aiming at minimizing the impact of extreme events. We are also able to detect an anomalous co-monoticity associated with the October 1987 crash. The plan of our presentation is as follows. The first section defines the concepts needed for the characterization and quantification of extreme dependences. In particular, we recall the definition of the coefficient of tail dependence, which captures in a single number the properties of extreme dependence between two random variables: the tail dependence is defined as the probability for a given random variable to be large assuming that another random variable is large, at the same probability level. We shall also need some basic notions on dependences between random variables using the mathematical concept of copulas. In order to provide some perspective on the following results, this section also contains the expression of some classical exemples of tail dependence coefficients for specific multivariate distributions. The second section states our main result in the form of a general theorem allowing the calculation of the coefficient of tail dependence for any factor model with arbitrary distribution functions of the factors and of the idiosyncratic noise. We find that the factor must have sufficiently “wild” fluctuations (to be made precise below) in order for the tail dependence not to vanish. For normal distributions of the factor, the tail dependence is identically zero, while for regularly varying distributions (power laws), the tail dependence is in general non-zero. We also show that the most interesting coefficients of tail dependence are those between each individual stock and their common factor, since the tail dependence between any pair of assets is shown to be nothing but the minimum of the tail dependence between each asset and their common factor. The third section is devoted to the empirical estimation of the coefficients of tail dependence between individual stock returns and the market return. The tests are performed for daily stock returns. The estimated coefficients of tail dependence are found in good agreement with the fraction of historically realized extreme events that occur simultaneously with any of the ten largest losses of the market factor (these ten largest losses were not used to calibrate the tail dependence coefficient). We also find some evidence for comonotonicity in the crash of Oct. 1987, suggesting that this event is an “outlier,” providing additional support to a previous analysis of large and extreme drawdowns. We summarize our results and conclude in the fourth section.
3
9.2. Estimation du coefficient de d´ependance de queue
1
303
Intrinsic measure of casual and of extreme dependences
This section provides a brief informal summary of the mathematical concepts used in this paper to characterize the normal and extreme dependences between asset returns.
1.1
How to characterize uniquely the full dependence between two random variables?
The answer to this question is provided by the mathematical notion of “copulas,” initially introduced by Sklar (1959) 1 , which allows one to study the dependence of random variables independently of the behavior of their marginal distributions. Our presentation focuses on two variables but is easily extended to the case of N random variables, whatever N may be. Sklar’s Theorem states that, given the joint distribution function F (·, ·) of two random variables X and Y with marginal distribution FX (·) and FY (·) respectively, there exists a function C(·, ·) with range in [0, 1] × [0, 1] such that F (x, y) = C(FX (x), FY (y)) , (1) for all (x, y). This function C is the copula of the two random variables X and Y , and is unique if the random variables have continous marginal distributions. Moreover, the following result shows that copulas are intrinsic measures of dependence. If g1 (X), g2 (Y ) are strictly increasing on the ˜ = g1 (X), Y˜ = g2 (Y ) have exactly the same copula C ranges of X, Y , the random variables X (see Lindskog (2000)). The copula is thus invariant under strictly increasing transformation of the variables. This provides a powerful way of studying scale-invariant measures of associations. It is also a natural starting point for construction of multivariate distributions.
1.2
Tail dependence between two random variables
A standard measure of dependence between two random variables is provided by the correlation coefficient. However, it suffers from at least three deficiencies. First, as stressed by Embrechts, McNeil, and Straumann (1999), the correlation coefficient is an adequate measure of dependence only for elliptical distributions and for events of moderate sizes. Second, the correlation coefficient measures only the degree of linear dependence and does not account of any other nonlinear functional dependence between the random variables. Third, it agregates both the marginal behavior of each random variable and their dependence. For instance, a simple change in the marginals implies in general a change in the correlation coefficient, while the copula and, therefore the dependence, remains unchanged. Mathematically speaking, the correlation coefficient is said to lack the property of invariance under increasing changes of variables. Since the copula is the unique and intrinsic measure of dependence, it is desirable to define measures of dependences which depend only on the copula. Such measures have in fact been known for a long time. Examples are provided by the concordance measures, among which the most famous are the Kendall’s tau and the Spearman’s rho (see Nelsen (1998) for a detailed exposition). In particular, the Spearman’s rho quantifies the degres of functional dependence between two random variables: it equals one (minus one) when and only when the first variable is an increasing (decreasing) function of the second variable. However, as for the correlation coefficient, these concordance measures do 1
The reader is refered to Joe (1997), Frees and Valdez (1998) or Nelsen (1998) for a detailed survey of the notion of copulas and a mathematically rigorous description of their properties.
4
304
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
not provide a useful measure of the dependence for extreme events, since they are constructed over the whole distributions. Another natural idea, widely used in the contagion literature, is to work with the conditional correlation coefficient, conditioned only on the largest events. But, as stressed by Boyer, Gibson, and Lauretan (1997), such conditional correlation coefficient suffers from a bias: even for a constant unconditional correlation coefficient, the conditional correlation coefficient changes with the conditioning set. Therefore, changes in the conditional correlation do not provide a characteristic signature of a change in the true correlations. The conditional concordance measures suffer from the same problem. In view of these deficiencies, it is natural to come back to a fundamental definition of dependence through the use of probabilities. We thus study the conditional probability that the first variable is large conditioned on the second variable being large too: F¯ (x|y) = Pr{X > x|Y > y}, when x and y goes to infinity. Since the convergence of F¯ (x|y) may depend on the manner with which x and y go to infinity (the convergence is not uniform), we need to specify the path taken by the variables to reach the infinity. Recalling that it would be preferable to have a measure which is independent of the marginal distributions of X and Y , it is natural to reason in the quantile space. This leads to choose x = FX −1 (u) and y = FY −1 (u) and replace the conditions x, y → ∞ by u → 1. Doing so, we define the so-called coefficient of upper tail dependence (see Coles, Heffernan, and Tawn (1999), Lindskog (2000), or Embrechts, McNeil, and Straumann (2001)): λ+ = lim Pr{X > FX −1 (u) | Y > FY −1 (u)} . u→1−
(2)
As required, this measure of dependence is independent of the marginals, since it can be expressed in term of the copula of X and Y as λ+ = lim
u→1−
1 − 2u + C(u, u) . 1−u
(3)
This representation shows that λ+ is symmetric in X and Y , as it should for a reasonable measure of dependence. In a similar way, we define the coefficient of lower tail dependence as the probability that X incurs a large loss assuming that Y incurs a large loss at the same probability level λ− = lim Pr{X < FX −1 (u) | Y < FY −1 (u)} = lim u→0+
u→0+
C(u, u) . u
(4)
This last expression has a simple interpretation in term of Value-at-Risk. Indeed, the quantiles FX−1 (u) and FY−1 (u) are nothing but the Values-at-Risk of assets (or portfolios) X and Y at the confidence level 1 − u. Thus, the coefficient λ− simply provides the probability that X exceeds the VaR at level 1 − u, assuming that Y has exceeded the VaR at the same level confidence level 1 − u, when this level goes to one. As a consequence, the probability that both X and Y exceed their VaR at the level 1 − u is asymptotically given by λ− · (1 − u) as u → 0. As an example, consider a daily VaR calculated at the 99% confidence level. Then, the probability that both X and Y undergo a loss larger than their VaR at the 99% level is approximately given by λ− /100. Thus, when λ− is about 0.1, the typical recurrence time between such concomitant large losses is about four years, while for λ− ≈ 0.5 it is less than ten months. The values of the coefficients of tail dependence are known explicitely for a large number of different copulas. For instance, the Gaussian copula, which is the copula derived from de Gaussian multivariate distribution, has a zero coefficient of tail dependence. In contrast, the Gumbel’s copula used 5
305
9.2. Estimation du coefficient de d´ependance de queue
by Longin and Solnik (2001) in the study of the contagion between international equity markets, which is defined by µ h i1 ¶ θ θ θ Cθ (u, v) = exp − (− ln u) + (− ln v) , θ ∈ [0, 1], (5) has an upper tail coefficient λ+ = 2 − 2θ . For all θ’s smaller than one, λ+ is positive and the Gumbel’s copula is said to present tail dependence, while for θ = 1, the Gumbel copula is said to be asymptotically independent. One should however use this terminology with a grain of salt as “tail independence” (quantified by λ+ = 0 or λ− = 0) does not imply necessarily that large events occur independently (see Coles, Heffernan, and Tawn (1999) for a precise discussion of this point).
2 2.1
Tail dependence of factor models Tail dependence between an asset and one of its explaining factors
Now we state the first part of our main theoretical result. Let us consider two random variables X and Y of cumulative distribution functions FX (X) and FY (Y ), where X represents the return of a single stock and Y is the market return for instance. Let us also introduce an idiosyncratic noise ε, which is assumed independent of the market return Y . The factor model is defined by the following relationship between the individual stock return X, the market return Y and the idiosyncratic noise ε: X =β·Y +ε . (6) β is the usual coefficient introduced by the Capital Asset Pricing Model Sharpe (1964). Let us stress that ε may embody other factors Y 0 , Y 00 , . . ., as long as they remain independent of Y . Under such conditions and a few other technical assumptions detailed in the theorem established in appendix A.1, the coefficient of (upper) tail dependence between X and Y defined in (2) is obtained as Z ∞ (7) λ+ = o dx f (x) , n max 1, βl
where l denotes the limit, when u → 1, of the ratio of the quantiles of X and Y , l = lim
u→1
FX −1 (u) , FY −1 (u)
(8)
and f (x) is the limit, when t → +∞, of t · PY (tx)/F¯Y (t): f (x) = lim
t→+∞
PY (tx) t ¯ . FY (t)
(9)
PY is the distribution density of Y and F¯Y = 1 − FY is the complementary cumulative distribution function of Y . A similar expression obviously holds, mutatis mutandis, for the coefficient of lower tail dependence. The measure of tail dependence given by equation (7) depends on two limits defined in (8) and (9) and thus seems likely difficult to estimate. As it turns out, we will show that this is not the case in the empirical section below. Indeed, the first limit (8) is nothing but a ratio of quantiles, while the second limit (9) can be easily calculated for almost all distributions of the factor. For 6
306
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
instance, let us consider the Pareto distribution F¯Y (y) = 1/(y/y0 )µ defined for y ≥ y0 , whose density is equal to PY (y) = (µ/y0 )/(y/y0 )1+µ ; the limit (9) gives f (x) = µ/x1+µ . In contrast, for the Poisson law F¯Y (y) = e−ry defined for y ≥ 0 with density PY (y) = re−ry , the limit (9) gives f (x) = limt→∞ r t e−rt(x−1) = 0 for x > 1. Thus an estimation of the tail of the factor distribution is sufficient to infer the limit function f (x). Moreover, equation (7) has a rather simple interpretation since it shows that a non-vanishing coefficient of tail dependence results from the combination of two phenomena. First, the limit function f (x), which only depends on the behavior of the factor distribution, must be non-zero. Second, the constant l must remain finite to ensure that the integral in (7) does not vanish. Thus, the value of the coefficient of tail dependence is controlled by f (x) solely function of the factor and a second variable l quantifying the competition of the tails of the distribution of the factor Y and of the idiosyncratic noise ε. The fundamental result (7) should be of vivid interest to financial economists because it provides a general, rigorous and simple method for estimating one of the key variable embodying the occurrence of and the risks associated with extremes in joint distributions. From a theoretical view point, it also anchors the derivation and quantification of a key variable on extremes in the general class of financial factor models, thus extending their use and relevance also to this rather novel domain of extreme dependence, extreme risks and extreme losses. Up to now, we have assumed that the factor Y and the idiosyncratic noise ε were independent. In fact, it is important to stress for the sake of generality that the result (7) holds even when they are dependent, provided that this dependence is not too strong, as explained and made specific at the end of Appendix A.1. We now derive two direct consequences of this result (7) (see corollary 1 and 2 in appendix B), concerning rapidly varying and regularly varying factors 2 , which clearly illustrate the role the factor itself and the impact of the trade off between the factor and the idiosyncratic noise.
2.2
Absence of tail dependence for rapidly varying factors
Let us assume that the factor Y and the idiosyncratic noise ε are normally distributed (the second assumption is made for simplicity and will be relaxed below). As a consequence, the joint distribution of (X, Y ) is the bivariate Gaussian distribution. Refering to the results stated in section 1.1.2, we conclude that the copula of (X, Y ) is the Gaussian copula whose coefficient of tail dependence is zero. In fact, it is easy to show that λ = 0 for any non-degenerated distribution of ε. More generally, let us assume that the distribution of the factor Y is rapidly varying, which describes the Gaussian, exponential and any distribution decaying faster than any power-law. Then, the coefficient of tail dependence is identically zero. This result holds for any arbitrary distribution of the idiosyncratic noise (see corollary 1 in appendix B). It also holds for mixtures of normals or other distributions fatter than Gaussians, some of which are thought to be reasonable approximations to empirical stock return distributions. These statements are somewhat counter-intuitive since one could expect a priori that the coefficient of tail dependence does not vanish as soon as the tail of the distribution of factor returns is fatter than the tail the distribution noise returns. Said differently, when the standard deviation of the idiosynchratic noise ε is small (but not zero), then the idiosynchratic noise component is small and X and Y are practically identical, and it seems strange that their tail dependence can be equal 2
see Bigham, Goldie, and Teugel (1987) or Embrechts, Kluppelberg, and Mikosh (1997) for a survey of the properties of rapidly and regularly varying functions
7
307
9.2. Estimation du coefficient de d´ependance de queue
to zero. This non-intuitive result stems from the fact that the tail dependence is quantifying not just a dependence but a specific dependence for extreme co-movements. Thus, in order to get a non-vanishing tail-dependence, the fluctuations of the factor must be ‘wild’ enough, which is not realized with rapidly varying distributions, irrespective of the relative values of the standard deviations of the factor and the idosyncratic noise.
2.3 2.3.1
Coefficient of tail dependence for regularly varying factors Example of the factor model with Student distribution
In order to account for the power-law tail behavior observed for the distributions of assets returns it is logical to consider that the factor and the indiosyncratic noise also have power-law tailed distributions. As an illustration, we will assume that Y and ε are distributed according to a Student’s distribution with the same number of degrees of freedom ν (and thus same tail exponent ν). Let us denote by σ the scale factor of the distribution of ε while the scale factor of the distribution of Y is chosen equal to one3 . Applying the theorem previously established, we find h ³ ´ν i1/ν that f (x) = ν/xν+1 and l = β 1 + βσ , so that the coefficient of tail dependence is λ± =
1+
1 ³ ´ν ,
and β > 0.
σ β
(10)
As expected, the tail dependence increases as β increases and as σ decreases. Since the idiosyncratic volatility of the asset increases when the scale factor σ increases, this results simply means that the tail dependence decreases when the idiosyncratic volatility of a stock increases relative to the market volatility. The dependence with respect to ν is less intuitive. In particular, let ν go to infinity. Then, λ → 0 if σ > β and λ → 1 for σ < β. This is surprising as one could argue that, as ν → ∞, the Student distribution tends to the Gaussian law. As a consequence, one would expect the same coefficient of dependence λ± = 0 as for rapidly varying functions. The reason for the non-certain convergence of λ± to zero as ν → ∞ is rooted in a subtle non-commutativity (and non-uniform convergence) of the two limits ν → ∞ and u → 1. Indeed, when taking first the limit u → 1, the result λ → 1 for β > σ indicates that a sufficiently strong factor coefficient β always ensures the validity of the power law regime, whatever the value of ν. Correlatively, in this regime β > σ, λ± is an increasing function of ν. The result (10) is of interest for financial economics purpose because it provides a simple parametric illustration and interpretation of how the risk of large co-movements is affected by the three key parameters entering in the definition of the factor model. It allows one to weight how the ingredients of the factor model impact on the large risks captured by λ± and thus links the financial basis underlying the factor model to the extreme multivariate risks. 2.3.2
General result
We now provide the general result valid for any regularly varying distribution. Let the factor Y follows a regularly varying distribution with tail index α: in other words, the complementary cumulative distribution of Y is such that F¯Y (y) = L(y) · y −α , where L(y) is a slowly varying 3
Such a choice is always possible via a rescaling of the coefficient β.
8
308
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
function, i.e:
L(ty) = 1, t→∞ L(t) Corollary 2 in appendix B.2 shows that
∀y > 0.
lim
(11)
1 n oiα , λ= h max 1, βl
(12)
where l denotes the limit, when u → 1, of the ratio FX −1 (u)/FY −1 (u). In the case of particular interest when the distribution of ε is also regularly varying with tail index α and if, in addition, we have F¯Y (y) ∼ Cy ·y −α and F¯ε (ε) ∼ Cε ·ε−α , for large y and ε, then the coefficient of tail dependence is a simple function of the ratio Cε /Cy of the scale factors: λ=
1 1+
β −α
·
Cε Cy
.
(13)
When the tail indexes αY and αε of the distribution of the factor and the residue are different, then λ = 0 for αY < αε and λ = 1 for αY > αε . The results (12) and (13) are very important both for a financial and and economic perseptive because they express in the most general and straightforward way the risk of extreme co-movements quantified by the tail dependence parameter λ within the important class of factor models. That λ increases with the β factor is intuitively clear. Less obvious is the dependence of λ on the structure of the marginal distributions of the factor and of the idiosynchratic noise, which is found to be captured uniquely in terms of the ratio of their scale factors Cε and Cy . The scale factors Cε and Cy together with the factor β thus replace the variance and covariance in their role as the sole quantifiers of the extreme risks occurring in co-movements. Until now, we have only considered a single asset X. Let us now consider a portfolio of assets Xi , each of the assets following exactly the one factor model (6) Xi = βi · Y + εi ,
(14) P
with independent noises εi , whose scale factors are Cεi . P The portfolio X = wi Xi , with weights wi , also follows the factor model with a parameter β = w β and noise ε, whose scale factor is i i P Cε = |wi |α · Cεi 4 . Thus, equation (13) shows that the tail dependence between the portfolio and the factor is P ¸ · |wi |α · Cεi −1 . (15) λ= 1+ P ( wi βi )α · CY When unlimited short sells are allowed, one can follow a “market neutral” strategy yielding β = 0 and thus λ = 0. But in the more realistic case where only limited short sells are authorized, one cannot reach β = 0, and the best portfolio, which is the less “correlated” with the large market moves, has to minimze the tail dependence (15). This simple example clearly shows that it is very different to minimize the extreme co-movements, according to (15), and to minimize the (linear) correlation ρ between the portfolio and the market factor given by P 2 · ¸−1/2 wi · V ar(εi ) ρ= 1+ P . (16) ( wi βi )2 V ar(Y ) 4
In the more realistic case where the εi ’s are not independent but still embody one or several common factors Y , Y 00 , · · ·, the resulting scale factor Cε can be calculated with the method described in Bouchaud, Sornette, Walter and Aguilar (1998) 0
9
9.2. Estimation du coefficient de d´ependance de queue
309
Thus, since the minimum of ρ may be very different from the minimum of λ, minimizing ρ almost surely leads to accept a level of extreme risks which is not optimal.
2.4
Tail dependence between two assets related by a factor model
We now present the second part of our theoretical result. Let X1 and X2 be two random variables (two assets) of cumulative distributions functions F1 , F2 with a common factor Y . Let ε1 and ε2 be the idiosyncratic noises associated with these two assets X1 and X2 . We allow the idiosyncratic noises to be dependent random variables, as occurs for instance if they embody the effect of other factors Y 0 , Y 00 , ... which are independent of Y . Our essential assumption is that the distribution of the factor Y must have a tail not thinner than the tail of the distributions of the other factors Y 0 , Y 00 , ... This hypothesis is crucial in order to detect the existence of tail-dependence. This means that, for purposes of characterizing tail dependencies in factor models, our model can always be re-stated as a single factor model where the single factor is the factor with the thickest tail. This makes our results quite general. Then, the model can be written as X1 = β1 · Y + ε1 ,
(17)
X2 = β2 · Y + ε2 .
(18)
We prove in appendix A.2, that the coefficient of (upper) tail dependence λ+ = limu→1 Pr{X1 > F1 −1 (u) | X2 > F2 −1 (u)}, between the assets X1 and X2 , is given by the expression Z ∞ λ+ = (19) n o dx f (x) , max
l1 l2 , β1 β2
which is very similar to that found for the tail dependence between an asset and one of its explaining factor (see equation (7)). As previously, l1,2 denotes the limit, when u → 1, of the ratio F1,2 −1 (u)/FY −1 (u), and f (x) is the limit, when t → +∞, of t · PY (tx)/F¯Y (t). The result (19) can be cast in a different illuminating way. Let λ(X1 , Y ) (resp. λ(X2 , Y )) denote the coefficient of tail dependence between the asset X1 (resp. X2 ) and their common factor Y . Let λ(X1 , X2 ) denote the tail dependence between the two assets. Equation (19) allows us to assert that λ(X1 , X2 ) = min{λ(X1 , Y ), λ(X2 , Y )}. (20) The tail dependence between the two assets X1 and X2 is nothing but the smallest tail dependence between each asset and the common factor. Therefore, a decrease of the tail dependence between the assets and the market will also lead automatically to a decrease of the tail dependence between the two assets. This result also shows that it is sufficient to study the tail dependence between the assets and their common factor to obtain the tail dependence between any pair of assets. The result (20) is also useful in the context of portfolio analysis. Not only does it provide a tool for assessing the probability of large losses of a portfolio composed of assets driven by a common factor, it also allows us to define novel strategies of portfolio optimization based on the selection and weighting of stocks chosen so as to balance to risks associated with extreme co-movements. Such an approach has been tested in (Malevergne and Sornette 2002) with encouraging results.
3
Empirical study
We now apply our theoretical results to the daily returns of a set of stocks traded on the New York Stock Exchange. In order to estimate the parameters of the factor model (6), the Standard and 10
310
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Poor’s 500 index is chosen to represent the common “market factor.” It has been prefered over the Dow Jones Industrial Averages Index for instance, because, although less diversified, it represents about 80% of the total market capitalization. We describe the set of selected stocks in the next sub-section. Next, we estimate the parameter β in (6) and check the independence of the market returns and the residues. Then, applying the commonly used hypothesis according to which the tail of the distribution of assets return is a power law or at leastregularly varying (see Longin (1996), Lux (1996), Pagan (1996), or Gopikrishnan, Meyer, Amaral, and Stanley (1998)), we estimate the tail index and the scale factor of these distributions, which allows us to calculate the coefficients of tail dependence between each asset return and the market return. Finally, we perform an analysis of the historical data to check the compatibility of our prediction on the fraction of realized large losses of the assets that occur simultaneously with the large losses of the market. The results of our analysis are reported below in terms of the returns rather than in terms of the excess returns above the risk free interest rate, in apparent contradiction with the prescription of the CAPM. However, for daily returns, the difference between returns and excess returns is negligible. Indeed, we checked that neglecting the difference between the returns and the excess returns does not affect our results by re-running all the study described below in terms of the excess returns and found that the tail dependence did not change by more than 0.1%.
3.1
Description of the data
We study a set of twenty assets traded on the New York Stock Exchange. The criteria presiding over the selection of the assets (see column 1 of table 1) are that (1) they are among the stocks with the largest capitalizations, but (2) each of them should have a weight smaller than 1% in the Standard and Poor’s 500 index, so that the dependence studied here does not stem trivially from their overlap with the market factor (taken as the Standard and Poor’s 500 index). The time interval we have considered ranges from July 03, 1962 to December 29, 2000, corresponding to 9694 data points, and represents the largest set of daily data available from the Center for Research in Security Prices (CRSP). This large time interval is important to let us collect as many large fluctuations of the returns as is possible in order to sample the extreme tail dependence. Moreover, in order to allow for a non-stationarity over the four decades of the study, to check the stability of our results and to test the stationnarity of the tail dependence over the time, we split this set into two subsets. The first one ranges from July 1962 to December 1979, a period with few very large return amplitudes, while the second one ranges from January 1980 to December 2000, a period which witnessed several very large price changes (see table 1 which shows the good stability of the standard deviation between the two sub-periods while the higher cumulants such as the excess kurtosis often increased dramatically in the second sub-period for most assets). The table 1 presents the main statistical properties of our set of stocks during the three time intervals. All assets exhibit an excess kurtosis significantly different from zero over the three time interval, which is inconsistent with the assumption of Gaussianly distributed returns. While the standard deviations remain stable over time, the excess kurtosis increases significantly from the first to the second period. This is in resonance with the financial community’s belief that stock price volatility has increased over time, a still controversial result (Jones and J.W.Wilson (1989), Campbell, Lettau, Malkiel, and Xu (2001) or Xu and Malkiel (2002)).
11
9.2. Estimation du coefficient de d´ependance de queue
3.2
311
Calibration of the factor model
The determination of the parameters β and of the residues ε entering in the definition of the factor model (6) is performed for each asset by regressing the stocks returns on the market return. The coefficient β is thus given by the ordinary least square estimator, which is consistent as long as the residues are weak white noise and with zero mean and finite variance. The idiosyncratic noise ε is obtained by substracting β times the market return to the stock return. Table 2 presents the results for the three periods we consider. For each period, we give the value of the estimated coefficient β (first columns of table 2 for each time interval). We then calculate the correlation coefficient between the market returns and the estimated idiosyncratic noise. All of them are less than 10−8 , so that none of them is significantly different from zero, which allows us to conclude that there is no linear correlation between the factor and the residues. To check one step further the independence hypothesis, we have estimated the correlation coefficient between the square of the factor and the square of the error-terms. In table 2, their values are given in the second of the pair of columns presented for each period. A Fisher’s test shows that, at the 95% confidence level, all these correlation coefficients are significantly different from zero. This result is not surprising and shows the existence of small but significant correlations between the market volatility and the idiosyncratic volatility. However, this will not invalidate the empirical tests of our theoretical results, since they hold even in presence of weakly dependent factor and noise. The coefficient β’s we obtain by regressing each asset returns on the Standard & Poor’s 500 returns are very close to within their uncertainties to the β’s given by the CRSP database, which are estimated by regressing the assets returns on the value-weighted market portfolio. Thus, the choice of the Standard and Poor’s 500 index to represent the whole market portfolio is reasonable.
3.3
Estimation of the tail indexes
Assuming that the distributions of stocks and market returns are asymptotically power laws (Longin (1996), Lux (1996), Pagan (1996) or Gopikrishnan, Meyer, Amaral, and Stanley (1998)), we now estimate the tail index of the distribution of each stock and their corresponding residue by the factor model, both for the positive and negative tails. Each tail index α is given by Hill’s estimator: −1 k X 1 (21) log xj,N − log xk,N , α ˆ= k j=1
where x1,N ≥ x2,N ≥ · · · ≥ xN,N denotes the ordered statistics of the sample containing N independent and identically distributed realizations of the variable X. Hill’s estimator is asymptotically normally distributed with mean α and variance α2 /k. But, for finite k, it is known that the estimator is biased. As the range k increases, the variance of the estimator decreases while its bias increases. The competition between these two effects implies that there is an optimal choice for k = k ∗ which minimizes the mean squared error of the estimator. To select this value k ∗ , one can apply the Danielsson and de Vries (1997)’s algorithm which is an improvement over the Hall (1990)’s subsample bootstrap procedure. One can also prefer the more recent Danielsson, de Haan, Peng, and de Vries (2001)’s algorithm for the sake of parsimony. We have tested all three algorithms to determine the optimal k ∗ . It turns out that the Danielsson, de Haan, Peng, and de Vries (2001)’s algorithm developed for high frequency data is not well adapted to samples containing less than 100,000 data points, as is the case here. Thus, we have focused on the two other algorithms. An accurate determination of k ∗ is rather difficult with any of 12
312
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
them, but in every case, we found that the relevant range for the tail index estimation was between the 1% and 5% quantiles. Tables 3 and 4 give the estimated tail index for each asset and residues at the 1%, 2.5% and 5% quantile, for both the positive and the negative tails for the two time sub-intervals. The second time interval from January 1980 to December 2000 is characterized by values of the tail indexes that are homogeneous over the various quantiles and range between 3 and 4 for the negative tails and between 3 and 5 for the positive tails. There is slightly more dispersions in the first time interval from July 1962 to December 1979. For each asset and their residue of the regression on the market factor, we tested whether the hypothesis, according to which the tail index measured for each asset and each residue is the same as the tail index of the Standard & Poor’s 500 index, can be rejected at the 95% confidence level, for a given quantile. Before proceding with the presentation of our tests, two caveats have to be accounted for. First, due to the phenomenon of volatility clustering in financial time series, extremes are more likely to occur together. In this situation, Hill’s estimator is no more normally distributed with variance α2 /k. In fact, for weakly dependent time series, it can only be asserted that the estimator remains consistent (see Rootz´en and de Haan (1998)). Moreover, as shown by Kearns and Pagan (1997) for heteroskedastic time series, the variance of the estimated tail index can be seven times larger than the variance given by the asymptic normality assumption. Second, the idiosyncratic noise is estimated by substracting β times the factor from the asset return. Thus, even when the factor and the error-term are independent, the empirically estimated residues depend on the realizations of the factor. As a consequence, the tail index estimators for the factor and for the idiosyncratic noise are correlated. This correlation obviously depends on the exact form of the distributions of the factor and the indiosyncratic noise. Even without the knowledge of the true test statistics, for both problematic points, we can assert that the fluctuations of the estimators are larger than those given by the asymptotically normal statisics for i.i.d realizations. Thus, performing the test under the asymptotic normality assumption is more constraining than under the true (but unknown) test statistics, so that the non-rejection of the equality hypothesis under the assumption of a normally distributed estimator ensures that we would not be able to reject this hypothesis under the real statistics of the estimator. The values which reject the equality hypothesis are indicated by a star in the tables 3 and 4. During the second time interval from January 1980 to December 2000, only four residues have a tail index significantly different from that of that Standard & Poor’s 500, and only in the negative tail. The situation is not as good during the first time interval, especially for the negative tail, for which no less than 13 assets and 10 residues out of 20 have a tail index significantly different from the Standard & Poor’s 500 ones, for the 5% quantile. Recall that the the equality tests have been performed under the assumption of a normally distributed estimator with variance α2 /k which, as explained above, is too strong an hypothesis. As a consequence, a rejection under the normality hypothesis does not imply necessarily that the equality hypothesis would have been rejected under the true statistics. While providing a note of caution, this statement is nevertheless not very useful from a practical point of view. More importantly, we stress that the equality of the tail indices of the distribution of the factor and of the idiosyncratic noise is not crucial. Indeed, we shall propose below two different estimators for the coefficient of tail dependence. One of them does not rely on the equality of these two tail indices and thus remains operational even when they are different and in particular when the tail index of the idiosyncratic noise appears larger than the tail of the factor. To summarize, our tests confirm that the tail indexes of most stock return distributions range between three and four, even though no better precision can be given with good significance. Moreover, in most cases, we can assume that both the asset, the factor and the residue have the
13
313
9.2. Estimation du coefficient de d´ependance de queue
same tail index. We can also add that, as asserted by Loretan and Phillips (1994) or Longin (1996), we cannot reject the hypothesis that the tail index remains the same over time. Nevertheless, it seems that during the first period from July 1962 to December 1979, the tail indexes were sightly larger than during the second period from January 1980 to december 2000.
3.4
Determination of the coefficient of tail dependence
Using the just established empirical fact that we cannot reject the hypothesis that the assets, the market and the residues have the same tail index, we can use the theorem of Appendix A and its second corollary stated in section 2. This allows us to conclude that one cannot reject the hypothesis of a non-vanishing tail dependence between the assets and the market. In addition, the coefficient of tail dependence is given by equations (12) and (13). These equations provide two ways for estimating the coefficient of tail dependence: non-parametric with (12) and parametric with (13). The first one is more general since it only requires the hypothesis of a regular variation, while the second one explicitly assumes that the factor and the residues have distributions with power law tails. To estimate the tail dependence according equation (12), we need only to determine the constant l defined (8). Consider N sorted realizations of X and Y denoted by x1,N ≥ x2,N ≥ · · · ≥ xN,N and y1,N ≥ y2,N ≥ · · · ≥ yN,N , the quantile of FX−1 (u) and of FY−1 (u) are estimated by −1 FˆX (u) = x[(1−u)·N ],N
and
−1 FˆY (u) = y[(1−u)·N ],N ,
(22)
where [·] denotes the integer part. Thus, the constant l is non-parametrically estimated by ˆlk = xk,N yk,N
as k → 0 or N.
(23)
As u goes to zero or one (or k goes to zero or N ), the number of observations decreases dramatically. However, we observe a large interval of small or large k’s such that the ratio of the empirical quantiles remains remarkably stable and thus allows for an accurate estimation of l. A more precise estimation could be performed with a kernel-based quantile estimator (see Shealter and Marron (1990) or Pagan and Ullah (1999) for instance). A non-parametric estimator for λ is then obtained by replacing l by its estimated value in equation (12) ˆN P = λ
1 1 n oα = n oα . ˆ x l max 1, ˆ max 1, ˆ k,N β
(24)
β·yk,N
It can also be advantageous to follow a parametric approach, which generally allows for a more accurate estimation of (the ratio of) the quantiles, provided that the assumed parametric form of the distributions is not too far from the true one. For this purpose, we will use formula (13) which requires the estimation of the scale factors for the different assets. To get the scale factors, we proceed as follows. Consider a variable X which asymptotically follows a power law distribution Pr{X > x} ∼ C · x−α . Given a rank ordered sample x1,N ≥ x2,N ≥ · · · ≥ xN,N , the scale factor C can be consistently estimated from the k largest realizations by k Cˆ = · (xk,N )α . N
(25)
The estimated value of the scale factor must not depends on the rank k for k large enough, in order for the parameterization of the distribution in terms of a power law to hold true. Thus, denoting 14
314
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
by CˆY and Cˆε the estimated scale factors of the factor Y and of the noise ε defined in equation (6), the estimator of the coefficient of tail dependence is ˆ= λ
1 1 + βˆ−α ·
ˆY C ˆε C
³
= 1+
1 εk,N ˆ k,N β·y
´α ,
(26)
where βˆ denotes the estimated coefficient β. Since the estimators CˆY , Cˆε and βˆ are consistent and ˆ is also consistent. using the continuous mapping theorem, we can assert that the estimator λ Since the tail indices α are impossible to determine with sufficient accuracy other than saying that the α probably fall in the interval 3 − 4 as we have seen above, our strategy is to determine the coefficient of tail dependence using (24) and (26) for three different common values α = 3, 3.5 and 4. This procedure allows us to test for the sensitivity of the scale factor and therefore of the tail coefficient with respect to the uncertain value of the tail index. Tables 5 and 6 give the values of the coefficients of lower tail dependence over the whole time interval from July 1962 to December 2000, under the assumption that the tail index α equals 3, for the non-parametric estimator (table 5) and the parametric one (table 6). For each table, the coefficient of tail dependence is estimated over the first centile, the first quintile and the first decile to also test for any possible sensitivity on the tail asymptotics. For each of these quantiles, the mean values, their standard deviations and their minimum and maximum values are given. We first remark that the standard deviation of the tail dependence coefficient remains small compared with its average value and that the minimum and maximum values cluster closely around its mean value. This shows that the coefficient of tail dependence is well-estimated by its mean over a given quantile. Secondly, we find that these estimated coefficients of tail dependence exhibit a good stability over the various quantiles. These two observations enable us to conclude that the average coefficient of tail dependence over the first centile is sufficient to provide a good estimate of the true coefficient of tail dependence. Note that the two estimators yield essentially equivalent results, even if the coefficients of tail dependence given by the non-parametric estimator exhibit a systematic tendency to be slightly smaller than the estimates provided by the parametric estimator. Since the results given by these two estimators are very close to each other, we choose to present below only those given by the parametric one. This choice has also been guided by the lower sensibility of this last estimator to small changes of the tail exponent α. Indeed, since the evaluation of the scale factors CˆY and Cˆε by the formula (25) involves the tail exponent α, the small deviations from its true value are compensated by the estimated scale factors. This explains the observation that the parametric estimator appears more robust than the non-parametric one with respect to small changes in α. Tables 7, 8 and 9 summarize the different values of the coefficient of tail dependence for both the positive and the negative tails, under the assumptions that the tail index α equals 3, 3.5 and 4 respectively, over the three considered time intervals. Overall, we find that the coefficients of tail dependence are almost equal for both the negative and the positive tail and that they are not very sensitive to the value of the tail index in the interval considered. More precisely, during the first time interval from July 1962 to December 1979 (table 7), the tail dependence is symmetric in both the upper and the lower tail. During the second time interval from January 1980 to December 2000 and over the whole time interval (tables 8 and 9), the coefficient of lower tail dependence is slightly but systematically larger than the upper one. Moreover, since these coefficients of tail dependence are all less than 1/2, they decrease when the tail index α increases and the smaller the coefficient of tail dependence, the larger the decay. During the first time interval, most of the coefficients of tail dependence range between 0.15 and 15
9.2. Estimation du coefficient de d´ependance de queue
315
0.35 in both tails, while during the second time interval, almost all range between 0.10 and 0.25 in the lower tail and between 0.10 and 0.20 in the upper one. Thus, the tail dependence is smaller during the last period compared to the first one. This result is interesting because it is in agreement and confirms the recent studies by Campbell, Lettau, Malkiel, and Xu (2001) and Xu and Malkiel (2002), showing that the idiosyncratic volatility of each stocks have increased relative to the market volatility. And, as already discussed, the coefficient of tail dependence given by equation (10) must decrease when the idiosynchratic volatility of the stocks increases relative to the market volatility. The strong similarity of the tail dependencies in the upper and lower tails is an interesting empirical finding which suggests that extreme co-movements reflect behaviors of agents which are more sensitive to large amplitudes rather than to a specific direction (loss or gain). Pictorially, the specific mechanism triggering co-movements of extreme amplitudes may well be different for losses compared to gains, such as fear for the former and greed for the latter, but the resulting large co-movements have similar frequencies of occurrence. The observed lack of stationarity of the coefficient of tail dependence in the two time sub-intervals suggests that it could be necessary to have a model where the tail dependence index is not constant but varies as a function of past shocks (just as the volatility varies with time in a GARCH model) in order to investigate whether large recent common shocks lead to higher future tail dependence. This point is beyond the scope of the present study, but we will provide in our concluding remarks some ways for explicitely accounting for this lack of stationarity.
3.5
Comparison with the historical extremes
Our determination of the coefficients of tail dependence provides predictions on the probability that future large moves of stocks may be simultaneous to large moves of the market. This begs for a check over the available historical period to determine whether our estimated coefficients of tail dependence are compatible with the realized historical extremes. For this, we consider the ten largest losses of the Standard & Poor’s 500 index during the two time sub-intervals5 . Since λ− is by definition equal to the probability that a given asset incurs a large loss (say, one of its ten largest losses) conditional on the occurrence of one of the ten largest losses of the Standard & Poor’s 500 index, the probability, for this asset, to undergo n of its ten largest losses simultaneously with any of the ten largest losses of the Standard & Poor’s 500 index is given by the binomial law with parameter λ− : µ ¶ 10 λ− n (1 − λ− )(10−n) . (27) Pλ− (n) = n We stress that our consideration of only the ten largest drops ensures that the present test is not embodied in the determination of the tail dependence coefficient, which has been determined on a robust procedure over the 1%, 5% and 10% quantiles. We checked that removing these then largest drops does not modify the determination of λ− . Our present test can thus be considered as “out-of-sample,” in this sense. Table 10 presents, for the two time sub-intervals, the number of extreme losses largest losses incured by a given asset which occured simultaneously with one of losses of the standard & Poor’s 500 index. For each asset, we give the probability such a realisation, according to (27). We notice that during the first time interval,
among the ten the ten largest of occurence of only two assets
5 We do not consider the whole time interval since the ten largest losses over the whole period coincide with the ten largest ones over the second time subinterval, which would bias the statistics towards the second time interval.
16
316
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
are incompatible, at the 95% confidence level, with the value of λ− previously determined: Du Pont (E.I.) de Nemours & Co. and Texas Instruments Inc. In contrast, during the second time interval, four assets reject the value of λ− : Coca Cola Corp., Pepsico Inc., Pharmicia Corp. and Texaco Inc. These results are very encouraging. However, there is a noticeable systematic bias. Indeed, during the first time interval, 17 out of the 20 assets have a realized number of large losses lower than their expected number (according to the estimated λ− ), while during the second time interval, 19 out of the 20 assets have a realized number of large losses larger than their expected one. Thus, it seems that during the first time interval the number of large losses is overestimated by λ− while it is underestimated during the second time interval. We propose to explain the underestimation of the number of large losses between January 1980 and December 2000 by a possible comonitonicity that occurred during the October 1987 crash. Indeed, on October 19, 1987, 12 out of the 20 considered assets incurred their most severe loss, which strongly suggests a comonotonic effect. Table 11 shows the same results as in table 10 but corrected by substracting this comonotonic effect to the number of large losses. The compatibility between the number of large losses and the estimated λ− becomes significantly better since only Pepsico Inc. and Pharmicia Corp. are still rejected, and only 16 assets out of 20 are underestimated, representing a slight decrease of the bias. Previous works have shown that, in period of crashes, the market conditions change, herding effects may become more important and almost dominant, so that the market enters an unusual regime, which can be characterized by outliers present in the distribution of drawdowns Johansen and Sornette (2002). Our detection of an anomalous comonotonicity can thus be considered as an independent confirmation of the existence of this abnormal regime. Another explaination for this slight discrepancy may be ascribed to a limitation of the CAPM. Indeed, the CAPM is known to explain the relation between the expected return on an asset and its amount of systematic risk. But, it is questionable whether extreme systematic risks as those measured by the coefficient of tail dependence are really accounted for by the economic agents and then effectively priced. Concerning the overestimation of the number of large losses during the first time interval, it can obviously not be ascribed to the comonotonicity of very large events, which in fact only occurred once for the Coca-Cola Corp. This overestimation is probably linked with the low “volatility” of the market during this period, which can have two effects. The first one is to lead to a less accurate estimation of the scale factor of the power-law distribution of the assets. The second one is that a market with smaller volatility produces fewer large losses. As a consequence, the asymptotic regime for which the relation Pr{X < FX −1 (u)|Y < FY −1 (u)} ' λ− holds may not be reached in the sample, and the number of recorded large losses remain lower than that asymptotically expected.
4
Concluding remarks
We have used the framework offered by factor models in order to derive a general theoretical expression for the coefficient of tail dependence between an asset and any of its explanatory factor or between any two assets. The coefficient of tail dependence represents the probability that a given asset incurs a large loss (say), assuming that the market (or another asset) has also undergone a large loss. We find that factors characterized by rapidly varying distributions, such as Normal or exponential distributions, always lead to a vanishing coefficient of tail dependence with other 17
9.2. Estimation du coefficient de d´ependance de queue
317
stocks. In contrast, factors with regularly varying distributions, such as power-law distributions, can exhibit tail dependence with other stocks, provided that the idiosyncratic noise distributions of the corresponding stocks are not fatter-tailed than the factor. Applying this general result to individual daily stock returns, we have been able to estimate the coefficient of tail dependence between the returns of each stock and those of the market. This determination of the tail dependence relies only on the simple estimation of the parameters of the underlying factor model and on the tail parameters of the distribution of the factor and of the idiosyncratic noise of each stock. As a consequence, the two strong advantages of our approach are the following. - The coefficients of tail dependence are estimated non-parametrically. Indeed, we never specify any explicit expression of the dependence structure, contrary to most previous works (see Longin and Solnik (2001), Malevergne and Sornette (2001) or Patton (2001) for instance); - Our theoretical result enables us to estimate an extreme parameter, not accessible by a direct statistical inference. This is achieved by the measurement of parameters whose estimation involves a significant part of the data with sufficient statistics. Having performed this estimation, we have checked the comptatibility of these estimated coefficients of tail dependence with the historically realized extreme losses observed in the empirical time series. A good agreement is found, notwithstanding a slight bias which leads to an overestimate of the occurence of large events during the period from July 1962 to December 1979 and to an underestimate during the time interval from January 1980 to December 2000. This bias can be explained by the low volatility of the market during the first period and by a comonotonicity effect, due to the October 1987 crach, during the second period. Indeed, from july 1962 to December 1979, the volatility was so low that the distributions of returns have probably not sampled their tails sufficiently for the probability of large conditional losses to be represented by its asymptotic expression given by the coefficient of tail dependence. The situation is very different for the period from january 1980 to December 2000. On October 19, 1987, many assets incurred their largest loss ever. This is presumably the manifestation of an ‘abnormal’ regime probably due to herding effects and irrational behaviors and has been previously characterized as yielding signatures in the form of outliers in the distribution of drawdowns. Finally, the observed lack of stationarity exhibited by the coefficient of tail dependence across the two time sub-intervals suggests the importance of going beyond a stationary view of tail dependence and of studying its dynamics. This question, which could be of great interest in the context of the contagion problem, could be easily treated with the new conditional quantile dynamics proposed by Engle and Manganelli (1999). Moreover, it should be interesting to account for the change of the β’s with incoming bad or good news, as shown by Cho and Engle (2000), for instance. These points are left for a future work. From a practical point of view, we stress that the coefficient λ studied here can be seen as a generalization or a tool complementary to the CAPM’s β. These two coefficients have in common that they probe the dependence between a given stock and the market. However, the coefficient β quantifies only the correlations between moderate movements of both an asset and the market. In contrast, the coefficient λ offers a measure of extreme co-movements, which is particularly useful in period of high market volatility. In such periods, a prudent fund manager should overweight its portofolio with assets whose λ is very small such as Texaco or Walgreen, for instance. Moreover, the observed decrease of the tail dependence during the last year concomitant with the 18
318
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
increase of the idiosyncratic volatility suggests that the main source of risk in such a period does not consist in the dependence between assets but rather in their instrinsic fluctuations measured by the idiosyncratic volality. Our study has focused on the dependence between different risks. In fact, our theorem can obviously be applied to extreme temporal dependences, when the variable follows an autoregressive process. This should provide an estimate of the probability that a large loss (respectively gain) is followed by another large loss (resp. gain) in the following period. Such information is very interesting in investment and hedging strategies.
19
9.2. Estimation du coefficient de d´ependance de queue
A
319
Proof of the theorem
A.1
Tail dependence between an asset and the factor
A.1.1
Statement
We consider two random variables X and Y , related by the relation X = β · Y + ε,
(28)
where ε is a random variable independent of Y and β a non-random positive coefficient. Let PY and FY denote respectively the density with respect to the Lebesgue measure and the distribution function of the variable Y . Let FX denotes the distribution function of X and Fε the distribution function of ε. We state the following theorem: Theorem 1 Assuming that H0: The variables Y and ε have distribution functions with infinite support, H1: For all x ∈ [1, ∞), lim
t→∞
t PY (tx) = f (x), F¯Y (t)
(29)
H2: There are real numbers t0 > 0, δ > 0 and A > 0, such that for all t ≥ t0 and all x ≥ 1 F¯Y (tx) A ≤ δ, ¯ x FY (t)
(30)
FX −1 (u) = l, u→1 FY −1 (u)
(31)
H3: There is a constant l ∈ R+ , such that lim
then, the coefficient of (upper) tail dependence of (X, Y ) is given by Z ∞ λ= dx f (x). max{1, βl }
A.1.2
(32)
Proof
We first give a general expression for the probability for X to be larger than FX−1 (u) knowing that Y is larger than FY−1 (u) : Lemma 1 The probability that X is larger than FX−1 (u) knowing that Y is larger than FY−1 (u) is given by : £
Pr X >
FX−1 (u)|Y
>
¤
FY−1 (u)
F −1 (u) = Y 1−u
Z 1
∞
¤ ¢ £ ¡ dx PY FY−1 (u) x · F¯ε FX −1 (u) − βFY−1 (u) x . (33) 20
320
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Proof : h i Pr{X > FX −1 (u), Y > FY −1 (u)} = E 1{X>FX −1 (u)} · 1{Y >FY −1 (u)} h h ii = E E 1{X>FX −1 (u)} · 1{Y >FY −1 (u)} |Y h h ii = E 1{Y >FY −1 (u)} · E 1{X>FX −1 (u)} |Y h h ii = E 1{Y >FY −1 (u)} · E 1{ε>FX −1 (u)−βY } h i = E 1{Y >FY −1 (u)} · F¯ε (FX −1 (u) − βY )
(34) (35) (36) (37) (38)
Assuming that the variable Y admits a density PY with respect to the Lebesgue measure, this yields Z ∞ −1 −1 Pr{X > FX (u), Y > FY (u)} = dy PY (y) · F¯ε [FX −1 (u) − βy] . (39) FY−1 (u)
Performing the change of variable y = FY −1 (u) · x, in the equation above, we obtain Z ∞ Pr{X > FX −1 (u), Y > FY −1 (u)} = FY−1 (u) dx PY (FY−1 (u) x) · F¯ε [FX −1 (u) − βFY−1 (u) x] , 1
(40)
¡ ¢ and, dividing by F¯Y FY −1 (u) = 1 − u, this concludes the proof. ¤ Let us now define the function fu (x) =
FY−1 (u) PY (FY−1 (u) x) · F¯ε [FX −1 (u) − βFY−1 (u) x] . 1−u
(41)
We can state the following result Lemma 2 Under assumption H1 and H3, for all x ∈ [1, ∞), fu (x) −→ 1nx> l o · f (x),
(42)
β
almost everywhere, as u goes to 1. Proof: Let us apply the assumption H1. We have FY−1 (u) t PY (t x) PY (FY−1 (u) x) = lim , t→∞ F¯Y (t) u→1 1 − u = f (x). lim
(43) (44)
Applying now the assumption H3, we have à lim FX
u→1
−1
(u) −
βFY−1 (u)
x =
lim βFY−1 (u) u→1 (
=
! FX −1 (u) −x βFY−1 (u)
−∞ if x > βl , ∞ if x < βl ,
(45) (46) (47)
21
321
9.2. Estimation du coefficient de d´ependance de queue
which gives
lim F¯ε [FX −1 (u) − βFY−1 (u) x] = 1nx> l o ,
u→1
(48)
β
and finally FY−1 (u) PY (FY−1 (u) x) · lim F¯ε [FX −1 (u) − βFY−1 (u) x], u→1 1 − u u→1 = 1nx> l o · f (x),
lim fu (x) =
u→1
lim
(49) (50)
β
which concludes the proof. ¤ Let us now prove that there exists an integrable function g(x) such that, for all t ≥ t0 and all x ≥ 1, we have ft (x) ≤ g(x). Indeed, let us write t PY (tx) F¯Y (tx) t PY (tx) = · ¯ . F¯Y (t) F¯Y (tx) FY (t)
(51)
For the leftmost factor in the right-hand-side of equation (51), we easily obtain ∀t, ∀x ≥ 1,
t PY (tx) x∗ PY (x∗ ) 1 ≤ · , x F¯Y (tx) F¯Y (x∗ ) x PY (x) F¯Y (x) A/xδ by
(52)
where x∗ denotes the point where the function
reaches its maximum. The rightmost factor
in the right-hand-side of (51) is smaller than
assumption H2, so that
∀t ≥ t0 , ∀x ≥ 1, Posing g(x) =
x∗ PY (x∗ ) A t PY (tx) ≤ · 1+δ . x F¯Y (t) F¯Y (x∗ )
(53)
A x∗ PY (x∗ ) · 1+δ , ∗ ¯ x FY (x )
(54)
and recalling that, for all ε ∈ R, F¯ε (ε) ≤ 1, we have found an integrable function such that for some u0 ≥ 0, we have ∀u ∈ [u0 , 1), ∀x ≥ 1, fu (x) ≤ g(x) . (55) Thus, applying Lebesgue’s theorem of dominated convergence, we can assert that Z ∞ Z ∞ lim dx fu (x) = dx 1nx> l o · f (x). u→1 1
Since
Z lim
u→1 1
1
∞
dx fu (x) =
£ ¤ lim Pr X > FX−1 (u)|Y > FY−1 (u) ,
u→1
= λ,
(56)
β
(57) (58)
the proof of theorem 1 is concluded. ¤ Remark: This result still holds in the presence of dependence between the factor and the idiosyncratic noise. Indeed, denoting by F¯ε|Y the survival distribution of ε conditional on Y , lemma 1 can easily be generalized: Z ¤ FY−1 (u) ∞ ¤ £ ¢ £ ¡ −1 −1 Pr X > FX (u)|Y > FY (u) = dx PY FY−1 (u) x ·F¯ε|Y =F −1 (u) x FX −1 (u) − βFY−1 (u) x , Y 1−u 1 (59) 22
322
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
where the only change in (59) compared to (33) is to replace F¯ε (·) by F¯ε|Y =F −1 (u) x (·). Let us Y now assume that the function F¯ε|Y =y (x) admits a uniform limit when x and y tend to ±∞. Then, equation (48) still holds and lemma 2 remains true. As an example, let F denote any one-dimensional distribution fonction. Then, one can easily check that, for any conditional distribution whose form is µ ¶ y2 F¯ε|Y =y (x) = F¯ x , (60) y02 + y 2 the uniform limit condition is satisfied and theorem 1 and lemma 2 still hold. In contrast, conditional distributions of the form F¯ε|Y =y (x) = F¯ (x − ρy) (61) do not fulfill the uniform limit condition, so that the result given by theorem 1 does not hold. The full understanding of the impact of more general dependences between the factor and the idiosyncratic noise on the coefficient of tail dependence requires a full-fledge investigation that we defer to a future work. Our goal here has been to show that one can reasonably expect our results to survice in the presence of weak dependence.
A.2 A.2.1
Tail dependence between two assets Statement
We consider three random variables X1 , X2 and Y , related by the relations X1 = β1 · Y + ε1
(62)
X2 = β2 · Y + ε2 ,
(63)
where ε1 and ε2 are two random variables independent of Y and β1 , β2 two non-random positive coefficients. Let PY and FY denote respectively the density with respect to the Lebesgue measure and the distribution function of the variable Y . Let F1 , (resp. F2 ) denotes the distribution function of X1 (resp. X2 ) and Fε1 (resp. Fε2 ) the marginal distribution function of ε1 (resp. ε2 ). Let Fε1 ,ε2 denotes the joined distribution of (ε1 , ε2 ). We state the following theorem: Theorem 2 Assuming that H0: The variables Y , ε1 and ε2 have distribution functions with infinite support, H1: For all x ∈ [1, ∞), t PY (tx) = f (x), t→∞ F¯Y (t) lim
(64)
H2: There are real numbers t0 > 0, δ > 0 and A > 0, such that for all t ≥ t0 and all x ≥ 1 F¯Y (tx) A ≤ δ, x F¯Y (t)
23
(65)
323
9.2. Estimation du coefficient de d´ependance de queue
H3: There is two constant (l1 , l2 ) ∈ R+ × R+ , such that F1 −1 (u) = l1 , u→1 FY −1 (u)
and
lim
F2 −1 (u) = l2 , u→1 FY −1 (u) lim
then, the coefficient of (upper) tail dependence of (X, Y ) is given by Z ∞ λ= n o dx f (x). max
A.2.2
l1 l2 , β1 β2
(66)
(67)
Proof
We first give a general expression for the probability for X to be larger than FX−1 (u) knowing that Y is larger than FY−1 (u) : Lemma 3 The probability that X is larger than FX−1 (u) knowing that Y is larger than FY−1 (u) is given by : £ ¤ Pr X > FX−1 (u)|Y > FY−1 (u) = Z ¡ ¢ £ ¤ FY−1 (u) dx PY FY−1 (u) x · F¯ε1 ,ε2 F1 −1 (u) − β1 FY−1 (u) x, F2 −1 (u) − β1 FY−1 (u) x . (68) 1−u Proof : The proof is the same for lemma 1. ¤ Let us now define the function FY−1 (u) PY (FY−1 (u) x) · F¯ε1 ,ε2 [F1 −1 (u) − β1 FY−1 (u) x, F2 −1 (u) − β2 FY−1 (u) x] . 1−u We can state the following result fu (x) =
(69)
Lemma 4 Under assumption H1 and H3, for all x ∈ [1, ∞), fu (x) −→ 1nx>maxn l1 , l2 oo · f (x),
(70)
β1 β2
almost everywhere, as u goes to 1. Proof: Applying the assumption H3, we have lim F1
−1
u→1
(u) −
β1 FY−1 (u)
x =
à lim β1 FY−1 (u) u→1 (
=
−∞ if x > ∞ if x <
! F1 −1 (u) −x β1 FY−1 (u)
l1 β1 , l1 β1 ,
(71) (72) (73)
and
à lim F2
u→1
−1
(u) −
β2 FY−1 (u)
x =
lim β2 FY−1 (u) u→1 (
=
−∞ if x > ∞ if x <
! F2 −1 (u) −x β2 FY−1 (u)
l2 β2 , l2 β2 ,
(74) (75) (76)
24
324
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
which give lim F¯ε1 ,ε2 [F1 −1 (u) − β1 FY−1 (u) x, F2 −1 (u) − β2 FY−1 (u) x] = 1nx>maxn l1 , l2 oo ,
u→1
(77)
β1 β2
and following the same calculations as in part A.1, it concludes the proof. ¤ We can now apply Lebesgue’s theorem of dominated convergence (see part A.1 for the justification), which allows us to assert that Z Z lim dx fu (x) = dx 1nx>maxn l1 , l2 oo · f (x). (78) u→1
β1 β2
Since Z lim
u→1
dx fu (x) =
£ ¤ lim Pr X1 > F1−1 (u)|X2 > F2−1 (u) ,
u→1
= λ,
(79) (80)
the proof of theorem 2 is concluded. ¤
25
9.2. Estimation du coefficient de d´ependance de queue
B B.1
325
Proofs of the corollaries First corollary
Corollary 1 If the random variable Y has a rapidly varying distribution function, then λ = 0. Proof : Let us write
t PY (tx) t PY (tx) F¯Y (tx) = ¯ · ¯ . ¯ FY (t) FY (tx) FY (t) For a rapidly varying function F¯Y , we have F¯Y (tx) = 0, ∀x > 1, lim ¯ t→∞ FY (t)
(81)
(82)
while the leftmost factor of the right-hand-side of equation (81) remains bounded as t goes to infinity, so that t PY (tx) F¯Y (tx) lim ¯ · ¯ = f (x) = 0 . (83) t→∞ FY (tx) FY (t) Since f (x) = 0, we can apply lemma 2 without the hypothesis H3, which concludes the proof. ¤
B.2
Second corollary
Corollary 2 Let Y be regularly varying with index (−α), and assume that hypothesis H3 is satisfied. Then, the coefficient of (upper) tail dependence is 1 n oiα , (84) λ= h max 1, βl where l denotes the limit, when u → 1, of the ratio FX −1 (u)/FY −1 (u). Proof : Karamata’s theorem (see Embrechts, Kluppelberg, and Mikosh (1997, p 567)) ensures that α H1 is satisfied with f (x) = xα+1 , which is sufficient to prove the corollary. To go one step further, let us define F¯y (y) = y −α · L1 (y), (85) F¯ε (ε) = ε−α · L2 (ε),
(86)
where L1 (·) and L2 (·) are slowly varying functions. Using the proposition stated in Feller (1971, p 278), we obtain, for the distribution of the variable X µ µ ¶ ¶ x −α α ¯ FX (x) ∼ x β · L1 + L2 (x) , (87) β for large x. Assuming now, for simplicity, that L1 (resp. L2 ) goes to a constant C1 (resp. C2 ), this implies that H3 is satistified, since · ¸1 α FX −1 (u) C2 l = lim =β 1+ α . (88) −1 u→1 FY β C1 (u) This allows us to obtain the equations (10) and (13). ¤ 26
326
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
References Bigham, N.H., C.M. Goldie, and J.L. Teugel, 1987, Regular Variation. (Cambridge university press Cambridge). Bouchaus, J.P., D. Sornette, C. Walter and J.P. Aguilar, 1998, Taming large events: Optimal portfolio theory for strongly fluctuating assets, International Journal of Theoretical and Applied Finance, 1, 25-41. Boyer, B.H., M.S. Gibson, and M. Lauretan, 1997, Pitfalls in tests for changes in correlations, Working paper, International Finance Discussion Paper 597, Board of the Governors of the Federal Reserve System. Brennan, M.J., and E.J. Schwarz, 1978, A continuous time approach to the pricing of bonds, Journal of Banking and Finance 3, 133–155. Campbell, J.Y., M. Lettau, B. G. Malkiel, and Y. Xu, 2001, Have individual stocks become more volatile? An empirical exploration of idiosyncratic risk, Journal of Finance 56, 1–43. Carey, M., 1998, Credit risk in private debt portfolio, Journal of Finance 53, 56–61. Cho, Y.-H., and R.F. Engle, 2000, Time-varying betas and asymmetric effects of news : empirical analysis of blue chip stocks, Working paper, University of California, San Diego. Coles, S., J. Heffernan, and J. Tawn, 1999, Dependence measures for extreme value analyses, Extremes 2, 339–365. Cox, J.C, J.E. Ingersoll, and S.A. Ross, 1985, A theory of the term structure of interest rates, Econometrica 51, 385–408. Danielsson, J., L. de Haan, L. Peng, and C.G. de Vries, 2001, Using a bootstrap method to choose the optimal sample fraction in tail index estimation, Journal of Multivariate analysis 76, 226–248. Danielsson, J., and C.G. de Vries, 1997, Tail index and quantile estimation with very high frequency data, Journal of Empirical Finance 4, 241–257. de Haan, L., S.I. Resnick, H. Rootzen, and C.G. de Vries, 1989, Extremal behaviour of solutions to a stochastic difference equation with application to ARCH processes, Stochastic Processes and their Applications 32, 213–224. Embrechts, P., C.P. Kluppelberg, and T. Mikosh, 1997, Modelling Extremal Events. (SpringerVerlag Berlin). Embrechts, P., A.J. McNeil, and D. Straumann, 1999, Correlation: Pitfalls and Alternatives, Risk pp. 69–71. Embrechts, P., A.J. McNeil, and D. Straumann, 2001, Correlation and Dependency in Risk Management: Properties and Pitfalls, in M. Dempster, eds.: Value at Risk and Beyond (Cambridge University Press, ). Engle, R.F., and S. Manganelli, 1999, CAViaR: Conditional autoregressive Value-at-Risk by regression quantiles, Working paper, University of California, San Diego. Fama, E., and J. Mc Beth, 1973, Risk, return and equilibrium: empirical tests, Journal of Political Economy 81, 607–636. 27
9.2. Estimation du coefficient de d´ependance de queue
327
Feller, W., 1971, An introduction to probability theory and its applications II. (Wiley New-York). Frees, E., and E. Valdez, 1998, Understanding relationships using copulas, North Americam Actuarial Journal 2, 1–25. Gopikrishnan, P., M. Meyer, L.A.N. Amaral, and H.E. Stanley, 1998, Inverse Cubic Law for the Distribution of Stock Price Variations, European Physical Journal B 3, 139 –140. Gorby, M.B., 2000, A comparative anatomy of credit risk models, Journal of Banking and Finance 24, 119–149. Hall, P., 1990, Using the bootstrap method to estimate mean squared error and select smoothing parameter in non parametric problems, Journal of Multivariate Analysis 32, 177–203. Joe, H., 1997, Multivariate models and dependence concepts. (Chapman & Hall London). Johansen, A., and D. Sornette, 2002, Large stock market price drawdowns are outliers, Journal of Risk 4, 69–110. Jones, C.P., and J.W.Wilson, 1989, Is stock price volatility increasing?, Financial Analysts Journal 45, 20–26. Kandel, S., and R. Staumbaugh, 1987, On correlations and the sensitivity of inference about meanvariance efficiency, Journal of Financial Economics 18, 61–90. Kearns, P., and A.R. Pagan, 1997, Estimating the density tail index for financial time series, Review of Economics and Statistics 79, 171–175. Lindskog, F., 2000, Modelling Dependence with Copulas, Working paper, RiskLab, http : //www.risklab.ch/P apers.html#M T Lindskog. Lintner, J., 1965, The valuation of risk assets and the selection of risky investments in stock portfolios and the capital bugets, Review of Economics and Statitics 41, 13–37. Longin, F.M., 1996, The asymptotic distribution of extrem stock market returns, Journal of Business 96, 383–408. Longin, F.M., and B. Solnik, 2001, Extreme Correlation of International Equity Markets, Journal of Finance 56, 649–676. Loretan, M., and P.C.B. Phillips, 1994, Testing the covariance stationarity of heavy-tailed times series, Journal of Empirical Finance 1, 211–248. Lucas, A., P. Klaassen, P. Spreij, and S. Straetmans, 2001, An analytic approach to credit risk of large corporate bond and loan portfolios, Journal of Banking and Finance 25, 1635–1664. Lux, T., 1996, The stable Paretian hypothesis and the frequency of large returns: an examination of major German stocks, Applied Financial Economics 6, 463–475. Malevergne, Y., and D. Sornette, 2001, Testing the Gaussian copula hypothesis for financial assets dependences, Working paper, (e-print available at http://papers.ssrn.com/abstract=291140). Malevergne, Y., and D. Sornette, 2002, Hedging Extreme Co-Movements, submitted to RISK, preprint at http://arXiv.org/abs/cond-mat/0205636. Mossin, J., 1966, Equilibrium in capital asset market, Econometrica 35, 768–783. 28
328
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Nelsen, R.B., 1998, An introduction to copulas. (Springer Verlag New York). Pagan, A., 1996, The econometrics of financial markets, Journal of Empirical Finance 3, 15 – 102. Pagan, A., and A. Ullah, 1999, Non parametrics econometrics. (Cambridge University Press Cambridge). Patton, J.A., 2001, Estimation of copula models for time series of possibly different lengths, Working paper, University of California, Econ. Disc. Paper No. 2001-17. Roll, R., 1988, The International Crash of October 1987, Financial Analysts Journal pp. 19–35. Rootz´en, H., M.R. Leadbetter, and L. de Haan, 1998, On the distribution tail array sums for strongly mixing stationary sequences, Annals of Applied Probability 8, 868–885. Ross, S.A., 1976, The arbitrage theory of capital asset pricing, Journal of Economic Theory 17, 254–286. Sharpe, W., 1964, Capital assets prices: a theory of market equilibrium under conditions of risk, Journal of Finance 19, 425–442. Shealter, S.J., and J.S. Marron, 1990, Kernel quantile estimators, Journal of the Americam Statistical Association 85, 410–415. Sklar, A., 1959, Fonction de r´epartition `a n dimensions et leurs marges, Publ. Inst. Statist. Univ. Paris 8, 229–231. Starica, C., 1999, Multivariate extremes for models with constant conditional correlations, Journal of Empirical Finance 6, 515–553. Vasicek, O., 1977, An equilibrium characterisation of the term structure of interest rates, Journal of Financial Economics 5, 177–188. Xu, Y., and B.G. Malkiel, 2002, Investigating the behavior of idiosyncratic volatility, forthcoming Journal of Business.
29
30 0.1783
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Standart & Poor’s 500
0.0075
0.2554
3.131
1962 - December 1979 Std. Skew. Kurt. 0.0154 0.2235 2.192 0.0136 0.2985 3.632 0.0228 0.6753 4.629 0.0152 -0.0811 2.808 0.0134 0.2144 2.442 0.0126 0.3493 2.754 0.0215 0.2420 2.762 0.0126 0.4138 4.302 0.0199 0.0212 3.063 0.0138 0.0342 5.436 0.0139 0.3016 2.997 0.0153 0.2751 2.799 0.0147 0.2380 2.867 0.0115 0.3911 4.343 0.0145 0.2699 3.508 0.0163 0.2619 3.112 0.0134 0.2656 2.596 0.0198 0.2076 3.174 0.0185 0.3397 2.826 0.0165 0.3530 3.030 0.5237
0.0101
January 1980 Mean Std. 0.9217 0.0174 0.8486 0.0166 0.7752 0.0193 0.9353 0.0175 0.6693 0.0169 0.6792 0.0172 0.8759 0.0195 0.5338 0.0183 0.8913 0.0238 0.9674 0.0170 0.6885 0.0150 0.9664 0.0180 0.9443 0.0180 0.7916 0.0164 0.9027 0.0191 1.0663 0.0192 0.6644 0.0166 1.0299 0.0268 0.7752 0.0170 1.1996 0.0185 -1.6974
36.657
December 2000 Skew. Kurt. -0.0434 2.248 0.1007 8.519 0.1311 4.785 -0.3437 16.733 0.0491 4.355 -0.1021 4.731 -0.6661 17.655 -0.0128 5.373 0.0254 4.921 -0.1012 14.377 -0.7861 20.609 -0.2602 10.954 0.1372 4.594 -1.6610 46.916 -0.6133 13.587 0.1781 7.9979 0.1192 6.477 0.1595 7.848 0.0396 3.190 0.1412 3.316 0.3674
July Mean 0.8066 0.6803 0.8068 0.7546 0.5885 0.4715 0.8997 0.4538 0.8420 0.7483 0.5333 0.8863 0.7431 0.5947 0.6666 0.8703 0.5197 0.8726 0.6876 0.9217 0.0090
-1.2236
1962 - December Std. Skew. 0.0165 0.0570 0.0154 0.1717 0.0209 0.4495 0.0165 -0.2485 0.0154 0.1033 0.0153 0.0231 0.0204 -0.1881 0.0160 0.0872 0.0221 0.0256 0.0157 -0.0513 0.0145 -0.3550 0.0169 -0.0784 0.0166 0.1786 0.0144 -1.2408 0.0172 -0.3773 0.0179 0.2139 0.0152 0.1725 0.0239 0.1831 0.0177 0.1933 0.0176 0.2260
Table 1: This table gives the main statistical features of the three samples we have considered. The columns Mean, Std., Skew. and Kurt. respectively give the average return multiplied by one thousand, the standard deviation, the skewness and the excess kurtosis of each asset over the time intervals form July 1962 to December 1979, January 1980 to Decemeber 2000 and July 1962 to December 2000. The excess kurtosis is given as indicative of the relative weight of large return amplitudes, and can always be calculated over a finite time series even if it may not be asymptotically defined for power tails with exponents less than 4.
July Mean 0.6677 0.4755 0.8460 0.5342 0.4916 0.2193 0.9272 0.3547 0.7823 0.4829 0.3459 0.7930 0.4982 0.3569 0.3801 0.6328 0.3416 0.6839 0.5801 0.5851
32.406
2000 Kurt. 2.300 7.557 4.901 12.573 4.209 4.937 9.568 6.164 4.624 12.611 14.066 8.790 4.413 44.363 12.378 6.757 5.829 7.737 3.034 3.295
9.2. Estimation du coefficient de d´ependance de queue 329
31
January 1979 - December 2000 β ρY 2 ,ε2 0.9122 0.1879 0.8102 0.0587 0.9036 0.0928 1.0435 0.0457 0.8333 0.0776 0.9451 0.0433 1.0016 0.1304 1.0112 0.0400 1.3074 0.0739 0.9833 0.1254 0.8756 0.2605 0.8598 0.0340 0.9004 0.3294 0.8938 0.1188 0.8824 0.0373 1.0480 0.0494 1.3811 0.0674 0.6600 0.0823 0.9049 0.1175 0.8554 0.1087
July 1962 - December 2000 β ρY 2 ,ε2 0.9081 0.1597 0.8652 0.0736 1.0715 0.1279 1.0559 0.0481 0.8873 0.0906 0.9880 0.0595 1.1736 0.0641 1.0371 0.0563 1.3332 0.0832 0.9995 0.1238 0.9564 0.1706 0.9314 0.0545 0.9187 0.3169 0.8738 0.1287 0.9429 0.0357 1.0720 0.0540 1.4049 0.0766 0.7481 0.1053 0.9763 0.1098 0.7869 0.0798
Table 2: This table presents the estimated coefficient β for the factor model (6) and the correlation coefficient ρY 2 ,ε2 between the square of factor and the square of the estimated idiosyncratic noise, for the different time intervals we have considered. A Fisher’s test shows that these correlation coefficients are all significantly different from zero.
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
July 1962 - December 1979 β ρY 2 ,ε2 0.8994 0.0879 0.9855 0.1253 1.4416 0.1196 1.0832 0.1056 1.0062 0.1191 1.0818 0.0960 1.5530 0.0960 1.0945 0.1531 1.3910 0.1023 1.0347 0.2146 1.1339 0.1203 1.0894 0.0723 0.9587 0.1233 0.8293 0.1873 1.0750 0.0783 1.1244 0.1284 1.4578 0.1410 0.9414 0.1354 1.1336 0.1243 0.6354 0.1052
330 9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
32 5.17
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Standart & Poor’s 500
-
1% ε 5.31 5.11 4.90 4.27 4.78 4.36 4.23 3.50 3.89 3.45 4.86 3.79 5.35 3.77 4.24 4.70 4.59 4.54 4.49 6.50 4.16
-
3.91
q= Asset 3.27 3.02∗ 3.32 2.99∗ 2.91 3.17∗ 3.08∗ 2.94∗ 2.81∗ 2.75∗ 3.43 3.38 3.02∗ 3.13∗ 2.88∗ 3.37 3.07∗ 3.22∗ 3.27 2.94∗ -
5% ε 3.31 3.21∗ 3.49 3.16∗ 3.12∗ 3.23∗ 3.22∗ 3.36 3.00∗ 3.17∗ 3.71 3.48 3.27 3.42 3.34 3.33 3.19∗ 2.87∗ 3.46 3.18∗ 3.74
q= Asset 5.10 3.64 4.04 5.96∗ 5.21 5.35 4.90 3.91 4.64 3.91 4.35 4.10 4.07 4.14 4.46 4.60 3.83 4.52 4.78 5.16 -
1% ε 4.50 4.66 4.27 5.19 5.15 5.15 4.34 4.78 5.08 4.26 4.47 4.64 4.67 5.39 3.72 5.88∗ 4.10 4.20 4.97 4.56 3.34
-
Positive Tail q = 2.5% Asset ε 4.09 3.71 3.60 3.81 3.95 4.19 3.94 4.82∗ 3.90 4.26 4.00 3.37 4.26 3.73 3.64 3.86 4.08 4.20 3.16 3.61 3.96 3.31 3.59 3.85 3.49 3.86 3.59 3.73 3.95 3.90 3.50 3.91 3.94 3.67 3.67 3.79 3.73 3.98 3.47 3.30 2.64
q= Asset 3.53∗ 3.11 3.35∗ 3.62∗ 3.25∗ 3.13 3.33∗ 2.94 3.41∗ 2.81 3.14 3.03 3.15 2.97 3.14 3.07 3.14 3.16 3.26∗ 3.15
-
5% ε 3.14 3.15 2.93 4.03∗ 3.07 3.04 3.29∗ 3.07 3.42∗ 3.16 3.06 3.06 3.21∗ 3.40∗ 2.99 3.22∗ 2.98 3.07 3.49∗ 2.82
Table 3: This table gives the estimated value of the tail index for the twenty considered assets, the Standard & Poor’s 500 index and the residues ε obtained by regressing each asset on the Standard & Poor’s 500 index, for both the negative and the positive tails, during the time interval from July 1962 to December 1979. The tail indexes are estimed by Hill’s estimator at the quantile 1%, 2.5% and 5% which are the optimal quantiles given by the Hall (1990) and Danielsson and de Vries (1997)’s algorithms. The values decorrated with stars represent the tail indexes which cannot be considered equal to the Standard & Poor’s 500 index’s tail index at the 95% confidence level.
q= Asset 5.54 4.58 6.07 4.32 5.24 5.26 3.59 4.82 3.76 3.45 5.16 4.63 4.89 4.42 4.73 4.59 5.34 4.08 4.00 4.63
Negative Tail q = 2.5% Asset ε 3.94 4.02 3.89 3.81 4.57 3.74 3.31 3.95 3.75 3.29 3.69 3.76 3.59 3.84 3.72 3.66 ∗ 3.12 3.05∗ ∗ 3.05 3.71 4.06 4.35 3.82 3.90 3.93 4.49 3.77 3.74 4.05 3.45 4.20 3.87 3.99 3.84 3.36 3.13∗ 3.52 3.92 3.85 4.26
9.2. Estimation du coefficient de d´ependance de queue 331
33 3.16
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Standart & Poor’s 500
-
1% ε 3.60 3.07 3.97 3.15 4.48 3.49 3.24 4.79 3.45 3.76 3.38 3.34 4.46 2.46 3.20 5.20∗ 3.20 3.53 3.98 4.35 3.17
-
3.16
q= Asset 3.22 2.73 3.02 2.80 3.30 3.02 2.85 3.44 3.00 2.99 2.88 2.68 2.99 3.19 2.80 3.11 2.88 2.89 3.34 3.20 -
5% ε 3.39 2.49∗ 3.21 3.16 3.45 3.04 2.83 3.56 2.73 2.86 3.04 2.53∗ 3.27 2.87 2.70 3.05 2.84 2.99 3.18 3.40 4.00
q= Asset 5.14 4.01 4.86 2.98 5.16 5.36 3.97 5.76 4.31 4.06 3.76 3.42 4.00 4.35 4.12 3.23 3.65 4.00 5.39 4.60 -
1% ε 4.60 3.47 3.65 3.74 4.53 4.33 3.70 5.32 3.40 3.47 3.46 3.16 3.87 3.90 4.70 3.51 3.36 3.42 4.50 5.12 3.65
-
Positive Tail q = 2.5% Asset ε 4.16 3.76 3.28 3.02 3.45 3.16 3.35 3.12 3.88 3.81 4.31 3.35 3.68 3.33 4.45 3.86 3.47 3.29 3.45 3.16 3.95 3.22 3.70 3.07 3.61 3.34 3.48 3.20 3.44 3.50 3.45 3.08 3.20 3.04 3.36 3.30 4.00 3.80 3.79 3.54 3.19
q= Asset 3.77 2.87 3.13 3.20 3.01 3.44 3.15 3.43 3.24 3.37 3.10 2.85 3.44 3.14 3.31 3.06 2.86 2.97 3.51 3.20
-
5% ε 3.07 2.79 3.23 2.75 3.06 2.76 2.87 3.22 2.99 2.87 2.76 2.81 3.31 2.91 2.89 2.87 2.70 3.06 3.26 3.07
Table 4: This table gives the estimated value of the tail index for the twenty considered assets, the Standard & Poor’s 500 index and the residues ε obtained by regressing each asset on the Standard & Poor’s 500 index, for both the negative and the positive tails, during the time interval from January 1980 to December 2000. The tail indexes are estimed by Hill’s estimator at the quantile 1%, 2.5% and 5% which are the optimal quantiles given by the Hall (1990) and Danielsson and de Vries (1997)’s algorithms. The values decorrated with stars represent the tail indexes whose value cannot be considered equal to the Standard & Poor’s 500 index’s tail index at the 95% confidence level.
q= Asset 3.59 3.03 3.39 3.21 4.13 3.99 2.83 4.44 3.73 3.01 3.52 3.58 4.14 2.65 2.96 4.22 3.09 3.49 4.21 4.06
Negative Tail q = 2.5% Asset ε 3.35 3.62 3.11 2.78 3.23 3.53 2.90 3.41 3.99 3.91 3.76 3.23 2.76 2.97 3.88 4.27∗ 3.52 3.12 3.14 3.48 3.21 3.39 3.33 3.12 3.39 3.60 3.29 3.19 3.09 2.79 3.29 3.68 3.10 3.15 3.35 3.31 3.82 3.46 3.81 4.04
332 9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
34
First Centile std. min. 0.0106 0.1039 0.0091 0.0796 0.0128 0.0954 0.0171 0.1407 0.0194 0.1334 0.0142 0.1785 0.0160 0.0930 0.0149 0.2045 0.0103 0.1188 0.0259 0.1199 0.0254 0.1851 0.0078 0.1050 0.0111 0.1014 0.0325 0.1131 0.0172 0.0596 0.0143 0.1320 0.0017 0.0170 0.0687 0.3897 0.0123 0.0990 0.0109 0.0664 max. 0.1745 0.1288 0.1653 0.2175 0.2368 0.2424 0.1613 0.2996 0.1769 0.2522 0.3268 0.1526 0.1706 0.2351 0.1378 0.2403 0.0269 0.6572 0.1872 0.1185
mean 0.1232 0.1349 0.1090 0.2220 0.1407 0.2067 0.1587 0.2109 0.1615 0.2159 0.2262 0.1334 0.1242 0.2017 0.1291 0.1733 0.0246 0.6162 0.1254 0.0682
First Quintile std. min. 0.0058 0.1039 0.0111 0.0796 0.0074 0.0954 0.0220 0.1407 0.0121 0.1265 0.0105 0.1785 0.0161 0.0930 0.0096 0.1947 0.0116 0.1188 0.0204 0.1199 0.0126 0.1851 0.0089 0.1050 0.0071 0.1014 0.0153 0.1131 0.0109 0.0596 0.0078 0.1320 0.0023 0.0170 0.0438 0.3897 0.0073 0.0990 0.0058 0.0629 max. 0.1745 0.1494 0.1653 0.2474 0.2368 0.2424 0.1754 0.2996 0.1776 0.2522 0.3268 0.1526 0.1706 0.2351 0.1432 0.2403 0.0278 0.6623 0.1872 0.1185
mean 0.1185 0.1401 0.1066 0.2180 0.1365 0.2021 0.1566 0.2020 0.1603 0.2160 0.2218 0.1340 0.1239 0.1972 0.1337 0.1663 0.0248 0.6168 0.1192 0.0674
First std. 0.0069 0.0096 0.0066 0.0180 0.0101 0.0106 0.0123 0.0120 0.0095 0.0164 0.0115 0.0070 0.0057 0.0131 0.0091 0.0102 0.0018 0.0368 0.0086 0.0044
Decile min. 0.1039 0.0796 0.0954 0.1407 0.1257 0.1785 0.0930 0.1808 0.1188 0.1199 0.1851 0.1050 0.1014 0.1131 0.0596 0.1320 0.0170 0.3897 0.0990 0.0629
Table 5: This table gives the average (mean), the standard deviation (std.), the minimum (min.) and the maximum (max.) values of the coefficient of lower tail dependence estimated over the first centile, quintile and decile during the entire time interval from July 1962 to December 2000, under the assumption that the tail of the distributions of the assets and the market are regularly varying with a index equal to three.
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MGF Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
mean 0.1264 0.1181 0.1116 0.1927 0.1566 0.2089 0.1317 0.2210 0.1455 0.1870 0.2311 0.1251 0.1263 0.1980 0.1180 0.1759 0.0214 0.5657 0.1300 0.0739
max. 0.1745 0.1505 0.1653 0.2474 0.2368 0.2424 0.1754 0.2996 0.1776 0.2522 0.3268 0.1526 0.1706 0.2351 0.1434 0.2403 0.0278 0.6761 0.1872 0.1185
9.2. Estimation du coefficient de d´ependance de queue 333
35
First Centile std. min. 0.0127 0.1442 0.0207 0.0910 0.0127 0.1101 0.0231 0.1878 0.0188 0.1656 0.0148 0.2127 0.0149 0.1368 0.0259 0.2393 0.0096 0.1389 0.0223 0.1686 0.0196 0.2399 0.0168 0.0983 0.0132 0.1483 0.0292 0.1434 0.0104 0.0863 0.0190 0.1920 0.0027 0.0243 0.0195 0.3389 0.0153 0.1298 0.0112 0.0808 max. 0.2137 0.1720 0.1804 0.3052 0.2564 0.2871 0.1957 0.3652 0.1914 0.2719 0.3407 0.1673 0.2106 0.2673 0.1432 0.2863 0.0369 0.4906 0.2182 0.1384
mean 0.1633 0.1728 0.1349 0.2751 0.1790 0.2695 0.1938 0.2565 0.2018 0.2576 0.2873 0.1700 0.1535 0.2461 0.1588 0.2179 0.0369 0.4500 0.1562 0.0837
First Quintile std. min. 0.0071 0.1442 0.0205 0.091 0.0064 0.1101 0.0115 0.1878 0.0105 0.1634 0.0117 0.2127 0.0123 0.1368 0.0138 0.2349 0.0230 0.1389 0.0163 0.1686 0.0099 0.2399 0.0206 0.0983 0.0083 0.1448 0.0169 0.1434 0.0192 0.0863 0.0103 0.1920 0.0033 0.0243 0.0142 0.3389 0.0075 0.1298 0.0071 0.0776 max. 0.2137 0.1963 0.1804 0.3052 0.2564 0.2876 0.2094 0.3652 0.2303 0.2731 0.3407 0.1919 0.2106 0.2673 0.1822 0.2863 0.0414 0.4906 0.2182 0.1384
mean 0.1540 0.1823 0.1289 0.2696 0.1748 0.2685 0.1900 0.2545 0.2039 0.2579 0.2802 0.1729 0.1512 0.2413 0.1643 0.2107 0.0371 0.4515 0.1511 0.0786
First std. 0.0120 0.0175 0.0078 0.0110 0.0096 0.0103 0.0109 0.0108 0.0176 0.0123 0.0117 0.0155 0.0067 0.0141 0.0149 0.0123 0.0027 0.011 0.0084 0.0078
Decile min. 0.1331 0.0910 0.1101 0.1878 0.1606 0.2127 0.1368 0.2349 0.1389 0.1686 0.2399 0.0983 0.1434 0.1434 0.0863 0.1877 0.0243 0.3389 0.1298 0.0669
Table 6: This table gives the average (mean), the standard deviation (std.), the minimum (min.) and the maximum (max.) values of the coefficient of lower tail dependence estimated over the first centile, quintile and decile during the entire time interval from July 1962 to December 2000, under the assumption that the tail of the distributions of the assets and the market are power laws with an exponent equal to three.
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
mean 0.1670 0.1423 0.1372 0.2720 0.1853 0.2547 0.1772 0.2641 0.1701 0.2343 0.2844 0.1369 0.1634 0.2284 0.1279 0.2195 0.0327 0.4355 0.1570 0.0937
max. 0.2137 0.2020 0.1804 0.3052 0.2564 0.2876 0.2094 0.3652 0.2303 0.2731 0.3407 0.1919 0.2106 0.2673 0.1822 0.2863 0.0414 0.4906 0.2182 0.1384
334 9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
335
9.2. Estimation du coefficient de d´ependance de queue
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Negative Tail α = 3 α = 3.5 α = 4 0.12 0.09 0.06 0.22 0.18 0.15 0.16 0.13 0.10 0.22 0.19 0.16 0.21 0.17 0.14 0.38 0.37 0.35 0.24 0.20 0.17 0.39 0.37 0.35 0.15 0.12 0.09 0.26 0.22 0.19 0.35 0.32 0.30 0.25 0.22 0.19 0.15 0.12 0.09 0.23 0.19 0.16 0.23 0.19 0.16 0.21 0.18 0.15 0.06 0.04 0.03 0.47 0.46 0.46 0.13 0.10 0.07 0.03 0.02 0.01
Positive Tail α = 3 α = 3.5 α = 4 0.11 0.08 0.06 0.25 0.22 0.19 0.13 0.10 0.07 0.28 0.25 0.23 0.26 0.23 0.20 0.37 0.35 0.33 0.23 0.19 0.16 0.48 0.47 0.47 0.23 0.20 0.17 0.26 0.23 0.20 0.35 0.33 0.31 0.20 0.17 0.14 0.17 0.14 0.11 0.24 0.21 0.18 0.26 0.23 0.20 0.20 0.17 0.14 0.07 0.05 0.03 0.49 0.49 0.49 0.13 0.10 0.07 0.02 0.01 0.01
Table 7: This table summarizes the mean values over the first centile of the distribution of the coefficients of (upper or lower) tail dependence for the positive and negative tails during the time interval from July 1962 to December 1979, for three values of the tail index α = 3, 3.5, 4.
36
336
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Negative Tail α = 3 α = 3.5 α = 4 0.20 0.17 0.14 0.12 0.09 0.06 0.14 0.11 0.08 0.32 0.29 0.26 0.18 0.14 0.11 0.23 0.20 0.17 0.16 0.13 0.10 0.26 0.22 0.19 0.19 0.15 0.13 0.24 0.20 0.18 0.26 0.23 0.20 0.11 0.08 0.06 0.17 0.14 0.11 0.24 0.21 0.18 0.10 0.08 0.05 0.23 0.20 0.17 0.02 0.01 0.01 0.43 0.42 0.41 0.20 0.16 0.14 0.15 0.12 0.09
Positive Tail α = 3 α = 3.5 α = 4 0.16 0.13 0.10 0.10 0.08 0.05 0.10 0.07 0.05 0.25 0.21 0.19 0.13 0.09 0.07 0.16 0.13 0.10 0.15 0.12 0.09 0.20 0.16 0.13 0.21 0.18 0.15 0.20 0.17 0.14 0.20 0.17 0.14 0.11 0.08 0.06 0.14 0.11 0.09 0.20 0.16 0.13 0.10 0.07 0.05 0.16 0.13 0.10 0.02 0.01 0.01 0.31 0.28 0.26 0.18 0.14 0.11 0.09 0.07 0.05
Table 8: This table summarizes the mean values over the first centile of the distribution of the coefficients of (upper or lower) tail dependence for the positive and negative tails during the time interval from January 1980 to December 2000, for three values of the tail index α = 3, 3.5, 4.
37
337
9.2. Estimation du coefficient de d´ependance de queue
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
Negative Tail α = 3 α = 3.5 α = 4 0.17 0.13 0.11 0.14 0.11 0.08 0.14 0.10 0.08 0.27 0.24 0.21 0.19 0.15 0.12 0.25 0.22 0.19 0.18 0.14 0.11 0.26 0.23 0.20 0.17 0.14 0.11 0.23 0.20 0.17 0.28 0.25 0.23 0.14 0.10 0.08 0.16 0.13 0.10 0.23 0.20 0.17 0.13 0.10 0.07 0.22 0.19 0.16 0.03 0.02 0.01 0.44 0.42 0.41 0.16 0.12 0.10 0.09 0.07 0.05
Positive Tail α = 3 α = 3.5 α = 4 0.15 0.12 0.09 0.15 0.11 0.09 0.10 0.07 0.05 0.27 0.24 0.21 0.17 0.13 0.10 0.23 0.19 0.16 0.17 0.13 0.11 0.24 0.21 0.18 0.23 0.19 0.16 0.23 0.20 0.17 0.25 0.22 0.19 0.14 0.11 0.08 0.16 0.12 0.10 0.22 0.18 0.15 0.14 0.10 0.08 0.19 0.15 0.12 0.03 0.02 0.01 0.37 0.35 0.33 0.15 0.12 0.09 0.06 0.04 0.03
Table 9: This table summarizes the mean values over the first centile of the distribution of the coefficients of (upper or lower) tail dependence for the positive and negative tails during the time interval from July 1962 to December 2000, for three values of the tail index α = 3, 3.5, 4.
38
338
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
July 1962 - Dec. 1979 Extremes λ− p-value 0 0.12 0.2937 1 0.22 0.2432 0 0.16 0.1667 2 0.22 0.2987 3 0.21 0.2112 0 0.38 0.0078 2 0.24 0.2901 2 0.39 0.1345 0 0.15 0.1909 2 0.26 0.2765 2 0.35 0.1784 1 0.25 0.1841 2 0.15 0.2795 1 0.23 0.2245 2 0.23 0.2956 0 0.21 0.0946 0 0.06 0.5222 1 0.47 0.0161 1 0.13 0.3728 1 0.03 0.2303
Jan.1980 Extremes 4 2 3 4 4 4 2 4 2 5 4 2 5 3 4 4 2 3 4 3
- Dec. 2000 λ− p-value 0.20 0.0904 0.12 0.2247 0.14 0.1176 0.32 0.2144 0.18 0.0644 0.23 0.1224 0.16 0.2873 0.26 0.1522 0.19 0.3007 0.24 0.0494 0.26 0.1571 0.11 0.2142 0.17 0.0141 0.24 0.2447 0.10 0.0128 0.23 0.1224 0.02 0.0212 0.43 0.1862 0.20 0.0870 0.15 0.1373
Table 10: This table gives, for the time intervals from July 1962 to December 1979 and from January 1980 to December 2000, the number of losses within the ten largest losses incured by an asset which have occured together with one of the ten largest losses of the Standard & Poor’s 500 index during the same time interval. The probabilty of occurence of such a realisation is given by the p-value derived from the binomial law (27) with parameter λ− .
39
339
9.2. Estimation du coefficient de d´ependance de queue
Abbott Labs American Home Products Corp. Boeing Co. Bristol-Myers Squibb Co. Chevron Corp. Du Pont (E.I.) de Nemours & Co. Disney (Walt) Co. General Motors Corp. Hewlett-Packard Co. Coca-Cola Co. Minnesota Mining & MFG Co. Philip Morris Cos Inc. Pepsico Inc. Procter & Gamble Co. Pharmacia Corp. Schering-Plough Corp. Texaco Inc. Texas Instruments Inc. United Technologies Corp Walgreen Co.
July 1962 - Dec. 1979 Extremes λ− p-value 0 0.12 0.2937 1 0.22 0.2432 0 0.16 0.1667 2 0.22 0.2987 3 0.21 0.2112 0 0.38 0.0078 2 0.24 0.2901 2 0.39 0.1345 0 0.15 0.1909 1 0.26 0.1782 2 0.35 0.1784 1 0.25 0.1841 2 0.15 0.2795 1 0.23 0.2245 2 0.23 0.2956 0 0.21 0.0946 0 0.06 0.5222 1 0.47 0.0161 1 0.13 0.3728 1 0.03 0.2303
Jan.1980 Extremes 4 1 3 3 3 3 1 3 1 4 3 2 5 3 4 3 1 3 3 3
- Dec. 2000 λ− p-value 0.20 0.0904 0.12 0.3828 0.14 0.1176 0.32 0.2653 0.18 0.1708 0.23 0.2342 0.16 0.3300 0.26 0.2536 0.19 0.2880 0.24 0.1318 0.26 0.2561 0.11 0.2142 0.17 0.0141 0.24 0.2447 0.10 0.0128 0.23 0.2342 0.02 0.1922 0.43 0.1862 0.20 0.2001 0.15 0.1373
Table 11: This table gives, for the time intervals from July 1962 to December 1979 and from January 1980 to December 2000, the number of losses within the ten largest losses incured by an asset which have occured together with one of the ten largest losses of the Standard & Poor’s 500 index during the same time interval, provided that the losses are not both the largest of each series. The probabilty of occurence of such a realisation is given by the p-value derived from the binomial law (27) with parameter λ− .
40
340
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
9.3 Synth`ese de la description de la d´ependance entre actifs financiers Les valeurs de d´ependance de queue pr´esent´ees dans la partie pr´ec´edente entre un actif et le facteur de march´e (l’indice) permettent ais´ement d’en d´eduire la d´ependance de queue entre deux actifs, comme e´ tant le minimum de la d´ependance de queue entre chacun des actifs et l’indice. On en d´eduit que si l’indice a` une distribution r´eguli`erement variable, les actifs pr´esentent une d´ependance de queue non nulle. En cons´equence, l’hypoth`ese de copule gaussienne faite au chapitre 7 ne peut eˆ tre consid´er´ee comme une approximation valable, puisque nous rappelons qu’elle n’admet pas de d´ependance de queue. Cependant, comme nous l’avons montr´e au chapitre 3, la distribution de rendements des actifs n’est peuteˆ tre pas r´eguli`erement variable, mais rapidement variable, si l’on consid`ere les distributions exponentielles e´ tir´ees. Dans ce cas, et pour autant que l’indice de march´e reste le facteur principal, les actifs ne pr´esentent pas de d´ependance de queue, et la description de la d´ependance en terme de copule gaussienne redevient acceptable.
4
4
3
3
2
2
1
1
United Technologies
Texaco
Ceci e´ tant, l’examen direct de la r´epartition des rendements des actifs en fonction des rendements de l’indice, au cours de la p´eriode 1980-2000, sugg`ere clairement l’existence d’une d´ependance de queue comme le montre la figure 9.1. De plus, cette repr´esentation confirme que la d´ependance de queue pour Texaco (panneau de gauche) est beaucoup plus faible que pour United Technologies (panneau de droite) par exemple. En effet, on observe que pour les rendements n´egatifs, les extrˆemes sont regroup´es en fuseau pour United Technologies alors qu’ils sont beaucoup plus dispers´es pour Texaco. Ceci est parfaitement conforme aux r´esultats e´ nonc´es au paragraphe pr´ec´edent o`u les valeurs de d´ependance de queue mesur´ees e´ taient respectivement de 2% et 20%.
0
−1
0
−1
−2
−2
−3
−3
−4 −4
−3
−2
−1
0 S&P 500
1
2
3
4
−4 −4
−3
−2
−1
0 S&P 500
1
2
3
4
F IG . 9.1 – Rendements de Texaco (panneau de gauche) et de United Technologies (panneau de droite) en fonction des rendements du Standard & Poor’s 500 durant la p´eriode 1980-2000. Les distributions marginales ont e´ t´e projet´ees sur des distributions gaussiennes pour permettre une meilleure comparaison et mettre en lumi`ere l’effet de la copule. Pour compl´eter cette e´ tude, nous avons utilis´e la m´ethode d’estimation non param´etrique de Coles, Heffernan et Tawn (1999) mise en œuvre par Poon, Rockinger et Tawn (2001) que nous avons pr´esent´ee au paragraphe 9.1. Cette m´ethode est tr`es d´elicate et nous semble assez peu pr´ecise puisqu’elle repose sur l’estimation d’un exposant de queue et d’un facteur d’´echelle. Nous avons d´ej`a e´ voqu´e ces probl`emes au chapitre 1, nous n’y revenons donc pas. Malgr´e (ou plutˆot a` cause de) ces impr´ecisions, nous n’avons jamais pu rejeter, a` 95% de confiance, l’hypoth`ese de l’existence de d´ependance de queue entre les actifs. En outre, mˆeme si les valeurs num´eriques obtenues sont entach´ees d’une grande incertitude, elles
341
9.3. Synth`ese de la description de la d´ependance entre actifs financiers
0.35
0.3
Modèle à facteur
0.25
0.2
0.15
0.1
0.05
0
0
0.05
0.1
0.15 0.2 Modèle elliptique
0.25
0.3
0.35
F IG . 9.2 – Coefficient de d´ependance de queue estim´e a` l’aide du mod`ele a` facteur en fonction du coefficient de d´ependance de queue estim´e sous hypoth`ese de copule elliptique. L’indice de variation r´eguli`ere du facteur est de 3 et celui de la copule elliptique de 4. Ce couple de valeurs est celui qui donne le meilleur accord entre les r´esultats des deux mod`eles. demeurent du mˆeme ordre de grandeur que celles trouv´ees pr´ec´edemment a` l’aide du mod`ele a` facteur. Nous avons e´ galement estim´e la d´ependance de queue sous l’hypoth`ese de copule elliptique r´eguli`erement variable, dont l’expression a e´ t´e d´eriv´ee par Hult et Lindskog (2001) et qui n’est fonction que de l’indice de variation r´eguli`ere et du coefficient de corr´elation. Ce dernier a e´ t´e estim´e de mani`ere non param´etrique par l’interm´ediaire du τ du Kendall, puis obtenu grˆace a` la relation π · τ ρ = sin , (9.1) 2 bien connue pour la distribution gaussienne mais aussi valable pour la classe des distributions elliptiques comme l’ont r´ecemment montr´e Lindskog, McNeil et Schmock (2001). L`a encore, les valeurs num´eriques sont qualitativement en accord avec les r´esultats pr´ec´edents. Plus pr´ecis´ement, on peut toujours trouver un indice de variation r´eguli`ere ν de la copule elliptique tel que les deux mod`eles donnent exactement la mˆeme valeur de d´ependance de queue, ce qui revient a` accepter que la copule elliptique soit diff´erente d’un couple d’actifs a` l’autre, mais ceci nous semble gu`ere satisfaisant. Si par contre l’on consid`ere des copules elliptiques de mˆeme indice de queue pour toutes les paires, des diff´erences importantes sont observ´ees par rapport aux estimations donn´ees par le mod`ele a` facteur, comme le montre la figure 9.2, o`u sont repr´esent´es les coefficients de d´ependance de queue estim´es a` l’aide du mod`ele a` facteur et sous hypoth`ese de copule elliptique. En conclusion, nous pensons pouvoir affirmer que les actifs auxquels nous nous sommes int´eress´es pr´esentent une d´ependance de queue, et que celle-ci est de l’ordre de 5% a` 30% selon les cas. Ceci a` d’importantes cons´equences pratiques du point de vue de la gestion des risques. En effet, reprenant le petit calcul pr´esent´e au point 1.2 du paragraphe 9.2, nous consid´erons la probabilit´e que deux actifs d’un portefeuille chutent ensemble au-del`a de leur VaR journali`ere a` 99% par exemple. N´egligeant la d´ependance de queue, un tel e´ v´enement a` une probabilit´e extrˆemement faible de se produire (temps de r´ecurrence typique de 40 ans), alors que si l’on consid`ere une d´ependance de queue de 30% un tel e´ v´enement se produit typiquement tous les 16 mois (contre huit ans pour λ = 5%).
342
9. Mesure de la d´ependance extrˆeme entre deux actifs financiers
Pour ce qui est des implications th´eoriques, les r´esultats que nous venons d’exposer rendent l’approximation de copule gaussienne non valable, a` strictement parler. Toutefois, pour des actifs dont la d´ependance de queue n’est que de l’ordre de 5%, ce type d’approximation peut eˆ tre consid´er´e comme non d´eraisonnable, comme vient de le montrer le petit calcul ci-dessus et dans la mesure o`u bien d’autres sources d’erreurs - notamment sur la description des marginales - sont a` prendre en compte dans l’estimation globale de l’incertitude associ´ee a` l’estimation de la distribution jointe dans son ensemble. Par ailleurs, dans les cas o`u la d´ependance de queue est trop importante pour que la copule gaussienne puisse eˆ tre retenue, il convient aussi de s’interroger sur la pertinence de la mod´elisation en terme de copule elliptique que nous avions initialement choisie. En effet, nous venons de montrer que l’accord entre ce type de repr´esentation et le mod`ele a` facteur, qui a de solides points d’ancrage dans la th´eorie financi`ere, n’est pas tr`es satisfaisant. De plus, les r´esultats du paragraphe 9.2 montrent que durant la p´eriode 1980-2000, la d´ependance de queue est l´eg`erement plus importante dans la queue inf´erieure que dans la queue sup´erieure - ce qui a e´ t´e confirm´e par l’approche non param´etrique - mˆeme si dans la majeure partie des cas, cette diff´erence n’est pas statistiquement significative. Cela dit, cette diff´erence syst´ematiquement en faveur de la queue n´egative ne peut eˆ tre n´eglig´ee et est d`es lors en contradiction avec l’hypoth`ese de copules elliptiques qui sont des copules sym´etriques dans leurs parties positive et n´egative et pr´esentent donc une mˆeme d´ependance de queue sup´erieure et inf´erieure.
Troisi`eme partie
Mesures des risques extrˆemes et application a` la gestion de portefeuille
343
Chapitre 10
La mesure du risque Dans ce chapitre, nous souhaitons pr´esenter quelques th´eories ayant permis la quantification du risque associ´e a` un actif financier ou a un portefeuille. Les sources de risque sont en fait aussi diverses que vari´ees. Nous pouvons citer par exemple le risque de d´efaut li´e au fait qu’une contrepartie ne puisse faire face a` ses obligations ou bien le risque de liquidit´e li´e a` la capacit´e limit´ee des march´es a` pouvoir absorber un afflux massif de titres a` la vente ou a` l’achat, mais il nous semble que la source principale du risque est le risque de march´e, c’est-`a-dire le risque associ´e aux fluctuations des actifs financiers. C’est pourquoi nous nous bornerons a` d´ecrire comment quantifier ce type de risque. Depuis le milieu du vingti`eme si`ecle, plusieurs th´eories ont vu le jour pour tenter de cerner le comportement des individus face a` l’incertain en vue d’en d´eduire des r`egles de d´ecisions et des mesures de risque. On peut citer tout d’abord les travaux de von Neumann et Morgenstern (1947) et Savage (1954) sur la formalisation de la th´eorie de l’utilit´e esp´er´ee, visant a` d´ecrire les pr´ef´erences des agents e´ conomiques, et l’introduction de la notion d’aversion au risque. Puis, au vu des incompatibilit´es de cette th´eorie avec certains comportements observ´es par Allais (1953) ou Ellsberg (1961), de nouvelles approches ont vu le jour, telle la “th´eorie de la perspective” de Kahneman et Tversky (1979), qui permet de rendre compte d’une part du fait que les agents s’attachent plus aux perspectives d’´evolution de leur richesse qu’`a leur richesse elle-mˆeme, et d’autre part qu’ils traitent leurs gains et leurs pertes de mani`ere disym´etriques : ils sont risquophobes face a` des gains potentiels mais risquophiles vis-`a-vis de pertes a` venir. Une autre alternative plus g´en´erale est ensuite apparue (Quiggin 1982, Gilboa et Schmeidler 1989) et s’est d´evelopp´ee autour des mod`eles dits non-additifs, c’est-`a-dire des mod`eles o`u les probabilit´es ne jouissent plus de la propri´et´e d’additivit´e, et sont donc remplac´ees par des capacit´es, ce qui permet notamment de rendre compte du ph´enom`ene de distorsion des probabilit´es souvent observ´e chez la plupart des agents. R´ecemment, dans un cadre plus restreint que celui de la th´eorie de la d´ecision, Artzner, Delbaen, Eber et Heath (1999) ont propos´e une nouvelle approche de la notion de risque, reposant sur les propri´et´es minimales que l’on est en droit d’attendre d’une mesure de risque. Ceci a donn´e naissance a` la notion de mesures de risque coh´erentes, qui a ensuite e´ t´e e´ tendue par Heath (2000) puis F¨ollmer et Schied (2002a) a` la notion de mesures de risque convexes. Cependant, comme nous le verrons, ce type de mesure de risque ne semble pas toujours eˆ tre le mieux adapt´e a` la quantification des risques d’un portefeuille. En effet, il nous semble qu’il convient de bien diff´erencier deux types de risques : – premi`erement, ce que l’on peut appeler la mesure du risque en terme de capital e´ conomique, c’est-`adire la somme d’argent dont doit disposer un gestionnaire de portefeuille / une institution pour pouvoir faire face a` ses obligations et ainsi e´ viter la ruine, – et deuxi`emement le risque li´e aux fluctuations statistiques de la richesse ou du rendement d’un portefeuille autour de l’objectif de rentabilit´e pr´ealablement fix´e. 345
346
10. La mesure du risque
C’est pour rendre compte de cette deuxi`eme cat´egorie de risques que nous proposons une autre approche dans laquelle il est fait usage des moments de la distribution des rendements d’un actif pour quantifier le risque associ´e aux fluctuations de cet actif.
10.1
La th´eorie de l’utilit´e
Notre objectif dans ce paragraphe est de passer en revue les avanc´ees r´ecentes de la th´eorie de la d´ecision, dont on sait qu’elle joue un rˆole tr`es important en e´ conomie et en finance au travers de la notion d’utilit´e. Notre pr´esentation ne fait qu’effleurer la surface de ce vaste probl`eme et nous renvoyons le lecteur a` l’article de Cohen et Tallon (2000) notamment pour de plus amples d´etails.
10.1.1
Th´eorie de l’utilit´e en environnement certain
La th´eorie de l’utilit´e trouve ses fondements dans le courant de pens´ee sociale d´evelopp´e de la fin du 18`eme si`ecle au milieu du 19`eme par J. Bentham et J.S. Mill, fondateurs de la philosophie de l’utilitarisme1 et selon laquelle on doit juger une action a` ses r´esultats et cons´equences sur le bien-ˆetre des individus. En cela, l’utilitarisme s’oppose farouchement au rigorisme moral prˆon´e, a` la mˆeme e´ poque, par Kant, pour qui la valeur d’une action ne peut se juger qu’`a l’aune des principes et intentions qui en sont a` l’origine. Loin de ces querelles philosophiques, les e´ conomistes se sont rapidement empar´es du principe selon lequel les individus agissent en vue de maximiser leur bien-ˆetre et en ont fait le moteur des comportements individuels des agents e´ conomique. On peut d’ailleurs faire remonter a` Adam Smith (1776) l’introduction de la notion d’utilit´e en e´ conomie du fait de la distinction qu’il introduit entre les concepts de “valeur a` l’´echange” (le prix) et de “valeur a` l’usage” (l’utilit´e) d’un bien et qui sont a` la base de la loi de l’offre et de la demande et de la notion d’´equilibre de march´es. La formalisation math´ematique de la th´eorie de l’utilit´e n’interviendra que pr`es de deux si`ecles plus tard et repose (en univers certain) sur deux axiomes simples d´ecrivant les capacit´es des agents a` d´eterminer leurs pr´ef´erences. Consid´erons a` partir de maintenant, l’ensemble B des actifs financiers (ou plus g´en´eralement des biens) accessibles aux agents e´ conomiques, et postulons deux axiomes simples, un axiome de comparaison et un axiome de continuit´e : A XIOME 1 (C OMPARABILIT E´ ) Un agent e´ conomique est capable d’´etablir une pr´ef´erence entre tous les actifs de B. Cela revient a` dire qu’il existe un pr´eordre complet “l’actif X est pr´ef´er´e a` l’actif Y ”, not´ee X Y , entre tous les actifs X, Y ∈ B. A XIOME 2 (C ONTINUIT E´ ) La relation d’ordre est continue. Cela signifie qu’´etant donn´e trois actifs X, Y, Z ∈ B il existe toujours deux re´els α, β ∈]0, 1[ tels que d’une part, le portefeuille compos´e d’une proportion α de l’actif X et (1 − α) de l’actif Y est strictement pr´ef´er´e a` l’actif Z et d’autre part, l’actif Z est strictement pr´ef´er´e au portefeuille compos´e d’une proportion β de l’actif X et (1 − β) de l’actif Y : α X + (1 − α) Y Z 1
et
Z β X + (1 − β) Y.
(10.1)
En toute rigueur les pr´emices de la philosophie de l’utilit´e remontent plutˆot a` la fin du 17`eme si`ecle avec Hobbes puis avec Helv´etius au d´ebut 18`eme (voire, si l’on veut aller aussi loin dans le temps a` Epicure).
347
10.1. La th´eorie de l’utilit´e
Moyennant ces deux axiomes et quelques hypoth`eses a` caract`ere purement technique que nous omettons, il est possible de montrer qu’il existe une fonction dite fonction d’utilit´e telle que : D E´ FINITION 5 (F ONCTION D ’ UTILIT E´ ) La relation de pr´ef´erence sur l’ensemble des actifs B peut eˆ tre repr´esent´ee par une fonction d’utilit´e U : B → R, telle que : ∀(X, Y ) ∈ B 2 , X Y ⇐⇒ U (X) ≥ U (Y ). (10.2) Cette d´efinition signifie simplement que l’actif X est pr´ef´er´e a` l’actif Y si et seulement si l’utilit´e (ou valeur a` l’usage) de l’actif X est sup´erieure a` l’utilit´e de l’actif Y . Bien e´ videmment, la fonction d’utilit´e n’est pas unique, puisque pour toute fonction g : R → R strictement croissante, la fonction V = g ◦ U est aussi une fonction d’utilit´e. La fonction U ainsi d´efinie n’a donc qu’une valeur ordinale. Si l’on souhaite pouvoir consid´erer qu’un accroissement de la fonction d’utilit´e mesure une augmentation de la satisfaction de l’agent e´ conomique, il faut admettre que la fonction d’utilit´e a une valeur cardinale, et elle n’est alors plus d´efinie qu’`a une tranformation affine croissante pr`es. Dans toute la suite nous consid´ererons uniquement des fonctions d’utilit´e cardinales. Si l’on s’int´eresse, comme le plus fr´equemment, a` l’utilit´e de la richesse W d’un individu, on d´eduit ais´ement du comportement des agents e´ conomiques que la fonction U (W ) est croissante, ce qui traduit l’appˆat du gain ou insatiabilit´e, et g´en´eralement concave ce qui exprime la d´ecroissance marginale de l’utilit´e de la richesse : cent euros ne repr´esentent pas la mˆeme utilit´e pour un agent poss´edant en tout et pour tout mille euros ou un million d’euros.
10.1.2
Th´eorie de la d´ecision face au risque
Consid´erons maintenant que l’on s’int´eresse au comportement des agents e´ conomiques a` l’´egard d’actifs dont la valeur future n’est pas parfaitement connue et d´epend, a` l’instant futur T , de l’´etat de la nature dans lequel se trouvera l’univers en T . Nous sommes alors confront´es a` un probl`eme de d´ecision en univers risqu´e. Le premier exemple de r´esolution d’un tel probl`eme remonte a` Daniel Bernoulli (1738) qui apparaˆıt comme le pr´ecurseur de l’introduction de l’utilit´e esp´er´ee, dont il fit usage pour r´esoudre le c´el`ebre paradoxe de Saint-P´etersbourg. Dans ce paradoxe, un individu se voit offrir la possibilit´e de jouer au jeu suivant : on lance une pi`ece parfaitement e´ quilibr´ee autant de fois que n´ecessaire pour voir le cot´e pile apparaˆıtre. A ce momentl`a, le jeu s’arrˆete et le joueur rec¸oit 2n euros, n e´ tant le nombre de fois que la pi`ece a e´ t´e lanc´ee. La question est alors de savoir combien est prˆet a` payer l’individu pour pouvoir participer a` ce jeu. Un simple calcul d’esp´erance math´ematique montre qu’en moyenne ce jeu offre un gain infini2 . Donc, pour eˆ tre e´ quitable, le joueur devrait accepter de payer une mise sinon infinie, du moins colossale. Or en r´ealit´e, on observe que les joueurs n’acceptent gu`ere de payer plus de quelques euros pour participer au jeu, d’o`u le paradoxe. La solution propos´ee par Bernoulli (1738) consiste a` supposer que les joueurs ne s’int´eressent pas a` la valeur moyenne des gains esp´er´es mais plutˆot a` l’esp´erance du logarithme des gains, et l’on obtient alors : ∞ X
2−n ln (2n ) = 2 ln 2.
(10.3)
n=1
Donc, pour reprendre la terminologie d’Adam Smith, les joueurs ne s’int´eresse pas a` la “valeur a` l’´echange” - ici les gains esp´er´es - du jeu, mais plutˆot a` sa “valeur a` l’usage” et donc a` l’esp´erance du logarithme 2
La probabilit´e de gagner 2n euros est e´ gale a` la probabilit´e d’obtenir n fois de suite le cot´e face de la pi`ece, soit
1 n . 2
348
10. La mesure du risque
des gains. Cette approche co¨ıncide exactement avec la th´eorie de l’utilit´e esp´er´ee dont von Neumann et Morgenstern (1947) poseront les bases plus de deux si`ecles plus tard et que nous allons maintenant exposer. Pour cela consid´erons l’ensemble Ω des e´ tats de la nature et F une tribu sur Ω de sorte que l’espace (Ω, F) soit mesurable. A chaque actif X ∈ B est associ´e une loi de probabilit´e PX sur (Ω, F) repr´esentant la distribution de la valeur future de l’actif X. Par abus de langage et pour all´eger les notations, la variable al´eatoire donnant la valeur future de l’actif X ∈ B sera elle mˆeme not´ee X (mais cette fois, X ∈ (Ω, F, PX )). Comme pr´ec´edemment en environnement certain, nous supposons que les agents e´ conomiques sont capables d’´etablir un pr´eordre total sur l’ensemble des actifs ou des valeurs futures des actifs et que ces pr´ef´erences sont continues. Nous admettons de plus que A XIOME 3 (I ND E´ PENDENCE ) Pour tout actif X, Y, Z ∈ B et tout re´el α ∈]0, 1], X Y ⇐⇒ α X + (1 − α) Z α Y + (1 − α) Z,
(10.4)
ce qui suppose que l’adjonction d’un mˆeme actif ne modifie pas l’ordre des pr´ef´erences. Cet axiome est central dans la th´eorie de l’utilit´e esp´er´ee de von Neumann et Morgenstern (1947). C’est en effet grˆace a` lui que l’on peut montrer que l’utilit´e U d’un actif X dont la valeur future est risqu´ee s’exprime comme U (X) = E[u(X)],
(10.5)
o`u u(·) est une fonction continue, croissante et d´efinie a` une transformation affine croissante pr`es. La fonction u(·) est donc elle mˆeme une fonction d’utilit´e, de sorte que l’utilit´e U d’un bien risqu´e apparait comme la moyenne de l’utilit´e u de ce mˆeme bien dans chacun des e´ tats de la nature. La fonction u permet de d´efinir la notion d’aversion pour le risque. Selon Rotschild et Stiglitz (1970), un actif Y est plus risqu´e qu’un actif X si pour toute fonction croissante et concave u(·), E[u(X)] ≥ E[u(Y )]. Ceci est en fait e´ quivalent (Levy 1998) au fait que X domine Y au sens de la dominance stochastique d’ordre deux : ∀t ∈ R,
Z
t
FY (y) dy ≥
−∞
Z
t
FX (x) dx.
(10.6)
−∞
Ainsi donc, un individu qui pr´ef`ere l’actif X a` l’actif Y pr´esente de l’aversion pour le risque, et une fonction d’utilit´e u(·) concave caract´erise un individu risquophobe. Le coefficient absolu d’aversion pour le risque de cet individu est alors d´efini par a=−
u00 . u0
(10.7)
La notion de dominance stochastique d’ordre deux peut eˆ tre e´ tendue a` un ordre n quelconque, et un individu pr´esentant une aversion pour le risque au sens de la dominance stochastique d’ordre n poss`ede une fonction d’utilit´e u(·), telle que pour tout x, et tout k ≤ n : (−1)k u(k) (x) ≤ 0,
(10.8)
(Levy 1998). De telles fonctions d’utilit´e caract`erisent une aversion pour le risque dite standard selon la terminologie de Kimball (1993) et pour n = 4 par exemple, un individu ayant une fonction d’utilit´e v´erifiant (10.8) est dit insatiable (u0 > 0), risquophobe (u00 < 0), prudent (u(3) > 0) et temp´er´e (u(4) < 0).
10.1. La th´eorie de l’utilit´e
349
Cette formulation simple et parcimonieuse de la th´eorie de la d´ecision face au rique a contribu´e a` rendre l’approche de von Neumann et Morgenstern (1947) tr`es populaire. Cependant, cette simplicit´e ne va pas sans soulever quelques difficult´es et incoh´erences, au premier rang desquelles se trouve la violation du postulat d’ind´ependance, cl´e de voute de cette th´eorie. En effet, Allais (1953) a montr´e par des tests simples, consistant a` proposer une s´erie d’alternatives a` des sujets, que la majorit´e de ceux-ci e´ mettent des choix en contradiction avec cet axiome d’independance. En particulier, les agents semblent tr`es sensibles aux petites variations de probabilit´es au voisinage du certain : ils accordent beaucoup d’importance au passage d’une probabilit´e de 0 a` 0.01 ou de 0.99 a` 1, alors qu’un changement de 5 a` 10 pour-cent au voisinage d’un niveau de probabilit´e de 0.50 les laissent bien souvent indiff´erents. Ainsi, les agents sont sujets a` une distorsion de leur perception des probabilit´es. D’autre part, en plus de cette contradiction empirique vient s’ajouter une limitation th´eorique : la fonction u(·) joue un double rˆole. Elle quantifie en mˆeme temps l’aversion pour le risque du d´ecideur et la d´ecroissance marginale de l’utilit´e de la richesse, de sorte qu’il est impossible de mod´eliser un agent qui est a` la fois risquophile et dont l’utilit´e marginale d´ecroit. En fait, ces deux contradictions peuvent eˆ tre lev´ees en affaiblissant le postulat d’ind´ependance, qui sera remplac´e par : ˆ A XIOME 4 (C HOSE S URE COMONOTONE DANS LE RISQUE ) Soit deux actifs X, Y ∈ B dont les valeurs futures sont donn´ees par les variables al´eatoires (suppos´ees discr`etes pour simplifier) : X = (x1 , p1 ; · · · ; xk , pk ; · · · ; xn , pn ) et Y = (y1 , p1 ; · · · ; yk , pk ; · · · ; yn , pn ), telles que x1 ≤ · · · ≤ xk ≤ · · · ≤ xn et y1 ≤ · · · ≤ yk ≤ · · · ≤ yn avec xk = yk . Soient alors les actifs X 0 , Y 0 ∈ B obtenus en remplac¸ant xk par x0k dans les actifs X et Y de sorte que xk−1 ≤ x0k ≤ xk+1 et yk−1 ≤ x0k = yk0 ≤ yk+1 . Alors X Y ⇐⇒ X 0 Y 0 . (10.9) Cela veut simplement dire que l’on ne modifie pas l’ordre de pr´ef´erence de deux actifs lorsque l’on modifie leur commune valeur future sans changer le rang de celle-ci, ce qui fait toute la diff´erence avec l’axiome d’ind´ependance. Moyennant cela, les paradoxes d’Allais (1953) sont lev´es et l’on est amen´e a` g´en´eraliser la th´eorie de l’utilit´e esp´er´ee par la th´eorie de l’utilit´e d´ependante du rang originellement d´evelopp´ee par Quiggin (1982). En effet, il est alors possible de montrer que l’on peut caract´eriser le comportement de tout agent e´ conomique par deux fonctions croissantes. La premi`ere, u : B → R, d´efinie a` une transformation affine croissante pr`es, joue le rˆole de fonction d’utilit´e dans le certain. La seconde, ϕ : [0, 1] → [0, 1] est unique et repr´esente la fonction de transformation (ou de distorsion) des probabilit´es. Ainsi, l’utilit´e de l’actif X est : Z (10.10) U (X) = − u(x) dϕ(Pr{X > x}) = Eϕ◦PX [u(x)]. Cette int´egrale est en fait une int´egrale de Choquet, c’est-`a-dire une int´egrale par rapport a` une mesure g´en´eralement non-additive. Dans le cas particulier o`u ϕ(x) = x, on retrouve bien e´ videmment l’expression de l’utilit´e esp´er´ee de von Neumann et Morgenstern (1947), qui est donc englob´ee par la th´eorie de l’utilit´e esp´er´ee d´ependante du rang. L’int´erˆet de cette nouvelle formulation est de compl`etement d´ecoupler les notions de d´ecroissance marginale de l’utilit´e, mesur´ee par la concavit´e de la fonction u(·), et d’aversion pour le risque, enti`erement caract´eris´ee par la fonction de transformation des probabilit´es ϕ(x) : un agent dont la fonction de transformation des probabilit´es est telle que ϕ(x) ≤ x sera dit pessimiste dans le risque. En effet, reprenant l’exemple discret de l’axiome 4, l’´equation (10.10) devient : U (X) = u(x1 ) + ϕ(p2 + · · · + pn ) · [u(x2 ) − u(x1 )] + · · · + ϕ(pn ) · [u(xn ) − u(xn−1 )],
(10.11)
350
10. La mesure du risque
ce qui montre qu’un tel agent commence par calculer l’utilit´e minimale que peut lui procurer l’actif X, soit u(x1 ), puis il ajoute les accroissements possibles de l’utilit´e u(xk ) − u(xk−1 ) qu’il peut recevoir en les pond´erant non pas par leur probabilit´es d’occurence mais par sa fonction de distorsion des probabilt´es. Ainsi, lorsque ϕ(x) ≤ x, il sous-estime la probabilit´e d’´ev´enements favorables et sous-pond`ere les accroissements d’utilit´e qu’il peut en retirer.
10.1.3
Th´eorie de la d´ecision face a` l’incertain
Nous venons d’exposer les bases de la th´eorie de la d´ecision face au risque, c’est-`a-dire lorsque le d´ecideur connait de fac¸on objective les probabilit´es associ´ees aux diff´erents e´ tats de la nature Ω. Cependant, dans la plupart des situations e´ conomiques et financi`eres, celles-ci ne sont que partiellement r´ev´el´ees voire totalement inconnues. Il convient donc de s’int´eresser a` cette situation, qualifi´ee de th´eorie de la d´ecision dans l’incertain, par opposition a` la th´eorie de la d´ecision face au risque, o`u l’on suppose donn´ees les probabilit´es sur les e´ tats de la nature. L’approche classique (ou bay´esienne) de ce probl`eme est celle de Savage (1954) qui consiste a` r´eduire le probl`eme de d´ecision dans l’incertain a` un probl`eme de d´ecision face au risque, a` l’aide de la notion de probabilit´es subjectives. Ces probabilit´es, dites subjectives, diff`erent des probabilit´es objectives de la mˆeme mani`ere que les courses de chevaux diff`erent du jeu de roulette au casino : a` la roulette, la table e´ tant parfaitement e´ quilibr´ee, tous les joueurs connaissent la probabilit´e que sorte le trois, rouge, impair et passe, alors qu’au tierc´e, nul ne connaˆıt avec exactitude la probabilit´e que tel ou tel cheval a de l’emporter. De plus, les probabilit´es objectives ont une interpr´etation tr`es simple dans la mesure o`u elles sont reli´ees a` la fr´equence typique d’occurrence d’un e´ v´enement. En effet, la probabilit´e d’obtenir face en jetant une pi`ece parfaitement e´ quilibr´ee est de un demi, tout simplement parce ce qu’en r´ep´etant un grand nombre de fois ce lancer de pi`ece, on observe que celle-ci tombe sur face la moiti´e du temps, et ce quelle que soit la personne effectuant les lancers. Donc, la probabilit´e objective est une propri´et´e intrins`eque de l’objet (ici, la pi`ece) ou de l’´ev´enement (ici, tomber sur face) consid´er´e. Au contraire, les probabilit´es subjectives mesurent un degr´e de croyance en la vraisemblance d’un e´ v´enement. Quelle est la probabilit´e qu’existe une vie extra-terrestre ? Nous ne pouvons pas faire d’exp´eriences a` ce sujet. Donc, la probabilit´e accord´ee a` ce type d’´ev´enement ne peut eˆ tre que fonction de l’opinion de chacun sur la question. Ainsi une probabilit´e subjective n’a pas une valeur unique et d´epend de chaque individu. Pour autant, ces probabilit´es subjectives ob´eissent aux mˆemes r`egles que les probabilit´es objectives en vertu du th´eor`eme du “ dutch book” (de Finetti 1937). Selon ce th´eor`eme, tout pari bas´e sur un ensemble de probabilit´es subjectives est e´ quitable (et ne peut donc conduire a` un gain certain) si et seulement si la probabilit´e subjective attribu´ee a` un e´ v´enement certain vaut un, ainsi que la somme des probabilit´es de deux e´ v´enements compl´ementaires. Moyennant cette nouvelle interpr´etation des probabilit´es, les probl`emes de d´ecision dans l’incertain peuvent eˆ tre ramen´es a` de “simples” probl`emes de d´ecision face au risque, ce qui, outre l’axiome de pr´eordre total, repose selon Savage (1954), sur l’axiome suivant : ˆ ) A XIOME 5 (P RINCIPE DE LA CHOSE S URE ˜ ∈ Ω de l’ensemble des e´ tats de la nature et des actifs X, X 0 , Y, Y 0 ∈ B Etant donn´e un sous-ensemble Ω ˜ X(ω) = X 0 (ω), Y (ω) = Y 0 (ω) et ∀ω ∈ Ω, ˜ X(ω) = Y (ω), X 0 (ω) = Y 0 (ω), alors tels que ∀ω ∈ Ω, X Y ⇐⇒ X 0 Y 0 .
(10.12)
Ceci signifie qu’une modification commune de la partie commune de deux actifs ne modifie pas l’ordre
351
10.1. La th´eorie de l’utilit´e
des pr´ef´erences. Ajout´e aux axiomes de comparabilit´e et de continuit´e, cet axiome permet d’affirmer qu’il existe une unique probabilit´e P sur (Ω, F) et une fonction u(·) continue et croissante (d´efinie a` une fonction affine croissante pr`es) telle que l’utilit´e U de l’actif X est donn´ee par U (X) = EP [u(X)],
(10.13)
o`u u(·) joue comme d’habitude le rˆole de fonction d’utilit´e dans le certain. Cette expression est analogue a` celle obtenue par la th´eorie de l’utilit´e esp´er´ee de von Neumann et Morgenstern (1947), a` la diff´erence notoire qu’ici, la mesure de probabilit´e P est subjective et non pas objective. L’axiome de la chose sˆure est extrˆemement fort car il permet de traiter tout probl`eme de d´ecision dans l’incertain comme un probl`eme de d´ecision face au risque. Ceci n’est en fait pas tr`es r´ealiste, si bien que cette approche est rejet´ee sur le plan th´eorique comme sur le plan pratique. En effet, de mˆeme qu’Allais (1953) avait prouv´e que l’axiome d’ind´ependance e´ tait contredit empiriquement par la majorit´e des agents, Ellsberg (1961) a pu montrer que dans des situations tr`es simple de choix dans l’incertain l’axiome de la chose sˆure ne r´esistait pas non plus a` l’exp´erience. En fait, la plupart des agents pr´esentent une aversion pour l’ambigu¨ıt´e, dans le sens o`u, pour une mˆeme mise, ils pr´ef´erent parier pour ou contre un e´ v´enement de probabilit´e P connue plutˆot que pour ou contre un e´ v´enement dont ils savent seulement que sa probabilit´e est comprise P − ε et P + ε. On peut alors montrer que les agents satisfaisant au mod`ele de Savage (1954) sont indiff´erents a` l’ambiguit´e, puisqu’ils ne peuvent pas faire la diff´erence entre ces deux paris. Pour d´epasser cette contradiction empirique il convient d’affaiblir l’axiome de la chose sˆure. Pour cela, Schmeidler (1989) a propos´e l’alternative suivante : ˆ A XIOME 6 (C HOSE S URE COMONOTONE ) Soit une partition {Ωk }nk=1 de l’ensemble des e´ tats de la nature Ω et deux actifs X, Y ∈ B dont les valeurs futures sont donn´ees par les variables al´eatoires : X = (x1 , Ω1 ; · · · ; xk , Ωk ; · · · ; xn , Ωn ) et Y = (y1 , Ω1 ; · · · ; yk , Ωk ; · · · ; yn , Ωn ), telles que x1 ≤ · · · ≤ xk ≤ · · · ≤ xn et y1 ≤ · · · ≤ yk ≤ · · · ≤ yn avec xk = yk . Soient alors les actifs X 0 , Y 0 ∈ B obtenus en remplac¸ant xk par x0k dans les actifs X et Y de sorte que xk−1 ≤ x0k ≤ xk+1 et yk−1 ≤ x0k = yk0 ≤ yk+1 . Alors X Y ⇐⇒ X 0 Y 0 .
(10.14)
Cet axiome est tr`es proche de l’axiome de la chose sˆure comonotone dans le risque, si ce n’est que cette fois, les probabilit´es pi des e´ tats Ωi ne sont pas connues. A l’aide de cela, on montre qu’il existe non plus une unique probabilit´e P sur {Ω, F} mais une unique capacit´e 3 v sur {Ω, F} et une fonction u(·) croissante et continue (d´efinie a` une transformation affine 3
Une capacit´e v est une fonction d’ensemble de {Ω, F } dans [0, 1] telle que : – v(∅) = 0, – v(Ω) = 1, – ∀A, B ∈ F, A ⊂ B =⇒ v(A) ≤ v(B). Rappelons qu’une mesure de probabilit´e P (addititive) v´erifierait en plus ∀A, B ∈ F, P (A ∪ B) = P (A) + P (B) − P (A ∩ B). Une capacit´e est dite convexe si ∀A, B ∈ F, v(A) + v(B) ≤ v(A ∪ B) + v(A ∩ B). Toute capacit´e convexe a un noyau non vide, o`u le noyau de v est core(v) = {P ∈ P | ∀A ∈ F, P (A) ≥ v(A)} , et P est l’ensemble des mesures de probabilit´es additives sur {Ω, F}.
352
10. La mesure du risque
croissante pr`es) telles que l’utilit´e de l’actif X est
U (X) =
Z
u(X) dv,
(10.15)
qui est une int´egrale de Choquet par rapport a` la mesure non-additive (capacit´e) v. On peut noter la tr`es forte ressemblance de cette expression avec celle obtenue pour l’utilit´e d´ependante du rang (cf e´ quation (10.10)). En fait, dans (10.10), l’expression ϕ ◦ P est une capacit´e. De plus, si dans le mod`ele de Schmeidler (1989), il existe une probabilit´e objective P sur {Ω, F}, la capacit´e v peut s’exprimer comme v = ϕ ◦ P, o`u ϕ est unique et v est convexe si et seulement si ϕ l’est aussi. Dans le cadre de ce mod`ele, Montessano et Giovannoni (1996) d´efinissent la notion d’aversion pour l’incertitude, a` savoir qu’un agent pr´eRsente de l’aversion pour l’incertitude, s’il existe une loi de probabilit´e P telle que quelque soit X ∈ B, u(X) dv ≤ EP [u(X)], ce qui implique que le noyau de la capacit´e v contient P et est donc non vide. R´eciproqement, on peut donc affirmer que tout agent caracteris´e par une capacit´e convexe (donc a` noyau non vide) a de l’aversion pour l’incertitude. Intuitivement, un agent averse a` l’incertitude affectera toujours un e´ v´enement de la “probabilit´e” la moins favorable parmi toutes les probabilit´es attribu´ees a` cet e´ v´enement par l’ensemble des lois pr´esentes dans le noyau de la capacit´e. Schmeidler (1986) fournit une interpretation de ce mod`ele en terme de croyances. En effet, sous l’hypoth`ese que la capacit´e v est convexe, son noyau est non vide et
∀X ∈ B,
Z
u(X) dv =
min
EP [u(X)],
(10.16)
P∈core(v)
donc l’utilit´e U (X) est donn´ee par l’esp´erance minimale de u(X) calcul´ee sur un ensemble de sc´enarii. Ceci a conduit Gilboa et Schmeidler (1989), Nakamura (1990), Chateauneuf (1991) ou encore CasadesusMasanell, Klibanoff et Ozdenoren (2000) a` d´evelopper des mod`eles dits multi-prior, o`u les agents se donnent, a priori, un ensemble de distributions de probabilit´es P (ou sc´enarii) et d´efinissent l’utilit´e comme ∀X ∈ B, U (X) = min EP [u(X)].
(10.17)
P∈P
Il faut bien remarquer que les deux approches ne sont pas e´ quivalentes, car tout ensemble (ferm´e et convexe) P n’est pas n´ecessairement le noyau d’une capacit´e v. De plus, cette derni`ere approche peut paraitre excessivement pessimiste puisqu’elle ne retient que la plus petite utilit´e possible, vu l’ensemble de sc´enarii consid´er´es. En tout cas, elle est beaucoup plus pessimiste que l’utilit´e d´eriv´ee du mod`ele de Schmeidler (1986) puisque Jaffray et Philippe (1997) ont montr´e que l’int´egrale de Choquet pouvait toujours s’exprimer comme la somme pond´er´ee de deux termes : le minimum et le maximum de l’esp´erance d’utilit´e par rapport a` un ensemble de distributions de probabilit´e, le poids relatif de ces deux termes permettant de d´efinir un indice de pessimisme de l’agent. Cependant, malgr´e les r´ecentes avanc´ees de la th´eorie de la d´ecision que nous venons de pr´esenter, il faut reconnaitre qu’une de ses limitations fondamentales demeure, a` savoir comment mettre en œuvre concr`etement et pratiquement cette th´eorie. En effet, dans la mesure o`u chaque agent poss`ede une fonction d’utilit´e diff´erente, il est tr`es difficile de d´ecider de mani`ere objective laquelle employer. Donc, il va s’av´erer utile de consid´erer d’autres outils de d´ecision et d’autres moyens de mesurer les risques.
10.2. Les mesures de risque coh´erentes
353
10.2 Les mesures de risque coh´erentes 10.2.1
D´efinition
Selon Artzner et al. (1999), le risque associ´e aux variations de la valeur d’une position est mesur´e par la somme d’argent qui doit eˆ tre investie dans un actif sans risque pour que dans le futur, cette position reste acceptable, c’est-`a-dire pour que les pertes e´ ventuelles li´ees a` la valeur future de la position ne mettent pas en p´eril les projets du gestionnaire de fond, de l’entreprise ou plus g´en´eralement la personne / l’organisme qui garantit la position. En ce sens, une mesure de risque constitue pour Artzner et al. (1999) une mesure de capital e´ conomique. Donc, la mesure de risque, not´ee ρ dans toute la suite, pourra eˆ tre positive ou n´egative selon que, respectivement, il faille augmenter la somme investie dans l’actif sans risque pour garantir la position risqu´ee ou bien que l’on puisse r´eduire cette somme tout en maintenant cette garantie. Une mesure de risque sera dite coh´erente au sens de Artzner et al. (1999) si elle v´erifie les quatres propri´et´es ou axiomes que nous allons exposer ci-dessous. D´ecidons tout d’abord d’appeler G l’espace des risques. Si l’espace des e´ tats de la nature Ω est suppos´e fini (hypoth`ese faite par Artzner et al. (1999)), G est isomorphe a` RN et une position risqu´ee X n’est alors rien d’autre qu’un vecteur de RN . Une mesure de risque ρ est alors une application de RN dans R. Une g´en´eralisation a` d’autres espaces de risque G a e´ t´e propos´ee par Delbaen (2000). Soit donc une position risqu´ee X et une somme d’argent α investie dans l’actif sans risque en d´ebut de p´eriode, de sorte qu’en fin de p´eriode, le montant investi dans l’actif sans risque est α · (1 + r), o`u r repr´esente le taux d’int´erˆet sans risque, alors : A XIOME 7 (I NVARIANCE PAR TRANSLATION ) ∀X ∈ G et ∀α ∈ R, ρ(X + α · (1 + r)) = ρ(X) − α.
(10.18)
Ceci signifie simplement qu’investir une somme α dans l’actif sans risque, diminue le risque de la mˆeme quantit´e α. En particulier, pour toute position risqu´ee X, ρ(X + ρ(X) · (1 + r)) = 0. Consid´erons maintenant deux positions risqu´ees X1 et X2 , repr´esentant par exemple les positions de deux traders dans une salle de march´e. Il est commode pour le superviseur de cette salle de march´e que le risque agr´eg´e de tous les traders soit inf´erieur ou e´ gal a` la somme des risques de chaque trader, donc en particulier il est souhaitable que le risque associ´e a` la position (X1 + X2 ) soit inf´erieur ou e´ gal a` la somme des risques associ´es aux positions X1 et X2 s´epar´ement : A XIOME 8 (S OUS - ADDITIVIT E´ ) ∀(X1 , X2 ) ∈ G × G, ρ(X1 + X2 ) ≤ ρ(X1 ) + ρ(X2 ).
(10.19)
De plus, la sous-additivit´e garantit qu’un gestionnaire de portefeuille a interˆet a` agr´eger ses diverses positions afin d’en diminuer le risque par diversification. Le troisi`eme axiome est un axiome d’homog´en´eit´e (ou d’extensivit´e pour reprendre le language des physiciens) : A XIOME 9 (P OSITIVE HOMOG E´ N E´ IT E´ ) ∀X ∈ G et ∀λ ≥ 0, ρ(λ · X) = λ · ρ(X),
(10.20)
354
10. La mesure du risque
ce qui signifie simplement que le risque d’une position croˆıt avec la taille de cette position, et plus pr´ecis´ement, le risque est ici suppos´e proportionnel a` la taille de la position risqu´ee. Nous reviendrons un peu plus loin sur l’hypoth`ese sous-jacente que sugg`ere un tel axiome. Enfin, sachant que dans tous les e´ tats de la nature, le risque X conduit a` une perte sup´erieure a` l’actif Y (c’est-`a-dire que toutes les composantes du vecteur X de RN sont toujours inf´erieures ou e´ gales a` celles du vecteur Y ), la mesure de risque ρ(X) doit eˆ tre sup´erieure ou e´ gale a` ρ(Y ) : A XIOME 10 (M ONOTONIE ) ∀X, Y ∈ G tel que X ≤ Y, ρ(X) ≥ ρ(Y ).
(10.21)
Ainsi pos´es, ces quatre axiomes d´efinissent ce que l’on appelle les mesures de risques coh´erentes.
10.2.2
Quelques exemples de mesures de risque coh´erentes
Beaucoup de mesures de risque commun´ement utilis´ees aussi bien dans la recherche acad´emique que par les professionels s’av`erent eˆ tre non coh´erentes au sens de Artzner et al. (1999). En effet, il est e´ vident que la variance - dont l’utilisation comme mesure de risque remonte a` Markovitz (1959) - ne satisfait pas a` l’axiome de monotonie. De mˆeme, il est ais´e de montrer que la Value-at-Risk n’est g´en´eralement pas sous-additive. Nous rappelons que la Value-at-Risk, calcul´ee au seuil de confiance α est d´efinie par D E´ FINITION 6 (VALUE - AT-R ISK ) Soit X ∈ G suppos´e a` distribution continue. La Value-at-Risk, calcul´ee au seuil de confiance α ∈ [0, 1] et not´ee VaRα est Pr[X + (1 + r) · VaRα ≥ 0] = α. (10.22) A ce sujet, la classe des actifs dont la distribution jointe est elliptique constitue une exception notable pour laquelle Embrechts et al. (2002) ont montr´e que la VaR est sous-additive, et demeure donc une mesure coh´erente de risque. Vue l’utilisation tr`es r´epandue de la VaR dans le milieu professionel, il e´ tait souhaitable d’essayer de construire une mesure de risque coh´erente se rapprochant le plus possible de la VaR et qui de plus compl`ete l’information fournie par celle-ci, a` savoir : conditionn´ee au fait de subir une perte d´epassant la VaR, quelle est, en moyenne, la perte observ´ee. Ceci a conduit a` la d´efinition de l’Expected Shortfall : D E´ FINITION 7 (E XPECTED SHORTFALL ) Soit X ∈ G suppos´e a` distribution continue. L’Expected Shortfall, calcul´ee au niveau de confiance α est X X ESα = −E ≤ −V aRα . (10.23) 1+r 1+r Dans le cas o`u la distribution de X n’est pas suppos´ee continue, l’expression de l’Expected Shortfall est un peu plus compliqu´ee (voir Acerbi et Tasche (2002) ou Tasche (2002) par exemple). Lorsque l’on souhaite tenir compte des grands risques, une autre approche, sur laquelle nous reviendrons en d´etail dans la section 10.3, consiste a` prendre en compte l’effet des moments d’ordres sup´erieurs (`a deux). La question concernant la coh´erence de ce type de mesures de risques contruites a` partir des moments est abord´ee par Delbaen (2000) et plus particuli`erement par Fisher (2001). Ce dernier montre que toute mesure de risque (10.24) ρ(X) = −E[X] + a · σp ,
355
10.2. Les mesures de risque coh´erentes
o`u 0 ≤ a ≤ 1 et
σp = (E [max{(E[X] − X)p , 0}])1/p
(10.25)
est le semi-moment centr´e (inf´erieur) d’ordre p, est une mesure coh´erente de risque. Plus g´en´eralement, du fait que toute somme convexe de mesures coh´erentes de risque est une mesure coh´erente de risque, la mesure ∞ X ρ(X) = −E[X] + ap · σp , (10.26) p=1
avec
∞ X
ap ≤ 1
et
ap ≥ 0,
(10.27)
p=1
est coh´erente. Ceci permet d’obtenir de fac¸on simple les e´ quivalents coh´erents de certaines mesures de risque ou fonctions d’utilit´e. Par exemple, la mesure de risque ρ(X) = −E[X] + a · σ2 peut eˆ tre consid´er´ee comme la g´en´eralisation coh´erente de la fonction d’utilit´e moyenne-variance.
10.2.3
Repr´esentation des mesures de risque coh´erentes
Les axiomes pr´esent´es au paragraphe 10.2.1 ainsi que les quelques exemples que nous venons de donner montrent qu’il existe une grande vari´et´e de mesures coh´erentes : les axiomes ne sont pas assez contraignants pour sp´ecifier compl´etement une (unique) mesure de risque coh´erente. Aussi est-il int´eressant de rechercher une repr´esentation de ce type de mesures. Artzner et al. (1999) ont montr´e que T H E´ OR E` ME 2 (R EPR E´ SENTATION DES MESURES COH E´ RENTES DE RISQUE ) Une mesure de risque ρ est coh´erente si et seulement si il existe une famille P de mesures de probabilit´e sur l’espace des e´ tats de la nature telle que : X ρ(X) = sup EP − . (10.28) 1+r P∈P Ainsi, une mesure coh´erente de risque apparait comme l’esp´erance de perte maximale sur un ensemble de sc´enarii r´ealisables. Il est alors e´ vident que plus l’ensemble de sc´enarii consid´er´es sera grand, plus ρ(X) sera grand lui aussi, toutes choses e´ gales par ailleurs. Ainsi, plus l’ensemble de sc´enarii consid´er´es est grand, plus la mesure de risque est conservatrice. Cette expression math´ematique n’est pas sans rappeler certaines formules que nous avons rencontr´ees concernant la th´eorie de la d´ecision dans l’incertain (voir les e´ quations 10.16 et 10.17). Ceci est en fait tr`es naturel car les axiomes choisis par Artzner et al. (1999) pour d´efinir les mesures de risques coh´erentes sont tout-`a-fait similaires a` ceux dont d´ecoulent le mod`ele d’utilit´e de Schmeidler (1986). La l´eg`ere diff´erence entre les expressions (10.16) et (10.28), c’est-`a-dire le changement du min en sup et le · passage d’une fonction d’utilit´e u(·) croissante a` la fonction d´ecroissante − 1+r vient simplement du fait que le d´ecideur ou le gestionnaire de risques tend a` maximiser son utilit´e alors qu’il cherche a` minimiser sa prise de risque. Ceci a comme cons´equence que dans le mod`ele multi-prior, l’utilit´e est une quantit´e super-additive et non pas sous-additive comme les mesures de risque coh´erentes. De plus, la sp´ecification · de la fonction − 1+r apparaissant dans (10.28) vient de l’axiome d’invariance par translation qui impose que pour un investissement α dans l’actif sans risque ρ(α (1 + r)) = −α. Aussi g´en´erale soit-elle, la repr´esentation des mesures de risque coh´erentes fournie par le th´eor`eme 2 n’est pas d’une utilisation ou d’une mise en oeuvre tr`es simple. En effet, s’il est facile, dans la pratique,
356
10. La mesure du risque
d’utiliser l’Expected Shortfall comme mesure coh´erente de risque ou toute autre mesure coh´erente ayant une expression analytique simple, comment doit-on choisir l’ensemble des sc´enarii r´ealisables si l’on veut en rester au degr´e de g´en´eralit´e propos´e par le th´eor`eme de repr´esentation ? Cette question n’admet gu`ere de r´eponse satisfaisante, et dans l’optique d’une mise en œuvre pratique, le point fondamental est plutˆot de savoir si la mesure de risque que l’on souhaite utiliser peut eˆ tre estim´ee a` partir des donn´ees empiriques. De telles mesure de risques sont dites law-invariant, et nous nous restreindrons d´esormais a` la seule e´ tude de ce type de mesures coh´erentes. Si de plus, on ne s’int´eresse qu’aux mesures comonotoniquement additives4 , on d´efinit alors ce qu’Acerbi (2002) qualifie de mesures spectrales, pour lesquelles Kusuoka (2001) puis Tasche (2002) ont d´emontr´e le th´eor`eme de repr´esentation suivant : T H E´ OR E` ME 3 (R EPR E´ SENTATION DES MESURES SPECTRALES ) Soit F une fonction de distribution continue et convexe et un r´eel p ∈ [0, 1]. La mesure de risque ρ est une mesure spectrale, c’est-`a-dire coh´erente, law-invariant et comonotoniquement additive, si et seulement si elle admet la repr´esentation ρ(X) = p
Z
1
VaRu (X) F (du) + (1 − p)VaR1 (X).
(10.29)
0
Si de plus F admet une densit´e φ par rapport a` la mesure de Lebesgue, (et en supposant p = 1 pour simplifier) alors Z 1 ρ(X) = VaRu (X) φ(u) du, (10.30) 0
et ρ(X) apparait comme la somme pond´er´ee par φ(u) des VaRu , ce qui justifie, suivant Acerbi (2002), que l’on puisse qualifier φ de “fonction d’aversion pour le risque”, puisque φ quantifie l’importance accord´ee aux diff´erents niveaux de risque quantifi´es par le seuil de confiance u. Dans le cas o`u φ(u) = α−1 · 1(u 1, telle que ∀X ∈ G et ∀λ ≥ 0, ρ(λ · X) = λβ · ρ(X) ,
(10.33)
et plus la constante β est grande, plus les positions de grandes tailles sont p´enalis´ees. Cet axiome suppose en fait seulement que l’impact de la liquidit´e limit´ee du march´e est la mˆeme pour tous les actifs. Ceci n’est peut-ˆetre pas rigoureusement vrai, mais demeure une tr`es bonne approximation pour des compagnies de taille comparable (Lillo et al. 2002). Il faut cependant noter qu’en tant que tel, cet axiome n’est pas compatible avec l’axiome d’invariance par translation. En effet, consid´erons le risque ρ(λ(X + α · (1 + r))), avec X ∈ G et α et λ deux r´eels. En appliquant tout d’abord l’axiome d’invariance par translation, puis l’axiome d’homog´en´eit´e en march´e illiquide, on obtient : ρ(λ(X + α · (1 + r))) = ρ(λX + λα · (1 + r)), = ρ(λX) − λα, β
= λ · ρ(X) − λα.
(10.34) (10.35) (10.36)
Si maintenant on fait usage de ces deux axiomes dans l’ordre inverse, il vient : ρ(λ(X + α · (1 + r))) = λβ · ρ(X + α · (1 + r)), β
β
= λ · ρ(X) − λ α,
(10.37) (10.38)
ce qui est en contradiction avec le r´esultat pr´ec´edent donn´e par l’´equation (10.36) Donc, si l’on veut restaurer la compatiblit´e entre l’axiome d’invariance par translation et l’axiome d’homog´en´eit´e en marche illiquide, il faut restreindre ce dernier aux positions purement risqu´ees, ce qui revient a` admettre la parfaite liquidit´e de l’actif sans risque. Une autre alternative consiste a` modifier l’axiome d’invariance par translation de sorte que pour un investissement de taille α dans l’actif sans risque ρ(α · (1 + r)) = −αβ , auquel cas, on tient aussi compte du risque d’illiquidit´e pour l’actif sans risque. Cette approche n’est donc pas tr`es satisfaisante. Une meilleure solution a tout d’abord e´ t´e propos´ee par Heath (2000) puis par F¨ollmer et Schied (2002a). Leur id´ee consiste pour commencer a` remarquer que l’axiome de sous-additivit´e peut eˆ tre remplac´e par un axiome de convexit´e : A XIOME 12 (C ONVEXIT E´ ) ∀(X1 , X2 ) ∈ G × G et ∀λ ∈ [0, 1], ρ(λ X1 + (1 − λ) X2 ) ≤ λ ρ(X1 ) + (1 − λ) ρ(X2 ).
(10.39)
Cette substitution est parfaitement l´egitime puisque pour les fonctions homog`enes, convexit´e et sousadditivit´e sont e´ quivalentes. Notons que comme l’axiome de sous-additivit´e, l’axiome de convexit´e garantit que l’agr´egation de positions risqu´ees assure leur diversification. Ainsi, les mesures de risque coh´erentes peuvent eˆ tre d´efinies par un ensemble d’axiomes e´ quivalents a` ceux e´ nonc´es en section 10.2.1 et qui sont les axiomes d’invariance par translation, d’homog´en´eit´e, de convexit´e et de monotonie.
359
10.3. Les mesures de fluctuations
Pour prendre en compte le risque de liquidit´e, Heath (2000) et F¨ollmer et Schied (2002a) proposent de rejeter l’axiome d’homog´en´eit´e. Moyennant cela, ils d´efinissent un nouvel ensemble de mesures de risque dites convexes, qui englobent les mesures de risque coh´erentes, et pour lesquelles ils donnent un th´eor`eme de repr´esentation : T H E´ OR E` ME 4 (R EPR E´ SENTATION DES MESURES DE RISQUE CONVEXES ) Une mesure de risque ρ est convexe si et seulement si il existe une famille Q de mesures de probabilit´e sur l’espace des e´ tats de la nature et une fonctionnelle α sur Q telle que : ρ(X) = sup (EQ [−X] − α(Q)) ,
(10.40)
Q∈Q
o`u la fonctionnelle α est donn´ee par α(Q) = sup EQ [−X],
(10.41)
X∈Aρ
et Aρ = {X ∈ G | ρ(X) ≤ 0} .
(10.42)
Dans l’´enonc´e du th´eor`eme, nous avons omis le facteur d’actualisation, afin d’all´eger l’´ecriture, mais il peut eˆ tre r´eintroduit de mani`ere e´ vidente. Ici encore, comme soulign´e par F¨ollmer et Schied (2002b), le lien avec la th´eorie de la d´ecision dans l’incertain est imm´ediat, et permet d’ancrer ces mesures de risque convexes dans la th´eorie de l’utilit´e et ainsi de leur donner un sens e´ conomique tr`es net. Cependant, nous avons vu que cette th´eorie conduisait a` prendre des d´ecisions extrˆemement pessismistes puisqu’elle ne retient que l’utilit´e minimale que peut retirer un agent e´ tant donn´e l’ensemble des situations qu’il consid`ere. Ce mˆeme pessimisme excessif affecte les mesures de risque convexes (et donc coh´erentes) puisque l`a aussi, le gestionnaire n’est sensible qu’`a la plus grande perte qu’il peut subir. Enfin, il convient de signaler que la mesure du risque en terme de capital e´ conomique demeure insuffisante. Certes elle garantit, avec un certain niveau de confiance d´etermin´e, que le portefeuille ou l’entreprise e´ vitera la ruine, ce qui est fondamental du point de vue du r´egulateur, mais si l’on se place du point de vue du gestionnaire de fond ou d’un e´ ventuel investisseur, cela ne suffit pas. Il faut aussi eˆ tre capable de mesurer les fluctuations autour de l’objectif de rentabilit´e fix´e, c’est-`a-dire de la richesse du portefeuille autour de la richesse esp´er´ee (ou richesse moyenne). En effet, la qualit´e d’un portefeuille se juge aussi a` la r´egularit´e de ses performances.
10.3 Les mesures de fluctuations Comme nous venons de l’exposer, la mesure du risque en terme de capital e´ conomique, pour n´ecessaire qu’elle soit - elle constitue en fait la premi`ere des exigences - ne suffit cependant pas. Il semble en effet tout-`a-fait souhaitable de pouvoir mesurer les fluctuations d’un actif ou d’un portfeuille autour de sa valeur moyenne ou plus g´en´eralement autour d’un objectif de rentabilit´e pr´ealablement e´ tabli. Les qualit´es du portefeuille seront alors d’autant meilleures que les fluctuations seront plus petites. Il convient donc de rechercher les propri´et´es minimales que doit satisfaire une mesure de fluctuation ρ. Ceci est expos´e dans Malevergne et Sornette (2002c) que nous pr´esenterons au chapitre 14 section 1.2, et dont nous donnons ici un r´esum´e. En premier lieu, nous requ`erons qu’une mesure de fluctuation soit positive :
360
10. La mesure du risque
A XIOME 13 (P OSITIVIT E´ ) Soit X ∈ G une grandeur risqu´ee, alors ρ(X) ≥ 0. De plus, ρ(X) = 0 si et seulement si X est non risqu´ee (ou certain). En particulier, tout actif sans risque a une mesure de fluctuation e´ gale a` z´ero, ce qui est bien naturel. De plus, l’ajout d’une quantit´e certaine a` une valeur risqu´ee ne modifiant en rien les fluctuations de cette derni`ere, nous devons avoir : A XIOME 14 (I NVARIANCE PAR TRANSLATION ) ∀X ∈ G et ∀µ ∈ R, ρ(X + α) = ρ(X).
(10.43)
Enfin, nous demandons que la mesure de fluctuation soit une fonction croissante, et plus sp´ecifiquement homog´ene, de la taille de la position : A XIOME 15 (P OSITIVE HOMOG E´ N E´ IT E´ ) Il existe une constante β ≥ 1, telle que ∀X ∈ G et ∀λ ≥ 0, ρ(λ · X) = λβ · ρ(X).
(10.44)
Dans le cas o`u β e´ gale un, la mesure de fluctuations est extensive par rapport a` la taille de la position mais ne prend alors pas en compte le risque de liquidit´e. Les mesures de fluctuations satisfaisant les axiomes 14 et 15 sont connues sous le nom de semi-invariants. Ils en existent de tr`es nombreux, parmi lesquels on peut citer par exemples les moments centr´es µn (X) = E [(X − E[X])n ] , ou les cumulants
d E eikX 1 Cn (X) = n i · n! dk
(10.45)
.
(10.46)
k=0
L’utilisation des moments centr´es comme mesure du risque associ´e aux fluctuations d’un actif n’est pas nouvelle. Elle remonte au moins a` Markovitz (1959) qui choisit d’utiliser la variance (moment centr´e d’ordre deux) comme mesure du risque des actifs financiers. Plus tard, Rubinstein (1973) montrera que les moments centr´es d’ordre sup´erieur a` deux - pour autant qu’ils existent (cf. chapitres 1 et 3) - interviennent de fac¸on naturelle pour quantifier des risques plus grands que ceux pris en compte par la variance, en les mettant en relation avec la th´eorie de l’utilit´e esp´er´ee de von Neumann et Morgenstern (1947). Dans le cas g´en´eral, c’est-`a-dire pour les cumulants ou pour toute autre mesure de fluctuations, il ne semble cependant pas que l’on puisse trouver de lien avec la th´eorie de l’utilit´e. L’axiome de positivit´e permet de restreindre les semi-invariants acceptables. En effet, par d´efinition, les moments centr´es d’ordre pairssont positifs, mais ce n’est pas n´ecessairement le cas pour ceux d’ordre impair. La situation est beaucoup plus floue pour ce qui est des cumulants, puisqu’aucun r´esultat g´en´eral ne peut eˆ tre donn´e concernant leur positivit´e. En fait tout d´epend de la distribution de la variable al´eatoire X. Partant des moments centr´es, il est en fait facile de construire une mesure de fluctuation qui satisfait les trois axiomes, quelque soit la valeur de β. En effet, il suffit de consid´erer les moments absolus centr´es : h i µ ¯β (X) = E |X − E[X]|β , (10.47)
10.4. Conclusion
361
et de mani`ere plus g´en´erale : µ ¯p β/p . Ceci permet d’ailleurs de construire de fac¸on tr`es simple d’autres mesures de fluctuations. En effet, il est ais´e de montrer que toute somme (positive mais non n´ecessairement convexe) de mesures de fluctuations de mˆeme degr´e d’homog´en´eit´e β est une mesure de fluctuation de degr´e d’homog´en´eit´e β. Donc, dans l’esprit des mesures spectrales d’Acerbi (2002), nous pouvons d´efinir Z ρ(X) = dα φ(α) E [|X − E[X]|α ]β/α , (10.48) pourvu que l’int´egrale existe. L`a encore, la fonction φ permet de quantifier l’aversion du gestionnaire de risque vis-`a-vis des grandes fluctuations.
10.4
Conclusion
Les avanc´ees r´ecentes en mati`ere de th´eorie de la d´ecision et de mesure de risque, que nous venons d’exposer, nous permettrons au chapitre suivant de montrer comment mettre en œuvre une gestion de portefeuille efficace vis-`a-vis des grands risques. Toutefois, il convient de garder a` l’esprit qu’une des limites les plus importantes des mesures de risque que nous venons de pr´esenter est de se restreindre a` une prise en compte mono-p´eriodique des risques et de n´egliger l’approche inter-temporelle, qui est pourtant fondamentale dans la mesure o`u les contraintes de risque ont souvent pour objectif d’ˆetre int´egr´ees dans un probl`eme d’optimisation de portefeuille qui g´en´eralement est un probl`eme dynamique. Malheureusement, l’´etude dynamique des risques est beaucoup moins avanc´ee car beaucoup plus d´elicate a` formaliser que l’´etude statique. On peut cependant pr´esenter quelques tentatives ayant permis d’appr´ehender ce probl`eme. Tout d’abord citons les approches de Dacorogna, Genc¸ay, M¨uller et Pictet (2001) et Muzy, Sornette, Delour et Arn´eodo (2001) notamment qui, partant de mesures de risques mono-p´eriodiques, montrent qu’il est possible de construire des mesures “multi-´echelles” en moyennant les mesures de risque mono-p´eriodiques calcul´ees a` diff´erentes e´ chelles de temps. Ceci peut paraˆıtre une m´ethode purement ad hoc, mais a d’une part r´ev´el´e d’int´eressants r´esultats et d’autre part repose tout de mˆeme sur l’existence d’une cascade causale entre les diff´erentes e´ chelles temporelles (Arn´eodo et al. 1998, Muzy et al. 2001). Plus g´en´eralement, Wang (1999) a propos´e un ensemble d’axiomes que les mesures de risques dynamiques doivent satisfaire et a compl´et´e ainsi les travaux pr´ec´edents de Shapiro et Basak (2000) concernant la maximisation de l’utilit´e en temps continu ou de Ahn, Boudoukh, Richardson et Whitelaw (1999) sur l’optimisation de la VaR en temps continu. Enfin, signalons certaines mesures de risques telles les drawdowns, qui mesurent les pertes cumul´ees ind´ependamment de leur dur´ee, et qui ont suscit´e r´ecemment un certain regain d’int´erˆet (Grossman et Zhou 1993, Cvitanic et Karatzas 1995, Chekhlov, Uryasev et Zabarankin 2000, Johansen et Sornette 2002).
362
10. La mesure du risque
Chapitre 11
Portefeuilles optimaux et e´ quilibre de march´e On doit a` Markovitz (1959) la premi`ere formalisation math´ematique de la gestion de portefeuille. Celle-ci est bas´ee sur la n´ecessit´e d’accepter un compromis entre l’obtention d’un rendement esp´er´e le plus e´ lev´e possible et d’une quantit´e de risques encourus la plus faible possible, ce qui conduit tout naturellement a` s’int´eresser aux portefeuilles dits optimaux, c’est-`a-dire les portefeuilles tels que pour un niveau de risque donn´e et un jeu de contraintes a` satisfaire - telle que l’absence de vente a` d´ecouvert, par exemple - il n’existe pas de portefeuilles de rendement sup´erieur au rendement du portefeuille optimal. La courbe repr´esentant l’ensemble des portefeuilles optimaux dans le plan risque / rendement d´efinit la fronti`ere efficiente au-dessus de laquelle aucun couple risque / rendement ne peut eˆ tre atteint. Dans l’approche initiale de Markovitz (1959), le risque est mesur´e par la variance (ou d´eviation standard) du rendement des actifs. Cela revient a` admettre soit que leur distribution (multivari´ee) est gaussienne - puisque c’est le seul cas o`u la variance caract´erise compl`etement les fluctuations des actifs - soit que la fonction d’utilit´e des agents est quadratique, auquel cas les d´ecisions des agents ne sont r´egies que par les deux premiers moments de la distribution des actifs. Moyennant ces hypoth`eses, il est possible de d´eriver analytiquement la composition des portefeuilles efficients en fonction du seul vecteur des rendements esp´er´es des actifs et de leur matrice de covariance (Elton et Gruber 1995, par exemple). Cependant, il est d´esormais bien admis que la variance ne saurait suffire a` quantifier convenablement les risques puisqu’elle ne rend compte que des petites fluctuations du rendement des actifs autour de leur valeur moyenne, n´egligeant ainsi totalement les grands risques dont l’impact est g´en´eralement le plus cons´equent. Aussi, est-il important de se tourner vers d’autres mesures de risques ou d’autres crit`eres d’optimisation. L’une des premi`eres alternatives propos´ees a` l’analyse moyenne-variance a e´ t´e de consid´erer les portefeuilles dont la moyenne non pas arithm´etique mais g´eom´etrique des rendements est la plus grande, car ce sont eux qui ont la probabilit´e la plus e´ lev´ee de d´epasser un niveau donn´e de rentabilit´e, quelque soit l’intervalle de temps consid´er´e (Brieman 1960, Hakansson 1971, Roll 1973). Cette approche n’est cependant compatible avec la th´eorie de la d´ecision que pour des fonctions d’utilit´e logarithmiques. Une seconde alternative, dite “Safety First”, consiste a` mettre l’accent sur les pertes qu’encourt le portefeuille. De nombreux crit`eres ont vu le jour, tel le crit`ere de Roy (1952) qui consiste a` minimiser la probabilit´e de subir une perte sup´erieure a` une valeur pr´ed´etermin´ee, ou encore les crit`eres de Kataoka ou de Telser (voir Elton et Gruber (1995)) qui sont en fait tr`es proches de crit`eres d’optimisation sous contrainte de VaR. De mani`ere g´en´erale, toute optimisation sous contrainte de capital e´ conomique et donc utilisant notamment les mesures de risques coh´erentes s’apparente a` cette approche. Enfin, une troisi`eme alternative 363
364
11. Portefeuilles optimaux et e´ quilibre de march´e
consiste, pour palier directement aux limitations de l’approche moyenne-variance (Samuelson 1958), a` prendre en compte l’effet des moments d’ordre sup´erieur, tels la skewness (Arditti 1967, Krauss et Lintzenburger 1976) ou plus g´en´eralement les mesures de fluctuations que nous avons pr´esent´ees au chapitre pr´ec´edent. Outre l’int´erˆet pratique des m´ethodes d’optimisation de portefeuilles, celles-ci ont aussi un int´erˆet th´eorique, dans la mesure o`u la composition des portefeuilles optimaux permet de d´eduire des relations entre les prix des actifs et le prix du portefeuille de march´e a` l’´equilibre. Ceci permet alors de g´en´eraliser le CAPM de Sharpe (1964), Lintner (1965) et Mossin (1966) d´eriv´e dans un univers de Markovitz et donc soumis aux mˆemes limites quant a` sa validit´e.
11.1 Les limites de l’approche moyenne - variance Dans l’approche de Markovitz (1959), le vecteur des rendements esp´er´es et la matrice de corr´elation jouent un rˆole crucial. Or, si l’estimation des rendements moyens peut eˆ tre r´ealis´ee avec une assez bonne pr´ecision dans la mesure o`u les distributions des actifs d´ecroissent plus vite qu’une loi de puissance d’exposant deux (cf chapitre 1), l’estimation de la matrice de covariance pose beaucoup plus de probl`emes, car son estimation correcte n´ecessite que la distribution de rendement d´ecroisse plus rapidement qu’une loi de puissance d’exposant quatre dans la r´egion interm´ediaire, ce qui n’est pas le cas. En effet, lorsque l’on s’int´eresse a` des portefeuilles de grande taille, la matrice de covariance empirique est a` telle point bruit´ee, que ses propri´et´es sont tr`es proches des propri´et´es universelles de certains ensembles de matrices al´eatoires. Plus pr´ecis´ement, Laloux, Cizeau, Bouchaud et Potters (1999), Laloux, Cizeau, Bouchaud et Potters (2000) ou encore Plerou, Gopikrishnan, Rosenow, Amaral et Stanley (1999) ont montr´e que les distributions de valeurs propres et vecteurs propres de ces matrices, ainsi que la distribution des e´ carts entre valeurs propres, e´ taient tr`es proches de celles des matrices de l’ensemble de Wishart (1928), a` l’exception des quelques plus grandes valeurs propres qui semblent pouvoir eˆ tre associ´ees a` des facteurs comme le march´e ou certains secteurs d’activit´e. Cependant, comme sugg´er´e par les r´esultats de Meerschaert et Scheffler (2001), l’ensemble de Whishart n’est peut-ˆetre pas le plus adapt´e. En effet, l’ensemble de Whishart est l’ensemble des matrices de corr´elations empiriques d´eriv´ees d’´echantillons gaussiens. Or, si l’on admet que les queues de distributions des rendements sont en lois de puissance (ou r´eguli`erement variables) d’exposant de queue inf´erieur a` quatre, les ensembles de matrice de L´evy (Burda, Janik, Jurkiewicz, Nowak, Papp et Zahed 2002) semblent plus indiqu´es. Les r´esultats de Burda, Jurkiewicz, Nowak, Papp et Zahed (2001a) et Burda, Jurkiewicz, Nowak, Papp et Zahed (2001b) montrent alors que les matrices de covariances estim´ees sont encore plus bruit´ees que pr´evu par comparaison a` l’ensemble de Whishart, puisque mˆeme les plus grandes valeurs ne paraissent pas significatives. En cons´equence, le contenu des matrices de covariances estim´ees de grandes tailles semble peu informatif. En fait, nous montrons en annexe de ce chapitre que les plus grandes valeurs propres peuvent eˆ tre estim´ees avec une bonne pr´ecision tandis que le cœur de la distribution de valeurs propres s’´ecarte sensiblement de la distribution de Wishart. En outre, nous justifions, a` l’aide de la th´eorie des matrices al´eatoires, comment e´ merge de tout syst`eme de grande taille, dont la corr´elation moyenne entre e´ l´ements est non nulle, un facteur (ou valeur propre) dominant qui permet a` lui seul de rendre compte de la plus grande partie des interactions (ou corr´elations) entre les e´ l´ements du syst`eme (Malevergne et Sornette 2002a). Ceci a d’importantes cons´equences quant a` la composition et la stabilit´e dans le temps des portefeuilles optimaux obtenus a` l’aide de ces matrices de corr´elations empiriques (Rosenow, Plerou, Gopikrishnan et Stanley 2001). En fait, l’importance du bruit d´epend du contexte. D’une part, elle semble assez faible pour les portefeuilles optimis´es sous contraintes lin´eaires plutˆot que sous contraintes non lin´eaires (Pafka
11.2. Prise en compte des grands risques
365
et Kondor 2001). D’autre part, cette influence d´epend du rapport r = N/T , o`u N est le nombre d’actifs dans le portefeuille et T la taille des s´eries temporelles ayant servi a` estimer la matrice de covariance. Pafka et Kondor (2002) ont montr´e que pour un rapport sup´erieur ou de l’ordre de 0.6, le bruit a` une influence primordiale, alors que pour r inf´erieur a` 0.2, son impact devient n´egligeable. Ceci implique qu’il est n´ecessaire de disposer de s´eries temporelles dont la longueur est au moins cinq fois sup´erieure a` la taille du portefeuille consid´er´e. Ainsi, pour un portefeuille d’une centaine d’actifs g´er´es sur la base de donn´ees journali`eres, cela requi`ert des e´ chantillons de cinq cents points, soit deux ann´ees de cotations, ce qui reste tr`es raisonnable. En revanche, pour un portefeuille d’un millier d’actifs, il faut alors des s´eries temporelles de cinq mille points, ce qui repr´esente une vingtaine d’ann´ees de cotations, et se posent alors d’autres probl`emes tels que celui de la stationnarit´e de ces donn´ees. Quand bien mˆeme ces probl`emes pratiques ne se poseraient pas, il faut garder a` l’esprit que l’approche moyenne - variance de Markovitz (1959) ne se r´ev`ele adapt´ee que dans l’hypoth`ese o`u les actifs sont conjointement gaussiens ou dans la mesure o`u les agents forment des pr´ef´erences quadratiques dans leur richesse, ce qui dans un cas comme dans l’autre traduit que l’on ne s’int´eresse qu’`a des risques de faibles amplitudes. D`es lors que l’on souhaite int´egrer d’autre dimension du risque, c’est-`a-dire des risques associ´es a` de grandes pertes ou de grandes fluctuations, il convient de se tourner vers d’autres mesures de risques que la variance.
11.2 Prise en compte des grands risques L’utilisation de nouvelles mesures de risques permettant de prendre en compte les grands risques est n´ecessaire a` l’obtention de portefeuilles moins sensibles que les portefeuilles moyenne - variance aux grandes variations de cours. Pour cela, nous devons nous int´eresser a` des mesures de risques accordant plus d’importance aux e´ v´enements rares et de grandes amplitudes. De telles mesures ont e´ t´e pr´esent´ees au chapitre pr´ec´edent, et selon Tasche (2000) et Malevergne et Sornette (2002c) peuvent eˆ tre divis´ees en deux classes : – premi`erement les mesures de risques associ´ees au capital e´ conomique pour lesquelles peuvent eˆ tre requises des conditions de coh´erence (Artzner et al. 1999) ou de convexit´e (Heath 2000, F¨ollmer et Schied 2002a), – et deuxi`emement les mesures des fluctuations du rendement autour de sa valeur esp´er´ee (Malevergne et Sornette 2002c), ce qui historiquement est la premi`ere approche a` avoir vu le jour puisque la variance est une mesure de fluctuation et pas une mesure de capital e´ conomique. En fait, ces deux classes ne suffisent pas a` englober toutes les mesures de grands risques : en restant dans un cadre strictement mono - p´eriodique, on peut au moins citer le coefficient de d´ependance de queue qui permet de quantifier les co-mouvements extrˆemes entre actifs (Malevergne et Sornette 2002b), ou encore la “covariance de queue” utilis´ee par Bouchaud, Sornette, Walter et Aguilar (1998) pour quantifier le risque d’un portefeuille d’actifs dont les rendements suivent des lois de puissances, g´en´eralisant ainsi l’approche de Fama (1965b) valable uniquement pour des actifs distribu´es selon des lois stables. Si l’on s’autorise a` consid´erer des mesures inter - temporelles, les “drawdowns” (Chekhlov et al. 2000), par exemple peuvent eˆ tre pris en compte.
11.2.1
Optimisation sous contrainte de capital e´ conomique
L’optimisation d’un portefeuille par rapport a` des contraintes portant sur le capital e´ conomique conduit naturellement a` s’int´eresser aux portefeuilles VaR-efficients (Consigli 2002, Huisman, Koedijk et Pownall 2001, Kaplanski et Kroll 2001a) ou Expected Shortfall-efficients (Frey et McNeil 2002). En effet,
366
11. Portefeuilles optimaux et e´ quilibre de march´e
ces deux mesures de risques sont, a` l’heure actuelle, les plus employ´ees. Comme il est l´egitime de le penser, l’allocation optimale selon des crit`eres moyenne - VaR ou moyenne - ES est tr`es diff´erente de celle obtenue selon le crit`ere moyenne variance (Alexander et Baptista 2002), ce dernier n’´etant pas a` mˆeme de tenir compte des grands risques. D’un point de vue pratique, l’optimisation selon des crit`eres de VaR est d´elicate pour deux raisons. D’une part du fait de sa non convexit´e, qui impose d’avoir recours a` des algorithmes de minimisation non standards (algorithmes g´en´etiques, par exemple) et d’autre part, son estimation dans un cadre non - param´etrique est g´en´eralement difficile car sensible a` la m´ethode utilis´ee et grosse consommatrice de temps de calcul1 . Aussi des m´ethodes de calculs approximatifs (Tasche et Tibiletti 2001) bas´ees notamment sur l’application de la th´eorie des valeurs extrˆemes (Longin 2000, Danielson et de Vries 2000, Consigli, Frascella et Sartorelli 2001) ou des approches param´etriques (Malevergne et Sornette 2002d), prennent tout leur sens. L’Expected-Shortfall pr´esente quant a` elle l’avantage de satisfaire aux contraintes de coh´erences2 de Artzner et al. (1999). Ainsi, elle conduit a` des probl`emes d’optimisation bien conditionn´es pour lesquels des algorithmes de minimisation particuli`erement simples et efficaces existent (Rockafellar et Uryasev 2002).
11.2.2
Optimisation sous contrainte de fluctuations autour du rendement esp´er´e
Le capital e´ conomique n’est pas la seule quantit´e a` minimiser et les fluctuations du portefeuille autour de son rendement moyen ou de tout autre objectif de rentabilit´e sont aussi a` prendre en compte. La variance r´ealise cela, mais elle se focalise uniquement sur les e´ carts a` la moyenne de faibles amplitudes et n´eglige donc compl`etement les grands risques. C’est pourquoi il convient d’utiliser d’autres quantit´es partageant certaines des propri´et´es de la variance mais mettant l’emphase sur les grandes fluctuations. Rubinstein (1973) fut l’un des premiers a` s’int´eresser a` cette approche, sugg´erant que les moments centr´es d’ordre sup´erieur a` deux ne devaient eˆ tre n´eglig´es puisqu’ils apparaissent naturellement dans le d´eveloppement en s´erie de la fonction d’utilit´e. Plus r´ecemment, Sornette, Andersen et Simonetti (2000) et Sornette, Simonetti et Andersen (2000) ont e´ mis l’id´ee que les cumulants pouvaient aussi fournir d’utiles mesures de fluctuations, permettant notamment de rendre compte du comportement de certains agents globalement risquophobes dans le sens o`u ils cherchent a` e´ viter les grands risques mais sont pr`es a` accepter un certain niveau de petits risques (Andersen et Sornette 2002a, Malevergne et Sornette 2002c). Dans le cas o`u les distributions marginales des actifs suivent des lois exponentielles e´ tir´ees (cf chapitre 3) et o`u la copule d´ecrivant leur d´ependance est gaussienne (cf chapitres 7 et 8), des expressions analytiques ont pu eˆ tre d´eriv´ees pour l’expression des moments et cumulants des portefeuilles constitu´es de tels actifs (voir chapitre 14). Ces quelques exemples sont des cas particuliers des mesures de fluctuations d´efinies au chapitre pr´ec´edent et permettent de d´eriver simplement la plupart des propri´et´es g´en´erales des portefeuilles optimaux. Ces propri´et´es sont en fait des g´en´eralisations imm´ediates, aux cas des grands risques, des propri´et´es dont jouissent les portefeuilles moyenne - variance efficients.
11.2.3
Optimisation sous d’autres contraintes
Lorsque l’on souhaite consid´erer les risques extrˆemes syst´ematiques, c’est-`a-dire les mouvements extrˆemes que subissent les actifs conjointement avec le march´e, le coefficient de d´ependance de queue λ peut 1
Ce point est discut´e en d´etail dans Chabaane, Duclos, Laurent, Malevergne et Turpin (2002) Voir Acerbi et Tasche (2002) pour une discussion des propri´et´es de coh´erence de l’Expected-Shortfall selon la d´efinition adopt´ee. 2
11.3. Equilibre de march´e
367
s’av´erer utile. En effet, comme nous le montrons dans Malevergne et Sornette (2002b), les portefeuilles constitu´es d’actifs de faibles λ pr´esentent globalement beaucoup moins de d´ependance dans les extrˆemes que les portefeuilles d’actifs ayant individuellement de grands λ. Lorsque l’on abandonne le strict cadre mono - p´eriodique, les choses se compliquent. L’approche statique de Markovitz (1959) peut certes eˆ tre g´en´eralis´ee dans un cadre dynamique (Merton 1992), mais cela nous ram`ene a` la seule prise en compte des petits risques. Quelques tentatives ont e´ t´e men´ees pour tenter de concilier grand risques et approche inter - temporelle telle par exemple la minimisation des “drawdowns” (Grossman et Zhou 1993, Cvitanic et Karatzas 1995, Chekhlov et al. 2000), ou encore l’utilisation des cumulants a` diff´erentes e´ chelles temporelles (Muzy et al. 2001).
11.3 Equilibre de march´e L’allocation de capital et le choix des actifs effectu´es par les agents a bien e´ videmment une influence sur le prix de march´e de ces actifs. Sous les hypoth`eses que le march´e est parfaitement liquide et en l’absence de taxe de quelque sorte que ce soit, il est possible de d´eriver des e´ quations reliant les rendements esp´er´es de chaque actif au rendement du march´e (`a l’´equilibre). Le premier mod`ele d’´equilibre, ancr´e dans l’univers de Markovitz, est le CAPM d´eriv´e par Sharpe (1964), Lintner (1965) et Mossin (1966). Tr`es tˆot, de nombreuses g´en´eralisations ont e´ t´e obtenues pour tenir compte notamment de l’effet des moments d’ordre sup´erieur a` deux (Jurczenko et Maillet 2002, et les r´ef´erences cit´ees dans cet article), et ainsi tenter de r´esoudre l’ “equity premium puzzle”. Les r´esultats n’´etant pas tr`es concluants, d’autres voies ont e´ t´e explor´ees pour rendre compte notamment des implications e´ conomiques de l’optimisation de portefeuille sous d’autres contraintes que la variance, telle la VaR par exemple (Alexander et Baptista 2002, Kaplanski et Kroll 2001b), sans malheureusement beaucoup plus de succ`es. Pour notre part, nous avons montr´e que la relation standard du CAPM reste valable pour les mesures de fluctuations que nous avons consid´er´ees ainsi que dans le cas o`u les agents n’utilisent pas tous la mˆeme mesure de risque, et donc lorsque le march´e est h´et´erog`ene (Malevergne et Sornette 2002c), reprenant et g´en´eralisant certains travaux ant´erieurs de Lintner (1969) ou Gonedes (1976) notamment. Toutes ces approches demeurent dans le cadre de l’´etude mono-p´eriodique de l’´equilibre des march´es. Mais, de mˆeme que le probl`eme de choix de portefeuille a rec¸u certaines extensions multi-p´eriodiques, des g´en´eralisations inter-temporelles du CAPM ont vu le jour dont celles propos´ees par Fama (1970) ou Merton (1973) pour ne citer que les plus c´el`ebres, que nous ne faisons que mentionner puisque nous ne nous y sommes pas du tout int´eress´es. Comme pour le probl`eme de s´election de portefeuille, nous nous sommes restreints au cas mono-p´eriodiques.
11.4 Conclusion Nous avons synth´etis´e dans ce chapitre les r´esultats ant´erieurs obtenus dans le domaine de la gestion de portefeuille quantitative et les cons´equences th´eoriques qui en d´ecoulaient concernant les e´ quilibres de march´es. Ceci nous a permis de situer les r´esultats que nous avons obtenus sur ce sujet par rapport a` ceux d´ej`a e´ tablis. Ces r´esultat seront pr´esent´es en d´etail dans les chapitres suivants, en commenc¸ant par les cons´equences de la prise en compte des risques grands et extrˆemes, notamment au travers du coefficient de d´ependance de queue (chapitre 12), puis les implications et difficult´es de l’optimisation de portefeuille sous contraintes de mesures de risques (coh´erentes ou non) associ´ees au capital e´ conomique (chapitre 13) et enfin nous
368
11. Portefeuilles optimaux et e´ quilibre de march´e
e´ tudierons les portefeuilles efficients et e´ quilibres de march´es qui en d´ecoulent sous contraintes de fluctuations autour d’un objectif de rentabilit´e sur le rendement esp´er´e (chapitre 14).
11.5 Annexe A l’aide de calculs simples et de simulations num´eriques, nous d´emontrons l’existence g´en´erique d’un e´ tat macroscopique auto-organis´e dans tout syst`eme de grande taille pr´esentant une corr´elation moyenne non-nulle entre une fraction finie de toutes ces paires d’´el´ements. Nous montrons que la coexistence d’un spectre de valeurs propres, pr´edit par la th´eorie des matrices al´eatoires, et de quelques tr`es grandes valeurs propres dans les matrices de corr´elation empiriques de grande taille r´esulte d’un effet collectif des s´eries temporelles sous-jacentes plutˆot que de l’impact de facteurs. Nos r´esultats, en excellent accord avec de pr´ec´edentes e´ tudes men´ees sur les matrices de corr´elation financi`eres, montrent e´ galement que le cœur du spectre de valeurs propres contient une part significative d’information et rationalise la pr´esence de facteurs de march´es jusqu’ici introduits de mani`ere ad hoc.
369
11.5. Annexe
Collective Origin of the Coexistence of Apparent RMT Noise and Factors in Large Sample Correlation Matrices Y. Malevergne1, 2 and D. Sornette1, 3 1
Laboratoire de Physique de la Mati`ere Condens´ee CNRS UMR 6622 Universit´e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France 2 Institut de Science Financi`ere et d’Assurances - Universit´e Lyon I 43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France 3 Institute of Geophysics and Planetary Physics and Department of Earth and Space Science University of California, Los Angeles, California 90095, USA Through simple analytical calculations and numerical simulations, we demonstrate the generic existence of a self-organized macroscopic state in any large multivariate system possessing nonvanishing average correlations between a finite fraction of all pairs of elements. The coexistence of an eigenvalue spectrum predicted by random matrix theory (RMT) and a few very large eigenvalues in large empirical correlation matrices is shown to result from a bottom-up collective effect of the underlying time series rather than a top-down impact of factors. Our results, in excellent agreement with previous results obtained on large financial correlation matrices, show that there is relevant information also in the bulk of the eigenvalue spectrum and rationalize the presence of market factors previously introduced in an ad hoc manner.
Since Wigner’s seminal idea to apply random matrix theory (RMT) to interpret the complex spectrum of energy levels in nuclear physics [1], RMT has made enormous progress [2] with many applications in physical sciences and elsewhere such as in meteorology [3] and image processing [4]. A new application was proposed a few years ago to the problem of correlations between financial assets and to the portfolio optimization problem. It was shown that, among the eigenvalues and principal components of the empirical correlation matrix of the returns of hundreds of asset on the New York Stock Exchange (NYSE), apart from the few highest eigenvalues, the marginal distribution of the other eigenvalues and eigenvectors closely resembles the spectral distribution of a positive symmetric random matrix with maximum entropy, suggesting that the correlation matrix does not contain any specific information beyond these few largest eigenvalues and eigenvectors [5]. These results apparently invalidate the standard mean-variance portfolio optimization theory [6] consecrated by the financial industry [7] and seemingly support the rationale behind factor models such as the capital asset pricing model (CAPM) [8] and the arbitrage pricing theory (APT) [9], where the correlations between a large number of assets are represented through a small number of so-called market factors. Indeed, if the spectrum of eigenvalues of the empirical covariance or correlation matrices are predicted by RMT, it seems natural to conclude that there is no usable information in these matrices and that empirical covariance matrices should not be used for portfolio optimization. In contrast, if one detects deviations between the universal – and therefore non-informative – part of the spectral properties of empirically estimated covariance and correlation matrices and those of the relevant ensemble of random matrices [10], this may quantify the amount of real information that can be used in portfolio
optimization from the “noise” that should be discarded. More generally, in many different scientific fields, one needs to determine the nature and amount of information contained in large covariance and correlation matrices. This occurs as soon as one attempts to estimate very large covariance and correlation matrices in multivariate dynamics of systems exhibiting non-Gaussian fluctuations with fat tails and/or long-range time correlations with intermittency. In such cases, the convergence of the estimators of the large covariance and correlation matrices is often too slow for all practical purposes. The problem becomes even more complex with time-varying variances and covariances as occurs in systems with heteroskedasticity [11] or with regime-switching [12]. A prominent example where such difficulties arise is the data-assimilation problem in engineering and in meteorology where forecasting is combined with observations iteratively through the Kalman filter, based on the estimation and forward prediction of large covariance matrices [13]. As we said in the context of financial time series, the rescuing strategy is to invoke the existence of a few dominant factors, such as an overall market factor and the factors related to firm size, firm industry and book-tomarket equity, thought to embody most of the relevant dependence structure between the studied time series [14]. Indeed, there is no doubt that observed equity prices respond to a wide variety of unanticipated factors, but there is much weaker evidence that expected returns are higher for equities that are more sensitive to these factors, as required by Markowitz’s mean-variance theory, by the CAPM and the APT [15]. This severe failure of the most fundamental finance theories could conceivably be attributable to an inappropriate proxy for the market portfolio, but nobody has been able to show that this is really the correct explanation. This remark constitutes
370
11. Portefeuilles optimaux et e´ quilibre de march´e
2 the crux of the problem: the factors invoked to model the cross-sectional dependence between assets are not known in general and are either postulated based on economic intuition in financial studies or obtained as black box results in the recent analyses using RMT [5]. Here, we show that the existence of factors results from a collective effect of the assets, similar to the emergence of a macroscopic self-organization of interacting microscopic constituents. For this, we unravel the general physical origin of the large eigenvalues of large covariance and correlation matrices and provide a complete understanding of the coexistence of features resembling properties of random matrices and of large “anomalous” eigenvalues. Through simple analytical calculations and numerical simulations, we demonstrate the generic existence of a self-organized macroscopic state in any large system possessing non-vanishing average correlations between a finite fraction of all pairs of elements. Let us first consider a large system of size N with correlation matrix C in which every non-diagonal pairs of elements exhibits the same correlation coefficient Cij = ρ for i 6= j and Cii = 1. Its eigenvalues are λ1 = 1 + (N − 1)ρ and λi≥2 = 1 − ρ
(1)
with multiplicity N − 1 and with ρ ∈ (0, 1) in order for the correlation matrix to remain positive definite. Thus, in the thermodynamics limit N → ∞, even for a weak positive correlation ρ → 0 (with ρN À 1), a very large eigenvalue appears, √ associated with the delocalized eigenvector v1 = (1/ N )(1, 1, · · · , 1), which dominates completely the correlation structure of the system. This trivial example stresses that the key point for the emergence of a large eigenvalue is not the strength of the correlations, provided that they do not vanish, but the large size N of the system. This result (1) still holds qualitatively when the correlation coefficients are all distinct. To see this, it is convenient to use a perturbation approach. We thus add a small random component to each correlation coefficient: Cij = ρ + ² · aij
for i 6= j ,
(2)
where the coefficients aij = aji have zero mean, variance σ 2 and are independently distributed (There are additional constraints on the support of the distribution of the aij ’s in order for the matrix Cij to remain positive definite with probability one). The determination of the eigenvalues and eigenfunctions of Cij is performed using the perturbation theory developed in quantum mechanics [16] up to the second order in ². We find that the largest eigenvalue becomes E[λ1 ] = (N −1)ρ+1+
(N − 1)(N − 2) ²2 σ 2 · +O(²3 ) (3) N2 ρ
while, at the same order, the corresponding eigenvector v1 remains unchanged. The degeneracy of the eigenvalue
12 12 10
10
8 largest eigenvalue
6 8
4 2 0
6
0
20
40
60
4
2
0
0
0.5
1
1.5
2
2.5
3
FIG. 1: Spectrum of eigenvalues of a random correlation matrix with average correlation coefficient ρ = 0.14 and√standard deviation of the correlation coefficients σ = 0.345/ N . The size N = 406 of the matrix is the same as in previous studies [5] for the sake of comparison. The continuous curve is the theoretical translated semi-circle distribution of eigenvalues describing the bulk of the distribution which passes the Kolmogorov test. The center value λ = 1 − ρ ensures the conservation of the trace equal to N . There is no adjustable parameter. The inset represents the whole spectrum with the largest eigenvalue whose size is in agreement with the prediction ρN = 56.8.
λ = 1 − ρ is broken and leads to a complex set of smaller eigenvalues described below. In fact, this result (3) can be generalized to the nonperturbative domain of any correlation matrix with independent random coefficients Cij , provided that they have the same mean value ρ and variance σ 2 . Indeed, it has been shown [17] that, in such a case, the expectations of the largest and second largest eigenvalues are E[λ1 ] = (N − 1) · ρ + 1 + σ 2 /ρ + o(1) , √ E[λ2 ] ≤ 2σ N + O(N 1/3 log N ) .
(4) (5)
Moreover, the statistical fluctuations of these two largest eigenvalues are asymptotically (for large fluctuations t > √ O( N )) bounded by a Gaussian distribution according to the following large deviation theorem 2
Pr{|λ1,2 − E[λ1,2 ]| ≥ t} ≤ e−c1,2 t ,
(6)
for some positive constant c1,2 [18]. This result is very different from that obtained when the mean value ρ vanishes. In such a case, the distribution of eigenvalues of the random matrix C is given by the semi-circle law [2]. However, due to the presence of the ones on the main diagonal of the correlation matrix C, the center of the circle is not at the origin but at the
371
11.5. Annexe
3 14 14 12
12 frequency
10 10
8
largest eigenvalue
6 4
frequency
point λ = 1. Thus, the distribution of the eigenvalues of random correlation matrices with zero mean correlation √ coefficients is a semi-circle of radius 2σ N centered at λ = 1. The result (4) is deeply related to the so-called “friendship theorem” in mathematical graph theory, which states that, in any finite graph such that any two vertices have exactly one common neighbor, there is one and only one vertex adjacent to all other vertices [19]. A more heuristic but equivalent statement is that, in a group of people such that any pair of persons have exactly one common friend, there is always one person (the “politician”) who is the friend of everybody. The connection is established by taking the non-diagonal entries Cij (i 6= j) equal to Bernouilli random variable with parameter ρ, that is, P r[Cij = 1] = ρ and P r[Cij = 0] = 1 − ρ. Then, the matrix Cij − I, where I is the unit matrix, becomes nothing but the adjacency matrix of the random graph G(N, ρ) [18]. The proof of [19] of the “friendship theorem” indeed relies on √ the N -dependence of the largest eigenvalue and on the N -dependence of the second largest eigenvalue of Cij as given by (4) and (5). Figure 1 shows the distribution of eigenvalues of a random correlation matrix. The inset shows the largest eigenvalue lying at the predicting size ρN = 56.8, while the bulk of the eigenvalues are much smaller and are described by a modified semi-circle law centered on λ = 1 − ρ, in the limit of large N . The result on the largest eigenvalue emerging from the collective effect of the crosscorrelation between all N (N −1)/2 pairs provides a novel perspective to the observation [20] that the only reasonable explanation for the simultaneous crash of 23 stock markets worldwide in October 1987 is the impact of a world market factor: according to our demonstration, the simultaneous occurrence of significant correlations between the markets worldwide is bound to lead to the existence of an extremely large eigenvalue, the world market factor constructed by ... a linear combination of the 23 stock markets! What our result shows is that invoking factors to explain the cross-sectional structure of stock returns is cursed by the chicken-and-egg problem: factors exist because stocks are correlated; stocks are correlated because of common factors impacting them [24]. Figure 2 shows the eigenvalues distribution of the sample correlation matrix reconstructed by sampling N = 406 time series of length T = 1309 generated with a given correlation matrix C with theoretical spectrum shown in figure 1. The largest eigenvalue is again very close to the prediction ρN = 56.8 while the bulk of the distribution departs very strongly from the semi-circle law and is not far from the Wishart prediction, as expected from the definition of the Wishart ensemble as the ensemble of sample covariance matrices of Gaussian distributed time series with unit variance and zero mean. A Kolmogorov test shows however that the bulk of the spectrum (renormalized so as to take into account the presence of the
2
8
0
0
20
40
λ
60
6
4
2
0
0
0.5
1
1.5
2
2.5 λ
3
3.5
4
4.5
5
FIG. 2: Spectrum estimated from the sample correlation matrix obtained from N = 406 time series of length T = 1309 (the same length as in [5]) with the same theoretical correlation matrix as that presented in figure 1.
outlier eigenvalue) is not in the Wishart class, in contradiction with previous claims lacking formal statistical tests [5]. This result holds for different simulations of the sample correlation matrix and different realizations of the theoretical correlation matrix with the same parameters (ρ, σ). The statistically significant departure from the Wishart prediction implies that there is actually some information in the bulk of the spectrum of eigenvalues, which can be retrieved using Marsili’s procedure [10]. We have also checked that these results remain robust for non-Gaussian distribution of returns as long as the second moments exist. Indeed, correlated time series with multivariate Gaussian or Student distributions with three degrees of freedom (which provide more acceptable proxies for financial time series [21]) give no discernible differences in the spectrum of eigenvalues. This is surprising as the estimator of a correlation coefficient is asymptotically Gaussian for time series with finite fourth moment and L´evy stable otherwise [22]. Empirically [5], a few other eigenvalues below the largest one have an amplitude of the order of 5 − 10 that deviate significantly from the bulk of the distribution. Our analysis provides a very simple constructive mechanism for them, justifying the postulated model of Ref.[23]. The solution consists in considering, as a first approximation, the block diagonal matrix C 0 with diagonal elements made P of the matrices A1 , · · · , Ap of sizes N1 , · · · , Np with Ni = N , constructed according to (2) such that each matrix Ai has the average correlation coefficient ρi . When the coefficients of the matrix C 0 outside the matrices Ai are zero, the spectrum of C 0 is given by the union of all the spectra of the Ai ’s, which
372
11. Portefeuilles optimaux et e´ quilibre de march´e
4 (see Drozdz et al. in [5]) – as in finance with important heteroskedastic effects – the composition of the main factor remains almost the same. This can be seen as a generalized limit theorem reflecting the bottom-up organization of broadly correlated time series.
14 15
frequency
12
frequency
10
10 largest eigenvalue 5
8 0
0
20
λ
40
60
6
4
2
0
0
0.5
1
1.5 λ
2
2.5
3
FIG. 3: Spectrum of eigenvalues estimated from the sample correlation matrix of N = 406 time series of length T = 1309. The times series have been constructed from a multivariate Gaussian distribution with a correlation matrix made of three block-diagonal matrices of sizes respectively equal to 130, 140 and 136 and mean correlation coefficients equal to 0.18 for all of them. The off-diagonal elements are all equal to 0.1. The same results hold if the off-diagonal elements are random.
are each dominated by a large eigenvalue λ1,i ' ρi · Ni . The spectrum of C 0 then exhibits p large eigenvalues. Each block Ai can be interpreted as a sector of the economy, including all the companies belonging to a same industrial branch and the eigenvector associated with each largest eigenvalue represents the main factor driving this sector of activity [25]. For similar sector sizes Ni and average correlation coefficients ρi , the largest eigenvalues are of the same order of magnitude. In order to recover a very large unique eigenvalue, we reintroduce some coupling constants outside the block diagonal matrices. A well-known result of perturbation theory in quantum mechanics states that such coupling leads to a repulsion between the eigenstates, which can be observed in figure 3 where C 0 has been constructed with three block matrices A1 , A2 and A3 and non-zero off-diagonal coupling described in the figure caption. These values allow us to quantitatively replicate the empirical finding of Laloux et al. in [5], where the three first eigenvalues are approximately λ1 ' 57, λ2 ' 10 and λ3 ' 8. The bulk of the spectrum (which excludes the three largest eigenvalues) is similar to the Wishart distribution but again statistically different from it as tested with a Kolmogorov test. As a final remark, expressions (3,4) and our numerical tests for a large variety of correlation √matrices show that the delocalized eigenvector v1 = (1/ N )(1, 1, · · · , 1), associated with the largest eigenvalue is extremely robust and remains (on average) the same for any large system. Thus, even for time-varying correlation matrices
[1] Wigner, E.P., Ann. Math. 53, 36 (1951). [2] Mehta, M.L., Random matrices, 2nd ed. (Boston: Academic Press, 1991). [3] Santhanam, M.S. and P. K. Patra, Phys. Rev. E 64, 016102 (2001). [4] Setpunga, A.M. and P.P. Mitra, Phys. Rev E 60, 3389 (1999). [5] Laloux, L. et al., Phys. Rev. Lett. 83, 1467 (1999); Plerou, V., et al., Phys. Rev. Lett. 83, 1471 (1999); Maslov, S., Physica A 301, 397 (2001); Drozdz, S. et al., Physica A 287, 440 (2000); Physica A 294, 226 (2001); Plerou, V. et al., Phys. Rev E 6506 066126 (2002). [6] Markowitz, H., Portfolio selection (John Wiley and Sons, New York, 1959). [7] RiskMetrics Group, RiskMetrics (Technical Document, NewYork: J.P. Morgan/Reuters, 1996). [8] Sharpe, W.F., J. Finance (September), 425 (1964); Lintner, J., Rev. Econ. Stat. (February), 13 (1965); Mossin, J., Econometrica (October), 768 (1966); Black, F., J. Business (July), 444 (1972). [9] Ross, S.A., J. Economic Theory (December), 341 (1976). [10] Marsili, M., cond-mat/0003241; Giada, L. and Marsili, M., Phys. Rev. E art. no. 061101, 6306 N6 PT1:1101,U17U23 (2001); T. Guhr and B. Kalber, cond-mat/0206577. [11] Engle, R.F. and K. Sheppard, NBER Working Paper No. W8554 (2001). [12] Schaller, H. and van Norden, S., Appl. Financial Econ. 7, 177 (1997). [13] Brammer, K., Kalman-Bucy filters (Gerhard Siffling, Norwood, MA: Artech House, 1989). [14] Fama, E.F. and Kenneth R., J. Finance 51, 55 (1996); J. Financial Econ. 33, 3 (1993); Fama, E.F. et al., Financial Analysts J. 49, 37 (1993). [15] Roll, R., Financial Management 23, 69 (1994). [16] Cohen-Tannoudji, C., B. Diu and F. Laloe, Quantum mechanics (New York: Wiley, 1977). [17] F¨redi Z. and J. Koml´ os, Combinatorica 1, 233-241 (1981). [18] Krivelevich, M. and V. H. Vu, math-ph/0009032 (2000). [19] Erdos, P. et al., Studia Sci. Math. 1, 215 (1966). [20] R. Roll, Financial Analysts J. 44, 19 (1988). [21] Gopikrishnan, P. et al., Eur. Phys. Journal B 3, 139 ¯ (1998); Guillaume, D.M., et al., Finance and Stochastics 1, 95 (1997); Lux, L., Appl. Financial Economics 6, 463 (1996); Pagan, A., J. Emp. Fin. 3, 15 (1996). [22] Davis, R.A. and J.E. Marengo, Commun. Statist.Stochastic Models 6, 483 (1990); Meerschaert, M.M. and H.P. Scheffler, J. Time Series Anal. 22, 481 (2001). [23] Noh, J.D., Phys.Rev.E 61, 5981 (2000). [24] see D. Sornette et al., in press in Risk, condmat/0204626, for a mechanism and empirical results contrasting the endogenous character of the October 1987 crash and other large endogenous market moves. [25] Mantegna, R.N., Eur. Phys. J. 11, 193 (1999); Marsili, M., Quant. Fin. 2, 297 (2002).
Chapitre 12
Gestion des risques grands et extrˆemes Dans la premi`ere partie de ce chapitre, nous exposons nos id´ees concernant la gestion des risques extrˆemes mais aussi des risques “interm´ediaires”, si l’on peut dire, que nous qualifions de grands risques dans le sens o`u leur impact est bien plus important que celui associ´e aux risques quantifi´es par la variance mais reste tr`es inf´erieur de part leur cons´equences aux risques extrˆemes. Cela nous permet donc de discuter des moyens d’appr´ehender toute la gamme des risques : des plus petits aux plus extrˆemes. La deuxi`eme partie du chapitre est exclusivement consacr´ee aux risques extrˆemes et nous nous interrogeons sur la fac¸on de s’en pr´emunir. En fait, a` cause de l’existence d’une d´ependance de queue entre les actifs, il n’est pas possible de diversifier parfaitement les risques extrˆemes par agr´egation. On peut n´eanmoins construire des portefeuilles dont les actifs ont chacun de tr`es faibles coefficients de d´ependance de queue, ce qui assure au portefeuille une assez faible sensibilit´e aux grands mouvements de ses constituants. Pour autant, l’absence des risques extrˆemes est un objectif qui semble hors d’atteinte. Nous nous sommes focalis´es sur des portefeuilles “traditionnels”, c’est-`a-dire o`u les ventes a` d´ecouvert ne sont pas autoris´ees. Il est cependant int´eressant de se demander si des strat´egies mixtes consistant a` d´etenir des positions longues sur certains actifs et courtes sur d’autres ne permettraient pas de se pr´emunir contre les risques extrˆemes dans la mesure o`u les tr`es grandes baisses des uns seraient compens´ees par les tr`es grandes baisses (concomitantes) des autres. Cette strat´egie fonctionne tr`es bien dans cette configuration, mais fait apparaˆıtre une autre situation critique : celle o`u les actifs en position longue baissent et o`u ceux en position courte montent. Or, des pertes extrˆemes concomitantes de hausses extrˆemes sont tout a` fait envisageables : que l’on se rem´emore la figure 7.1 page 187 et la figure 1 page 214 concernant la copule de Student et son coefficient de d´ependance de queue. On y observe que les grandes hausses concomitantes de grandes pertes ont une probabilit´e d’occurrence non nulle, mˆeme si elle est syst´ematiquement plus faible que la probabilit´e d’occurrence simultan´ee de deux pertes (ou hausses) extrˆemes. Ainsi, contrairement a` l’approche moyenne-variance o`u les strat´egies mixtes permettent de compl`etement d´ecorr´eler le portefeuille du march´e, par exemple, elles ne peuvent apporter de r´eelle solution face au probl`eme des risques extrˆemes, mˆeme s’il semble qu’elles pr´esentent une am´elioration par rapport aux strat´egies exclusivement en positions longues.
373
374
12. Gestion des risques grands et extrˆemes
12.1 Comprendre et g´erer les risques grands et extrˆemes L’impact des risques grands et extrˆemes sur l’activit´e financi`ere et le secteur de l’assurance est devenu si important qu’il ne peut plus eˆ tre ignor´e des gestionnaires de portefeuille. C’est pourquoi nous nous proposons de synth´etiser ici les e´ tapes successives conduisant a` une gestion de portefeuille rigoureuse visant a` prendre en compte (1) le comportement sous-exponentiel des distributions de rendement, (2) les d´ependances non-gaussiennes entre actifs et (3) les d´ependances temporelles intermittentes amenant les grandes pertes tout en d´eployant (4) le concept de risque sur ses diff´erentes dimensions allant des “petits” risques jusqu’aux risques “extrˆemes” afin de d´efinir des fonctions de d´ecision coh´erentes et pratiques permettant d’´etablir des portefeuilles optimaux.
Reprint from : J.V Andersen, Y. Malevergne et D.Sornette, 2002, Comprendre et g´erer les risques grands et extˆemes, Risque 49, 105-110.
12.1. Comprendre et g´erer les risques grands et extrˆemes
375
C
´ OMPRENDRE ET GERER ˆ LES RISQUES GRANDS ET EXTREMES Jorgen V. Andersen Charg´e de recherche a` l’universit´e Paris X-Nanterre (Thema) et a` l’universit´e de Nice Sophia-Antipolis
Yannick Malevergne Doctorant a` l’universit´e de Nice Sophia-Antipolis et a` l’universit´e Lyon I (ISFA)
Didier Sornette Directeur de recherche a` l’universit´e de Nice Sophia-Antipolis et professeur a` l’universit´e de Californie, Los Angeles (Ucla)
L’impact des risques grands et extrˆemes sur l’activit´e financi`ere et le secteur de l’assurance est devenu si important qu’il ne peut plus eˆ tre ignor´e des gestionnaires de portefeuille. C’est pourquoi nous nous proposons de synth´etiser ici les e´ tapes successives conduisant a` une gestion de portefeuille rigoureuse visant a` prendre en compte (1) le comportement sous-exponentiel des distributions de rendement, (2) les d´ependances non-gaussiennes entre actifs et (3) les d´ependances temporelles intermittentes amenant les grandes pertes tout en d´eployant (4) le concept de risque sur ses diff´erentes dimensions allant des “petits” risques jusqu’aux risques “extrˆemes” afin de d´efinir des fonctions de d´ecision coh´erentes et pratiques permettant d’´etablir des portefeuilles optimaux. La capitalisation totale des march´es financiers a` travers le monde a consid´erablement augment´e depuis le d´ebut des ann´ees 1980. En effet, alors qu’elle ne repr´esente que 3380 milliards de dollars en 1983, soit 4 fois le budget annuel des ´ Etats-Unis d’Am´erique, elle atteint, en 1999, le chiffre de 38700 milliards de dollars, soit 22 fois ´ budget annuel des Etats-Unis, pour cette ann´eel`a. Ainsi, en moins de vingt ans, la capitalisation boursi`ere mondiale est pass´ee de 4 fois a` 22 fois le budget des Etats-Unis ! Rien qu’en ce qui concerne la derni`ere d´ecennie, la capitalisation boursi`ere et les volumes e´ chang´es ont tripl´e, alors que le volume d’actions e´ mises a e´ t´e multipli´e par six. De plus, la volatilit´e a connu une croissance
significative depuis le d´ebut des ann´ees 1990, surtout pour les march´es int´egrant des soci´et´es fortement centr´ees sur le secteur des technologies de l’information (indice am´ericain Nasdaq ou finlandais Helsinki General (Hex) par exemple). La mˆeme tendance, certes moins prononc´ee, est e´ galement visible sur des march´es plus traditionnels comme par exemple le Cac40 (bourse de Paris), le Dow Jones (Bourse de New York) ou le Ftse100 (bourse de Londres). Cette intense et lucrative activit´e financi`ere est cependant temp´er´ee par quelques rares mais tr`es violentes secousses. En effet, l’´eclatement des bulles sp´eculatives de la fin des ann´ees 1990,
376
ainsi que deux ann´ees de tourmentes sur les march´es financiers, ont fait fondre la capitalisation boursi`ere mondiale de plus de 30% par rapport a` son niveau de 1999, pour la ramener a` un montant de 25100 milliards de dollars. Un autre crash d’une telle ampleur, se d´eclenchant simultan´ement (comme en octobre 1987) dans la plupart des bourses mondiales, am`enerait encore une perte quasi-instantan´ee de pr`es de 7500 milliards de dollars. Ainsi, de par les sommes astronomiques qu’ils engloutissent, les crashs financiers peuvent an´eantir en quelques instants les plus gros fonds d’investissement, ruinant, par la mˆeme, des ann´ees d’´epargne et de financement de retraite. Ce pourrait-il mˆeme qu’ils soient, comme en 1929-33 apr`es le grand crash d’octobre 1929, les pr´ecurseurs ou les d´eclencheurs de r´ecessions majeures ? Voire, qu’ils puissent mener a` un e´ croulement g´en´eral des syst`emes financiers et bancaires qui semblent y avoir e´ chapp´e de justesse d´ej`a quelques fois dans le pass´e ? Les grandes crises et les crashs financiers sont e´ galement fascinants parce-qu’ils personnifient une classe de ph´enom`enes appel´es “ph´enom`enes extrˆemes”. Des recherches r´ecentes en physique, psychologie, en th´eorie de jeux ou encore en sciences cognitives au sens large sugg`erent qu’ils sont les caract´eristiques incontournables de syst`emes complexes autoorganis´es. March´es turbulents, crises, crashs exposent donc un investisseur a` de grands risques dont la compr´ehension pr´ecise devient essentielle. Compte-tenu de l’´evolution des march´es et de leurs caract´eristiques cit´ees plus haut, il est plus que jamais dans l’int´erˆet des gestionnaires de portefeuilles et des investisseurs en g´en´eral de comprendre et de g´erer les risques extrˆemes.
Distributions des rendements a` queues e´ paisses Le premier pas vers une quantification des grands risques est d’admettre que les statistiques des risques – que ce soient les risques de march´es associ´es aux fluctuations des actions ou la distri-
12. Gestion des risques grands et extrˆemes
bution des remboursements survenant a` la suite de sinistres ou des sinistres eux-mˆemes – suivent des distributions a` queues e´ paisses. Ainsi, le paradigme gaussien, en vogue en finance jusqu’`a une p´eriode relativement r´ecente, n’est plus de mise aujourd’hui. Sa disparition a laiss´e le champ libre a` diverses mod´elisations possibles des risques, par exemple la mod´elisation Par´etienne, tr`es appliqu´ee en finance, et la mod´elisation a` l’aide des distributions dites “exponentielles e´ tir´ees” que nous avons d´evelopp´ee ces derni`eres ann´ees. Ces deux classes de distributions sont qualifi´ees de sous-exponentielles, c’est-`a-dire que la probabilit´e d’occurrence d’´ev´enements extrˆemes est plus probable qu’avec une distribution exponentielle. Cela a pour cons´equence imm´ediate que de telles distributions n’admettent pas de moment exponentiel, ou pour adopter le langage de la th´eorie de la ruine, ces distributions ne satisfont pas a` la condition de Cram´er-Lundberg. Sous l’hypoth`ese que les rendements sont distribu´es de mani`ere identique et ind´ependante ou ne poss`edent qu’une faible d´ependance, ces distributions caract´erisent compl`etement les risques. Cela a l’´enorme avantage de permettre d’´etablir des lois de comportement universelles, li´ees a` certains th´eor`emes de convergence math´ematique tels que la loi des grands nombres, le th´eor`eme de la limite centrale, la th´eorie des valeurs extrˆemes et des grandes d´eviations. L’immense majorit´e des th´eories sur la gestion des risques, e´ tablies aussi bien en finance qu’en assurance, est fond´ee sur cette hypoth`ese d’ind´ependance.
D´ependance temporelle intermittente a` l’origine des grandes pertes Ce
premier pas s’av`ere en fait tr`es insuffisant pour appr´ecier toute la dimension des risques r´eels encourus. En effet, des e´ tudes r´ecentes indiquent que l’hypoth`ese d’ind´ependance des rendements sur des p´eriodes successives (par exemple journali`eres) tombe en d´efaut lors de grands mouvements qui s’av`erent persistants :
12.1. Comprendre et g´erer les risques grands et extrˆemes
la distribution des drawdowns (ou somme de pertes quotidiennes successives) d’un actif, que ce soit d’un indice financier, du taux de change entre deux monnaies ou de la cˆote d’une action, pr´esente un comportement anormal pour les tr`es grands drawdowns. Autrement dit, les tr`es grands drawdowns n’appartiennent pas a` la mˆeme population que le reste de la statistique observ´ee et font apparaˆıtre d’importantes corr´elations s´erielles qui les rend beaucoup plus probables. Les distributions Par´etiennes ou en exponentielles e´ tir´ees sont insuffisantes pour quantifier ces grands risques intermittents, que nous appelons “outliers”, pour faire r´ef´erence au vocable statistique d´esignant des occurrences anormales, distinctes du reste de la population. La nature outlier des e´ v`enements extrˆemes semble ne pas se confiner aux syst`emes financiers mais a e´ t´e propos´ee e´ galement pour la rupture catastrophique de mat´eriaux et de structures industrielles, les tremblements de terre, les catastrophes m´et´eorologiques et enfin divers ph´enom`enes biologiques et sociaux. Les crises extrˆemes semblent donc r´esulter de m´ecanismes amplificateurs sp´ecifiques signalant probablement des ph´enom`enes coop´eratifs. Ainsi, des e´ tudes comportementales, dans lesquelles l’´economie dite “cognitive” tient un grand rˆole, permettent d’associer ces corr´elations s´erielles intermittentes concomitantes des grandes pertes a` certains comportements des acteurs e´ conomiques, tels que des effets de paniques et/ou d’imitations. Dans ce contexte, les mesures de risques r´ealis´ees a` partir d’outils standards comme la VaR (Value-at-Risk) peuvent se r´ev´eler totalement inad´equates. En effet, une perte journali`ere de 2 % ou 3 % sur les march´es financiers n’est pas rare, et ne constitue pas un e´ v´enement extrˆeme. Mais si une perte d’une telle ampleur vient a` se reproduire plusieurs jours de suite, qui plus est en s’amplifiant, la perte cumul´ee peut alors atteindre 10%, 20% ou mˆeme beaucoup plus, ce qui entraˆıne des cons´equences bien plus dramatiques que ne l’indique la VaR a` l’´echelle quotidienne. La prise en compte de ce type de d´ependances s´erielles intermittentes n´ecessite imp´erativement le calcul d’indicateurs de risques a` plusieurs e´ chelles temporelles permettant de couvrir la distribution des
377
dur´ees de drawdowns, comme le sugg`ere le Comit´e de Basle quand il recommande de calculer la VaR sur un intervalle de dix jours. La distribution des drawdowns fournit un tel indicateur parmi d’autres. Notons aussi que de nombreux investisseurs professionnels attachent une grande importance aux drawdowns pour caract´eriser leurs risques et la qualit´e d’une strat´egie ou d’un portefeuille.
D´ependance de queue et contagion Cependant, une gestion des risques digne de ce nom ne peut se r´eduire a` une e´ tude individuelle de chaque actif. En effet, l’´epine dorsale de la gestion des risques est la diversification par la constitution de portefeuilles de risques, aussi bien en assurance qu’en finance. De la mˆeme mani`ere que le paradigme gaussien est inad´equat pour quantifier les grands risques des distributions marginales des rendements, la covariance intervenant dans la th´eorie standard du portefeuille ne donne qu’une id´ee tr`es limit´ee des grands risques collectifs. Ces grands risques de portefeuille r´esultent en effet de la conjonction d’effets non-gaussien dans les distributions marginales a` queue e´ paisse et dans les d´ependances entre actifs. On peut donner une id´ee de l’importance des effets de d´ependance non-gaussienne en e´ tudiant la “d´ependance de queue” λ c’est-`a-dire la probabilit´e pour que l’actif X subisse une perte plus grande que Xq , associ´ee au quantile q tendant vers z´ero, conditionn´ee a` la r´ealisation d’une perte de l’actif Y plus grande que Yq associ´ee au mˆeme quantile q. Il se trouve que λ est nul pour les actifs a` d´ependance gaussienne ! Par contre, dans le cadre des mod`eles a` facteurs, nous avons montr´e que seules les distributions sous-exponentielles (et plus particuli`erement Par´etiennes) pr´esentent une d´ependance de queue asymptotique (pour q tendant vers 0). Il est possible d’acc´eder au param`etre de d´ependance de queue entre deux actifs par des m´ethodes qui ne font pas appel a` une d´etermination statistique directe. Nos tests empiriques trouvent alors un bon accord entre le ca-
378
librage du coefficient de d´ependance de queue et les grandes pertes r´ealis´ees entre les ann´ees 1962 et 2000 pour des actions principales et de grands indices de march´e. Conditionn´e a` un grand mouvement du march´e, on peut ainsi d´eduire la probabilit´e que tel ou tel actif subisse une perte du mˆeme ordre.
Nature multidimensionnelle des risques Le
Graal est de conjuguer la description des distributions marginales sous-exponentielles avec les d´ependances inter-actifs non-gaussiennes et id´ealement les d´ependances temporelles intermittentes amenant les grandes pertes pour e´ tablir un portefeuille optimal. Le probl`eme est alors que la notion d’“optimalit´e” n’est pas e´ vidente a` d´efinir en pratique : si la th´eorie e´ conomique nous dit de maximiser la fonction d’utilit´e de l’investisseur, en r´ealit´e nous ne la connaissons pas avec pr´ecision. Le probl`eme se complique par les multiples dimensions du risque introduites par la nature non-gaussienne des distributions de rendement et des d´ependances . Dans une s´erie d’articles, nous avons d´evelopp´e une th´eorie du portefeuille reposant sur la caract´erisation des risques par les cumulants de la distribution des rendements du portefeuille. Les cumulants, not´es cn , s’expriment comme des combinaisons de moments, et quantifient notamment l’´ecart a` la gaussienne. De mˆeme que les moments, les cumulants n’existent pas tous pour les distributions Par´etiennes mais sont d´efinis a` tout ordre pour les distributions exponentielles e´ tir´ees. En particulier, les cumulants d’ordres un et deux sont respectivement la moyenne et la variance des rendements, tandis que les cumulants d’ordre trois et quatre (apr`es normalisation par l’´ecart type) permettent de d´efinir la skewness et la kurtosis. De fac¸on g´en´erale, les cumulants d’ordres pairs quantifient des risques d’autant plus grands que l’ordre du cumulant consid´er´e est e´ lev´e, tandis que les cumulants d’ordres impairs caract´erisent la dissym´etrie entre les queues positives et n´egatives de la distribution des rende-
12. Gestion des risques grands et extrˆemes
ments. Plus l’ordre n du cumulant cn consid´er´e est grand, plus celui-ci accorde d’importance aux e´ v´enements extrˆemes. L’ordre n des cumulants allant de 1 a` l’infini, varier n revient a` e´ taler ou d´evelopper toutes les dimensions du risque : les “petits” risques quantifi´es par c2 et les “grands” risques quantifi´es par c4 et les cumulants d’ordres plus e´ lev´es. Notre th´eorie du portefeuille utilise les distributions marginales de la famille des exponentielles e´ tir´ees et la d´ependance entre actif est d´ecrite par la copule gaussienne. Si on souhaite cr´eer un portefeuille qui e´ vite les grands risques, on choisit le poids des actifs de telle mani`ere que les cumulants c4 , c6 , c8 , etc. soient tous proches de leur minimum, tout en laissant libre la variance c2 (petits risques). Cette approche est tr`es diff´erente de l’approche standard de Markowitz qui se focalise sur c2 et de plus construit une fronti`ere efficiente dans l’espace (rendement-variance).
Petits risques, grands risques et rendement Pour
illustrer l’impact de la d´ecomposition des risques en “petits” et “grands” risques, consid´erons le cas simple d’un portefeuille avec seulement deux actifs : l’action Chevron et la devise malaise : le Ringgit. Ces deux actifs ont des caract´eristiques tr`es diff´erentes et illustrent admirablement un effet surprenant a priori. La figure 1a montre le rendement quotidien d’un portefeuille dont la proportion w1, investie dans l’action Chevron, a e´ t´e obtenue en minimisant la variance. La figure 1b donne la solution de la minimisation de cn vis-`a-vis du poids w1 . Les lignes des points horizontaux sont les valeurs maximales du rendement quotidien dans le cas o`u l’on optimise c2n , pour n>1. Les rendements quotidiens pour le portefeuille de la figure 1b surpassent ces limites, i.e. le portefeuille de la figure 1a subit plus de fluctuations de grandes amplitudes.Ces deux figures illustrent clairement le fait que minimiser des petits risques peut faire augmenter les grands risques ! De plus, le gain cumulatif de la figure 1c montre que le portefeuille de la figure 1b voit son
379
12.1. Comprendre et g´erer les risques grands et extrˆemes
gain s’accroˆıtre consid´erablement par rapport au portefeuille standard a` la Markovitz. Autrement dit “on peut avoir le beurre et l’argent du beur-
re” : diminuer les grands risques et augmenter le profit !
Richesse cummulée Rendement quotiedien Rendement quotiedien
Rendements quotidiens annualis´es (en pourcentage) et gain cumul´e pour les deux portefeuilles correspondants au minimum de la variance (poids Chevron w1 = 0,095) et au minimum des cumulants c2n , d’ordre 2n > 2 (poids Chevron w1 = 0,38).
20 10
w1=0.095
(a)
0 10 20
0
1000
2000
3000
4000
5000
6000
7000
20 w1=0.38
10 0
(b)
10 20
0
1000
2000
3000
4000
5000
6000
7000
6
4
2
w1=0.38
(c)
0
w1=0.095
0
1000
2000
3000
Le m´ecanisme de cet effet remarquable est simple : le Ringgit malais contribue le plus au cumulant d’ordres e´ lev´es (grands risques) et a de plus un rendement tr`es faible par rapport a` celui de Chevron. Par contre, sa distribution e´ troite dans la zone des petits rendements lui donne une faible variance. L’optimisation a` la Markowitz mettra donc plus de poids sur le Ringgit qui semble apporter une diversification int´eressante du point de vue de la variance. Mais cela est une illusion dangereuse, car le risque r´eel du Ringgit est beaucoup plus grand que ne le fait croire sa variance. Le cumulant c4, par exemple, le quantifie clairement et sa minimisation conduit en cons´equence a` r´eduire le poids de la devise malaisienne dans le porte-
4000
5000
6000
7000
feuille. Du coup, les grands risques sont r´eduits. Comme le Ringgit n’a que peu de rendement, on gagne alors sur les deux tableaux car alors le rendement augmente nettement.
Bibliographie Andersen, J. V. and D. Sornette (2001) Have your cake and eat it too : increasing returns while lowering large risks ! Journal of Risk Finance 2 (3), 70-82. Embrechts P., C. Kl¨uppelberg et T. Mikosch (1997), Modelling Extremal Events for Insurance
380
and Finance (Springer, New York). Malevergne, Y. and D. Sornette (2001) General framework for a portfolio theory with nonGaussian risks and non-linear correlations, paper presented at the 18th International conference in Finance, June 2001, Namur, Belgium (e-print at http : //arXiv.org/abs/cond−mat/0103020). Malevergne, Y. and D. Sornette (2002) Tail Dependence of Factor Models, working paper (http : //arXiv.org/abs/cond − mat/0202356). Johansen, A. et D. Sornette (2002), “Large Price Drawdowns Are Outliers”, Journal of Risk 4 (2), (http : //arXiv.org/abs/cond − mat/0010050).
12. Gestion des risques grands et extrˆemes
Sornette, D. (1999) Complexity, catastrophe and physics, Physics World 12 (N12), 57-57. Sornette, D. (2002) Predictability of catastrophic events : material rupture, earthquakes, turbulence, financial crashes and human birth, Proceedings of the National Academy of Sciences USA, vol. 4, (e-print at http : //arXiv.org/abs/cond − mat/0107173). Sornette, D., P. Simonetti and J. V. Andersen (2000) φq -field theory for Portfolio optimization : ``fat tails” and non-linear correlations, Physics Report 335 (2), 19-92. Zajdenweber, D., Economie des extrˆemes (Flammarion, 2000).
12.2. Minimiser l’impact des grands co-mouvements
381
12.2 Minimiser l’impact des grands co-mouvements A l’aide des r´esultats expos´es au chapitre 9 section 9.2, nous montrons comment le coefficient de d´ependance de queue entre un actif et un de ces facteurs explicatifs (le march´e par exemple) ou entre deux actifs, peut facilement eˆ tre calibr´e. Nous construisons ensuite des portefeuilles compos´es d’actifs ayant de tr`es faibles coefficients de d´ependance de queue avec le march´e et montrons qu’ils pr´esentent une corr´elation remarquablement plus faible que des portefeuilles compos´es d’actifs ayant une plus forte d´ependance de queue avec le march´e, et ce sans d´egradation de la performance mesur´ee par le ratio de Sharpe.
Reprint from : Y. Malevergne et D.Sornette (2002), Minimizing extremes, RISK 15(11), 129-134.
382
12. Gestion des risques grands et extrˆemes
Portfolio tail risk
l Cutting edge
Minimising extremes Portfolio diversification often breaks down in stressed market environments, but the comovement of asset prices in a tail risk regime may be modelled using a coefficient of tail dependence. Here, Yannick Malevergne and Didier Sornette show how such coefficients can be estimated analytically using the parameters of factor models, while avoiding the problem of under-sampling of extreme values
M
ore than 100 years ago, Vilfred Pareto discovered a statistical relationship, now known as the 80-20 rule, that manifests itself over and over in large systems: “In any series of elements to be controlled, a selected small fraction, in terms of numbers of elements, always accounts for a large fraction in terms of effect.” The stock market is no exception: events occurring over a very small fraction of the total invested time may account for most of the gains and/or losses. Diversifying away such large risks requires novel approaches to portfolio management, which must take into account the non-Gaussian fat tail structure of distributions of returns and their dependence. Recent economic shocks and crashes have shown that standard portfolio diversification works well in normal times but may break down in stressful times, precisely when diversification is most important. One could say that diversification works when one does not really need it and may fail severely when it is most needed. Technically, the question boils down to whether large price movements occur mainly in an isolated manner or in a co-ordinated way. This question is vital for fund managers who take advantage of the diversification to minimise their risks. Here, we introduce a new technique to quantify and empirically estimate the propensity for assets to exhibit extreme comovements, through the use of the so-called coefficient of tail dependence. Using a factor model framework and tools from extreme value theory, we provide novel analytical formulas for the coefficient of tail dependence between arbitrary assets, which yields an efficient non-parametric estimator. We then construct portfolios of stocks with minimal tail dependence with the market represented by the S&P 500, and show that their superior behaviour in stressed times comes together with qualities in terms of Sharpe ratio and standard quality measures that are at least as good as standard portfolios.
Assessing large co-movements Standard estimators of the dependence between assets include the correlation coefficient and the Spearman’s rank correlation. However, as stressed by Embrechts, McNeil & Straumann (1999), these kind of dependence measures suffer from many deficiencies. Moreover, their values are mostly controlled by relatively small moves of the asset prices around their mean. To solve this problem, it has been proposed to use the correlation coefficients conditioned on large movements of the assets. But Boyer, Gibson & Lauretan (1997) have emphasised that this approach suffers also from a severe systematic bias leading to spurious strategies: the conditional correlation in general evolves with time even when the true non-conditional correlation remains constant. In fact, Malevergne & Sornette (2002a) have shown that any approach based on conditional dependence measures implies a spurious change of the intrinsic value of the dependence, measured for instance by copulas. Recall that the copula of several random variables is the (unique) function (for continuous marginals) that completely embodies the dependence between these variables, irrespective of their marginal behaviour (see Nelsen, 1998, for a mathematical description of the notion of copula). In view of these limitations of the standard statistical tools, it is natural to turn to extreme value theory. In the univariate case, extreme value theory is very useful and provides many tools for investigating the extreme
tails of distributions of assets’ returns. These new developments rest on the existence of a few fundamental results on extremes, such as the Gnedenko-Pickands-Balkema-de Haan theorem, which gives a general expression for the conditional distribution of exceedance over a large threshold. In this framework, the study of large and extreme co-movements requires the multivariate extreme values theory, which, in contrast with the univariate case, cannot be used to constrain accurately the distribution of large co-movements since the class of limiting extreme-value distributions is too broad. In the spirit of the mean-variance portfolio or of utility theory, which establish an investment decision on a unique risk measure, we use the coefficient of tail dependence, which, to our knowledge, was first introduced in a financial context by Embrechts, McNeil & Straumann (2002). The coefficient of tail dependence between assets Xi and Xj is a very natural and easily comprehensible measure of extreme co-movements. It is defined as the probability that the asset Xi incurs a large loss (or gain) assuming that the asset Xj also undergoes a large loss (or gain) at the same probability level, in the limit where this probability level explores the extreme tails of the distribution of returns of the two assets. Mathematically speaking, the coefficient of _ lower tail dependence between the two assets Xi and Xj, denoted by λ ij , is defined by:
{
}
λ ij− = lim Pr Xi < Fi−1 (u ) X j < Fj−1 (u ) u→0
(1)
where Fi–1(u) and Fj–1(u) represent the quantiles of assets Xi and Xj at the level u. Similarly, the coefficient of upper tail dependence is:
{
}
λ ij+ = lim Pr Xi > Fi−1 (u ) X j > Fj−1 (u ) u →1
(2)
_
λ ij (respectively λ+ij ) is of concern to investors with long (respectively short) positions. We refer to Coles, Heffernan & Tawn (1999) and references therein for a survey of the properties of the coefficient of tail de-_ pendence. Let us stress that the use of quantiles in the definition of λ ij and λ+ij makes them independent of the marginal distribution of the asset returns. As a consequence, the tail dependence parameters are intrinsic dependence measures. The obvious gain is an ‘orthogonal’ decomposition of the risks into (1) individual risks carried by the marginal distribution of each asset and (2) their collective risk described by their dependence structure or copula. Being a probability, the coefficient of tail dependence varies between _ zero and one. A large value of λ ij means that large losses are more likely to occur together. Then, large risks cannot be diversified away and the assets crash together. This investor and portfolio manager nightmare is further amplified in real life situations by the limited liquidity of markets. _ When λ ij vanishes, these assets are said to be asymptotically independent, but this term hides the subtlety that the assets can still present a non-zero dependence in their tails. For instance, two assets with a bivariate normal distribution with correlation coefficient less than one can be shown to have a vanishing coefficient of tail dependence. Nevertheless, unless their correlation coefficient is zero, these assets are never independent. Thus, asymptotic independence must be understood as the weakest dependence WWW.RISK.NET ● NOVEMBER 2002 RISK
129
383
12.2. Minimiser l’impact des grands co-mouvements
Cutting edge
l
Portfolio tail risk
X = σY
1. Tail dependence versus correlation 1.0 0.9
Tail index ν = 3
0.8 0.7 λ
0.6 0.5 0.4 0.3 0.2 0.1 0 –1.0 –0.8 –0.6 –0.4 –0.2 1.0 0.9
0 ρ
0.2
0.4
0.6
0.8
1.0
Tail index ν = 10
where σ is a positive random variable modelling the volatility, Y is a Gaussian random vector, independent of σ, and X is the vector of asset returns. In this framework, the multivariate distribution of asset returns X is an elliptical multivariate distribution. For instance, if the inverse of the square of the volatility 1/σ2 is a constant times a χ2-distributed random variable with ν degrees of freedom, the distribution of asset returns will be the Student distribution with ν degrees of freedom. When the volatility follows Arch or Garch processes, the asset returns are also elliptically distributed with fat-tailed marginal distributions. Thus, any asset Xi is asymptotically distributed according to a regularly varying distribution1: Pr{|Xi| > x} ~ L(x) × x–ν, where L(⋅) denotes a slowly varying function, with the same exponent ν for all assets, due to the ellipticity of their multivariate distribution. Hult & Lindskog (2002) have shown that the necessary and sufficient condition for any two assets Xi and Xj with an elliptical multivariate distribution to have a non-vanishing coefficient of tail dependence is that their distribution be regularly varying. Denoting by ρij the correlation coefficient between the assets Xi and Xj and by ν the tail index of their distributions, they obtain: π/2
0.8
λ ij±
0.7
=
∫(π / 2 −arcsin ρ ) / 2 dt cos ij
π/2
∫0
λ
0.6
I x ( z, w ) =
0.4 0.3 0.2 0.1 0 ρ
0.2
0.4
0.6
0.8
1.0
Evolution as a function of the correlation coefficient ρ of the coefficient of tail dependence for an elliptical bivariate Student distribution (solid line) and for the additive factor model with Student factor and noise (dashed line)
that can be quantified by the coefficient of tail dependence (for other details, the reader is referred to Ledford & Tawn, 1998). For practical implementations, a direct application of the definitions (1) and (2) fails to provide reasonable estimations due to the double curse of dimensionality and under-sampling of extreme values, so that a fully nonparametric approach is not reliable. It turns out to be possible to circumvent this fundamental difficulty by considering the general class of factor models, which are among the most widespread and versatile models in finance. They come in two classes: multiplicative and additive factor models respectively. The multiplicative factor models are generally used to model asset fluctuations due to an underlying stochastic volatility (see, for example, Hull & White, 1987, and Taylor, 1994, for a survey of the properties of these models). The additive factor models are made to relate asset fluctuations to market fluctuations, as in the capital asset pricing model and its generalisations (see, for example, Sharpe, 1964, and Rubinstein, 1973), or to any set of common factors as in Ross’ (1976) arbitrage pricing theory. The coefficient of tail dependence is known in closed form for both classes of factor models, which allows, as we shall see, for an efficient empirical estimation.
Tail dependence generated by factor models We first examine multiplicative factor models, which account for most of the stylised facts observed on financial time series. A multivariate stochastic volatility model with a common stochastic volatility factor can be written as: 130
ν
dt cos t
ν
t
ν + 1 1 = 2 I 1+ρij , 2 2 2
(4)
where the function:
0.5
0 –1.0 –0.8 –0.6 –0.4 –0.2
(3)
RISK NOVEMBER 2002 ● WWW.RISK.NET
x 1 w −1 z −1 ∫ dt t (1 − t ) B ( z, w ) 0
(5)
denotes the incomplete beta function. This expression holds for any regularly varying elliptical distribution, irrespective of the exact shape of the distribution. Only the tail index is important in the determination of the coefficient of tail dependence because λ±ij probes the extreme end of the tail of the distributions, which all have, roughly speaking, the same behaviour for regularly varying distributions. In contrast, when the marginal distributions decay faster than any power law, such as the Gaussian distribution, the coefficient of tail dependence is zero. Let us now turn to the second class of additive factor models, whose introduction in finance goes back at least to the arbitrage pricing theory (Ross, 1976). They are now widely used in many branches of finance, including to model stock returns, interest rates and credit risks. Here, we shall only consider the effect of a single factor, which may represent the market, for example. This factor will be denoted by Y and its cumulative distribution by FY. As previously, the vector X is the vector of asset returns and ε will denote the vector of idiosyncratic noises assumed independent2 of Y. β is the vector whose components are the regression coefficients of the Xi on the factor Y. Thus, the factor model reads: X = βY + ε
(6)
In contrast with multiplicative factor models, the multivariate distribution of X cannot be obtained in an analytical form, in the general case. In the particular situation when Y and ε are normally distributed, the multivariate distribution of X is also normal but this case is not very interesting. In a sense, additive factor models are richer than the multiplicative ones, since they give birth to a larger set of distributions of asset returns. Notwithstanding these difficulties, it turns out to be possible to obtain the coefficient of tail dependence for any pair of assets Xi and Xj. In a first step, let us consider the coefficient of tail dependence λ±i between any asset Xi and the factor Y itself. Malevergne & Sornette (2002b) have shown that λ±i is also identically zero for all rapidly varying factors, that is, for all factors whose distribution decays faster than any power law, such as the Gaussian, exponential or gamma laws. When the factor Y has a distribution that is regularly varying with tail index ν, we have: 1 2
See Bingham, Goldie & Teugel (1987) for details on regular variations In fact, ε and Y can be weakly dependent (see Malevergne & Sornette, 2002b, for details)
384
12. Gestion des risques grands et extrˆemes
1
λ i+ =
{ }
max 1, βl
ν
A. Coefficients of tail dependence
(7)
i
where: l = lim
Bristol-Myers Squibb Chevron Hewlett-Packard Coca-Cola Minnesota Mining & MFG Philip Morris Procter & Gamble Pharmacia Schering-Plough Texaco Texas Instruments Walgreen
FX−1 (u )
u →1 F −1 Y
(u )
A similar expression holds for λ–i , which is obtained by simply replacing the limit u → 1 by u → 0 in the definition of l. λ±i is non-zero as long as l remains finite, that is, when the tail of the distribution of the factor is not thinner than the tail of the idiosyncratic noise εi. Therefore, two conditions must hold for the coefficient of tail dependence to be non-zero: the factor must be intrinsically ‘wild’ (to use the terminology of Mandelbrot, 1997) so that its distribution is regularly varying; and the factor must be sufficiently ‘wild’ in its intrinsic variability, so that its influence is not dominated by the idiosyncratic component of the asset. Then, the amplitude of λ±i is determined by the trade-off between the relative tail behaviours of the factor and the idiosyncratic noise. As an example, let us consider that the factor and the idiosyncratic noise follow Student distribution with νY and νεi degrees of freedom and scale factor σY and σεi respectively. Expression (7) leads to: 0
λi =
1
1+ λi =
( ) σ εi
ν
if
νY > νεi
if
νY = νεi = ν
βi σY
1
if
2. Quantile ratio 2.2 2.0
(8)
1.8
νY < νεi
The tail dependence decreases when the idiosyncratic volatility increases relative to the factor volatility. Therefore, λi decreases in periods of high idiosyncratic volatility and increases in periods of high market volatility. From the viewpoint of the tail dependence, the volatility of an asset is not relevant. What is governing extreme co-movement is the relative weights of the different components of the volatility of the asset. Figure 1 compares the coefficient of tail dependence as a function of the correlation coefficient for the bivariate Student distribution (expression (4)) and for the factor model with the factor and the idiosyncratic noise following Student distributions (equation (8)). Contrary to the coefficient of tail dependence of the Student factor model, the tail dependence of the (elliptical) Student distribution does not vanish for negative correlation coefficients. For large values of the correlation coefficient, the former is always larger than the latter. Once the coefficients of tail dependence between the assets and the common factor are known, the coefficient of tail dependence between any two assets Xi and Xj with a common factor Y is simply equal to the weakest tail dependence between the assets and their common factor:
{
This table presents the coefficients of lower and upper tail dependence with the S&P 500 index for a set of 12 major stocks traded on the New York Stock Exchange from January 1991 to December 2000. The numbers in brackets give the estimated standard deviation of the empirical coefficients of tail dependence
λ ij = min λ i , λ j
}
(9)
This result is very intuitive: since the dependence between the two assets is due to their common factor, this dependence cannot be stronger than the weakest dependence between each of the assets and the factor.
Practical implementation and consequences The two mathematical results (4) and (7) have a very important practical effect for estimating the coefficient of tail dependence. As we have already pointed out, its direct estimation is essentially impossible since, by definition, the number of observations goes to zero as the probability level of the quantile goes to zero (or one). In contrast, the formulas (4) and (7–9) tell us that one has just to estimate a tail index and a correlation coefficient. These estimations can be reasonably accurate because they make use of a significant part of the data beyond the few extremes targeted by λ. Moreover, equation (7) does not explicitly assume a power law behaviour, but only a regularly varying behaviour, which is far more general. In
I = Xk,N/Yk,N
λi =
Lower tail dependence Upper tail dependence 0.16 (0.03) 0.14 (0.01) 0.05 (0.01) 0.03 (0.01) 0.13 (0.01) 0.12 (0.01) 0.12 (0.01) 0.09 (0.01) 0.07 (0.01) 0.06 (0.01) 0.04 (0.01) 0.04 (0.01) 0.12 (0.02) 0.09 (0.01) 0.06 (0.01) 0.04 (0.01) 0.12 (0.01) 0.11 (0.01) 0.04 (0.01) 0.03 (0.01) 0.17 (0.02) 0.12 (0.01) 0.11 (0.01) 0.09 (0.01)
1.6 1.4 1.2 1.0 0.8 0
0.05
0.10
0.15
0.20
0.25
k/N Empirical estimate l^ of the quantile ratio l in (7) versus the empirical ^ quantile k/N. We observe a very good stability of l for quantiles ranging between 0.005 and 0.05
such a case, the empirical quantile ratio l in (7) turns out to be stable enough for its accurate non-parametric estimation, as shown in figure 2. As an example, table A presents the results obtained both for the upper and lower coefficients of tail dependence between several major stocks and the market factor represented here by the S&P 500 index, over the past decade. The estimation has been performed under the assumption that equation (6) holds, rather than under the ellipticality assumption yielding equation (4). In the present context of the dependence between stocks and an index (not between two stocks), favouring the factor model is very reasonable since, according to the financial theory, the market’s return is well known to be the most important explanatory factor for each individual asset return.3 The technical aspects of the method are given in the Appendix. The coefficient of tail dependence between any two assets is easily derived from (9). It is interesting to observe that the coefficients of tail dependence seem almost identical in the lower and the upper tail. Nonetheless, the coefficient of lower tail dependence is always slightly larger than the upper one, showing that large losses are more likely to occur together than large gains. Two clusters of assets stand out: those with a tail dependence of about 3
In a situation where the common factor cannot be easily identified or estimated, the ellipticality assumption may provide a useful alternative
WWW.RISK.NET ● NOVEMBER 2002 RISK
131
12.2. Minimiser l’impact des grands co-mouvements
Cutting edge
l
Portfolio tail risk
3. Portfolios versus market 0.10 0.08
Portfolio daily return
0.06 0.04 0.02
Portfolio 1 Linear regression of the portfolio 1 on the S&P 500 index Portfolio 2 Linear regression of the portfolio 2 on the S&P 500 index
0 0.02 0.04 0.06 0.08 0.10 0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
S&P 500 daily return Daily returns of two equally weighted portfolios P1 (made of four stocks with small λ ≤ 0.06) and P2 (made of four stocks with large λ ≥ 0.12) as a function of the daily returns of the S&P 500 from Jan 1991–Dec 2000
10% (or more) and those with a tail dependence of about 5%. Since the estimation of the tail dependence coefficients has been performed under the assumption that equation (6) holds, it is interesting to compare the values with the tail dependence coefficient under the assumption of joint ellipticality leading to (4). To get a reliable correlation coefficient ρ, we calculate the Kendall’s tau coefficient τ, use the relation r = sin(π τ/2) and derive λ from (4), assuming ν = 3 or 4. For ν = 3 (respectively ν = 4), all λ’s are in the range 0.25–0.30 (respectively 0.20–0.25). Thus, assuming joint ellipticality, the tail-dependence coefficients between stocks and the index are much more homogeneous than found with the factor model. As we show in figure 3, it is clear that the portfolio with stocks with small tail-dependence coefficients with the index (measured with the factor model) exhibits significantly less dependence than the portfolio constructed with stocks with large λ’s (measured with the factor model). This
132
385
RISK NOVEMBER 2002 ● WWW.RISK.NET
should not be observed if all λ’s are the same as predicted under the assumption of joint ellipticality. We now explore some consequences of the existence of stocks with drastically different tail-dependence coefficients with the index. These stocks offer the interesting possibility of devising a prudential portfolio that can be significantly less sensitive to the large market moves. Figure 3 compares the daily returns of the S&P 500 index with those of two portfolios P1 and P2. P1 comprises the four stocks (Chevron, Philip Morris, Pharmacia and Texaco) with the smallest λ’s while P2 comprises the four stocks (Bristol-Meyer Squibb, Hewlett-Packard, Schering-Plough and Texas Instruments) with the largest λ’s. For each set of stocks, we have constructed two portfolios, one in which each stock has the same weight 1/4 and the other with asset weights chosen to minimise the variance of the resulting portfolio. We find that the results are almost the same between the equally weighted and minimum-variance portfolios. This makes sense since the tail-dependence coefficient of a bivariate random vector does not depend on the variances of the components, which only account for price moves of moderate amplitudes. Figure 3 shows the results for the equally weighted portfolios generated from the two groups of assets. Observe that only one large drop occurs simultaneously for P1 and for the S&P 500 index, in contrast with P2, for which several large drops are associated with the largest drops of the index and only a few occur desynchronised. The figure clearly shows an almost circular scatter plot for the large moves of P1 and the index compared with a rather narrow ellipse, whose long axis is approximately along the first diagonal, for the the large returns of P2 and the index, illustrating that the small tail dependence between the index and the four stocks in P1 automatically implies that their mutual tail dependence is also very small, according to (9). As a consequence, P1 offers a better diversification with respect to large drops than P2. This effect, already quite significant for such small portfolios, should be overwhelming for large ones. The most interesting result stressed in figure 3 is that optimising for minimum tail dependence automatically diversifies away the large risks. These advantages of portfolio P1 with small tail dependence compared with portfolio P2 with large tail dependence with the S&P 500 index come at almost no cost in terms of the daily Sharpe ratio, which is equal respectively to 0.058 and 0.061 for the equally weighted and minimum variance P1 and to 0.069 and 0.071 for the equally weighted and minimum variance P2. The straight lines represent the linear regression of the two portfolios’
386
12. Gestion des risques grands et extrˆemes
returns on the index returns, which shows that there is significantly less linear correlation between P1 and the index (correlation coefficient of 0.52 for both the equally weighted and the minimum variance P1) compared with P2 and the index (correlation coefficient of 0.73 for the equally weighted P2 and of 0.70 for the minimum variance P2). Theoretically, it is possible to construct two random variables with small correlation coefficient and large λ and vice versa. Recall that the correlation coefficient and the tail-dependence coefficient are two opposite end-members of dependence measures. The correlation coefficient quantifies the dependence between relatively small moves while the tail-dependence coefficient measures the dependence during extreme events. The finding that P1 comes with both the smallest correlation and the smallest tail-dependence coefficients suggests that they are not independent properties of assets. This intuition is in fact explained and encompassed by the factor model since the larger β is, the larger the correlation coefficient and the larger the tail dependence. Diversifying away extreme shocks may provide a useful diversification tool for less extreme dependences, thus improving the potential usefulness of a strategy of portfolio management based on tail dependence proposed here. As a final remark, the almost identical values of the coefficients of tail dependence for negative and positive tails show that assets that are the most likely to suffer from the large losses of the market factor are also those that are the most likely to take advantage of its large gains. This has the following consequence: minimising the large concomitant losses between the stocks and the market means renouncing the potential concomitant large gains. This point is well exemplified by our two portfolios (see figure 3): P2 obviously underwent severe negative co-movements but it also enjoyed large gains with the large positive movements of the index. In contrast, P1 is almost completely decoupled from the large negative movements of the market but is also insensitive to the large positive movements of the index. Thus, a good dynamic strategy seems to be: invest in P1 during bearish or trend-less market phases and prefer P2 in a bullish market. ■ Yannick Malevergne is a PhD student at the University of Nice-Sophia Antipolis and at the ISFA Actuarial School – University of Lyon. Didier
Appendix: empirical estimation of the coefficient of tail dependence We show how to estimate the coefficient of tail dependence between an asset X and the market factor Y related by the relation (6) where ε is an idiosyncratic noise uncorrelated with X. Given a sample of N realisations {X1, X2, ... , XN} and {Y1, Y2, ... , YN} of X and Y, we first estimate the coefficient β using ^ denote its estimate. Then, the ordinary least square estimator. Let β ^ of the factor Y: using Hill’s estimator, we obtain the tail index ν
1 k νˆ k = ∑ log Y j , N − log Yk , N k j =1 where Y1, N ≥ Y2, N ≥ ... ≥ YN, N are the order statistics of the N realisations of Y. The constant l is non-parametrically estimated with the formula: l = lim
FX−1 (u )
u →1 F −1 Y
(u )
Xk,N Yk , N
for k = o(N), which means that k must remain very small with respect to N but large enough to ensure an accurate determination of l. Figure 2 presents ^l as a function of k/N. Finally, using equation (7), the estimated ^λ is: 1 λˆ + = νˆ ˆ max 1, lˆ
{ } β
Sornette is a CNRS research director at the University of Nice-Sophia Antipolis and professor of geophysics at the University of California at Los Angeles. They acknowledge helpful discussions with Jean-Paul Laurent. This work was partially supported by the James S McDonnell Foundation twenty-first century scientist award/studying complex system. e-mail:
[email protected] and
[email protected]
REFERENCES Bingham N, C Goldie and J Teugel, 1987 Regular variation Cambridge University Press, Cambridge Boyer B, M Gibson and M Lauretan, 1997 Pitfalls in tests for changes in correlations International Finance Discussion Paper 597, Board of the Governors of the Federal Reserve System Coles S, J Heffernan and J Tawn, 1999 Dependence measures for extreme value analysis Extremes 2, pages 339–365 Embrechts P, A McNeil and D Straumann, 1999 Correlation: pitfalls and alternatives Risk May, pages 69–71 Embrechts P, A McNeil and D Straumann, 2002 Correlation and dependence in risk management: properties and pitfalls In Risk Management: Value at Risk and Beyond, edited by M Dempster, pages 176–223, Cambridge University Press, Cambridge
Malevergne Y and D Sornette, 2002a Investigating extreme dependences: concepts and tools Working paper, available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id = 303465 Malevergne Y and D Sornette, 2002b Tail dependence of factor models Working paper, available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id = 301266 Mandelbrot B, 1997 Fractals and scaling in finance: discontinuity, concentration Springer-Verlag, New York Nelsen R, 1998 An introduction to copulas Lectures Notes in Statistics 139, Springer Verlag, New York Ross S, 1976 The arbitrage theory of capital asset pricing Journal of Economic Theory 17, pages 254–286
Hull J and A White, 1987 The option pricing on assets with stochastic volatilities Journal of Finance 42, pages 281–300
Rubinstein M, 1973 The fundamental theorem of parameter-preference security valuation Journal of Financial and Quantitative Analysis 8, pages 61–69
Hult H and F Lindskog, 2002 Multivariate extremes, aggregation and dependence in elliptical distributions Forthcoming in Advances in Applied Probability 34(3)
Sharpe W, 1964 Capital asset pricing: a theory of market equilibrium under conditions of risk Journal of Finance 19, pages 425–442
Ledford A and J Tawn, 1998 Concomitant tail behaviour for extremes Advances in Applied Probability 30, pages 197–215
Taylor S, 1994 Modeling stochastic volatility Mathematical Finance 4, pages 183–204
WWW.RISK.NET ● NOVEMBER 2002 RISK
133
Chapitre 13
Gestion de portefeuille sous contraintes de capital e´ conomique : l’exemple de la Value-at-Risk et de l’Expected-Shortfall A l’aide d’une famille de distributions de Weibull modifi´ees, englobant a` la fois des distributions super et sous-exponentielles (dont l’interˆet a e´ t´e soulign´e au chapitre 3), nous param´etrons les distributions de rendements des actifs financiers. L’hypoth`ese de copule gaussienne est utilis´ee pour mod´eliser la d´ependance entre les actifs. Cela nous permet d’obtenir les expressions analytiques des queues de la distribution P (S) du rendement S d’un portefeuille compos´e de tels actifs. Nous montrons que les queues de la distribution P (S) demeurent asymptotiquement des distributions de Weibull modifi´ees avec un facteur d’´echelle χ fonction du poids des actifs dans le portefeuille et dont l’expression diff´ere selon le comportement super ou sous-exponentiel des actifs. Nos traitons alors en d´etails le probl`eme de la minimisation du risque pour de tels portefeuilles, les mesures de risque consid´er´ees e´ tant la Value-atRisk et l’Expected-Shortfall, que nous montrons eˆ tre asymptotiquement e´ quivalentes dans le cadre de la repr´esentaion adopt´ee.
387
388
13. Gestion de portefeuille sous contraintes de capital e´ conomique
389
VaR-Efficient Portfolios for a Class of Super- and Sub-Exponentially Decaying Assets Return Distributions Y. Malevergne1,2 and D. Sornette1,3 1
Laboratoire de Physique de la Mati`ere Condens´ee CNRS UMR 6622 Universit´e de Nice-Sophia Antipolis, 06108 Nice Cedex 2, France 2 Institut de Science Financi`ere et d’Assurances - Universit´e Lyon I 43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex 3 Institute of Geophysics and Planetary Physics and Department of Earth and Space Science University of California, Los Angeles, California 90095, USA email:
[email protected] and
[email protected] fax: (33) 4 92 07 67 54
Abstract Using a family of modified Weibull distributions, encompassing both sub-exponentials and super-exponentials, to parameterize the marginal distributions of asset returns and their multivariate generalizations with Gaussian copulas, we offer exact formulas for the tails of the distribution P (S) of returns S of a portfolio of arbitrary composition of these assets. We find that the tail of P (S) is also asymptotically a modified Weibull distribution with a characteristic scale χ function of the asset weights with different functional forms depending on the super- or sub-exponential behavior of the marginals and on the strength of the dependence between the assets. We then treat in details the problem of risk minimization using the Value-at-Risk and Expected-Shortfall which are shown to be (asymptotically) equivalent in this framework.
Introduction In recent years, the Value-at-Risk has become one of the most popular risk assessment tool (Duffie and Pan 1997, Jorion 1997). The infatuation for this particular risk measure probably comes from a variety of factors, the most prominent ones being its conceptual simplicity and relevance in addressing the ubiquitous large risks often inadequately accounted for by the standard volatility, and from its prominent role in the recommendations of the international banking authorities (Basle Commitee on Banking Supervision 1996, 2001). Moreover, down-side risk measures such as the Value-at-risk seem more in accordance with observed behavior of economic agents. For instance, according to prospect theory (Kahneman and Tversky 1979), the perception of downward market movements is not the same as upward movements. This may be reflected in the so-called leverage effect, first discussed by (Black 1976), who observed that the volatility of a stock tends to increase when its price drops (see (Fouque et al. 2000, Campbell, Lo and McKinley 1997, Bekaert and Wu 2000, Bouchaud et al. 2001) for reviews and recent works). Thus, it should be more natural to consider down-side risk measures like the VaR than the variance traditionally used in portfolio management (Markowitz 1959) which does not differentiate between positive and negative change in future wealth.
1
390
13. Gestion de portefeuille sous contraintes de capital e´ conomique
However, the choice of the Value-at-Risk has recently been criticized (Szerg¨o 1999, Danielsson et al. 2001) due to its lack of coherence in the sense of Artzner et al. (1999), among other reasons. This deficiency leads to several theoretical and practical problems. Indeed, other than the class of elliptical distributions, the VaR is not sub-additive (Embrechts et al. 2002a), and may lead to inefficient risk diversification policies and to severe problems in the practical implementation of portfolio optimization algorithms (see (Chabaane et al. 2002) for a discussion). Alternative have been proposed in terms of Conditional-VaR or ExpectedShortfall (Artzner et al. 1999, Acerbi and Tasche 2002, for instance), which enjoy the property of subadditivity. This ensures that they yield coherent portfolio allocations which can be obtained by the simple linear optimization algorithm proposed by Rockafellar and Uryasev (2000). From a practical standpoint, the estimation of the VaR of a portfolio is a strenuous task, requiring large computational time leading sometimes to disappointing results lacking accuracy and stability. As a consequence, many approximation methods have been proposed (Tasche and Tilibetti 2001, Embrechts et al. 2002b, for instance). Empirical models constitute another widely used approach, since they provide a good trade-off between speed and accuracy. From a general point of view, the parametric determination of the risks and returns associated with a given portfolio constituted of N assets is completely embedded in the knowledge of their multivariate distribution of returns. Indeed, the dependence between random variables is completely described by their joint distribution. This remark entails the two major problems of portfolio theory: 1) the determination of the multivariate distribution function of asset returns; 2) the derivation from it of a useful measure of portfolio risks, in the goal of analyzing and optimizing portfolios. These objective can be easily reached if one can derive an analytical expression of the portfolio returns distribution from the multivariate distribution of asset returns. In the standard Gaussian framework, the multivariate distribution takes the form of an exponential of minus a quadratic form X 0 Ω−1 X, where X is the uni-column of asset returns and Ω is their covariance matrix. The beauty and simplicity of the Gaussian case is that the essentially impossible task of determining a large multidimensional function is collapsed onto the very much simpler one of calculating the N (N + 1)/2 elements of the symmetric covariance matrix. And, by the statibility of the Gaussian distribution, the risk is then uniquely and completely embodied by the variance of the portfolio return, which is easily determined from the covariance matrix. This is the basis of Markowitz (1959)’s portfolio theory and of the CAPM (Sharpe 1964, Lintner 1965, Mossin 1966). The same phenomenon occurs in the stable Paretian portfolio analysis derived by (Fama 1965) and generalized to separate positive and negative power law tails (Bouchaud et al. 1998). The stability of the distribution of returns is essentiel to bypass the difficult problem of determining the decision rules (utility function) of the economic agents since all the risk measures are equivalent to a single parameter (the variance in the case of a Gaussian universe). However, it is well-known that the empirical distributions of returns are neither Gaussian nor L´evy Stable (Lux 1996, Gopikrishnan et al. 1998, Gouri´eroux and Jasiak 1998) and the dependences between assets are only imperfectly accounted for by the covariance matrix (Litterman and Winkelmann 1998). It is thus desirable to find alternative parameterizations of multivariate distributions of returns which provide reasonably good approximations of the asset returns distribution and which enjoy asymptotic stability properties in the tails so as to be relevant for the VaR. To this aim, section 1 presents a specific parameterization of the marginal distributions in terms of so-called modified Weibull distributions introduced by Sornette et al. (2000b), which are essentially exponential of minus a power law. This family of distributions contains both sub-exponential and super-exponentials, including the Gaussian law as a special case. It is shown that this parameterization is relevant for modeling the distribution of asset returns in both an unconditional and a conditional framework. The dependence structure between the asset is described by a Gaussian copula which allows us to describe several degrees of
2
391
dependence: from independence to comonotonicity. The relevance of the Gaussian copula has been put in light by several recent studies (Sornette et al. 2000a, Sornette et al. 2000b, Malevergne and Sornette 2001, Malevergne and Sornette 2002c). In section 2, we use the multivariate construction based on (i) the modified Weibull marginal distributions and (ii) the Gaussian copula to derive the asymptotic analytical form of the tail of the distribution of returns of a portfolio composed of an arbitrary combination of these assets. In the case where individual asset returns have modified-Weibull distributions, we show that the tail of the distribution of portfolio returns S is asymptotically of the same form but with a characteristic scale χ function of the asset weights taking different functional forms depending on the super- or sub-exponential behavior of the marginals and on the strength of the dependence between the assets. Thus, this particular class of modified-Weibull distributions enjoys (asymptotically) the same stability properties as the Gaussian or L´evy distributions. The dependence properties are shown to be embodied in the N (N + 1)/2 elements of a non-linear covariance matrix and the individual risk of each assets are quantified by the sub- or super-exponential behavior of the marginals. Section 3 then uses this non-Gaussian nonlinear dependence framework to estimate the Value-at-Risk (VaR) and the Expected-Shortfall. As in the Gaussian framework, the VaR and the Expected-Shortfall are (asymptotically) controlled only by the non-linear covariance matrix, leading to their equivalence. More generally, any risk measure based on the (sufficiently far) tail of the distribution of the portfolio returns are equivalent since they can be expressed as a function of the non-linear covariance matrix and the weights of the assets only. Section 4 uses this set of results to offer an approach to portfolio optimization based on the asymptotic form of the tail of the distribution of portfolio returns. When possible, we give the analytical formulas of the explicit composition of the optimal portfolio or suggest the use of reliable algorithms when numerical calculation is needed. Section 5 concludes. Before proceeding with the presentation of our results, we set the notations to derive the basic problem addressed in this paper, namely to study the distribution of the sum of weighted random variables with given marginal distributions and dependence. Consider a portfolio with ni shares of asset i of price pi (0) at time t = 0 whose initial wealth is N X W (0) = ni pi (0) . (1) A time τ later, the wealth has become W (τ ) = δτ W ≡ W (τ ) − W (0) =
N X i=1
where
PN
i=1
i=1 ni pi (τ )
ni pi (0)
and the wealth variation is
N X pi (τ ) − pi (0) = W (0) wi xi (t, τ ), pi (0)
(2)
i=1
ni pi (0) wi = PN j=1 nj pj (0)
(3)
is the fraction in capital invested in the ith asset at time 0 and the return xi (t, τ ) between time t − τ and t of asset i is defined as: pi (t) − pi (t − τ ) xi (t, τ ) = . (4) pi (t − τ ) Using the definition (4), this justifies us to write the return Sτ of the portfolio over a time interval τ as the
3
392
13. Gestion de portefeuille sous contraintes de capital e´ conomique
weighted sum of the returns ri (τ ) of the assets i = 1, ..., N over the time interval τ N
X δτ W Sτ = = wi xi (τ ) . W (0)
(5)
i=1
In the sequel, we shall thus consider the asset returns Xi as the fundamental variables and study their aggregation properties, namely how the distribution of portfolio return equal to their weighted sum derives for their multivariable distribution. We shall consider a single time scale τ which can be chosen arbitrarily, say equal to one day. We shall thus drop the dependence on τ , understanding implicitly that all our results hold for returns estimated over time step τ .
1 1.1
Definitions and important concepts The modified Weibull distributions
We will consider a class of distributions with fat tails but decaying faster than any power law. Such possible behavior for assets returns distributions have been suggested to be relevant by several empirical works (Mantegna and Stanley 1995, Gouri´eroux and Jasiak 1998, Malevergne et al. 2002) and has also been asserted to provide a convenient and flexible parameterization of many phenomena found in nature and in the social sciences (Lah`erre and Sornette 1998). In all the following, we will use the parameterization introduced by Sornette et al. (2000b) and define the modified-Weibull distributions: D EFINITION 1 (M ODIFIED W EIBULL D ISTRIBUTION ) A random variable X will be said to follow a modified Weibull distribution with exponent c and scale parameter χ, denoted in the sequel X ∼ W(c, χ), if and only if the random variable √ Y = sgn(X) 2
µ
|X| χ
¶c
2
(6)
follows a Normal distribution.¤ These so-called modified-Weibull distributions can be seen to be general forms of the extreme tails of product of random variables (Frisch and Sornette 1997), and using the theorem of change of variable, we can assert that the density of such distributions is c 1 c −1 − p(x) = √ e c |x| 2 2 π χ2
|x| χ
c
,
(7)
where c and χ are the two key parameters. These expressions are close to the Weibull distribution, with the addition of a power law prefactor to the exponential such that the Gaussian law is retrieved for c = 2. Following Sornette et al. (2000b), Sornette et al. (2000a) and Andersen and Sornette (2001), we call (7) the modified Weibull distribution. For c < 1, the pdf is a stretched exponential, which belongs to the class of sub-exponential. The exponent c determines the shape of the distribution, fatter than an exponential if c < 1. The parameter χ controls the scale or characteristic width of the distribution. It plays a role analogous to the standard deviation of the Gaussian law. The interest of these family of distributions for financial purposes have also been recently underlined by Brummelhuis and Gu´egan (2000) and Brummelhuis et al. (2002). Indeed these authors have shown that 4
393
given a series of return {rt }t following a GARCH(1,1) process, the large deviations of the returns rt+k and of the aggregated returns rt + · · · + rt+k conditional on the return at time t are distributed according to a modified-Weibull distribution, where the exponent c is related to the number of step forward k by the formula c = 2/k . A more general parameterization taking into account a possible asymmetry between negative and positive values (thus leading to possible non-zero mean) is p(x) = p(x) =
c+ 1 c+ −1 − √ e c+ |x| 2 2 π 2 χ+
c− 1 c− −1 − √ e c− |x| 2 2 π 2 χ−
|x| χ+
|x| χ−
c+
if x ≥ 0
(8)
if x < 0 .
(9)
c−
In what follows, we will assume that the marginal probability distributions of returns follow modified Weibull distributions. Figure 1 shows the (negative) “Gaussianized” returns Y defined in (6) of the Standard and Poor’s 500 index versus the raw returns X over the time interval from January 03, 1995 to December 29, 2000. With such a representation, the modified-Weibull distributions are qualified by a power law of exponent c/2, by definition 1. The double logarithmic scales of figure 1 clearly shows a straight line over an extended range of data, qualifying a power law relationship. An accurate determination of the parameters (χ, c) can be performed by maximum likelihood estimation (Sornette 2000, pp 160-162). However, note that, in the tail, the six most extreme points significantly deviate from the modified-Weibull description. Such an anomalous behavior of the most extreme returns can be probably be associated with the notion of “outliers” introduced by Johansen and Sornette (1998, 2002) and associated with behavioral and crowd phenomena during turbulent market phases. The modified Weibull distributions defined here are of interest for financial purposes and specifically for portfolio and risk management, since they offer a flexible parametric representation of asset returns distribution either in a conditional or an unconditional framework, depending on the standpoint prefered by manager. The rest of the paper uses this family of distributions.
1.2
Tail equivalence for distribution functions
An interesting feature of the modified Weibull distributions, as we will see in the next section, is to enjoy the property of asymptotic stability. Asymptotic stability means that, in the regime of large deviations, a sum of independent and identically distributed modified Weibull variables follows the same modified Weibull distribution, up to a rescaling. D EFINITION 2 (TAIL EQUIVALENCE ) Let X and Y be two random variables with distribution function F and G respectively. X and Y are said to be equivalent in the upper tail if and only if there exists λ+ ∈ (0, ∞) such that 1 − F (x) = λ+ . x→+∞ 1 − G(x) lim
(10)
Similarly, X and Y are said equivalent in the lower tail if and only if there exists λ− ∈ (0, ∞) such that lim
x→−∞
F (x) = λ− . G(x)
¤ 5
(11)
394
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Applying l’Hospital’s rule, this gives immediately the following corollary: C OROLLARY 1 Let X and Y be two random variables with densities functions f and g respectively. X and Y are equivalent in the upper (lower) tail if and only if f (x) = λ± , λ± ∈ (0, ∞). x→±∞ g(x) lim
(12)
¤
1.3
The Gaussian copula
We recall only the basic properties about copulas and refer the interested reader to (Nelsen 1998), for instance, for more information. Let us first give the definition of a copula of n random variables. D EFINITION 3 (C OPULA ) A function C : [0, 1]n −→ [0, 1] is a n-copula if it enjoys the following properties : • ∀u ∈ [0, 1], C(1, · · · , 1, u, 1 · · · , 1) = u , • ∀ui ∈ [0, 1], C(u1 , · · · , un ) = 0 if at least one of the ui equals zero , • C is grounded and n-increasing, i.e., the C -volume of every boxes whose vertices lie in [0, 1]n is positive. ¤ The fact that such copulas can be very useful for representing multivariate distributions with arbitrary marginals is seen from the following result. T HEOREM 1 (S KLAR ’ S T HEOREM ) Given an n-dimensional distribution function F with continuous marginal distributions F1 , · · · , Fn , there exists a unique n-copula C : [0, 1]n −→ [0, 1] such that : F (x1 , · · · , xn ) = C(F1 (x1 ), · · · , Fn (xn )) .
(13)
¤ This theorem provides both a parameterization of multivariate distributions and a construction scheme for copulas. Indeed, given a multivariate distribution F with margins F1 , · · · , Fn , the function ¡ ¢ C(u1 , · · · , un ) = F F1−1 (u1 ), · · · , Fn−1 (un ) (14) is automatically a n-copula. Applying this theorem to the multivariate Gaussian distribution, we can derive the so-called Gaussian copula. D EFINITION 4 (G AUSSIAN COPULA ) Let Φ denote the standard Normal distribution and ΦV,n the n-dimensional Gaussian distribution with correlation matrix V. Then, the Gaussian n-copula with correlation matrix V is ¡ ¢ CV (u1 , · · · , un ) = ΦV,n Φ−1 (u1 ), · · · , Φ−1 (un ) , (15)
whose density cV (u1 , · · · , un ) =
∂CV (u1 , · · · , un ) ∂u1 · · · ∂un
6
(16)
395
µ ¶ 1 1 t −1 √ cV (u1 , · · · , un ) = exp − y(u) (V − Id)y(u) (17) 2 det V with yk (u) = Φ−1 (uk ). Note that theorem 1 and equation (14) ensure that CV (u1 , · · · , un ) in equation (15) is a copula. ¤
reads
It can be shown that the Gaussian copula naturally arises when one tries to determine the dependence between random variables using the principle of entropy maximization (Rao 1973, Sornette et al. 2000b, for instance). Its pertinence and limitations for modeling the dependence between assets returns has been tested by Malevergne and Sornette (2001), who show that in most cases, this description of the dependence can be considered satisfying, specially for stocks, provided that one does not consider too extreme realizations (Malevergne and Sornette 2002a, Malevergne and Sornette 2002b, Mashal and Zeevi 2002).
2 Portfolio wealth distribution for several dependence structures 2.1 Portfolio wealth distribution for independent assets Let us first consider the case of a portfolio made of independent assets. This limiting (and unrealistic) case is a mathematical idealization which provides a first natural benchmark of the class of portolio return distributions to be expected. Moreover, it is generally the only case for which the calculations are analytically tractable. For such independepent assets distributed with the modified Weibull distributions, the following results prove the asymptotic stability of this set of distributions: T HEOREM 2 (TAIL EQUIVALENCE FOR I . I . D MODIFIED W EIBULL RANDOM VARIABLES ) Let X1 , X2 , · · · , XN be N independent and identically W(c, χ)-distributed random variables. Then, the variable SN = X1 + X2 + · · · + XN (18)
is equivalent in the lower and upper tail to Z ∼ W(c, χ) ˆ , with χ ˆ = N
c−1 c
χ ˆ = χ,
χ, c > 1,
(19)
c ≤ 1.
(20)
¤ This theorem is a direct consequence of the theorem stated below and is based on the result given by Frisch and Sornette (1997) for c > 1 and on general properties of sub-exponential distributions when c ≤ 1. T HEOREM 3 (TAIL EQUIVALENCE FOR WEIGHTED SUMS OF INDEPENDENT VARIABLES ) Let X1 , X2 , · · · , XN be N independent and identically W(c, χ)-distributed random variables. Let w1 , w2 , · · · , wN be N non-random real coefficients. Then, the variable SN = w1 X1 + w2 X2 + · · · + wN XN ˆ with is equivalent in the upper and the lower tail to Z ∼ W(c, χ) ÃN ! c−1 c X c c−1 χ ˆ = |wi | · χ,
c > 1,
(21)
(22)
i=1
χ ˆ = max{|w1 |, |w2 |, · · · , |wN |}, c ≤ 1. i
¤ 7
(23)
396
13. Gestion de portefeuille sous contraintes de capital e´ conomique
The proof of this theorem is given in appendix A. C OROLLARY 2 Let X1 , X2 , · · · , XN be N independent random variables such that Xi ∼ W(c, χi ). Let w1 , w2 , · · · , wN be N non-random real coefficients. Then, the variable SN = w1 X1 + w2 X2 + · · · + wN XN
(24)
is equivalent in the upper and the lower tail to Z ∼ W(c, χ) ˆ with à χ ˆ =
N X
|wi χi |
c c−1
! c−1 c , c > 1,
(25)
i=1
χ ˆ = max{|w1 χ1 |, |w2 χ2 |, · · · , |wN χN |}, c ≤ 1. i
(26)
¤ The proof of the corollary is a straightforward application of theorem 3. Indeed, let Y1 , Y2 , · · · , YN be N independent and identically W(c, 1)-distributed random variables. Then, d
(X1 , X2 , · · · , XN ) = (χ1 Y1 , χ2 Y2 , · · · , χN YN ),
(27)
which yields d
SN = w1 χ1 · Y1 + w2 χ2 · Y2 + · · · + wN χN · YN .
(28)
Thus, applying theorem 3 to the i.i.d variables Yi ’s with weights wi χi leads to corollary 2.
2.2
Portfolio wealth distribution for comonotonic assets
The case of comonotonic assets is of interest as the limiting case of the strongest possible dependence between random variables. By definition, D EFINITION 5 (C OMONOTONICITY ) the variables X1 , X2 , · · · , XN are comonotonic if and only if there exits a random variable U and nondecreasing functions f1 , f2 , · · · , fN such that d
(X1 , X2 , · · · , XN ) = (f1 (U ), f2 (U ), · · · , fN (U )).
(29)
¤ In terms of copulas, the comonotonicity can be expressed by the following form of the copula C(u1 , u2 , · · · , uN ) = min(u1 , u2 , · · · , uN ) .
(30)
This expression is known as the Fr´echet-Hoeffding upper bound for copulas (Nelsen 1998, for instance). It would be appealing to think that estimating the Value-at-Risk under the comonotonicity assumption could provide an upper bound for the Value-at-Risk. However, it turns out to be wrong, due –as we shall see in the sequel– to the lack of coherence (in the sense of Artzner et al. (1999)) of the Value-at-Risk, in the general case. Notwithstanding, an upper and lower bound can always be derived for the Value-at-Risk (Embrechts et al. 2002b). But in the present situation, where we are only interested in the class of modified Weibull distributions with a Gaussian copula, the VaR derived under the comonoticity assumption will actually represent the upper bound (at least for the VaR calculated at sufficiently hight confidence levels). 8
397 T HEOREM 4 (TAIL EQUIVALENCE FOR A SUM OF COMONOTONIC RANDOM VARIABLES ) Let X1 , X2 , · · · , XN be N comonotonic random variables such that Xi ∼ W(c, χi ). Let w1 , w2 , · · · , wN be N non-random real coefficients. Then, the variable SN = w1 X1 + w2 X2 + · · · + wN XN
is equivalent in the upper and the lower tail to Z ∼ W(c, χ) ˆ with X χ ˆ= wi χi .
(31)
(32)
i
¤ The proof is obvious since, under the assumption of comonotonicity, the portfolio wealth S is given by S=
X
d
wi · Xi =
i
N X
wi · fi (U ),
(33)
i=1
and for modified Weibull distributions, we have µ fi (·) = sgn(·) χi
|·| √ 2
¶2/ci ,
(34)
in the symmetric case while U is a Gaussian random variable. If, in addition, we assume that all assets have the same exponent ci = c, it is clear that S ∼ W(c, χ) ˆ with X χ ˆ= wi χi . (35) i
It is important to note that this relation is exact and not asymptotic as in the case of independent variables. When the exponents ci ’s are different from an asset to another, a similar result holds, since we can still write the inverse cumulative function of S as FS−1 (p) =
N X
wi FX−1i (p),
p ∈ (0, 1),
(36)
i=1
which is the property of additive comonotonicity of the Value-at-Risk1 . Let us then sort the Xi ’s such that c1 = c2 = · · · = cp < cp+1 ≤ · · · ≤ cN . We immediately obtain that S is equivalent in the tail to Z ∼ W(c1 , χ), ˆ where p X χ ˆ= wi χi . (37) i=1
In such a case, only the assets with the fatest tails contributes to the behavior of the sum in the large deviation regime. 1 This relation shows that, in general, the VaR calculated for comonotonic assets does not provide an upper bound of the VaR, whatever the dependence structure the portfolio may be. Indeed, in such a case, we have VaR(X1 + X2 ) = VaR(X1 ) + VaR(X2 ) while, by lack of coherence, we may have VaR(X1 + X2 ) ≥ VaR(X1 ) + VaR(X2 ) for some dependence structure between X1 and X2 .
9
398
2.3
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Portfolio wealth under the Gaussian copula hypothesis
2.3.1 Derivation of the multivariate distribution with a Gaussian copula and modified Weibull margins An advantage of the class of modified Weibull distributions (7) is that the transformation into a Gaussian, and thus the calculation of the vector y introduced in definition 1, is particularly simple. It takes the form √ yk = sgn(xk ) 2
µ
|xk | χk
¶ ck 2
,
(38)
where yk is normally distributed . These variables Yi then allow us to obtain the covariance matrix V of the Gaussian copula : " ¶ ci µ ¶ cj # µ |xi | 2 |xj | 2 Vij = 2 · E sgn(xi xj ) , (39) χi χj which always exists and can be efficiently estimated. The multivariate density P (x) is thus given by: P (x1 , · · · , xN ) = cV (x1 , x2 , · · · , xN )
N Y
pi (xi )
i=1
=
1 √ 2N π N/2 V
N Y ci |xi |c/2−1 i=1
c/2 χi
(40)
exp −
X
µ Vij−1
i,j
|xi | χi
¶c/2 µ
|xj | χj
¶c/2
.
(41)
Obviously, similar transforms hold, mutatis mutandis, for the asymmetric case (8,9). 2.3.2 Asymptotic distribution of a sum of modified Weibull variables with the same exponent c > 1 We now consider a portfolio made of dependent assets with pdf given by equation (41) or its asymmetric generalization. For such distributions of asset returns, we obtain the following result T HEOREM 5 (TAIL EQUIVALENCE FOR A SUM OF DEPENDENT RANDOM VARIABLES ) Let X1 , X2 , · · · , XN be N random variables with a dependence structure described by the Gaussian copula with correlation matrix V and such that each Xi ∼ W(c, χi ). Let w1 , w2 , · · · , wN be N (positive) nonrandom real coefficients. Then, the variable SN = w1 X1 + w2 X2 + · · · + wN XN
(42)
is equivalent in the upper and the lower tail to Z ∼ W(c, χ) ˆ with χ ˆ=
à X
! c−1 c
wi χi σi
,
(43)
i
where the σi ’s are the unique (positive) solution of X 1−c/2 Vik−1 σi c/2 = wk χk σk , i
¤
10
∀k .
(44)
399
χ ˆ ³P N
i=1 |wi χi |
c c−1
λ− ´ c−1 c
h ,c>1
c 2(c−1)
i N −1 2
Independent Assets max{|w1 χ1 |, · · · , |wN χN |}, c ≤ 1 PN
i=1 wi χi
Comonotonic Assets
c−1 P ( i wi χi σi ) c , c > 1
Gaussian copula
Card {|wi χi | = maxj { |wj χj |}}
1
see appendix B
Table 1: Summary of the various scale factors obtained for different distribution of asset returns. The proof of this theorem follows the same lines as the proof of theorem 3. We thus only provide a heuristic derivation of this result in appendix B. Equation (44) is equivalent to X Vik wk χk σk 1−c/2 = σi c/2 , ∀i . (45) k
which seems more attractive since it does not require the inversion of the correlation matrix. In the special case where V is the identity matrix, the variables Xi ’s are independent so that equation (43) must yield the 1 same result as equation (22). This results from the expression of σk = (wk χk ) c−1 valid in the independent case. Moreover, in the limit where all entries of V equal one, we retrieve the case of comonotonic assets. Obviously, V−1 does not exist for comonotonic assets and the derivation given in appendix B does not hold, 1 P but equation (45) remains well-defined and still has a unique solution σk = ( wk χk ) c−1 which yields the scale factor given in theorem 4.
2.4
Summary
In the previous sections, we have shown that the wealth distribution FS (x) of a portfolio made of assets with modified Weibull distributions with the same exponent c remains equivalent in the tail to a modified Weibull distribution W(c, χ). ˆ Specifically, FS (x) ∼ λ− FZ (x) , (46) when x → −∞, and where Z ∼ W(c, χ). ˆ Expression (46) defines the proportionality factor or weight λ− of the negative tail of the portfolio wealth distribution FS (x). Table 1 summarizes the value of the scale parameter χ ˆ for the different types of dependence we have studied. In addition, we give the value of the coefficient λ− , which may also depend on the weights of the assets in the portfolio in the case of dependent assets.
11
400
3 3.1
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Value-at-Risk Calculation of the VaR
We consider a portfolio made of N assets with all the same exponent c and scale parameters χi , i ∈ {1, 2, · · · , N }. The weight of the ith asset in the portfolio is denoted by wi . By definition, the Valueat-Risk at the loss probability α, denoted by VaRα , is given , for a continuous distribution of profit and loss, by (47) Pr{W (τ ) − W (0) < −VaRα } = α, ½ ¾ VaRα Pr S < − = α. (48) W (0) In this expression, we have assumed that all the wealth is invested in risky assets and that the risk-free interest rate equals zero, but it is easy to reintroduce it, if necessary. It just leads to discount VaRα by the discount factor 1/(1 + µ0 ), where µ0 denotes the risk-free interest rate. which can be rewritten as
Now, using the fact that FS (x) ∼ λ− FZ (x), when x → −∞, and where Z ∼ W(c, χ), ˆ we have à µ ½ ¾ ¶c/2 ! √ VaRα 1 VaRα Pr S < − '1−Φ , 2 λ− W (0) W (0) χ ˆ
(49)
as VaRα goes to infinity, which allows us to obtain a closed expression for the asymptotic Value-at-Risk with a loss probability α: · µ ¶¸2/c χ ˆ α VaRα ' W (0) 1/c Φ−1 1 − , (50) λ− 2 ' ξ(α)2/c W (0) · χ, ˆ where the function Φ(·) denotes the cumulative Normal distribution function and µ ¶ 1 −1 α ξ(α) ≡ Φ 1− . 2 λ−
(51)
(52)
In the case where a fraction w0 of the total wealth is invested in the risk-free asset with interest rate µ0 , the previous equation simply becomes VaRα ' ξ(α)2/c (1 − w0 ) · W (0) · χ ˆ − w0 W (0)µ0 .
(53)
Due to the convexity of the scale parameter χ, ˆ the VaR is itself convex and therefore sub-additive. Thus, for this set of distributions, the VaR becomes coherent when the considered quantiles are sufficiently small. The Expected-Shortfall ESα , which gives the average loss beyond the VaR at probability level α, is also very easily computable: Z 1 α ESα = VaRu du (54) α 0 = ζ(α)(1 − w0 ) · W (0) · χ ˆ − w0 W (0)µ0 , (55) R α where ζ(α) = α1 0 ξ(u)2/c du . Thus, the Value-at-Risk, the Expected-Shortfall and in fact any downside risk measure involving only the far tail of the distribution of returns are entirely controlled by the scale parameter χ. ˆ We see that our set of multivariate modified Weibull distributions enjoy, in the tail, exactly the same properties as the Gaussian distributions, for which, all the risk measures are controlled by the standard deviation. 12
401
3.2
Typical recurrence time of large losses
Let us translate these formulas in intuitive form. For this, we define a Value-at-Risk VaR∗ which is such that its typical frequency is 1/T0 . T0 is by definition the typical recurrence time of a loss larger than VaR∗ . In our present example, we take T0 equals 1 year for example, i.e., VaR∗ is the typical annual shock or crash. Expression (49) then allows us to predict the recurrence time T of a loss of amplitude VaR equal to β times this reference value VaR∗ : µ µ ¶ ¶ VaR∗ c T c ' (β − 1) ln + O(ln β) . (56) T0 W (0) χ ˆ Figure 2 shows ln TT0 versus β. Observe that T increases all the more slowly with β, the smaller is the exponent c. This quantifies our expectation that large losses occur more frequently for the “wilder” subexponential distributions than for super-exponential ones.
4 Optimal portfolios In this section, we present our results on the problem of the efficient portfolio allocation for asset distributed according to modified Weibull distributions with the different dependence structures studied in the previous sections. We focus on the case when all asset modified Weibull distributions have the same exponent c, as it provides the richest and more varied situation. When this is not the case and the assets have different exponents ci , i = 1, ..., N , the asymptotic tail of the portfolio return distribution is dominated by the asset with the heaviest tail. The largest risks of the portfolio are thus controlled by the single most risky asset characterized by the smallest exponent c. Such extreme risk cannot be diversified away. In such a case, for a risk-averse investor, the best strategy focused on minimizing the extreme risks consists in holding only the asset with the thinnest tail, i.e., with the largest exponent c.
4.1
Portfolios with minimum risk
Let us consider first the problem of finding the composition of the portfolio with minimum risks, where the risks are measured by the Value-at-Risk. We consider that short sales are not allowed, that the risk free interest rate equals zero and that all the wealth is invested in stocks. This last condition is indeed the only interesting one since allowing to invest in a risk-free asset would automatically give the trivial solution in which the minimum risk portfolio is completely invested in the risk-free asset. The problem to solve reads: VaR∗α = min VaRα = ξ(α)2/c W (0) · min χ ˆ PN i=1 wi = 1
(58)
wi ≥ 0 ∀i.
(59)
(57)
In some cases (see table 1), the prefactor ξ(α) defined in (52) also depends on the weight wi ’s through λ− defined in (46). But, its contribution remains subdominant for the large losses. This allows to restrict the minimization to χ ˆ instead of ξ(α)2/c · χ. ˆ
13
402
13. Gestion de portefeuille sous contraintes de capital e´ conomique
4.1.1 Case of independent assets “Super-exponential” portfolio (c > 1) Consider assets distributed according to modified Weibull distributions with the same exponent c > 1. The Value-at-Risk is given by à VaRα = ξ(α)2/c W (0) ·
N X
|wi χi |
c c−1
! c−1 2 ,
(60)
i=1
Introducing the Lagrange multiplier λ, the first order condition yields ∂χ ˆ λ = ∂wi ξ(α) W (0)
∀i,
(61)
and the composition of the minimal risk portfolio is χ−c wi ∗ = P i −c j χj
(62) ¯
which satistifies the positivity of the Hessian matrix Hjk =
∂2χ ˆ ¯ ∂wj ∂wk ¯{w∗ }
(second order condition).
i
The minimal risk portfolio is such that VaR∗α
ξ(α)2/c W (0) = ³ , P −c ´ 1c i χj
P −c χ µi µ = Pi i −c , j χj ∗
(63)
where µi is the return of asset i and µ∗ is the return of the minimum risk portfolio. sub-exponential portfolio (c ≤ 1) Consider assets distributed according to modified Weibull distributions with the same exponent c < 1. The Value-at-Risk is now given by VaRα = ξ(α)c/2 W (0) · max{|w1 χ1 |, · · · , |wN χN |}.
(64)
Since the weights wi are positive, the modulus appearing in the argument of the max() function can be removed. It is easyPto see that the minimum of VaRα is obtained when all the wi χi ’s are equal, provided that the constraint wi = 1 can be satisfied. Indeed, let us start with the situation where w1 χ1 = w2 χ2 = · · · = wN χN .
(65)
Let us decrease the weight w1 . Then, P w1 χ1 decreases with respect to the initial maximum situation (65) but, in order to satisfy the constraint i wi = 1, at least one of the other weights wj , j ≥ 2 has to increase, so that wj χj increases, leading to a maximum for the set of the wi χi ’s greater than in the initial situation where (65) holds. Therefore, A wi∗ = , ∀i, (66) χi P and the constraint i wi = 1 yields 1 (67) A = P −1 , i χi 14
403
and finally
χ−1 wi∗ = P i −1 , j χj
VaR∗α =
ξ(α)c/2 W (0) P −1 , i χi
P −1 χ µi µ∗ = Pi i −1 . j χj
(68)
The composition of the optimal portfolio is continuous in c at the value c = 1. This is the consequence of the continuity as a function of c at c = 1 of the scale factor χ ˆ for a sum of independent variables. In this regime c ≤ 1, the Value-at-Risk increases as c decreases only through its dependence on the prefactor ξ(α)2/c since the scale factor χ ˆ remains constant. 4.1.2 Case of comonotonic assets For comonotonic assets, the Value-at-Risk is VaRα = ξ(α)c/2 W (0) ·
X
wi χi
(69)
i
which leads to a very simple linear optimization problem. Indeed, denoting χ1 = min{χ1 , χ2 , · · · , χN }, we have X X wi χi ≥ χ1 wi = χ1 , (70) i
i
which proves that the composition of the optimal portfolio is w1∗ = 1, wi∗ = 0 i ≥ 2 leading to VaR∗α = ξ(α)c/2 W (0)χ1 ,
µ∗ = µ1 .
(71)
This result is not surprising since all assets move together. Thus, the portfolio with minimum Value-at-Risk is obtained when only the less risky asset, i.e., with the smallest scale factor χi , is held. In the case where there is a degeneracy in the smallest χ of order p (χ1 = χ2 = ... = χp = min{χ1 , χ2 , · · · , χN }), the optimal choice lead to invest all the wealth in the asset with the larger expected return µj , j ∈ {1, · · · , p}. However, in an efficient market with rational agents, such an opportunity should not exist since the same risk embodied by χ1 = χ2 = ... = χp should be remunerated by the same return µ1 = µ2 = ... = µp . 4.1.3 Case of assets with a Gaussian copula In this situation, we cannot solve the problem analytically. We can only assert that the miminization problem has a unique solution, since the function VaRα ({wi }) is convex. In order to obtain the composition of the optimal portfolio, we need to perform the following numerical analysis. P c/2 1−c/2 It is first needed to solve the set of equations i Vij−1 σi = wj χj σj or the equivalent set of equations given by (45), which can be performed by Newton’s algorithm. Then one have the minimize the quantity P wi χi σi ({wi }). To this aim, one can use the gradient algorithm, which requires the calculation of the derivatives of the σi ’s with respect to the wk ’s. These quantities are easily obtained by solving the linear set of equations ³c ´ c X −1 2c −1 2c −1 ∂σi 1 ∂σj · Vij σi σj + − 1 wj χj = χj · δjk . (72) 2 ∂wk 2 σj ∂wk i
Then, the analytical solution for independent assets or comonotonic assets can be used to initialize the minimization algorithm with respect to the weights of the assets in the portfolio.
15
404
4.2
13. Gestion de portefeuille sous contraintes de capital e´ conomique
VaR-efficient portfolios
We P are now interested in portfolios with minimum Value-at-risk, but with a given expected return µ = i wi µi . We will first consider the case where, as previously, all the wealth is invested in risky assets and we then will discuss the consequences of the introduction of a risk-free asset in the portfolio. 4.2.1 Portfolios without risky asset When the investors have to select risky assets only, they have to solve the following minimization problem: VaR∗α = min VaRα = ξ(α) W (0) · min χ ˆ PN i=1 wi µi = µ PN i=1 wi = 1
(73)
wi ≥ 0 ∀i.
(76)
(74) (75)
In contrast with the research of the minimum risk portfolios where analytical results have been derived, we need here to use numerical methods in every situation. In the case of super-exponential portfolios, with or without dependence between assets, the gradient method provides a fast and easy to implement algorithm, while for sub-exponential portfolios or portfolios made of comonotonic assets, one has to use the simplex method since the minimization problem is then linear. Thus, although not as convenient to handle as analytical results, these optimization problems remain easy to manage and fast to compute even for large portfolios. 4.2.2 Portfolios with risky asset When a risk-free asset is introduced in the portfolio, the expression of the Value-at-Risk is given by equation (53), the minimization problem becomes VaR∗α = min ξ(α)2/c (1 − w0 ) · W (0) · χ ˆ − w0 W (0)µ0 PN i=1 wi µi = µ PN i=1 wi = 1
(77)
wi ≥ 0 ∀i.
(80)
(78) (79)
When the risk-free interest rate µ0 is non zero, we have to use the same numerical methods as above to solve the problem. However, if we assume that µ0 = 0, the problem becomes amenable analytically. Its Lagrangian reads ÃN ! X X 2/c L = ξ(α) (1 − w0 ) · W (0) · χ ˆ − λ1 wi µi − µ − λ2 wi − 1 , (81) i6=0
= ξ(α)2/c
X X wi · W (0) · χ ˆ − λ1 wi µi − µ , j6=0
i=0
(82)
i6=0
which allows us to show that the weights of the optimal portfolio are w ˆi wi∗ = (1 − w0 ) · PN
ˆj j=1 w
and
VaR∗α = 16
ξ(α)2/c (1 − w0 ) · W (0) · µ, 2
(83)
405
where the w ˆi ’s are solution of the set of equations ÃN ! X ∂χ ˆ χ ˆ+ w ˆi = µi . ∂w ˆi
(84)
i=1
Expression (83) shows that the efficient frontier is simply a straight line and that any efficient portfolio is the sum of two portfolios: a “riskless portfolio” in which a fraction w0 of the initial wealth is invested and a portfolio with the remaining (1 − w0 ) of the initial wealth invested in risky assets. This provides another example of the two funds separation theorem. A CAPM then holds, since equation (84) together with the market equilibrium assumption yields the proportionality between any stock return and the market return. However, these three properties are rigorously established only for a zero risk-free interest rate and may not remain necessarily true as soon as the risk-free interest rate becomes non zero. Finally, for practical purpose, the set of weights wi∗ ’s obtained under the assumption of zero risk-free interest rate µ0 , can be used to initialize the optimization algorithms when µ0 does not vanish.
5
Conclusion
The aim of this work has been to show that the key properties of Gaussian asset distributions of stability under convolution, of the equivalence between all down-side riks measures, of coherence and of simple use also hold for a general family of distributions embodying both sub-exponential and super-exponential behaviors, when restricted to their tail. We then used these results to compute the Value-at-Risk (VaR) and to obtain efficient porfolios in the risk-return sense, where the risk is characterized by the Value-at-Risk. Specifically, we have studied a family of modified Weibull distributions to parameterize the marginal distributions of asset returns, extended to their multivariate distribution with Gaussian copulas. The relevance to finance of the family of modified Weibull distributions has been proved in both a context of conditional and unconditional portfolio management. We have derived exact formulas for the tails of the distribution P (S) of returns S of a portfolio of arbitrary composition of these assets. We find that the tail of P (S) is also asymptotically a modified Weibull distribution with a characteristic scale χ function of the asset weights with different functional forms depending on the super- or sub-exponential behavior of the marginals and on the strength of the dependence between the assets. The derivation of the portfolio distribution has shown the asymptotic stability of this family of distribution with the important economic consequence that any down-side risk measure based upon the tail of the asset returns distribution are equivalent, in so far as they all depends on the scale factor χ and keep the same functional form whatever the number of assets in the portfolio may be. Our analytical study of the properties of the VaR has shown the VaR to be coherent. This justifies the use of the VaR as a coherent risk measure for the class of modified Weibull distributions and ensures that portfolio optimization problems are always well-conditioned even when not fully analytically solvable. The Value-at-Risk and the Expected-Shortfall have also been shown to be (asymptotically) equivalent in this framework. In fine, using the large class of modified Weibull distributions, we have provided a simple and fast method for calculating large down-side risks, exemplified by the Value-at-Risk, for assets with distributions of returns which fit quite reasonably the empirical distributions.
17
406
13. Gestion de portefeuille sous contraintes de capital e´ conomique
A Proof of theorem 3 : Tail equivalence for weighted sums of modified Weibull variables A.1 Super-exponential case: c > 1 Let X1 , X2 , · · · , XN be N i.i.d random variables with density p(·). Let us denote by f (·) and g(·) two positive functions such that p(·) = g(·) · e−f (·) . Let w1 , w2 , · · · , wN be N real non-random coefficients, PN and S = i=1 wi xi . P Let X = {x ∈ RN , N i=1 wi xi = S}. The density of the variable S is given by Z PN dx e− i=1 [f (xi )−ln g(xi )] , (85) PS (S) = X
We will assume the following conditions on the function f 1. f (·) is three times continuously differentiable and four times differentiable, 2. f (2) (x) > 0, for |x| large enough, 3. limx→±∞
f (3) (x) (f (2) (x))2
= 0,
4. f (3) is asymptotically monotonous, 5. there is a constant β > 1 such that
f (3) (β·x) f (3) (x)
remains bounded as x goes to infinity,
6. g(·) is ultimately a monotonous function, regularly varying at infinity with indice ν. Let us start with the demonstration of several propositions. P ROPOSITION 1 under hypothesis 3, we have
lim |x| · f 00 (x) = 0.
x→±∞
(86)
¤ Proof d 1 = 0, so that (2) dx f (x) ¯ ¯ ¯d 1 ¯¯ ¯ ∀² > 0, ∃A² /x > A² =⇒ ¯ ≤ ². dx f (2) (x) ¯
Hypothesis 3 can be rewritten as lim
x→±∞
Now, since f 00 is differentiable, 1/f 00 is also differentiable, and by the mean value theorem, we have ¯ ¯ ¯ ¯ ¯ 1 ¯d 1 ¯ 1 ¯¯ ¯ ¯ ¯ ¯ f 00 (x) − f 00 (y) ¯ = |x − y| · ¯ dξ f 00 (ξ) ¯ for some ξ ∈ (x, y).
18
(87)
(88)
407
Choosing x > y > A² , and applying equation (87) together with (88) yields ¯ ¯ ¯ 1 ¯ 1 ¯ ¯ ¯ f 00 (x) − f 00 (y) ¯ ≤ ² · |x − y|. Now, dividing by x and letting x go to infinity gives ¯ ¯ ¯ ¯ 1 ¯ ¯ ≤ ², lim ¯ x→∞ x · f 00 (x) ¯
(89)
(90)
which concludes the proof. ¤ P ROPOSITION 2 Under assumption 3, we have
lim f 0 (x) = +∞.
(91)
x→±∞
¤ Proof According to assumption 3 and proposition 1, lim x · f 00 (x) = ∞, which means x→±∞
∀α > 0, ∃Aα /x > A² =⇒ x · f 00 (x) ≥ α.
(92)
This thus gives ∀x ≥ aα ,
α x · f 00 (x) ≥ α ⇐⇒ f 00 (x) ≥ x Z x Z 00 =⇒ f (t) dt ≥ α · =⇒
(93) x
dt Aα Aα t 0 f (x) ≥ α · ln x − α · ln Aα + f 0 (Aα ).
(94) (95)
The right-hand-side of this last equation goes to infinity as x goes to infinity, which concludes the proof. ¤ P ROPOSITION 3 Under assumptions 3 and 6, the function g(·) satisfies ∀|h| ≤
C f 00 (x)
,
g(x + h) = 1, x→±∞ g(x) lim
(96)
uniformly in h, for any positive constant C . ¤ Proof For g non-decreasing, we have ³ ³ ³ ³ ´´ ´´ C C g x 1 − g x 1 + x·f 00 (x) x·f 00 (x) C g(x + h) , ≤ ≤ . ∀|h| ≤ 00 f (x) g(x) g(x) g(x)
(97)
If g is non-increasing, the same inequalities hold with the left and right terms exchanged. Therefore, the final conclusion is easily shown to be independent of the monotocity property of g. From assumption 3 and proposition 1, we have ∀α > 0, ∃Aα /x > A² =⇒ x · f 00 (x) ≥ α. (98) Thus, for all x larger than Aα and all |h| ≤ C/f 00 (x) ¡ ¡ ¡ ¡ ¢¢ g x 1− C g x 1+ g(x + h) α ≤ ≤ g(x) g(x) g(x) 19
C α
¢¢ (99)
408
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Now, letting x go to infinity, µ µ ¶ ¶ g(x + h) C ν C ν 1− ≤ lim ≤ 1+ , x→∞ α g(x) α
(100)
for all α as large as we want, which concludes the proof. ¤ P ROPOSITION 4 Under assumptions 1, 3 and 4 we have, for any positive constant C: ¯ ¯ supξ∈[x,x+h] ¯f (3) (ξ)¯ C = 0. ∀|h| ≤ 00 , lim x→±∞ f (x) f 00 (x)2
(101)
¤ Proof Let us first remark that ¯ ¯ ¯ ¯ ¯ ¯ supξ∈[x,x+h] ¯f (3) (ξ)¯ supξ∈[x,x+h] ¯f (3) (ξ)¯ ¯f (3) (x)¯ ¯ ¯ = · 00 2 . ¯f (3) (x)¯ f 00 (x)2 f (x)
(102)
The rightmost factor in the right-hand-side of the equation above goes to zero as x goes to infinity by assumption 3. Therefore, we just have to show that the leftmost factor in the right-hand-side remains bounded as x goes to infinity to prove Proposition 4. Applying assumption 4 according to which f (3) is asymptotically monotonous, we have ¯ ´¯ ³ ¯ ¯ ¯ ¯ (3) x + f 00C(x) ¯ ¯f supξ∈[x,x+h] ¯f (3) (ξ)¯ ¯ ¯ ¯ ¯ ≤ ¯f (3) (x)¯ ¯f (3) (x)¯ ¯ ´´¯ ³ ³ ¯ ¯ (3) x 1 + x·fC00 (x) ¯ ¯f ¯ ¯ , = ¯f (3) (x)¯ ¯ (3) ¡ ¡ ¢¢¯ ¯ ¯f x 1+ C α ¯ ¯ ≤ , ¯f (3) (x)¯
(103)
(104) (105)
for every x larger than some positive constant Aα by assumption 3 and proposition 1. Now, for α large supξ∈[x,x+h] |f (3) (ξ)| is less than β (assumption 5) which shows that remains bounded for large enough, 1 + C α |f (3) (x)| x, which conclude the proof. ¤ We can now show that under the assumptions stated above, the leading order expansion of PS (S) for large S and finite N > 1 is obtained by a generalization of Laplace’s method which here amounts to remark that the set of x∗i ’s that maximize the integrand in (85) are solution of fi0 (x∗i ) = σ(S)wi ,
(106)
P where σ(S) is nothing but a Lagrange multiplier introduced to minimize the expression N i=1 fi (xi ) under PN the constraint i=1 wi xi = S. This constraint shows that at least one xi , for instance x1 , goes to infinity as S → ∞. Since f 0 (x1 ) is an increasing function by assumption 2 which goes to infinity as x1 → +∞ (proposition 2), expression (106) shows that σ(S) goes to infinity with S, as long as the weight of the asset 1 is not zero. Putting the divergence of σ(S) with S in expression (106) for i = 2, ..., N ensures that each x∗i increases when S increases and goes to infinity when S goes to infinity. 20
409 Expanding fi (xi ) around x∗i yields Z f (x∗i )
f (xi ) =
+f
0
(x∗i )
· hi +
x∗i +hi
x∗i
Z dt
t
x∗i
du f 00 (u)
(107)
where the set of hi = xi − x∗i obey the condition N X
wi hi = 0 .
(108)
i=1
Summing (106) in the presence of relation (108), we obtain N X
f (xi ) =
i=1
Thus exp(−
P
N X
f (x∗i ) +
i=1
N Z X
Z
x∗i +hi
dt
x∗i
i=1
f (xi )) can be rewritten as follows : " N # "N N Z X X X exp − f (xi ) = exp f (x∗i ) + i=1
i=1
t
x∗i
x∗i +hi
x∗i
i=1
du f 00 (u) .
Z dt
#
t
x∗i
(109)
00
du f (u) .
(110)
P 00 ∗ 2 2 2 Let us now define the compact set AC = {h ∈ RN , N i=1 f (xi ) · hi ≤ C } for any given positive P N N constant C and the set H = {h ∈ R , i=1 wi hi = 0}. We can thus write Z PN PS (S) = dh e− i=1 [f (xi )−ln g(xi )] , (111) H Z Z PN PN = (112) dh e− i=1 [f (xi )−ln g(xi )] + dh e− i=1 [f (xi )−ln g(xi )] , AC ∩H
AC ∩H
We are now going to analyze in turn these two terms in the right-hand-side of (112). First term of the right-hand-side of (112). Let us start with the first term. We are going to show that R lim
AC ∩H dh e
S→∞
−
R x∗i +hi i=1 x∗ i
PN
N −1 2
dt
Rt
x∗ i
du f 00 (u)−ln g(xi )
= 1,
Q
∗ i g(xi ) QN 2 00 w f (x∗ ) i j=1 j j 00 i=1 f (x∗ ) i i
(2π) s PN
for some positive C.
(113)
In order to prove this assertion, we will first consider the leftmost factor of the right-hand-side of (112): Y
g(xi ) e
−
P
f (xi )
= =
Y Y
g(x∗i
+ hi ) e
−
P R x∗i +hi x∗ i
1
g(x∗i + hi ) e− 2
P
dt
Rt
f 00 (x∗i )h2i
x∗ i
e
du f 00 (u)
−
,
(114)
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du [f 00 (u)−f 00 (x∗i )]
.
(115)
Since for all ξ ∈ R, e−|ξ| ≤ e−ξ ≤ e|ξ| , we have e
P R x∗ +hi R t 00 (u)−f 00 (x∗ )] i dt − du [f ∗ ∗ i x x i i
≤e
−
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du [f 00 (u)−f 00 (x∗i )]
≤e
P R x∗ +hi R t 00 (u)−f 00 (x∗ )] i dt du [f ∗ ∗ i x x i i
(116) 21
,
410
13. Gestion de portefeuille sous contraintes de capital e´ conomique
and −
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du |f 00 (u)−f 00 (x∗i )|
−
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du [f 00 (u)−f 00 (x∗i )]
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du |f 00 (u)−f 00 (x∗i )|
, (117) R x∗ +h Rt since whatever the sign of hi , the quantity x∗i i dt x∗ du |f 00 (u) − f 00 (x∗i )| remains always positive. e
≤e
i
But, |u − x∗i | ≤ |hi | ≤
C f 00 (x∗i ) ,
≤e
i
which leads, by the mean value theorem and assumption 1, to
|f 00 (u) − f 00 (x∗i )| ≤ ≤
sup
|f (3) (ξ)| · |u − x∗i |,
sup
|f (3) (ξ)|
ξ∈(x∗i ,x∗i +hi )
ξ∈(x∗i ,x∗i +hi )
≤ sup |f (3) (ξ)| ξ∈Gi
(118)
C , f 00 (x∗i )
(119)
¸ · C C C ∗ ∗ , where G = x − , x + , (120) i i f 00 (x∗i ) f 00 (x∗i ) i f 00 (x∗i )
which yields XZ
0≤
x∗ i +hi
Z
x∗ i
i
t
du |f 00 (u) − f 00 (x∗i )| ≤
dt x∗ i
C 1X sup |f (3) (ξ)| 00 ∗ h2i . 2 i ξ∈Gi f (xi )
(121)
Thus e
− 12
P i
sup |f (3) (ξ)|
C h2 f 00 (x∗ ) i i
≤e
−
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du [f 00 (u)−f 00 (x∗i )]
1
P
≤ e2
i
sup |f (3) (ξ)|
C h2 f 00 (x∗ ) i i
,
(122)
where supξ∈Gi |f (3) (ξ)|, have been denoted by sup |f (3) (ξ)| in the previous expression, in order not to cumber the notations. By proposition 4, we know that for all h ∈ AC and all ²i > 0 ¯ ¯ ¯ sup |f (3) (ξ) ¯ ¯ ¯ ¯ ≤ ²i , for x∗i large enough, ¯ ¯ f 00 (x∗i ) ¯
(123)
so that 0
∀² > 0 and ∀h ∈ AC ,
0
− C·² 2
e
P i
h2i
≤e
−
P R x∗i +hi x∗ i
dt
Rt
x∗ i
du [f 00 (u)−f 00 (x∗i )]
≤e
C·²0 2
P i
h2i
,
(124)
for |x| large enough. Moreover, from proposition 3, we have for all ²i > 0 and x∗i large enough: g(x∗i + hi) ≤ (1 + ²i )ν , g(x∗i )
∀h ∈ AC , (1 − ²i )ν ≤ so, for all ²00 > 0 ∀h ∈ AC , (1 − ²00 )N ν ≤
Y g(x∗ + hi) i
i
g(x∗i )
≤ (1 + ²00 )N ν .
(125)
(126)
Then for all ² > 0 and |x| large enough, this yields : 1
(1 − ²)N ν e− 2
P
i (f
00 (x∗ )+C·²)·h2 i i
≤
Y g(xi ) P 1 P 00 ∗ 2 e− f (xi ) ≤ (1 + ²)N ν e− 2 i (f (xi )−C·²)·hi , ∗ g(xi ) i
22
(127)
411
for all h ∈ AC . Thus, integrating over all the h ∈ AC ∩ H and by continuity of the mapping Z G(Y) = dh g(h, Y)
(128)
AC ∩H 1
where g(h, Y) = e− 2
P
Yi ·h2i
, we can conclude that, P R Q − f (xi ) g(x ) e i S→∞ AC ∩H −−−→ 1. R Q 1 P 00 ∗ 2 − g(x∗i ) AC ∩H dh e− 2 f (xi )hi
Now, we remark that Z Z 1 P 00 ∗ 2 dh e− 2 f (xi )hi = H
1
dh e− 2
P
Z f 00 (x∗i )h2i
Z dh e
− 12
P
f 00 (x∗i )h2i
=r
(2π) PN
i=1
Z
1
dh e− 2
P
P
f 00 (x∗i )h2i
,
(130)
AC ∩H
H
and
1
dh e− 2
+
AC ∩H
with
(129)
f 00 (x∗i )h2i
wi2
N −1 2
QN
,
(131)
α > 0,
(132)
00 ∗ j=1 fj (xj ) 00 ∗ fi (xi )
³ ´ − α ∼ O e f 00 (x∗ ) ,
AC ∩H
where x∗ = max{x∗i } (note that 1/f 00 (x) → ∞ with x by Proposition 1). Indeed, we clearly have Z Z P 00 ∗ 2 1 P 00 ∗ 2 f (xi )hi − 12 ≤ dh e− 2 f (xi )hi , dh e AC ∩H
(133)
AC
(2π)N/2
pQ
=
f 00 (x∗i )
Z
1
u2 i
Y e− 2 f 00 (x∗i ) p du , 2π f 00 (x∗i ) BC i
(134)
where we have performed the change of variable ui = f 00 (x∗i ) · hi and denoted by BC the set {h ∈ P RN , u2i ≤ C 2 }. Now, let x∗max = max{x∗i } and x∗min = min{x∗i }. Expression (134) then gives Z dh e
− 12
P
f 00 (x∗i )h2i
AC ∩H
≤
(2π)N/2 00 f (x∗min )N/2
= SN −1 ' SN −1
Z
−
1
P
u2
i e 2 f 00 (x∗max ) du , ∗ N/2 00 (2π f (xmin )) BC ¶ µ f 00 (x∗max )N/2 N C2 Γ , , f 00 (x∗min )N 2 2 f 00 (x∗max ) µ ¶ N2 −1 2 f 00 (x∗max )N/2 C2 − 2 f 00C (x∗ max ) , · e f 00 (x∗min )N 2 f 00 (x∗max )
(135) (136) (137)
which decays exponentially fast for large S (or large x∗max ) as long as f 00 goes to zero at infinity, i.e, for any function f which goes to infinity not faster than x2 . So, finally Z dh e
− 12
P
f 00 (x∗i )h2i
(2π)
=r
AC ∩H
PN
i=1
which concludes the proof of equation (113).
23
N −1 2
QN 2
wi
00 ∗ j=1 fj (xj ) 00 ∗ fi (xi )
³ ´ α − + O e f 00 (x∗max ) ,
(138)
412
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Second term of the right-hand-side of (112). We now have to show that Z P ∗ ∗ dh e− f (xi +hi )−g(xi +hi )
(139)
AC ∩H
can be neglected. This is obvious since, by assumption 2 and 6, the function f (x) − ln g(x) remains convex for x large enough, which ensures that f (x) − ln g(x) ≥ C1 |x| for some positive constant C1 and x large enough. Thus, choosing the constant C in AC large enough, we have µ ¶ Z Z 0 P P − f 00α(x∗ ) − N f (xi )−ln g(xi ) −C1 N |x∗i +hi | i=1 i=1 dh e ≤ dh e . (140) ∼O e AC ∩H
AC ∩H
Thus, for S large enough, the density PS (S) is asymptotically equal to PS (S) =
Y
g(x∗i )
i
(2π)
r
PN
i=1
wi2
N −1 2
QN
00 ∗ j=1 fj (xj ) fi00 (x∗i )
.
In the case of the modified Weibull variables, we have µ ¶c |x| f (x) = , χ
(141)
(142)
and
c c g(x) = √ c/2 · |x| 2 −1 , 2 πχ which satisfy our assumptions if and only if c > 1. In such a case, we obtain
(143)
1
x∗i
w c−1 = P i c · S, wic−1
which, after some simple algebraic manipulations, yield · ¸ N −1 c |S| 2 c c c 1 −1 − χˆ 2 √ c/2 |S| P (S) ∼ e 2(c − 1) 2 πχ ˆ with χ ˆ=
³X
c
wic−1
´ c−1 c
· χ.
(144)
(145)
(146)
as announced in theorem 3.
A.2 Sub-exponential case: c ≤ 1 Let X1 , X2 , · · · , XN be N i.i.d sub-exponential modified Weibull random variables W(c, χ), with distribution function F . Let us denote by GS the distribution function of the variable SN = w1 X1 + w2 X2 + · · · + wN XN ,
(147)
where w1 , w2 , · · · , wN are real non-random coefficients. Let w∗ = max{|w1 |, |w2 |, · · · , |wN |}. Then, theorem 5.5 (b) of Goldie and Kl¨uppelberg (1998) states that GS (x/w∗ ) = Card {i ∈ {1, 2, , N } : |wi | = w∗ }. x→∞ F (x) lim
By definition 2, this allows us to conclude that SN is equivalent in the upper tail to Z ∼ W(c, w∗ χ). A similar calculation yields an analogous result for the lower tail. 24
(148)
413
B Asymptotic distribution of the sum of Weibull variables with a Gaussian copula. We assume that the marginal distributions are given by the modified Weibull distributions:
1 c − Pi (xi ) = √ c/2 |xi |c/2−1 e 2 πχ
|xi | χi
c
(149)
i
and that the χi ’s are all equal to one, in order not to cumber the notation. As in the proof of corollary 2, it will be sufficient to replace wi by wi χi to reintroduce the scale factors. Under the Gaussian copula assumption, we obtain the following form for the multivariate distribution : N X Y cN c/2 c/2 c/2−1 √ xi exp − Vij−1 xi xj . (150) P (x1 , · · · , xN ) = 2N π N/2 det V i=1 i,j Let f (x1 , · · · , xN ) =
X
Vij−1 xi c/2 xj c/2 .
(151)
i,j
P We have to minimize f under the constraint wi xi = S. As for the independent case, we introduce a Lagrange multiplier λ which leads to X −1 ∗ c/2 ∗ c/2−1 c Vjk xj xk = λwk . (152) j
The left-hand-side of this equation is a homogeneous function of degree c − 1 in the x∗i ’s, thus necessarily x∗i where the σi ’s are solution of
X
µ ¶ 1 λ c−1 = · σi , c
−1 c/2 c/2−1 Vjk σj σk = wk .
(153)
(154)
j
The set of equations (154) has a unique solution due to the convexity of the minimization problem. This set of equations can be easily solved by a numerical method like Newton’s algorithm. It is convenient to simplify the problem and avoid the inversion of the matrix V , by rewritting (154) as X c/2 Vjk wk σk 1−c/2 = σj . (155) k
Using the constraint
P
wi x∗i = S, we obtain µ ¶ 1 λ c−1 S , =P wi σi c
so that
x∗i = P
σi · S. wi σi
25
(156)
(157)
414
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Thus f (x1 , · · · , xN ) = f (x∗1 , · · · , x∗N ) +
X ∂f 1 X ∂2f hi + hi hj + · · · ∂xi 2 ∂xi ∂xj i
=
(
P
(158)
ij
Sc 1 X ∂2f + hi hj + · · · , wi σi )c−1 2 ∂xi ∂xj
(159)
ij
where, as in the previous section, hi = xi − x∗i and the derivatives of f are expressed at x∗1 , ..., x∗N . It is easy to check that the nth -order derivative of f with respect to the xi ’s evaluated at {x∗i } is proportional to S c−n . In the sequel, we will use the following notation : ¯ ¯ ∂nf (n) ¯ = Mi1 ···in S c−n . (160) ¯ ∂xi1 · · · ∂xin {x∗ } i
We can write : Sc S c−2 X (2) S c−3 X (3) f (x1 , · · · , xN ) = P + Mij hi hj + Mijk hi hj hk + · · · c−1 ( wi σi ) 2 6 ij
(161)
ijk
up to the fourth order. This leads to P (S) ∝ e
− P (
Sc wi σi )c−1
Z dh1 · · · dhN e−
S c−2 2
× 1 +
P ij
(2)
Mij hi hj
δ
³X
´ wi hi ×
S c−3 X (3) Mijk hi hj hk + · · · . 6
(162)
ijk
R P Using the relation δ ( wi hi ) = P (S) ∝ e
− P (
Sc wi σi )c−1
P
e−ik j wj hj , we obtain : Z Z P (2) S c−2 P dk j wj hj × ij Mij hi hj −ik dh1 · · · dhN e− 2 2π c−3 X S (3) × 1 + Mijk hi hj hk + · · · , 6 dk 2π
(163)
ijk
or in vectorial notation : − P
P (S) ∝ e
(
Sc wi σi )c−1
Z
dk 2π
× 1 +
Z dh e−
S c−2 t h M(2) h−ikwt h 2
×
S c−3 X (3) Mijk hi hj hk + · · · . 6
(164)
ijk
Let us perform the following standard change of variables : h = h0 − (M(2)
−1
ik −1 M(2) w . c−2 S
(165)
exists since f is assumed convex and thus M(2) positive) : S c−2 t (2) S c−2 0t (2) 0 k2 −1 h M h + ikwt h = h M h + wt M(2) w . c−2 2 2 2S 26
(166)
415
This yields P (S) ∝ e Z ×
dh e
c−2 −S 2
h+
−1 ik M(2) w S c−2
t
M (2)
h+
− P
−1 ik M(2) w S c−2
(
Sc wi σi )c−1
Z
c−3 1 + S 6
2 t (2) −1 w dk − kc−2 e 2S w M × 2π X (3) Mijk hi hj hk + · · · . (167)
ijk
Denoting by h·ih the average with respect to the Gaussian distribution of h and by h·ik the average with respect to the Gaussian distribution of k, we have : s −1 Sc det M(2) 2−c N 2−1 − (P wi σi )c−1 e × (2πS ) P (S) ∝ −1 wt M(2) w c−3 X c−4 X S S (3) (4) × 1 + Mijk hhhi hj hk ih ik + Mijkl hhhi hj hk hl ih ik + · · · . (168) 6 24 ijk
ijkl
We now invoke Wick’s theorem 2 , which states that each term hhhi · · · hp ih ik can be expressed as a product of pairwise correlation coefficients. Evaluating the average with respect to the symmetric distribution of k, it is obvious that odd-order terms will vanish and that the count of powers of S involved in each even-order term shows that all are sub-dominant. So, up to the leading order : s −1 Sc det M(2) 2−c N 2−1 − (P wi σi )c−1 (2πS ) e . (169) P (S) ∝ −1 wt M(2) w The matrix M (2) can be calculated, which yields ¸ · ³ ´w 1 c2 −1 2c −1 2c −1 c (2) k Mkl = P c −1 δkl + Vkl σk σl , ( wi σi )c−2 2 σk 2 1 ˜ kl , = P M ( wi σi )c−2 and shows that
s det M(2) wt M(2)
˜ −1 satisfies The inverse matrix M c
−1
−1
w
=
³X
wi σi
´(N −1)( c −1) 2
s
˜ −1 det M . ˜ −1 w wt M
(170) (171)
(172)
P ˜ ˜ −1 )lj = δkj which can be rewritten: l Mkl · (M
³c
´w 2 X c c k ˜ −1 )kj + c ˜ −1 )lj σ 2 −1 σ 2 −1 = δkj −1 (M Vkl−1 · (M k l 2 σk 2
(173)
l
or equivalently c
³c
´ 2 X c c ˜ −1 )kj + c ˜ −1 )lj σ 2 σ 2 −1 = δkj · σk − 1 wk (M Vkl−1 · (M k l 2 2
(174)
l
2
See for instance (Br´ezin et al. 1976) for a general introduction, (Sornette 1998) for an early application to the portfolio problem and (Sornette et al. 2000b) for a systematic utilization with the help of diagrams.
27
416
13. Gestion de portefeuille sous contraintes de capital e´ conomique
which gives c
³c
´X 2 X X c c ˜ −1 )lj σ 2 σ 2 −1 wj = ˜ −1 )kj wj + c −1 wk (M Vkl−1 · (M δkj · σk wj . k l 2 2 j,k
j,k,l
(175)
j,k
Summing the rightmost factor of the left-hand-side over k, and accounting for equation (154) leads to c
³c
´X 2 X X ˜ −1 )kj wj + c ˜ −1 )lj wj = −1 wk (M wl (M σj wj . 2 2 j,k
so that ˜ −1 w = wt M
(176)
j
j,l
X 1 wj σj . c(c − 1)
(177)
j
Moreover Q 2c −1 N Y c σi cN cN ∗ c/2−1 √ √ xi = S N ( 2 −1) . c P 2N π N/2 det V i=1 2N π N/2 det V ( wi σi )N ( 2 −1) Thus, putting together equations (169), (172), (177) and (178) yields s Q c |S| ˜ −1 cN −1 σ c/2−1 det M 1 c c/2−1 − χˆ i √ · |S| e P (S) ' c(c − 1) , (N −1)/2 c/2 det V 2 πχ 2 ˆ with χ ˆ=
³X
(178)
(179)
´ c−1 wi χi σi
28
c
.
(180)
417
References Acerbi, A. and D. Tasche, 2002, On the coherence of expected-shortfall, Journal of Banking and Finance 26, 1487-1503. Andersen, J.V., and D. Sornette, 2001, Have your cake and eat it too: increasing returns while lowering large risks! Journal of risk finance 2, 70-82. Artzner, P., F. Delbaen, J.M. Eber and D. Heath, 1999, Coherent measures of risk, Mathematical Finance 9, 203-228. Basle Commitee on Banking Supervision, 1996, Amendement to the capital accord to incorporate market risks. Basle Commitee on Banking Supervision, 2001, The new Basel capital accord. Bekaert, G. and G.J. Wu, 2000, Asymmetric volatility and risk in equity markets, Review of Financial Studies 13, 1-42. Black, F., 1976, in Proceedings of the 1976 American Statistical Association, Business and Economical Statistics Section (American Statistical Association, Alexandria, VA), p. 177. Bouchaud, J.P., Matacz, A. and Potters, M., 2001, Leverage effect in financial markets: The retarded volatility model, Physical Review E 61, 228701. Bouchaud, J.-P., D. Sornette, C. Walter and J.-P. Aguilar, 1998, Taming large events: Optimal portfolio theory for strongly fluctuating assets, International Journal of Theoretical and Applied Finance 1, 25-41. Br´ezin, E., J.C. Le Guillou et J. Zinn-Justin, 1976, Field theoretical approach to critical phenomena, in C. Domb and M. Green, vol.6, The renormalization group and its applications (Academic Press, London), pp. 125-247. Brummelhuis, R.G.M. and D. Gu´egan, 2000, Extreme values of conditional distributions of GARCH(1,1) processes, Working Paper, University of Reims Brummelhuis, R.G.M., D. Gu´egan and S. LaDoucette, in preparation. Campbell, J.Y., A. W. Lo, and A. C. McKinley, 1997, The Econometrics of Financial Markets (Princeton University Press, Princeton, NJ), Chabaane, A., E. Duclos, J.P. Laurent, Y. Malevergne and F. Turpin, 2002, Looking for efficient portfolios: An empirical investigation, Working Paper. Danielsson, J., P. Embrechts, C. Goodhart, C. Keating, F. Muennich, O. Renault and H.-S. Shin, 2001, An academic response to Basel II, Working Paper, FMG and ESRC, London. Duffie, D. and J. Pan, 1997, An Overview of Value at Risk, Journal of Derivatives 4, 7-49. Embrechts, P., C. Kluppelberg and T. Mikosh, 1997, Modelling extremal events (Springel-Verlag, Applications of Mathematics 33). Embrechts, P., A. McNeil and D. Straumann, 2002a, Correlation and Dependence in risk management: properties and pitfalls, In: Risk Management: Value at Risk and Beyond, ed. M.A.H. Dempster, Cambridge University Press, Cambridge, pp. 176-223. 29
418
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Embrecht, P., A. H¨oing and A. Juri, 2002b, Using copulae to bound the Value-at-Risk for functions of dependent risks, Working Paper, Risklab. Fama, E.F., 1965, Portfolio analysis in a stable Paretian market, Management Science 11, 404-419. Fouque, J.-P., G. Papanicolaou, and R. Sircar, 2000, Derivatives in Financial Markets with Stochastic Volatility (Cambridge University Press, Cambridge, UK). Frisch, U. and D.Sornette, 1997, Extreme Deviations and Applications, Journal de. Physique I France 7, 1155-1171. Goldie, C. M. and C. Kl¨uppelberg, 1998, Subexponential distributions, in A Practical Guide to Heavy Tails: Statistical Techniques and Applications, ed. R. L. Adler, R. Feldman, M. S. Taqqu, pp. 435-459 ( Birkhuser, Boston) Gopikrishnan, P., M. Meyer, L.A. Nunes Amaral and H.E. Stanley, 1998, Inverse cubic law for the distribution of stock price variations, European Physical Journal B 3, 139-140. Gouri´eroux, C. and J. Jasiak, 1998, Truncated maximum likelyhood, goodness of fit tests and tail analysis, Working paper. Johansen, A. and D. Sornette, 1998, Stock market crashes are outliers, European Physical Journal B 1, 141-144. Johansen, A. and D. Sornette, 1998, Large market price drawdowns are outliers, Journal of Risk 4, 69-110. Jorion, P., 1997, Value-at-Risk: The New Benchmark for Controlling Derivatives Risk (Irwin Publishing, Chicago, IL). Kahneman, D. and A. Tversky, 1979, Prospect theort: An analysis of decision under risk, Econometrica 47, 263-291. Laherr`ere, J. and D. Sornette, 1998, Stretched exponential distributions in nature and economy : ”fat tails” with characteristic scales, European Physical Journal B 2, 525-539. Lintner, J., 1965, The valuation of risk assets and the selection of risky investments in stock portfolios and the capital bugets, Review of Economics and Statitics 41, 13–37. Litterman, R. and K. Winkelmann, 1998, Estimating covariance matrices (Risk Management Series, Goldman Sachs). Lux, T., 1996, The stable Paretian hypothesis and the frequency of large returns: an examination of major German stocks, Applied Financial Economics 6, 463-475. Malevergne, Y., V. Pisarenko and D. Sornette, 2002, Empirical distributions of log-returns: Exponential or power-like?, Working paper. Malevergne, Y. and D. Sornette, 2001, Testing the Gaussian copulas hypothesis for modeling financial assets dependence, Working paper, cond-mat/0111310. Malevergne, Y. and D. Sornette, 2002a, Minimising extremes, Risk 15(11), 129-133. Malevergne, Y. and D. Sornette, 2002b, How to account for extreme co-movements between individual stocks and the market, Working paper.
30
419
Malevergne, Y. and D. Sornette, 2002c, Multi-Moments Method for Portfolio Management: Generalized Capital Asset Pricing Model in Homogeneous and Heterogeneous markets, working paper http : //papers.ssrn.com/paper.taf ?abstract id = 319544 Mantegna, R.N. and H.E. Stanley, 1995, Scaling bahavior of an economic index, Nature 376, 46-55. Markovitz, H., 1959, Portfolio selection : Efficient diversification of investments (John Wiley and Sons, New York). Mashal, R. and A. Zeevi, 2002, Beyond Correlation: Extreme Co-movements Between Financial Assets, Working paper, Columbia University, preprint at www.columbia.edu\ ∼rm586 Mossin, J., 1966, Equilibrium in capital asset market, Econometrica 35, 768–783. Nelsen, R.B., 1998, An Introduction to Copulas. Lectures Notes in statistic, 139 (Springer Verlag, New York). Rao, C.R., 1973, Linear statistical inference and its applications, 2d ed. (New York Willey). Rockafeller, R.T and S. Uryasev, 2000, Optimization of the conditional value-at-risk, journal of Risk 2, 21-41. Sharpe, W., 1964, Capital assets prices: a theory of market equilibrium under conditions of risk, Journal of Finance 19, 425–442. Sornette, D., 1998, Large deviations and portfolio optimization, Physica A 256, 251-283. Sornette, D., 2000, Critical Phenomena in Natural Sciences, Chaos, Fractals, Self-organization and Disorder: Concepts and Tools (Springer Series in Synergetics). Sornette, D., J. V. Andersen and P. Simonetti, 2000a, Portfolio Theory for “Fat Tails”, International Journal of Theoretical and Applied Finance 3, 523-535. Sornette, D., P. Simonetti, J.V. Andersen, 2000b, φq -field theory for portfolio optimization : ”fat-tails” and non-linear correlations, Physics Reports 335, 19-92. Szerg¨o, G., 1999, A critique to Basel regulation, or how to enhance (im)moral hazards, Proceedings of the International Conference on Risk Management and Regulation in Banking, Bank of Israel, Kluwer. Tasche, D. and L. Tilibetti, 2001, Approximations for the Value-at-Risk approach to risk-return, Working Paper.
31
420
13. Gestion de portefeuille sous contraintes de capital e´ conomique
Standard & Poor’s 500
1
Gaussian Returns
10
0
10
c/2 = 0.73
−1
10
−4
10
−3
−2
10
10
−1
10
Raw Returns
Figure 1: Graph of the Gaussianized Standard & Poor’s 500 index returns versus its raw returns, during the time interval from January 03, 1995 to December 29, 2000 for the negative tail of the distribution.
32
421
c>1
ln(T/T0)
c=1
c 2, the largest fluctuations do not contribute significantly to this expectation. To increase their contributions, and in this way to account for the largest fluctuations, it is natural to invoke higher order moments of order n > 2. The large n is, the larger is the contribution of the rare and large returns in the tail of the pdf. This phenomenon is demonstrated in figure 1, where we can observe the evolution of the quantity xn · P (x) for n = 1, 2 and 4, where P (x), in this example, is the standard exponential distribution e−x . The expectation E[X n ] is then simply represented geometrically as equal to the area below the curve xn ·P (x). These curves provide an intuitive illustration of the fact that the main contributions to the moment E[X n ] of order n come from values of X in the vicinity of the maximum of xn · P (x) which increases fast with the order n of the moment we consider, all the more so, the fatter is the tail of the pdf of the returns X. For the exponential distribution chosen to construct figure 1, the value of x corresponding to the maximum of xn · P (x) is exactly equal to n. Thus, increasing the order of the moment allows one to sample larger fluctuations of the asset prices.
2.2
Quantifying the fluctuations of an asset
Let us now examine what should be the properties that coherent measures of risks adapted to the portfolio problem must satisfy in order to best quantify the asset price fluctuations. Let us consider an asset denoted X, and let G be the set of all the risky assets available on the market. Its profit and loss distribution is the )−X(0) distribution of δX = X(τ ) − X(0), while the return distribution is given by the distribution of X(τX(0) . The risk measures will be defined for the profit and loss distributions and then shown to be equivalent to another definition applied to the return distribution. Our first requirement is that the risk measure ρ(·), which is a functional on G, should always remain positive A XIOM 1
∀X ∈ G,
ρ(δX) ≥ 0 ,
where the equality holds if and only if X is certain. Let us now add to this asset a given amount a invested in the risk free-asset whose return is µ0 (with therefore no randomness in its price trajectory) and define the new asset Y = X +a. Since a is non-random, the fluctuations of X and Y are the same. Thus, it is desirable that ρ enjoys the property of translational invariance, whatever the asset X and the non-random coefficient a may be: A XIOM 2
∀X ∈ G, ∀a ∈ R,
ρ(δX + µ · a) = ρ(δX).
We also require that our risk measure increases with the quantity of assets held in the portfolio. A priori, one should expect that the risk of a position is proportional to its size. Indeed, the fluctuations associated with the variable 2 · X are naturally twice larger as the fluctuations of X. This is true as long as we can consider that a large position can be liquidated as easily as a smaller one. This is obviously not true, due to the limited liquidity of real markets. Thus, a large position in a given asset is more risky than the sum of the risks associated with the many smaller positions which add up to the large position. To account for this point, we assume that ρ depends on the size of the position in the same manner for all assets. This assumption is slightly restrictive but not unrealistic for companies with comparable properties in terms of market capitalization or sector of activity. This requirement reads A XIOM 3
∀X ∈ G, ∀λ ∈ R+ ,
5
ρ(λ · δX) = f (λ) · ρ(δX),
430
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
where the function f : R+ −→ R+ is increasing and convex to account for liquidity risk. In fact, it is straightforward to show 1 that the only functions statistying this axiom are the fonctions fα (λ) = λα with α ≥ 1, so that axiom 3 can be reformulated in terms of positive homogeneity of degree α: A XIOM 4
ρ(λ · δX) = λα · ρ(δX).
∀X ∈ G, ∀λ ∈ R+ ,
(6)
Note that the case of liquid markets is recovered by α = 1 for which the risk is directly proportionnal to the size of the position. These axioms, which define our risk measures for profit and loss can easily be extended to the returns of the assets. Indeed, the return is nothing but the profit and loss divided by the initial value X(0) of the asset. One can thus easily check that the risk defined on the profit and loss distribution is X(0)α times the risk defined on the return distribution. In the sequel, we will only consider this later definition, and, to simplify the notations since we will only consider the returns and not the profit and loss, the notation X will be used to denote the asset and its return as well. We can remark that the risk measures ρ enjoying the two properties defined by the axioms 2 and 4 are known as the semi-invariants of the distribution of the profit and loss / returns of X (see (Stuart and Ord 1994, p 86-87)). Among the large familly of semi-invariants, we can cite the well-known centered moments and cumulants of X.
2.3
Examples
The set of risk measures obeying axioms 1-4 is huge since it includes all the homogeneous functionals of (X − E[X]), for instance. The centered moments (or moments about the mean) and the cumulants are two well-known classes of semi-invariants. Then, a given value of α can be seen as nothing but a specific choice of the order n of the centered moments or of the cumulants. In this case, our risk measure defined via these semi-invariants fulfills the two following conditions: ρ(X + µ) = ρ(X), n
ρ(λ · X) = λ · ρ(X).
(7) (8)
In order to satisfy the positivity condition (axiom 1), we need to restrict the set of values taken by n. By construction, the centered moments of even order are always positive while the odd order centered moments can be negative. Thus, only the even order centered moments are acceptable risk measures. The situation is not so clear for the cumulants, since the even order cumulants, as well as the odd order ones, can be negative. In full generality, only the centered moments provide reasonable risk measures satifying our axioms. However, for a large class of distributions, even order cumulants remain positive, especially for fat tail distributions (eventhough there are simple but somewhat artificial counter-examples). Therefore, cumulants of even order can be useful risk measures when restricted to these distributions. Indeed, the cumulants enjoy a property which can be considered as a natural requirement for a risk measure. It can be desirable that the risk associated with a portfolio made of independent assets is exactly the sum of the risk associated with each individual asset. Thus, given N independent assets {X1 , · · · , XN }, and the portfolio SN = X1 + · · · + XN , we wish to have ρn (SN ) = ρn (X1 ) + · + ρn (XN ) . 1
(9)
using the trick ρ(λ1 λ2 · δX) = f (λ1 ) · ρ(λ2 · δX) = f (λ1 ) · f (λ2 ) · ρ(δX) = f (λ1 · λ2 ) · ρ(δX) leading to f (λ1 · λ2 ) = f (λ1 ) · f (λ2 ). The unique increasing convex solution of this functional equation is fα (λ) = λα with α ≥ 1.
6
431
This property is verified for all cumulants while is not true for centered moments. In addition, as seen from their definition in terms of the characteristic function (63), cumulants of order larger than 2 quantify deviation from the Gaussian law, and thus large risks beyond the variance (equal to the second-order cumulant). Thus, centered moments of even orders possess all the minimal properties required for a suitable portfolio risk measure. Cumulants fulfill these requirement only for well behaved distributions, but have an additional advantage compared to the centered moments, that is, they fulfill the condition (9). For these reasons, we shall consider below both the centered moments and the cumulants. In fact, we can be more general. Indeed, as we have written, the centered moments or the cumulants of order n are homogeneous functions of order n, and due to the positivity requirement, we have to restrict ourselves to even order centered moments and cumulants. Thus, only homogeneous functions of order 2n can be considered. Actually, this restrictive constraint can be relaxed by recalling that, given any homogeneous function f (·) of order p, the function f (·)q is also homogeneous of order p · q. This allows us to decouple the order of the moments to consider, which quantifies the impact of the large fluctuations, from the influence of the size of the positions held, measured by the degres of homogeneity of ρ. Thus, considering any even £ ¤α/2n order centered moments, we can build a risk measure ρ(X) = E (X − E[X])2n which account for the fluctuations measured by the centered moment of order 2n but with a degree of homogeneity equal to α. A further generalization is possible to odd-order moments. Indeed, the absolute centered moments satisfy our three axioms for any odd or even order. We can go one step further and use non-integer order absolute centered moments, and define the more general risk measure ρ(X) = E [|X − E[X]|γ ]α/γ ,
(10)
where γ denotes any positve real number. These set of risk measures are very interesting since, due to the Minkowsky inegality, they are convex for any α and γ larger than 1 : ρ(u · X + (1 − u) · Y ) ≤ u · ρ(X) + (1 − u) · ρ(Y ),
(11)
which ensures that aggregating two risky assets lead to diversify their risk. In fact, in the special case γ = 1, these measures enjoy the stronger sub-additivity property. Finally, we should stress that any discrete or continuous (positive) sum of these risk measures, with the same degree of homogeneity is again a risk measure. This allows us to define “spectral measures of fluctuations” in the same spirit as in (Acerbi 2002): Z ρ(X) = dγ φ(γ) E [(X − E[X])γ ]α/γ , (12) where φ is a positive real valued function defined on any subinterval of [1, ∞) such that the integral in (12) remains finite. It is interesting to restrict oneself to the functions φ whose integral sums up to one: R dγ φ(γ) = 1, which is always possible, up to a renormalization. Indeed, in such a case, φ(γ) represents the relative weight attributed to the fluctuations measured by a given moment order. Thus, the function φ can be considered as a measure of the risk aversion of the risk manager with respect to the large fluctuations. Let us stress that the variance, originally used in (Markovitz 1959)’s portfolio theory, is nothing but the second centered moment, also equal to the second order cumulant (the three first cumulants and centered moments are equal). Therefore, a portfolio theory based on the centered moments or on the cumulants automatically contain (Markovitz 1959)’s theory as a special case, and thus offers a natural generalization emcompassing large risks of this masterpiece of the financial science. It also embodies several other generalizations where homogeneous measures of risks are considered, a for instance in (Hwang and Satchell 1999). 7
432
3
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
The generalized efficient frontier and some of its properties
We now address the problem of the portfolio selection and optimization, based on the risk measures introduced in the previous section. As we have already seen, there is a large choice of relevant risk measures from which the portfolio manager is free to choose as a function of his own aversion to small versus large risks. A strong risk aversion to large risks will lead him to choose a risk measure which puts the emphasis on the large fluctuations. The simplest examples of such risk measures are provided by the high-order centered moments or cumulants. Obviously, the utility function of the fund manager plays a central role in his choice of the risk measure. The relation between the central moments and the utility function has already been underlined by several authors such as (Rubinstein 1973) or (Jurczenko and Maillet 2002), who have shown that an economic agent with a quartic utility function is naturally sensitive to the first four moments of his expected wealth distribution. But, as stressed before, we do not wish to consider the expected utility formalism since our goal, in this paper, is not to study the underlying behavior leading to the choice of any risk measure. The choice of the risk measure also depends upon the time horizon of investment. Indeed, as the time scale increases, the distribution of asset returns progressively converges to the Gaussian pdf, so that only the variance remains relevant for very long term investment horizons. However, for shorter time horizons, say, for portfolio rebalanced at a weekly, daily or intra-day time scales, choosing a risk measure putting the emphasis on the large fluctuations, such as the centered moments µ6 or µ8 or the cumulants C6 or C8 (or of larger orders), may be necessary to account for the “wild” price fluctuations usually observed for such short time scales. Our present approach uses a single time scale over which the returns are estimated, and is thus restricted to portfolio selection with a fixed investment horizon. Extensions to a portofolio analysis and optimization in terms of high-order moments and cumulants performed simultaneously over different time scales can be found in (Muzy et al. 2001).
3.1
Efficient frontier without risk-free asset
Let us consider N risky assets, denoted by X1 , · · · , XN . Our goal is to find the best possible allocation, given a set of constraints.The portfolio optimization generalizing the approach of (Sornette et al. 2000a, Andersen and Sornette 2001) corresponds to accounting for large fluctuations of the assets through the risk measures introduced above in the presence of a constraint on the return as well as the “no-short sells” constraint: inf w ∈[0,1] ρα ({wi }) P i w =1 Pi≥1 i (13) i≥1 wi µ(i) = µ , wi ≥ 0, ∀i > 0, where wi is the weight of Xi and µ(i) its expected return. In all the sequel, the subscript α in ρα will refer to the degree of homogeneity of the risk measure. This problem cannot be solved analytically (except in the Markovitz’s case where the risk measure is given by the variance). We need to perform numerical calculations to obtain the shape of the efficient frontier. Nonetheless, when the ρα ’s denotes the centered moments or any convex risk measure, we can assert that this optimization problem is a convex optimization problem and that it admits one and only one solution which can be easily determined by standard numerical relaxation or gradient methods. As an example, we have represented In figure 2, the mean-ρα efficient frontier for a portfolio made of sev1/α enteen assets (see appendix A for details) in the plane (µ, ρα ), where ρα represents the centered moments 8
433
µn=α of order n = 2, 4, 6 and 8. The efficient frontier is concave, as expected from the nature of the optimization problem (13). For a given value of the expected return µ, we observe that the amount of risk 1/n measured by µn increases with n, so that there is an additional price to pay for earning more: not only the µ2 -risk increases, as usual according to Markowitz’s theory, but the large risks increases faster, the more so, the larger n is. This means that, in this example, the large risks increases more rapidly than the small risks, as the required return increases. This is an important empirical result that has obvious implications for portfolio selection and risk assessment. For instance, let us consider an efficient portfolio whose expected (daily) return equals 0.12%, which gives an annualized return equal to 30%. We can see in table 1 that the typical fluctuations around the expected return are about twice larger when measured by µ6 compared with µ2 and that they are 1.5 larger when measured with µ8 compared with µ4 .
3.2
Efficient frontier with a risk-free asset
Let us now assume the existence of a risk-free asset X0 . The optimization problem with the same set of constraints as previoulsy can be written as: inf w ∈[0,1] ρα ({wi }) P i w =1 Pi≥0 i (14) i≥0 wi µ(i) = µ , wi ≥ 0, ∀i > 0, This optimizationP problem can be solved exactly. Indeed, due to existence of a risk-free asset, the normalization condition wi = 1 is not-constraining since one can always adjust, by lending or borrowing money, the fraction w0 to a value satisfying the normalization condition. Thus, as shown in appendix B, the efficient frontier is a straight line in the plane (µ, ρα 1/α ), with positive slope and whose intercept is given by the value of the risk-free interest rate: µ = µ0 + ξ · ρα 1/α , (15) where ξ is a coefficient given explicitely below. This result is very natural when ρα denotes the variance, since it is then nothing but (Markovitz 1959)’s result. But in addition, it shows that the mean-variance result can be generalized to every mean-ρα optimal portfolios. We present in figure 3 the results given by numerical simulations. The set of assets is the same as before and the risk-free interest rate has been set to 5% a year. The optimization procedure has been performed using a genetic algorithm on the risk measure given by the centered moments µ2 , µ4 , µ6 and µ8 . As expected, we observe three increasing straight lines, whose slopes monotonically decay with the order of the centered moment under consideration. Below, we will discuss this property in greater detail.
3.3
Two funds separation theorem
The two funds separation theorem is a well-known result associated with the mean-variance efficient portfolios. It results from the concavity of the Markovitz’s efficient frontier for portfolios made of risky assets only. It states that, if the investors can choose between a set of risky assets and a risk-free asset, they invest a fraction w0 of their wealth in the risk-free asset and the fraction 1 − w0 in a portfolio composed only with risky assets. This risky portofolio is the same for all the investors and the fraction w0 of wealth invested in the risk-free asset depends on the risk aversion of the investor or on the amount of economic capital an institution must keep aside due to the legal requirements insuring its solvency at a given confidence level. We shall see that this result can be generalized to any mean-ρα efficient portfolio. 9
434
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Indeed, it can be shown (see appendix B) that the weights of the optimal portfolios that are solutions of (14) are given by: w0∗ = w0 ,
(16)
wi∗
= (1 − w0 ) · w ˜i , i ≥ 1, (17) P where the w ˜i ’s are constants such that w ˜i = 1 and whose expressions are given appendix B. Thus, denoting by Π the portfolio only made of risky assets whose weights are the w ˜i ’s, the optimal portfolios are the linear combination of the risk-free asset, with weight w0 , and of the portfolio Π, with weigth 1 − w0 . This result generalizes the mean-variance two fund theorem to any mean-ρα efficient portfolio. To check numerically this prediction, figure 4 represents the five largest weights of assets in the portfolios previously investigated as a function of the weight of the risk-free asset, for the four risk measures given by the centered moments µ2 , µ4 , µ6 and µ8 . One can observe decaying straight lines that intercept the horizontal axis at w0 = 1, as predicted by equations (16-17). In figure 2, the straight lines representing the efficient portfolios with a risk-free asset are also represented. They are tangent to the efficient frontiers without risk-free asset. This is natural since the efficient portfolios with the risk-free asset are the weighted sum of the risk-free asset and the optimal portfolio Π only made of risky assets. Since Π also belongs to the efficient frontier without risk-free asset, the optimum is reached when the straight line describing the efficient frontier with a risk-free asset and the (concave) curve of the efficient frontier without risk-free asset are tangent.
3.4
Influence of the risk-free interest rate
Figure 3 has shown that the slope of the efficient frontier (with a risk-free asset) decreases when the order n of the centered moment used to measure risks increases. This is an important qualitative properties of the risk measures offered by the centered moments, as this means that higher and higher large risks are sampled under increasing imposed return. Is it possible that the largest risks captured by the high-order centered moments could increase at a slower rate than the small risks embodied in the small-order centered cumulants? For instance, is it possible for the slope of the mean-µ6 efficient frontier to be larger than the slope of the mean-µ4 frontier? This is an important question as it conditions the relative costs in terms of the panel of risks under increasing specified returns. To address this question, consider figure 2. Changing the value of the risk-free interest rate amounts to move the intercept of the straight lines along the ordinate axis so as to keep them tangent to the efficient frontiers without risk-free asset. Therefore, it is easy to see that, in the situation depicted in figure 2, the slope of the four straight lines will always decay with the order of the centered moment. In order to observe an inversion in the order of the slopes, it is necessary and sufficient that the efficient frontiers without risk-free asset cross each other. This assertion is proved by visual inspection of figure 5. Can we observe such crossing of efficient frontiers? In the most general case of risk measure, nothing forbids this occurence. Nonetheless, we think that this kind of behavior is not realistic in a financial context since, as said above, it would mean that the large risks could increase at a slower rate than the small risks, implying an irrational behavior of the economic agents.
4
Classification of the assets and of portfolios
Let us consider two assets or portfolios X1 and X2 with different expected returns µ(1), µ(2) and different levels of risk measured by ρα (X1 ) and ρα (X2 ). An important question is then to be able to compare 10
435
these two assets or portfolios. The most general way to perform such a comparison is to refer to decision theory and to calculate the utility of each of them. But, as already said, the utility function of an agent is generally not known, so that other approaches have to be developed. The simplest solution is to consider that the couple (expected return, risk measure) fully characterizes the behavior of the economic agent and thus provides a sufficiently good approximation for her utility function. In the (Markovitz 1959)’s world for instance, the preferences of the agents are summarized by the two first moments of the distribution of assets returns. Thus, as shown by (Sharpe 1966, Sharpe 1994) a simple way to synthetize these two parameters, in order to get a measure of the performance of the assets or portfolios, is to build the ratio of the expected return µ (minus the risk free interest rate) over the standard deviation σ: S=
µ − µ0 , σ
(18)
which is the so-called Sharpe ratio and simply represents the amount of expected return per unit of risk, measured by the standard deviation. It is an increasing function of the expected return and a decreasing function of the level of risk, which is natural for risk-averse or prudential agent.
4.1
The risk-adjustment approach
This approach can be generalized to any type of risk measures (see (Dowd 2000), for instance) and thus allows for the comparison of assets whose risks are not well accounted for by the variance (or the standard deviation). Indeed, instead of considering the variance, which only accounts for the small risks, one can build the ratio of the expected return over any risk measure. In fact, looking at the equation (113) in appendix B, the expression µ − µ0 , (19) ρα (X)1/α naturally arises and is constant for every efficient portfolios. In this expression, α denotes the coefficient of homogeneity of the risk measure. It is nothing but a simple generalisation of the usual Sharpe ratio. Indeed, when ρα is given by the variance σ 2 , the expression above recovers the Sharpe ratio. Thus, once the portfolio manager has chosen his measure of fluctuations ρα , he can build a consistent risk-adjusted performance measure, as shown by (19). As just said, these generalized Sharpe ratios are constant for every efficient portfolios. In fact, they are not only constant but also maximum for every efficient portfolios, so that looking for the portfolio with maximum generalized Sharpe ratio yields the same optimal portfolios as those found with the whole optimization program solved in the previous section. As an illutration, table 2 gives the risk-adjusted performance of the set of seventeen assets already studied, for several risk measures. We have considered the three first even order centered moments (columns 2 to 4) and the three first even order cumulants (columns 2, 5 and 6) as fluctuation measures. Obviously the second order centered moment and the second order cumulant are the same, and give again the usual Sharpe ratio (18). The assets have been sorted with respect to their Sharpe Ratio. The first point to note is that the rank of an asset in terms of risk-adjusted perfomance strongly depends on the risk measure under consideration. The case of MCI Worldcom is very striking in this respect. Indeed, according to the usual Sharpe ratio, it appears in the 12th position with a value larger than 0.04 while according to the other measures it is the last asset of our selection with a value lower than 0.02. The second interesting point is that, for a given asset, the generalize Sharpe ratio is always a decreasing function of the order of the considered centered moment. This is not particular to our set of assets since we
11
436
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
can prove that
(E [|X|p ])1/p ≥ (E [|X|q ])1/q ,
so that ∀p > q,
µ − µ0 (E [|X|p ])1/p
≤
µ − µ0 (E [|X|q ])1/q
(20) .
(21)
On the contrary, when the cumulants are used as risk measures, the generalized Sharpe ratios are not monotonically decreasing, as exhibited by Procter & Gamble for instance. This can be surprising in view of our previous remark that the larger is the order of the moments involved in a risk measure, the larger are the fluctuations it is accounting for. Extrapolating this property to cumulants, it would mean that Procter & Gamble presents less large risks according to C6 than according to C4 , while according to the centered moments, the reverse evolution is observed. Thus, the question of the coherence of the cumulants as measures of fluctuations may arise. And if we accept that such measures are coherent, what are the implications on the preferences of the agents employing such measures ? To answer this question, it is informative to express the cumulants as a function of the moments. For instance, let us consider the fourth order cumulant C4 = µ4 − 3 · µ2 2 ,
(22)
2
(23)
= µ4 − 3 · C2 .
An agent assessing the fluctuations of an asset with respect to C4 presents aversion for the fluctuations quantified by the fourth central moment µ4 – since C4 increases with µ4 – but is attracted by the fluctuations measured by the variance - since C4 decreases with µ2 . This behavior is not irrational since it remains globally risk-averse. Indeed, it depicts an agent which tries to avoid the larger risks but is ready to accept the smallest ones. This kind of behavior is characteristic of any agent using the cumulants as risk measures. It thus allows us to understand why Procter & Gamble is more attractive for an agent sentitive to C6 than for an agent sentitive to C4 . From the expression of C6 , we remark that the agent sensitive to this cumulant is risk-averse with respect to the fluctuations mesured by µ6 and µ2 but is risk-seeker with respect to the fluctuations mesured by µ4 and µ3 . Then, is this particular case, the later ones compensate the former ones. It also allows us to understand from a behavioral stand-point why it is possible to “have your cake and eat it too” in the sense of (Andersen and Sornette 2001), that is, why, when the cumulants are choosen as risk measures, it may be possible to increase the expected return of a portfolio while lowering its large risks, or in other words, why its generalized Sharpe ratio may increase when one consider larger cumulants to measure its risks. We will discuus this point again in section 9.
4.2
Marginal risk of an asset within a portofolio
Another important question that arises is the contribution of a given asset to the risk of the whole portfolio. Indeed, it is crucial to know whether the risk is homogeneously shared by all the assets of the portfolio or if it is only held by a few of them. The quality of the diversification is then at stake. Moreover, this also allows for the sensitivity analysis of the risk of the portfolio with respect to small changes in its composition2 , which is of practical interest since it can prevent us from recalculating the whole risk of the portfolio after a small re-adjustment of its composition. 2
see (Gouri´eroux et al. 2000, Scaillet 2000) for a sensitivity analysis of the Value-at-Risk and the expected shortfall.
12
437
Due to the homogeneity property of the fluctuation measures and to Euler’s theorem for homogeneous functions, we can write that N 1X ∂ρ wi · ρ({w1 , · · · , wN }) = , (24) α ∂wi i1
provided the risk measure ρ is differentiable which will be assumed in all the sequel. In this expression, the coefficient α again denotes the degree of homogeneity of the risk measure ρ This relation simply shows that the amount of risk brought by one unit of the asset i in the portfolio is given ∂ρ by the first derivative of the risk of the portfolio with respect to the weight wi ot this asset. Thus, α−1 · ∂w i represents the marginal amount of risk of asset i in the portfolio. It is then easy to check that, in a portfolio with minimum risk, irrespective of the expected return, the weight of each asset is such that the marginal risks of the assets in the portfolio are equal.
5
A new equilibrum model for asset prices
Using the portfolio selection method explained in the two previous sections, we now present an equilibrium model generalizing the original Capital Asset Pricing Model developed by (Sharpe 1964, Lintner 1965, Mossin 1966). Many generalizations have already been proposed to account for the fat-tailness of the assets return distributions, which led to the multi-moments CAPM. For instance (Rubinstein 1973) and (Krauss and Lintzenberger 1976) or (Lim 1989) and (Harvey and Siddique 2000) have underlined and tested the role of the asymmetry in the risk premium by accounting for the skewness of the distribution of returns. More recently, (Fang and Lai 1997) and (Hwang and Satchell 1999) have introduced a four-moments CAPM to take into account the letpokurtic behavior of the assets return distributions. Many other extentions have been presented such as the VaR-CAPM (see (Alexander and Baptista 2002)) or the Distributional-CAPM by (Polimenis 2002). All these generalization become more and more complicated and not do not provide necessarily more accurate prediction of the expected returns. Here, we will assume that the relevant risk measure is given by any measure of fluctuations previously presented that obey the axioms I-IV of section 2. We will also relax the usual assumption of an homogeneous market to give to the economic agents the choice of their own risk measure: some of them may choose a risk measure which put the emphasis on the small fluctuations while others may prefer those which account for the large ones. We will show that, in such an heterogeneous market, an equilibrium can still be reached and that the excess returns of individual stocks remain proportional to the market excess return. For this, we need the following assumptions about the market: • H1: We consider a one-period market, such that all the positions held at the begining of a period are cleared at the end of the same period. • H2: The market is perfect, i.e., there are no transaction cost or taxes, the market is efficient and the investors can lend and borrow at the same risk-free rate µ0 . We will now add another assumption that specifies the behavior of the agents acting on the market, which will lead us to make the distinction between homogeneous and heterogeneous markets.
13
438
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
5.1
Equilibrium in a homogeneous market
The market is said to be homogeneous if all the agents acting on this market aim at fulfilling the same objective. This means that: • H3-1: all the agents want to maximize the expected return of their portfolio at the end of the period under a given constraint of measured risk, using the same measure of risks ρα for all of them. In the special case where ρα denotes the variance, all the agents follow a Markovitz’s optimization procedure, which leads to the CAPM equilibrium, as proved by (Sharpe 1964). When ρα represents the centered moments, we will be led to the market equilibrium described by (Rubinstein 1973). Thus, this approach allows for a generalization of the most popular asset pricing in equilibirum market models. When all the agents have the same risk function ρα , whatever α may be, we can assert that they have all a fraction of their capital invested in the same portfolio Π, whose composition is given in appendix B, and the remaining in the risk-free asset. The amount of capital invested in the risky fund only depends on their risk aversion or on the legal margin requirement they have to fulfil. Let us now assume that the market is at equilibrium, i.e., supply equals demand. In such a case, since the optimal portfolios can be any linear combinations of the risk-free asset and of the risky portfolio Π, it is straightforward to show (see appendix C) that the market portfolio, made of all traded assets in proportion of their market capitalization, is nothing but the risky portfolio Π. Thus, as shown in appendix D, we can state that, whatever the risk measure ρα chosen by the agents to perform their optimization, the excess return of any asset over the risk-free interest rate is proportional to the excess return of the market portfolio Π over the risk-free interest rate: µ(i) − µ0 = βαi · (µΠ − µ0 ), (25) where
³ 1 ´¯ ∂ ln ρα α ¯¯ ¯ βαi = · ¯ ∂wi ¯
,
(26)
∗ w1∗ ,···,wN
∗ are defined in appendix D. When ρ denotes the variance, we recover the usual β i given where w1∗ , · · · , wN α by the mean-variance approach: Cov(Xi , Π) βi = . (27) Var(Π)
Thus, the relations (25) and (26) generalize the usual CAPM formula, showing that the specific choice of the risk measure is not very important, as long as it follows the axioms I-IV characterizing the fluctuations of the distribution of asset returns.
5.2
Equilibrium in a heterogeneous market
Does this result hold in the more realistic situation of an heterogeneous market? A market will be said to be heterogeneous if the agents seek to fulfill different objectives. We thus consider the following assumption: • H3-2: There exists N agents. Each agent n is characterized by her choice of a risk measure ρα (n) so that she invests only in the mean-ρα (n) efficient portfolios. According to this hypothesis, an agent n invests a fraction of her wealth in the risk-free asset and the remaining in Πn , the mean-ρα (n) efficient portfolio, only made of risky assets. The fraction of wealth 14
439
invested in the risky fund depends on the risk aversion of each agents, which may vary from an agent to another one. The composition of the market portfolio for such a heterogenous market is derived in appendix C. We find that the market portfolio Π is nothing but the weighted sum of the mean-ρα (n) optimal portfolio Πn : Π=
N X
γn Πn ,
(28)
n=1
where γn is the fraction of the total wealth invested in the fund Πn by the nth agent. Appendix D demonstrates that, for every asset i and for any mean-ρα (n) efficient portfolio Πn , for all n, the following equation holds µ(i) − µ0 = βni · (µΠn − µ0 ) . (29) Multiplying these equations by γn /βni , we get γn · (µ(i) − µ0 ) = γn · (µΠn − µ0 ), βni for all n, and summing over the different agents, we obtain à ! à ! X γn X · (µ(i) − µ0 ) = γn · µΠn − µ0 , βni n n so that
µ(i) − µ0 = β i · (µΠ − µ0 ),
with βi =
!−1 Ã X γn n
βni
.
(30)
(31)
(32)
(33)
This allows us to conclude that, even in a heterogeneous market, the expected excess return of each individual stock is directly proportionnal to the expected excess return of the market portfolio, showing that the homogeneity of the market is not a key property necessary for observing a linear relationship between individual excess asset returns and the market excess return.
6
Estimation of the joint probability distribution of returns of several assets
A priori, one of the main practical advantage of (Markovitz 1959)’s method and its generalization presented above is that one does not need the multivariate probability distribution function of the assets returns, as the analysis solely relies on the coherent measures ρ(X) defined in section 2, such as the centered moments or the cumulants of all orders that can in principle be estimated empirically. Unfortunately, this apparent advantage maybe an illusion. Indeed, as underlined by (Stuart and Ord 1994) for instance, the error of the empirically estimated moment of order n is proportional to the moment of order 2n, so that the error becomes quickly of the same order as the estimated moment itself. Thus, above n = 6 (or may be n = 8) it is not reasonable to estimate the moments and/or cumulants directly. Thus, the knowledge of the multivariate distribution of assets returns remains necessary. In addition, there is a current of thoughts that provides evidence that marginal distributions of returns may be regularly varying with index µ in the range 3-4 (Lux 1996, Pagan 1996, Gopikrishnan et al. 1998), suggesting the non-existence of asymptotically defined moments and cumulants of order equal to or larger than µ. 15
440
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
In the standard Gaussian framework, the multivariate distribution takes the form of an exponential of minus a quadratic form X 0 Ω−1 X, where X is the unicolumn of asset returns and Ω is their covariance matrix. The beauty and simplicity of the Gaussian case is that the essentially impossible task of determining a large multidimensional function is reduced into the very much simpler one of calculating the N (N + 1)/2 elements of the symmetric covariance matrix. Risk is then uniquely and completely embodied by the variance of the portfolio return, which is easily determined from the covariance matrix. This is the basis of Markovitz’s portfolio theory (Markovitz 1959) and of the CAPM (see for instance (Merton 1990)). However, as is well-known, the variance (volatility) of portfolio returns provides at best a limited quantification of incurred risks, as the empirical distributions of returns have “fat tails” (Lux 1996, Gopikrishnan et al. 1998) and the dependences between assets are only imperfectly accounted for by the covariance matrix (Litterman and Winkelmann 1998). In this section, we present a novel approach based on (Sornette et al. 2000b) to attack this problem in terms of the parameterization of the multivariate distribution of returns involving two steps: (i) the projection of the empirical marginal distributions onto Gaussian laws via nonlinear mappings; (ii) the use of an entropy maximization to construct the corresponding most parsimonious representation of the multivariate distribution.
6.1
A brief exposition and justification of the method
We will use the method of determination of multivariate distributions introduced by (Karlen 1998) and (Sornette et al. 2000b). This method consists in two steps: (i) transform each return x into a Gaussian variable y by a nonlinear monotonous increasing mapping; (ii) use the principle of entropy maximization to construct the corresponding multivariate distribution of the transformed variables y. The first concern to address before going any further is whether the nonlinear transformation, which is in principle different for each asset return, conserves the structure of the dependence. In what sense is the dependence between the transformed variables y the same as the dependence between the asset returns x? It turns out that the notion of “copulas” provides a general and rigorous answer which justifies the procedure of (Sornette et al. 2000b). For completeness and use later on, we briefly recall the definition of a copula (for further details about the concept of copula see (Nelsen 1998)). A function C : [0, 1]n −→ [0, 1] is a n-copula if it enjoys the following properties : • ∀u ∈ [0, 1], C(1, · · · , 1, u, 1 · · · , 1) = u , • ∀ui ∈ [0, 1], C(u1 , · · · , un ) = 0 if at least one of the ui equals zero , • C is grounded and n-increasing, i.e., the C-volume of every boxes whose vertices lie in [0, 1]n is positive. Skar’s Theorem then states that, given an n-dimensional distribution function F with continuous marginal distributions F1 , · · · , Fn , there exists a unique n-copula C : [0, 1]n −→ [0, 1] such that : F (x1 , · · · , xn ) = C(F1 (x1 ), · · · , Fn (xn )) .
(34)
This elegant result shows that the study of the dependence of random variables can be performed independently of the behavior of the marginal distributions. Moreover, the following result shows that copulas are intrinsic measures of dependence. Consider n continuous random variables X1 , · · · , Xn with copula 16
441
C. Then, if g1 (X1 ), · · · , gn (Xn ) are strictly increasing on the ranges of X1 , · · · , Xn , the random variables Y1 = g1 (X1 ), · · · , Yn = gn (Xn ) have exactly the same copula C (Lindskog 2000). The copula is thus invariant under strictly increasing tranformation of the variables. This provides a powerful way of studying scale-invariant measures of associations. It is also a natural starting point for construction of multivariate distributions and provides the theoretical justification of the method of determination of mutivariate distributions that we will use in the sequel.
6.2
Transformation of an arbitrary random variable into a Gaussian variable
Let us consider the return X, taken as a random variable characterized by the probability density p(x). The transformation y(x) which obtains a standard normal variable y from x is determined by the conservation of probability: y2 1 p(x)dx = √ e− 2 dy . (35) 2π Integrating this equation from −∞ and x, we obtain: ¶¸ · µ 1 y , (36) F (x) = 1 + erf √ 2 2 where F (x) is the cumulative distribution of X: Z
x
F (x) =
dx0 p(x0 ) .
(37)
−∞
This leads to the following transformation y(x): √ y = 2 erf−1 (2F (x) − 1) ,
(38)
which is obvously an increasing function of X as required for the application of the invariance property of the copula stated in the previous section. An illustration of the nonlinear transformation (38) is shown in figure 6. Note that it does not require any special hypothesis on the probability density X, apart from being non-degenerate. In the case where the pdf of X has only one maximum, we may use a simpler expression equivalent to (38). Such a pdf can be written under the so-called Von Mises parametrization (Embrechts et al. 1997) : f 0 (x) − 1 f (x) e 2 , p(x) = C p |f (x)|
(39)
where C is a constant of normalization. For f (x)/x2 → 0 when |x| → +∞, the pdf has a “fat tail,” i.e., it decays slower than a Gaussian at large |x|. Let us now define the change of variable p y = sgn(x) |f (x)| .
(40)
Using the relationship p(y) = p(x) dx dy , we get: y2 1 p(y) = √ e− 2 . 2π
(41)
It is important to stress the presence of the sign function sgn(x) in equation (40), which is essential in order to correctly quantify dependences between random variables. This transformation (40) is equivalent to (38) but of a simpler implementation and will be used in the sequel. 17
442
6.3
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Determination of the joint distribution : maximum entropy and Gaussian copula
Let us now consider N random variables Xi with marginal distributions pi (xi ). Using the transformation (38), we define N standard normal variables Yi . If these variables were independent, their joint distribution would simply be the product of the marginal distributions. In many situations, the variables are not independent and it is necessary to study their dependence. The simplest approach is to construct their covariance matrix. Applied to the variables Yi , we are certain that the covariance matrix exists and is well-defined since their marginal distributions are Gaussian. In contrast, this is not ensured for the variables Xi . Indeed, in many situations in nature, in economy, finance and in A social sciences, pdf’s are found to have power law tails ∼ x1+µ for large |x|. If µ ≤ 2, the variance and the covariances can not be defined. If 2 < µ ≤ 4, the variance and the covariances exit in principle but their sample estimators converge poorly. We thus define the covariance matrix:
V = E[yyt ] ,
(42)
where y is the vector of variables Yi and the operator E[·] represents the mathematical expectation. A classical result of information theory (Rao 1973) tells us that, given the covariance matrix V , the best joint distribution (in the sense of entropy maximization) of the N variables Yi is the multivariate Gaussian: ¶ µ 1 1 p (43) exp − yt V −1 y . P (y) = 2 (2π)N/2 det(V ) Indeed, this distribution implies the minimum additional information or assumption, given the covariance matrix. Using the joint distribution of the variables Yi , we obtain the joint distribution of the variables Xi : ¯ ¯ ¯ ∂yi ¯ ¯ , P (x) = P (y) ¯¯ ∂xj ¯ ¯ ¯ ¯ ∂yi ¯ where ¯ ∂x ¯ is the Jacobian of the transformation. Since j
we get
(44)
√ 1 2 ∂yi = 2πpj (xj )e 2 yi δij , ∂xj
(45)
¯ ¯ N Y ¯ ∂yi ¯ 1 2 ¯ ¯ = (2π)N/2 pi (xi )e 2 yi . ¯ ∂xj ¯
(46)
i=1
This finally yields µ ¶Y N 1 t −1 P (x) = p exp − y(x) (V − I)y(x) pi (xi ) . 2 det(V ) 1
(47)
i=1
As expected, if the variables are independent, V = I, and P (x) becomes the product of the marginal distributions of the variables Xi . Let F (x) denote the cumulative distribution function of the vector x and Fi (xi ), i = 1, ..., N the N corresponding marginal distributions. The copula C is then such that F (x1 , · · · , xn ) = C(F1 (x1 ), · · · , Fn (xn )) . 18
(48)
443
Differentiating with respect to x1 , · · · , xN leads to N
Y ∂F (x1 , · · · , xn ) P (x1 , · · · , xn ) = = c(F1 (x1 ), · · · , Fn (xn )) pi (xi ) , ∂x1 · · · ∂xn
(49)
i=1
where c(u1 , · · · , uN ) =
∂C(u1 , · · · , uN ) ∂u1 · · · ∂uN
(50)
is the density of the copula C. Comparing (50) with (47), the density of the copula is given in the present case by µ ¶ 1 t 1 −1 p exp − y(u) (V − I)y(u) , c(u1 , · · · , uN ) = 2 det(V )
(51)
which is the “Gaussian copula” with covariance matrix V. This result clarifies and justifies the method of (Sornette et al. 2000b) by showing that it essentially amounts to assume arbitrary marginal distributions with Gaussian copulas. Note that the Gaussian copula results directly from the transformation to Gaussian marginals together with the choice of maximizing the Shannon entropy under the constraint of a fixed covariance matrix. Under differents constraint, we would have found another maximum entropy copula. This is not unexpected in analogy with the standard result that the Gaussian law is maximizing the Shannon entropy at fixed given variance. If we were to extend this formulation by considering more general expressions of the entropy, such that Tsallis entropy (Tsallis 1998), we would have found other copulas.
6.4
Empirical test of the Gaussian copula assumption
We now present some tests of the hypothesis of Gaussian copulas between returns of financial assets. This presentation is only for illustration purposes, since testing the gaussian copula hypothesis is a delicate task which has been addressed elsewhere (see (Malevergne and Sornette 2001)). Here, as an example, we propose two simple standard methods. The first one consists in using the property that Gaussian variables are stable in distribution under addition. Thus, a (quantile-quantile or Q − Q) plot of the cumulative distribution of the sum y1 + · · · + yp versus the cumulative Normal distribution with the same estimated variance should give a straight line in order to qualify a multivariate Gaussian distribution (for the transformed y variables). Such tests on empirical data are presented in figures 7-9. The second test amounts to estimating the covariance matrix V of the sample we consider. This step is simple since, for fast decaying pdf’s, robust estimators of the covariance matrix are available. We can then estimate the distribution of the variable z 2 = yt V−1 y. It is well known that z 2 follows a χ2 distribution if y is a Gaussian random vector. Again, the empirical cumulative distribution of z 2 versus the χ2 cumulative distribution should give a straight line in order to qualify a multivariate Gaussian distribution (for the transformed y variables). Such tests on empirical data are presented in figures 10-12. First, one can observe that the Gaussian copula hypothesis appears better for stocks than for currencies. As discussed in (Malevergne and Sornette 2001), this result is quite general. A plausible explanation lies in the stronger dependence between the currencies compared with that between stocks, which is due to the monetary policies limiting the fluctuations between the currencies of a group of countries, such as was the case in the European Monetary System before the unique Euro currency. Note also that the test of aggregation seems systematically more in favor of the Gaussian copula hypothesis than is the χ2 test, maybe due to its smaller sensitivity. Nonetheless, the very good performance of the Gaussian hypothesis under the 19
444
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
aggregation test bears good news for a porfolio theory based on it, since by definition a portfolio corresponds to asset aggregation. Even if sums of the transformed returns are not equivalent to sums of returns (as we shall see in the sequel), such sums qualify the collective behavior whose properties are controlled by the copula. Notwithstanding some deviations from linearity in figures 7-12, it appears that, for our purpose of developing a generalized portfolio theory, the Gaussian copula hypothesis is a good approximation. A more systematic test of this goodness of fit requires the quantification of a confidence level, for instance using the Kolmogorov test, that would allow us to accept or reject the Gaussian copula hypothesis. Such a test has been performed in (Malevergne and Sornette 2001), where it is shown that this test is sensitive enough only in the bulk of the distribution, and that an Anderson-Darling test is preferable for the tails of the distributions. Nonetheless, the quantitative conclusions of these tests are identical to the qualitative results presented here. Some other tests would be useful, such as the multivariate Gaussianity test presented by (Richardson and Smith 1993).
7
7.1
Choice of an exponential family to parameterize the marginal distributions The modified Weibull distributions
We now apply these constructions to a class of distributions with fat tails, that have been found to provide a convenient and flexible parameterization of many phenomena found in nature and in the social sciences (Laherr`ere and Sornette 1998). These so-called stretched exponential distributions can be seen to be general forms of the extreme tails of product of random variables (Frisch and Sornette 1997). Following (Sornette et al. 2000b), we postulate the following marginal probability distributions of returns: c 1 c −1 − e p(x) = √ c |x| 2 2 π χ2
|x| χ
c
,
(52)
where c and χ are the two key parameters. A more general parameterization taking into account a possible asymmetry between negative and positive returns (thus leading to possible non-zero average return) is p(x) = p(x) =
c+ Q c+ −1 − √ e c+ |x| 2 π 2 χ+
|x| χ+
c− 1 − Q c− −1 − √ e c− |x| 2 π χ−2
c+
if x ≥ 0 |x| χ−
(53)
c−
if x < 0 ,
(54)
where Q (respectively 1 − Q) is the fraction of positive (respectively negative) returns. In the sequel, we will only consider the case Q = 12 , which is the only analytically tractable case. Thus the pdf’s asymmetry will be only accounted for by the exponents c+ , c− and the scale factors χ+ , χ− . We can note that these expressions are close to the Weibull distribution, with the addition of a power law prefactor to the exponential such that the Gaussian law is retrieved for c = 2. Following (Sornette et al. 2000b, Sornette et al. 2000a, Andersen and Sornette 2001), we call (52) the modified Weibull distribution. For c < 1, the pdf is a stretched exponential, also called sub-exponential. The exponent c determines the shape of the distribution, which is fatter than an exponential if c < 1. The parameter χ controls the scale or characteristic width of the distribution. It plays a role analogous to the standard deviation of the Gaussian law. See chapter 6 of (?) for a recent review on maximum likelihood and other estimators of such generalized Weibull distributions. 20
445
7.2
Transformation of the modified Weibull pdf into a Gaussian Law
One advantage of the class of distributions (52) is that the transformation into a Gaussian is particularly simple. Indeed, the expression (52) is of the form (39) with µ ¶c |x| f (x) = 2 . (55) χ Applying the change of variable (40) which reads √ yi = sgn(xi ) 2
µ
|xi | χi
¶ ci 2
,
(56)
leads automatically to a Gaussian distribution. These variables Yi then allow us to obtain the covariance matrix V : ¶ ci µ ¶ cj µ T 2X |xi | 2 |xj | 2 Vij = sgn(xi xj ) , T χi χj
(57)
n=1
and thus the multivariate distributions P (y) and P (x) : ¶c/2 µ ¶c/2 ÃY µ ! N c/2−1 |x | c X |x | c |x | 1 |x | − χi j i i i −1 √ exp − P (x1 , · · · , xN ) = Vij e . c/2 χi χj 2N π N/2 V χ i=1
i,j
i
(58) Similar transforms hold, mutatis mutandis, for the asymmetric case. Indeed, for asymmetric assets of interest for financial risk managers, the equations (53) and (54) yields the following change of variable: µ
yi
√ 2 =
yi
√ = − 2
+
xi χ+ i µ
¶ ci
2
and xi ≥ 0,
(59)
−
|xi | χ− i
¶ ci
2
and xi < 0 .
(60)
This allows us to define the correlation matrix V and to obtain the multivariate distribution P (x), generalizing equation (58) for asymmetric assets. Since this expression is rather cumbersome and nothing but a straightforward generalization of (58), we do not write it here.
7.3
Empirical tests and estimated parameters
In order to test the validity of our assumption, we have studied a large basket of financial assets including currencies and stocks. As an example, we present in figures 13 to 17 typical log-log plot of the transformed return variable Y versus the return variable X for a certain number of assets. If our assumption was right, we should observe a single straight line whose slope is given by c/2. In contrast, we observe in general two approximately linear regimes separated by a cross-over. This means that the marginal distribution of returns can be approximated by two modified Weibull distributions, one for small returns which is close to a Gaussian law and one for large returns with a fat tail. Each regime is depicted by its corresponding straight line in the graphs. The exponents c and the scale factors χ for the different assets we have studied are given in tables 3 for currencies and 4 for stocks. The coefficients within brackets are the coefficients estimated for small returns while the non-bracketed coefficients correspond to the second fat tail regime. 21
446
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
The first point to note is the difference between currencies and stocks. For small as well as for large returns, the exponents c− and c+ for currencies (excepted Poland and Thailand) are all close to each other. Additional tests are required to establish whether their relatively small differences are statistically significant. Similarly, the scale factors are also comparable. In contrast, many stocks exhibit a large asymmetric behavior for large returns with c+ − c− & 0.5 in about one-half of the investigated stocks. This means that the tails of the large negative returns (“crashes”) are often much fatter than those of the large positive returns (“rallies”). The second important point is that, for small returns, many stocks have an exponent hc+ i ≈ hc− i ' 2 and thus have a behavior not far from a pure Gaussian in the bulk of the distribution, while the average exponent for currencies is about 1.5 in the same “small return” regime. Therefore, even for small returns, currencies exhibit a strong departure from Gaussian behavior. In conclusion, this empirical study shows that the modified Weibull parameterization, although not exact on the entire range of variation of the returns X, remains consistent within each of the two regimes of small versus large returns, with a sharp transition between them. It seems especially relevant in the tails of the return distributions, on which we shall focus our attention next.
8 Cumulant expansion of the portfolio return distribution 8.1
link between moments and cumulants
Before deriving the main result of this section, we recall a standard relation between moments and cumulants that we need below. The moments Mn of the distribution P are defined by +∞ X (ik)n
Pˆ (k) =
n!
n=0
Mn ,
(61)
where Pˆ is the characteristic function, i.e., the Fourier transform of P : Z +∞ Pˆ (k) = dS P (S)eikS .
(62)
−∞
Similarly, the cumulants Cn are given by Pˆ (k) = exp
à +∞ X (ik)n n=1
n!
! Cn
.
(63)
Cn ,
(64)
Differentiating n times the equation ln
à +∞ X (ik)n n=0
n!
! Mn
=
+∞ X (ik)n n=1
n!
we obtain the following recurrence relations between the moments and the cumulants : n−1 X µn − 1¶ Mn = Mp Cn−p , p
(65)
p=0
Cn = Mn −
n−1 Xµ p=1
22
¶ n−1 Cp Mn−p . n−p
(66)
447
In the sequel, we will first evaluate the moments, which turns out to be easier, and then using eq (66) we will be able to calculate the cumulants.
8.2
Symmetric assets
We start with the expression of the distribution of the weighted sum of N assets : Z PS (s) =
RN
N X dx P (x)δ( wi xi − s) ,
(67)
i=1
where δ(·) is the Dirac distribution. Using the change of variable (40), allowing us to go from the asset returns Xi ’s to the transformed returns Yi ’s, we get 1 p PS (s) = N/2 (2π) det(V ) Taking its Fourier transform PˆS (k) =
Z dy e
− 12 yt V −1 y
RN
N X δ( wi sgn(yi )f −1 (yi2 ) − s) .
(68)
i=1
R
dsPS (s)eiks , we obtain Z PN 1 t −1 1 −1 2 ˆ p PS (k) = dy e− 2 y V y+ik i=1 wi sgn(yi )f (yi ) , N/2 (2π) det(V ) RN
(69)
where PˆS is the characteristic function of PS . In the particular case of interest here where the marginal distributions of the variables Xi ’s are the modified Weibull pdf, yi f −1 (yi ) = χi | √ |qi (70) 2 with qi = 2/ci , (71) the equation (69) becomes PˆS (k) =
1 p (2π)N/2 det(V )
Z dy e
− 12 yt V −1 y+ik
PN
i=1
y
wi sgn(yi )χi | √i |qi 2
.
(72)
RN
The task in front of us is to evaluate this expression through the determination of the moments and/or cumulants. 8.2.1 Case of independent assets In this case, the cumulants can be obtained explicitely (Sornette et al. 2000b). Indeed, the expression (72) can be expressed as a product of integrals of the form q Z +∞ 2 i − u +ikwi χi √u 2 du e 2 . (73) 0
We obtain C2n =
N X
c(n, qi )(χi wi )2n ,
i=1
23
(74)
448
and
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
n−2 X
" ¡ ¡ ¢" ¡ ¢ #p ¢ #n 1 1 1 n Γ qi (n − p) + 2 Γ qi + 2 (−1) Γ qi + 2 c(n, qi ) = (2n)! (−1)n − . 1/2 1/2 n (2n − 2p)!π 2!π 2!π 1/2 p=0
(75)
Note that the coefficient c(n, qi ) is the cumulant of order n of the marginal distribution (52) with c = 2/qi and χ = 1. The equation (74) expresses simply the fact that the cumulants of the sum of independent variables is the sum of the cumulants of each variable. The odd-order cumulants are zero due to the symmetry of the distributions. 8.2.2 Case of dependent assets Here, we restrict our exposition to the case of two random variables. The case with N arbitrary can be treated in a similar way but involves rather complex formulas. The equation (72) reads ¯ ¯ · µ Z ¯ y1 ¯q1 1 1 t −1 ˆ ¯ PS (k) = p dy1 dy2 exp − y V y + ik χ1 w1 sgn(y1 ) ¯ √ ¯¯ + 2 2 2π 1 − ρ2 ¯ ¶¸ ¯ ¯ y2 ¯q2 , (76) +χ2 w2 sgn(y2 ) ¯¯ √ ¯¯ 2 and we can show (see appendix E) that the moments read n µ ¶ X n p n−p Mn = w w γq1 q2 (n, p) , p 1 2
(77)
p=0
with
¢ ¡ ¢ ¡ µ ¶ q1 p + 12 Γ q2 (n − p) + 12 1 2 γq1 q2 (2n, 2p) = , (78) 2 F1 −q1 p, −q2 (n − p); ; ρ π 2 ¡ ¢ ¡ ¢ µ q1 q2 q1 − 1 2(n−p)−1 Γ q1 p + 1 + 2 Γ q2 (n − p) + 1 − 2 ρ F , γq1 q2 (2n, 2p + 1) = 2χ2p+1 χ 2 1 −q1 p − 1 2 π 2 ¶ q2 + 1 3 2 , −q2 (n − p) + ; ;ρ , (79) 2 2 2(n−p) Γ χ2p 1 χ2
where 2 F1 is an hypergeometric function. These two relations allow us to calculate the moments and cumulants for any possible values of q1 = 2/c1 and q2 = 2/c2 . If one of the qi ’s is an integer, a simplification occurs and the coefficients γ(n, p) reduce to polynomials. In the simpler case where all the qi ’s are odd integer the expression of moments becomes : min{q1 p,q2 (n−p)} n µ ¶ X X n p n−p Mn = (w1 χ1 ) (w2 χ2 ) ρs s! as(q1 p) as(q2 (n−p)) , (80) p p=0
with (2n) a2p
s=0
µ ¶ 2(n − p) (2n)! , = (2p)! (2(n − p) − 1)!! = n−p 2 (n − p)! 2n
(2n)
a2p+1 = 0 , (2n+1)
a2p
(2n+1)
a2p+1
(81) (82)
= 0,
µ ¶ 2(n − p) (2n + 1)! . = (2p + 1)! (2(n − p) − 1)!! = n−p 2 (n − p)! 2n + 1 24
(83) (84)
449
8.3
Non-symmetric assets
In the case of asymmetric assets, we have to consider the formula (53-54), and using the same notation as in the previous section, the moments are again given by (77) with the coefficient γ(n, p) now equal to : ¶ · µ − ¶ µ − ¶ µ − p − n−p (−1)n (χ− q1 p + 1 q2 (n − p) + 1 q1 p q2− (n − p) 1 2 1 ) (χ2 ) γ(n, p) = Γ Γ ,− ; ;ρ + 2 F1 − 4π 2 2 2 2 2 µ − ¶ µ − ¶ µ − ¶¸ − q1 p q2 (n − p) q1 p − 1 q2 (n − p) − 1 3 2 +2Γ +1 Γ + 1 ρ 2 F1 − ,− ; ;ρ + 2 2 2 2 2 · µ ¶ µ ¶ µ ¶ p + n−p (−1)p (χ− q−p + 1 q + (n − p) + 1 q1− p q2+ (n − p) 1 2 1 ) (χ2 ) + Γ 1 Γ 2 ,− ; ;ρ + 2 F1 − 4π 2 2 2 2 2 µ − ¶ µ + ¶ µ − ¶¸ + q1 p q2 (n − p) q1 p − 1 q2 (n − p) − 1 3 2 −2Γ +1 Γ + 1 ρ 2 F1 − ,− ; ;ρ + 2 2 2 2 2 · µ ¶ µ ¶ µ ¶ p − n−p (−1)n−p (χ+ q+p + 1 q − (n − p) + 1 q1+ p q2− (n − p) 1 2 1 ) (χ2 ) + Γ 1 Γ 2 F − ,− ; ;ρ + 2 1 4π 2 2 2 2 2 µ + ¶ µ − ¶ µ + ¶¸ − q1 p − 1 q2 (n − p) − 1 3 2 q1 p q2 (n − p) −2Γ +1 Γ + 1 ρ 2 F1 − ,− ; ;ρ + 2 2 2 2 2 · µ ¶ µ ¶ µ ¶ n−p (χ+ )p (χ+ q+p + 1 q + (n − p) + 1 q1+ p q2+ (n − p) 1 2 2) + 1 Γ 1 Γ 2 F − ,− ; ;ρ + 2 1 4π 2 2 2 2 2 ¶ µ + ¶ µ + ¶¸ µ + + q2 (n − p) q1 p − 1 q2 (n − p) − 1 3 2 q1 p +1 Γ + 1 ρ 2 F1 − ,− ; ;ρ . +2Γ 2 2 2 2 2 (85) This formula is obtained in the same way as for the formulas given in the symmetric case. We retrieve the formula (78) as it should if the coefficients with index ’+’ are equal to the coefficients with index ’-’.
8.4
Empirical tests
Extensive tests have been performed for currencies under the assumption that the distributions of asset returns are symmetric (Sornette et al. 2000b). As an exemple, let us consider the Swiss franc and the Japanese Yen against the US dollar. The calibration of the modified Weibull distribution to the tail of the empirical histogram of daily returns give (qCHF = 1.75, cCHF = 1.14, χCHF = 2.13) and (qJP Y = 2.50, cJP Y = 0.8, χJP Y = 1.25) and their correlation coefficient is ρ = 0.43. Figure 18 plots the excess kurtosis of the sum wCHF xCHF + wJP Y xJP Y as a function of wCHF , with the constraint wCHF + wJP Y = 1. The thick solid line is determined empirically, by direct calculation of the kurtosis from the data. The thin solid line is the theoretical prediction using our theoretical formulas with the empirically determined exponents c and characteristic scales χ given above. While there is a nonnegligible difference, the empirical and theoretical excess kurtosis have essentially the same behavior with their minimum reached almost at the same value of wCHF . Three origins of the discrepancy between theory and empirical data can be invoked. First, as already pointed out in the preceding section, the modified Weibull distribution with constant exponent and scale parameters describes accurately only the tail of the empirical distributions while, for small returns, the empirical distributions are close to a Gaussian law. While putting a strong emphasis on large fluctuations, cumulants of order 4 are still significantly sensitive to the bulk of the distributions. Moreover, the excess kurtosis is 25
450
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
normalized by the square second-order cumulant, which is almost exclusively sensitive to the bulk of the distribution. Cumulants of higher order should thus be better described by the modified Weibull distribution. However, a careful comparison between theory and data would then be hindered by the difficulty in estimating reliable empirical cumulants of high order. This estimation problem is often invoked as a criticism against using high-order moments or cumulants. Our approach suggests that this problem can be in large part circumvented by focusing on the estimation of a reasonable parametric expression for the probability density or distribution function of the assets returns. The second possible origin of the discrepancy between theory and data is the existence of a weak asymmetry of the empirical distributions, particularly of the Swiss franc, which has not been taken into account. The figure also suggests that an error in the determination of the exponents c can also contribute to the discrepancy. In order to investigate the sensitivity with respect to the choice of the parameters q and ρ, we have also constructed the dashed line corresponding to the theoretical curve with ρ = 0 (instead of ρ = 0.43) and the dotted line corresponding to the theoretical curve with qCHF = 2 rather than 1.75. Finally, the dasheddotted line corresponds to the theoretical curve with qCHF = 1.5. We observe that the dashed line remains rather close to the thin solid line while the dotted line departs significantly when wCHF increases. Therefore, the most sensitive parameter is q, which is natural because it controls directly the extend of the fat tail of the distributions. In order to account for the effect of asymmetry, we have plotted the fourth cumulant of a portfolio composed of Swiss Francs and British Pounds. On figure 19, the solid line represents the empirical cumulant while the dashed line shows the theoretical cumulant. The agreement between the two curves is better than under the symmetric asumption. Note once again that an accurate determination of the parameters is the key point to obtain a good agreement between empirical data and theoretical prediction. As we can see in figure 19, the paramaters of the Swiss Franc seem well adjusted since the theoretical and empirical cumulants are both very close when wCHF ' 1, i.e., when the Swiss Franc is almost the sole asset in the portfolio, while when wCHF ' 0, the theoretical cumulant is far from the empirical one, i.e., the parameters of the Bristish Pound are not sufficiently well-adjusted.
9 Can you have your cake and eat it too ? Now that we have shown how to accurately estimate the multivariate distribution fonction of the assets return, let us come back to the portfolio selection problem. In figure 2, we can see that the expected return of the portfolios with minimum risk according to Cn decreases when n increases. But, this is not the general situation. Figure 20 and 21 show the generalized efficient frontiers using C2 (Markovitz case), C4 or C6 as relevant measures of risks, for two portfolios composed of two stocks : IBM and Hewlett-Packard in the first case and IBM and Coca-Cola in the second case. Obviously, given a certain amount of risk, the mean return of the portfolio changes when the cumulant considered changes. It is interesting to note that, in figure 20, the minimisation of large risks, i.e., with respect to C6 , increases the average return while, in figure 21, the minimisation of large risks lead to decrease the average return. This allows us to make precise and quantitative the previously reported empirical observation that it is possible to “have your cake and eat it too” (Andersen and Sornette 2001). We can indeed give a general criterion to determine under which values of the parameters (exponents c and characteristic scales χ of the distributions of the asset returns) the average return of the portfolio may increase while the large risks decrease at the same time, thus allowing one to gain on both account (of course, the small risks quantified 26
451
by the variance will then increase). For two independent assets, assuming that the cumulants of order n and n + k of the portfolio admit a minimum in the interval ]0, 1[, we can show that µ∗n < µ∗n+k if and only if
"µ (µ(1) − µ(2)) ·
Cn (1) Cn (2)
¶
1 n−1
µ −
(86)
Cn+k (1) Cn+k (2)
¶
1 n+k−1
# >0,
(87)
where µ∗n denotes the return of the portfolio evaluated with respect to the minimum of the cumulant of order n and Cn (i) is the cumulant of order n for the asset i. The proof of this result and its generalisation to N > 2 are given in appendix F. In fact, we have observed that when the exponent c of the assets remains sufficiently different, this result still holds in presence of dependence between assets. This last empirical observation in the presence of dependence between assets has not been proved mathematically. It seems reasonable for assets with moderate dependence while it may fail when the dependence becomes too strong as occurs for comonotonic assets. For the assets considered above, we have found µIBM = 0.13, µHW P = 0.07, µKO = 0.05 and ³
C2 (IBM ) C2 (HW P )
= 1.76 >
C2 (IBM ) C2 (KO)
= 0.96 <
C4 (IBM ) C4 (HW P )
³
C4 (IBM ) C4 (KO)
´1 3
´1 3
µ
¶1 C6 (IBM ) 5 = 0.89 = 1.03 > C6 (HW P ) ¶1 µ C6 (IBM ) 5 = 1.01 < = 1.06 , C6 (KO)
(88) (89)
which shows that, for the portfolio IBM / Hewlett-Packard, the efficient return is an increasing function of the order of the cumulants while, for the portfolio IBM / Coca-Cola, the inverse phenomenon occurs. This is exactly what is shown on figures 20 and 21. The underlying intuitive mechanism is the following: if a portfolio contains an asset with a rather fat tail (many “large” risks) but narrow waist (few “small” risks) with very little return to gain from it, minimizing the variance C2 of the return portfolio will overweight this asset which is wrongly perceived as having little risk due to its small variance (small waist). In contrast, controlling for the larger risks quantified by C4 or C6 leads to decrease the weight of this asset in the portfolio, and correspondingly to increase the weight of the more profitable assets. We thus see that the effect of “both decreasing large risks and increasing profit” appears when the asset(s) with the fatter tails, and therefore the narrower central part, has(ve) the smaller overall return(s). A mean-variance approach will weight them more than deemed appropriate from a prudential consideration of large risks and consideration of profits. From a behavioral point of view, this phenomenon is very interesting and can probably be linked with the fact that the main risk measure considered by the agents is the volatility (or the variance), so that the other dimensions of the risk, measured by higher moments, are often neglected. This may sometimes offer the opportunity of increasing the expected return while lowering large risks.
10 Conclusion We have introduced three axioms that define a consistent set of risk measures, in the spirit of (Artzner et al. 1997, Artzner et al. 1999). Contrarily to the risk measures of (Artzner et al. 1997, Artzner et al. 1999), our consistent risk measures may account for both-side risks and not only for down-side risks. Thus, they supplement the notion of coherent measures of risk and are well adapted to the problem of portfolio risk 27
452
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
assessment and optimization. We have shown that these risk measures, which contain centered moments (and cumulants with some restriction) as particular examples, generalize them significantly. We have presented a generalization of previous generalizations of the efficient frontiers and of the CAPM based on these risk measures in the cases of homogeneous and heterogeneous agents. We have then proposed a simple but powerful specific von Mises representation of multivariate distribution of returns that allowed us to obtain new analytical results on and empirical tests of a general framework for a portfolio theory of non-Gaussian risks with non-linear correlations. Quantitative tests have been presented on a basket of seventeen stocks among the largest capitalization on the NYSE. This work opens several novel interesting avenues for research. One consists in extending the Gaussian copula assumption, for instance by using the maximum-entropy principle with non-extensive Tsallis entropies, known to be the correct mathematical information-theoretical representation of power laws. A second line of research would be to extend the present framework to encompass simultaneously different time scales τ in the spirit of (Muzy et al. 2001) in the case of a cascade model of volatilities.
28
453
A Description of the data set We have considered a set of seventeen assets traded on the New York Stock Exchange: Applied Material, Coca-Cola, EMC, Exxon-Mobil, General Electric, General Motors, Hewlett Packard, IBM, Intel, MCI WorldCom, Medtronic, Merck, Pfizer, Procter & Gambel, SBC Communication, Texas Instrument, Wall Mart. These assets have been choosen since they are among the largest capitalizations of the NYSE at the time of writing. The dataset comes from the Center for Research in Security Prices (CRSP) database and covers the time interval from the end of January 1995 to the end of December 2000, which represents exactly 1500 trading days. The main statistical features of the compagnies composing the dataset are presented in the table 5. Note the high kurtosis of each distribution of returns as well as the large values of the observed minimum and maximum returns compared with the standard deviations, that clearly underlines the non-Gaussian behavior of these assets.
29
454
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
B Generalized efficient frontier and two funds separation theorem Let us consider a set of N risky assets X1 , · · · , XN and a risk-free asset X0 . The problem is to find the optimal allocation of these assets in the following sense: ρ ({wi }) inf P wi ∈[0,1] α w = 1 (90) i Pi≥0 i≥0 wi µ(i) = µ , In other words, we search for the portfolio P with minimum risk as measured by any risk measure ρα obeying axioms I-IV of section 2 for a given amount of expected return µ and normalized weights wi . Short-sells are forbidden except for the risk-free asset which can be lent and borrowed at the same interest rate µ0 . Thus, the weights wi ’s are assumed positive for all i ≥ 1.
B.1
Case of independent assets when the risk is measured by the cumulants
To start with a simple example, let us assume that the risky assets are independent and that we choose to measure the risk with the cumulants of their distributions of returns. The case when the assets are dependent and/or when the risk is measured by any ρα will be considered later. Since the assets are assumed independent, the cumulant of order n of the pdf of returns of the portfolio is simply given by Cn =
N X
wi n Cn (i),
(91)
i=1
where Cn (i) denotes the marginal nth order cumulant of the pdf of returns of the asset i. In order to solve this problem, let us introduce the Lagrangian ÃN ! ÃN ! X X wi − 1 , (92) wi µ(i) − µ − λ2 L = Cn − λ1 i=0
i=0
where λ1 and λ2 are two Lagrange multipliers. Differentiating with respect to w0 yields λ2 = µ0 λ1 ,
(93)
which by substitution in equation (92) gives ÃN ! X L = Cn − λ1 wi (µ(i) − µ0 ) − (µ − µ0 ) .
(94)
i=1
Let us now differentiate L with respect to wi , i ≥ 1, we obtain n wi∗ n−1 Cn (i) − λ1 (µ(i) − µ0 ) = 0, so that 1
wi∗ = λ1 n−1
µ
µ(i) − µ0 n Cn (i)
¶
1 n−1
(95)
.
(96)
= 1,
(97)
Applying the normalization constraint yields ¶ N µ X µ(i) − µ0 n−1 1
1
w0 + λ1 n−1
i=1
n Cn (i) 30
455
thus
1 − w0 ´ ³
1
λ1 n−1 =
PN
µ(i)−µ0 n Cn (i)
i=1
and finally
³ wi∗ = (1 − w0 )
´
µ(i)−µ0 Cn (i)
1 n−1
,
(98)
1 n−1
1 . PN ³ µ(i)−µ0 ´ n−1
i=1
(99)
Cn (i)
Let us now define the portfolio Π exclusively made of risky assets with weights ³ w ˜i =
µ(i)−µ0 Cn (i)
´
1 n−1
1 , PN ³ µ(i)−µ0 ´ n−1
i=1
i ≥ 1.
(100)
Cn (i)
The optimal portfolio P can be split in two funds : the risk-free asset whose weight is w0 and a risky fund Π with weight (1 − w0 ). The expected return of the portfolio P is thus µ = w0 µ0 + (1 − w0 )µΠ ,
(101)
where µΠ denotes the expected return of portofolio Π: ³
PN µΠ =
i=1 µ(i) P ³ N i=1
µ(i)−µ0 Cn (i)
µ(i)−µ0 Cn (i)
´
´
1 n−1
.
1 n−1
(102)
The risk associated with P and measured by the cumulant Cn of order n is ³
PN Cn = (1 − w0 )n
i=1 Cn (i) · P ³ N i=1
µ(i)−µ0 Cn (i)
µ(i)−µ0 Cn (i)
´
´
1 n−1
n n−1
.
¸n
(103)
Putting together the three last equations allows us to obtain the equation of the efficient frontier: µ = µ0 +
n−1 " n # n X (µ(i) − µ0 ) n−1
Cn (i)
1 n−1
1
· Cn n ,
(104)
which is a straight line in the plane (Cn 1/n , µ).
B.2
General case
Let us now consider the more realistic case when the risky assets are dependent and/or when the risk is measured by any risk measure ρα obeying the axioms I-IV presented in section 2, where α denotes the degres of homogeneity of ρα . Equation (94) always holds (with Cn replaced by ρα ), and the differentiation with respect to wi , i ≥ 1 yields the set of equations: ∂ρα ∗ ∗ (w , · · · , wN ) = λ1 (µ(i) − µ0 ), ∂wi 1 31
i ∈ {1, · · · , N }.
(105)
456
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Since ρα (w1 , · · · , wN ) is a homogeneous function of order α, its first-order derivative with respect to wi is also a homogeneous function of order α − 1. Using this homogeneity property allows us to write ∂ρα ∗ ∗ (w , · · · , wN ) = (µ(i) − µ0 ), ∂wi 1 ´ 1 1 ∂ρα ³ − α−1 ∗ λ1 w1∗ , · · · , λ1 − α−1 wN = (µ(i) − µ0 ), ∂wi λ1 −1
i ∈ {1, · · · , N },
(106)
i ∈ {1, · · · , N } .
(107)
Denoting by {w ˆ1 , · · · , w ˆN } the solution of ∂ρα (w ˆ1 , · · · , w ˆN ) = (µ(i) − µ0 ), ∂wi this shows that the optimal weights are
i ∈ {1, · · · , N },
1
wi∗ = λ1 α−1 w ˆi .
(108)
(109)
Now, performing the same calculation as in the case of independent risky assets, the efficient portfolio P can be realized by investing a weight w0 of the initial wealth in the risk-free asset and the weight (1 − w0 ) in the risky fund Π, whose weights are given by w ˆi w ˜ i = PN
ˆi i=1 w
.
(110)
Therefore, the expected return of every efficient portfolio is µ = w0 · µ0 + (1 − w0 ) · µΠ ,
(111)
where µΠ denotes the expected return of the market portfolio Π, while the risk, measured by ρα is ρα = (1 − w0 )α ρα (Π) , so that µ = µ0 +
µΠ − µ0 ρα (Π)
1/α
ρα 1/α .
(112)
(113)
This expression is the natural generalization of the relation obtained by (Markovitz 1959) for mean-variance efficient portfolios.
32
457
C Composition of the market portfolio In this appendix, we derive the relationship between the composition of the market portfolio and the composition of the optimal portfolio Π obtained by the minimization of the risks measured by ρα (n).
C.1 Homogeneous case We first consider a homogeneous market, peopled with agents choosing their optimal portfolio with respect to the same risk measure ρα . A given agent p invests a fraction w0 (p) of his wealth W (p) in the risk-free asset and a fraction 1 − w0 (p) in the optimal portfolio Π. Therefore, the total demand Di of asset i is the sum of the demand Di (p) over all agents p in asset i: X Di = Di (p) , (114) p
=
X
W (p) · (1 − w0 (p)) · w ˜i ,
p
= w ˜i ·
X
W (p) · (1 − w0 (p)) ,
(115) (116)
p
where the w ˜i ’s are given by (110). The aggregated demand D over all assets is X D = Di , i
=
X
w ˜i ·
X
W (p) · (1 − w0 (p)),
(118)
p
i
=
X
(117)
W (p) · (1 − w0 (p)).
(119)
p
(120) By definition, the weight of asset i, denoted by wim , in the market portfolio equals P the ratio of its capitalization (the supply Si of asset i) over the total capitalization of the market S = Si . At the equilibrium, demand equals supply, so that Si Di wim = = =w ˜i . (121) S D Thus, at the equilibrium, the optimal portfolio Π is the market portfolio.
C.2 Heterogeneous case We now consider a heterogenous market, defined such that the agents choose their optimal portfolio with respect to different risk measures. Some of them choose the usual mean-variance optimal portfolios, others prefer any mean-ρα efficient portfolio, and so on. Let us denote by Πn the mean-ρα (n) optimal portfolio made only of risky P assets. Let φn be the fraction of agents who choose the mean-ρα (n) efficient portfolios. By normalization, n φn = 1. The demand Di (n) of asset i from the agents optimizing with respect to ρα (n) is X Di (n) = W (p) · (1 − w0 (p)) · w ˜i (n), (122) p∈Sn
= w ˜i (n)
X
W (p) · (1 − w0 (p)),
p∈Sn
33
(123)
458
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
where Sn denotes the set of agents, among all the agents, who follow the optimization stragtegy with respect to ρα (n). Thus, the total demand of asset i is X Di = N φn · Di (n), (124) n
= N
X
φn · w ˜i (n)
n
X
W (p) · (1 − w0 (p)),
(125)
p∈Sn
where N is the total number of agents. This finally yields the total demand D for all assets and for all agents X Di , (126) D = i
= N
XX i
= N
X
n
φn
n
since
P
˜i (n) iw
φn · w ˜i (n) X
X
W (p) · (1 − w0 (p)),
(127)
p∈Sn
W (p) · (1 − w0 (p)),
(128)
p∈Sn
= 1, for every n. Thus, setting P φn p∈Sn W (p) · (1 − w0 (p)) P , γn = P n φn p∈Sn W (p) · (1 − w0 (p))
(129)
the market portfolio is the weighted sum of the mean-ρα (n) optimal portfolios Πn : wim =
Si Di X γn · w ˜i (n) . = = S D n
34
(130)
459
D Generalized capital asset princing model Our proof of the generalized capital asset princing model is similar to the usual demontration of the CAPM. Let us consider an efficient portfolio P. It necessarily satisfies equation (105) in appendix B : ∂ρα ∗ ∗ (w , · · · , wN ) = λ1 (µ(i) − µ0 ), ∂wi 1
i ∈ {1, · · · , N }.
(131)
Let us now choose any portfolio R made only of risky assets and let us denote by wi (R) its weights. We can thus write N X i=1
N X ∂ρα ∗ ∗ wi (R) · (µ(i) − µ0 ), (w , · · · , wN ) = λ1 wi (R) · ∂wi 1
(132)
i=1
= λ1 (µR − µ0 ).
(133)
We can apply this last relation to the market portfolio Π, because it is only composed of risky assets (as proved in appendix B). This leads to wi (R) = wi∗ and µR = µΠ , so that N X
wi∗ ·
i=1
∂ρα ∗ ∗ (w , · · · , wN ) = λ1 (µΠ − µ0 ), ∂wi 1
(134)
which, by the homogeneity of the risk measures ρα , yields ∗ α · ρα (w1∗ , · · · , wN ) = λ1 (µΠ − µ0 ) .
(135)
Substituting equation (131) into (135) allows us to obtain µj − µ0 = βαj · (µΠ − µ0 ), where βαj =
³ ´ 1 ∂ ln ρα α ∂wj
,
(136)
(137)
∗ }. Expression (135) with (137) provides our CAPM, generalized with calculated at the point {w1∗ , · · · , wN respect to the risk measures ρα .
In the case where ρα denotes the variance, the second-order centered moment is equal to the second-order cumulant and reads C2 = w1∗ · Var[X1 ] + 2w1∗ w2∗ · Cov(X1 , X2 ) + w2∗ · Var[X2 ], = Var[Π] .
(138) (139)
Since 1 ∂C2 · 2 ∂w1
= w1∗ · Var[X1 ] + w2∗ · Cov(X1 , X2 ) ,
(140)
= Cov(X1 , Π),
(141)
we find β=
Cov(X1 , XΠ ) , Var[XΠ ]
which is the standard result of the CAPM derived from the mean-variance theory. 35
(142)
460
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
E Calculation of the moments of the distribution of portfolio returns Let us start with equation (72) in the 2-asset case : ¯ ¯ · µ Z ¯ y1 ¯q1 1 1 PˆS (k) = p dy1 dy2 exp − y t V −1 y + ik χ1 w1 sgn(y1 ) ¯¯ √ ¯¯ + 2 2 2π 1 − ρ2 ¯ ¶¸ ¯ ¯ y2 ¯q2 +χ2 w2 sgn(y2 ) ¯¯ √ ¯¯ . 2
(143)
Expanding the exponential and using the definition (67) of moments, we get Mn =
2π
Z
1
p
Z dy1
1 − ρ2
dy2
¯ ¯ n µ ¶ X ¯ y1 ¯q1 p n p n−p p n−p χ1 χ2 w1 w2 sgn(y1 )p ¯¯ √ ¯¯ × p 2 p=0 ¯q2 (n−p) ¯ ¯ ¯ 1 t −1 n−p ¯ y2 ¯ √ ×sgn(y2 ) e− 2 y V y . ¯ 2¯
(144)
Posing χ1 p χ2 n−p γq1 q2 (n, p) = p 2π 1 − ρ2 this leads to
Z
¯ ¯q2 (n−p) ¯ ¯ ¯ ¯ y1 ¯q1 p ¯ 1 t −1 n−p ¯ y2 ¯ ¯ √ dy1 dy2 sgn(y1 ) ¯ √ ¯ sgn(y2 ) e− 2 y V y , ¯ ¯ 2 2 p¯
n µ ¶ X n p n−p Mn = w w γq1 q2 (n, p) . p 1 2
(145)
(146)
p=0
Let us defined the auxiliary variables α and β such that ( 1 α = (V −1 )11 = (V −1 )22 = 1−ρ 2 , ρ −1 −1 β = −(V )12 = −(V )21 = 1−ρ 2 .
(147)
Performing a simple change of variable in (145), we can transform the integration such that it is defined solely within the first quadrant (y1 ≥ 0, y2 ≥ 0), namely γq1 q2 (n, p) = χ1 p χ2 n−p
1 + (−1)n p 2π 1 − ρ2
Z 0
Z
+∞
dy1
µ
+∞
dy2
0
y √1 2
¶q1 p µ
¶ y q2 (n−p) − α (y12 +y22 ) √2 e 2 × 2 ³ ´ × eβy1 y2 + (−1)p e−βy1 y2 . (148)
This equation imposes that the coefficients γ vanish if n is odd. This leads to the vanishing of the moments of odd orders, as expected for a symmetric distribution. Then, we expand eβy1 y2 + (−1)p e−βy1 y2 in series. Permuting the sum sign and the integral allows us to decouple the integrations over the two variables y1 and y2 : ! ÃZ +∞ +∞ n X s y1q1 p+s − α y12 p n−p 1 + (−1) p+s β p × γq1 q2 (n, p) = χ1 χ2 [1 + (−1) ] dy1 q1 p e 2 s! 2π 1 − ρ2 s=0 0 2 2 ÃZ ! q (n−p)+s +∞ y22 2 −α y × dy2 e 2 1 . (149) q2 (n−p) 0 2 2
36
461
This brings us back to the problem of calculating the same type of integrals as in the uncorrelated case. Using the expressions of α and β, and taking into account the parity of n and p, we obtain: µ ¶ 1 +∞ − ρ2 )q1 p+q2 (n−p)+ 2 X (2ρ)2s 1 γq1 q2 (2n, 2p) = χ1 χ2 Γ q1 p + s + × π (2s)! 2 s=0 µ ¶ 1 ×Γ q2 (n − p) + s + , (150) 2 q −q +1 +∞ 2 q1 p+q2 (n−p)+ 1 22 X (2ρ)2s+1 2p+1 2n−2p−1 (1 − ρ ) × γq1 q2 (2n, 2p + 1) = χ1 χ2 π (2s + 1)! s=0 ³ q1 ´ ³ q2 ´ ×Γ q1 p + s + 1 + Γ q2 (n − p) + s + 1 − . (151) 2 2 2p
2n−2p (1
Using the definition of the hypergeometric functions 2 F1 (Abramovitz and Stegun 1972), and the relation (9.131) of (Gradshteyn and Ryzhik 1965), we finally obtain ¡ ¢ ¡ ¢ µ ¶ 1 1 1 2 2p 2n−2p Γ q1 p + 2 Γ q2 (n − p) + 2 γq1 q2 (2n, 2p) = χ1 χ2 , 2 F1 −q1 p, −q2 (n − p); ; ρ (152) π 2 ¡ ¢ ¡ ¢ 2Γ q1 p + 1 + q21 Γ q2 (n − p) + 1 − q22 ρ × γq1 q2 (2n, 2p + 1) = χ1 2p+1 χ2 2n−2p−1 π µ ¶ q1 − 1 q2 + 1 3 2 × 2 F1 −q1 p − , −q2 (n − p) + ; ;ρ . (153) 2 2 2 In the asymmetric case, a similar calculation follows, with the sole difference that the results involves four terms in the integral (148) instead of two.
37
462
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
F Conditions under which it is possible to increase the return and decrease large risks simultaneously We consider N independent assets {1 · · · N }, whose returns are denoted by µ(1) · · · µ(N ). We aggregate these assets P in a portfolio. Let w1 · · · wN be their weights. We consider that short positions are forbidden and that i wi = 1. The return µ of the portfolio is µ=
N X
wi µ(i).
(154)
i=1
The risk of the portfolio is quantified by the cumulants of the distribution of µ. Let us denote µ∗n the return of the portfolio evaluated for the asset weights which minimize the cumulant of order n.
F.1 Case of two assets Let Cn be the cumulant of order n for the portfolio. The assets being independent, we have Cn = Cn (1)w1 n + Cn (2)w2 n ,
(155)
n
n
= Cn (1)w1 + Cn (2)(1 − w1 ) .
(156)
In the following, we will drop the subscript 1 in w1 , and only write w. Let us evaluate the value w = w∗ at the minimum of Cn , n > 2 : dCn = 0 ⇐⇒ Cn (1)wn−1 − Cn (2)(1 − w)n−1 = 0, dw µ ¶ Cn (1) 1 − w∗ n−1 ⇐⇒ = , Cn (2) w∗
(157) (158)
and assuming that Cn (1)/Cn (2) > 0, which is satisfied according to our positivity axiom 1, we obtain 1
Cn (2) n−1
∗
w =
1
1
Cn (1) n−1 + Cn (2) n−1
.
(159)
This leads to the following expression for µ∗n : 1
µ∗n
=
1
µ(1) · Cn (2) n−1 + µ(2) · Cn (1) n−1 1
1
Cn (1) n−1 + Cn (2) n−1
.
Thus, after simple algebraic manipulations, we find ³ ´ 1 1 1 1 µ∗n < µ∗n+k ⇐⇒ (µ(1) − µ(2)) Cn (1) n−1 Cn+k (2) n+k−1 − Cn (2) n−1 Cn+k (1) n+k−1 > 0, which concludes the proof of the result announced in the main body of the text.
38
(160)
(161)
463
F.2 General case We now consider a portfolio with N independent assets. Assuming that the cumulants Cn (i) have the same sign for all i (according to axiom 1), we are going to show that the minimum of Cn is obtained for a portfolio whose weights are given by 1 QN n−1 j6=i Cn (j) wi = P (162) 1 , N n−1 j=1 Cn (j) PN ³
and we have µ∗n =
i=1
µ(i) PN
QN
1
n−1 j6=i Cn (j) 1
n−1 j=1 Cn (j)
´ .
(163)
Indeed, the cumulant of the portfolio is given by Cn =
N X
Cn (i) win
(164)
i=1
subject to the constraint N X
wi = 1.
(165)
i=1
Introducing a Lagrange multiplier λ, the first order conditions yields Cn (i) win−1 − λ = 0, so that win−1 =
∀i ∈ {1, · · · , N }, λ . Cn (i)
(166)
(167)
Since all the Cn (i) are positive, we can find a λ such that all the wi are real and positive, which yields the announced result (162). From here, there is no simple condition that ensures µ∗n < µ∗n+k . The simplest way to compare µ∗n and µ∗n+k is to calculate diretly these quantities using the formula (163).
39
464
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
References Acerbi, C., 2002, Spectrale measures of risk : a coherent representation of subjective risk aversion, Journal of Banking and Finance. Alexander, G.J. and A.M. Baptista, 2002, Economic implications of using a mean-VaR model for portfolio selection: A comparison with mean-variance analysis, Journal of Economic Dynamics and Control 26, 1159-1193. Abramovitz, E. and I.A. Stegun, 1972, Handbook of Mathematical functions (Dover Publications, New York). Allais, M., 1953, Le comportement de l’homme rationel devant le risque, critique des postulat de l’´ecole am´ericaine, Econometrica 21, 503-546. Allais, M., 1990, Allais Paradox, in The new palgrave, Utility and probability, Macmillan, 3-9. Andersen, J.V., and D. Sornette, 2001, Have your cake and eat it too: increasing returns while lowering large risks! Journal of Risk Finance 2, 70-82. Artzner, P., F. Delbaen, J.M. Eber and D. Heath, 1997, Thinking coherently, Risk 10, 68-71. Artzner, P., F. Delbaen, J.M. Eber and D. Heath, 1999, Coherent measures of risk, Math. Finance 9, 203-228. Bouchaud, J.-P., D. Sornette, C. Walter and J.-P. Aguilar, 1998, Taming large events: Optimal portfolio theory for strongly fluctuating assets, International Journal of Theoretical and Applied Finance 1, 25-41. Dowd, K., 2000, Adjusting for risk: an improved Sharpe ratio, International Review of Economics and Finance 9, 209-222. Embrechts, P., C. Kluppelberg and T. Mikosh, 1997, Modelling extremal events (Springel-Verlag, Applications of Mathematics 33). Embrecht, P., A. McNeil and D. Straumann, 1998, Correlation and Dependence in risk management: properties and pitfalls, Proceedings of the Risk Management workshop at the Newton Institute Cambridge, Cambridge University Press. Fang, H. and T. Lai, 1997, Co-kurtosis and capital asset pricing, Financial Review 32, 293-307. F¨ollmer, H. and A. Schied, 2002, Convex measures of risk and trading constraints, Finance and Stochastics 6, forthcoming. F¨ollmer, H. and A. Schied, 2002, Robust preferences and convex measures of risk, Working paper. Frisch, U. and D. Sornette, 1997, Extreme Deviations and Applications, J. Phys. I France 7, 1155-1171. Gopikrishnan, P., M. Meyer, L.A. Nunes Amaral and H.E. Stanley,1998, Inverse cubic law for the distribution of stock price variations, European Physical Journal B 3, 139-140. Gouri´eroux, C., J.P. Laurent and O. Scaillet, 2000, Sensitivity analysis of Values at Risk, Journal of Empirical Finance 7,225-245. Gradshteyn, I.S. and I. M. Ryzhik, 1965, Table of integrals Series and Products, Academic Press.
40
465
Harvey, C.R. and A. Siddique, 2000, Conditional skewness in asset pricing tests, Journal of Finance 55, 1263-1295. Hill, B.M., 1975, A Simple General Approach to Inference about the Tail of a Distribution, Annals of statistics, 3(5), 1163-1174. Hwang, S. and S. Satchell, 1999, Modelling emerging market risk premia using higher moments, International Journal of Finance and Economics 4, 271-296. Jorion, P., 1997, Value-at-Risk: The New Benchmark for Controlling Derivatives Risk (Irwin Publishing, Chicago, IL). Jurcenko, E. and B. Maillet, 2002, The four-moment capital asset pricing model: some basic results, Working Paper. Karlen, D., 1998, Using projection and correlation to approximate probability distributions, Computer in Physics 12, 380-384. Krauss, A. and R. Litzenberger, 1976, Skewness preference and the valuation of risk assets, Journal of Finance 31, 1085-1099. Laherr`ere, J. and D. Sornette, 1998, Stretched exponential distributions in nature and economy : ”fat tails” with characteristic scales, European Physical Journal B 2, 525-539. Lim, K.G., 1989, A new test for the three-moment capital asset pricing model, Journal of Financial and Quantitative Analysis 24, 205-216. Lindskog, F., 2000, Modelling Dependence with Copulas, http : //www.risklab.ch/P apers.html#M T Lindskog Lintner, J. 1975, The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Review of Economics and Statistics 13, 13-37. Litterman, R. and K. Winkelmann, 1998, Estimating covariance matrices (Risk Management Series, Goldman Sachs). Lux, T., 1996, The stable Paretian hypothesis and the frequency of large returns: an examination of major German stocks, Applied Financial Economics 6, 463-475. Malevergne, Y. and D. Sornette, 2001, Testing the Gaussian copula hypothesis for financial assets dependences. http : //papers.ssrn.com/sol3/papers.cf m?abstract id = 291140 Markovitz, H., 1959, Portfolio selection : Efficient diversification of investments (John Wiley and Sons, New York). Merton, R.C., 1990, Continuous-time finance, (Blackwell, Cambridge). Mossin, J., 1966, Equilibrium in a capital market, Econometrica 34, 768-783. Muzy, J.-F., D. Sornette, J. Delour and A. Arneodo, 2001, Multifractal returns and Hierarchical Portfolio Theory, Quantitative Finance 1 (1), 131-148. Nelsen, R.B. 1998, An Introduction to Copulas. Lectures Notes in statistic, 139, Springer Verlag, New York. Pagan, A., 1996, The Econometrics of Financial Markets, Journal of Empirical Finance, 3, 15 - 102. 41
466
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Pickhands, J., 1975, Statistical Inference Using Extreme Order Statitstics, Annals of Statistics, 3(1), 119131. Polimenis, V., 2002, The distributional CAPM: Connecting risk premia to return distributions. Working Parper Rao, C.R., 1973, Linear statistical inference and its applications, 2d ed. (New York Willey). Richardson M. and T. Smith, 1993, A test for multivariate normality in stocks, Journal of Business 66, 295-321. Rothschild, M; and J.E. Stiglitz, 1970, Increasing risk I: A definition, Journal of Economic Theory 2, 225243. Rothschild, M; and J.E. Stiglitz, 1971, Increasing risk II: Its economic consequences, Journal of Economic Theory 3, 66-84. Rubinstein, M., 1973, The fundamental theorem of parameter-preference security valuation. Journal of Financial and Quantitative Analysis 8, 61-69. Scaillet, O., 2000, Nonparametric estimation and sensitivity analysis of expected shortfall, Working paper. Sharpe, N.F., 1964, Capital asset prices: A theory of market equilibrium under conditions of risk, Journal of Finance, 425-442. Sharpe, W.F., 1966, Mutual fund performance, Journal of Business 39, 119-138. Sharpe, W.F., 1994, The Sharpe ratio, Journal of Portfolio Management, 49-58. Sornette, D., 1998, Large deviations and portfolio optimization, Physica A 256, 251-283. ornette, D., 2000, Critical Phenomena in Natural Sciences, Chaos, Fractals, Self-organization and Disorder: Concepts and Tools, (Springer Series in Synergetics). Sornette, D., J. V. Andersen and P. Simonetti, 2000a, Portfolio Theory for “Fat Tails”, International Journal of Theoretical and Applied Finance 3 (3), 523-535. Sornette, D., P. Simonetti, J.V. Andersen, 2000b, φq -field theory for portfolio optimization : ”fat-tails” and non-linear correlations, Physics Reports 335 (2), 19-92. Stuart, A. and J.K. Ord, , Kendall’s advanced theory of statistics, 1994, 6th edition, Edward Arnold London, Halsted Press, New York. Tsallis, C., 1998, Possible generalization of Boltzmann-Gibbs statistics, J. Stat. Phys. 52, 479-487; for updated bibliography on this subject, see http : //tsallis.cat.cbpf.br/biblio.htm. Von Neuman, J. and O. Morgenstern, 1944, Theory of games and economic behavior, Princetown University Press.
42
467
µ 0.10% 0.12% 0.14% 0.16% 0.18% 0.20%
µ2 1/2 0.92% 0.96% 1.05% 1.22% 1.47% 1.77%
µ4 1/4 1.36% 1.43% 1.56% 1.83% 2.21% 2.65% 1/n
Table 1: This table presents the risk measured by µn (daily) return µ.
Wall Mart EMC Intel Hewlett Packard IBM Merck Procter & Gamble General Motors SBC Communication General Electric Applied Material MCI WorldCom Medtronic Coca-Cola Exxon-Mobil Texas Instrument Pfizer
µ6 1/6 1.79% 1.89% 2.06% 2.42% 2.92% 3.51%
µ8 1/8 2.15% 2.28% 2.47% 2.91% 3.55% 4.22%
for n=2,4,6,8, for a given value of the expectedd
µ 1/2 µ2
µ 1/4 µ4
µ 1/6 µ6
µ 1/4 C4
µ 1/6 C6
0.0821 0.0801 0.0737 0.0724 0.0705 0.0628 0.0590 0.0586 0.0584 0.0569 0.0525 0.0441 0.0432 0.0430 0.0410 0.0324 0.0298
0.0555 0.0552 0.0512 0.0472 0.0465 0.0415 0.0399 0.0362 0.0386 0.0334 0.0357 0.0173 0.0278 0.0278 0.0256 0.0224 0.0184
0.0424 0.0430 0.0397 0.0354 0.0346 0.0292 0.0314 0.0247 0.0270 0.0233 0.0269 0.0096 0.0202 0.0207 0.0178 0.0171 0.0131
0.0710 0.0730 0.0694 0.0575 0.0574 0.0513 0.0510 0.0418 0.0477 0.0373 0.0462 0.0176 0.0333 0.0335 0.0299 0.0301 0.0213
0.0557 0.0612 0.0532 0.0439 0.0421 0.0331 0.0521 0.0269 0.0302 0.0258 0.0338 0.0098 0.0237 0.0252 0.0197 0.0218 0.0148
Table 2: This table presents the values of the generalized Sharpe ratios for the set of seventeen assets listed in the first column. The assets are ranked with respect to their Sharpe ratio, given in the second column. The third and fourth columns give the generalized Sharpe ratio calculated with respect to the fourth and sixth centered moments µ4 and µ6 while the fifth and sixth columns give the generalized Sharpe ratio calculated with respect to the fourth and sixth cumulants C4 and C6 .
43
468
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
CHF DEM JPY MAL POL THA UKP
< χ+ > 2.45 2.09 2.10 1.00 1.55 0.78 1.89
Positive Tail < c+ > χ+ 1.61 2.33 1.65 1.74 1.28 1.30 1.22 1.25 1.02 1.30 0.75 0.75 1.52 1.38
c+ 1.26 1.03 0.76 0.41 0.73 0.54 0.92
< χ− > 2.34 2.01 1.89 1.01 1.60 0.82 2.00
Negative Tail < c− > χ− 1.53 1.72 1.58 1.45 1.47 0.99 1.25 0.44 2.13 1.25 0.73 0.30 1.41 1.82
c− 0.93 0.91 0.76 0.48 0.62 0.38 1.09
Table 3: Table of the exponents c and the scale parameters χ for different currencies. The subscript ”+” or ”-” denotes the positive or negative part of the distribution of returns and the terms between brackets refer to parameters estimated in the bulk of the distribution while naked parameters refer to the tails of the distribution.
< χ+ > Applied Material Coca-Cola EMC General Electric General Motors Hewlett Packart IBM Intel MCI WorldCom Medtronic Merck Pfizer Procter & Gambel SBC Communication Texas Instrument Wall Mart
12.47 5.38 13.53 5.21 5.78 7.51 5.46 8.93 9.80 6.82 5.36 6.41 4.86 5.21 9.06 7.41
Positive Tail < c+ > χ+ 1.82 1.88 1.63 1.89 1.71 1.93 1.71 2.31 1.74 1.95 1.91 2.01 1.83 1.97 1.78 1.83
8.75 4.46 13.18 1.81 0.63 4.20 3.85 2.79 11.01 6.09 4.56 5.84 3.53 1.26 4.07 5.81
c+ 0.99 1.04 1.55 1.28 0.48 0.84 0.87 0.64 1.56 1.11 1.16 1.27 0.96 0.59 0.72 1.01
Negative Tail < χ− > < c− > χ− 11.94 5.06 11.44 4.80 5.32 7.26 5.07 9.14 9.09 6.49 5.00 6.04 4.55 4.89 8.24 6.80
1.66 1.74 1.61 1.81 1.89 1.76 1.90 1.60 1.56 1.54 1.73 1.70 1.74 1.59 1.84 1.64
8.11 2.98 3.05 4.31 2.80 1.66 0.18 3.56 2.86 2.55 1.32 0.26 2.96 1.56 2.18 3.75
c− 0.98 0.78 0.57 1.16 0.79 0.52 0.33 0.62 0.58 0.67 0.59 0.35 0.82 0.60 0.54 0.78
Table 4: Table of the exponents c and the scale parameters χ for different stocks. The subscript ”+” or ”-” denotes the positive or negative part of the distribution and the terms between brackets refer to parameters estimated in the bulk of the distribution while naked parameters refer to the tails of the distribution.
44
469
Applied Material Coca-Cola EMC Exxon-Mobil General Electric General Motors Hewlett Packard IBM Intel MCI WorldCom Medtronic Merck Pfizer Procter&Gambel SBC Communication Texas Instrument Wall Mart
Mean (10−3 )
Variance (10−3 )
Skewness
Kurtosis
min
max
2.11 0.81 2.76 0.92 1.38 0.64 1.17 1.32 1.71 0.87 1.70 1.32 1.57 0.90 0.86 2.20 1.35
1.62 0.36 1.13 0.25 0.30 0.39 0.81 0.54 0.85 0.85 0.55 0.35 0.46 0.41 0.39 1.23 0.52
0.41 0.13 0.23 0.30 0.08 0.12 0.16 0.08 -0.31 -0.18 0.23 0.18 0.01 -2.57 0.06 0.50 0.16
4.68 5.71 4.79 5.26 4.46 4.35 6.58 8.43 6.88 6.88 5.52 5.29 4.28 42.75 5.86 5.26 4.79
-14% -11% -18% -7% -7% -11% -14% -16% -22% -20% -12% -9% -10% -31% -13% -12% -10%
21% 10% 15% 11% 8% 8% 21% 13% 14% 13% 12% 10% 10% 10% 9% 24% 9%
Table 5: This table presents the main statistical features of the daily returns of the set of seventeen assets studied here over the time interval from the end of January 1995 to the end of December 2000.
45
470
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
5 x P(x) x2 P(x) x4 P(x)
4.5
4
3.5
xn P(x)
3
2.5
2
1.5
1
0.5
0
0
1
2
3
4
5 x
6
7
8
9
10
Figure 1: This figure represents the function xn · e−x for n = 1, 2 and 4. It shows the typycal size of the fluctuations involved in the moment of order n.
46
471
−3
2.5
x 10
µ (daily return)
2
1.5
1
0.5 Mean−µ2 Efficient Frontier Mean−µ4 Efficient Frontier Mean−µ6 Efficient Frontier Mean−µ8 Efficient Frontier 0
0
0.01
0.02
0.03
0.04
0.05
0.06
µn1/n
Figure 2: This figure represents the generalized efficient frontier for a portfolio made of seventeen risky assets. The optimization problem is solved numerically, using a genetic algorithm, with risk measures given respectively by the centered moments µ2 , µ4 , µ6 and µ8 . The straight lines are the efficient frontiers when we add to these assets a risk-free asset whose interest rate is set to 5% a year.
47
472
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
−3
1.5
x 10
µ (daily return)
1
0.5
Mean−µ2 Efficient Frontier Mean−µ4 Efficient Frontier Mean−µ6 Efficient Frontier Mean−µ8 Efficient Frontier 0
0
0.005
0.01
0.015
0.02
0.025
µn1/n
Figure 3: This figure represents the generalized efficient frontier for a portfolio made of seventeen risky assets and a risk-free asset whose interest rate is set to 5% a year. The optimization problem is solved numerically, using a genetic algorithm, with risk measures given by the centered moments µ2 , µ4 , µ6 and µ8 .
48
473
Mean−µ
Mean−µ
2
4
0.2
0.25 0.2
0.15
wi
w
i
0.15 0.1
0.1 0.05
0
0.05
0
0.2
0.4
0.6
0.8
0
1
0.4
0.6
Mean−µ6
Mean−µ8
0.25
0.25
0.2
0.2
0.8
1
0.8
1
wi
0.3
i
w
0.2
w0
0.3
0.15
0.15
0.1
0.1
0.05
0.05
0
0
w0
0
0.2
0.4
0.6
0.8
0
1
w0
0
0.2
0.4
0.6 w0
Figure 4: Dependence of the five largest weights of risky assets in the efficient portfolios found in figure 3 as a function of the weight w0 invested in the risk-free asset, for the four risk measures given by the centered moments µ2 , µ4 , µ6 and µ8 . The same symbols always represent the same asset.
49
474
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
µ
1/n Cn Figure 5: The dark and grey thick curves represent two efficient frontiers for a portfolio without risk-free interest rate obtained with two measures of risks. The dark and grey thin straight lines represent the efficient frontiers in the presence of a risk-free asset, whose value is given by the intercept of the straight lines with the ordinate axis. This illustrates the existence of an inversion of the dependence of the slope of the efficient frontier with risk-free asset as a function of the order n of the measures of risks, which can occur only when the efficient frontiers without risk-free asset cross each other.
50
475
Figure 6: Schematic representation of the nonlinear mapping Y = u(X) that allows one to transform a variable X with an arbitrary distribution into a variable Y with a Gaussian distribution. The probability densities for X and Y are plotted outside their respective axes. Consistent with the conservation of probability, the shaded regions have equal area. This conservation of probability determines the nonlinear mapping.
51
476
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
CHF−UKP 1
0.9
Empirical Cumulative Distribution
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4 0.5 0.6 Cumulative Normal Ditribution
0.7
0.8
0.9
1
Figure 7: Quantile of the normalized sum of the Gaussianized returns of the Swiss Franc and The British Pound versus the quantile of the Normal distribution, for the time interval from Jan. 1971 to Oct. 1998. Different weights in the sum give similar results.
52
477
KO−PG 1
0.9
Empirical Cumulative Distribution
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4 0.5 0.6 Cumulative Normale Ditribution
0.7
0.8
0.9
1
Figure 8: Quantile of the normalized sum of the Gaussianized returns of Coca-Cola and Procter&Gamble versus the quantile of the Normal distribution, for the time interval from Jan. 1970 to Dec. 2000. Different weights in the sum give similar results.
53
478
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
MRK−GE 1
0.9
Empirical Cumulative Distribution
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4 0.5 0.6 Cumulative Normal Ditribution
0.7
0.8
0.9
1
Figure 9: Quantile of the normalized sum of the Gaussianized returns of Merk and General Electric versus the quantile of the Normal distribution, for the time interval from Jan. 1970 to Dec. 2000. Different weights in the sum give similar results.
54
479
CHF−UKP 1
0.9
0.8
0.7
Z2
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2
χ
Figure 10: Cumulative distribution of z 2 = yt V−1 y versus the cumulative distribution of chi-square (denoted χ2 ) with two degrees of freedom for the couple Swiss Franc / British Pound, for the time interval from Jan. 1971 to Oct. 1998. This χ2 should not be confused with the characteristic scale used in the definition of the modified Weibull distributions.
55
480
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
KO−PG 1
0.9
0.8
0.7
Z2
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2
χ
Figure 11: Cumulative distribution of z 2 = yt V−1 y versus the cumulative distribution of the chi-square χ2 with two degrees of freedom for the couple Coca-Cola / Procter&Gamble, for the time interval from Jan. 1970 to Dec. 2000. This χ2 should not be confused with the characteristic scale used in the definition of the modified Weibull distributions.
56
481
MRK−GE 1
0.9
0.8
0.7
Z2
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2
χ
Figure 12: Cumulative distribution of z 2 = yt V−1 y versus the cumulative distribution of the chi-square χ2 with two degrees of freedom for the couple Merk / General Electric, for the time interval from Jan. 1970 to Dec. 2000. This χ2 should not be confused with the characteristic scale used in the definition of the modified Weibull distributions.
57
482
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
MAL +
0
10
−1
10
0
1
10
10
2
10
MAL −
0
10
−1
10
−1
10
0
1
10
10
2
10
Figure 13: Graph of Gaussianized Malaysian Ringgit returns versus Malaysian Ringgit returns, for the time interval from Jan. 1971 to Oct. 1998. The upper graph gives the positive tail and the lower one the negative ´hc± i √ ³ √ ³ ´c± tail. The two straight lines represent the curves y = 2 hχx± i and y = 2 χx±
58
483
UKP +
0
10
−1
10
−1
10
0
1
10
10
2
10
UKP −
0
10
−1
10
−1
10
0
1
10
10
2
10
Figure 14: Graph of Gaussianized British Pound returns versus British Pound returns, for the time interval from Jan. 1971 to Oct. 1998. The upper graph gives the positive tail and the lower one the negative tail. The ´hc± i √ ³ √ ³ ´c± two straight lines represent the curves y = 2 hχx± i and y = 2 χx±
59
484
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
GE +
0
10
−1
10
−1
10
0
1
10
10
2
10
GE −
0
10
−1
10
−1
10
0
1
10
10
2
10
Figure 15: Graph of Gaussianized General Electric returns versus General Electric returns, for the time interval from Jan. 1970 to Dec. 2000. The upper graph gives the positive tail and the lower one the negative ´hc± i √ ³ √ ³ ´c± tail. The two straight lines represent the curves y = 2 hχx± i and y = 2 χx±
60
485
IBM +
0
10
−1
10
−1
10
0
1
10
10
2
10
IBM −
0
10
−1
10
−1
10
0
1
10
10
2
10
Figure 16: Graph of Gaussianized IBM returns versus IBM returns, for the time interval from Jan. 1970 to Dec. 2000. The upper graph gives the positive tail and the lower one the negative tail. The two straight lines ´hc± i √ ³ √ ³ ´c± represent the curves y = 2 hχx± i and y = 2 χx±
61
486
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
WMT +
0
10
−1
10
−1
10
0
1
10
10
2
10
WMT −
0
10
−1
10
−1
10
0
1
10
10
2
10
Figure 17: Graph of Gaussianized Wall Mart returns versus Wall Mart returns, for the time interval from Sep. 1972 to Dec. 2000. The upper graph gives the positive tail and the lower one the negative tail. The two ´hc± i √ ³ √ ³ ´c± straight lines represent the curves y = 2 hχx± i and y = 2 χx±
62
487
Excess Kurtosis for CHF / JPY 22 20 18 16 14
κ
12 10 8 6 4 2 0
0
0.1
0.2
0.3
0.4
0.5 wCHF
0.6
0.7
0.8
0.9
1
Figure 18: Excess kurtosis of the distribution of the price variation wCHF xCHF + wJP Y xJP Y of the portfolio made of a fraction wCHF of Swiss franc and a fraction wJP Y = 1 − wCHF of the Japanese Yen against the US dollar, as a function of wCHF . Thick solid line : empirical curve, thin solid line : theoretical curve, dashed line : theoretical curve with ρ = 0 (instead of ρ = 0.43), dotted line: theoretical curve with qCHF = 2 rather than 1.75 and dashed-dotted line: theoretical curve with qCHF = 1.5. The excess kurtosis has been evaluated for the time interval from Jan. 1971 to Oct. 1998.
63
488
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Fourth Cumulant for CHF / UKP 65
60
55
50
C4
45
40
35
30
25
20
15
0
0.1
0.2
0.3
0.4
0.5 wCHF
0.6
0.7
0.8
0.9
1
Figure 19: Fourth cumulant for a portfolio made of a fraction wCHF of Swiss Franc and 1−wCHF of British Pound. The thick solid line represents the empirical cumulant while the dotted line represents the theoretical cumulant under the symmetric assumption. The dashed line shows the theoretical cumulant when the slight asymmetry of the assets has been taken into account. This cumulant has been evaluated for the time interval from Jan. 1971 to Oct. 1998.
64
489
Efficient Frontier forIBM−HWP
0.12
Return µ
0.11
0.1
0.09
0.08
0.07 1
1.2
1.4
1.6 C2n/Cmin , 2n
1.8
2
2.2
n={1,2,3}
Figure 20: Efficient frontier for a portfolio composed of two stocks: IBM and Hewlett-Packard. The dashed line represents the efficient frontier with respect to the second cumulant, i.e., the standard Markovitz efficient frontier, the dash-dotted line represents the efficient frontier with respect to the fourth cumulant and the solid line is the efficient frontier with respect to the sixth cumulant. The data set used covers the time interval from Jan. 1977 to Dec 2000.
65
490
14. Gestion de Portefeuilles multimoments et e´ quilibre de march´e
Efficient Frontier forIBM−KO
0.13
0.12
Return µ
0.11
0.1
0.09
0.08
0.07
0.06
0.05 1
1.05
1.1
1.15
1.2
1.25 C2n/Cmin , 2n
1.3
1.35
1.4
1.45
1.5
n={1,2,3}
Figure 21: Efficient frontier for a portfolio composed of two stocks: IBM and Coca-Cola. The dashed line represents the efficient frontier with respect to the second cumulant, i.e., the standard Markovitz efficient frontier, the dash-dotted line represents the efficient frontier with respect to the fourth cumulant and the solid line the efficient frontier with repect to the sixth cumulant. The data set used covers the time interval from Jan. 1970 to Dec 2000.
66
Conclusions et Perspectives L’objet de notre e´ tude e´ tait de contribuer a` une meilleure compr´ehension des risques extrˆemes sur les march´es financiers afin de permettre d’esquisser les contours de nouvelles m´ethodes de contrˆole de ce type de risques et d’en pr´esenter certaines cons´equences pour ce qui est de la gestion de portefeuilles. Pour cela, nous avons suivi une approche dont le but e´ tait de chercher a` isoler les diff´erentes sources de risques selon que leur origine provienne des fluctuations individuelles des actifs ou bien de leur comportement collectif. Cette distinction nous parait particuli`erement importante pour la gestion des risques dans la mesure o`u il nous semble n´ecessaire pour mener a` bien cette tˆache (i) d’ˆetre capable d’estimer les grands risques associ´es a` chaque actif - ce qui ne repose que sur l’´etude de leur “variabilit´e” intrins`eque - et (ii) de savoir si l’on peut esp´erer diversifier ces grands risques et donc d’en r´eduire l’impact par la formation de portefeuilles - ce qui fait appel a` leurs caract´eristiques collectives. C’est pourquoi nous avons tout d’abord proc´ed´e a` une e´ tude a` la fois empirique et th´eorique des variations de cours des actifs financiers, afin d’en d´eterminer les propri´et´es statistiques et d’essayer de mettre a` jour certains m´ecanismes micro-structurels et comportementaux permettant de justifier les observations empiriques. L’´etude s’est focalis´ee sur trois points : – les aspects statiques (monovari´es), avec l’´etude des distributions marginales de rendements, – mais aussi les aspects dynamiques de par l’´etude de la mod´elisation de la d´ependance temporelle a` l’aide de processus de marches al´eatoires multifractales, – ainsi que la prise en compte de l’interaction entre les actifs au travers de l’´etude de la repr´esentation de la structure de d´ependance en terme de copule et surtout l’estimation des d´ependances extrˆemes a` l’aide du coefficient de d´ependance de queue. Ceci nous a conduit, pour ce qui est du premier point, a` nous pencher sur l’hypoth`ese d’apr`es laquelle les rendements sont distribu´es selon des lois de puissances (ou plus g´en´eralement des lois r´eguli`erement variables). Conform´ement a` certains travaux r´ecents, nous avons confirm´e qu’il ne pouvait cependant eˆ tre exclu que des distributions exponentielles e´ tir´ees sont a` mˆeme de repr´esenter les distributions de rendements aussi bien (sinon mieux) que les distributions r´eguli`erement variables g´en´eralement utilis´ees. La prise en compte de cette nouvelle repr´esentation param´etrique des distributions de rendements est importante et nous semble devoir eˆ tre employ´ee conjointement avec la description en terme de distributions r´eguli`erement variables (ce que nous avons fait dans la suite de notre e´ tude) afin de ne pas n´egliger le risque de mod`ele inh´erent a` toute description param´etrique des donn´ees. Concernant l’´etude dynamique de l’´evolution des cours, nous avons pr´ef´er´e nous tourner vers l’utilisation de processus de marche al´eatoire multifractale, plutˆot que vers les traditionnels processus de la famille ARCH, car ils semblent les seuls a` pouvoir rendre compte de la mani`ere suivant laquelle la volatilit´e retourne a` la moyenne (relaxe) apr`es un grand choc. En effet, nous avons montr´e que le processus de marche al´eatoire multifractale pr´edisait une relaxation hyperbolique dont l’exposant diff`ere selon que le choc est endog`ene ou exog`ene, ce qui a pu eˆ tre effectivement observ´e sur les donn´ees. En outre, un tel comportement ne peut pas eˆ tre expliqu´e par les processus de type ARCH. Donc, le processus de marche al´eatoire multifractale semble bien adapt´e a` la description de la dynamique des cours des actifs 491
492
Conclusion
financiers, et permet une meilleure compr´ehension de l’incorporation du flux d’information dans le prix des actifs, ce qui pr´esente un grand int´erˆet pour la pr´ediction de la volatilit´e, par exemple, mais aussi pour la gestion de scenarii permettant de mieux appr´ehender l’impact de telle ou telle nouvelle sur l’´evolution du prix futur d’un actif et notamment la dur´ee des p´eriodes de fortes turbulences ou de calme anormal des march´es. Notons toutefois qu’il conviendra d’am´eliorer cette description par la prise en compte de l’asym´etrie entre fluctuations a` la hausse et a` la baisse (effet de levier) qui a e´ t´e totalement n´eglig´ee jusqu’ici. Pour synth´etiser ces deux premiers points, et essayer de percer certains des m´ecanismes microscopiques permettant de les justifier, nous nous sommes int´eress´es aux mod`eles d’agents en interaction. Ceci avait pour but d’int´egrer les comportements non totalement rationnels des acteurs e´ conomiques afin de comprendre comment - en d´epit de cette rationnalit´e limit´ee - peut e´ merger une structure de march´e la plupart du temps efficiente, mais aussi pourquoi et comment les march´es s’´ecartent parfois violemment de cette structure efficiente. Dans cette e´ tude, nous avons mis en exergue l’importance des comportements mim´etiques et antagonistes entre agents afin de pr´eciser leur rˆole dans l’´emergence et l’explosion des bulles sp´eculatives. L`a encore, cette approche offre une possibilit´e de simuler certaines phases de march´es et fournit un moyen de mieux anticiper les risques a` venir. Une des limites actuelles des mod`eles d’agents en interaction est de se concentrer uniquement sur la mod´elisation de march´es o`u un seul actif (plus e´ ventuellement un actif sans risque) peut eˆ tre e´ chang´e. Nous avons, nous aussi, choisi de suivre cette voie pour des raisons de simplicit´e. En effet, a` partir du moment o`u l’on consid`ere un march´e dot´e de plusieurs actifs, il convient de donner aux agents les moyens de choisir entre les divers actifs et donc d’introduire une fonction d’utilit´e - a priori diff´erente - pour chaque agent, ce qui pose quelques difficult´es, au premier rang desquelles, comme soulign´e au chapitre 10, le choix de cette fonction. Il nous semble clair qu’une des e´ volutions futures des mod`eles d’agents en interaction devra apporter des r´eponses a` ce probl`eme qui a` notre avis constitue l’une des toutes prochaines e´ volutions a` apporter a` ce type de mod´elisation. Le troisi`eme point que nous avons abord´e concernait la description de la d´ependance entre les actifs. Nous avons pour cela commenc´e par essayer de d´eterminer de mani`ere globale la structure de d´ependance en tentant d’estimer la copule des rendements des actifs financiers. Nous avons pu conclure que pour des rendements mod´er´es, une copule gaussienne semblait tout a` fait a` mˆeme de rendre compte de leur d´ependance, mais risquait de sous-estimer les d´ependances extrˆemes. Pour confirmer cette hypoth`ese, nous avons voulu estimer le coefficient de d´ependance de queue, qui mesure la propension de deux actifs financiers a` subir ensemble de grands mouvements. La difficult´e a` estimer de mani`ere directe cette quantit´e nous a d’abord conduit a` la calculer th´eoriquement en nous appuyant sur une description des fluctuations des prix des actifs a` l’aide d’un mod`ele a` facteur. Puis, de la calibration des param`etres de ce mod`ele, nous avons pu d´eduire la valeur du coefficient de d´ependance de queue pour chaque pair d’actifs. Ceci a alors confirm´e l’inad´equation de la copule gaussienne a` d´ecrire la d´ependance entre les grandes variations des cours. Cette e´ tude devra eˆ tre poursuivie, car nous n’avons pas pu trouver de copule r´eellement satisfaisante quant a` la description compl`ete de la d´ependance entre actifs. Si nous sommes d´esormais convaincus qu’il existe une d´ependance de queue entre les actifs, il reste n´eanmoins a` essayer de d´eterminer quelle copule est la mieux adapt´ee pour rendre compte de la structure de d´ependance extrˆeme entre les actifs. Ceci est important car, comme nous l’avons d´ej`a signal´e, les param`etres intervenant dans la repr´esentation param´etrique de la copule permettent de r´esumer toute la d´ependance de la mˆeme mani`ere que le coefficient de corr´elation capture compl`etement la d´ependance pour une distribution multivari´ee gaussienne. Ces param`etres constituent, a` notre avis, les “bonnes variables” qu’il convient de rechercher afin de nous permettre de mod´eliser a` l’´echelle macroscopique le comportement des actifs et notamment la relation entre risque et rendement. Que l’on pense au lien existant entre le coefficient de corr´elation et la variance
Conclusion
493
en univers gaussien d’une part et coefficient beta mesurant le risque syst´ematique associ´e a` chaque actif selon le CAPM d’autre part. En outre, notre e´ tude de la structure de d´ependance entre actifs n’a e´ t´e que descriptive et n’en a pas sond´e les causes microscopiques. Cela rejoint en fait ce que nous avons e´ crit plus haut concernant la n´ecessit´e de d´evelopper des mod`eles d’agents consid´erant des march´es o`u peuvent eˆ tre e´ chang´es plusieurs actifs. Ce sera alors un moyen de mieux comprendre les raisons de l’existence de divers facteurs et leur influence sur cette d´ependance entre actifs : les facteurs e´ conomiques, tels que l’appartenance a` un mˆeme secteur d’activit´e par exemple, sont-ils des e´ l´ements primordiaux de cette d´ependance, ou bien les agents jouent-ils un rˆole pr´epond´erant ? Existe-t-il des phases o`u la structure de d´ependance est domin´ee par les fondamentaux e´ conomiques et d’autre part l’activit´e sp´eculative, comme on l’observe pour le d´eveloppement des fluctuations dans le cas marginal ? Toutes ces questions restent ouvertes et en attente de r´eponses. Apr`es avoir consid´er´e les deux facettes des grands risques que sont d’une part l’aspect individuel ou marginal et d’autre part l’aspect collectif, nous avons regroup´e nos r´esultats afin d’en d´eduire les cons´equences pratiques sur la gestion de portefeuilles en vue de la pr´evention de ces grands risques. Ceci nous a tout d’abord amen´e a` discuter de la mani`ere dont il convient de les mesurer. Nous avons pour cela utilis´e deux approches : l’une bas´ee sur la mesure du capital e´ conomique et l’autre prenant en compte la propension du portefeuille a` ne pas trop s’´ecarter des objectifs de rentabilit´e que l’on en esp`ere. Ces deux voies ont e´ t´e explor´ees et nous avons discut´e de leur mise en œuvre pratique ainsi que de leurs cons´equences sur les prix d’´equilibre des actifs. Nous nous sommes born´es a` consid´erer des mesures de risques et a` aborder le probl`eme de l’optimisation de portefeuille dans un cadre mono-p´eriodique (ou statique), alors qu’il apparaˆıt tr`es clairement que ces deux probl`emes doivent se poser en termes dynamiques. De nombreuses recherches sont actuellement men´ees pour d´efinir de mani`ere axiomatique la notion de risque dans un cadre dynamique ainsi que pour en e´ tudier les cons´equences sur la gestion de portefeuilles. Nos futures recherches sur ce sujet devront s’inscrire dans ce courant. Enfin, nous focalisant sur l’impact des risques extrˆemes, il est notamment apparu que l’existence d’une d´ependance de queue entre les actifs en limitait les possibilit´es de diversification. Nous avons alors montrer comment on parvenait a` en minimiser les effets en ne s´electionnant que les actifs pr´esentant les plus faibles d´ependances de queues. Cependant, si les risques extrˆemes ne sont pas totalement diversifiables, nous devrions eˆ tre en droit de nous attendre a` ce que le march´e les r´emun´ere. Nous n’avons pas e´ tudi´e cette question qui nous semble cruciale mais envisageons de traiter ce probl`eme lors de recherches ult´erieures. Nous voyons donc que les travaux que nous venons de pr´esenter, loin d’ˆetre achev´es et d’apporter des r´eponses d´efinitives, posent de nombreuses questions et ouvrent donc de multiples perspectives de recherches futures.
494
Conclusion
Annexe A
Evaluation de la conduite du projet de th`ese Ce nouveau chapitre s’inscrit dans le cadre de la poursuite d’un projet pilote d´evelopp´e par le minist`ere de l’´education nationale, de la recherche et de la technologie (MENRT) afin d’aider les jeunes docteurs a` s’ins´erer plus facilement dans la vie active, que ce soit dans le milieu de la recherche acad´emique ou bien au sein d’entreprises priv´ees. L’objet de ce chapitre est de mener une r´eflexion sur la conduite de la th`ese, non seulement en termes de retomb´ees scientifiques mais e´ galement d’un point de vue comptable, avec l’´evaluation du coˆut du projet, et de faire le bilan des exp´eriences et comp´etences acquises au cours de ces ann´ees.
A.1
Bref r´esum´e du sujet
L’objet de cette th`ese e´ tait l’´etude des risques extrˆemes sur les march´es financiers. Pour nous attaquer a` ce probl`eme, nous avons tout abord cherch´e a` mod´eliser les m´ecanismes permettant d’expliquer l’apparition de mouvements extrˆemes sur ce type de march´es afin d’en avoir une meilleure compr´ehension et notamment d’en isoler certaines causes. A partir de l`a, nous avons, dans un premier temps, pu d´evelopper de nouvelles mesures de risques permettant de mieux rendre compte des grandes fluctuations observ´ees sur les cours des actifs financiers. Cela nous a ensuite permis d’´elaborer de nouvelles th´eories de portefeuilles visant a` se pr´emunir au mieux contre les risques extrˆemes. Enfin, nous en avons d´eduit d’int´eressants r´esultats concernant le prix des actifs sur les march´es financiers a` l’´equilibre. Notre e´ tude s’est bas´ee sur l’hypoth`ese que les march´es financiers peuvent eˆ tre consid´er´es comme des syst`emes complexes auto-organis´es : de l’interaction entre agents e´ conomiques – aux objectifs divers et souvent oppos´es – r´esultent les propri´et´es macroscopiques des march´es. Ceci nous a notamment conduit a` abandonner les hypoth`eses traditionnelles fond´ees sur l’existence d’agents e´ conomiques repr´esentatifs et rationnels, dont d´ecoule le concept de march´e efficient et qui conduit a` d´ecrire la dynamique des cours par un mouvement brownien, dont on sait pertinemment aujourd’hui, qu’il est inadapt´e a` une telle mod´elisation. 495
496
A. Evaluation de la conduite du projet de th`ese
A.2
El´ements de contexte
A.2.1
Choix du sujet
Apr`es avoir pr´epar´e le diplˆome d’´etudes approfondies en physique th´eorique de l’Ecole Normale Sup´erieure de Lyon durant l’ann´ee scolaire 1999-2000, j’ai d´ecid´e de m’inscrire en th`ese. A l’issue d’un tel DEA, il est naturel de choisir un sujet de th`ese portant sur la physique statistique ou la physique des particules. Cependant, il e´ tait bien clair pour moi que de tels sujets ne conduisent que rarement a` un poste dans la recherche publique – leur nombre e´ tant extrˆemement faible – et ne suscitent presque jamais l’int´erˆet des acteurs priv´es, qui jugent ces th`emes de recherche trop fondamentaux et sans applications concr`etes. Ceci risquait donc de poser quelques probl`emes en vue de ma future insertion dans la vie professionnelle et laissait pr´esager d`es le d´epart d’une n´ecessaire r´eorientation a` l’issue d’une telle th`ese. Aussi, je pr´ef´erais d’embl´ee me d´etourner de la voie habituelle. En fait, le domaine de la finance m’attirait depuis longtemps, a` tel point que j’avais un instant envisag´e de suivre un DEA de finance plutˆot qu’un DEA de physique th´eorique. Mais, sachant que l’´etude des march´es financiers au moyen des outils de la physique statistique se d´eveloppait activement en physique, au sein d’une branche appel´ee “ e´ conophysique ”, je m’´etais alors r´esolu a` apprendre les bases de la physique th´eorique dans le but de r´einvestir ces connaissances dans l’´etude et la mod´elisation des m´ecanismes a` l’œuvre sur les march´es financiers. C’est donc tout naturellement que m’est venu l’id´ee de pr´eparer une th`ese portant sur la probl´ematique des march´es financiers. Durant les mois qui ont pr´ec´ed´e le d´ebut de ma th`ese j’ai rencontr´e et discut´e avec de nombreuses personnes, tant des enseignants et/ou chercheurs en physique et finance que des professionnels, afin d’obtenir leur avis et conseil sur le choix de th`ese que je souhaitais r´ealiser et sur la meilleure fac¸on d’y parvenir. J’avoue qu’`a ce moment-l`a les avis que j’ai rec¸us e´ taient partag´es et que de nombreuses mises en garde sur les difficult´es auxquelles je risquais d’ˆetre confront´e m’ont e´ t´e faites. Cependant, cela n’a nullement entam´e mon enthousiasme, et je dirais mˆeme qu’au contraire, cela m’a pouss´e a` relever ce qui m’apparaissait alors comme un v´eritable d´efi.
A.2.2
Choix de l’encadrement et du laboratoire d’accueil
Le th`eme central de ma th`ese e´ tant choisi, il me fallait encore trouver un directeur de th`ese pouvant m’encadrer sur un tel sujet ainsi qu’un laboratoire d’accueil. Le choix du directeur de th`ese fut en fait assez rapide, car si de plus en plus de chercheurs s’int´eressent a` ce domaine encore e´ mergeant qu’est l’´econophysique, tr`es peu sont a` mˆeme de diriger une th`ese sur ce sujet. C’est donc fort logiquement que je pris contact avec Didier Sornette, directeur de recherche au CNRS, affect´e au Laboratoire de Physique de la Mati`ere Condens´ee (Universit´e de Nice – Sophia Antipolis) o`u il dirige l’´equipe de “ physique pluridisciplinaire ” dont le but est l’´etude des ph´enom`enes critiques auto-organis´es dans des domaines aussi divers que la rupture de mat´eriaux, la pr´ediction des tremblements de terre ou l’´etude des march´es financiers. Tr`es vite nous d´ecidˆames de nous rencontrer, ce qui me permis de d´ecouvrir plus en d´etails les th`emes de recherche abord´es au sein de cette e´ quipe et de pr´evoir le cadre de ma th`ese. Il fut aussi d´ecid´e que j’effectuerais mon stage de DEA au sein de l’´equipe, ce qui me permettrait d’une part de me familiariser avec les th`emes et m´ethodes que j’aurais a` employer au cours de ma th`ese, et d’autre part de v´erifier que le travail qui m’attendait en th`ese correspondait bien a` mes aspirations. Il apparut aussi assez rapidement qu’il serait tr`es int´eressant de s’adjoindre les services d’un professeur de finance afin d’avoir de r´eels e´ changes avec ce milieu, de mieux cerner leurs probl`emes et ainsi ne pas s’´egarer dans des recherches st´eriles car coup´ees des r´ealit´es du monde financier. De plus, cette colla-
497
A.2. El´ements de contexte
boration permettait de garantir le caract`ere pluridisciplinaire de la th`ese dans laquelle je m’engageais. C’est pourquoi je suis all´e rencontrer Jean-Paul Laurent, professeur a` l’Institut de Science Financi`ere et d’Assurances (Universit´e Lyon I), afin de lui exposer mon projet et de lui demander de bien vouloir co-diriger ma th`ese, ce qu’il accepta tout de suite.
A.2.3
Financement du projet
Ce projet e´ tant essentiellement th´eorique, et les retomb´ees pratiques a priori difficilement e´ valuables, il semblait d´elicat d’obtenir un financement autre que public. En fait, en tant qu’´el`eve de l’Ecole Normale Sup´erieure de Lyon, je pouvais postuler pour l’obtention d’une bourse sp´ecifique, dite allocation coupl´ee, consistant en une allocation de recherche du MENRT et d’un monitorat (`a l’universit´e de Nice). Le sujet de th`ese peu orthodoxe que je pr´esentais pouvait me desservir quant a` l’obtention de cette bourse, mais finalement, celui-ci fut jug´e innovant et en plein accord avec la politique minist´erielle visant a` promouvoir la pluridisciplinarit´e dans les th`emes de recherche, ce qui me permit d’obtenir l’attribution de cette allocation coupl´ee. Une telle bourse repr´esente un montant annuel d’environ 24000 euros, charges patronales comprises. A ce montant, s’ajoute un pourcentage du coˆut salarial de mes directeur et co-directeur de th`ese. Ce coˆut est difficilement e´ valuable mais on peut estimer que chacun de mes encadrants a consacr´e entre dix et vingt pour-cent de son temps a` l’encadrement de ma th`ese ce qui, globalement doit repr´esenter pr´es de 15000 euros par an. On peut donc estimer le coˆut total en ressources humaines a` approximativement 39000 euros par an. Du point de vue investissement mat´eriel, le coˆut se r´ev`ele beaucoup plus modeste. Mon travail de th`ese e´ tant essentiellement th´eorique, il n’a n´ecessit´e que l’achat d’un ordinateur, quelques logiciels et quelques livres. On peut donc chiffrer cette d´epense a` 3000 euros. J’ai aussi eu a` effectuer de nombreux d´eplacements soit pour assister a` des conf´erences soit pour rencontrer mon co-directeur a` Lyon ou Paris. Sur toute la dur´ee de ma th`ese, les sommes engag´ees se chiffrent a` 4000 euros. J’ai e´ galement particip´e a` deux e´ coles organis´ees par le d´epartement de la formation continue du CNRS et pris en charge par lui. Le coˆut engag´e par le CNRS a` ces deux occasions avoisine les 2500 euros. Enfin, il faut ajouter a` cela la fraction qui m’est imputable des frais de fonctionnement du laboratoire qui m’a accueilli pendant ces deux ans et demi. Ces frais sont difficilement e´ valuables avec pr´ecision, mais il semble raisonnable de les estimer a` 5000 euros. En r´esum´e, les frais engag´es sont les suivants : Ressources humaines (30 mois) Frais de mat´eriels Frais de d´eplacement Formation continue Frais de fonctionnement Total
97500 euro 3000 euro 4000 euro 2500 euro 5000 euro 112000 euro
Le total des frais engag´es atteint donc 112000 euros sur trente mois, soit un coˆut annuel de 44800 euros. Les frais de ressources humaines sont bien e´ videmment pr´epond´erants, puisqu’ils repr´esentent plus de 85% du coˆut total du projet, ce qui est en ligne avec ce qui est g´en´eralement observ´e pour tout sujet de recherche th´eorique.
498
A. Evaluation de la conduite du projet de th`ese
A.3
Evolution du projet
A.3.1
Elaboration du projet
Le projet initial visait, tout d’abord, a` d´evelopper une m´ethode de mesure des grands risques financiers en univers non gaussien et a` les mod´eliser a` l’aide des outils employ´es en physique statistique et th´eorie des champs, qui semblaient particuli`erement bien adapt´es a` cette probl´ematique. Nous souhaitions ensuite pouvoir appliquer ces m´ethodes au probl`eme de la gestion de portefeuilles. Donc, d`es le d´epart le projet e´ tait conc¸u de mani`ere pluridisciplinaire avec une part de physique th´eorique – ou du moins faisant appel aux outils de la physique th´eorique – et une part de gestion financi`ere. Volontairement, le cadre du projet a e´ t´e d´efini de mani`ere assez g´en´erale afin de lui laisser le plus de souplesse possible pour pouvoir e´ voluer au fur et a` mesure de l’avanc´ee des recherches. En effet, si le cadre global des recherches a` conduire apparaissait d`es le d´epart assez clairement, il e´ tait difficile d’´evaluer tr`es pr´ecis´ement les r´esultats auxquels nous devions nous attendre. En outre, de part la codirection de cette th`ese et son caract`ere pluridisciplinaire, il fallait que les termes du projet conviennent a` chacune des parties. Ce point aurait pu eˆ tre sujet a` quelques difficult´es, mais tout s’est en fait tr`es bien d´eroul´e et il n’y a eu aucun probl`eme. Je dois dire que ceci tient en grande partie au caract`ere de mes encadrants qui ont toujours fait preuve d’une grande ouverture d’esprit et m’ont laiss´e beaucoup de libert´e quant aux choix des directions de recherche a` suivre.
A.3.2
Conduite du projet
Tout au long de ma th`ese, j’ai b´en´efici´e d’un encadrement toujours tr`es pr´esent et en mˆeme temps d’une grande libert´e. Cet e´ tat de fait est li´e au choix de mes encadrants et je savais d`es le d´epart a` quoi m’attendre. En effet, de par ses activit´es, mon directeur de th`ese n’´etait en France que six mois par an. Ceci m’a tr`es rapidement pouss´e a` devoir faire preuve d’autonomie. De plus, j’avais choisi un co-directeur occupant un poste a` l’Universit´e Lyon I, ce qui m’a impos´e une certaine mobilit´e et conduit a` effectuer de nombreux allers-retours a` Lyon ou Paris. Ceci m’a, en contrepartie, permis de rencontrer des personnes ext´erieures a` mon laboratoire de rattachement, ce qui a conduit a` d´evelopper des collaborations tr`es utiles et enrichissantes. Ceci e´ tant, malgr´e la distance qui nous s´eparait parfois, j’ai toujours eu quelqu’un ”`a l’autre bout du fil”, si ce n’est en face de moi pour r´epondre a` mes questions et me guider lors de mes h´esitations. Ceci a e´ t´e extrˆemement important tout au long de ma th`ese et plus particuli`erement au cours des premiers mois o`u je n’ai fait que suivre les directives de mes encadrants. Puis, pass´es les six premiers mois, je me suis mis a` choisir moi-mˆeme les directions de recherche que je souhaitais suivre, a` en e´ valuer la faisabilit´e et a` les concr´etiser. Bien sˆur, tout cela se faisait en plein accord avec mes encadrants et apr`es discussion et concertation avec eux. Ponctuellement, j’ai particip´e a` des projets a priori un peu e´ loign´es du th`eme central de ma th`ese. Mais au final, cela a toujours e´ t´e stimulant et g´en´erateur d’id´ees nouvelles qui se sont parfaitement int´egr´ees au corps d´ej`a e´ tabli de mes recherches, y apportant un compl´ement ou un e´ clairage diff´erent. En r´esum´e, s’il s’est toujours maintenu a` l’int´erieur du cadre originellement e´ tabli, le d´eroulement de ma th`ese e´ tait loin d’ˆetre planifi´e d`es le d´epart et s’est plutˆot produit sous l’effet des d´ecouvertes et avanc´ees successives. De plus, l’avancement de la th`ese s’est effectu´e de mani`ere rapide puisqu’il n’a pas e´ t´e n´ecessaire d’aller au bout des trois ans que dure normalement une th`ese. En fait, un an et demi apr`es le d´ebut du projet, j’ai
A.4. Comp´etences acquises et enseignements personnels
499
e´ mis l’id´ee de soutenir avant terme ce qui est apparu raisonnable a` mes encadrants, qui en ont accept´e le principe. J’estimais alors avoir encore besoin de six mois pour terminer les projets en cours et quelques mois pour r´ediger le manuscrit de th`ese. Sur ce point, la r´edaction r´eguli`ere d’articles au fur et a` mesure de l’avanc´ee des travaux m’a grandement facilit´e la tˆache. Au final les d´elais que je m’´etais fix´es ont effectivement e´ t´e tenus et j’ai soutenu au bout d’un peu plus de deux ann´ees de th`ese.
A.3.3
Retomb´ees scientifiques
Les retomb´ees scientifiques ont e´ t´e nombreuses et importantes. Les r´esultats obtenus nous ont notamment permis d’isoler et de mieux comprendre certaines des origines des ph´enom`enes extrˆemes observ´es sur les march´es financiers. Il nous a alors e´ t´e possible d’´etablir, sur des bases th´eoriques solides, de nouvelles mesures des risques extrˆemes, qui ont en retour autoris´e le d´eveloppement concret de th´eories de portefeuille prenant en compte ce type de risques, ce qui e´ tait l’objectif ultime de notre projet. Globalement, tous les objectifs premiers ont e´ t´e atteints et dans certains cas largement d´epass´es. En outre, les retomb´ees scientifiques du projet ont e´ t´e rapides et r´eguli`eres. En effet, tr`es tˆot les premiers r´esultats sont apparus. D`es les quatre premiers mois, un article a e´ t´e soumis. Puis, a` partir de l`a, environ un papier tous les deux ou trois mois e´ tait r´edig´e, si bien qu’apr`es un peu plus de deux ann´ees de th`ese, six articles sont parus a` la fois dans des journaux de physique et des revues de finance, et environ autant sont soumis ou a` paraˆıtre. La r´egularit´e de publication tient avant tout a` la diversification de mes th`emes de recherches. C’´etait le principal int´erˆet d’avoir d´efini un cadre d’´etude relativement large. Il est int´eressant de remarquer que sur les plus de dix papiers aujourd’hui produits, seul un quart e´ tait attendu, dans le sens o`u ils portent de mani`ere tr`es directe sur des th`emes de recherche parfaitement identifi´es d`es le d´ebut de la th`ese. Les autres sont les fruits de la progression et des id´ees nouvelles qui sont apparues au fur et a` mesure de l’avancement des travaux. Il est bien e´ vident qu’une telle quantit´e de travail n’a pu eˆ tre r´ealis´ee seul, et le nombre des publications tient aussi aux collaborations – avec des partenaires publics mais aussi priv´es – qui se sont tiss´ees tout au long de ma th`ese et qui se sont toutes r´ev´el´ees tr`es fructueuses.
A.4
Comp´etences acquises et enseignements personnels
La th`ese constitue pour moi une exp´erience riche d’enseignement. Sur le plan scientifique, elle m’a tout d’abord permis d’appliquer les connaissances que j’avais acquises en physique th´eorique lors de ma formation initiale (physique statistique, syst`emes complexes et ph´enom`enes critiques notamment). Ceci a e´ t´e pour moi le moyen de d´efinitivement m’approprier ce savoir. J’ai aussi dˆu ponctuellement mettre a` jour et compl´eter ces connaissances lorsque mes recherches me conduisaient a` aborder des th`emes que je n’avais pas eu a` consid´erer lors de mes e´ tudes ant´erieures. Ceci fait qu’aujourd’hui je pense pouvoir dire que je poss`ede un niveau de connaissances e´ lev´ees sur les m´ethodes employ´ees en physique pour l’´etude des ph´enom`enes critiques. Par ailleurs, et c’est l`a o`u j’ai dˆu fournir l’effort le plus important, il m’a fallu acqu´erir la somme de connaissances en th´eorie financi`ere qui me faisait d´efaut (th´eorie de la d´ecision et analyse du risque, th´eories de portefeuille et mod`eles d’´equilibre de march´es, pour ne citer que quelques exemples), ainsi qu’en statistique et th´eorie des probabilit´es. Cet investissement a e´ t´e long (et je ne le consid`ere pas comme totalement achev´e), mais j’estime avoir acquis un bon niveau de comp´etences dans ces domaines. Pour cela, j’ai – entre autre – particip´e a` des e´ coles organis´ees par le CNRS. Celles-ci se sont r´ev´el´ees tr`es
500
A. Evaluation de la conduite du projet de th`ese
formatrices et m’ont permis tout d’abord de combler mes lacunes puis d’acqu´erir des connaissances de haut niveau au contact des personnes les plus expertes dans ces domaines. En outre, les connaissances que j’ai acquises en math´ematique m’ont permis de r´ealiser la synth`ese et d’approfondir les savoirs que j’avais accumul´es en physique au cours de mes e´ tudes ant´erieures. Sur le plan purement technique, j’ai d´evelopp´e notamment de bonnes comp´etences informatiques. J’ai appris la programmation en langage C, afin de pouvoir r´ealiser des simulations num´eriques e´ volu´ees. J’ai e´ galement appris a` utiliser des logiciels tels que MatLab pour le traitement de donn´ees et Maple pour l’aide au calcul formel. En outre, j’ai dˆu apprendre a` g´erer mon temps de travail ainsi que le travail en groupe. Ceci me semble eˆ tre la cl´e du succ`es de tout projet (comme la th`ese par exemple, mais cela va bien au-del`a). En effet, j’ai continuellement travaill´e sur plusieurs sujets en mˆeme temps, g´en´eralement avec la mˆeme personne mais plusieurs fois au sein de collaboration regroupant des personnes g´eographiquement dispers´ees sur l’ensemble du globe. Ceci conduit a` s’astreindre a` une certaine rigueur dans l’organisation afin de ne pas se laisser d´eborder et de maintenir une progression r´eguli`ere des diff´erents projets. En fait, un travail en collaboration n’avanc¸ant qu’au rythme du plus lent de ces membres, le retard peut parfois s’accumuler de mani`ere importante. C’est pourquoi il m’a aussi fallu apprendre a` limiter mes ambitions et ainsi refuser de participer a` certains projets afin de ne pas trop me disperser. Par ailleurs, mes travaux se trouvant a` l’interface de deux mondes – celui de la physique et celui de la finance – j’ai eu sans cesse a` interagir avec des interlocuteurs parlant des langages diff´erents et ayant des sensibilit´es bien distinctes. Il a donc e´ t´e n´ecessaire que je m’adapte a` cela et notamment que j’apprenne a` communiquer avec des personnes ayant a` la base des formations et cultures tr`es diff´erentes de la mienne, ce qui s’est fait petit a` petit mais sans grandes difficult´es. Enfin, b´en´eficiant d’un monitorat a` l’Universit´e de Nice, j’ai eu a` assurer une charge d’enseignement de 64 heures par an. Cette charge s’est trouv´ee repartie entre diff´erents niveaux, allant du premier cycle a` la pr´eparation a` l’agr´egation, et s’est d´eroul´ee devant des publics divers : e´ tudiants en physique, g´eologie, ou encore ponctuellement e´ conomie. Ces quelques heures d’enseignement ont e´ t´e r´eellement profitables. Elles ont bien sˆur e´ t´e l’occasion de mettre en pratique les conseils que j’avais rec¸us lors de ma formation a` l’enseignement au cours de mon ann´ee de pr´eparation a` l’agr´egation, mais elles ont en fait surtout e´ t´e le moyen d’apprendre a` transmettre des connaissances. En effet, l’effort de communication en direction d’un public e´ tudiant est bien plus important et totalement diff´erent de celui auquel doit s’habituer le chercheur qui expose ses travaux lors d’une conf´erence devant un public de sp´ecialistes. En outre, les questions – parfois na¨ıves – des e´ tudiants imposent une remise en cause, ou du moins une r´eorganisation, permanente de ses propres connaissances. Enfin, cela m’a l`a encore conduit a` travailler en coordination, si ce n’est en collaboration, avec mes coll`egues enseignants au sein d’´equipes p´edagogiques, forgeant un peu plus mon exp´erience du travail en groupe.
A.5
Conclusion
J’estime que ma th`ese s’est extrˆemement bien d´eroul´ee et j’en garderai le souvenir d’une exp´erience tr`es positive. J’attribue cela essentiellement a` la tr`es bonne entente qui a toujours r´egn´e entre mes encadrants et moi-mˆeme. Je ne conc¸ois pas avoir pu atteindre les mˆemes objectifs si des conditions de travail aussi agr´eables n’avaient pas e´ t´e r´eunies. Cela souligne l’importance des relations humaines dans la bonne marche de tout projet. Ceci me parait d’autant plus vrai que les groupes de travail sont petits et qu’en cons´equence les membres sont plus d´ependants les uns des autres. Cette exp´erience tr`es motivante m’a confort´e dans l’intention de poursuivre ma carri`ere dans le secteur
A.5. Conclusion
501
de la recherche – acad´emique ou priv´ee. En outre, l’enseignement – ou la formation au sens large – me semble d´esormais indissociable de cette activit´e. Je pense que mes capacit´es d’enseignant se sont nourries de mon exp´erience de chercheur et que r´eciproquement, les e´ changes lors de conf´erences ou s´eminaires ont fourni g´en´eralement mati`ere a` de nouvelles recherches. Je dois a` ces ann´ees de th`ese d’avoir d´ecouvert la formidable compl´ementarit´e entre d’une part une activit´e quelque peu solitaire qu’est la recherche et d’autre part une activit´e tourn´ee vers l’ext´erieur qu’est l’enseignement ou la communication. Le tableau que j’ai dress´e de mes ann´ees de th`ese peut paraˆıtre idyllique, il n’en est pas moins sinc`ere. J’ai sˆurement b´en´efici´e d’une part de chance – inh´erente a` toute entreprise – mais cela ne saurait tout expliquer. Je pense que l’organisation m´ethodique de mon projet, ce qui commence bien avant le d´ebut officiel de la th`ese et passe par le choix avis´e du sujet et des encadrants, est la cl´e de la r´eussite de cette entreprise. Pour ma part, je m’´etais astreint a` cette d´emarche plus de six mois avant le d´ebut de ma th`ese, ce qui, me semble-t-il, a port´e ses fruits bien au-del`a de mes esp´erances initiales.
502
A. Evaluation de la conduite du projet de th`ese
Bibliographie ACERBI , C. (2002) : “Spectral measures of risk : A coherent representation of subjective risk aversion”, Journal of Banking & Finance 26, 1505–1518. ACERBI , C. ET D. TASCHE (2002) : “On the coherence of expected shorfall”, Journal of Banking & Finance 26, 1487–1503. ACERBI , C. ET P. S IMONETTI (2002) : “Portfolio optimization with spectral measures of risk”, Document de Travail. A DAMS , M. C. ET A. S ZARFARZ (1992) : “Speculative bubbles and financial markets”, Oxford Economic Papers 44, 626–640. A HN , D., J. B OUDOUKH , M. R ICHARDSON ET R. W HITELAW (1999) : “Optimal risk management using options”, Journal of Finance 54, 359–376. A LEXANDER , G. J. ET A. M. BAPTISTA (2002) : “Economic Implications of Using a Mean-VaR Model for Portfolio Selection : A Comparison with Mean-Variance Analysis”, Journal of Economic Dynamics & Control 26, 1159–1193. A LLAIS , M. (1953) : “Le comportement de l’homme rationnel devant le risque : critique des postulats de l’´ecole amricaine”, Econometrica 21, 503–546. A NDERSEN , J. V. ET D. S ORNETTE (2002a) : “Have your cake and eat it too : Increasing returns while lowering large risks !”, Journal of Risk Finance 2, 70–82. A NDERSEN , J. V. ET D. S ORNETTE (2002b) : “the $-game”, Document de Travail 0205423, Cond-mat. A NDERSSON , M., B. E KLUND ET J. LYHAGEN (1999) : “A simple linear time series model with misleading nonlinear properties”, Economics Letters 65, 281–285. A RDITTI , F. (1967) : “Risk and the required return on equity”, Journal of Finance 22, 19–36. A RN E´ ODO , A., J. F. M UZY ET D. S ORNETTE (1998) : “Causale cascade in the stock market from the infrared to the ultraviolet”, European Physical journal B 2, 277–282. A RTHUR , W. B. (1987) : “Inductive Reasoning and Bounded Rationality : Self-reinforcing mechanisms in economics”, Center for Economic Policy Research 111, 1–20. A RTHUR , W. B. (1994) : “Inductive Reasoning and Bounded Rationality”, American Economic Review 84, 406–411. A RTHUR , W. B., J. H. H OLLAND , B. L E BARON , R. PALMER ET P. TAYLOR (1997) : “Asset pricing under endogeneous expectations in an artificial stock market”, in Arthur, W. B., D. Lane et S. Durlauf (eds.), The Economy as an Evolving Complex System II, Addison-Wesley, Redwood City. A RTZNER , P., F. D ELBAEN , J.-M. E BER matical Finance 9, 203–288.
ET
D. H EATH (1999) : “Coherent measures of risk”, Mathe-
AVNIR , D., O. B IHAM , D. L IDAR ET O. M ALCAI (1998) : “Is the geometry of nature fractal ?”, Science 279, 39–40. 503
504
Bibliographie
BACHELIER , L. (1900) : “Th´eorie de la sp´eculation”, Annales Scientifiques de l’Ecole Normale Sup´erieure 17, 21–86. BACRY, E., J. D ELOUR 64(26103).
ET
J. F. M UZY (2001) : “Multifractal random walk”, Physical Review E
BACRY, E. ET J. F. M UZY (2002) : “Log-infinitely divisible multifractal processes”, Document de Travail 0207094, Cond-mat. BASLE C OMMITTEE ON BANKING S UPERVISION (1996) : “Amendement to the Capital Accord to Incorporate Market Risks”. BASLE C OMMITTEE ON BANKING S UPERVISION (2001) : “The New Basel Capital Accord”. BAVIERA , R., L. B IFERALE , R. N. M ANTEGNA ET A. V ULPIANI (1998) : “Transient multiaffine behaviors in ARCH and GARCH processes”, International Workshop on Econophysics and Statistical Finance, Italy. B ECK , U. (2001) : La soci´et´e du risque, Aubier. B ERNOULLI , D. (1738) : “Specimen theoriae novae de mensura sortis”, Commentarii Academiae Scientiarum Imperialis Petropolitanae . B INGHAM , N. H., C. M. G OLDIE Press.
ET
J. L. T EUGEL (1987) : Regular Variation, Cambridge University
B LACK , F. (1976) : “Studies of stock price volatility changes”, Proceeding of the Business and Economic Statistics Section pp. 177–181. B LACK , F. ET M. S CHOLES (1973) : “The pricing of options and corporate liabilities”, Journal of Political Economy 81, 637–653. B LANCHARD , O. J. (1979) : “Speculative bubble, crashes and rational expectations”, Economics Letters 3, 387–396. B LANCHARD , O. J. ET M. W. WATSON (1982) : “Bubbles, rational expectations and speculative markets”, in Watchel, P. (ed.), Crisis in Economic and Financial Structure : Bubles, Bursts and Shocks, Lexington Books, Lexington. B LUM , P., A. D IAS ET P. E MBRECHTS (2002) : “The ART of dependence modelling : the latest advances in correlation analysis”, in Lane, M. (ed.), Alternative Risk Strategies, Risk Books, London, pp. 339–356. B OLLERSLEV, T. (1986) : “Generalized autoregressive conditional heteroskdasticity”, Journal of Econometrics 31, 307–327. B OLLERSLEV, T., R. F. E NGLE ET D. B. N ELSON (1994) : “ARCH models”, in Engle, R. F. et D. McFadden (eds.), Handbook of Econometrics, Vol. IV, Elsevier, pp. 2959–3038. B OLLERSLEV, T., R. Y. C HOU ET K. F. K RONER (1992) : “ARCH modeling in finance : A review of the theory and empirical evidence”, Journal of Econometrics 52, 5–59. B OUCHAUD , J. P., A. M ATACZ ET M. P OTTERS (2001) : “The leverage effect in financial markets : retarded volatility and market panic”, Physical Review Letters 87(228701). B OUCHAUD , J. P., D. S ORNETTE , C. WALTER ET J. P. AGUILAR (1998) : “Taming large events : Optimal portfolio theory for strongly fluctuating assets”, International Journal of Theoretical and Applied Finance 1, 25–41. B OUCHAUD , J. P., M. P OTTERS ET M. M EYER (2000) : “Apparent multifractality in financial time series”, European Physical Journal B 13, 595–599. B OUCHAUD , J. P. ET M. P OTTERS (2000) : Theory of Financial Risks : From Statistical Physics to Risk Management, Cambridge University Press.
505
Bibliographie
B OUCHAUD , J. P. ET R. C ONT (1998) : “A Langevin approach to stock market fluctuations and crashes”, European Physical Journal B 6, 543–550. B OUY E´ , E., V. D URRLEMAN , A. N IKEGHBALI , G. R IBOULET ET T. RONCALLI (2000) : “Copulas for finance : A reading guide and some applications”, Document de Travail, Groupe de Recherche Op´erationelle, Cr´edit Lyonnais. B RIEMAN , L. (1960) : “Investment policies for expanding businesses optimal in a long run sense”, Naval Research Logistics Quaterly 7, 647–651. B ROCK , W. A. (1993) : “Pathways to randomness in the economy : Emergent nonlinearity and chaos in economics and finance”, Estudios Economicos 8. B ROCK , W. A. ET B. L E BARON (1996) : “A dynamic structural model for stock return volatility and trading volume”, Review of Economics and Statistics 78, 94–110. B ROCK , W. A. 1095.
ET
C. H. H OMMES (1997) : “Rational route to randomness”, Econometrica 65, 1059–
B ROWN , P., D. WALSH ET A. Y UEN (1997) : “The interaction between order imbalance and stock price”, Pacific-Basin Finance Journal 5, 539–557. B ROZE , L., C. G OURI E´ ROUX ET A. S ZAFARZ (1990) : Reduced Forms of Rational Expectations Models, Harwood Academic Press. B URDA , Z., J. J URKIEWICZ , M. N OWAK , G. PAPP ET I. Z AHED (2001a) : “Free L´evy matrices and financial correlations”, Document de Travail 0103109, Cond-mat. B URDA , Z., J. J URKIEWICZ , M. N OWAK , G. PAPP ET I. Z AHED (2001b) : “L´evy matrices and financial covariances”, Document de Travail 0103108, Cond-mat. B URDA , Z., R. JANIK , J. J URKIEWICZ , M. N OWAK , G. PAPP L´evy matrices”, Physical Review. E 65(011106).
ET
I. Z AHED (2002) : “Free random
C AMPBELL , J., A. W. L O ET C. M AC K INLAY (1997) : The Econometrics of Financial Markets, Princetown University press. C ASADESUS -M ASANELL , R., P. K LIBANOFF ET E. O ZDENOREN (2000) : “Maxmin expected utility through statewise combinations”, Economics Letters 66, 49–54. C AVES , C. M., C. A. F LUCHS ET R. S CHACK (2002) : “Quantum probabilities as Bayesian probabilities”, Physical Review A 65(022305). C HABAANE , A., E. D UCLOS , J. P. L AURENT, Y. M ALEVERGNE ET F. T URPIN (2002) : “Looking for efficient portofolios : An empirical investigation”, Document de Travail. C HALLET, D., A. C HESSA , M. M ARSILI ET Y. C. Z HANG (2001) : “From minority games to real financial makets”, Quantitative Finance 1, 168–176. C HALLET, D. ET Y. C. Z HANG (1997) : “Emergence of Cooperation and Organization in an Evolutionary Game”, Physica A 246, 407–418. C HAN , K. ET W.-M. F ONG (2000) : “Trade size, order imbalance, and the volatility-volume relation”, Journal of Financial Economics 57, 247–273. C HATEAUNEUF, A. (1991) : “On the use of capacities in modeling uncertainty aversion and risk aversion”, Journal of Mathematical Economics 20, 343–369. C HEKHLOV, A., S. U RYASEV ET M. Z ABARANKIN (2000) : “Portfolio Optimization with Drawdown Constraints”, Document de Travail, ISE Dept., Univ. of Florida. C HERUBINI , U. ET E. L UCIANO (2000) : “Multivariate option pricing with copula”, Document de Travail, SSRN.
506
Bibliographie
C HORDIA , T., R. ROLL ET A. S UBRAHMANYAM (2002) : “Order imbalance, liquidity, and market returns”, Journal of Financial Economics 65, 111–130. C HRISTIE , A. A. (1982) : “The stochastic behavior of common stock variances : Value, leverage and interest rate effects”, Journal of Financial Economics 10, 407–432. C OHEN , M. ET J. M. TALLON (2000) : “D´ecision dans le risque et l’incertain : l’apport des mod`eles non additifs”, Revue d’Economie Politique 110, 631–681. C OLES , S., J. H EFFERNAN Extremes 2, 339–365.
ET
J. TAWN (1999) : “Dependence measures for extreme value analysis”,
C OLLETAZ , G. ET J. P. G OURLAOUEN (1989) : “Les bulles rationnelles : une synth`ese de la litt´erature”, in Bourguinat, H. et P. Artus (eds.), Th´eorie Economique et Crises des March´es Financiers, Economica, pp. 67–101. C ONSIGLI , G. (2002) : “Tail estimation and mean-VaR portfolio selection in markets subject to financial instability”, Journal of Banking & Finance 26, 1355–1382. C ONSIGLI , G., G. F RASCELLA ET G. S ARTORELLI (2001) : “Understanding financial market with extreme value theory from Value-at-Risk to crises correlation analysis”, Document de Travail, UniCredit Banca Mobiliare. C ONT, R. (2001) : “Empirical properties of asset returns : stylized facts and statistical issues”, Quantitative Finance 1, 223–236. C ONT, R., M. P OTTERS ET J. P. B OUCHAUD (1997) : “Scaling in stock market data : Stable laws and beyond”, in Dubrulle, Graner et Sornette (eds.), Scale Invariance and Beyond, Springer, Berlin, pp. 75–85. C ONT, R. ET J. P. B OUCHAUD (2000) : “Herd behavior and aggregate fluctuations in financial markets”, Macroeconomic Dynamics 4, 170–196. C ORCOS , A., J. P. E CKMANN , A. M ALASPINAS , Y. M ALEVERGNE ET D. S ORNETTE (2002) : “Imitation and contrarian behavior : Hyperbolic bubbles, crashes and chaos”, Quantitative Finance 2, 264–281. C OSSETTE , H., P. G AILLARDETZ , E. M ARCEAU ET J. R IOUX (2002) : “On two dependent individual risk models”, Insurance : Mathematics and Economics 30, 153–166. C OUTANT, S., V. D URRLEMAN , G. R APUCH ET T. RONCALLI (2001) : “Copulas, multivariate riskneutral distributions and implied dependence functions”, Document de Travail, Groupe de Recherche Op´erationelle, Cr´edit Lyonnais. C VITANIC , J. ET I. K ARATZAS (1995) : “On portfolio optimization under ‘drawdown’ constraints”, IMA Lecture Notes in Mathematics & Applications 65, 77–88. ¨ DACOROGNA , M. M., R. G ENC¸ AY, U. A. M ULLER ET O. V. P ICTET (2001) : “Effective return, risk aversion and drawdowns”, Physica A 289, 229–248. ¨ DACOROGNA , M. M., U. A. M ULLER , O. V. P ICTET ET C. G. DE V RIES (1992) : “The distribution of extremal foreign exchange rate returns in large data sets”, Document de Travail 19921022, Olsen and Associates Internal Documents UAM. DANIELSON , J. ET C. G. DE V RIES (2000) : “Value-at-Risk and extreme returns”, in Embrechts, P. (ed.), Extremes and Integrated Risk Management, RISK Books in association with USB Warburg, pp. 85–106. DANIELSSON , J., P. E MBRECHTS , C. G OODHART, C. K EATING , F. M UENNICH , O. R ENAULT ET H.S. S HIN (2001) : “An academic response to Basel II”, Document de Travail 130, FMG and ESRC, London.
507
Bibliographie
DE
F INETTI , B. (1937) : “La pr´evision : ses lois logiques, ses sources subjectives”, Annales de l’Institut Henri Poincar´e 7, 1–68.
DE
V RIES , C. G. (1994) : “Stylized facts of nominal exchange rate returns”, in van der Ploeg, F. (ed.), The Handbook of International Macroeconomics, Blackwell, Oxford, pp. 348–389.
D ELBAEN , F. (2000) : “Coeherent risk measures on general probability spaces”, Document de Travail, Risklab. D ENAULT, M. (2001) : “Coherent allocation of risk capital”, Journal of Risk 3, 1–34. D URRLEMAN , V., A. N IKEGHBALI ET T. RONCALLI (2000) : “Copulas approximation and new families”, Document de Travail, Groupe de Recherche Op´erationelle, Cr´edit Lyonnais. ¨ E INSTEIN , A. (1905) : “Uber die von der molekularkinetishen Theorie der W¨arme geforderte Bewegung von in ruhenden Fl¨ussigkeiten suspendierten Teilchen”, Annalen de Physik 17, 549–560. E LLSBERG , D. (1961) : “Risk, ambiguity, and the Savage axioms”, Quarterly Journal of Economics 75, 643–669. E LTON , E. Sons.
ET
M. G RUBER (1995) : Modern Portfolio Theory and Investment Analisys, John Willey &
E MBRECHTS , P., A. H OEING ET A. J URI (2001) : “Using Copulae to bound the Value-at-Risk for functions of dependent risk”, Document de Travail, Risklab. E MBRECHTS , P., A. M C N EIL ET D. S TRAUMANN (2002) : “Correlation and dependence in risk management : properties and pitfalls”, in Dempster, M. A. H. (ed.), Risk Management : Value at Risk and Beyond, Cambridge University Press, Cambridge, pp. 176–223. ¨ E MBRECHTS , P., C. P. K L UPPELBERG Verlag, Berlin.
ET
T. M IKOSH (1997) : Modelling Extremal Events, Springer-
E NGLE , R. F. (1982) : “Autoregressive conditional heteroskedasticity with estimate of the variance of UK inflation”, Econometrica 50, 987–1008. E NGLE , R. F. ET A. J. PATTON (2001) : “What good is a volatility model ?”, Quantitative Finance 1, 237–245. ¨ , P. ET A. R ENYI (1960) : “On the evolution of random graphs”, Publications of the Mathematical E RD OS Institute of the Hungarian Academy of Sciences 5, 17–61. FAMA , E. F. (1963) : “Mandelbrot and the stable paretian hypothesis”, Journal of Business 36, 420–449. FAMA , E. F. (1965a) : “The behavior of stock market prices”, Journal of Business 38, 29–105. FAMA , E. F. (1965b) : “Portofolio analysis in a stable paretian market”, Management Science 11, 404– 419. FAMA , E. F. (1970) : “Multi-period consumption-investment decision”, American Economic Review 60, 163–174. FAMA , E. F. (1971) : “Efficient capital markets : A review of theory and empirical work”, Journal of Finance 25, 383–417. FANG , H. B., K. T. FANG ET S. KOTZ (2002) : “The meta-elliptical distributions with given marginals”, Journal of Multivariate Analysis 82, 1–16. FARMER , J. D. (1998) : “Market force, ecology and evolution”, Document de Travail, Santa Fe Institute. F ISHER , A., L. C ALVET ET B. M ANDELBROT (1998) : “Multifractal analysis of USD/DM exchange rates”, Document de Travail, Yale University. F ISHER , T. (2001) : “Coherent risk mesures depending on higher moments”, Document de Travail, University of Heidelberg.
508
Bibliographie
¨ F OLLMER , H. ET A. S CHIED (2002a) : “Convex measures of risk and trading constraints”, Finance & Stochastics 6, 429–447. ¨ F OLLMER , H. ET A. S CHIED (2002b) : “Robust preference and convex measures of risk”, Advances in Finance & Stochastics . F REES , W. E. ET E. A. VALDEZ (1998) : “Understanding relationship using copulas”, North American Actuarial Journal 2, 1–25. F REY, R., A. M C N EIL Risklab.
ET
M. N YFELER (2001) : “Credit Risk and Copulas”, Document de Travail,
F REY, R. ET A. J. M C N EIL (2002) : “VaR and expected shortfall in portfolios of dependent credit risks : Conceptual and practical insights”, Journal of Banking & Finance 26, 1317–1334. F REY, R. ET A. M C N EIL (2001) : “Modelling dependent defaults”, Document de Travail, Risklab. F RISCH , U. (1995) : Turbulence : The legacy of A.N. Kolmogorov, Cambridge university Press. F RISCH , U. ET D. S ORNETTE (1997) : “Extreme deviations and applications”, Journal de Physique I France 7, 1155–1171. G IARDINA , I. ET J. P. B OUCHAUD (2002) : “Bubbles, crashes and intermittency in agent based market models”, Document de Travail 0206222, Cond-mat. G ILBOA , I. ET D. S CHMEIDLER (1989) : “Maxmin expected utility with non-unique prior”, Journal of Mathematical Economics 18, 141–153. G LOSTEN , L. R., R. JAGANNANTHAN ET D. E. RUNKLE (1993) : “On the relation between the expected value and the volatilty of the nominal excess returns on stocks”, Journal of Finance 48, 779–801. G OLDBERG , J. ET R.
VON
N ITZSCH (2001) : Behavioral Finance, John Wiley.
G OLDIE , C. M. (1991) : “Implicit renewal theory and tails of solution of random equations”, Annals of Applied Probability 1, 126–172. ¨ G OLDIE , C. M. ET C. P. K L UPPELBERG (1998) : “Subexponential distributions”, in Adler, R., R. Feldman et M. Taqqu (eds.), A Practical Guide to Heavy Tails : Statistical Techniques for Analysing Heavy Tailed Distributions, Birkh¨auser, Boston, pp. 435–459. G ONEDES , N. (1976) : “Capital market equilibrium for a class of heterogeneous expectations in a twoparameter world”, Journal of Finance 31, 1–15. G OPIKRISHNAN , P., M. M EYER , L. A. N. A MARAL ET H. E. S TANLEY (1998) : “Inverse Cubic Law for the Distribution of Stock Price Variations”, European Physical Journal B 3, 139 –140. G OPIKRISHNAN , P., V. P LEROU , X. G ABIAX ET H. E. S TANLEY (2000) : “Statistical properties of share volume traded in financial markets”, physical Review E 62, 3023–3026. G OURI E´ ROUX , C. (1997) : ARCH Models and Financial Applications, Springer, Berlin. G OURI E´ ROUX , C., J. L AURENT ET O. S CAILLET (2000) : “Sensitivity analysis of values at risk”, Journal of Empirical Finance 7, 225–245. G OURI E´ ROUX , C. ET J. JASIAK (1998) : “Truncated maximum likelihood, goodness of fit tests and tail analysis”, Document de Travail, CREST. G OURI E´ ROUX , C. ET J. JASIAK (2001) : Financial Econometrics, Princetown university Press. G RANDMONT, J. M. (1998) : “Expectation formation and stability of large socioeconomic systems”, Econometrica 66, 741–782. G RANGER , C. W. J. ET R. J OYEUX (1980) : “An introduction to long-range time series models and fractional differencing”, Journal of Time Series Analysis 1, 15–30.
Bibliographie
509
¨ G RANGER , C. W. ET T. T ER ASVIRTA (1999) : “A simple nonlinear model with misleading linear properties”, Economics Letters 62, 161–165. G ROSSMAN , S. J. ET Z. Z HOU (1993) : “Optimal investment strategies for controlling drawdowns”, Mathematical Finance 3, 241–276. ¨ G UILLAUME , D. M., M. M. DACOROGNA , R. R. DAV E´ , U. A. M ULLER , R. B. O LSEN ET O. V. P ICTET (1997) : “From the bird eye to the microscope : a survey of new stylized facts of the intraday foreign exchange markets”, Finance & Stochastics 1, 95–130. H AKANSSON , N. (1971) : “Capital groth and the mean-variance approach to portfolio selection”, Journal of Financial and Quantitative Analysis 6, 517–557. H AUSMAN , J., A. W. L O ET C. M AC K INLAY (1992) : “An ordered probit analysis of transaction stock prices”, Journal of Financial Economics 31, 319–379. H EATH , D. (2000) : “Back to the future”, Plenary lecture at the First World Congress of the Bachelier Society, Paris. H ILL , B. M. (1975) : “A simple general approach to inference about the tail of a distribution”, Annals of Statistics 3, 1163–1174. H OSKING , J. R. M. (1981) : “Fractional differencing”, Biometrika 65, 165–176. H UISMAN , R., K. G. KOEDIJK ET R. A. J. P OWNALL (2001) : “Asset allocation in a Value-at-Risk framework”, Document de Travail, Erasmus University. H ULT, H. ET F. L INDSKOG (2001) : “Multivariate extremes, aggregation and dependence in elliptical distributions”, Document de Travail, Risklab. I DE , K. ET D. S ORNETTE (2002) : “Oscillatory finite-time singularities in finance, population and rupture”, Physica A 307, 63–106. JAFFRAY, J. ET F. P HILIPPE (1997) : “On the exitence of the subjective upper lower probabilities”, Mathematics of Operations Research 22, 165–185. J OE , H. (1997) : Multivariate models and dependence concepts, Chapman & Hall, London. J OHANSEN , A., D. S ORNETTE ET O. L EDOIT (1999) : “Predidicting financial crashes using discrete scale invariance”, Journal of Risk 1, 5–32. J OHANSEN , A., O. L EDOIT ET D. S ORNETTE (2000) : “Crashes as critical points”, International Journal of theoretical and Applied Finance 3, 219–255. J OHANSEN , A. ET D. S ORNETTE (1998) : “Stock market crashes are outliers”, European Physical journal B 1, 141–144. J OHANSEN , A. ET D. S ORNETTE (2002) : “Large market price drawdowns are outliers”, Journal of Risk 4, 69–110. J OHNSON , N. F., D. L AMPER , P. J EFFERIES , M. L. H ART ET S. H OWISON (2001) : “Application of multi-agent games to the prediction of financial time series”, Document de Travail 0105303, Cond-mat. J ONDEAU , E. ET M. ROCKINGER (2000) : “Conditional volatility, skewness and kurtosis : exitence and persistence”, Document de Travail, HEC. J ONDEAU , E. ET M. ROCKINGER (2001) : “Testing for differences in the tails of stock-market returns”, Document de Travail, HEC. J URCZENKO , E. ET B. M AILLET (2002) : “Multi-moment kernel asset pricing model (KAPM) : Some basic results”, in Jurczenko, E. et B. Maillet (eds.), Multi-Moment Capital Pricing Models, Springer. ¨ J URI , A. ET M. V. W UTHRICH (2002) : “Copula convergence theorem for tail events”, Insurance : Mathematics and Economics 30, 405–420.
510
Bibliographie
K AHNEMAN , D. ET A. T VERSKY (1979) : “Prospect theory : An analysis of decision under risk”, Econometrica 47, 263–291. K APLANSKI , G. ET Y. K ROLL (2001a) : “Efficient VaR Portfolios”, Document de Travail, Hebrew University of Jerusalem. K APLANSKI , G. ET Y. K ROLL (2001b) : “Value-at-Risk equilirium Pricing Model”, Document de Travail, Hebrew University of Jerusalem. K ARLEN , D. (1998) : “Using projection and correlation to approximate probability distributions”, Computer in Physics 12, 380–384. K EARNS , P. ET A. R. PAGAN (1997) : “Estimating the density tail index for financial time series”, Review of Economics and Statistics 79, 171–175. K EMPF, A. ET O. KORN (1999) : “Market depth and order size”, Journal of Financial Markets 2, 29–48. K ESTEN , H. (1973) : “Random difference equation and renewal theory for products of random matrices”, Acta Mathematica 131(207-253). K IMBALL , M. (1993) : “Standard risk aversion”, Econometrica 61, 573–589. K IRMAN , A. (1983) : “Communication in markets : a suggested approach”, Economics Letters 12, 101– 108. K IRMAN , A. (1991) : “Epidemincs of opinion and speculative bubbles in financial markets”, in Taylor, M. P. (ed.), Money and Financial Markets, Blackwell, Cambridge, chapter 17. K LUGMAN , S. ET R. PARSA (1999) : “Fitting bivariate loss distributions with copulas”, Insurance : Mathematics and Economics 24, 139–148. K RAUSS , A. ET R. L INTZENBURGER (1976) : “Skewness preference and the valuation of risky assets”, Journal of Finance 21, 1085–1094. K RZYSZTOFOWICZ , R. ET K. S. K ELLY (1996) : “A meta-Gaussian distribution with specified marginals”, Document de Travail, University of Virginia. K USUOKA , S. (2001) : “On law invariant coherent risk measures”, Advances in Mathematical Economics, Springer, Tokyo, pp. 83–95. L AHERR E` RE , J. ET D. S ORNETTE (1999) : “Streched exponential distributions in nature and economy : Fat tails with characteristic scales”, European Physical Journal B 2, 525–539. L ALOUX , L., P. C IZEAU , J. B OUCHAUD ET M. P OTTERS (1999) : “Noise dressing of financial correlation matrices”, Physical Review Letters 83, 1467–1470. L ALOUX , L., P. C IZEAU , J. B OUCHAUD ET M. P OTTERS (2000) : “Random matrix theory and financial correlations”, International Journal of Theoretical and Applied Finance 3, 391–397. L E BARON , B. (2001) : “Stochastic volatility as a simple generator of apparent financial power laws and long memory”, Quantitative Finance 1, 621–631. L E BARON , B., W. A RTHUR ET R. PALMER (1999) : “Time series properties of an artificial stock market”, Journal of Economic Dynamics and Control 23, 1487–1516. L EVY, H. (1998) : Stochatic Dominance : Investment decision Making under Uncertainty, Kluwer Academic Press. L E´ VY, M., H. L E´ VY ET S. S OLOMON (1995) : “Microscopic simulation of the stick market : the effect of miscroscopic diversity”, Journal de Physique I France 5, 1087–1107. L I , D., P. M IKUSINSKI ET M. TAYLOR (1998) : “Strong approximation of copulas”, Journal of Mathematical Analysis and Applications 225, 608–623. L ILLO , F., J. D. FARMER ET R. N. M ANTEGNA (2002) : “Single curve collapse of the price impact for the New York Stock Exchange”, Document de Travail 0207428, Cond-mat.
511
Bibliographie
L ILLO , F. ET R. N. M ANTEGNA (2002) : “Omori law after a financial market crash”, Physical Review Letters . L INDSKOG , F. (2000) : “Modelling Dependence with Copulas”, Document de Travail, RiskLab. L INDSKOG , F., A. M C N EIL ET U. S CHMOCK (2001) : “Kendall’s tau for elliptical distributions”, Document de Travail, Risklab. L INTNER , J. (1965) : “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets”, Review of Economics and Statistics 13, 13–37. L INTNER , J. (1969) : “The aggregation of investors divers judgements and preferences in purely competitive security markets”, Journal of Financial and Quantitative Analysis 4, 347–400. L IU , Y., P. C IZEAU , M. M EYER , C. K. P ENG time series”, Physica A 245, 437–440.
ET
H. E. S TANLEY (1997) : “Correlations in economic
L ONGIN , F. M. (1996) : “The asymptotic distribution of extreme stock market returns”, Journal of Business 96, 383–408. L ONGIN , F. M. (2000) : “From VaR to stress testing : the extreme value approach”, Journal of Banking & Finance 24, 1097–1130. L UX , T. (1995) : “Herd behaviour, bubbles and crashes”, Economic Journal 105, 881–896. L UX , T. (1996) : “The stable Paretian hypothesis and the frequency of large returns : an examination of major German stocks”, Applied Financial Economics 6, 463–475. L UX , T. (1997) : “The limiting extremal behavior of speculative returns : an annalysis of intradaily data from Frankfurt Stock exchange”, Document de Travail, University of Bonn. L UX , T. (1998) : “The socio-economic dynamics of speculative markets : interacting agents, chaos, and the fat tails of return distributions”, Journal of Economic Behavior and Organization 33, 143–165. L UX , T. ET D. S ORNETTE (2002) : “On rational bubbles and fat tails”, Journal of Money, Credit and Banking 34, 589–610. L UX , T. ET M. M ARCHESI (1999) : “Scaling and criticality in a stochastic multi-agent model of a financial market”, Nature 397, 498–500. M ALEVERGNE , Y. ET D. S ORNETTE (2001a) : “Multi-dimensional rational bubbles and fat tails”, Quantitative Finance 1, 533–541. M ALEVERGNE , Y. ET D. S ORNETTE (2001b) : “Testing the Gaussian copula hypothesis for modelling financial assets dependence”, Submitted to Quantitative Finance . M ALEVERGNE , Y. ET D. S ORNETTE (2002a) : “Collective origin of the coexistence of apparent RMT noise and factors in large sample correlation matrices”, Submitted to Physical Review Letters . M ALEVERGNE , Y. ET D. S ORNETTE (2002b) : “Minimizing Extremes”, Risk 15, 129–132. M ALEVERGNE , Y. ET D. S ORNETTE (2002c) : “Multi-moment methods for portfolio management : Generalized capital asset pricing model in homogeneous and heterogeneous markets”, in Maillet, B. et Jurzenko (eds.), Proceeding of the workshop on Multi-Moment Capital Pricing Models and Related Topics, Springer. M ALEVERGNE , Y. ET D. S ORNETTE (2002d) : “VaR efficient portofolios for super and sub exponentially decaying assets’ returns distributions”, Document de Travail, University of Nice-Sophia Antipolis. M ALEVERGNE , Y., V. P ISARENKO ET D. S ORNETTE (2002) : “Distribution of returns : Exponential versus power-like ?”, Document de Travail, University of Nice-Sophia Antipolis. M ANDELBROT, B. (1963) : “The variation of certain speculative prices”, Journal of Business 36, 392– 417.
512
Bibliographie
M ANDELBROT, B. (1971) : “When can prices be arbitraged efficiently ? A limit to the validity of random walk and martigale models”, Review of Economics and Statistics 53, 225–261. M ANTEGNA , R. N. ET H. E. S TANLEY (1994) : “Stochastic process with ultraslow convergence to a Gaussian : The truncated Lvy flight”, Physical Review Letters 73, 2946–2949. M ANTEGNA , R. N. 376, 46–55.
ET
H. E. S TANLEY (1995) : “Scaling behavior of an economic index”, Nature
M ANTEGNA , R. N. ET H. E. S TANLEY (1999) : An Introduction to Econophysics : Correlations and Complexity in Finance, Cambridge University Press. M ARKOVITZ , H. (1959) : Portfolio Selection : Efficient Diversification of Investments, John Wiley and Sons, New York. M AURER , S. M. (2001) : “Portfolios of quantum algorithms”, Physical Review Letters 97(257901). M EERSCHAERT, M. ET H. S CHEFFLER (2001) : “Sample cross-correlations for moving averages with regularly varying tails”, Journal of Time Series Analysis 22, 481–492. M ERTON , R. C. (1973) : “An intertemporal capital asset pricing model”, Econometrica 41, 867–888. M ERTON , R. C. (1992) : Continuous Time Finance, Blackwell, Cambridge, Massaschusetts. M ONTESSANO , A. ET F. G IOVANNONI (1996) : “Uncertainty aversion and aversion to increasing uncertainty”, Theory and Decision 41, 133–148. M OSSIN , J. (1966) : “Equilibrium in a capital market”, Econometrica 34, 768–783. M UZY, J. F., D. S ORNETTE , J. D ELOUR ET A. A RN E´ ODO (2001) : “Multifractal returns and hierarchical portfolio theory”, Quantitative Finance 1, 131–148. M UZY, J. F., J. D ELOUR ET E. BACRY (2000) : “Modelling fluctuations of financial times series : From cascade process to stochastic volatility model”, European Physical Journal B 17, 537–548. M UZY, J. F. ET E. BACRY (2002) : “Multifractal stationary random measure and multifractal random walks with log-infinitely divisible scaling laws”, Document de Travail 0206202, Cond-mat. NAKAMURA , Y. (1990) : “Subjective expected utility with non-additive probabilities on finite state spaces”, Journal of Economic Theory 51, 346–366. N ELSEN , R. (1998) : An Introduction to Copulas, Lectures Notes in statistic 139, Springer Verlag, New York. N ELSON , D. B. (1991) : “Conditional heteroskedasticity in asset pricing : a new approach”, Econometrica 59, 347–370. O RL E´ AN , A. (1989) : “Comportements min´etiques et diversit´e des opinions sur les march´es financiers”, in Bourguinat, H. et P. Artus (eds.), Th´eorie Economique et Crises des March´es Financiers, Economica, pp. 45–65. O RL E´ AN , A. (1992) : “Contagion des opinions et fonctionnements des march´es financiers”, Revue Economique 43, 685–697. PAFKA , S. ET I. KONDOR (2001) : “Noisy covariance matrices and portfolio optimization”, Document de Travail 0111503, Cond-mat. PAFKA , S. ET I. KONDOR (2002) : “Noisy covariance matrices and portfolio optimization II”, Document de Travail 0205119, Cond-mat. PAGAN , A. R. (1996) : “The econometrics of financial markets”, Journal of Empirical Finance 3, 15 – 102. PALMER , R., W. B. A RTHUR , J. H. H OLLAND , B. L E BARON ET P. TAYLOR (1994) : “Artificial economic life - A simple model of a stock market”, Physica D 75, 264–274.
513
Bibliographie
PATTON , A. (2001) : “Estimation of Copula Models for Time Series of Possibly Different Lengths”, Document de Travail 01-17, University of California, San Diego. P FLUG , G. (2000) : “Some remarks on the value-at-risk and the conditional value-at-risk”, in Uryasev, S. (ed.), Probabilistic Constrained Optimization : Methodology and Applications, Kluwer Academic Publisher. P ICKANDS , J. (1975) : “Statistical inference using extreme order statistics”, Annals of Statistics 3, 119– 131. P LEROU , V., P. G OPIKRISHNAN , B. ROSENOW, L. N. A MARAL ET H. S TANLEY (1999) : “Universal and nonuniversal properties of cross correlations in financial time series”, Physical Review Letters 83(1471). P LEROU , V., P. G OPIKRISHNAN , X. G ABAIX ET H. E. S TANLEY (2001) : “Quantifying stock price response to demande fluctuations”, Document de Travail 0106657, cond-mat. P OCHART, B. ET J. P. B OUCHAUD (2002) : “The skewed multifractal random walk with applications to option smiles”, Quantitative Finance 2, 303–314. P OON , S., M. ROCKINGER ET J. TAWN (2001) : “New extreme-value dependence measures and finance applications”, Document de Travail, HEC. Q UIGGIN , J. (1982) : “A theory of anticiped utility”, Journal of Economic Behavior and Organization 3, 323–343. R ICHARDSON , M. ET T. S MITH (1993) : “A test for multivariate normality in stocks”, Journal of Business 66, 295–321. ROCKAFELLAR , R. Risk 2, 21–41.
ET
S. U RYASEV (2000) : “Optimization of conditional value-at-risk”, Journal of
ROCKAFELLAR , R. T. ET S. U RYASEV (2002) : “Conditional value-at-risk for general loss distributions”, Journal of Banking & Finance 26, 1443–1471. ROCKINGER , M. ET E. J ONDEAU (2001) : “Conditional dependency of financial series : An application of copulas”, Document de Travail, HEC. ROCKINGER , M. ET E. J ONDEAU (2002) : “Entropy Densities with an application to autoregressive conditional skewness and kurtosis”, Journal of Econometrics 106, 119–142. ROEHNER , B. M. (2001) : Hidden Collective Factors in Speculative Trading, Springr Verlag, New York. ROEHNER , B. M. (2002) : Patterns of Speculation : A Study in Observational Econophysics, Cambridge University Press. ROEHNER , B. M. ET D. S ORNETTE (1998) : “The sharp peak-flat trough pattern and critical speculation”, European Physical Journal B 4, 387–399. ROLL , R. (1973) : “Evidence on the ‘Groth-Optimum’ model”, Journal of Finance 28, 551–556. ROOTZ E` N , H., M. R. L EADBETTER ET L. DE H AAN (1998) : “On the distribution of tail array sums for strongly mixing stationnary sequences”, Annals of Applied Probability 8, 868–885. ROSENOW, B., V. P LEROU , P. G OPIKRISHNAN ET H. S TANLEY (2001) : “Portfolio optimization and the random magnet problem”, Document de Travail 0111537, Cond-mat. ROTSCHILD , M. ET J. E. S TIGLITZ (1970) : “Increasing Risk I : A definition”, Journal of Economic Theory 2, 225–243. ROY, A. D. (1952) : “Safety-First and the holding of assets”, Econometrics 20, 431–449. RUBINSTEIN , M. (1973) : “The fundamental theorem of parameter-preference security valuation”, Journal of Financial and Quantitative Analysis 8, 61–69.
514
Bibliographie
S AMUELSON , P. A. (1941) : “The stability of equilibirum”, Econometrica 9, 97–120. S AMUELSON , P. A. (1958) : “The fundamental approximation theorem ofportfolio analysis in terms of means variances and higher moments”, Review of Economic Studies 25, 65–86. S AMUELSON , P. A. (1965) : “Proof that properly anticipated prices fluctuates randomly”, Industrial management Review 6, 41–50. S AMUELSON , P. A. (1973) : “Mathematics of speculative price”, SIAM Review 15, 1–42. S AVAGE , L. (1954) : The fundations of statistics, John Wiley, New York. S CAILLET, O. (2000a) : “Nonparametric estimation and sensitivity analysis of expected shortfall”, Document de Travail, HEC Gen`eve. S CAILLET, O. (2000b) : “Nonparametric estimation of copulas for time series”, Document de Travail, Universit´e Catholique de Louvain. S CHMEIDLER , D. (1986) : “Integral representation without additivity”, Proceedings of the American Mathematical Society 97, 255–261. S CHMEIDLER , D. (1989) : “Subjective probability and expected utility without additivity”, Econometrica 57, 571–587. S HAPIRO , A. ET S. BASAK (2000) : “Value-at-Risk based risk management : Optimal policies and asset prices”, Review of Financial Studies 14, 371–405. S HARPE , W. (1964) : “Capital asset prices : A theory of market equilibrium under conditions of risk”, Journal of Finance 19, 425–442. S HEFFRIN , H. (2000) : Beyond Greed and Fear : Understanding Behavioral Finance and the Psychology of Investing, Havard Business School Press. S HILLER , R. J. (2000) : Irrational Exuberance, Princetown university Press. S HLEIFFER , A. (2000) : Inefficient Market : an Introduction to Behavioral Finance, Oxford University Press. S KLAR , A. (1959) : “Fonction de r´epartition a` n dimensions et leurs marges”, Publ. Inst. Statist. Univ. Paris 8, 229–231. S MITH , A. (1776) : An Inquiry into the Nature and Causes of the Wealth of Nations, R. H. Campbell and A. S Skinner (eds.), Oxford, UK : Clarenton Press, 1976. S MITH , V. L. (1994) : “Economics in the laboratory”, Journal of Economic Persectives 8, 113–131. S MITH , V. L. (1998) : “The two faces of Adam Smith”, Southern Economic Journal 95, 1–19. S ORNETTE , D. (2000) : Critical Phenomena in Natural Sciences, Springer. S ORNETTE , D., J. V. A NDERSEN ET P. S IMONETTI (2000) : “Portfolio theory for “Fat Tails””, International journal of Theoretical and Applied Finance 3, 523–535. S ORNETTE , D., P. S IMONETTI ET J. A NDERSEN (2000) : “φq -field theory for portfolio optimization : ”fat-tails” and non-linear correlations”, Physics Reports 335, 19–92. S ORNETTE , D. ET J. V. A NDERSEN (2002) : “A nonlinear super-exponential rational model of speculative financial bubbles”, International Journal of Modern Physics 13, 171–188. S ORNETTE , D. ET Y. M ALEVERGNE (2001) : “From rational bubbles to crashes”, Physica A 299, 10–59. S ORNETTE , D., Y. M ALEVERGNE ET J. F. M UZY (2002) : “Volatility fingerprints of large shocks : Endogeneous versus exogeneous”, forthcoming Risk . S TAUFFER , D. ET D. S ORNETTE (1999) : “Self-organized percolation model for stock market fluctuations”, Physica A 271, 496–506.
Bibliographie
515
S USMEL , R. (1996) : “Switching volatility in international equity market”. Dept. of Finance, University of Houston. S ZERG O¨ , G. (1999) : “A critique to Basel regulation, or how to enhance (im)moral hazards”, Proceedings of the International Conference on Risk Management and Regulation in Banking, Bank of Israel, Kluwer. TASCHE , D. (2000) : “Risk contribrutions and performance measurement”, Document de Travail, TU M¨unchen. TASCHE , D. (2002) : “Expected shortfall and beyond”, Journal of Banking & Finance 26, 1519–1533. TASCHE , D. ET L. T IBILETTI (2001) : “Approximations for the value-at-risk approach to risk-return analysis”, Document de Travail 269733, SSRN. T HALLER , R. H. (ed.) (1993) : Advances in Behavioral Finance, Russel Sage Foundation, New York. VALDEZ , E. A. (2001) : “Bivariate analysis of survivorship and persistency”, Insurance : Mathematics and Economics 29, 357–373. V ENEZIANO , D., G. E. M OGLEN ET R. L. B RAS (1995) : “Multifractal analysis : pitfalls of standard procedures and alternatives”, Physical Review E 52, 1387–1398. VON
N EUMANN , J. ET O. M ORGENSTERN (1947) : Theory of games and economic behavior, Princetown University Press.
WANG , T. (1999) : “A class of dynamic risk measures”, Document de Travail, University of British Columbia. W ESTERHOFF , F. H. (2001) : “Speculative markets and the effectiveness of price limits”, Document de Travail, Universit¨at Osnabr¨uck. W ESTERHOFF , F. H. (2002) : “Heterogeneous traders and the Tobin tax”, Document de Travail, Universit¨at Osnabr¨uck. W ISHART, J. (1928) : “The generalized product moment distribution in samples from a multinormal population”, Biometrika 20, 32. Z AKOIAN , J. M. (1994) : “Threshold heteroskedastic models”, Journal of Economic Dynamics and Control 18, 931–986. Z HANG , Y. C. (1999) : “Toward a theory of marginally efficient markets”, Physica A 269, 30–44.
Risques extrˆemes en finance : Statistique, th´eorie et gestion de portefeuille
R´esum´e : Cette th`ese propose une e´ tude des risques extrˆemes sur les march´es financiers, consid´er´es comme un exemple typique de syst`eme complexe auto-organis´e. Nous commenc¸ons par d´ecrire et mod´eliser les propri´et´es statistiques individuelles des actifs financiers afin d’en d´eduire une estimation pr´ecise des grands risques et d’en comprendre les m´ecanismes sous-jacents en relation avec la micro-structure des march´es et le comportement des agents e´ conomiques. A l’aide des copules et mod`eles a` facteurs, nous analysons ensuite les propri´et´es de d´ependance extrˆemes des actifs financiers afin de mieux cerner les possibilit´es et les limites de la diversification des grands risques. Enfin, nous e´ tudions les mesures de risques les plus a` mˆemes de quantifier des risques extrˆemes et appliquons l’ensemble de ces r´esultats a` l’obtention de portefeuilles les moins sensibles a` ce type de risques. Parall`element, nous exposons certaines cons´equences de ce type d’allocation sur les e´ quilibres de march´e. Mots-cl´es : syst`eme complexe auto-organis´e, risques extrˆemes, march´es financiers, gestion de portefeuille, e´ conophysique, bulles sp´eculatives, distributions a` queues e´ paisses, copules.
Extreme risks in finance : Statistics, theory and portfolio management
Summary : This thesis proposes a study of extreme risks observed on financial markets, considered as a typical example of self-organized complex systems. We start by describing and modeling individual statistical properties of financial assets in order to obtain the most accurate estimation of large risks and to provide a better understanding of the underlying mechanisms in relation with the markets microstructure and the economic agents’ behaviors. Using copulas as well as factor models, we then analyze the dependence properties between assets, and more specifically their extreme counterparts, in order to understand the opportunities for diversification of large risks, but also their limits. Finally, we study the risk measures that are the most appropriate for the assessment of extreme risks and apply all these results to define the portofolios that are the least sensitive to extreme risks. In parallel, we develop consequences of such assets allocations with respect to market equilibria. Key-words : self-organized complex systems, extreme risks, financial markets, portfolio management, econophysics, speculative bubbles, heavy tail distributions, copulas.