This conference is co-sponsored by the American Statistical Association and the Institute of Mathematical Statistics. We thank the National Science Foundation, the Army Research Organization, the Office of Naval Research and the Mellon Fund of the American University College of Arts and Sciences for their financial support which made this meeting possible.
All talks are in Ward Circle Building at the intersection of Massachusetts Avenue and Nebraska Avenue. Opening remarks and invited talks are in Ward 1, contributed talk room numbers are on the schedules below.
Thursday, June 3 | |||
8:30 | Opening remarks - Provost Kerwin | ||
8:45 | Benoit Mandelbrot | ||
9:30 | Break | ||
9:45 | Paul Embrechts | ||
10:30 | Dilip Madan | ||
11:15 | G. Tong Zhou | ||
12:00 | Lunch | ||
13:00 | Michel M. Dacorogna | ||
13:45 | Robert Adler | ||
14:30 | Sid Resnick | ||
15:15 | Break | ||
Contributed session 1 | Contributed session 2 | Contributed session 3 | |
Ward 4 | Ward 6 | Ward 1 | |
15:30 | Mark M. Meerschaert | Douglas A. Abraham | Boris Choy |
15:45 | Paul L. Anderson | Jacek Ilow | David Khabie-Zeitoune |
16:00 | David Benson | Maymon, Friedmann, et. al. | Jon. R. M. Hosking |
16:15 | Schertzer, Lovejoy | David Levey | Philippe Lambert |
16:30 | Schmitt, Schertzer, Lovejoy | Roughan, Yates and Veitch | Hippolyte Fofack |
16:45 | Tchiguirinskaya, Molz, Lu | Ananthram Swami | Zbigniew W. Kominek |
17:00 | Panel discussion - directions for future research | ||
18:00 | Adjourn |
Friday, June 4 | |||
8:30 | Gonzalo R. Arce | ||
9:15 | Stephen McLaughlin | ||
10:00 | Yang, Petropulu, Adams | ||
10:45 | Break | ||
11:00 | Gennady Samorodnitsky | ||
11:45 | Raya Feldman | ||
12:30 | Lunch | ||
13:30 | Rolf-Dieter Reiss | ||
14:15 | Alexander Levin | ||
15:00 | Murad S. Taquu | ||
15:45 | Break | ||
Contributed session 4 | Contributed session 5 | Contributed session 6 | |
Ward 4 | Ward 6 | Ward 1 | |
16:00 | Zhang, Blum, Sadler, Kozick | Dance and Kuruoglu | |
16:15 | Bob Pierce | Leong Lan | John Nolan |
16:30 | Silong Lu | Sapir Luba | Mor Harchol-Balter |
16:45 | Tchiguirinskaya, Schertzer, Lovejoy | Donald P. Cram | Gomes and Selman |
17:00 | Adjourn | ||
18:00 | Banquet |
Saturday, June 5 | |||
8:30 | Nalini Ravishanker | ||
9:15 | Richard A. Davis | ||
10:00 | Break | ||
10:15 | J. Huston McCulloch | ||
11:00 | Til Schuermann | ||
Contributed session 7 | Contributed session 8 | Contributed session 9 | |
Ward 4 | Ward 6 | Ward 1 | |
11:45 | Colin Gallagher | Kozubowski and Podgórski | Hernandez-Molinar, Lefante |
12:00 | Garel and Hallin | Steven J. Sepanski | Hasan Hamdan |
12:15 | Godsill and Kuruoglu | Inmaculada B. Aban | Zhaohui Qin |
12:30 | Keith Knight | Haug, Frigessi, Gjerde, Rue | Rolf Reiss |
8:30 AM Opening Remarks - Provost Cornelius M. Kerwin
8:45 AM Benoit Mandelbrot (Abstract 1),
IBM/Yale,
The multifractality of financial prices
9:30 AM Break
9:45 AM Paul Embrechts (Abstract 2),
ETH Zurich,
Extremes and integrated risk management
10:30 AM Dilip Madan (Abstract 3),
University of Maryland,
Purely discontinuous asset price processes
11:15 AM G. Tong Zhou (Abstract 4),
Georgia Tech,
Modeling the Short and Long Memories of VBR Video Streams
12:00 PM Lunch
1:00 PM Michel M. Dacorogna (Abstract 5),
Olsen & Associates, Zurich, Switzerland,
Does Diversification Work in Case of Market Shocks?
A High Frequency Study of the Correlation of Extremes
1:45 PM Robert Adler (Abstract 6),
Technion and UNC, Chapel Hill,
Nonlinear Time Series via Mixtures of Autoregressions
2:30 PM Sid Resnick (Abstract 7),
Cornell University,
Infinite source Poisson models with heavy tailed transmission
times
3:15 PM Break
3:30 - 5:00 PM Contributed Session 1 - Ward Circle Building, Room 4
3:30 PM Mark M. Meerschaert (Abstract 8),
University of Nevada,
Fractional Diffusion
3:45 PM Paul L. Anderson (Abstract 9),
Albion College,
Modeling River Flows with Heavy Tails
4:00 PM David Benson (Abstract 10),
Desert Research Institute,
Simple and Accurate Solutions of Heavy-Tailed Contaminant
Transport in Aquifers
4:15 PM D. Schertzer, S. Lovejoy (Abstract 11),
CNRS-Universite P.M. Curie, McGill U.,
Statistics and Heavy Tails in Environment and Geophysics
4:30 PM F. Schmitt, D. Schertzer, S. Lovejoy (Abstract 12),
CNRS-Universite P.M. Curie, McGill U.,
Multifractal Stochastic Dynamics and Heavy Tails in Finance
4:45 PM I. Tchiguirinskaya, F.J. Molz and S. Lu (Abstract 13),
Clemson University- Clemson Research Park,
The Unified Multifractal Model of Hydraulic Conductivity
3:30 - 5:00 PM Contributed Session 2 - Ward Circle Building, Room 6
3:30 PM Douglas A. Abraham (Abstract 14),
University of Conneticut,
A Unifying Model for Non-Rayleigh Active Sonar Reverberation
3:45 PM Jacek Ilow (Abstract 15),
Dalhousie University,
Blind Channel Identification with a-Stable Inputs Based on the
Multivariate Empirical Characteristic Function
4:00 PM Shay Maymon, Jonathan Friedmann, Eran Fishler and Hagit Messer-Yaron (Abstract 16),
Tel Aviv University, ISRAEL,
Estimation of the Parameters of a Stable Distribution
Based on Order Statistics
4:15 PM David Levey (Abstract 17),
University of Edinburgh,
Heavy-Tailed Distributions of Impulse Noise
4:30 PM Matthew Roughan, Jennifer Yates and Darryl Veitch (Abstract 18),
Software Engineering Research Centre, RMIT, Melbourne, Australia,
On Some Difficulties in the Use of Fractal Renewal Processes to
Simulate Long-Range Dependent Processes
4:45 PM Ananthram Swami (Abstract 19),
Army Research Laboratory,
On some detection and estimation problems in heavy-tailed noise.
3:30 - 5:00 PM Contributed Session 3 - Ward Circle Building, Room 1
3:30 PM Boris Choy (Abstract 20),
The University of Hong Kong,
Bayesian Value at Risk with Spherically Symmetric Distributions
3:45 PM David Khabie-Zeitoune (Abstract 21),
Imperial College,
Factor GARCH, regime-switching and the term structure of
interest rates
4:00 PM Jon. R. M. Hosking (Abstract 22),
IBM Research Division,
L-moments and their applications in the analysis of financial data
4:15 PM Philippe Lambert (Abstract 23),
Universite de Liege,
Modelling skewness and the occurence of consecutive extremes
in financial series using the stable and the skewed-stable
distributions
4:30 PM Hippolyte Fofack (Abstract 24),
World Bank,
Distribution of parallel exchange rates in African countries
4:45 PM Wojciech W. Charemza and Zbigniew W. Kominek (Abstract 25),
University of Leicester, UK,
Efficient returns, thick tails and speculative processes
5:00 - 6:00 PM Panel discussion: directions for future research
8:30 AM Gonzalo R. Arce, Sudhakar Kalluri, and A. Brinton Cooper III (Abstract 26),
University of Delaware,
Maximum Likelihood Decoding of Convolutional Codes for Non-Gaussian
Channels
9:15 AM Stephen McLaughlin (Abstract 27),
University of Edinburgh,
Stable Distributions in Teletraffic Analysis and Modelling
10:00 AM Xueshi Yang, Athina P. Petropulu and V. Adams (Abstract 28),
Drexel University,
The Extended On/Off model for High-Speed Data Network
10:45 AM Break
11:00 AM Gennady Samorodnitsky and Thomas Mikosch (Abstract 29),
Cornell University,
Ruin probability with claims modeled by a stationary ergodic stable process
11:45 AM Raya Feldman (Abstract 30),
University of California, Santa Barbara,
Filtering from observations with Levy noise.
12:30 PM Lunch
1:30 PM Rolf-Dieter Reiss (Abstract 31), University of Siegen, (Germany),
An analysis of exchange rates using XGPL/Xtremes
2:15 PM Alexander Levin (Abstract 32),
Bank of Montreal,
Multifactor Gamma Stochastic Variance Value-at-Risk Model
3:00 PM Murad S. Taquu (Abstract 33),
Boston University,
The asymptotic behavior of the Mandelbrot-Weierstrass process
3:45 PM Break
4:00 - 5:00 PM Contributed Session 4 - Ward Circle Building, Room 4
4:00 PM Yumin Zhang, Rick Blum, Brian Sadler and Rick Kozick (Abstract 34),
Lehigh University, Army Research Lab, Bucknell University,
On the Approximation of Correlated Non-Gaussian Pdfs
Using Gaussian Mixture Models
4:15 PM Bob Pierce (Abstract 35),
Naval Surface Warfare Center,
Codifference and Dependent, Complex, Isotropic SaS Random Variables
4:30 PM Silong Lu (Abstract 36),
Clemson Univ.,
Is Hydraulic Conductivity at the MADE Site Governed by
Stable Processes?
4:45 PM I. Tchiguirinskaya, D. Schertzer and S. Lovejoy (Abstract 37),
Clemson University- Clemson Research Park,
The Physics of Statistical Heavy Tails of Turbulent Intermittency
4:00 - 5:00 PM Contributed Session 5 - Ward Circle Building, Room 6
4:15 PM Leong Lan (Abstract 38),
Monash University, Malaysia,
What Does Quantum Mechanics Have in Common with Stock Markets?
4:30 PM Sapir Luba (Abstract 39),
Ben-Gurion University,
Expert rule versus majority rule under partial information,
III
4:45 PM Donald P. Cram (Abstract 40),
MIT Sloan School of Management,
Symmetric Stable Binary Regression with Application to Environmental
Management Decision-making
4:00 - 5:00 PM Contributed Session 6 - Ward Circle Building, Room 1
4:00 PM Christopher R. Dance and Ercan E. Kuruoglu (Abstract 41),
Xerox Research Centre Europe, Cambridge, UK,
Estimation of the Parameters of Skewed Stable Distributions
4:15 PM John Nolan (Abstract 42),
American University,
Data analysis for stable distributions
4:30 PM Mor Harchol-Balter (Abstract 43),
M.I.T. Laboratory for Computer Science,
The Effect of Heavy-Tailed Job Size.
Distributions on Computer System Design
4:45 PM Carla Gomes and Bart Selman (Abstract 44),
Cornell,
Heavy-Tailed Distributions in Computational Methods
5:00 PM Break
6:00 PM Banquet, Hogates Restaurant, 800 Water Street
(directions in registration packet)
8:30 AM Nalini Ravishanker (Abstract 45),
University of Connecticut,
Monte Carlo EM Estimation for Stable Distributions
9:15 AM Richard A. Davis (Abstract 46),
Colorado State University,
Linear Processes With Nonlinear Behavior
10:00 AM Break
10:15 AM J. Huston McCulloch (Abstract 47),
Ohio State University,
Implications of unknown skewness for mean stock returns.
11:00 AM Til Schuermann (Abstract 48),
Oliver Wyman,
Pitfalls and Opportunities of Extreme Value Theory in Finance
11:45 AM - 12:45 PM Contributed Session 7 - Ward Circle Building, Room 4
11:45 AM Colin Gallagher (Abstract 49),
Clemson University,
The Autocovariation Function with Applications to Time
Series Modeling
12:00 PM M. Bernard Garel and Marc Hallin (Abstract 50),
ENSEEIHT, Toulouse, France and Univ Libre de Bruxelles, Brussels, Belgium,
Rank-based statistics and stable AR processes
12:15 PM Simon Godsill and Ercan Kuruoglu (Abstract 51),
University of Cambridge,
Bayesian inference for time series with heavy-tailed noise sources
12:30 PM Keith Knight (Abstract 52),
University of Toronto,
Asymptotic behaviour of extreme regression quantiles
11:45 AM - 12:45 PM Contributed Session 8 - Ward Circle Building, Room 6
11:45 AM Tomasz J. Kozubowski and Krzysztof Podgórski (Abstract 53),
University of Tennessee Chatanooga and Indiana
University Purdue University Indianapolis,
Asymmetric Laplace Laws and Modeling Financial Data.
12:00 PM Steven J. Sepanski (Abstract 54),
Saginaw Valley State University,
Some laws of the iterated logarithm for generalized
domain of attraction
12:15 PM Inmaculada B. Aban (Abstract 55),
University of Nevada, Reno,
Shifted Hill's estimator for heavy tails
12:30 PM Ola Haug, Arnoldo Frigessi, Jon Gjerde and Havard Rue (Abstract 56),
Norwegian Computing Center,
Tail estimation with the Generalised Pareto Distribution without
threshold selection
11:45 AM - 12:45 PM Contributed Session 9 - Ward Circle Building, Room 1
11:45 AM Raul Hernandez-Molinar, John Lefante (Abstract 57),
Tulane University,
Heavy Tail Estimation Using Upper Order Statistics For
Truncated Weibull,
Generalized Pareto and Lognormal Distributions.
12:00 PM Hasan Hamdan (Abstract 58),
American University,
Approximating Variance Mixtures of Normals
12:15 PM Zhaohui Qin (Abstract 59),
University of Michigan,
Some Extensions of the Scale Mixture of Uniforms Method
12:30 PM Rolf-Dieter Reiss (Abstract 60),
University of Siegen, (Germany),
Exact Credibility Estimation in Certain Pareto Models
Risk Management within the banking industry evolved from a set of quantitative tools on the handling of market risk to a complete theory combining various types of risk (market, credit, operational, etc.). Recently, the combination of insurance risk and investment risk within an all-finance environment added a further layer of integration (an example of this is DFA: Dynamic Financial Analysis). In this talk I will review the key mathematical tools used within IRM. Extreme value techniques play an important role, for instance for the construction of dynamic value-at-risk measures. Another important issue concerns the modelling of dependency in high-dimensional non-Gaussian (read heavy-tailed) data. Various examples will be presented.
This paper presents the case for modeling asset price processes as purely discontinuous processes of finite variation with an infinite arrival rate of jumps that have arrival rates completely monotone in the jump size. The arguments address both the empirical realities of asset returns and the implications of the economic principle of no arbitrage. Two classes of economic models meeting these conditions are presented and linked. An important example given by the variance gamma process is studied in detail and use to design optimal derivative investment portfolios that are calibrated to actual portfolios to reverse engineer trader preferences and beliefs and infer personalized risk neutral measures termed position measures. Illustrative comparison of statistical, risk neutral an position measures are also provided.
Variable Bit Rate (VBR) video is expected to be a major source of traffic in high speed communication networks. In order to design networks that employ statistical multiplexing to improve bandwidth efficiency, statistical source models are necessary to characterize the traffic data. Several studies have shown that VBR video frame sizes exhibit both short and long memories and their distribution is approximately log-normal. In this paper, we represent the properly transformed VBR data as a Gaussian fractional ARIMA process and employ Whittle's method to estimate its parameters. The short and long memory properties of the original VBR data are then indirectly but parsimoniously characterized.
There is a common wisdom among investors that diversification effects become weaker when financial markets are submitted to strong shocks. To study this problem, we analyze how the dependency structure between financial assets depends on shocks in the market. In particular, we aim at investigating if there is a mob effect, meaning that when extreme events happen the market behaves differently than it does normally.
Part of the problem can be tackled by investigating the tails of multivariate processes. The standard assumptions of multivariate extreme value theory imply that the processes have the same tail exponent when conditioned on a ray from the origin. When a bivariate process is written in polar coordinates (r,q) and conditioned on q, the tail exponent of the resulting process is independent of q. This has lead to the study of the spectral measure de Haan and de Ronde (1998) and Starica (1998).
We propose to explore the tails by a new method, which can be seen as an extension of the spectral measure. We explore the distribution along the ray conditioned on q and compute the tail indices of these distributions using the Hill estimator. Contrary to the results of multivariate extreme value theory, we find that for financial assets the tail exponent does indeed depend on the angle. For two correlated assets the tail seems fatter when the assets move in the same direction than when they move in opposite directions. Furthermore, comparing the two distributions resulting from conditioning on the diagonals we show that the distribution of one cannot be obtained by simply changing the scale and the location of the other. To confirm these findings we also consider the linear correlation of the process conditioned to the micro-activity of the market (computed from very short-term volatility). By comparing this conditional correlation to that of known processes one can obtain information about how the market behavior during shocks deviates from its normal behavior.
We analyze high frequency data for four major foreign exchange rates, USD/DEM, USD/JPY, USD/CHF and GBP/USD.
I will describe a novel class of non-linear models for time series analysis based on mixtures of local autoregressive models called MixAR models. These are constructed so that locally (in the state space), the processes follow a linear autoregressive structure. In addition, there is a state dependent probability distribution defined over the different AR models.
This is joint work with Ronny Meir of the Technion and Assaf Zeevi of the Technion and Stanford.
By allowing some of the local autoregressions to have heavy tailed noise, it is also possible to generate a wide class of non-linear models in which the tail and central behaviour can both be modelled and estimated in a precise and meaningful fashion.
Fluid queues are frequently used models for telecommunication networks. Consider a fluid queue being fed by an infinite number of sources where a server works off the load at constant rate r and where sources initiate transmissions or connections at Poisson times which result in work flowing into the system at unit rate. Transmissions last for iid periods governed by a heavy-tailed distribution of session lengths. The heavy tails induce long range dependence in the system and result in performance deterioration. The expected time it takes such a fluid queue with finite but large holding capacity L to reach buffer overflow is a function of capacity L, increases only polynomially fast, and so overflows happen much more often than in the ``classical'' light tailed case, where the expected overflow time increases as an exponential function of L.
We consider Gaussian approximations to the fluid content when tails are so heavy that not even the mean exists. Such a situation is encountered with data of the sizes of files downloaded during WWW sessions and is of more than academic interest. When a finite transmission mean exists, we discuss when cumulative input can be approximated by fractional Brownian motion and when an approximation by Levy stable motion is appropriate. We also include some remarks about heavy traffic approximations when service requirements are heavy tailed and the load increases towards instability.
Just as Brownian motion solves the classical diffusion equation, Levy motion solves a diffusion equation with fractional derivatives. Recent applications in Hydrology model the diffusion of ground water as a Levy motion. The resulting stable concentration profiles provide an exceptionally good fit to contaminant plume data. In this talk we develop a multivariable fractional diffusion equation which is solved by a vector Levy motion. We use an asymmetric fractional derivative operator which can be defined in terms of convolution with a Levy measure. The resulting Levy motion is also the scaling limit of a random walk, representing particle jumps whose magnitude has a heavy probability tail, and whose random direction is governed by the spectral measure of the Levy motion.
Recent advances in time series analysis provide alternative models for river flows in which the innovations have heavy tails, so that some of the moments do not exist. The probability of large fluctuations is much larger than for standard models. We survey some recent theoretical developments for heavy tail time series models, and illustrate their practical application to river flow data from the Salt river near Roosevelt, Arizona, USA. We also include some simple diagnostics that the practitioner can use to identify when the methods of this paper may be useful.
A carefully controlled tracer test was conducted at the Columbus Air Force Base (MADE site) in Mississippi. The aquifer was notably different from other well-studied tracer experiments. The degree of hydraulic conductivity (K) variability was much higher, although most practitioners would consider the high variability run-of-the-mill. The high variance of the sample data violates the basic assumptions of traditional 2^{nd}-order transport theories. A fractional-order dispersion equation of order a that describes particles that undergo Lévy, rather than Brownian, motion, readily describes the highly skewed and heavy-tailed plume development at the MADE site. Based on plume measurements and K increments, the order of fractional differentiation (corresponding to the Lévy index) is shown to be a = 1.1. Simple arguments lead to accurate estimates of the mean velocity and dispersion constants based only on the K statistics and the hydraulic gradient. While the traditional (2^{nd}-order) transport equation in various forms (stochastic, numerical) fails to model a conservative tracer in the MADE aquifer, the fractional equation predicts tritium concentration profiles with remarkable accuracy over all spatial and temporal scales. The implication of heavy-tailed concentrations is sobering when considering the eventual cleanup of aquifers or long-term containment of radioactive tracers.
Heavy tails seem rather ubiquitous in geophysics and environment. We first review some of the recent claims of its empirical evidence ranging from underground phenomena to atmospheric dynamics, and including earthquakes, hydraulic, conductivity, precipitations, river floods, pollutions, clouds, extreme temperatures, cyclones, etc. However, there is a question of particular importance for statistical analysis: how robust and accurate can be the empirical estimate of the power law exponent q_{d} of the tail of the probability distribution from a limited sample?
We therefore discuss the relevant stochastic framework and show that it is multiplicative rather than additive for empirical reasons (e.g. the exponents of the distributions tails are often greater than 2), as well as for phenomenological reasons (e.g. phenomenology of cascades) or theoretical reasons (e.g. constraint of the conservation of the flux of energy). We show that the corresponding multifractal framework yields rather general criteria for the estimate of q_{d}, in particular with the help of a sampling dimension, which corresponds to an effective dimension of the sample. Furthermore, in the framework of multifractal universality, these criteria can be analytically derived from the 4 universal exponents characterizing fully the process. This help us to assess the validity of the reviewed claims.
In this paper we show that the heavy tails in finance are generated by a multiplicative process rather than by an additive process, and we consider some of the consequences. The analysis of several foreign exchange financial datasets, yields a nonlinear scaling exponent of the structure functions of prices fluctuations (i.e. the moments of these fluctuations versus the time lag), whereas it should be linear (or bilinear) for additive scaling models. On the other hand, the tails of the probability distribution have a power law exponent q_{d} » 3 for financial times series. Since q_{d} = a < 2 for any additive Lévy model, whereas it is unbounded for multiplicative process, this confirms the multifractal nature of the finances fluctuations. Further to these empirical findings, we consider a multifractal integrated flux model of prices, i.e. prices correspond to a fractional integration of a flux of finance flowing through the different time scales of the process and which is (statistically) strictly scale invariant. The main application of this multifractal model is predictability: past and present values of the time series can be exploited to provide an optimal forecast. This contradicts the frequently assumed efficient market hypothesis.
A fairly large body of observational evidence shows that the hydraulic conductivity (K) is extremely variable over a large range of scales. Its correct mathematical representation has been the object of intensive research due to its applications to hydrology, chemical and petroleum engineering. We choose a stochastic multifractal framework, which not only generalizes previous scaling analyses of lnK as Levy-stable distributions, but which also reproduces the scaling properties of both K and lnK, while resulting in finite values for some statistical moments of K distributions. Using the borehole flowmeter measurements from the Macro-Dispersion Experiment (MADE) test site at the Columbus Air Force Base, MS, we find horizontal and vertical scaling regimes of horizontal hydraulic conductivity to be different and well represented. This and the results of the multifractal analysis allow us to develop the Unified Multifractal Model of hydraulic conductivity, which incorporates heavy tailed K distributions, allows one to capture observed anisotropic scaling and reproduce it in generated hydraulic conductivity distributions (realizations). Finally, we discuss how multifractal representations of hydraulic conductivity could lead to the non-classical Self-Organized Criticality. The later could represent somewhat ill-defined concept of preferential flow.
In active sonar systems, submarines are detected by transmitting a waveform and looking for an echo in the subsequently recorded time series. Hindering detection is reverberation, which is a result of reflections from inhomogeneities in the water and irregularities in the ocean bottom and on the surface. It has been traditionally assumed that the received reverberation is composed of a multitude of point scatterers, resulting in Gaussian distributed data owing to the central limit theorem (CLT).
However, modern sonar systems are able to isolate the reflections so that only those coming from a small region contribute to the observed data at any given time. This can result in not enough point scatterers contributing to the output for the CLT to hold, resulting in non-Gaussian reverberation. The signal processing performed prior to detection involves forming the envelope of the bandpass time series, traditionally resulting in Rayleigh distributed data when the reverberation is zero-mean Gaussian distributed. This paper presents a unified model for non-Rayleigh reverberation consisting of the product between a square root Gamma random variable and a modulating random variable. This model has as sub-members the Rayleigh, K, Weibull, log-normal, and Rayleigh mixture distributions, an Edgeworth series expansion, a spherically invariant random vector (SIRV) based model, and the models developed by McDaniel and Crowther which are physics based models. The flexibility of the Rayleigh mixture will be demonstrated by showing how it can approximate physics based models such as the K-distribution and Crowther's model as well as several other common phenomenological models.
In this talk, we consider the problem of blind channel identification with a-stable input. Accurate estimation of moving average (MA) a-stable noise parameters is important in characterizing many communication channels and in the design of optimal detectors (Nikias). We present a new technique for blind channel identification with finite impulse response (FIR). The method proposed exploits the properties of the multivariate empirical characteristic function (MECF). Statistical properties of the MECF are investigated and computational issues are discussed first. Then, we examine the performance of the scheme proposed through Monte-Carlo simulations and compare it to that of other methods available.
In our method, we obtain the channel impulse response of length q by estimating first the MECF at appropriate q-tuples and than solving the set of relatively simple nonlinear equations. The method proposed shows improvements in performance compared to a-spectrum method and bears some similarities with the normalized cumulant matching method in Swami. Our identification method is general enough to be used for innovations with other distributions than a-stable which are conveniently described in terms of the characteristic function.
The main difficulty in maximum likelihood (ML) estimation of the parameters of an alpha stable distribution is the lack of a close form expression for the probability density function (pdf). Based on the fact that the pdf of the central ordered samples of any random variable are asymptotically normal, with mean and variance which are known functions of the parameters of the order and of the original pdf, we suggest order statistics based parameter estimation procedures. Simple asymptotic estimators are constructed for the parameters of a stable distribution. The estimators for the location parameter, the scale parameter and the characteristic exponent are considered, based on selected order statistics. The appropriate Cramer-Rao bounds (CRB) and maximum likelihood estimators are derived and analyzed.
Let x_{1}¼x_{L},x_{L+1}¼x_{2L},¼,x_{(K-1)L+1}¼x_{N} , be samples of an i.i.d sequence of SaS distributed random variables, x_{i} ~ SaS(m,s). Instead of the original N samples, we use K = N/L samples, each of them is the q*L-th sample in the ordered K-th non-overlapping subsequence the original N-dimensional sequence. The resulting sequence, whose elements are denoted by {z_{i}}_{1}^{K} is i.i.d.. Under mild regularity conditions, for any 0 < q < 1 the sequence has asymptotic Gaussian distribution with mean h_{z} = F^{-1}(q) and variance s_{z}^{2} = (q(1-q)/(Lf^{2}(F^{-1}(q))), where f(·) is the original pdf and F(·) is the corresponding cumulative distribution function.
We suggest to apply ML techniques for estimating the parameters of the SaS distribution on the sequence z_{1}¼z_{K}, instead of on the original sequence x_{1}¼x_{KL}. We develop the estimation procedure and we analyze the asymptotic performance of the resulting estimates as a function of q via the corresponding CRB and compare it to the CRB of the original problem. In particular, we show that while q = 0.5 is a very good choice for almost optimal estimation of the location parameter, for efficient estimation of the other two parameters q = 0.5 is the worst choice.
The increasing demand for reliable, high-speed data transmission over the local loop has instigated fresh studies into the nature and statistics of impulse noise. Impulse noise is known to be the most significant factor limiting successful transmission. We examine the interarrival statistics of IN events and demonstrate that up to a threshold u events have Pareto distribution. This is consistent with Mandelbrot's observations in the 1960's of self-similar error clusters in communication systems for which the Pareto distribution was originally proposed. For events in excess of u a heavy-tailed distribution is observed. Such excesses fit a Generalised Pareto Distribution. The threshold u is determined from mean and median excess plots. The overall approach to the heavy tail is that of exceedance rather than extreme value analysis.
It has now been demonstrated in many studies that network traffic exhibits properties consistent with Long Range Dependence (LRD) and self-similarity. While theoretical frameworks are currently being developed to estimate the performance of such systems, simulation will remain a valuable tool for validating these theoretical models, and providing insight into systems which are too complicated to effectively model. Furthermore, when testing real systems, it is desirable to have traffic sources which are realistic, and hence display self-similarity.
The Fractal Renewal Process (FRP) and its variants (including On/Off processes and superpositions thereof) have been proposed as models for LRD processes, in particular for network traffic. The FRP is a simple renewal point process with heavy-tailed inter-renewal times. The long-range correlations in the process are directly introduced by the heavy tail of the renewal times.
The FRP has the great advantage that the number of computations required to generate a time series is linear and the time series can be generated on-line, facilitating generation of real traffic.
However, there are some problems which arise when using such processes to generate LRD traffic. Most notably undersampling of the heavy-tailed random variables used to generate FRPs can lead to a truncation of the sampled autocorrelation that is not consistent with LRD.
This problem becomes clear when the processes are investigated using the wavelet based methods of Abry and Veitch which segregate behaviour at different scales. This paper will describe the problem of undersampling, and its effects, and methods for avoiding the problem.
The optimal detector for a known signal with unknown amplitude, observed in non-Gaussian noise, is non-linear. When the detector is constrained to be linear, the form of the optimal linear filter (OLF) is known. We study the problem of optimal signal set design if the OLF is to be used. The performance of the OLF is compared with that of detectors based on typical ZMNL pre-processing. We apply ZMNL pre-processing ideas to typical parameter estimation problems, such as time-delay estimation (wideband/narrowband), direction of arrival estimation, and signal copy, when the noise is impulsive (iid or linear). We summarize some of our earlier work on estimating the parameters of linear (perhaps mixed-phase, or even non-causal) stable processes by using self-normalized fourth-order moments. Finally, we propose some suboptimal techniques for estimating the parameters of a general stable process.
In this paper we consider the spherically symmetric distributions (SSDs), which include the normal, student-t and stable distributions, for the Value at Risk (VaR) models. We propose a multivariate model in order to capture the correlations among the returns of different financial assets. Different marginal distributions for the returns of individual financial assets can be obtained by choosing different prior densities for the corresponding mixing parameters arising from the scale mixture representation of the SSD density functions. This model also permits the information from financial data to control the tail-fatness of the predictive marginal distributions for the returns of the indivdual financial assets. Using Bayesian approach, we can capture both the risk trader's subjective view towards the financial markets and the objective market data in our VaR model. The calculation of VaR relies on the sampling-based Markov chain Monte Carlo (MCMC) methodologies. Numerical results will be given for illustration.
The presence of time-changing variance (heteroskedasticity) in financial time-series is often cited as the cause of fat-tailedness in the unconditional distribution of the series. However, many researchers have found that, after allowing for heteroskedastic behaviour, the conditional distributions remain fat-tailed. Consequently, one approach adopted by applied econometricians has been to postulate a fat-tailed conditional distribution. In the multivariate context, very few such distributions offer tractable solutions which accurately capture multivariate deviations from normality. The approach taken in this paper is to model the multivariate dynamics of the conditional covariance matrix with a parsimonious regime-switching factor GARCH model. The factor loading matrix switches within a finite state-space according to the value of an unobserved Markov state variable. The conditional distribution of the process is then a mixture of multivariate normals. Fat-tails are explicitly generated by the presence of structural breaks or changes of regime. We develop some theoretical properties of such models, and filters for the unobserved factor process and Markov chain, as well as efficient maximum likelihood estimation via the EM algorithm. Finally, we apply the techniques to daily changes in the term structure of interest rates (and possibly to the term structure of implied volatilities?).
L-moments (Hosking, J.R.Statist.Soc.B, 52 (1990), 105-124) are summary statistics of probability distributions and data samples, computed from linear combinations of the ordered data values. Like conventional moments, the first few sample L-moments of a data set give an indication of the shape of the distribution from which the sample was drawn, and an indication of possible families of distributions that might fit the data. However, L-moments have several advantages: in particular, population L-moments exist even when the variance or higher-order conventional moments are infinite, and sample L-moments are less affected than their conventional counterparts by the presence of outliers in the data sample.
Many financial computations, such as option pricing and calculation of Value At Risk, require knowledge of the distribution of returns on financial instruments. It is generally acknowledged that the naive assumption that returns are Normally distributed is inadequate, but there is little agreement about what other distributions are appropriate. As an example of the use of L-moments with financial data, we analyse the distribution of daily returns on IBM stock and demonstrate the ability of L-moments to identify which heavy-tailed distributions are consistent with the data.
Because the family of generalized linear models has come to be fairly widely used, statisticians have become accustomed to regression models where the variance is not constant. However, these models do impose a fixed relationship between the mean and the variance. In contrast, the stable family is particularly interesting because the four different parameters modelling the shape of the distribution (location, scale, skewness, thickness of tails) can be specified independently of each other (Lambert and Lindsey, 1999). However, as the tail parameter a approaches 2, stable distributions tend to become symmetric whatever the value specified for the skewness parameter b. It is particularly disappointing when modelling skewed series of exchange rates where the tail parameter often takes values around 1.8.
In this work, we show how it is possible to derive, from the stable distribution, a new family of distributions (that we propose to name the skewed stable distribution) that allows the introduction of skewness without any limitation while keeping the desired Paretian behaviour of the tails unchanged. The fit of these models will be compared to the adjustments provided by the stable and generalized versions of the Student distributions using the likelihood and diagnostic tools.
Finally, extensions of the ideas underlying GARCH models will be proposed to model sequence of extremes.
Stable laws are fitted to distributions of parallel exchange rate for fluctuations in African currency markets, and parameters are estimated using Maximum likelihood estimation methods. Empirical evidence shows that stable models approximate the distributions of parallel exchange rate much better than Gaussian counterparts - these distributions have heavy tails and infinite variance. The stable fits suggest long-run depreciation of these currencies against the US dollar.
The paper considers the distributions of returns on markets which are subject to speculative processes of the Diba-Grossman type - these being bilinear stochastic root processes describing the dynamics of prices. In particular, a Diba-Grossman process might degenerate to a random walk resulting in the normal distribution of filtered returns. This is consistent with the standard efficient markets hypothesis. However, if a Diba-Grossman process is non-degenerative, then the distribution of returns is non-normal and the market may not be regarded as efficient. It has been asserted that the distribution of returns can be described by a symmetric stable distribution and that there is a relation between the characteristic exponent of the stable distribution ('alpha') and the degree of inefficiency of a Diba-Grossman process, measured by the variance of its stochastic root. In a number of Monte Carlo experiments in which Diba-Grossman processes are simulated, a relation between the 'alphas' (estimated by the McCulloch quantile method) and the degree of inefficiency has been found. This gives rise for an indirect method of the evaluation of market inefficiency by estimation of 'alpha' for market returns and mapping it into the corresponding degree of inefficiency. Estimated response surfaces and exemplary empirical analysis for some emerging stock markets are given.
The Viterbi algorithm plays a fundamental role in the design of receivers for digital communication systems corrupted by Gaussian noise. This algorithm arises as the maximum likelihood sequence detector of the transmitted data symbols in several applications, including equalization for channels subject to intersymbol interference, multiuser communications, and the detection of convolutionally encoded data. Although the Viterbi algorithm has been extensively studied and applied to several problems in communications involving Gaussian noise, little work has been done on these same problems for the case when the channel noise is impulsive and, therefore, non-Gaussian in nature. In this paper, we derive a general algorithm for maximum likelihood sequence detection of convolutionally encoded data, when the channel is corrupted by additive i.i.d. non-Gaussian noise following an arbitrary (but known) distribution. We then focus on the special case of Laplacian noise, for which our algorithm is particularly elegant and simple to implement.
The boom in the Internet and the development of Broadband ISDN services and networks has led to a growing interest in the development of suitable network modelling and resource allocation schemes. An important topic in this area of research is that of teletraffic data analysis and modelling. This paper will present self similar teletraffic models based on stable distributions. In addition, it will consider the impact the infinte variance nature of such distributions would have on the resource allocation scheme using effective bandwidths. Finally the queuing behaviour of a buffer and how it is affected by the parameter a in stable distributions will be presented.
Understanding and modeling the nature of network traffic is critical in designing buffer control to guarantee the required quality of service. It has been known that the network traffic exhibits self-similar characteristics, and contains bursts over a wide range of time scales. The Alternating Fractal Renewal Process (AFRP), or On/Off model, has been proposed to model the high-speed data network. According to this model, each user transmits (state 1), or stays idle(state 0), and the duration of each state follows a heavy-tail distribution. Although the AFRP model provided theoretical justification for the self-similar nature of network traffic, its aggregated results are grounded on the fractional Brownian motion model, whose marginal distribution is normal. Thus, it can not account for the impulsive nature of real network traffic. In this paper we propose the extended AFRP model for the bandwidth requirement of each user versus time. Each user transmits or stays idle, with durations that are heavy-tail distributed, but, unlike the AFRP model, the bandwidth requirement during the transmission state follows a heavy-tail law. We provide proofs for long-range dependence and heavy-tail properties of the propose model, and present comparisons between real and synthesized traffic data.
For a random walk with negative drift we study the exceedance probability (ruin probability) of a high threshold. The steps of this walk (claim sizes) constitute a stationary ergodic stable process. We study how ruin occurs in this situation and evaluate the asymptotic behavior of the ruin probability for a large variety of stationary ergodic stable processes. Our findings show that the order of magnitude of the ruin probability varies significantly from one model to another. In particular, ruin becomes much more likely when the claim sizes exhibit long-range dependence. The proofs exploit large deviation techniques for sums of dependent stable random variables and the series representation of a stable process as a functional of a Poisson process.
Many engineering applications require extracting a signal from the observation with noise, possibly heavy-tailed. We assume that the observation noise is a Levy process, while the signal is Gaussian, and derive a non-linear recursive filter that minimizes m.s. error. A sub-optimal filter is proposed for numerical purposes, and simulations show that it out-performs the existing linear filter.
An analysis of exchange rates using XGPL/Xtremes.
A demo of a statistical computing environment (joint work with Michael Thomas) is given in the form of a case study. The data to be analyzed are the Yen/US dollar exchange rates from Dec 78 to Jan 91. We demonstrate and apply the POT (peaks-over-threshold) method to the log-returns of the exchange rates. This parametric approach enables an extrapolation of the empirical insight beyond the range of the data. An important application is the estimation of very low quantiles which entails an estimation of the VaR (value at risk). The data analysis is done with the interactive, statistical software system Xtremes which is included on CD-ROM in the book Statistical Analysis of Extreme Values published by Birkhauser. In addition, the distributional performance of both the Dekker et al Moment and the Hill estimator are compared by using XGPL. The conclusion is that the Hill estimator should not be used in applications such as the preceding one. XGPL is scheduled as a general graphical programming language in statistics, where "flow-charts" are executable programs. Extreme value procedures are provided by Xtremes regarded as a server.
A standard Value-at-Risk (VaR) model corresponds to stable market conditions and assumes a multivariate normal distribution for risk factors with known constant volatilities and correlations. However, the actual risk factor distributions exhibit significant deviations from normality. Excess kurtosis, skewness, and volatility fluctuations are typical for many market variables. Fat-tailed and skewed distributions result in the underestimation of actual VaR by the standard model. The Stochastic Variance VaR (SV-VaR) model accounts for uncertainty and instability of the risk factor volatilities. The one-period exponential distribution for the stochastic variance is derived from the Maximum Entropy Principle. The model is extended to the multi-period Gamma SV model based on the gamma process for the stochastic variance. This results in Levy (K-Bessel) process for the risk factor. Derived volatility term structure better describes an empirical term structure of the risk factor kurtosis for short holding periods. Multifactor SV-VaR model incorporates correlations between risk factors, as well as correlations between risk factors and their volatilities. Developed calibration procedure provides exact fit to the correlation structure of the risk factors and accurate approximation of the fourth moments. Two-step Monte Carlo simulation procedure for the VaR calculation is proposed. Numerical results for equity, commodity, interest rate, and foreign exchange rate risk are presented.
Mandelbrot suggested that Weierstrass' nowhere differentiable function can be modified and randomized so as to approximate fractional Brownian motion. Our approach covers the convergence of processes of a more general type and allows us to consider different dependence structures in the above randomization.
We also show that Weierstrass' function can also be modified and randomized in such a way as to provide a series approximation to the Harmonizable Fractional Stable Motion. The Harmonizable Fractional Stable Motion, which is a complex-valued, stable, self-similar process with stationary increments, is one of the many different extensions of fractional Brownian motion to the stable case.
Gaussian mixture densities have been popular for modeling non-Gaussian noise. The majority of non-Gaussian noise research has been restricted to iid observation sequences due to the difficulty in characterizing multidimensional pdf's. There has been very few studies on the ability of Gaussian mixture densities to model correlated non-Gaussian noise processes. In this paper, we initiate such a study and demonstrate that in many practical cases, Gaussian mixture densities with a small number of mixing terms can give good approximations to some non-Gaussian noise pdfs. A review of some general models for correlated non-Gaussian interference and noise is given. The focus is on three approaches. The first is the Gaussian mixture model approach. The second is an approach based on spherically invariant random vectors. The final approach involves the combination of linear filters and nonlinearities, generally in an ad-hoc manner. The three approaches are compared and the Gaussian mixture model is shown to be able to approximate models generated from the other approaches.
For symmetric alpha-stable (SaS) random variables with any characteristic exponent, alpha, from zero to two, the codifference is a measure of bivariate dependence. In the proposed paper, the properties of the codifference are exploited to examine bivariate dependence between complex, isotropic SaS random variables. The resulting method functions in a manner comparable to the covariation (alpha from one to two) and covariance (alpha of two). Besides including alpha from zero to two, the major advantage of the codifference approach is robustness to uncorrelated SaS noise added to both random variables. A disadvantage is a nonlinear relation with respect to the ``amount of dependence'', especially for alpha less than one. Potential application to radar sea clutter is examined.
Since the increments of hydraulic conductivity K or ln(K) often exhibit non-Gaussian distributions [Painter and Paterson,1994; Liu and Molz,1997] with fat tails, Levy-stable distributions have been used to represent the increments of K or lnK. Fractional Levy motion (fLm) models with the stable index a, scale parameter C and Hurst coefficient H extracted from sample data fail to reproduce the nature of aquifer heterogeneous due to the fact that the Levy-stable distribution tails are so fat that fLm models generate unrealistic distributions of K or ln(K). When fLm models are used to reproduce the ln(K) spatial distributions, all moments of K spatial distributions are infinite, which may be unsound physically. To overcome this shortcoming, Painter [1996] introduced truncated Levy-stable distributions. In this short communication, we use the high-moments based technique [Lau, et al. 1990] to further examine whether the increments of K or ln(K) at the MADE site have stable distributions, we also discuss effects of volume-averaged measurements [Lu et al., 1998]. Results show that the various theoretical moments of the Levy-stable distribution are significantly greater (several times to several orders of magnitude) than those calculated from measurements. Also the empirical tail index a is greater than 2, which implies that the second moment is finite. The effort to simulate high degree variability and intermittency of hydraulic conductivities naturally leads one to consider multifractal models, such as the universal multifractal model [Schertzer and Lovejoy, 1987].
The intermittency of turbulence is one of the most challenging problem in many natural and industrial processes. Significant progress in its understanding has been achieved in recent years by relating it to the heavy tailed pdf of the energy flux or wind shears, e.g. the exponent of the power-law of such pdf tails for wind shears, q_{D} = 3D 7 ±1, has been estimated in various empirical atmospheric conditions. These heavy tails are a rather general outcome of stochastic multiplicative cascades. However, there is a gap between these phenomenological models of turbulence and the deterministic-like Navier-Stokes equations for the velocity field which have symmetries distinct from the scale invariance. To bridge up this gap, we consider the Scaling Gyroscope Cascade (SGC) as an alternative to stochastic cascades. Indeed, this parameter-free model is defined by an infinite hierarchy of (deterministic) gyroscopes of smaller and smaller size and is obtained by partial truncation of the direct interactions of the Navier-Stokes equations. On the one hand, the multifractal exponents of SGC estimated on very high Reynolds number simulations, are extremely closed to those obtained on experimental atmospheric turbulence data. On the other hand, the relative simplicity of SGC opens the possibility of determining analytically this exponent.
Numerous empirical data from very diverse systems: financial (e.g., stock market, currency market), communication, biological (human heart) and fluid (leaky faucet) are well described by univariate stable non-normal distributions. Here we show that, remarkably, stable distributions also appear in quantum mechanics, our fundamental theory of nature. In particular, we show that, for the periodically delta-kicked plane-pendulum, Bohm's quantum force is stable. Based on our findings, we conjecture that this stable behaviour is generic, or universal, for Hamiltonian dynamical systems.
In this paper we deal with certain aspects of the dichotomous choice model. Our main purpose is clarifying the connections between some characteristics of the decision making body and the probability of its making correct decisions. A group of experts is required to select one of two alternatives, of which exactly one is regarded as correct. The alternatives may be related to a wide variety of areas. A decision rule translates the individual opinions of the members into a group decision. A decision rule is optimal if it maximizes the probability of the group to make a correct choice for all possible combinations of opinions.
We study the situation where only partial information on the probabilities of each experts in the group to choose the right decision is available. Specifically, we assume the expertise levels to be independent generalized Pareto distributed random variables. Moreover, the ranking of the members of the team is (at least partly) known. Thus, one can follow rules based on this ranking. The extremes are the expert rule and the simple majority rule. We show that, similarly to the other previously studied cases, the expert rule is more likely to be optimal than the majority rule. The results are partly obtained theoretically and partly by simulation.
Numerical methods to estimate the CDF of the symmetric stable distribution now permit its use as the link function in binary regression analysis. In accounting, finance, economics, and other areas of research, probit and logit regressions explain binary outcomes a function of a linear combination of explanatory variables plus an unobserved error term, distributed normally or by extreme value distribution of type I. The uncontrolled, observational nature of the data in these studies, however, suggests that these models always omit relevant (correlated) variables; one cannot accept any of these models literally. A robust alternative, with coefficients less sensitive to outliers driven by omitted variables, is Arctanit regression, which assumes Cauchy errors. This paper generalizes such heavy-tailed binary regression to employ the family of symmetric stable distributions parameterized by alpha, spanning from light-tailed normal (parameter alpha equals to two) through and beyond heavy-tailed Cauchy (alpha equals one). Logit regression is approximated by alpha of 1.8. Allowing the alpha parameter to be estimated, we may let data suggest the best fitting model and test the fit of alternatives along the spectrum.
This symmetric stable binary regression method is applied to explain 539 capital budget go-ahead/no-go decisions involving proposed environmental improvements that were considered by a large U.S. firm, such as reductions of toxic releases beyond regulatory requirements. The data is described best by a heavy-tailed error distribution of alpha approximately .71. By computing the decision weight of environmental impact factors, e.g. number of pounds of air toxic releases that will be reduced by a given investment, relative to the weight put on investment cost in dollars, I infer implicit prices on environmental impacts.
Stable distributions have received great interest over the last few years in the signal processing community and have proved to be strong alternatives to the Gaussian distribution. There have been several works in the literature addressing the problem of estimating the parameters of stable distributions. However, most of these works consider only the special case of symmetric stable random variables with beta=0. This is an important restriction though, since most rather than few of the real life signals are skewed, the examples of which include financial time series data, data traffic on computer networks, service time in a queue, hydrology data, meteorology data, geophysical signals, urban vehicle noise and relay switching noise on telephone lines. The amount of work on estimating the parameters of general (possibly skewed) stable distributions has been very limited and the existing techniques are either computationally too expensive or their estimates have high variances. In this paper, we solve the general stable parameter estimation problem analytically. In this paper, we introduce three novel classes of estimators for the parameters of general stable distributions. These new classes of estimators are based on formulas we have developed for the fractional and negative order moments of skewed stable random variables. These are generalisations of methods previously suggested for parameter estimation with symmetric stable distributions.
Of all known techniques for the general problem, only the characteristic function technique and the methods we have suggested yield closed form estimates for the parameters which may be efficiently computed. Simulation results show that at least one of our new estimators has better performance than the characteristic function technique over most of the parameter space. Furthermore our techniques require substantially less computation.
Stable distributions have been proposed to model various phenomenon in physics, astronomy, engineering and finance. While such models have appealing theoretical properties, it is not clear how useful they are in practice. A major obstacle to testing such models has been the lack of closed formulas or accurate approximations for general stable densities and distribution functions.
We have developed software that makes it possible to use stable models in practice. The programs calculate densities, d.f. and quantiles of stable distributions with a > 0.25 and any skewness. Maximum likelihood estimation of all stable parameters is feasible; several examples will be described. Various EDA techniques are used to assess goodness-of-fit.
Multivariate stable distributions are also of interest in applications. Techniques for approximating multivariate stable distributions, computing stable densities, simulation of stable random vectors, and estimating stable spectral measures from samples are given. Univariate EDA techniques are adapted to assess goodness-of-fit in the multivariate case.
We consider several common questions in the design of computer systems.
1. What is a good policy for migrating processes in a Network of Workstations environment?
2. In a distributed supercomputing server, what is an effective rule for assigning tasks to hosts?
3. What is a good scheduling policy for HTTP requests at a Web server?
For each problem, we show that the answer depends on the job size distribution. We show that the impact of the job size distribution is very great, affecting answers sometimes by orders of magnitude.
We present our own measurements showing that job size distributions are commonly heavy-tailed. In particular they have a Pareto distribution with alpha parameter around 1. We show how to incorporate heavy-tailed job size distributions into the design of computer systems. We find that the answers we obtain to the above questions when the job size distribution is heavy-tailed are different from common wisdom. Our analysis leads us to discover solutions to the above three questions which are novel and highly effective.
Algorithmic methods for solving hard computational problems, as they occur in, e.g., scheduling, VLSI design, software verification, and computational biology, often exhibit a large variability in performance. We study the probability distributions of the run time of such computational processes, and show that these distributions often exhibit heavy-tailed behavior. We will introduce a general strategy based on random restarts to improve the performance of such algorithmic methods. Using this strategy, the run time distributions are no longer heavy-tailed, and we obtain speedups of several orders of magnitude.
This talk describes parameter estimation for the univariate stable distribution and the multivariate sub-Gaussian symmetric stable distribution using a Monte Carlo EM algorithm. Unknown augmented vectors are employed in the construction of the joint posterior density of the parameters. Gibbs sampling enables the generation of these vectors from their respective conditional posterior distributions and thus facilitates the expectation step of the algorithm. The methodology is illustrated using simulated and real data. This approach is extended to carry out inference on autoregressive moving average (ARMA) and vector ARMA models with stable innovations.
Linear processes, and in particular autoregressive-moving average (ARMA) processes, play a central role in the analysis of time series in fields as diverse as finance and communications. We will consider estimation for linear processes under two different scenarios. In the first case, the process is AR (possible non-causal) and the driving noise is stable. We will show that maximum likelihood estimation procedures can be implemented and are effective even in the non-causal case. In the second case, we will consider all-pass models. These models generate uncorrelated but non-independent processes in the non-Gaussian case. Model estimation and selection procedures for these processes will be described. The model fitting procedures under both scenarios will be applied to several data sets.
The attractiveness of stock market investment relative to safe investment in price-level indexed bonds depends critically on the magnitude of the mean return on stocks.
Symmetric stable maximum likelihood (SSML) generally provides estimates of mean real returns that are substantially higher than OLS estimates. If the symmetric stable assumption is justified, SSML estimates are unbiased, are robust to outliers, and have finite standard errors estimable from the information matrix. The OLS estimates, on the other hand, have an infinite variance stable distribution, and are overly sensitive to outliers.
However if, as proves to be the case, market returns are in fact skewed to the left, SSML is biased upward, since it is basically estimating the mode rather than the true mean. Skew stable ML then provides much lower estimates of the mean return. Because the distance from the mode to the mean depends critically on the estimated skewness parameter, the standard error of the mean is much larger than in the symmetric case.
Illustrative numerical estimates are provided, using the author's SSML algorithm, and Nolan's STABLE program on real CRSP value- weighted U.S. Stock Market returns.
Recent literature has trumpeted the claim that extreme value theory (EVT) holds promise for accurate estimation of extreme quantiles and tail probabilities of financial asset returns, and hence holds promise for advances in the management of extreme financial risks. Our view, based on a disinterested assessment of EVT from the vantage point of financial risk management, is that the recent optimism is partly appropriate but also partly exaggerated, and that at any rate much of the potential of EVT remains latent. We substantiate this claim by sketching a number of pitfalls associated with use of EVT techniques. More constructively, we show how certain of the pitfalls can be avoided, and we sketch a number of explicit research directions that will help the potential of EVT to be realized.
For stationary ergodic stochastic processes with finite second moment, we can define the autocorrelation function. It's sample counterpart will provide a consistent estimate of the appropriate value at any lag. Given any empirical autocorrelation function [^(r)](k) we can fit a Gaussian autoregressive process of order p with autocorrelation [^(r)](k) for any |k| < p. The autoregressive parameters for this process will be given by the Yule-Walker equations.
It is not entirely clear that the sample acf is the appropriate measure of linear dependence in the infinite variance case. The theoretical autocorrelation no longer exists. In some cases the sample acf may converge to a random limit (even when the process is ergodic) which casts serious doubt on the reliability of statistical methods which use the sample acf.
For any stationary ergodic process with finite absolute mean we define (through a ratio of expectations) the autocovariation (a generalization of the covariation). We discuss the limiting behavior of the sample autocovariation function and give applications to time series modeling. In particular, the empirical autocovariation function has a deterministic limit whenever the process is stationary and ergodic, and we can use it to fit a stable autoregressive process via a generalization of the Yule-Walker equations.
Due to their distribution freeness property rank-based technics provide competitive methods for testing for independance, specially in the case of infinite variance, such as Cauchy or a- stable distributions.
To be short, a rank-based technique consists first in ranking the n observations X_{1},...,X_{n}, then in replacing the observations by a suitable function of the ranks R_{k},k = 1,...,n. For instance, in the van der Waerden statistics, the observations X_{k} are replaced by the corresponding [(R_{k})/( n+1)]- quantiles of the standard normal distribution.
We concentrate our study on the specific problem of identifying the order of AR processes. Though order identification of course is not, in the strict sense, a testing problem, testing ideas are present at each step of all identification procedures. As we shall see, rank methods perform remarkably well in this context. Under non Gaussian innovation densities, more particularly under the heavy-tailed or the non symmetric ones, or when outliers are present, the percentage of correct order identification based on our aligned ranks (i.e. ranks of the estimated residuals) methods is substantially higher than that resulting from traditional ones such as partial correlogram or Lagrange multiplier.
In this paper we describe Markov chain Monte Carlo (MCMC) methods for inference in models where heavy-tailed noise sources are present, concentrating on the case where the noise sources may be expressed using a scale mixture of normals (SMiN) representation. In particular, we show how to perform exact inference in the presence of symmetric stable noise sources using a simple product decomposition of the symmetric stable law, although similar methods are routinely applicable also to Student-t noise and exponential power law distributions, for example. Our method for the symmetric stable distribution uses a novel combination of rejection sampling and asymptotic tail expansions to achieve very fast sampling from the mixing parameters of the SMiN representation. Interest in the use of MCMC for dealing with otherwise intractable problems of inference for stable law distributions has rapidly grown over recent years. We compare and contrast our work with the earlier work of Buckle (1995), who shows how to perform inference for the parameters of the stable law distribution, and methods recently developed by Tsionas (1999) and Ravishankar (1999) for parameter estimation in the presence of stable disturbances. Simulation examples are presented for parameter estimation in non-Gaussian audio signals, which have both innovations and observation components that may be modelled using stable laws.
Consider a regression model Y_{i} = a+ å_{j} b_{j} x_{ij} + u_{i} (i = 1 , ¼, n) where the u_{i}'s are independent and have regularly varying right tail probabilities. In some situations (for example, environmental monitoring), we are interested in the extreme quantiles of Y_{i} given the covariates ( x_{i1} , ¼, x_{ip} ); these so-called regression quantiles can be estimated by minimizing an L_{1}-like objective function where greater weight is placed on positive residuals than on negative residuals. We will consider the asymptotic properties (including limiting distributions) of these estimators for the regression quantiles of order ( 1 - d/ n ) where d > 0 is fixed as n ® ¥. In particular, it is possible to obtain a consistent estimator of a given b_{j} if the corresponding x_{ij}'s have unbounded support as n ® ¥; this result is potentially useful in the detection of time trends. We will also consider the case where the u_{i}'s have exponential tails.
Asymmetric Laplace laws form a subclass of geometric stable distributions, the limiting laws in the random summation scheme with a geometric number of terms. Among geometric stable laws they play a role analogous to that of normal distribution among stable Paretian laws. However, with steeper peaks and heavier tails than normal distribution, asymmetric Laplace laws reflect properties of empirical financial data sets much better than the normal model. Despite heavier than normal tails, they have finite moments of any order. In addition, explicit analytical forms of their one-dimensional densities and convenient computational forms of their multivariate densities make estimation procedures practical and relatively easy to implement. Thus, asymmetric Laplace laws provide an interesting, efficient and user friendly alternative to normal and stable Paretian distributions for modeling financial data. We present an overview of the theory of asymmetric Laplace laws and their applications in modeling currency exchange rates.
Given a sequence of iid random vectors in the Generalized Domain of Attraction of a Gaussian Law, we will give necessary and sufficient conditions for the sequence of partial sums to satisfy a law of the iterated logarithm. We show that under appropriate operator normalization, the cluster set is almost surely the closed unit ball.
Hill's estimator is a popular method for estimating the thickness of heavy tails. In this study we derive an extension of Hill's estimator which accounts for a possible shift. Because the shifted Hill's estimator is shift-invariant as well as scale-invariant, it provides a better estimate of the tail parameter for heavy-tailed distributions including stable laws. We include the results of a modest simulation study which demonstrates the effectiveness of this estimator on a variety of heavy tail distributions.
Exceedances over high thresholds are often modelled by fitting a generalised Pareto distribution (GPD) to this part of the data. This is particularly useful in presence of heavy tails. One difficult aspect is the selection of the threshold, above which the GPD assumption is enough solid. This is often done by repeated choice of the threshold below which data are not considered. We suggest a new mixture model, where one term of the mixture is the GPD, and the other is a light-tailed density distribution. The two components are put together by means of a continuos weight function that, in some way, takes the role of automatic threshold selection. The full data set is used for inference. Maximum likelihood provides estimates with approximate standard deviations for all parameters of the model, including those present in the weight function. Our approach has been successfully applied to simulated data and to the (already studied) Danish fire loss data set.
In some applications, the population characteristics of main interest can be found in the tails of the distribution function. The study of risk of extreme events will lead to the use of probability distributions and the scenarios that correspond to the tail of these distributions.
Considering two approaches: parametric and nonparametric, the research emphasizes the assesment of distribution tails, assuming that underlying distributions are heavy tailed.
Three heavy tailed distributions are considered: Truncated Weibull, Generalized Pareto and Lognormal. The Maximum likelihood estimation method, using the complete sample, and using only the upper order statistics provide estimators of the parameters. Measures of Bias and Mean Squared Error of the estimators of the parameters, and the Conditional Mean Exceedence Functions of the distributions, are going to be produced.
This methodology has potential applications in quality control, monitoring of residual discharges, medical applications, design of environmental policies, and calibration and adjustment of processes and equipment.
Conditions and classes of examples of variance mixture of normals are given, along with a constructive proof on how to guarantee that a finite variance mixtures of normals is uniformly close (up to a desired tolerance level) to a given infinite variance mixture distribution.
We wish to minimize the finite number of terms needed subject to the desired tolerance level. The number of terms needed for this approximation depends on the desired tolerance level and the mixing measure, p. The mixing measure may be continuous, however, a discrete version p^{*}of p is used in the approximation process as a means of simplifying the infinite mixture. The method which is based on discretizing the mixing measure is presented and illustrated through an example and the infinite and finite mixtures are displayed on the same graph. The results are extended to multidimensional variance mixtures of normals.
Scale mixtures of uniform distributions are used to model non-normal, heavy-tailed data in both univariate and multivariate settings. In addition to providing greater flexibility in modelling, the use of scale mixtures also results in straightforward computational strategies, particularly in Bayesian analysis where Monte Carlo methods are used. We exemplify the models via several illustrations.
Bayesian (exact credibility) estimation of the net premium plays a central role in the actuarial business. The linear credibility approach is also applied in the restrictive Pareto model for which the Hill estimator is the MLE (e.g., Hesselager (1993) and Schnieper (1993)). One is confronted with the usual deficiencies of this model (Reiss and Thomas, Statistical Analysis of Extreme Values, Birkhauser, 1997). Bayesian parameter estimation within a suitably parametrized full Pareto model was explored in a recent paper by Reiss and Thomas (1998). The present paper deals with the exact credibility estimation of the net premium within models of Poisson-Pareto processes.