Economics 765

This page is for the course entitled Models for Financial Economics. The course is offered in the spring, in the months of May and June.

Method of Evaluation:

There will be 4 assignments, plus or minus one, which will count for one-third of the mark for the course. The remaining two-thirds will be based on the final exam, which will be a take-home exam.

Class Notices:

The class meets on Tuesdays and Thursdays, from 10.00 to 13.00. in Leacock 541. The first class takes place on Thursday May 1st; the last on June 17th.

The class outline is here as a PDF file, and here as HTML. The information it contains is also given below.

The course is directed to students who wish to learn the mathematical techniques used in modern finance theory. The course will also include the basic theory of asset pricing, in particular, the pricing of derivative assets, such as options. If time permits, more elaborate models will also be discussed. The introductory material starts with measure theory, a topic not always treated in courses of mathematics for economists. Measure theory is however a necessary prerequisite for the sort of probability theory needed for financial applications. In particular, we will treat stochastic processes in continuous time, of which the simplest example is Brownian motion.

A brief list of the foundational topics we will treat is as follows.

Measure theory and the Lebesgue integral;
Probability based on Sigma-algebras and filtrations;
Conditional expectations;
Theory of martingales and arbitrage-free pricing;
Markov processes and stopping times;
Generalised probability density and the Radon-Nikodym theorem;
Brownian motion and Ito's stochastic calculus;
Stochastic differential equations;
Kolmogorov's backward and forward equations;
Girsanov's theorem.

On the more applied side, we will consider

Hedging a portfolio;
European and American options;
Arbitrage-free pricing;
Specific models, such as Black-Scholes, Cox-Ingersoll-Ross.

We will follow the two-volume set entitled Stochastic Calculus for Finance, by Steven Shreve, in the Springer Finance series. The first volume contains no sophisticated mathematics, but allows readers to develop valuable intuition by a detailed treatment of the so-called binomial model, the simplest of all models of derivative pricing. We will make use of many of the examples in that volume. The second volume is where most of the material for the course is to be found. It combines mathematical developments with some quite sophisticated financial models.

I have from time to time drawn attention to misprints and errors in Volume 2 of Shreve's book. I have located Shreve's own list of errata, which is in fact a lot more comprehensive than my own observations would have led me to think. Here is the PDF file containing these errata .

Log of material covered:

Our first class was on the 1st of May. We began with Volume 1 of Shreve's textbook, looking at the One-Period Binomial Asset Pricing Model. This model is simple, but it allows us to understand the essentials of more complicated asset-pricing models.

A fundamental principle is that of no arbitrage. If one wants to price a derivative security, that is, an asset the payoff of which is a deterministic function of the price of an underlying asset, it may be possible to imagine a hedging portfolio, which gives exactly the same payoff as the derivative security in every possible state of the world. If so, then the value of the hedging portfolio must be equal to that of the derivative security in order to avoid arbitrage opportunities.

Options can be written on the underlying asset, either call options or put options. The former are options to buy a unit of the underlying at the strike price, the latter to sell one unit. The idea of no-arbitrage pricing is that the return on the option can be replicated by a hedging portfolio, which contains only the two assets, the risky underlying asset and the risk-free asset.

In each discrete period in the binary model, the market price of the underlying asset, which we may call the stock, can move to only two possible new values, one greater and one less than the price at the beginning of the period. If there are several periods, the value of the option is found by backward recurrence from the possible returns at maturity. In this case, we speak of the infinite coin-toss model, where what happens in each period can be modelled as resulting from a coin toss giving heads (H) or tails (T).

In Chapter 2 of Volume 1, finite probability spaces are considered. Of these, the one of most interest is the finite coin-toss model, which is just the multi-period binary model. In this simple context, we can define a real-valued random variable as a mapping from the outcome space, denoted by Ω, to the real line or a subset of the real line. Subsequently, if a probability structure is defined on the outcome space, this induces a probability distribution for the random variable. Then, still with a finite probability space, we may define the expectation of a random variable. As an aside, we looked at Jensen's inequality.
On May 6, we started on Volume II of Shreve's textbooks, from the beginning, and covered the first five sections of Chapter 1. An abstract probability space is a triple (Ω,𝓕,P), where Ω is the outcome space, 𝓕 is a σ-algebra the elements of which are subsets of Ω, and P is a probability measure. Random variables are mappings from Ω to somewhere else, in the simplest case the real line. Ω comes equipped with 𝓕, and the real line with its σ-algebra, the Borel σ-algebra. An essential property of a well-defined random variable is that inverse images of Borel sets must belong to 𝓕.

The probability measure on (Ω, 𝓕) can induce another measure on (R,𝓑), for a random variable X. This is the distribution measure μ(X). It characterises all the probabilistic properties of X. Another equivalent characterisation is given by the cumulative distribution function (CDF). Sometimes a density exists as well.

Combined with the distribution measure, a random variable may have an expectation. It can be defined in a quite abstract setting by a Lebesgue integral. In some cases the expectation may be infinite.

The relation between the Riemann and Lebesgue integrals is captured by a theorem which says that the Riemann integral is defined if and only if the set of points at which the integrand fails to be continuous has Lebesgue measure zero. In that case it is equal to the Lebesgue integral.

Section 1.4 deals with convergence of a sequence of integrals and a sequence of expectations of random variables. For the latter, the kind of convergence that suits us best is almost-sure convergence. An example was given of a sequence of densities of normal random variables, with expectation zero and variance 1/n, and n → ∞. The integrals of these densities are all equal to 1, but the densities themselves converge to zero almost everywhere, and so the limit of the integrals differs from the integral of the limiting function.

Two theorems allow us to arrive at the opposite conclusion, namely that the limit of the integrals of a sequence of Borel-measurable functions is equal to the integral of the limiting function. These theorems are the Monotone convergence and the Dominated convergence theorems.

Section 1.5 gives various results that are well known, but need proof in the abstract context. The very valuable technique of proof laid out here is called the standard machine. It starts by trying to prove the desired result for an indicator function, and then, in three further steps, extends the result to non-negative simple functions, then to general non-negative Borel-measurable functions, and finally to Borel-measurable functions that can have positive and negative values.
We tackled the last substantive section of Chapter 1, section 1.6, on May 8. This section introduces some valuable concepts. Starting from some probability measure on a probability space, the measure can be changed by use of a random variable that is non-negative almost surely, and has expectation 1 under the first measure. This random variable is called the Radon-Nikodým derivative. If it is almost surely strictly positive, it defines a measure that is equivalent to the original one, by which is meant that the two measures agree on which sets have measure zero.

An important example is given whereby a normal variable Y that is equal to a standard normal variable X plus a constant θ ends up with the standard normal distribution, with expection 0, under the new measure. The Radon-Nikodým derivative random variable is a deterministic function of the variable X. Section 1.6 concludes with the statement of the Radon-Nikodým theorem, which says that any two equivalent probability measures have a Radon-Nikodým derivative that is almost surely positive and has unit expectation.
On to Chapter 2. The first section introduces the idea of a filtration, an increasing sequence, or collection in the continuous case, of σ-algebras. A σ-algebra is thought of as a repository of information, and the filtration models how information accumulates over time. A stochastic process is a collection of random variables indexed by a continuous variable, usually thought of as time. It is said to be adapted to a filtration if, for each time t, X(t) is measurable by the σ-algebra 𝓕(t) of the filtration.

If X is a random variable, it generates a σ-algebra σ(X) as the set of inverse images under X of the Borel sets in the real line. X is measurable by another σ-algebra 𝓖 if σ(X) is a sub-algebra of 𝓖.

In section 2.2 the concept of independence is introduced. First we define independence of a pair of events (elements of the underlying σ-algebra). Next we can define independence of two σ-algebras, and can then say that two random variables are independent if the σ-algebras they generate are independent. These ideas can be readily extended to finite sequences of events, of σ-algebras, or of random variables.

Shreve compiles a list of necessary and sufficient conditions for independence, all of which involve factorising characterisations of the joint distribution of a set of random variables. We saw some examples of this involving multivariate normal vectors of random variables. Without independence, things rapidly get complicated. Even the joint density of a pair of bivariate normal variables is rather a mess to write down. A sort of counter-example of the behaviour of multivariate normal variables is given by two standard normal variables with zero covariance that are nonetheless not independent.
We began with conditional expectations on May 13. If X is a random variable, we may define its expection conditional on a σ-algebra 𝓖 ⊆ 𝓕, written E(X | 𝓖). This conditional expectation must satisfy two conditions: (i) it is a random variable that is 𝓖-measurable, and (ii) it satisfies the condition of partial averaging. With just these requirements, many things can be proved about conditional expectations. First, existence. Leaving aside questions of infinite expectations, existence can be proved using the Radon-Nikodým theorem. (Shreve gives the proof in an Appendix.) Next, uniqueness. A very simple argument shows that, if we have two different conditional expectations, they must be equal almost surely.

The main properties of conditional expectations are linearity, what Shreve calls taking out what is known, iterated conditioning, and the fact that the conditional expectation of a variable that is independent of the σ-algebra is just the unconditional expectation. Jensen's inequality can also be seen to hold with conditional expectations. Another property is that the expectation of a 𝓖-measurable random variable X conditional on 𝓖 is just X itself.

A more subtle result is what Shreve calls the Independence Lemma. Its statement and proof are found in R.L. Schilling and L. Partzsch (2012), An introduction to stochastic processes, De Gruyter, Berlin/Boston, Lemma A.3.

By use of conditional expectations, we define two sorts of stochastic process: martingales and Markov processes.

In Chapter 3, the symmetric random walk is introduced, as a discrete-time stochastic process. It has independent increments, and its variance accumulates at the rate of one unit per unit time. The property of quadratic variation is introduced, and although it is a path-dependent random variable, it is seen to be equal to the variance in this specific case.

The scaled symmetric random walk is the next step on the road to Brownian motion, the topic of this chapter. This is introduced as a continuous-time process, by filling in the times between the discrete times of the random walk by linear interpolation. The scaling is determined by an integer parameter n, which is allowed to tend to infinity. The properties of the limit are established: independent increments, variance accumulating at the rate of one unit per unit time, normality of the value of the process at each time t, and quadratic variation equal to variance.

Any process that has all the properties mentioned just above is called Brownian motion, or alternatively, a Wiener process. The properties are all properties of the distribution of the process, and so we can conclude only that the sequence of processes W⁽ⁿ⁾(t) converges in distribution.

The properties of Brownian motion can be established from its defining characteristics regardless of the construction used to obtain it. The finite-dimensional distributions are defined to be the joint distributions of a set of values of the stochastic process, for instance W(t₁), ..., W(t_m). The distributions are multivariate normal with covariance structure cov(W(t), W(s)) = min(t,s).

A Brownian motion generates a filtration 𝓕(t), such that W(t) is 𝓕(t)-measurable, and that, for s > t, the increment W(s)-W(t is independent of 𝓕(t). This implies directly that Brownian motion is a martingale.

The construction that uses the scaled symmetric random walk to obtain the limit of Brownian motion is not constructive: there is no limiting process, only a limiting distribution. This follows from the fact that the central-limit theorem provides only a limiting distribution; not a limiting variable. However, a construction that gives Brownian motion as the almost-sure limit of a sequence of stochastic processes can be found at this link.

Another limiting procedure, starting from the multi-period binomial model, leads to a variable that describes the distribution of the terminal stock price. It has a log-normal distribution, and it depends functionally on Brownian motion.
On May 15, we began with the first-order variation of a continuously differentiable function, and then the quadratic variation, which is zero for such functions. The fact that Brownian motion has non-zero quadratic variation distinguishes it from a continuously differentiable function. In fact, although Brownian motion is continuous, it is nowhere differentiable.

Regarding quadratic variation, we proved that [W,W](T) = T. This proof made use of a new sort of stochastic convergence, namely L² convergence. This shows that Brownian motion accumulates quadratic variation at a rate of one per unit time. The proof suggests the notation dW(t) dW(t) = dt, and some other calculations lead to the other two mnemonics of stochastic calculus, which are dW(t) dt = 0, and dt dt=0.

A brief digression introduced the idea of realised volatility, making use of the stochastic process called geometric Brownian motion, which is how the movement of stock prices in time is modelled in the Black-Scholes-Merton (BSM) model. Note that it is σ that is the volatility, and so what Shreve derives as the realised volatility is in fact the square of the realised volatility.

Next was the demonstration that, relative to a filtration to which a Brownian motion is adapted, Brownian motion is a Markov process, which can be characterised by its transition density.

The first-passage time for Brownian motion was the next concept to be introduced. If the process starts at zero, the first-passage time to m > 0 is defined to be τ_m = min{t ≥ 0; W(t) = m}. Two different approaches were taken to finding the distribution of the random variable τ_m. The first makes use of the exponential martingale, and considers this martingale stopped at the stopping time τ_m. We find that the probability that τ_m is infinite is zero, but that its expectation is infinite. An expression is found for the MGF of τ_m.

The second approach relies on the reflection principle. This principle arises from the symmetry and the self-similarity of Brownian motion. It lets us compute the CDF of τ_m, and hence also its density and MGF.

In addition to the CDF of τ_m, the reflection principle lets us derive the joint distribution of W(t) and the maximum-to-date stochastic process M(t).

In Chapter 4, we are introduced to the stochastic calculus. We define the Itô integral, with Brownian motion as the integrator. The integrand Δ(t) must be adapted to a filtration 𝓕(t) to which the Brownian motion is also adapted. To begin with, Δ is a simple process, a cadlag piecewise constant process. Properties of the integral were shown: it is continuous as a stochastic process indexed by t, the upper limit of integration; it is linear in the integrand; it is 𝓕(t) measurable; it is a martingale relative to the filtration.
On May 20, we resumed the study of Chapter 4, with Itô integrals of simple integrands, that is piecewise cadlag functions adapted to the filtration 𝓕(t). Such integrals are martingales; they satisfy the Itô isometry and their quadratic variation can be computed as a Lebesgue integral of the square of the integrand.

The Itô-Doeblin formula is at the heart of stochastic calculus. It lets us compute the differential of a measurable function f of Brownian motion. The formula involves not only the first but also the second derivative of f. An Itô process is the sum of an Itô integral and a Lebesgue integral, where the integrands of both processes are adapted to a filtration for the Brownian motion that serves as the integrator of the Itô integral.

The definition of the Itô integral is a natural one, and is linear in the two integrals that constitute the Itô process. A little calculation demonstrates what can be seen immediately using the rules of the stochastic calculus, namely that the quadratic variation of the process defined by such an integral comes only from the Itô integral. This lets us extend the Itô-Doeblin formula formally to encompass these integrals.

There followed a set of examples, which involve stochastic differential equations (SDEs). The first was generalised geometric Brownian motion. For it, the solution of the SDE is given by Shreve, and it is then proved, using the Itô-Doeblin formula, that it is indeed the solution. The following important result is stated: Any model of an asset price process that is positive, has no jumps, and is driven by a single Brownian motion must be a generalised geometric Brownian motion.

Next came the special case of an Itô integral of a deterministic integrand. At each time t, the integral turns out to normally distributed. This could be expected by analogy with multivariate scalar random variables, but can be proved formally by invoking the exponential martingale.

The next example is a genuine financial model: the Vasicek interest rate model. This model, too, is defined by an SDE. Again, Shreve provides the solution, and then shows that it is indeed the solution. We mentioned the fact that it is possible to solve the SDE directly, with no prior knowledge of the solution. The technique is to set the coefficient of Brownian motion in the SDE to zero, so as to get an ordinary differential equation (ODE). This can then (sometimes) be solved by conventional methods. The next step is to define the constant of integration of the ODE as a new variable which is not constant once the Brownian motion is reintroduced. It is sometimes possible, as in this case, to solve the SDE thus defined in a straightforward manner. Once that is done, we have an analytic solution for the interest rate model, and we see that it is normally distributed, mean-reverting, and has a long-term stationary solution with well-defined expectation and variance.

The Cox-Ingersoll-Ross (CIR) interest rate model came next. It is very similar to the Vasicek model, but, by introducing the square root of the dependent variable in the coefficient of the differential of Brownian motion, it gives a model for which the interest rate cannot be negative. There is unfortunately no analytic solution, but the expectation and variance can be computed using the Itô-Doeblin formula.

The next section, 4.5, is devoted to the Black-Scholes-Merton (BSM) model. A stock price is modelled by geometric Brownian motion with constant drift and volatility. If the risk-free interest rate is also constant, one can construct a hedging portfolio for various options. We started with a European call option. In principle, this is just like what happens with the binomial asset-pricing model, but in continuous time.

It is convenient to discount all values, that of the stock price, of a portfolio with the stock and value in the money market, and of the call option, by a factor e^-rt. Then, if the hedging portfolio contains Δ(t) units of the stock at time t, the differential of the discounted value of the portfolio is just Δ(t) times that of the discounted stock price.

The BSM approach posits the existence of a deterministic function c(t,x) of the time t and a variable x, such that c(t,S(t)) is a stochastic process that gives the value of the option at time t when the stock price is S(t). The differential of this process is readily computed by Itô-Doeblin. A hedging portfolio is constructed using a process Δ that is adapted to the filtration for the Brownian motion. Then the no-arbitrage argument requires the differential of the discounted value of the hedging process to be equal to the differential of the option value process.

This leads to the continuous-time version of the delta-hedging rule, and to the BSM partial differential equation (PDE). Boundary conditions are needed for this PDE to have a unique solution, and these are provided by the properties of a European call.
We resumed where we left off at the end of the previous class on May 22. There is a somewhat silly section called The Greeks, whereby each of the partial derivatives in the BSM PDE is named after a letter of the Greek alphabet. The concept of Put-Call Parity (PCP) is introduced by considering a forward contract. It is seen that such a contract is equivalent to a long call option and a short put option with the same strike price. Since the (discounted) value of the forward contract is easy to compute, this gives a relation between the function that gives the value of a call and the function that gives the value of a put with the same strike price.

Section 4.6 deals with multivariable stochastic calculus. A new rule is demonstrated for two independent Brownian motions W_i and W_j. It is very simple: dW_i dW_j = 0. The two-dimensional Itô-Doeblin formula is straightforward in differential form, and only more notationally clumsy in integrated form. An important use of the differential form is the Itô product rule. Next came Lévy's theorem, which gives a set of sufficient conditions for a martingale to be Brownian motion. The theorem was presented in one dimension and then in two dimensions, in which form it can readily be extended to higher dimensions.

Section 4.7 deals with the Brownian Bridge. We skipped this section, as the material it contains is not used anywhere else in Shreve's book.

Chapter 5 is on Risk-Neutral Pricing, which is based on the ideas of No-Arbitrage. The first main result is Girsanov's Theorem. This derives a change of measure analogous to what we saw when first encountering Radon-Nikodým derivatives. Now we need a Radon-Nikodým process, which allows for straightforward evaluation of conditional expectations under the changed measure.

The first application of Girsanov's Theorem was to derive the risk-neutral measure for one stock that follows a geometric Brownian motion. The so-called market price of risk is defined in this context. By combining this with a portfolio that also includes a money-market account, one can construct a hedging portfolio for options based on the stock. In the case of a single stock, a portfolio can be defined by specifying Δ(t) for each time t, the number of units of the stock held at that time. Then the differential of the discounted value of such a portfolio is just Δ(t) times the differential of the discounted stock price.
Class started late on May 27, on account of absences and late arrivals.

We began with section 5.2.5, in which the solution to the Black-Scholes-Merton model is derived by use of the risk-neutral measure. It relies on the Markov property of geometric Brownian motion, but is just a rather complicated calculation of the risk-neutral expectation of the payoff of a call option.

The next mathematical result is the Martingale Representation Theorem, once more given first in one dimension. In one dimension, it says that a continuous martingale that is adapted to the filtration generated by a Brownian motion can be expressed as an Itô integral with some adapted integrand. The defect of the theorem is that it is not constructive, in the sense that all it claims is that the integrand exists, without saying how to construct it. In the problem of hedging with one stock, it shows the existence of a hedging strategy, but leaves us in the dark about how to compute it.

The topic of section 5.4 is the Fundamental Theorems of Asset Pricing. To begin with, both Girsanov's Theorem and the Martingale Representation Theorem are extended to multiple dimensions. This lets us construct a market model with many stocks, not just one. Here, we observe how correlated Brownian motions can be constructed.

This leads on to a multidimensional market model, in which there are m stocks, driven by d Brownian motions, each stock price following a geometric Brownian motion. Problems arise if m≠d. Next, a formal definition was given of a risk-neutral measure. The market price of risk equations must be solved if a risk-neutral measure can be found. If no solution exists, the model contains an arbitrage.

A formal definition of an arbitrage was given before stating and proving the first of the asset-pricing theorems. It deals with the existence of a risk-neutral measure, and states that if such a measure exists, arbitrage is impossible. For the second theorem, we need to define a complete model, which means a model in which all derivative securities, like options, can be hedged. The theorem concerns the uniqueness of the risk-neutral measure, and states that a model is complete if and only if the risk-neutral measure is unique.

The second theorem is trickier to prove than the first theorem. We had to prove that the ability to hedge an arbitrary derivative strategy implies the uniqueness of the risk-neutral measure, and also, conversely, that uniqueness implies that there is a unique solution to the market price of risk equations for any given derivative security, and that this solution provides a solution to the hedging equations. Shreve complicates matters, but the essence of the theorem is that there should be exactly as many independent sources of randomness as different stocks in the model. It boils down to having a full-rank coefficient matrix in the market price of risk equations.

Section 5.5 deals with dividend-paying stocks. Continuous payment is treated first, although it is less realistic than discrete payments at a finite set of times. In both cases, it is easy to modify the stock-price process to take account of dividends, and to make small adjustments to the BSM model for options on dividend-paying stocks. We spent only a little time on this, not going into algebraic details. The main point is that, if a share of stock pays a certain amount in dividends, the value of the share falls by exactly that amount -- we have a no-arbitrage argument for this.
Class on May 29 was again characterised by more absences than presences.

The last section of Chapter 5 before the Summary, Notes, and Exercises is entitled Forwards and Futures. The short section on forward contracts and the forward price is not too bad, if not very intuitive. The longer section on futures contracts and the futures price is, to my mind, incomprehensible and full of errors. For that reason, I have prepared a note with better explanations, at least I hope they are better!

We next embarked on Chapter 6, on Connections with Partial Differential Equations. Definitions were given of stochastic differential equations (SDE), in general, and for the special case of a one-dimensional linear SDE.

We completed our study of Chapter 6 in Shreve on May 29. The examples given are either things we saw before or closely related to those things. Among them is a new interest rate model, the Hull-White model. It generalises the Vasicek model by letting the constants of that model be time-varying, although still non-random. The Hull-White model also leads to a closed-form solution, and also has the property that it allows the possibility of negative interest rates.

In Section 6.3, it is shown that solutions to SDEs are Markov processes. This is the property that gives various option-pricing formulas, like the BSM formula.

Then, in the next section, the Feynman-Kac theorem is proved. This theorem is what establishes the connection between SDEs and PDEs. It uses the Markov property of solutions to an SDE. The principle behind the proof of the theorem is threefold:
- Find the martingale; and
- Compute the differential of the martingale; and
- Set the coefficient of the dt term in the differential to zero.
The theorem has a discounted version, more useful to us for option pricing. Shreve applies it to rederive the BSM PDE. The next section of the chapter returns to interest rate models, with particular attention to (zero-coupon) bonds. With a time-varying interest rate, it is useful to define the yield on a bond over a fixed period of time. It is the constant interest rate that would generate the same return as the bond in the presence of time-varying interest rates.

Since the interest rate models we have discussed all depend on SDEs, they correspond to PDEs, of which the solutions give the value, or price, of a bond as a function of time and the bond price at that time. For the Hull-White model, and for the Cox-Ingersoll-Ross model, the PDEs once derived can be solved, using a solution technique particular to affine-yield models, that is, those for which the yield is an affine function of the instantaneous interest rate.

The multidimensional Feynman-Kac theorem is the subject of the next section. The extension to this from the one-dimensional theorem is very straightforward. The multidimensional theorem is used to price an Asian option, the payoff of which depends on the whole trajectory of the price of the underlying asset rather than on just the price at the expiry of the option. In the context of BSM, the PDE used to get option prices is very similar to the BSM equation, but with an extra term.
We skipped Chapter 7, on exotic options, for now. If time permits, we will return to it.

On, then, to Chapter 8, on American derivative securities. Unlike European or Asian options, these can be exercised at any time up to the expiry of the option. In order to handle this additional feature, it is necessary to formalise the idea of a stopping time. The paramount example of this is a first-passage time, as discussed earlier, in Chapter 3. Although the definition of a stopping time is simple, the derivation of some of its properties is tricky. But the main use of stopping times is given by the Optional-sampling theorem, which says that a stopped martingale is still a martingale, and the analogous result is true for sub- and super-martingales.

The first example of an American option is the perpetual American put, where no finite expiry time is given. This assumption, although unrealistic, leads to an analytically tractable calculation of the optimal exercise time. It has to be a stopping time if it is to be implemented, and it turns out that it is a first-passage time: exercise when the price of the underlying asset falls below a given level. The optimal choice of this level can be determined quite simply.

The first technical result we need to determine the optimal exercise time, conditional on the assumption that it is a first-passage time, is to find the distribution of the first-passage time to some target of a Brownian motion combined with a drift. The most useful way to characterise this distribution is by the moment-generating function, which can also be interpreted as the Laplace transform of the density. The answer is found in a way very similar to the way in which the first-passage time was studied in the absence of drift, with a few necessary differences. One quite different result is that, if the drift goes in the direction away from the target, the first-passage time can be infinite with positive probability.

The calculation of the value of the perpetual American put is done in the same setup as the BSM model, where the stock price follows a geometric Brownian motion with constant parameters. If the policy is to exercise the first time the stock price falls to a level L, one can work out the expected discounted payoff, under the risk-neutral measure, of this policy. The answer is of course a function of L.

The next step is thus to maximise this expected discounted payoff with respect to L. Shreve does this one way, and mentions that it can also be done by seeking a point of tangency of the graph of the value function, as a function of the initial stock price, and the payoff from immediate exercise. Both methods give the same result. This means that the problem of pricing the perpetual American put is solved, provided that the optimal exercise time is constrained to be a first-passage time for the stock price.

Using this explicit solution, it can be shown by direct calculation that the ordinary differential equation (ODE) satisfied by the value function that is analogous to the partial differential equation for BSM with European options, is satisfied before the stopping time is reached, but not afterwards. Since the holder of the perpetual put may fail to use an optimal exercise policy, it is desirable to find out what becomes of the differential equation after the optimal stopping time is exceeded. The answer to the question of the behaviour of the ODE both before and after the first-passage stopping time is given by the linear complementary conditions.

Just as the BSM solution can be found by solving a PDE and also by using the fact that the expected discounted payoff is a martingale under the risk-neutral measure, so here we can see that, for an American option, it is still a martingale until the stopping time, but becomes a supermartingale thereafter. The stopped martingale obtained by stopping the martingale at the stopping time is of course still a martingale. But the stopped supermartingale remains a supermartingale whatever stopping time is used, by the optional sampling theorem. This fact lets us show that no other stopping time can give a higher value than the optimal first-passage time.

Can the perpetual American put be hedged? The hedging portfolio is defined by the delta-hedging rule before optimal exercise, and afterwards the owner of the hedging portfolio can maintain equality with the value of the option by consuming cash out of the portfolio at a rate rK.

Most of the ideas worked out for the perpetual put have analogues for the finite-expiration American put. The value function is now explicitly dependent on the time to expiry as well as the stock price. In the two-dimensional space of these two arguments, there is a curve that separates the continuation set, where the option should not be exercised, from the stopping set, where it should be exercised immediately. Although no closed-form expression is known for the separating curve, the rest of the analysis is as for the perpetual put. The optimal solution is determined by linear complementary conditions, combined with what is now a partial differential equation, which involves the derivative of the value function with respect to the time.

The analysis of an American call option is much simpler, because, for it and for any option the payoff of which is a convex function of the stock price, it is readily shown that the payoff from immediate exercise is dominated by the payoff at maturity. Thus the American call is equivalent to the European call. That is, unless the underlying stock pays dividends. The case in which there is a fixed finite number of dividend payment dates is analysed, and it is seen that, if the dividend payment (which does not acrue to the option holder) is big enough, the subsequent loss in value of the stock may make it preferable to exercise just before the payment of the dividend. The computation of the value function of the American call based on a dividend-paying asset is a little complicated, but can be done by means of a backward recurrence, starting with the value at times between the last dividend payment date and maturity.
We now skip Chapters 9 and 10, and go directly to the final Chapter, on jump processes. We began the study of the Poisson process, which can be analysed in terms of its construction from a set of independent exponentially distributed arrival times.

The first task on June 5 was to derive the density of the gamma distribution, which is the distribution of a sum of IID exponential variables with the same expectation. This leads directly to the probability mass function of the Poisson process. It is seen that the Poisson process shares with Brownian motion the properties of having stationary independent increments. The first two moments of Poisson increments were calculated, and it was seen that the expectation and variance are the same.

A Poisson process is evidently not a martingale, but a very easy construction yields the compensated Poisson process, which is.

The next topic was the compound Poisson process, which can be constructed on the basis of a single Poisson process and a distribution for the jumps, the magnitudes of which are IID variables with this distribution. At this stage of the argument, it became necessary to base all arguments about distributions on moment-generating functions, rather than CDFs or densities. It turned out to be possible to decompose a compound Poisson process if the distribution of the jumps is a finite discrete distribution. This lets a compound Poisson process be expressed as a linear combination of independent (single) Poisson processes with different intensities.

Jump processes can appear as both integrands and integrators. A jump process is the sum of a continuous part, with an Itô integral, which contributes quadratic variation, and a Riemann integral, which does not, and a pure jump process, which is piecewise constant between jump times. It does contribute to the overall quadratic variation of the jump process. If the integrator in a stochastic integral is a martingale, for instance a compensated Poisson process, a particular example shows that, even if the integrand is adapted, the integral need not be a martingale, contrary to what is the case with Itô processes. In order to obtain the desired result that the integral is a martingale, the necessary regularity condition is that the integrand must be left-continuous, while the integrator is right-continuous.

The quadratic variation of a jump process, and also the cross variation of two jump processes can be calculated. A crucial result is that the cross variation of the continuous part of a jump process and the pure jump part is zero. This result leads to a theorem according to which if a Brownian motion and a Poisson process are both adapted to the same filtration, they are independent.

The next step is to expand the rules of the stochastic calculus, in particular the Itô-Doeblin formula, to apply to jump processes. It is not always possible to express the formula in differential form, and so it is usually presented in integrated form. The extra term in this integrated form contributed by the jump part is just the sum of the jumps of the function of the jump process under study.

The geometric Poisson process is one where it is possible to express things in differential form, although this is not always possible. This process turns out to be an exponential martingale involving a jump process.

Assignments:

The first assignment, dated May 8, is available at this link. It is due on Thursday May 15. The best way to submit your assignment is to send it to me by email.
The second assignment, dated May 20, is available at this link. It is due on Tuesday May 27.
The third assignment, dated May 27, is available at this link. It is due on Tuesday June 3.
The fourth assignment, dated June 3, is available at this link. It is due on Tuesday June 10.

Ancillary readings:

Here is the note I prepared as an alternative to Shreve's discussion of forward and future contracts, in the hope that it would be more understandable.

I mentioned in class that there is a construction of Brownian motion by a sequence of stochastic processes that converge almost surely to the Brownian motion. The gory details of this, and a few other things, can be found in this link

I found this review of Shreve's texts, written by Darrell Duffie of Stanford. In it, you will read how good a set of two texts these books are!

By chance I came across an article (in French) written by a Parisian probabilist on the "History of Martingales". It gives a fairly complete account of the numerous senses of the word "martingale", and explains the best modern theories as to why the word means what it does in Probability theory. The article is well written and amusing, as well as being scholarly. It can be found here as a PDF file.

The article found by following this link, by Jarrow and Protter, gives a history of the development of stochastic calculus and its application to mathematical finance. It includes the sad tale of Doeblin, and explains why a Frenchman had a German name.

In order to encourage the use of the Linux operating system, here is a link to an article by James MacKinnon, in which he gives valuable information about what software is appropriate for the various tasks econometricians wish to undertake.

To send me email, click here or write directly to Russell.Davidson@mcgill.ca.

Back to the main page of this site

URL: https://russell-davidson.research.mcgill.ca/e765/