Economics 765

This page is for the course entitled Models for Financial Economics. The course is offered in the spring, in the months of May and June.

Class Notices:

The final exam for the course is now available at this link. It is due before midnight on Saturday June 22nd, 2024.
My office is Leacock 321C.
The class meets on Tuesdays and Thursdays, from 10.00 to 13.00. in room 441 at 680 Sherbrooke Street West. We will not meet on Thursday May 30, because I have to leave for Toronto to attend the CEA conference.

The class outline is here as a PDF file, and here as HTML. The information it contains is also given below.

The course is directed to students who wish to learn the mathematical techniques used in modern finance theory. The course will also include the basic theory of asset pricing, in particular, the pricing of derivative assets, such as options. If time permits, more elaborate models will also be discussed. The introductory material starts with measure theory, a topic not always treated in courses of mathematics for economists. Measure theory is however a necessary prerequisite for the sort of probability theory needed for financial applications. In particular, we will treat stochastic processes in continuous time, of which the simplest example is Brownian motion.

A brief list of the foundational topics we will treat is as follows.

Measure theory and the Lebesgue integral;
Probability based on Sigma-algebras and filtrations;
Conditional expectations;
Theory of martingales and arbitrage-free pricing;
Markov processes and stopping times;
Generalised probability density and the Radon-Nikodym theorem;
Brownian motion and Ito's stochastic calculus;
Stochastic differential equations;
Kolmogorov's backward and forward equations;
Girsanov's theorem.

On the more applied side, we will consider

Hedging a portfolio;
European and American options;
Arbitrage-free pricing;
Specific models, such as Black-Scholes, Cox-Ingersoll-Ross.

We will follow the two-volume set entitled Stochastic Calculus for Finance, by Steven Shreve, in the Springer Finance series. The first volume contains no sophisticated mathematics, but allows readers to develop valuable intuition by a detailed treatment of the so-called binomial model, the simplest of all models of derivative pricing. We will make use of many of the examples in that volume. The second volume is where most of the material for the course is to be found. It combines mathematical developments with some quite sophisticated financial models.

I have from time to time drawn attention to misprints and errors in Volume 2 of Shreve's book. I have located Shreve's own list of errata, which is in fact a lot more comprehensive than my own observations would have led me to think. Here is the PDF file containing these errata .

Log of material covered:

Well, we covered nothing on May 2nd, because no one showed up for class. Let's hope we can do better next week.
On May 7th, we had the first actual class. We began with Volume 1 of Shreve's textbook, looking at the One-Period Binomial Asset Pricing Model. This model is simple, but it allows us to understand the essentials of more complicated asset-pricing models.

A fundamental principle is that of no arbitrage. If one wants to price a derivative security, that is, an asset the payoff of which is a deterministic function of the price of an underlying asset, it may be possible to imagine a hedging portfolio, which gives exactly the same payoff as the derivative security in every possible state of the world. If so, then the value of the hedging portfolio must be equal to that of the derivative security in order to avoid arbitrage opportunities.

Options can be written on the underlying asset, either call options or put options. The former are options to buy a unit of the underlying at the strike price, the latter to sell one unit. The idea of no-arbitrage pricing is that the return on the option can be replicated by a hedging portfolio, which contains only the two assets, the risky underlying asset and the risk-free asset.

In each discrete period in the binary model, the market price of the underlying asset, which we may call the stock, can move to only two possible new values, one greater and one less than the price at the beginning of the period. If there are several periods, the value of the option is found by backward recursion from the possible returns at maturity. In this case, we speak of the infinite coin-toss model, where what happens in each period can be modelled as resulting from a coin toss giving heads (H) or tails (T).

In Chapter 2 of Volume 1, finite probability spaces are considered. Of these, the one of most interest is the finite coin-toss model, which is just the multi-period binary model. In this simple context, we can define a real-valued random variable as a mapping from the outcome space, denoted by Ω, to the real line or a subset of the real line. Subsequently, if a probability structure is defined on the outcome space, this induces a probability distribution for the random variable.

Then, still with a finite probability space, we may define the expectation of a random variable, and, with a random sequence, we may define conditional expectations. As an aside, we looked at Jensen's inequality. This led on to the concept of a martingale, a formalisation of a fair game. A result that will have a parallel in continuous time is that the sequence of discounted values of an asset is a martingale under a risk-neutral measure.
We finished our study of Volume 1 on May 9th. The only topic from there that we considered was the definition of a Markov process. Prediction of the future of a Markov process depends only on the current state of the process - the history of the process contributes no further information.

In Volume 2, we began at the beginning, and covered most of the first three sections of the chapter, which deals with probability in the abstract. A probability space is a triple (Ω,𝓕,P), where Ω is the outcome space, 𝓕 is a σ-algebra the elements of which are subsets of Ω, and P is a probability measure. Random variables are mappings from Ω to somewhere else, in the simplest case the real line. Ω comes equipped with 𝓕, and the real line with its σ-algebra, the Borel σ-algebra. An essential property of a well-defined random variable is that inverse images of Borel sets must belong to 𝓕.

The probability measure on (Ω, 𝓕) can induce another measure on (R,𝓑), for a random variable X. This is the distribution measure μ(X). It characterises all the probabilistic properties of X. Another equivalent characterisation is given by the cumulative distribution function (CDF). Sometimes a density exists as well.

Combined with the distribution measure, a random variable may have an expectation. It can be defined in a quite abstract setting by a Lebesgue integral. In some cases the expectation may be infinite.

Section 1.4 deals with convergence of a sequence of integrals and a sequence of expectations of random variables. For the latter, the kind of convergence that suits us best is almost-sure convergence. An example was given of a sequence of densities of normal random variables, with expectation zero and variance 1/n, and n → ∞. The integrals of these densities are all equal to 1, but the densities themselves converge to zero almost everywhere, and so the limit of the integrals differs from the integral of the limiting function.
On May 14, we began with something we had skipped over, namely the relation between the Riemann and Lebesgue integrals. The condition needed for them to be equal is that the set of points at which the integrand is discontinuous has Lebesgue measure zero.

The rest of our time that day we finished Chapter 1 of Shreve. We had seen that the limit of integrals of a sequence of functions is not necessarily equal to the integral of the limiting function. Two theorems let us arrive at the opposite conclusion; monotone convergence and dominated convergence.

Section 1.5 gives various results that are well known, but need proof in the abstract context. The very valuable technique of proof laid out here is called the standard machine. It starts by trying to prove the desired result for an indicator function, and then, in three further steps, extends the result to non-negative simple functions, then to general non-negative Borel-measurable functions, and finally to Borel-measurable functions that can have positive and negative values.

The last section, 1.6, of the chapter introduces some valuable concepts. Starting from some probability measure on a probability space, the measure can be changed by use of a random variable that is non-negative almost surely, and has expectation 1 under the first measure. This random variable is called the Radon-Nikodým derivative. If it is almost surely strictly positive, it defines a measure that is equivalent to the original one, by which is meant that two measure agree on which sets have measure zero.

An important example is given whereby a normal variable Y that is equal to a standard normal variable X plus a constant θ ends up with the standard normal distribution, with expection 0, under the new measure. The Radon-Nikodým derivative random variable is a deterministic function of the variable X. Section 1.6 concludes with the statement of the Radon-Nikodým theorem, which says that any two equivalent probability measures have a Radon-Nikodým derivative that is almost surely positive and has unit expectation.
We were all together on May 16, and started on Chapter 2. The first section introduces the idea of a filtration, an increasing sequence, or collection in the continuous case, of σ-algebras. A σ-algebra is thought of as a repository of information, and the filtration models how information accumulates over time. A stochastic process is a collection of random variables indexed by a continuous variable, usually thought of as time. It is said to be adapted to a filtration if for each time t X(t) is measurable by the σ-algebra 𝓕(t) of the filtration.

If X is a random variable, it generates a σ-algebra σ(X) as the set of inverse images under X of the Borel sets in the real line. X is measurable by another σ-algebra G if σ(X) is a sub-algebra of G.

In section 2.2 the concept of independence is introduced. First we define independence of a pair of events (elements of the underlying σ-algebra). Next we can define independence of two σ-algebras, and can then say that two random variables are independent if the σ-algebras they generate are independent. These ideas can be readily extended to finite sequences of events, of σ-algebras, or of random variables.

Shreve compiles a list of necessary and sufficient conditions for independence, all of which involve factorising characterisations of the joint distribution of a set of random variables. A more subtle result is what Shreve calls the Independence Lemma. Its statement and proof are found in R.L. Schilling and L. Partzsch (2012), An introduction to stochastic processes, De Gruyter, Berlin/Boston, Lemma A.3. By use of conditional expectations, we define two sorts of stochastic process: martingales and Markov processes.

Then we discussed conditional expectations. If X is a random variable, we may define its expection conditional on a σ-algebra 𝓖 ⊆ 𝓕, written E(X | 𝓖). This conditional expectation must satisfy two conditions: (i) it is a random variable that is measurable-𝓖, and (ii) it satisfies the condition of partial averaging. With just these requirements, many things can be proved about conditional expectations. First, existence. Leaving aside questions of infinite expectations, existence can be proved using the Radon-Nikodým theorem. (Shreve gives the proof in an Appendix.) Next, uniqueness. A very simple argument shows that, if we have two different conditional expectations, they must be equal almost surely.

The main properties of conditional expectations are linearity, what Shreve calls taking out what is known, iterated conditioning, and the fact that the conditional expectation of a variable that is independent of the σ-algebra is just the unconditional expectation. Jensen's inequality can also be seen to hold with conditional expectations.

In Chapter 3, the symmetric random walk is introduced, as a discrete-time stochastic process. It has independent increments, and its variance accumulates at the rate of one unit per unit time. The property of quadratic variation is introduced, and although it is a path-dependent random variable, it is seen to be equal to the variance in this specific case.

The scaled symmetric random walk is the next step on the road to Brownian motion, the topic of this chapter. This is introduced as a continuous-time process, by filling in the times between the discrete times of the random walk by linear interpolation. The scaling is determined by an integer parameter n, which is allowed to tend to infinity. The properties of the limit are established: independent increments, variance accumulating at the rate of one unit per unit time, normality of the value of the process at each time t, and quadratic variation equal to variance.
The study of Brownian motion was continued on May 21. Any process that has all the properties mentioned just above is called Brownian motion, or alternatively, a Wiener process. The properties are all properties of the distribution of the process, and so we can conclude only that the sequence of processes W⁽ⁿ⁾(t) converges in distribution. The construction that uses the scaled symmetric random walk is not constructive: there is no limiting process, only a limiting distribution.

A construction that gives Brownian motion as the almost-sure limit of a sequence of stochastic processes can be found at this link.

The properties of Brownian motion can be established from its defining characteristics regardless of the construction used to obtain it. The finite-dimensional distributions are defined to be the joint distributions of a set of values of the stochastic process, for instance W(t₁), ..., W(t_m). The distributions are multivariate normal with covariance structure cov(W(t), W(s)) = min(t,s).

A Brownian motion generates a filtration 𝓕(t), such that W(t) is measurable-𝓕(t), and that, for s > t, the increment W(s)-W(t is independent of 𝓕(t). This implies directly that Brownian motion is a martingale. We note that the fact that Brownian motion has non-zero quadratic variation distinguishes it from a continuously differentiable function, whose quadratic variation is zero. In fact, although Brownian motion is continuous, it is nowhere differentiable.

Regarding quadratic variation, we proved that [W,W](T) = T. This proof made use of a new sort of stochastic convergence, namely L² convergence. This shows that Brownian motion accumulates quadratic variation at a rate of one per unit time. The proof suggests the notation dW(t) dW(t) = dt, and some other calculations lead to the other two mnemonics of stochastic calculus, which are dW(t) dt = 0, and dt dt=0.

A brief digression introduce the idea of realised volatility, making use of the stochastic process called geometric Brownian motion, which is how the movement of stock prices in time is modelled in the Black-Scholes-Merton (BSM) model. Note that it is σ that is the volatility, and so what Shreve derives as the realised volatility is in fact the square of the realised volatility.

Next was the demonstration that, relative to a filtration to which a Brownian motion is adapted, Brownian motion is a Markov process, which can be characterised by its transition density.

The first-passage time for Brownian motion was the next concept to be introduced. If the process starts at zero, the first-passage time to m > 0 is defined to be τ_m = min{t ≥ 0; W(t) = m}. Two different approaches were taken to finding the distribution of the random variable τ_m. The first makes use of the exponential martingale, and considers this martingale stopped at the stopping time τ_m. We find that the probability that τ_m is infinite is zero, but that its expectation is infinite. An expression is found for the m.g.f. of τ_m.

The second approach relies on the reflection principle. This principle arises from the symmetry and the self-similarity of Brownian motion. It lets us compute the CDF of τ_m, and hence also its density and m.g.f.
We began class on May 23 by looking again at the reflection principle. In addition to the CDF of τ_m, it lets us derive the joint distribution of W(t) and the maximum-to-date stochastic process M(t).

In Chapter 4, we are introduced to the stochastic calculus. We define the Itô integral, with Brownian motion as the integrator. The integrand Δ(t) must be adapted to a filtration 𝓕(t) to which the Brownian motion is also adapted. To begin with, Δ is a simple process, a cadlag piecewise constant process. Properties of the integral were shown: it is continuous as a stochastic process indexed by t, the upper limit of integration; it is linear in the integrand; it is 𝓕(t) measurable; it is a martingale relative to the filtration. It satisfies the Itô isometry and its quadratic variation can be computed as a Lebesgue integral of the square of the integrand.

The Itô-Doeblin formula is at the heart of stochastic calculus. It lets us compute the differential of a measurable function f of Brownian motion. The formula involves not only the first but also the second derivative of f. An Itô process is the sum of an Itô integral and a Lebesgue integral, where the integrands of both processes are adapted to a filtration for the Brownian motion that serves as the integrator of the Itô integral.

The definition of the Itô integral is a natural one, and is linear in the two integrals that constitute the Itô process. A little calculation demonstrates what can be seen immediately using the rules of the stochastic calculus, namely that the quadratic variation of the process defined by such an integral comes only from the Itô integral. This lets us extend the Itô-Doeblin formula formally to encompass these integrals.

There followed a set of examples, which involve stochastic differential equations (SDEs). The first was generalised geometric Brownian motion. For it, the solution of the SDE is given by Shreve, and it is then proved, using the Itô-Doeblin formula, that it is indeed the solution. The following important result is stated: Any model of an asset price process that is positive, has no jumps, and is driven by a single Brownian motion must be a generalised geometric Brownian motion.

Next came the special case of an Itô integral of a deterministic integrand. At each time t, the integral turns out to normally distributed. This could be expected by analogy with multivariate scalar random variables, but can be proved formally by invoking the exponential martingale.

The next example is a genuine financial model: the Vasicek interest rate model. This model, too, is defined by an SDE. Again, Shreve provides the solution, and then shows that it is indeed the solution. We mentioned that fact that it is possible to solve the SDE directly, with no prior knowledge of the solution. The technique is to set the coefficient of Brownian motion in the SDE to zero, so as to get an ordinary differential equation (ODE). This can then (sometimes) be solved by conventional methods. The next step is to define the constant of integration of the ODE as a new variable which is not constant once the Brownian motion is reintroduced. It is sometimes possible, as in this case, to solve the SDE thus defined in a straightforward manner. Once that is done, we have an analytic solution for the interest rate model, and we see that it is mean-reverting, and has a long-term stationary solution with well-defined expectation and variance.
The Cox-Ingersoll-Ross (CIR) interest rate model had been briefly alluded to in the previous class, and was taken up again on May 28. It is very similar to the Vasicek model, but, by introducing the square root of the dependent variable in the coefficient of the differential of Brownian motion, it gives a model for which the interest rate cannot be negative. There is unfortunately no analytic solution, but the expectation and variance can be computed using the Itô-Doeblin formula.

The next section, 4.5, is devoted to the Black-Scholes-Merton (BSM) model. A stock price is modelled by geometric Brownian motion with constant drift and volatility. If the risk-free interest rate is also constant, one can construct a hedging portfolio for various options. We started with a European call option. In principle, this is just like what happens with the binomial asset-pricing model, but in continuous time.

It is convenient to discount all values, that of the stock price, of a portfolio with the stock and value in the money market, and of the call option, by a factor e^-rt. Then, if the hedging portfolio contains Δ(t) units of the stock at time t, the differential of the discounted value of the portfolio is just Δ(t) times that of the discounted stock price.

The BSM approach posits the existence of a deterministic function c(t,x) of the time t and a variable x, such that c(t,S(t)) is a stochastic process that gives the value of the option at time t when the stock price is S(t). The differential of this process is readily computed by Itô-Doeblin. A hedging portfolio is constructed using a process Δ that is adapted to the filtration for the Brownian motion. Then the no-arbitrage argument requires the differential of the discounted value of the hedging process to be equal to the differential of the option value process.

This leads to the continuous-time version of the delta-hedging rule, and to the BSM partial differential equation (PDE). Boundary conditions are needed for this PDE to have a unique solution, and these are provided by the properties of a European call. Shreve gives the solution, but does not (yet) tell us where it comes from. A somewhat silly section called The Greeks follows, whereby each of the partial derivatives in the BSM PDE is named after a letter of the Greek alphabet.

The concept of Put-Call Parity (PCP) is introduced by considering a forward contract. It is seen that such a contract is equivalent to a long call option and a short put option with the same strike price. Since the (discounted) value of the forward contract is easy to compute, this gives a relation between the function that gives the value of a call and the function that gives the value of a put with the same strike price.

Section 4.6 deals with multivariable stochastic calculus. A new rule is demonstrated for two independent Brownian motions W_i and W_j. It is very simple: dW_i dW_j = 0. The two-dimensional Itô-Doeblin formula is straightforward in differential form, and only more notationally clumsy in integrated form. An important use of the differential form is the Itô product rule. Next came Lévy's theorem, which gives a set of sufficient conditions for a martingale to be Brownian motion. The theorem was presented in one dimension and then in two dimensions, in which form it can readily be extended to higher dimensions.

Section 4.7 deals with the Brownian Bridge. We skipped this section, as the material it contains is not used anywhere else in Shreve's book.

Chapter 5 is on Risk-Neutral Pricing, which is based on the ideas of No-Arbitrage. The first main result is Girsanov's Theorem. This derives a change of measure analogous to what we saw when first encountering Radon-Nikodým derivatives. Now we need a Radon-Nikodým process, which allows for straightforward evaluation of conditional expectations under the changed measure.
We resumed the study of Chapter 5 on June 4th. The first application of Girsanov's Theorem was to derive the risk-neutral measure for one stock that follows a geometric Brownian motion. The so-called market price of risk is defined in this context. By combining this with a portfolio that also includes a money-market account, one can construct a hedging portfolio for options based on the stock. In the case of a single stock, a portfolio can be defined by specifying Δ(t) for each time t, the number of units of the stock held at that time. Then the differential of the discounted value of such a portfolio is just Δ(t) times the differential of the discounted stock price. We were able to derive the BSM pricing formula for a European call option by using the risk-neutral measure, with no need for a partial differential equation.

The next mathematical result is the Martingale Representation Theorem, once more given first in one dimension. In one dimension, it says that a continuous martingale that is adapted to the filtration generated by a Brownian motion can be expressed as an Itô integral with some adapted integrand. The defect of the theorem is that it is not constructive, in the sense that all it claims is that the integrand exists, without saying how to construct it. In the problem of hedging with one stock, it shows the existence of a hedging strategy, but leaves us in the dark about how to compute it.

The topic of section 5.4 is the Fundamental Theorems of Asset Pricing. To begin with, both Girsanov's Theorem and the Martingale Representation Theorem are extended to multiple dimensions. This lets us construct a market model with many stocks, not just one. Here, we observe how correlated Brownian motions can be constructed.

This leads on to a multidimensional market model, in which there are m stocks, driven by d Brownian motions, each stock price following a geometric Brownian motion. Problems arise if m≠d. Next, a formal definition was given of a risk-neutral measure. The market price of risk equations must be solved if a risk-neutral measure can be found. If no solution exists, the model contains an arbitrage.

A formal definition of an arbitrage was given before stating and proving the first of the asset-pricing theorems. It deals with the existence of a risk-neutral measure, and states that if such a measure exists, arbitrage is impossible. For the second theorem, we need to define a complete model, which means a model in which all derivative securities, like options, can be hedged. The theorem concerns the uniqueness of the risk-neutral measure, and states that a model is complete if and only if the risk-neutral measure is unique.
On June 6th we started by reminding ourselves of what the two theorems of asset pricing say. The second theorem, about the uniqueness of the risk-neutral measure, is trickier to prove than the first theorem, which concerns existence. We had to prove that the ability to hedge an arbitrary derivative strategy implies the uniqueness of the risk-neutral measure, and also, conversely, that uniqueness implies that there is a unique solution to the market price of risk equations for any given derivative security, and that this solution provides a solution to the hedging equations. Shreve complicates matters, but the essence of the theorem is that there should be exactly as many independent sources of randomness as different stocks in the model. It boils down to having a full-rank coefficient matrix in the market price of risk equations.

Section 5.5 deals with dividend-paying stocks. Continuous payment is treated first, although it is less realistic than discrete payments at a finite set of times. In both cases, it is easy to modify the stock-price process to take account of dividends, and to make small adjustments to the BSM model for options on dividend-paying stocks. We spent only a little time on this, not going into algebraic details. The main point is that, if a share of stock pays a certain amount in dividends, the value of the share falls by exactly that amount -- we have a no-arbitrage argument for this.

The last section of Chapter 5 before the Summary, Notes, and Exercises is entitled Forwards and Futures. The short section on forward contracts and the forward price is not too bad, if not very intuitive. The longer section on futures contracts and the futures price is, to my mind, incomprehensible and full of errors. For that reason, I have prepared a note with better explanations, at least I hope they are better!

We just had time to embark on Chapter 6, on Connections with Partial Differential Equations. Definitions were given of stochastic differential equations (SDE), in general, and for the special case of a one-dimensional linear SDE.
We completed our study of Chapter 6 in Shreve on June 11, starting more or less from the beginning, with the definition of a stochastic differential equation. The examples given are either things we saw before or closely related to those things. Among them is a new interest rate model, the Hull-White model. It generalises the Vasicek model by letting the constants of that model be time-varying, although still non-random. The Hull-White model also leads to a closed-form solution, and also has the property that it allows the possibility of negative interest rates.

In Section 6.3, it is shown that solutions to SDEs are Markov processes. This is the property that gives various option-pricing formulas, like the BSM formula.

Then, in the next section, the Feynman-Kac theorem is proved. This theorem is what establishes the connection between SDEs and PDEs. It uses the Markov property of solutions to an SDE. The principle behind the proof of the theorem is threefold:
- Find the martingale; and
- Compute the differential of the martingale; and
- Set the coefficient of the dt term in the differential to zero.
The theorem has a discounted version, more useful to us for option pricing. Shreve applies it to rederive the BSM PDE. The next section of the chapter returns to interest rate models, with particular attention to (zero-coupon) bonds. With a time-varying interest rate, it is useful to define the yield on a bond over a fixed period of time. It is the constant interest rate that would generate the same return as the bond in the presence of time-varying interest rates.

Since the interest rate models we have discussed all depend on SDEs, they correspond to PDEs, of which the solutions give the value, or price, of a bond as a function of time and the bond price at that time. For the Hull-White model, and for the Cox-Ingersoll-Ross model, the PDEs once derived can be solved, using a solution technique particular to affine-yield models, that is, those for which the yield is an affine function of the instantaneous interest rate.

The multidimensional Feynman-Kac theorem is the subject of the next section. The extension to this from the one-dimensional theorem is very straightforward. The multidimensional theorem is used to price an Asian option, the payoff of which depends on the whole trajectory of the price of the underlying asset rather than on just the price at the expiry of the option. In the context of BSM, the PDE used to get option prices is very similar to the BSM equation, but with an extra term.

We skipped Chapter 7, and went on to Chapter 8, on American derivative securities. Unlike European or Asian options, these can be exercised at any time up to the expiry of the option. In order to handle this additional feature, it is necessary to formalise the idea of a stopping time. The paramount example of this is a first-passage time, as discussed earlier, in Chapter 3. Although the definition of a stopping time is simple, the derivation of some of its properties is tricky. But the main use of stopping times is given by the Optional-sampling theorem, which says that a stopped martingale is still a martingale, and the analogous result is true for sub- and super-martingales.
June 13 was our last class. We completed the study of Chapter 8 on American options. The first example of an American option is the perpetual American put, where no finite expiry time is given. This assumption, although unrealistic, leads to an analytically tractable calculation of the optimal exercise time. It has to be a stopping time if it is to be implemented, and it turns out that it is a first-passage time: exercise when the price of the underlying asset falls below a given level. The optimal choice of this level is then derived

The first technical result we need to determine the optimal exercise time, conditional on the assumption that it is a first-passage time, is to find the distribution of the first-passage time to some target of a Brownian motion combined with a drift. The most useful way to characterise this distribution is by the moment-generating function, which can also be interpreted as the Laplace transform of the density. The answer is found in a way very similar to the way in which the first-passage time was studied in the absence of drift, with a few necessary differences. One quite different result is that, if the drift goes in the direction away from the target, the first-passage time can be infinite with positive probability.

The calculation of the value of the perpetual American put is done in the same setup as the BSM model, where the stock price follows a geometric Brownian motion with constant parameters. If the policy is to exercise the first time the stock price falls to a level L, one can work out the expected discounted payoff, under the risk-neutral measure, of this policy. The answer is of course a function of L.

The next step is thus to maximise this expected discounted payoff with respect to L. Shreve does this one way, and mentions that it can also be done by seeking a point of tangency of the graph of the value function, as a function of the initial stock price, and the payoff from immediate exercise. Both methods give the same result. This means that the problem of pricing the perpetual American put is solved, provided that the optimal exercise time is constrained to be a first-passage time for the stock price.

Using this explicit solution, it can be shown by direct calculation that the ordinary differential equation (ODE) satisfied by the value function that is analogous to the partial differential equation for BSM with European options, is satisfied before the stopping time is reached, but not afterwards. Since the holder of the perpetual put may fail to use an optimal exercise policy, it is desirable to find out what becomes of the differential equation after the optimal stopping time is exceeded. The answer to the question of the behaviour of the ODE both before and after the first-passage stopping time is given by the linear complementary conditions.

Just as the BSM solution can be found by solving a PDE and also by using the fact that the expected discounted payoff is a martingale under the risk-neutral measure, so here we can see that, for an American option, it is still a martingale until the stopping time, but becomes a supermartingale thereafter. The stopped martingale obtained by stopping the martingale at the stopping time is of course still a martingale. But the supermartingale property is maintained whatever stopping time is used, by the optional sampling theorem. This fact lets us show that no other stopping time can give a higher value than the optimal first-passage time.

Can the perpetual American put be hedged? The hedging portfolio is defined by the delta-hedging rule before optimal exercise, and afterwards the owner of the hedging portfolio can maintain equality with the value of the option by consuming cash out of the portfolio at a rate rK.

Most of the ideas worked out for the perpetual put have analogues for the finite-expiration American put. The value function is now explicitly dependent on the time to expiry as well as the stock price. In the two-dimensional space of these two arguments, there is a curve that separates the continuation set, where the option should not be exercised, from the stopping set, where it should be exercised immediately. Although no closed-form expression is known for the separating curve, the rest of the analysis is as for the perpetual put. The optimal solution is determined by linear complementary conditions, combined with what is now a partial differential equation, which involves the derivative of the value function with respect to the time.

The analysis of an American call option is much simpler, because, for it and for any option the payoff of which is a convex function of the stock price, it is readily shown that the payoff from immediate exercise is dominated by the payoff at maturity. Thus the American call is equivalent to the European call. That is, unless the underlying stock pays dividends. The case in which there is a fixed finite number of dividend payment dates is analysed, and it is seen that, if the dividend payment (which does not accrue to the option holder) is big enough, the subsequent loss in value of the stock may make it preferable to exercise just before the payment of the dividend. The computation of the value function of the American call based on a dividend-paying asset is a little complicated, but can be done by means of a backward recurrence, starting with the value at times between the last dividend payment date and maturity.
On June 18, some of us met, but not really as part of the course. It was a session in which we discussed Asian options, with the much more detailed exposition given in Chapter 7

Assignments:

Here is the link to the first assignment, dated May 9. The assignment is due on Thursday May 16. The easiest way to submit your assignment is by sending it to me by email.
Here is the link to the second assignment, dated May 18. The assignment is due on Tuesday May 28.
Here is the link to the third assignment, dated May 28. The assignment is due on Thursday June 6.
Here is the link to the fourth assignment, dated June 6. The assignment is due on Thursday June 13.

Ancillary readings:

Here is the note I prepared as an alternative to Shreve's discussion of forward and future contracts, in the hope that it would be more understandable.

I mentioned in class that there is a construction of Brownian motion by a sequence of stochastic processes that converge almost surely to the Brownian motion. The gory details of this, and a few other things, can be found in this link

I found this review of Shreve's texts, written by Darrell Duffie of Stanford. In it, you will read how good a set of two texts these books are!

By chance I came across an article (in French) written by a Parisian probabilist on the "History of Martingales". It gives a fairly complete account of the numerous senses of the word "martingale", and explains the best modern theories as to why the word means what it does in Probability theory. The article is well written and amusing, as well as being scholarly. It can be found here as a PDF file.

The article found by following this link, by Jarrow and Protter, gives a history of the development of stochastic calculus and its application to mathematical finance. It includes the sad tale of Doeblin, and explains why a Frenchman had a German name.

In order to encourage the use of the Linux operating system, here is a link to an article by James MacKinnon, in which he gives valuable information about what software is appropriate for the various tasks econometricians wish to undertake.

To send me email, click here or write directly to Russell.Davidson@mcgill.ca.

Back to the main page of this site

URL: https://russell-davidson.research.mcgill.ca/e765/