advanced macroeconomics- 3 question-with reference books and slide

Advanced Macroeconomics
Problem Set 1 – Consumption
Due 11:59 PM Sunday 24th February 2013
Problem 1
Solve for ct, ct+1 and st the following two-period model:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

max U (ct,ct+1) = −
1
α
e−αct −

β

α
e−αct+1

subject to:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

ct +
(

1

1 + r

)
ct+1 = Yt +

(
1

1 + r

)
Yt+1

Problem 2
Solve the following problem:

max
Ct,Ct+1

U =
(
Ct − aC2t

)
+ β

(
Ct+1 − aC2t+1

)
subject to:

i) Ct + bt+1 + Kt+1 = Yt
ii) Ct+1 = Yt+1 + (1 + r) bt+1 + (1 − δ) Kt+1
iii) Yt+1 = A ln (1 + Kt+1)

where A > r + δ.

a) Find the optimal values of Kt+1, Yt+1, Ct, Ct+1, and bt+1;

b) Given the optimal value of Kt+1, find ∂Kt+1/∂r. Does this make sense?
Why?

c) For Yt = 1, β (1 + r) = 1, and A = 2 (r + δ), find ∂bt+1/∂r. Does the
income or substitution effect dominate?

Problem 3
Solve the following optimization problem for a three-period lived individual:

max
Ct,Ct+1,Ct+2

U = ln Ct + β ln Ct+1 + β2 ln Ct+2

subject to:

i) Ct + bt+1 = Yt
ii) Ct+1 + bt+2 = Yt+1 + (1 + r) bt+1
iii) Ct+2 = Yt+2 + (1 + r) bt+2
a) Find the optimal values of C∗t , C∗t+1 and C∗t+2;

b) Find the value function V = U(C∗t ,C∗t+1,C∗t+2), i.e. evaluate the total utility
function at the optimal values. What form does V have compared to U?

Advanced

Macroeconomics

Lecture 1. Quick Review of Undergraduate Macroeconomics:

Simple two-period models of consumption

Andrzej Cieślik

Spring 2013

2

Main assumptions:

– 2 periods of time: t, t+1
– Utility function additively separable:

UUU t

t

tt

C

C

CC

)(

)

(

,

1

1

+

+

+=

β

– constant discount factor:
θ

β

+

=

1

1 where: 10 ≤≤ β

– constant discount rate: where 0≥θ

3

  • Consumer utility maximization problem:
  • UUU tt
    tt

    CC
    CC

    Max

    )()(
    ,

    1
    1

    +
    +

    +=

    β

    s.t : o) no storage (benchmark)
    i) physical storage, no financial market, no production
    ii) financial market
    iii) production
    iv) financial market and production

    4

    Problem: UUU tt

    tt
    CC

    CC
    Max

    )()(
    ,
    1
    1
    +
    +

    += β

    CAS

    E

    0. (Benchmark): No Physical Storage

    s.t. (1) YC tt ≤
    (2) YC tt 11 ++ ≤

    CASE 0. (Benchmark): No Physical Storage
    Budget Constraint = Endowment Point

    (1) YC tt ≤
    (2) YC tt 11 ++ ≤

    E
    1+

    tY

    tY

    1+

    tC

    tC

    6

    Equilibrium with binding constraints

    11* ++ = tt YC

    tt YC =*

    1+tC

    tC

    U

    C = E

    7

    Problem: UUU tt

    tt
    CC
    CC
    Max
    )()(
    ,
    1
    1
    +
    +
    += β

    s.t. (1) YSC ttt =+
    (2) SYC ttt += ++ 11

    (3) 0≥S t

    Combine (1) &(2) into the intertemporal budget constraint:

    11 ++ +=+⇒ tttt YYCC
    11 ++ +=+⇒ tttt dYdYdCdC ( )0,0 1 == +tt dYdY
    11 −=⇒ +

    t
    t

    dC
    dC (slope)

    CASE 1. Physical Storage

    8

    1+tC

    tCYt

    Yt+1 •

    CASE 1. Physical Storage
    Kinked Budget Constraint

    E

    9

    1+tC
    tCYt
    Yt+1 •

    CASE 1. Physical Storage
    Equilibrium with not binding saving constraint

    C*t+1 •

    C*t

    C
    E

    10

    CASE 1. Physical Storage
    Equilibrium with not binding saving constraint
    Equate the slope of the indifference curve to the slope of the budget constraint:

    1

    )(

    )(1

    1

    −=

    =⇒

    +

    +
    t
    t
    C
    C
    t
    t

    U
    U

    dC
    dC

    β
    , CU c ln( )( =

    01)()(
    ,

    11

    =′+′=

    ++

    +

    tCtC
    CC

    dCUdCUUd
    tt

    tt
    β
    1
    1
    1
    1
    =⇒

    +t

    t
    C
    C

    β
    )

    )()( 1+
    ′=′⇒

    tt CC
    UU β

    tt
    tt
    CC

    CfC

    β=

    =⇒
    +
    +

    *
    1

    1
    * )(

    ( )

    ( )1* 1

    1
    *

    1
    1
    1
    ++
    +

    +⎟⎟


    ⎜⎜


    +

    =
    +⎟⎟


    ⎜⎜


    +
    =⇒

    ttt

    ttt

    YYC

    YYC

    β
    β

    β

    11 ++ +=+ tttt YYCC

    0
    1

    1

    1 1

    ** >⎟⎟


    ⎜⎜


    +

    −⎟⎟


    ⎜⎜


    +

    =−= +ttttt YYCYS ββ
    β

    11
    1+tC
    tC

    CASE 1. Physical storage
    Equilibrium with binding saving constraint

    11* ++ = tt YC
    tt YC =*
    C = E

    12

    CASE 2. Financial Market
    Problem: UUU tt

    tt
    CC
    CC
    Max
    )()(
    ,
    1
    1
    +
    +
    += β

    s.t. (1) YSC ttt =+
    (2) SYC ttt r)1(11 ++= ++

    Combine (1) &(2) into the intertemporal budget constraint:

    ( )
    11
    11
    1
    1
    1
    1
    1
    1
    ++
    ++

    +
    +=

    +
    +⇒


    +

    =⇒

    tttt

    ttt

    Y
    r

    Y

    C
    r

    C
    YC
    r

    S

    )(
    )(1
    1

    )1(
    +


    −=+−=⇒ +

    t
    t
    C
    C
    t
    t
    U
    U

    r
    dC

    d

    C
    β

    PDV of consumption = PDV of income

    13

    CASE 2. Financial Market
    Numerical Example

    1

    Two Period Utility
    Function
    (additivelly separable)

    ln lnt tU c cβ += +1442443

    Logarithmic
    Utility
    Function

    ( ) lnU c c=
    14243

    1 1

    Inter-temporal (between periods)
    Budget Constraint

    1 1
    1 1t t

    t t

    C C Y Y
    r r

    + +

    +

    = +

    + +1444442444443

    CASE 2. Financial Market
    Numerical Example
    1

    Slope of the Indifference 1
    Curve

    1
    ( )

    1
    1( )

    t t
    t
    t

    U c c
    r

    U c
    c

    β β+
    +


    = = +


    14243

    {1
    Consumption
    Policy
    Function

    ( ) (1 )t t tC f c r Cβ+ = = +

    15

    CASE 2. Financial Market

    ( )
    ( )
    1 1
    1
    1

    1 1
    1 1

    1 1
    1

    1 1
    1
    1
    1

    t

    t t t

    t t t t
    t t t

    C C Y Y
    r r

    C r C Y Y
    r r

    C Y Y
    r

    β
    β
    + +
    +
    +

    ⎛ ⎞ ⎛ ⎞
    + = +⎜ ⎟ ⎜ ⎟+ +⎝ ⎠ ⎝ ⎠
    ⎛ ⎞ ⎛ ⎞

    + + = +⎜ ⎟ ⎜ ⎟+ +⎝ ⎠ ⎝ ⎠
    ⎛ ⎞

    + = + ⎜ ⎟+⎝ ⎠

    {

    1
    Optimal

    Amount of
    Consumption
    in period (t)

    1

    1
    1 1t t t

    C Y Y


    +

    ⎛ ⎞⎛ ⎞
    = +⎜ ⎟⎜ ⎟+ +⎝ ⎠⎝ ⎠

    We have to solve a simple two-period consumption = income equality

    ( )*
    1 1

    1
    1 1t t t

    r
    C Y Y

    β β
    β β+ +
    +⎧ ⎫ ⎧ ⎫

    = +⎨ ⎬ ⎨ ⎬
    + +⎩ ⎭⎩ ⎭

    ( )* 1 1
    1 1

    1
    1 1t t t

    C r Y Y
    r

    β
    β+ +

    ⎧ ⎫⎛ ⎞⎛ ⎞⎛ ⎞
    = + +⎨ ⎬⎜ ⎟⎜ ⎟⎜ ⎟+ +⎝ ⎠⎝ ⎠⎝ ⎠⎩ ⎭

    CASE 2. Financial Market
    Numerical Example

    ( )* *1 1t tC r Cβ+ = +

    17

    CASE 2. Financial Market
    Optimal Savings

    ( )

    ( ) 1
    1 1

    1 1 1t t t t t t
    S Y C Y Y Y

    rβ β
    ∗ ∗

    +

    ⎛ ⎞
    = − = − −⎜ ⎟⎜ ⎟+ + +⎝ ⎠

    ( )( )

    1
    1 1 1

    1 1 1 1t t t

    S Y Y
    r

    β
    β β β


    +

    ⎛ ⎞+
    = − −⎜ ⎟+ + + +⎝ ⎠

    ( )

    { ( )( ) 1
    Optimal
    Amount of
    Savings
    in period

    1
    1 1 1t t t

    t
    S Y Y
    r

    β
    β β


    += −+ + +

    18

    1+tC
    tCYt
    Yt+1 •

    CASE 2. Financial Market
    Equilibrium with positive savings (lending)

    C*t+1 •
    C*t
    C
    E

    19

    1+tC
    tCYt
    Yt+1 •

    CASE 2. Financial Market
    Equilibrium with negative savings (borrowing)

    C*t+1


    C*t

    C
    E

    20

    Problem: UUU tt
    tt

    CC
    CC

    Max
    )()(

    ,
    1

    1
    +

    +
    += β

    s.t : (1) tttt KYKC )1(1 δ−+=+ +
    (2) 1121 )1( ++++ −+=+ tttt KYKC δ
    (3) )(1 1+=+ tKt FY

    CASE 3. Production

    Solution: (assume 02 =+tK )
    Substitute (3) into (2)

    1)(1 )1(1 ++ −+=⇒ + tKt KFC t

    δ

    (1) tttt KCYK )1()(1 δ−+−=⇒ +

    21

    CASE 3. Production
    Intertemporal budget constraint

    ( ) ( )[ ] ( )[ ]ttttttt KCYKCYFC )1(111 δδδ −+−−+−+−=⇒ +

    ( ) ( )[ ]
    =−−+−



    −+−∂

    ==


    − +

    +
    +
    +

    )1)(1()1(
    1 1

    1
    1
    )(
    (
    1

    ) δ
    δ

    β

    t
    t

    t
    ttt
    t
    t
    C
    C

    C
    K

    K
    KCYF

    dC
    dC
    U
    U
    t
    t

    [ ])1( δ−+−= kF

    Equate slopes of indifference curve and intertemporal budget constraint

    22

    1+tC
    tCYt

    Yt+1

    CASE 3. Production
    Equilibrium with positive investment

    C*t+1


    C*t
    C
    E

    23

    CASE 3. Production
    Numerical Example:

    ( )
    1
    21 1 1( )t t tY F K K

    + + += =

    ( )
    1
    21

    1

    2K t
    F K


    +=

    1 δ =

    ( )

    1
    2

    1 1 1t t tC Y K+ + += =

    ( )( )
    1
    2

    1 1t t t tC Y K Cδ+ = + − −

    From the budget constraint we know that

    24

    CASE 3. Production
    Numerical Example:

    ( )
    1 1

    22
    t
    t
    t t

    C
    C

    Y C

    β∗
    + =

    From the utility maximization we know that

    2

    2t t
    C Y

    β
    ∗ =

    +

    Equation we have our solutions:

    1
    2

    2 2t t t t

    K Y Y Y
    β

    β β

    + = − =+ +
    1
    2

    1 1t t
    C Y

    β
    β

    +

    ⎛ ⎞
    = ⎜ ⎟+⎝ ⎠

    25

    Problem: UUU tt
    tt
    CC
    CC
    Max
    )()(
    ,
    1
    1
    +
    +
    += β

    s.t : (1) tttttt BrKYKBC )1()1(11 ++−+=++ ++ δ , [ 0,0 == tt BK ]
    (2) 1111 )1()1( ++++ ++−+= tttt BrKYC δ
    (3) )(1 1+=+ tKt FY , 01≥+tK

    CASE 4. Financial Market and Production

    Note that S splits into B & K

    26

    CASE 4. Financial Market and Production

    Solution:
    (1)

    11 ++ −−=⇒ tttt KCYB

    [ ]11)(1 )1()1(1
    1

    1
    1

    1 +++
    +−−+

    +
    +=
    +
    +

    + ttKttt
    KrKF

    r
    YC

    r
    C

    t
    δ

    0)1()1()(
    1

    1
    =+−−+′=

    +
    +

    rF
    dK
    dPDV

    tK
    t

    δ

    )1()1()( 1 rF tK +=−+′ + δ

    27

    CASE 4. Financial Market and Production
    Numerical Example:

    1ln lnt tMaxU C Cβ += +

    1 1t t t tC B K Y+ ++ + =

    1 1 1 1(1 ) (1 )t t t tC Y K r Bδ+ + + += + − + +

    1 1 1( )t t tY F K AK
    α

    + + += =

    CASE 4. Financial Market and Production
    Numerical Example:

    1 1 1 1
    1 1

    (1 ) (1 )
    1 1t t t t t t

    C C Y AK K r K
    r r

    α δ+ + + −⎡ ⎤+ = + + − − +⎣ ⎦+ +

    Maximize the value of recourses 1 1 1 (1 ) (1 )t t tAK K r K
    α δ+ + ++ − − +

    {

    1

    1
    1

    Re

    (1 ) (1 ) (1 ) (1 )K t
    t GrossRateOf turnInFinancialMarkGrossRateOnCapital

    dPDV
    F r AK r

    dK
    αδ α δ−+

    +

    = + − − + = + − = +
    1442443

    CASE 4. Financial Market and Production
    Numerical Example:

    }

    Net Rate of Retrun
    in Financial

    Net Rate of Return on capital Markets
    1

    1
    Re
    t

    NetRateOf turn

    AK rαα δ−+ − =
    64748

    144444424444443

    {
    { {

    1
    1 1 1
    1
    Optimal

    Opportunity Cost Improvements Capital
    of Holding Capital in Technology

    Stock

    decreases Capital Stock increases Capital

    Stock

    K 0 0t tt
    dK dKA

    r dr dA
    αα

    δ
    −∗ + +

    +
    ⎛ ⎞=⎜ ⎟+⎝ ⎠

    p f

    CASE 4. Financial Market and Production
    Numerical Example:

    1 1 1
    1 1 1

    1

    Maximum Resources that we can have in the next period

    Intertemporal Budget Co

    1 1
    (1 ) (1 )

    1 1t t t
    A A A

    C C Y A r
    r r r r r

    α
    α α αα α α

    δ
    δ δ δ

    − − −
    +

    ⎡ ⎤
    ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎢ ⎥+ = + + − − +⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎢ ⎥+ + + + +⎝ ⎠ ⎝ ⎠ ⎝ ⎠

    ⎣ ⎦14444444444244444444443

    nstraint
    14444444444444444244444444444444443

    {
    1 1 1
    1 1 1

    1
    Maximum Resources Available in
    Period (t+1)

    (1 ) (1 ) t
    A A A

    A r X
    r r r

    α
    α α αα α α
    δ
    δ δ δ
    − − −
    +

    ⎛ ⎞ ⎛ ⎞ ⎛ ⎞+ − − + =⎜ ⎟ ⎜ ⎟ ⎜ ⎟+ + +⎝ ⎠ ⎝ ⎠ ⎝ ⎠

    CASE 4. Financial Market and Production
    Numerical Example:
    {

    Optimality Condition (Utility Maximization)

    Slope of the
    Budget Constraint

    1

    Slope of the
    Indifference Curve

    1

    1
    1
    t

    t
    C
    r
    C
    β
    +
    = +

    64444744448

    123

    ( )1
    Consumption Policy
    function

    1t tC r Cβ

    + = +144424443

    CASE 4. Financial Market and Production
    Numerical Example:
    ( ) 1
    1 1

    1 1 (1 )t t t
    C Y X

    rβ β

    += ++ + +

    1 1 t t t tB Y C K
    ∗ ∗ ∗

    + += − −

    ( ) 1
    1 1
    1 1 (1 )t t t
    C Y X
    rβ β

    += ++ + +

    CASE 4. Financial Market and Production
    Numerical Example:

    1 1 0 0.25 0.5A rβ δ α= = = = =

    1 1 0.5 0.5(2 1) 2 0.5( -5) 0 5t t t tB Y Y B if Y
    ∗ ∗

    + += − − − = f f

    • Advanced Macroeconomics
    • Main assumptions:�
    • Consumer utility maximization problem:

    • CASE 0. (Benchmark): No Physical Storage�Budget Constraint = Endowment Point

    Dynamic Economics:
    Quantitative Methods and Applications
    Jérôme Adda and Russell Cooper
    October 25, 2002

    A Lance Armstrong, notre maitre à tous

    Contents
    1 OVERVIEW 1
    I Theory 6
    2 Theory of Dynamic Programming 7
    2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
    2.2 Indirect Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
    2.2.1 Consumers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
    2.2.2 Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
    2.3 Dynamic Optimization: A Cake Eating Example . . . . . . . . . . . . 10
    2.3.1 Direct Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
    2.3.2 Dynamic Programming Approach . . . . . . . . . . . . . . . . 13
    2.4 Some Extensions of the Cake Eating Problem . . . . . . . . . . . . . 18
    2.4.1 Infinite Horizon . . . . . . . . . . . . . . . . . . . . . . . . . . 18
    2.4.2 Taste Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
    2.4.3 Discrete Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 26
    2.5 General Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
    2.5.1 Non-Stochastic Case . . . . . . . . . . . . . . . . . . . . . . . 29
    2.5.2 Stochastic Dynamic Programming . . . . . . . . . . . . . . . . 35
    2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
    iii

    iv
    3 Numerical Analysis 39
    3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
    3.2 Stochastic Cake Eating Problem . . . . . . . . . . . . . . . . . . . . . 40
    3.2.1 Value Function Iterations . . . . . . . . . . . . . . . . . . . . 41
    3.2.2 Policy Function Iterations . . . . . . . . . . . . . . . . . . . . 45
    3.2.3 Projection Methods . . . . . . . . . . . . . . . . . . . . . . . . 46
    3.3 Stochastic Discrete Cake Eating Problem . . . . . . . . . . . . . . . . 51
    3.3.1 Value Function Iterations . . . . . . . . . . . . . . . . . . . . 52
    3.4 Extensions and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 54
    3.4.1 Larger State Spaces . . . . . . . . . . . . . . . . . . . . . . . . 54
    3.A Additional Numerical Tools . . . . . . . . . . . . . . . . . . . . . . . 58
    3.A.1 Interpolation Methods . . . . . . . . . . . . . . . . . . . . . . 58
    3.A.2 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . 60
    3.A.3 How to Simulate the Model . . . . . . . . . . . . . . . . . . . 64
    4 Econometrics 66
    4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
    4.2 Some Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . 67
    4.2.1 Coin Flipping . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
    4.2.2 Supply and Demand Revisited . . . . . . . . . . . . . . . . . . 79
    4.3 Estimation Methods and Asymptotic Properties . . . . . . . . . . . . 85
    4.3.1 Generalized Method of Moments . . . . . . . . . . . . . . . . 86
    4.3.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . 90
    4.3.3 Simulation Based Methods . . . . . . . . . . . . . . . . . . . . 92
    4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

    v
    II Applications 108
    5 Stochastic Growth 109
    5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
    5.2 Non-Stochastic Growth Model . . . . . . . . . . . . . . . . . . . . . . 109
    5.2.1 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
    5.2.2 Numerical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 114
    5.3 Stochastic Growth Model . . . . . . . . . . . . . . . . . . . . . . . . . 117
    5.3.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
    5.3.2 Bellman’s Equation . . . . . . . . . . . . . . . . . . . . . . . . 120
    5.3.3 Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . . 122
    5.3.4 Decentralization . . . . . . . . . . . . . . . . . . . . . . . . . . 128
    5.4 A Stochastic Growth Model with Endogenous Labor Supply . . . . . 130
    5.4.1 Planner’s Dynamic Programming Problem . . . . . . . . . . . 130
    5.4.2 Numerical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 133
    5.5 Confronting the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
    5.5.1 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
    5.5.2 GMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
    5.5.3 Indirect Inference . . . . . . . . . . . . . . . . . . . . . . . . . 139
    5.5.4 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . 141
    5.6 Some Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
    5.6.1 Technological Complementarities . . . . . . . . . . . . . . . . 142
    5.6.2 Multiple Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . 144
    5.6.3 Taste Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
    5.6.4 Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
    5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

    vi
    6 Consumption 149
    6.1 Overview and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 149
    6.2 Two-Period Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
    6.2.1 Basic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
    6.2.2 Stochastic Income . . . . . . . . . . . . . . . . . . . . . . . . . 154
    6.2.3 Portfolio Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 156
    6.2.4 Borrowing Restrictions . . . . . . . . . . . . . . . . . . . . . . 158
    6.3 Infinite Horizon Formulation: Theory and Empirical Evidence . . . . 159
    6.3.1 Bellman’s equation for the Infinite Horizon Probem . . . . . . 159
    6.3.2 Stochastic Income . . . . . . . . . . . . . . . . . . . . . . . . . 160
    6.3.3 Stochastic Returns: Portfolio choice . . . . . . . . . . . . . . . 163
    6.3.4 Endogenous Labor Supply . . . . . . . . . . . . . . . . . . . . 167
    6.3.5 Borrowing Constraints . . . . . . . . . . . . . . . . . . . . . . 169
    6.3.6 Consumption Over the Life Cycle . . . . . . . . . . . . . . . . 173
    6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
    7 Durable Consumption 178
    7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
    7.2 Permanent Income Hypothesis Model of Durable Expenditures . . . . 179
    7.2.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
    7.2.2 Estimation of a Quadratic Utility Specification . . . . . . . . . 182
    7.2.3 Quadratic Adjustment Costs . . . . . . . . . . . . . . . . . . . 183
    7.3 Non Convex Adjustment Costs . . . . . . . . . . . . . . . . . . . . . 184
    7.3.1 General Setting . . . . . . . . . . . . . . . . . . . . . . . . . . 185
    7.3.2 Irreversibility and Durable Purchases . . . . . . . . . . . . . . 187
    7.3.3 A Dynamic Discrete Choice Model . . . . . . . . . . . . . . . 189

    vii
    8 Investment 199
    8.1 Overview/Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
    8.2 General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
    8.3 No Adjustment Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
    8.4 Convex Adjustment Costs . . . . . . . . . . . . . . . . . . . . . . . . 203
    8.4.1 Q Theory: Models . . . . . . . . . . . . . . . . . . . . . . . . 205
    8.4.2 Q Theory: Evidence . . . . . . . . . . . . . . . . . . . . . . . 207
    8.4.3 Euler Equation Estimation . . . . . . . . . . . . . . . . . . . . 212
    8.4.4 Borrowing Restrictions . . . . . . . . . . . . . . . . . . . . . . 214
    8.5 Non-Convex Adjustment: Theory . . . . . . . . . . . . . . . . . . . . 215
    8.5.1 Non-convex Adjustment Costs . . . . . . . . . . . . . . . . . . 216
    8.5.2 Irreversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
    8.6 Estimation of a Rich Model of Adjustment Costs . . . . . . . . . . . 224
    8.6.1 General Model . . . . . . . . . . . . . . . . . . . . . . . . . . 224
    8.6.2 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . 227
    8.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
    9 Dynamics of Employment Adjustment 229
    9.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
    9.2 General Model of Dynamic Labor Demand . . . . . . . . . . . . . . . 230
    9.3 Quadratic Adjustment Costs . . . . . . . . . . . . . . . . . . . . . . . 232
    9.4 Richer Models of Adjustment . . . . . . . . . . . . . . . . . . . . . . 239
    9.4.1 Piecewise Linear Adjustment Costs . . . . . . . . . . . . . . . 239
    9.4.2 Non-Convex Adjustment Costs . . . . . . . . . . . . . . . . . 241
    9.4.3 Asymmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
    9.5 The Gap Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
    9.5.1 Partial Adjustment Model . . . . . . . . . . . . . . . . . . . . 245

    viii
    9.5.2 Measuring the Target and the Gap . . . . . . . . . . . . . . . 246
    9.6 Estimation of a Rich Model of Adjustment Costs . . . . . . . . . . . 250
    9.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
    10 Future Developments 255
    10.1 Overview/Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
    10.2 Price Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
    10.2.1 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . 256
    10.2.2 Evidence on Magazine Prices . . . . . . . . . . . . . . . . . . 259
    10.2.3 Aggregate Implications . . . . . . . . . . . . . . . . . . . . . . 260
    10.3 Optimal Inventory Policy . . . . . . . . . . . . . . . . . . . . . . . . . 263
    10.3.1 Inventories and the Production Smoothing Model . . . . . . . 263
    10.3.2 Prices and Inventory Adjustment . . . . . . . . . . . . . . . . 267
    10.4 Capital and Labor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
    10.5 Technological Complementarities: Equilibrium Analysis . . . . . . . . 272
    10.6 Search Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
    10.6.1 A Simple Labor Search Model . . . . . . . . . . . . . . . . . . 274
    10.6.2 Estimation of the Labor Search Model . . . . . . . . . . . . . 275
    10.6.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
    10.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

    Chapter 1
    OVERVIEW
    This book studies a rich set of applied problems in economics, emphasizing the
    dynamic aspects of economic decisions. While we are ultimately interested in appli-
    cations, it is necessary to acquire some basic techniques before tackling the details
    of specific dynamic optimization problems. Thus the book presents and integrates
    tools, such as dynamic programming, numerical techniques and simulation based
    econometric methods. We then use these tools to study a variety of applications of
    in both macroeconomics and microeconomics.
    The approach we pursue to studying economic dynamics is structural. As re-
    searchers, we are frequently interested in inferring underlying parameters that rep-
    resent tastes, technology and other primitives from observations of individual house-
    holds and firms as well as from economic aggregates. If this inference is successful,
    then we can test competing hypotheses about economic behavior and evaluate the
    effects of policy experiments. In the end, our approach allows us to characterize the
    mapping from primitives to observed behavior.
    To appreciate what is at stake, consider the following policy experiment. In re-
    cent years, a number of European governments have instituted policies of subsidizing
    the scrapping of old cars and the subsequent purchase of a new car. What are the
    1

    2
    expected effects of these policies on the car industry and on government revenues?
    At some level this question seems easy if a researcher ”knows” the demand function
    for cars. But of course that demand function is, at best, elusive. Further, the de-
    mand function estimated in one policy regime is unlikely to be very informative for
    a novel policy experiment, such as the car scrapping subsidies.
    An alternative approach is to build and estimate a dynamic model of household
    choice over car ownership. Once the parameters of this model are estimated, then
    various policy experiments can be evaluated.1 This seems considerably more difficult
    than just estimating a demand function and indeed that is the case. The approach
    requires the specification and solution of a dynamic optimization problem and then
    the estimation of the parameters. But, as we argue here, this methodology is both
    feasible and exciting.
    It is the integration of the solution of dynamic optimization problems with the
    estimation of parameters that is at the heart of the approach to the study of dynamic
    economies. There are three key steps in our development of this topic. These are
    reflected in the organization of the chapters.
    The first step is to review the formal theory of dynamic optimization. This tool is
    used in many areas of economics including macroeconomics, industrial organization,
    labor economics, international economics and so forth. As in previous contributions
    to the study of dynamic optimization, such as Sargent (1987) and Stokey and Lucas
    (1989), our presentation starts with the formal theory of dynamic programming.
    Given the large number of other contributions in this area, our presentation will rely
    on existing theorems concerning the existence of solutions to a variety of dynamic
    programming problems.
    A second step is to present the numerical tools and the econometric techniques
    necessary to conduct a structural estimation of the theoretical dynamic models.
    These numerical tools serve two purposes: (i) to complement the theory in learn-

    3
    ing about dynamic programming and (ii) to enable a researcher to evaluate the
    quantitative implications of the theory. From our experience, the process of writing
    computer code to solve dynamic programming problems is an excellent device for
    teaching basic concepts of this approach.2
    The econometric techniques provide the final link between the dynamic program-
    ming problem and data. Our emphasis will be on the mapping from parameters of
    the dynamic programming problem to observations. For example, a vector of pa-
    rameters is used to numerically solve a dynamic programming problem which is
    then simulated to create moments. An optimization routine then selects a vector of
    parameters to bring these simulated moments close to the actual moments observed
    in the data.
    The complete presentation of these two steps will comprise the first three chap-
    ters. To distinguish this material, which is more theoretical, we call this Part I of
    the book.
    The final step of the presentation comprises Part II of the book which is devoted
    to the application of dynamic programming to specific areas of applied economics
    such as the study of business cycles, consumption, investment behavior, etc. Each
    of the applied sections of the text will contain four elements: presentation of the
    specific optimization problem as a dynamic programming problem, characterization
    of the optimal policy functions, estimation of the parameters and using models for
    policy evaluation.
    While the specific applications might be labelled ”macroeconomics”, the material
    is of value in other areas of economics for a couple of reasons. First, the presentation
    of these applications utilizes material from all parts of economics. So, for example,
    the discussion of the stochastic growth model includes material on taxation and the
    work on factor adjustment at the plant-level is of interest to economists in labor
    and industrial organization. Second, these techniques are useful in any application

    4
    where the researcher is interested in taking a dynamic optimization problem to data.
    The presentation contains references to various applications of these techniques.
    The novel element of this book is our presentation of an integrated approach to
    the empirical implementation dynamic optimization models. Previous texts have
    provided the mathematical basis for dynamic programming but those presentations
    generally do not contain any quantitative applications. Other texts present the un-
    derlying econometric theory but generally without specific economic applications.
    This approach does both and thus provides a useable link between theory and ap-
    plication as illustrated in the chapters of Part II.
    Our motivation for writing this book is thus clear. From the perspective of un-
    derstanding dynamic programming, explicit empirical applications complement the
    underlying theory of optimization. From the perspective of applied macroeconomics,
    explicit dynamic optimization problems, posed as dynamic programming problems,
    provides needed structure for estimation and policy evaluation.
    Since the book is intended to teach empirical applications of dynamic program-
    ming problems, we plan to create a web-site for the presentation of code (MATLAB
    and GAUSS) as well as data sets that will be useful for applications. The site will be
    vital to readers wishing to supplement the presentation in Part II and also provide
    a forum for further development of code.
    The development of the material in this book has certainly benefited from the
    joint work with Joao Ejarque, John Haltiwanger, Alok Johri and Jonathan Willis
    that underlies some of the material. We thank these co-authors for their generous
    sharing of ideas and computer code as well as their comments on the draft. Thanks
    also to Victor Aguirregabiria, Yan Bai, Dean Corbae, Simon Gilchrist, Hang Kang,
    Valérie Lechene, Nicola Pavoni, Marcos Vera for comments on various parts of the
    book. Finally, we are grateful to numerous MA and PhD students at Tel Aviv
    University, University of Texas at Austin, the IDEI at the Universit de Toulouse,

    5
    the NAKE PhD program in Holland, the University of Haifa, University College
    London for their numerous comments and suggestions during the preparation of
    this material.

    Part I
    Theory
    6

    Chapter 2
    Theory of Dynamic Programming
    2.1 Overview
    The mathematical theory of dynamic programming as a means of solving dynamic
    optimization problems dates to the early contributions of Bellman (1957) and Bert-
    sekas (1976). For economists, the contributions of Sargent (1987) and Stokey and
    Lucas (1989) provide a valuable bridge to this literature.
    2.2 Indirect Utility
    Intuitively, the approach of dynamic programming can be understood by recalling
    the theme of indirect utility from basic static consumer theory or a reduced form
    profit function generated by the optimization of a firm. These reduced form repre-
    sentations of payoffs summarizes information about the optimized value of the choice
    problems faced by households and firms. As we shall see, the theory of dynamic
    programming uses this insight in a dynamic context.
    2.2.1 Consumers
    Consumer choice theory focuses on households who solve:
    7

    8
    V (I, p) = max
    c
    u(c) subject to: pc = I
    where c is a vector of consumption goods, p is a vector of prices and I is income.3
    The first order condition is given by
    uj(c)/pj = λ for j = 1, 2…J.
    where λ is the multiplier on the budget constraint and uj(c) is the marginal utility
    from good j.
    Here V (I, p) is an indirect utility function. It is the maximized level of utility
    from the current state (I, p). So if someone is in this state, you can predict that they
    will attain this level of utility. You do not need to know what they will do with their
    income; it is enough to know that they will act optimally. This is very powerful logic
    and underlies the idea behind the dynamic programming models studied below.
    To illustrate, what happens if we give the consumer a bit more income? Welfare
    goes up by VI (I, p) > 0. Can the researcher predict what will happen with a little
    more income? Not really since the optimizing consumer is indifferent with respect
    to how this is spent:
    uj(c)/pj = VI (I, p) for all j.
    It is in this sense that the indirect utility function summarizes the value of the
    households optimization problem and allows us to determine the marginal value of
    income without knowing further details about consumption functions.
    Is this all we need to know about household behavior? No, this theory is static
    and thus ignores savings, spending on durable goods as well as uncertainty over the
    future. These are all important elements in the household optimization problem.
    We will return to these in later chapters on the dynamic behavior of households.

    9
    The point here was simply to recall a key object from optimization theory: the
    indirect utility function.
    2.2.2 Firms
    Suppose that a firm chooses how many workers to hire at a wage of w given its stock
    of capital, k, and product price, p. Thus the firm solves:
    Π(w, p, k) = max
    l
    pf (l, k) − wl.
    This will yield a labor demand function which depends on (w, p, k). As with V (I, p),
    Π(w, p, k) summarizes the value of the firm given factor prices, the product price,
    p, and the stock of capital, k. Both the flexible and fixed factors could be vectors.
    Think of Π(w, p, k) as an indirect profit function. It completely summarizes the
    value of the optimization problem of the firm given (w, p, k).
    As with the households problem, given Π(w, p, k),we can directly compute the
    marginal value of giving the firm some additional capital as Πk(w, p, k)=pfk(l, k)
    without knowing how the firm will adjust its labor input in response to the additional
    capital.
    But, is this all there is to know about the firm’s behavior? Surely not as we
    have not specified where k comes from. So the firm’s problem is essentially dynamic
    though the demand for some of its inputs can be taken as a static optimization
    problem. These are important themes in the theory of factor demand and we will
    return to them in our firm applications.

    10
    2.3 Dynamic Optimization: A Cake Eating Ex-
    ample
    Here we will look at a very simple dynamic optimization problem. We begin with a
    finite horizon and then discuss extensions to the infinite horizon.4
    Suppose that you have a cake of size W1. At each point of time, t = 1, 2, 3, ….T
    you can consume some of the cake and thus save the remainder. Let ct be your
    consumption in period t and let u(ct) represent the flow of utility from this con-
    sumption. The utility function is not indexed by time: preferences are stationary.
    Assume u(·) is real-valued, differentiable, strictly increasing and strictly concave.
    Assume that limc→0 u′(c) → ∞. Represent lifetime utility by
    T∑
    t=1
    β(t−1)u(ct)
    where 0≤ β ≤ 1 and β is called the discount factor.
    For now, assume that the cake does not depreciate (melt) or grow. Hence, the
    evolution of the cake over time is governed by:
    Wt+1 = Wt − ct (2.1)
    for t = 1, 2, ..T . How would you find the optimal path of consumption, {ct}T1 ?5
    2.3.1 Direct Attack
    One approach is to solve the constrained optimization problem directly. This is
    called the sequence problem by Stokey and Lucas (1989). Consider the problem
    of:
    max
    {ct}T1 ,{Wt}T +12
    T∑
    t=1
    β(t−1)u(ct) (2.2)

    11
    subject to the transition equation (2.1), which holds for t = 1, 2, 3, ….T . Also, there
    are non-negativity constraints on consumption and the cake given by: ct ≥ 0 and
    Wt ≥ 0. For this problem, W1 is given.
    Alternatively, the flow constraints imposed by (2.1) for each t could be combined
    yielding:
    T∑
    t=1
    ct + WT +1 = W1. (2.3)
    The non-negativity constraints are simpler: ct ≥ 0 for t = 1, 2, ..T and WT +1 ≥ 0.
    For now, we will work with the single resource constraint. This is a well-behaved
    problem as the objective is concave and continuous and the constraint set is compact.
    So there is a solution to this problem.6
    Letting λ be the multiplier on (2.3), the first order conditions are given by:
    βt−1u′(ct) = λ
    for t = 1, 2, …, T and
    λ = φ
    where φ is the multiplier on the non-negativity constraint on WT +1. The non-
    negativity constraints on ct ≥ 0 are ignored as we assumed that the marginal utility
    of consumption becomes infinite as consumption approaches zero within any period.
    Combining equations, we obtain an expression that links consumption across
    any two periods:
    u′(ct) = βu
    ′(ct+1). (2.4)
    This is a necessary condition for optimality for any t: if it was violated, the agent
    could do better by adjusting ct and ct+1. Frequently, (2.4) is referred to as an Euler

    12
    equation.
    To understand this condition, suppose that you have a proposed (candidate)
    solution for this problem given by {c∗t }T1 , {W ∗t }T +12 . Essentially, the Euler equation
    says that the marginal utility cost of reducing consumption by ε in period t equals
    the marginal utility gain from consuming the extra ε of cake in the next period,
    which is discounted by β. If the Euler equation holds, then it is impossible to
    increase utility by moving consumption across adjacent periods given a candidate
    solution.
    It should be clear though that this condition may not be sufficient: it does
    not cover deviations that last more than one period. For example, could utility be
    increased by reducing consumption by ε in period t saving the ”cake” for two periods
    and then increasing consumption in period t+2? Clearly this is not covered by a
    single Euler equation. However, by combining the Euler equation that hold across
    period t and t + 1 with that which holds for periods t + 1 and t + 2, we can see that
    such a deviation will not increase utility. This is simply because the combination of
    Euler equations implies:
    u′(ct) = β
    2u′(ct+2)
    so that the two-period deviation from the candidate solution will not increase utility.
    As long as the problem is finite, the fact that the Euler equation holds across all
    adjacent periods implies that any finite deviations from a candidate solution that
    satisfies the Euler equations will not increase utility.
    Is this enough? Not quite. Imagine a candidate solution that satisfies all of the
    Euler equations but has the property that WT > cT so that there is cake left over.
    This is clearly an inefficient plan: having the Euler equations holding is necessary
    but not sufficient. Hence the optimal solution will satisfy the Euler equation for

    13
    each period and the agent will consume the entire cake!
    Formally, this involves showing the non-negativity constraint on WT +1 must
    bind. In fact, this constraint is binding in the above solution: λ = φ > 0. This
    non-negativity constraint serves two important purposes. First, in the absence of
    a constraint that WT +1 ≥ 0, the agent would clearly want to set WT +1 = −∞ and
    thus die with outstanding obligations. This is clearly not feasible. Second, the fact
    that the constraint is binding in the optimal solution guarantees that cake is not
    being thrown away after period T .
    So, in effect, the problem is pinned down by an initial condition (W1 is given)
    and by a terminal condition (WT +1 = 0). The set of (T − 1) Euler equations and
    (2.3) then determine the time path of consumption.
    Let the solution to this problem be denoted by VT (W1) where T is the horizon of
    the problem and W1 is the initial size of the cake. VT (W1) represents the maximal
    utility flow from a T period problem given a size W1 cake. From now on, we call this
    a value function. This is completely analogous to the indirect utility functions
    expressed for the household and the firm.
    As in those problems, a slight increase in the size of the cake leads to an increase
    in lifetime utility equal to the marginal utility in any period. That is,
    V ′T (W1) = λ = β
    t−1u′(ct), t = 1, 2, …T.
    It doesn’t matter when the extra cake is eaten given that the consumer is acting
    optimally. This is analogous to the point raised above about the effect on utility of
    an increase in income in the consumer choice problem with multiple goods.
    2.3.2 Dynamic Programming Approach
    Suppose that we change the above problem slightly: we add a period 0 and give an
    initial cake of size W0. One approach to determining the optimal solution of this

    14
    augmented problem is to go back to the sequence problem and resolve it using this
    longer horizon and new constraint. But, having done all of the hard work with the
    T period problem, it would be nice not to have to do it again!
    Finite Horizon Problem
    The dynamic programming approach provides a means of doing so. It essentially
    converts a (arbitrary) T period problem into a 2 period problem with the appropriate
    rewriting of the objective function. In doing so, it uses the value function obtained
    from solving a shorter horizon problem.
    So, when we consider adding a period 0 to our original problem, we can take
    advantage of the information provided in VT (W1), the solution of the T period
    problem given W1 from (2.2). Given W0, consider the problem of
    max
    c0
    u(c0) + βVT (W1) (2.5)
    where
    W1 = W0 − c0; W0 given.
    In this formulation, the choice of consumption in period 0 determines the size of
    the cake that will be available starting in period 1, W1. So instead of choosing
    a sequence of consumption levels, we are just choosing c0. Once c0 and thus W1
    are determined, the value of the problem from then on is given by VT (W1). This
    function completely summarizes optimal behavior from period 1 onwards. For the
    purposes of the dynamic programming problem, it doesn’t matter how the cake will
    be consumed after the initial period. All that is important is that the agent will be
    acting optimally and thus generating utility given by VT (W1). This is the principle
    of optimality, due to Richard Bellman, at work. With this knowledge, an optimal

    15
    decision can be made regarding consumption in period 0.
    Note that the first order condition (assuming that VT (W1) is differentiable) is
    given by:
    u′(c0) = βV

    T (W1)
    so that the marginal gain from reducing consumption a little in period 0 is summa-
    rized by the derivative of the value function. As noted in the earlier discussion of
    the T period sequence problem,
    V ′T (W1) = u
    ′(c1) = β
    tu′(ct+1)
    for t = 1, 2, …T − 1. Using these two conditions together yields
    u′(ct) = βu
    ′(ct+1),
    for t = 0, 1, 2, …T − 1, a familiar necessary condition for an optimal solution.
    Since the Euler conditions for the other periods underlie the creation of the
    value function, one might suspect that the solution to the T + 1 problem using
    this dynamic programming approach is identical to that from using the sequence
    approach.7 This is clearly true for this problem: the set of first order conditions
    for the two problems are identical and thus, given the strict concavity of the u(c)
    functions, the solutions will be identical as well.
    The apparent ease of this approach though is a bit misleading. We were able
    to make the problem look simple by pretending that we actually knew VT (W1). Of
    course, we had to solve for this either by tackling a sequence problem directly or by
    building it recursively starting from an initial single period problem.
    On this latter approach, we could start with the single period problem implying
    V1(W1). We could then solve (2.5) to build V2(W1). Given this function, we could

    16
    move to a solution of the T = 3 problem and proceed iteratively, using (2.5) to build
    VT (W1) for any T .
    Example
    We illustrate the construction of the value function in a specific example. Assume
    u(c) = ln(c). Suppose that T = 1. Then V1(W1) = ln(W1).
    For T = 2, the first order condition from (2.2) is
    1/c1 = β/c2
    and the resource constraint is
    W1 = c1 + c2.
    Working with these two conditions:
    c1 = W1/(1 + β) and c2 = βW1/(1 + β).
    ¿From this, we can solve for the value of the 2-period problem:
    V2(W1) = ln(c1) + β ln(c2) = A2 + B2 ln(W1) (2.6)
    where A2 and B2 are constants associated with the two period problem. These
    constants are given by:
    A2 = ln(1/(1 + β)) + β ln(β/(1 + β)) B2 = (1 + β)
    Importantly, (2.6) does not include the max operator as we are substituting the
    optimal decisions in the construction of the value function, V2(W1).
    Using this function, the T = 3 problem can then be written as:
    V3(W1) = max
    W2
    ln(W1 − W2) + βV2(W2)

    17
    where the choice variable is the state in the subsequent period. The first order
    condition is:
    1
    c1
    = βV ′2 (W2).
    Using (2.6) evaluated at a cake of size W2, we can solve for V

    2 (W2) implying:
    1
    c1
    = β
    B2
    W2
    =
    β
    c2
    .
    Here c2 the consumption level in the second period of the three-period problem and
    thus is the same as the level of consumption in the first period of the two-period
    problem. Further, we know from the 2-period problem that
    1/c2 = β/c3.
    This plus the resource constraint allows us to construct the solution of the 3-period
    problem:
    c1 = W1/(1 + β + β
    2), c2 = βW1/(1 + β + β
    2), c3 = β
    2W1/(1 + β + β
    2).
    Substituting into V3(W1) yields
    V3(W1) = A3 + B3 ln(W1)
    where
    A3 = ln(1/(1+β+β
    2))+β ln(β/(1+β+β2))+β2 ln(β2/(1+β+β2)), B3 = (1+β+β
    2)
    This solution can be verified from a direct attack on the 3 period problem using
    (2.2) and (2.3).

    18
    2.4 Some Extensions of the Cake Eating Problem
    Here we go beyond the T period problem to illustrate some ways to use the dynamic
    programming framework. This is intended as an overview and the details of the
    assertions and so forth will be provided below.
    2.4.1 Infinite Horizon
    Basic Structure
    Suppose that we consider the above problem and allow the horizon to go to infinity.
    As before, one can consider solving the infinite horizon sequence problem given by:
    max
    {ct}∞1 ,{Wt}∞2
    ∞∑
    t=1
    βtu(ct)
    along with the transition equation of
    Wt+1 = Wt − ct
    for t=1,2,……
    Specifying this as a dynamic programming problem,
    V (W ) = max
    c∈[0,W ]
    u(c) + βV (W − c)
    for all W . Here u(c) is again the utility from consuming c units in the current
    period. V (W ) is the value of the infinite horizon problem starting with a cake of
    size W . So in the given period, the agent chooses current consumption and thus
    reduces the size of the cake to W ′ = W − c, as in the transition equation. We use
    variables with primes to denote future values. The value of starting the next period
    with a cake of that size is then given by V (W −c) which is discounted at rate β < 1. 19 For this problem, the state variable is the size of the cake (W ) that is given at the start of any period. The state completely summarizes all information from the past that is needed for the forward looking optimization problem. The control variable is the variable that is being chosen. In this case, it is the level of consump- tion in the current period, c. Note that c lies in a compact set. The dependence of the state tomorrow on the state today and the control today, given by W ′ = W − c is called the transition equation. Alternatively, we can specify the problem so that instead of choosing today’s consumption we choose tomorrow’s state. V (W ) = max W ′∈[0,W ] u(W − W ′) + βV (W ′) (2.7) for all W . Either specification yields the same result. But choosing tomorrow’s state often makes the algebra a bit easier so we will work with (2.7). This expression is known as a functional equation and is often called a Bellman equation after Richard Bellman, one of the originators of dynamic programming. Note that the unknown in the Bellman equation is the value function itself: the idea is to find a function V (W ) that satisfies this condition for all W . Unlike the finite horizon problem, there is no terminal period to use to derive the value function. In effect, the fixed point restriction of having V (W ) on both sides of (2.7) will provide us with a means of solving the functional equation. Note too that time itself does not enter into Bellman’s equation: we can express all relations without an indication of time. This is the essence of stationarity.8 In fact, we will ultimately use the stationarity of the problem to make arguments about the existence of a value function satisfying the functional equation. A final very important property of this problem is that all information about the 20 past that bears on current and future decisions is summarized by W , the size of the cake at the start of the period. Whether the cake is of this size because we initially had a large cake and ate a lot or a small cake and were frugal is not relevant. All that matters is that we have a cake of a given size. This property partly reflects the fact that the preferences of the agent do not depend on past consumption. But, in fact, if this was the case, we could amend the problem to allow this possibility. The next part of this chapter addresses the question of whether there exists a value function that satisfies (2.7). For now, we assume that a solution exists and explore its properties. The first order condition for the optimization problem in (2.7) can be written as u′(c) = βV ′(W ′). This looks simple but what is the derivative of the value function? This seems particularly hard to answer since we do not know V (W ). However, we take use the fact that V (W ) satisfies (2.7) for all W to calculate V ′. Assuming that this value function is differentiable, V ′(W ) = u′(c), a result we have seen before. Since this holds for all W , it will hold in the following period yielding: V ′(W ′) = u′(c′). Substitution leads to the familar Euler equation: u′(c) = βu′(c′). 21 The solution to the cake eating problem will satisfy this necessary condition for all W . The link from the level of consumption and next period’s cake (the controls from the different formulations) to the size of the cake (the state) is given by the policy function: c = φ(W ), W ′ = ϕ(W ) ≡ W − φ(W ). Using these in the Euler equation reduces the problem to these policy functions alone: u′(φ(W )) = βu′(φ(W − φ(W ))) for all W . These policy functions are very important for applied research since they provide the mapping from the state to actions. When elements of the state as well as the action are observable, then these policy functions will provide the foundation for estimation of the underlying parameters. An Example In general, actually finding closed form solutions for the value function and the resulting policy functions is not possible. In those cases, we try to characterize certain properties of the solution and, for some exercises, we solve these problems numerically. However, as suggested by the analysis of the finite horizon examples, there are some versions of the problem we can solve completely. Suppose then, as above, that u(c) = ln(c). Given the results for the T-period problem, we might conjecture that the solution to the functional equation takes the form of: 22 V (W ) = A + B ln(W ) for all W . With this guess we have reduced the dimensionality of the unknown function V (W ) to two parameters, A and B. But can we find values for A and B such that V (W ) will satisfy the functional equation? Taking this guess as given and using the special preferences, the functional equa- tion becomes: A + B ln(W ) = max W ′ ln(W − W ′) + β(A + B ln(W ′)) (2.8) for all W . After some algebra, the first-order condition implies: W ′ = ϕ(W ) = βB (1 + βB) W. Using this in (2.8) implies: A + B ln(W ) = ln W (1 + βB) + β(A + B ln( βBW (1 + βB) )) for all W . Collecting terms into a constant and terms that multiply ln(W ) and then imposing the requirement that the functional equation must hold for all W , we find that B = 1/(1 − β) is required for a solution. Given this, there is a complicated expression that can be used to find A. To be clear then we have indeed guessed a solution to the functional equation. We know that because we can solve for (A, B) such that the functional equation holds for all W using the optimal consumption and savings decision rules. With this solution, we know that 23 c = W (1 − β), W ′ = βW. Evidently, the optimal policy is to save a constant fraction of the cake and eat the remaining fraction. Interestingly, the solution to B could be guessed from the solution to the T- horizon problems where BT = T∑ t=1 βt−1. Evidently, B = limT →∞BT . In fact, we will be exploiting the theme that the value function which solves the infinite horizon problem is related to the limit of the finite solutions in much of our numerical analysis. Here are some exercises that add some interesting elements to this basic struc- ture. Both begin with finite horizon formulations and then progress to the infinite horizon problem. Exercise 2.1 Suppose that utility in period t was given by u(ct, ct−1). How would you solve the T period problem with these preferences? Interpret the first order conditions. How would you formulate the Bellman equation for the infinite horizon version of this problem? Exercise 2.2 Suppose that the transition equation was modified so that Wt+1 = ρWt − ct where ρ > 0 represents a return from the holding of cake inventories. How would
    you solve the T period problem with this storage technology? Interpret the first order

    24
    conditions. How would you formulate the Bellman equation for the infinite horizon
    version of this problem? Does the size of ρ matter in this discussion? Explain.
    2.4.2 Taste Shocks
    One of the convenient features of the dynamic programming problem is the simplicity
    with which one can introduce uncertainty.9 For the cake eating problem, the natural
    source of uncertainty has to do with the agent’s tastes. In other settings we will
    focus on other sources of uncertainty having to do with the productivity of labor or
    the endowment of households.
    To allow for variations in tastes, suppose that utility over consumption is given
    by:
    εu(c)
    where ε is a random variable whose properties we will describe below. The function
    u(c) is again assumed to be strictly increasing and strictly concave. Otherwise, the
    problem is the original cake eating problem with an initial cake of size W .
    In problems with stochastic elements, it is critical to be precise about the timing
    of events. Does the optimizing agent know the current shocks when making a
    decision? For this analysis, assume that the agent knows the value of the taste
    shock when making current decisions but does not know future values. Thus the
    agent must use expectations of future values of ε when deciding how much cake to
    eat today: it may be optimal to consume less today (save more) in anticipation of
    a high realization of ε in the future.
    For simplicity, assume that the taste shock takes on only two values: ε ∈ {εh, εl}
    with εh > εl > 0. Further, we assume that the taste shock follows a first -order
    Markov process 10 which means that the probability a particular realization of ε

    25
    occurs in the current period depends only the value of ε attained in the previous
    period.11 For notation, let πij denote the probability that the value of ε goes from
    state i in the current period to state j in the next period. For example, πlh is define
    from:
    πlh ≡ Prob(ε′ = εh|ε = εl)
    where ε′ refers to the future value of ε. Clearly πih + πil = 1 for i = h, l. Let Π be a
    2×2 matrix with a typical element πij which summarizes the information about the
    probability of moving across states. This matrix is naturally called a transition
    matrix.
    Given this notation and structure, we can turn to the cake eating problem. It
    is critical to carefully define the state of the system for the optimizing agent. In
    the nonstochastic problem, the state was simply the size of the cake. This provided
    all the information the agent needed to make a choice. When taste shocks are
    introduced, the agent needs to take this into account as well. In fact, the taste
    shocks provide information about current payoffs and, through the Π matrix, are
    informative about the future value of the taste shock as well.12
    Formally, the Bellman equation is:
    V (W, ε) = max
    W ′
    εu(W − W ′) + βEε′|εV (W ′, ε′)
    for all (W, ε) where W ′ = W − c as usual. Note that the conditional expectation is
    denoted here by Eε′|εV (W ′, ε′) which, given Π, is something we can compute.13
    The first order condition for this problem is given by:
    εu′(W − W ′) = βEε′|εV1(W ′, ε′)
    for all (W, ε). Using the functional equation to solve for the marginal value of cake,
    we find:

    26
    εu′(W − W ′) = βEε′|ε[ε′u′(W ′ − W ′′)] (2.9)
    which, of course, is the stochastic Euler equation for this problem.
    The optimal policy function is given by
    W ′ = ϕ(W, ε)
    The Euler equation can be rewritten in these terms as:
    εu′(W − ϕ(W, ε)) = βEε′|ε[ε′u′(ϕ(W, ε) − ϕ(ϕ(W, ε), ε′)))]
    The properties of this policy function can then be deduced from this condition.
    Clearly both ε′ and c′ depend on the realized value of ε′ so that the expectation on
    the right side of (2.9) cannot be split into two separate pieces.
    2.4.3 Discrete Choice
    To illustrate some of the flexibility of the dynamic programming approach, we build
    on this stochastic problem. Suppose the cake must be eaten in one period. Perhaps
    we should think of this as the wine drinking problem recognizing that once a good
    bottle of wine is opened, it should be consumed! Further, we modify the transition
    equation to allow the cake to grow (depreciate) at rate ρ.
    The problem is then an example of a dynamic, stochastic discrete choice problem.
    This is an example of a family of problems called optimal stopping problems .14
    The common element in all of these problems is the emphasis on timing of a single
    event: when to eat the cake; when to take a job; when to stop school, when to stop
    revising a chapter, etc. In fact, for many of these problems, these choices are not
    once in a lifetime events and so we will be looking at problems even richer than the
    optimal stopping variety.

    27
    Let V E(W, ε) and V N (W, ε) be the value of eating the size W cake now (E) and
    waiting (N ) respectively given the current taste shock, ε ∈ {εh, εl}. Then,
    V E(W, ε) = εu(W )
    and
    V N (W ) = βEε′|εV (ρW, ε
    ′).
    where
    V (W, ε) = max(V E(W, ε), V N (W, ε))
    for all (W, ε). To understand these terms, εu(W ) is the direct utility flow from
    eating the cake. Once the cake is eaten the problem has ended. So V E(W, ε) is just
    a one-period return. If the agent waits, then there is no cake consumption in the
    current period and next period the cake is of size (ρW ). As tastes are stochastic,
    the agent choosing to wait must take expectations of the future taste shock, ε′. The
    agent has an option next period of eating the cake or waiting further. Hence the
    value of having the cake in any state is given by V (W, ε), which is the value attained
    by maximizing over the two options of eating or waiting. The cost of delaying the
    choice is determined by the discount factor β while the gains to delay are associated
    with the growth of the cake, parameterized by ρ. Further, the realized value of ε
    will surely influence the relative value of consuming the cake immediately.
    If ρ ≤ 1, then the cake doesn’t grow. In this case, there is no gain from delay
    when ε = εh. If the agent delays, then utility in the next period will have to be
    lower due to discounting and, with probability πhl, the taste shock will switch from
    low to high. So, waiting to eat the cake in the future will not be desirable. Hence,
    V (W, εh) = V
    E(W, εh) = εhu(W )

    28
    for all W .
    In the low ε state, matters are more complex. If β and ρ are sufficiently close
    to 1 then there is not a large cost to delay. Further, if πlh is sufficiently close to 1,
    then it is likely that tastes will switch from low to high. Thus it will be optimal not
    to eat the cake in state (W, εl).
    15
    Here are some additional exercises.
    Exercise 2.3
    Suppose that ρ = 1. For a given β, show that there exists a critical level of
    πlh,denoted by π̄lh such that if πlh > π̄lh, then the optimal solution is for the agent
    to wait when ε = εl and to eat the cake when εh is realized.
    Exercise 2.4
    When ρ > 1, the problem is more difficult. Suppose that there are no variations
    in tastes: εh = εl = 1. In this case, there is a trade-off between the value of waiting
    (as the cake grows) and the cost of delay from discounting.
    Suppose that ρ > 1 and u(c) = c
    1−γ
    1−γ . What is the solution to the optimal stop-
    ping problem when βρ1−γ < 1? What happens if βρ1−γ > 1? What happens when
    uncertainty is added?
    2.5 General Formulation
    Building on the intuition gained from this discussion of the cake eating problem,
    we now consider a more formal abstract treatment of the dynamic programming
    approach.16 We begin with a presentation of the non-stochastic problem and then
    add uncertainty to the formulation.

    29
    2.5.1 Non-Stochastic Case
    Consider the infinite horizon optimization problem of an agent with a payoff function
    for period t given by σ̃(st, ct). The first argument of the payoff function is termed the
    state vector, (st). As noted above, this represents a set of variables that influences
    the agent’s return within the period but, by assumption, these variables are outside
    of the agent’s control within period t. The state variables evolve over time in a
    manner that may be influenced by the control vector (ct), the second argument of
    the payoff function. The connection between the state variables over time is given
    by the transition equation:
    st+1 = τ (st, ct).
    So, given the current state and the current control, the state vector for the subse-
    quent period is determined.
    Note that the state vector has a very important property: it completely summa-
    rizes all of the information from the past that is needed to make a forward-looking
    decision. While preferences and the transition equation are certainly dependent on
    the past, this dependence is represented by st: other variables from the past do
    not affect current payoffs or constraints and thus cannot influence current decisions.
    This may seem restrictive but it is not: the vector st may include many variables
    so that the dependence of current choices on the past can be quite rich.
    While the state vector is effectively determined by preferences and the transition
    equation, the researcher has some latitude in choosing the control vector. That
    is, there may be multiple ways of representing the same problem with alternative
    specifications of the control variables.
    We assume that c ∈ C and s ∈ S. In some cases, the control is restricted to be
    in subset of C which depends on the state vector: c ∈ C(s). Finally assume that

    30
    σ̃(s, c) is bounded for (s, c) ∈ S × C. 17
    For the cake eating problem described above, the state of the system was the
    size of the current cake (Wt) and the control variable was the level of consumption
    in period t, (ct). The transition equation describing the evolution of the cake was
    given by
    Wt+1 = Wt − ct.
    Clearly the evolution of the cake is governed by the amount of current consumption.
    An equivalent representation, as expressed in (2.7), is to consider the future size of
    the cake as the control variable and then to simply write current consumption as
    Wt+1 − Wt.
    There are two final properties of the agent’s dynamic optimization problem worth
    specifying: stationarity and discounting. Note that neither the payoff nor the
    transition equations depend explicitly on time. True the problem is dynamic but
    time per se is not of the essence. In a given state, the optimal choice of the agent
    will be the same regardless of “when” he optimizes. Stationarity is important both
    for the analysis of the optimization problem and for empirical implementation of
    infinite horizon problems. In fact, because of stationarity we can dispense with time
    subscripts as the problem is completely summarized by the current values of the
    state variables.
    The agent’s preferences are also dependent on the rate at which the future is
    discounted. Let β denote the discount factor and assume that 0 < β < 1. Then we can represent the agent’s payoffs over the infinite horizon as ∞∑ t=0 βtσ̃(st, ct) (2.10) One approach to optimization is then to maximize (2.10) through the choice of {ct} for t = 0, 1, 2, ... given s0 and subject to the transition equation. Let V (s0) be 31 the optimized value of this problem given the initial state. Alternatively, one can adopt the dynamic program approach and consider the following equation, called Bellman’s equation: V (s) = max c∈C(s) σ̃(s, c) + βV (s′) (2.11) for all s ∈ S, where s′ = τ (s, c). Here time subscripts are eliminated, reflecting the stationarity of the problem. Instead, current variables are unprimed while future ones are denoted by a prime (′). As in Stokey and Lucas (1989), the problem can be formulated as V (s) = max s′∈Γ(s) σ(s, s′) + βV (s′) (2.12) for all s ∈ S. This is a more compact formulation and we will use it for our presentation.18 Nonetheless, the presentations in Bertsekas (1976) and Sargent (1987) follow (2.11). Assume that S is a convex subset of �k. Let the policy function that determines the optimal value of the control (the future state) given the state be given by s′ = φ(s). Our interest is ultimately in the policy function since we generally observe the actions of agents rather than their levels of utility. Still, to determine φ(s) we need to ”solve” (2.12). That is, we need to find the value function that satisfies (2.12). It is important to realize that while the payoff and transition equations are primitive objects that models specify a priori, the value function is derived as the solution of the functional equation, (2.12). There are many results in the lengthy literature on dynamic programming prob- lems on the existence of a solution to the functional equation. Here, we present one set of sufficient conditions. The reader is referred to Bertsekas (1976), Sar- gent (1987) and Stokey and Lucas (1989) for additional theorems under alternative 32 assumptions about the payoff and transition functions.19 Theorem 1 Assume σ(s, c) is real-valued, continuous and bounded, 0 < β < 1 and the constraint set, Γ(s), is non-empty, compact-valued and continuous, then there exists a unique value function V (s) that solves (2.12) Proof: See Stokey and Lucas (1989),[Theorem 4.6]. Instead of a formal proof, we give an intuitive sketch. The key component in the analysis is the definition of an operator, commonly denoted as T, defined by: T (W )(s) = max s′∈Γ(s) σ(s, s′) + βW (s′) for all s ∈ S.20 So, this mapping takes a guess on the value function and, working through the maximization for all s, produces another value function, T (W )(s). Clear, any V (s) such that V (s) = T (V )(s) for all s ∈ S is a solution to (2.12). So, we can reduce the analysis to determining the fixed points of T (W ). The fixed point argument proceeds by showing the T (W ) is a contraction using a pair of sufficient conditions from Blackwell (1965). These conditions are: (i) monotonicity and (ii) discounting of the mapping T (V ). Monotonicity means that if W (s) ≥ Q(s) for all s ∈ S, then T (W )(s) ≥ T (Q)(s) for all s ∈ S. This property can be directly verified from the fact that T (V ) is generated by a maximization problem. So that if one adopts the choice of φQ(s) obtained from max s′∈Γ(s) σ(s, s′) + βQ(s′) for all s ∈ S. When the proposed value function is W (s) then: T (W )(s) = max s′∈Γ(s) σ(s, s′) + βW (s′) ≥ σ(s, φQ(s)) + βW (φQ(s)) ≥ σ(s, φQ(s)) + βQ(φQ(s)) ≡ T (Q)(s) 33 for all s ∈ S. Discounting means that adding a constant to W leads T (W ) to increase by less than this constant. That is, for any constant k, T (W + k)(s) ≤ T (W )(s) + βk for all s ∈ S where β ∈ [0, 1). The term discounting reflects the fact that β must be less than 1. This property is easy to verify in the dynamic programming problem: T (W + k) = max s′∈Γ(s) σ(s, s′) + β[W (s′) + k] = T (W ) + βk, for all s ∈ S since we assume that the discount factor is less than 1. The fact that T (W ) is a contraction allows us to take advantage of the contrac- tion mapping theorem.21 This theorem implies that: (i) there is a unique fixed point and (ii) this fixed point can be reached by an iteration process using an arbitrary initial condition. The first property is reflected in the theorem given above. The second property is used extensively as a means of finding the solution to (2.12). To better understand this, let V0(s) for all s ∈ S be an initial guess of the solution to (2.12). Consider V1 = T (V0). If V1 = V0 for all s ∈ S, then we have the solution. Else, consider V2 = T (V1) and continue iterating until T (V ) = V so that the functional equation is satisfied. Of course, in general, there is no reason to think that this iterative process will converge. However, if T (V ) is a contraction, as it is for our dynamic programming framework, then the V (s) that satisfies (2.12) can be found from the iteration of T (V0(s)) for any initial guess, V0(s). This procedure is called value function iteration and will be a valuable tool for applied analysis of dynamic programming problems. The value function which satisfies (2.12) may inherit some properties from the more primitive functions that are the inputs into the dynamic programming problem: the payoff and transition equations. As we shall see, the property of strict concavity is useful for various applications.22 The result is given formally by: Theorem 2 Assume σ(s, s′) is real-valued, continuous, concave and bounded, 0 < 34 β < 1, S is a convex subset of �kand the constraint set is non-empty, compact- valued, convex and continuous, then the unique solution to (2.12) is strictly concave. Further, φ(s) is a continuous, single-valued function. Proof: See Theorem 4.8 in Stokey and Lucas (1989). The proof of the theorem relies on showing that strict concavity is preserved by T (V ): i.e. if V (s) is strictly concave, then so is T (V (s)). Given that σ(s, c) is concave, let our initial guess of the value function be the solution to the one-period problem V0(s) ≡ max s′∈Γ(s) σ(s, s′). V0(s) will be strictly concave. Since T (V ) preserves this property, the solution to (2.12) will be strictly concave. As noted earlier, our interest is in the policy function. Note that from this theorem, there is a stationary policy function which depends only on the state vector. This result is important for econometric application since stationarity is often assumed in characterizing the properties of various estimators. The cake eating example relied on the Euler equation to determine some proper- ties of the optimal solution. However, the first-order condition from (2.12) combined with the strict concavity of the value function is useful in determining properties of the policy function. Beneveniste and Scheinkman (1979) provide conditions such that V (s) is differentiable (Stokey and Lucas (1989), Theorem 4.11). In our dis- cussion of applications, we will see arguments that use the concavity of the value function to characterize the policy function. 35 2.5.2 Stochastic Dynamic Programming While the non-stochastic problem is perhaps a natural starting point, in terms of applications it is necessary to consider stochastic elements. Clearly the stochas- tic growth model, consumption/savings decisions by households, factor demand by firms, pricing decisions by sellers, search decisions all involve the specification of dynamic stochastic environments. Further, empirical applications rest upon shocks that are not observed by the econometrician. In many applications, the researcher appends a shock to an equation prior to estimation without being explicit about the source of the error term. This is not consistent with the approach of stochastic dynamic programming: shocks are part of the state vector of the agent. Of course, the researcher may not observe all of the variables that influence the agent and/or there may be measurement error. Nonetheless, being explicit about the source of error in empirical applications is part of the strength of this approach. While stochastic elements can be added in many ways to dynamic programming problems, we consider the following formulation which is used in our applications. Letting ε represent the current value of a vector of ”shocks”; i.e. random variables that are partially determined by nature. Let ε ∈ Ψ which is assumed to be a finite set.23 Then using the notation developed above, the functional equation becomes: V (s, ε) = max s′∈Γ(s,ε) σ(s, s′, ε) + βEε′|εV (s ′, ε′) (2.13) for all (s, ε). Further, we have assumed that the stochastic process itself is purely exogenous as the distribution of ε′ depends on ε but is independent of the current state and control. Note too that the distribution of ε′ depends on only the realized value of ε : i.e. ε follows a first-order Markov process. This is not restrictive in the sense 36 that if values of shocks from previous periods were relevant for the distribution of ε′, then they could simply be added to the state vector. Finally, note that the distribution of ε′ conditional on ε, written as ε′|ε, is time invariant. This is analogous to the stationarity properties of the payoff and transition equations.. In this case, the conditional probability of ε′|ε are characterized by a transition matrix, Π. The element πij of this matrix is defined as: πij ≡ Prob(ε′ = εj|ε = εi) which is just the likelihood that εj occurs in the next period, given that εi occurs today. Thus this transition matrix is used to compute the transition probabilities in (2.13). Throughout we assume that πij ∈ (0, 1) and ∑ j πij = 1 for each i. With this structure: Theorem 3 If σ(s, s′, ε) is real-valued, continuous, concave and bounded, 0 < β < 1 and the constraint set is compact and convex, then: 1. there exists a unique value function V (s, ε) that solves (2.13) 2. there exists a stationary policy function, φ(s, ε). Proof: As in the proof of Theorem 2, this is a direct application of Blackwell’s Theorem. That is, with β < 1, discounting holds. Likewise, monotonicity is imme- diate as in the discussion above. See also the proof of Proposition 2 in Bertsekas (1976), Chp. 6. The first-order condition for (2.13) is given by: σs′(s, s ′, ε) + βEε′|εVs′(s ′, ε′) = 0. (2.14) Using (2.13) to determine Vs′(s ′, ε′) yields an Euler equation: σs′(s, s ′, ε) + βEε′|εσs′(s ′, s′′, ε′) = 0. (2.15) 37 This Euler equation has the usual interpretation. The expected sum of the effects of a marginal variation in the control in the current period (s) must be zero. So, if there is a marginal gain in the current period, this, in expectation, is offset by a marginal loss in the next period. Put differently, if a policy is optimal, there should be no variation in the value of the current control that will, in expectation, make the agent better off. Of course, ex post (after the realization of ε′), there may have been better decisions for the agent and, from the vantage point of hindsight, mistakes were made. That is σs′(s, s ′, ε) + βσs′(s ′, s′′, ε′) = 0. (2.16) will surely not hold for all realizations of ε′. Yet, from the ex ante optimization we know that these ex post errors were not predicable given the information available to the agent. As we shall see, this is a powerful insight that underlies the estimation of models based upon a stochastic Euler equation such as 2.15. Yet, as illustrated in many applications, the researcher may be unable to summarize conditions for optimality through an Euler equation. In these cases, characterizing the policy function directly is required. 2.6 Conclusion The theory of dynamic programming is a cornerstone of this book. The point of this chapter is to introduce researchers to some of the insights of this vast literature and some of the results we will find useful in our applications. As mentioned earlier, this chapter has been specifically directed to provide theoretical structure for the dynamic optimization problems we will confront in this book. Of course, versions of these results hold in much more general circumstances. Again the reader is urged 38 to study Bertsekas (1976), Sargent (1987) and Stokey and Lucas (1989) for a more complete treatment of this topic. Chapter 3 Numerical Analysis 3.1 Overview This chapter reviews numerical methods used to solve dynamic programming prob- lems. This discussion provides a key link between the basic theory of dynamic programming and the empirical analysis of dynamic optimization problems. The need for numerical tools arises from the fact that generally dynamic programming problems do not possess tractable closed form solutions. Hence, techniques must be used to approximate the solutions of these problems. We present a variety of techniques in this chapter which are subsequently used in the macroeconomic ap- plications studied in Part II of this book. The presentation starts by solving a stochastic cake eating problem using a procedure called value function iteration. This same example is then used to illustrate alternative methods that operate on the policy function rather than the value function. Finally, a version of this problem is studied to illustrate the solution to dynamic, discrete choice problems. The appendix and the web page for this book contain the programs used in this chapter. The applied researcher may find these useful templates for solving other 39 40 problems. In section 3.A in the appendix, we present several numerical tools such as numerical integration or interpolation techniques, which are useful when using numerical methods. A number of articles and books have been devoted to numerical programming. For a more complete description, we refer the reader to Judd (1998), Amman et al. (1996), Press et al. (1986) or Taylor and Uhlig (1990). 3.2 Stochastic Cake Eating Problem We start with the stochastic cake eating problem defined by: V (W, y) = max 0≤c≤W +y u(c) + βEy′|yV (W ′, y′) for all (W, y) with W ′ = R(W − c + y) (3.1) Here there are two state variables: W , the size of the cake brought into the current period, and y, the stochastic endowment of additional cake. This is an example of a stochastic dynamic programming problem from the framework in (2.5.2). We begin by analyzing the simple case where the endowment is iid: the shock today does not give any information on the shock tomorrow. In this case, the consumer only cares about the total amount which can be potentially eaten, X = W + y, and not the particular origin of any piece of cake. In this problem, there is only one state variable X. We can rewrite the problem as: V (X) = max 0≤c≤X u(c) + βEy′V (X ′) for all X with X′ = R(X − c) + y′ (3.2) If the endowment is serially correlated, then the agent has to keep track of any variables which allow him to forecast future endowment. The state space, will include X but also current and maybe past realizations of endowments. We present such a case in section 3.3 where we study a discrete cake eating problem. Chapter 6.1 also presents the continuous cake eating problem with serially correlated shocks. 41 The control variable is c, the level of current consumption. The size of the cake evolves from one period to the next according to the transition equation. The goal is to evaluate the value V (X) as well as the policy function for consumption, c(X). 3.2.1 Value Function Iterations This method works from the Bellman equation to compute the value function by backward iterations on an initial guess. While sometimes slower than competing methods, it is trustworthy in that it reflects the result, stated in Chapter 2, that (under certain conditions)the solution of the Bellman equation can be reached by iterating the value function starting from an arbitrary initial value. We illustrate this approach here in solving (3.2).24 In order to program value function iteration, there are several important steps: 1. choosing a functional form for the utility function. 2. discretizing the state and control variable. 3. building a computer code to perform value function iteration 4. evaluating the value and the policy function. We discuss each steps in turn. These steps are indicated in the code for the stochastic cake eating problem. Functional Form and Parameterization We need to specify the utility function. This is the only known primitive function in (3.2): recall that the value function is what we are solving for! The choice of this function depends on the problem and the data. The consumption literature has often worked with a constant relative risk aversion (CRRA) function: u(c) = c(1−γ) 1 − γ 42 The vector θ will represent the parameters. For the cake eating problem (γ, β) are both included in θ. To solve for the value function, we need to assign particular values to these parameters as well as the exogenous return R. For now, we assume that βR = 1 so that the growth in the cake is exactly offset by the consumers discounting of the future. The specification of the functional form and its parame- terization are given in Part I of the accompanying Matlab code for the cake eating problem. State and Control Space We have to define the space spanned by the state and the control variables as well as the space for the endowment shocks. For each problem, specification of the state space is important. The computer cannot literally handle a continuous state space, so we have to approximate this continuous space by a discrete one. While the approximation is clearly better if the state space is very fine (i.e. has many points), this can be costly in terms of computation time. Thus there is a trade-off involved. For the cake eating problem, suppose that the cake endowment can take two values, low (yL) and high (yH ). As the endowment is assumed to follow an iid process, denote the probability a shock yi by πi, for i = L, H. The probability of transitions can be stacked in a transition matrix: π = [ πL πH πL πH ] with πL + πH = 1 In this discrete setting, the expectation in (3.2) is just a weighted sum, so that the Bellman equation can be simply rewritten: V (X) = max 0≤c≤X u(c) + β ∑ i=L,H πiV (R(X − c) + yi) for all X For this problem, it turns out that the natural state space is given by:[X̄L, X̄H ]. This choice of the state space is based upon the economics of the problem, which 43 will be understood more completely after studying household consumption choices. Imagine though that endowment was constant at a level yi for i = L, H. Then, given the assumption βR = 1, the cake level of the household will (trust us) eventually settle down to X̄i, for i = L, H. Since the endowment is stochastic and not constant, consumption and the size of the future cake will vary with realizations of the state variable, X, but it turns out that X will never leave this interval. The fineness of the grid is simply a matter of choice too. In the program, let ns be the number of elements in the state space. The program simply partitions the interval [X̄L, X̄H ] into ns elements. In practice, the grid is usually uniform, with the distance between two consecutive elements being constant. 25 Call the state space ΨS and let is be an index: ΨS = {Xis}nsis=1 with X1 = X̄L, Xns = X̄H The control variable, c, takes values in [X̄L, X̄H ]. These are the extreme levels of consumption given the state space for X. We discretize this space into a nc size grid, and call the control space ΨC = {cic}ncic=1. Value Function Iteration and Policy Function Here we must have a loop for the mapping T (v(X)) defined as T (v(X)) = max c u(c) + β ∑ i=L,H πivj (R(X − c) + yi) . (3.3) Here v(X) represents a candidate value function, that is a proposed solution to (3.2). If T (v(X)) = v(X), then indeed v(X) is the unique solution to (3.2). Thus the solution to the dynamic programming problem is reduced to finding a fixed point of the mapping T (v(X)). Starting with an initial guess v0(X), we compute a sequence of value functions 44 vj(X): vj+1(X) = T (vj(X)) = max c u(c) + β ∑ i=L,H πivj (R(X − c) + yi) . The iterations are stopped when |vj+1(X) − vj(X)| < �, ∀is, where � is a small number. As T (.) is a contraction mapping (see chapter 2), the initial guess v0(X) does not have any influence on the convergence to the fixed point, so that one can choose for instance v0(X) = 0. However, finding a good guess for v0(X) helps to decrease the computing time. Using the contraction mapping property, it can be shown that the convergence rate is geometric, parameterized by the discount rate β. We now review in more detail how the iteration is done in practice. At each iteration, the values vj(X) are stored in a nsx1 matrix: V =   vj(X 1) ... vj(X is ) ... vj(X ns )   To compute vj+1, we start by choosing a particular size for the total amount of cake at the start of the period, Xis . We then search among all the points in the control space ΨC for the one that maximizes u(c) + βEvj(X ′). Let’s denote it ci ∗ c . This involves finding next period’s value, vj(R(X is − ci∗c ) + yi), i = L, H. With the assumption of a finite state space, we look for the value vj(.) at the point nearest to R(Xis − ci∗c ) + yi. Once we have calculated the new value for vj+1(Xis ), we can proceed to compute similarly the value vj+1(.) for other sizes of the cake and other endowment at the start of the period. These new values are then stacked in V. Figure 3.1 gives a detailed example of how this can be programmed on a computer. (Note that the code is not written in a particular computer language, so one has to adapt the code to the appropriate syntax. The code for the value function iteration piece is Part III of the Matlab code. ) 45 [Figure 3.1 approximately here] Once the value function iteration piece of the program is completed, the value function can be used to find the policy function, c = c(X). This is done by collecting all the optimal consumption value, cic∗ for each value of Xis . Here again, we only know the function c(X) at the points of the grid. We can use interpolating methods to evaluate the policy function at other points. The value function and the policy function are displayed in Figures 3.2 and 3.3 for particular values of the parameters. [Figure 3.2 approximately here] [Figure 3.3 approximately here] As discussed above, approximating the value function and the policy rules by a finite state space requires a large number of points on this space (ns has to be big). This is often very time consuming in terms of numerical calculations. One can reduce the number of points on the grid, while keeping a satisfactory accuracy by using interpolations on this grid. When we evaluated the function vj(R(X is − ci∗c ) + yi), i = L, H, we used the nearest value on the grid to approximate R(X is −ci∗c )+yi. With a small number of points on the grid, this can be a very crude approximation. The accuracy of the computation can be increased by interpolating the function vj(.) (see section 3.A.1 for more details). The interpolation is based on the values in V. 3.2.2 Policy Function Iterations The value function iteration method can be rather slow, as it converges at a rate β. Researchers have devised other methods which can be faster to compute the solution to the Bellman equation in an infinite horizon. The policy function iteration, also 46 known as Howard’s improvement algorithm, is one of these. We refer the reader to Judd (1998) or Ljungqvist and Sargent (2000) for further details. This method starts with a guess for the policy function, in our case c0(X). This policy function is then used to evaluate the value of using this rule forever: V0(X) = u(c0(X)) + β ∑ i=L,H πiV0 (R(X − c0(X)) + yi) for all X. This ”policy evaluation step” requires solving a system of linear equations, given that we have approximated R(X − c0(X)) + yi by an X on our grid. Next, we do a ”policy improvement step” to compute c1(X) as: c1(X) = argmax c [ u(c) + β ∑ i=L,H πiV0 (R(X − c) + yi) ] for all X. Given this new rule, the iterations are continued to find V1(), c2(), . . ., cj+1() until |cj+1(X) − cj(X)| is small enough. The convergence rate is much faster than the value function iteration method. However, solving the ”policy evaluation step” can be in some cases very time consuming, especially when the state space is large. Once again, the computation time is much reduced if the initial guess c0(X) is close to the true policy rule c(X). 3.2.3 Projection Methods These methods compute directly the policy function without calculating the value functions. They use the first order conditions (Euler equation) to back out the policy rules. The continuous cake problem satisfies the first order Euler equation: u′(ct) = Etu ′(ct+1) if the desired consumption level is less than the total resources X = W + y. If there is a corner solution, then the optimal consumption level is c(X) = X. Taking into account the corner solution, we can rewrite the Euler equation as: u′(ct) = max[u ′(Xt), Etu ′(ct+1)] 47 We know that, under the iid assumption, the problem has only one state variable, X, so that the consumption function can be written c = c(X). As we consider the stationary solution, we drop the subscript t in the next . The Euler equation can be reformulated as: u′ ( c(X) ) − max [ u′(X), Ey′u ′ ( c ( R(X − c(X)) + y′ ))] = 0 (3.4) or F (c(X)) = 0 (3.5) The goal is to find an approximation ĉ(X) of c(X), for which (3.5) is approximately satisfied. The problem is thus reduced to find the zero of F , where F is an operator over function spaces. This can be done with a minimizing algorithm. There are two issues to resolve. First, we need to find a good approximation of c(X). Second, we have to define a metric to evaluate the fit of the approximation. Solving for the Policy Rule [Figure 3.4 approximately here] Let {pi(X)} be a base of the space of continuous functions and let Ψ = {ψi} be a set of parameters. We can approximate c(X) by ĉ(X, Ψ) = n∑ i=1 ψipi(X) There is an infinite number of bases to chose from. A simple one is to consider polynomials in X, so that ĉ(X, Ψ) = ψ0 + ψ1X + ψ2X 2 + .... Although this is an intuitive choice, this is not usually the best one. In the function space, this base is not an orthogonal base, which means that some elements tend to be collinear. Orthogonal bases will yield more efficient and precise results. 26 The chosen base should be computationally simple. Its elements should ”look like” the function to approximate, so that the function c(X) can be approximated with a small number 48 of base functions. Any knowledge of the shape of the policy function will be to a great help. If, for instance this policy function has a kink, a method based only on a series of polynomials will have a hard time fitting it. It would require a large number of powers of the state variable to come somewhere close to the solution. Having chosen a method to approximate the policy rule, we now have to be more precise about what ”bringing F (ĉ(X, Ψ)) close to zero” means. To be more specific, we need to define some operators on the space of continuous functions. For any weighting function g(x), the inner product of two integrable functions f1 and f2 on a space A is defined as: 〈f1, f2〉 = ∫ A f1(x)f2(x)g(x)dx (3.6) Two functions f1 and f2 are said to be orthogonal, conditional on a weighting function g(x), if 〈f1, f2〉 = 0. The weighting function indicates where the researcher wants the approximation to be good. We are using the operator 〈., .〉 and the weighting function to construct a metric to evaluate how close F (ĉ(X, Ψ)) is to zero. This will be done by solving for Ψ such that 〈F (ĉ(X, Ψ)), f (X)〉 = 0 where f (X) is some known function. We next review three methods which differs in their choice for this function f (X). First, a simple choice for f (X) is simply F (ĉ(X, Ψ)) itself. This defines the least square metric as: min Ψ 〈F (ĉ(X, Ψ)), F (ĉ(X, Ψ))〉 The collocation method detailed in section 3.2.3 chose to find Ψ as min Ψ 〈F (ĉ(X, Ψ)), δ(X − Xi)〉 i = 1, . . . , n where δ(X − Xi) is the mass point function at point Xi, i.e. δ(X) = 1 if X = Xi 49 and δ(X) = 0 elsewhere. Another possibility is to define min Ψ 〈F (ĉ(X, Ψ)), pi(X)〉 i = 1, . . . , n where pi(X) is a base of the function space. This is called the Galerkin method. An application of this method can be seen in section 3.2.3, where the base is taken to be ”tent” functions. Figure 3.4 displays some element of a computer code which calculates the residual function F (ĉ(X, Ψ)) when the consumption rule is approximated by a second order polynomial. This can then be used in one of the proposed methods. Collocation Methods Judd (1992) presents in more details this method applied to the growth model. The function c(X) is approximated using Chebyshev polynomials. These polynomials are defined on the interval [0, 1] and take the form: pi(X) = cos(i arccos(X)) X ∈ [0, 1], i = 0, 1, 2, . . . For i = 0, this polynomial is a constant. For i = 1, the polynomial is equal to X. As these polynomials are only defined on the [0, 1] interval, one can usually scale the state variables appropriately. 27 The policy function can then be expressed as: ĉ(X, Ψ) = n∑ i=1 ψipi(X) Next, the method find Ψ which minimizes 〈F (ĉ(X, Ψ)), δ(X − Xi)〉 i = 1, . . . n where δ() is the mass point function. Hence, the method requires that F (ĉ(X, Ψ)) is zero at some particular points Xi and not over the whole range [X̄L, X̄H ]. The method is more efficient if these points are chosen to be the zeros of the basis 50 elements pi(X), here Xi = cos(π/2i). In this case the method is referred to as an orthogonal collocation method. Ψ is the solution to a system of nonlinear equations: F (ĉ(Xi, Ψ)) = 0 i = 1, . . . n This method is good at approximating policy functions which are relatively smooth. A draw back with this method is that the Chebyshev polynomials tends to display oscillations at higher orders. The resulting policy function c(X) will also tend to display wriggles. There is no particular rule for choosing n, the highest order of the Chebyshev polynomial. Obviously, the higher n is the better the approximation, but this comes at an increased cost of computation. Finite Element Methods McGrattan (1996) illustrates the finite element method with the stochastic growth model (see also Reddy (1993) for a more in-depth discussion on finite elements). To start, the state variable X is discretized over a grid {Xis}nsis=1. The finite element method is based on the following functions: pis (X) =   X − Xis−1 Xis − Xis−1 if X ∈ [X is−1, Xis ] Xis+1 − X Xis+1 − Xis if X ∈ [X is , Xis+1] 0 elsewhere The function pis (X) is a very simple function which is in [0,1], as illustrated in Figure 3.5. This is in fact a simple linear interpolation (and an order two spline, see section 3.A.1 for more details on these techniques). On the interval [Xis , Xis+1], the function ĉ(X) is equal to the weighted sum of pis (X) and pis+1(X). Here the residual function satisfies 〈F (ĉ(X, Ψ)), pi(X)〉 = 0 i = 1, . . . n 51 or equivalently, choosing a constant weighting function:∫ X̄ 0 pis (X)F (ĉ(X))dX = 0 is = 1, . . . , ns This gives a system with ns equations and ns unknowns, {ψis}nsis=1. This non-linear system can be solved to find the weights {ψis}. To solve the system, the integral can be computed numerically using numerical techniques, see Appendix 3.A.2 for more details. As in the collocation method, the choice of ns is the result of a trade-off between increased precision and higher computational burden. [Figure 3.5 approximately here] 3.3 Stochastic Discrete Cake Eating Problem We present here another example of a dynamic programming model. It differs from the one presented in section 3.2 in two ways. First, the decision of the agent is not continuous (how much to eat) but discrete (eat or wait). Second, the problem has two state variables as the exogenous shock is serially correlated. The agent is endowed with a cake of size W . At each period, the agent has to decide whether to eat the cake entirely or not. If not eaten, the cake shrinks by a factor ρ each period. The agent also experiences taste shocks, possibly serially correlated and which follows an autoregressive process of order one. The agent observes the current taste shock at the beginning of the period, before the decision to eat the cake is taken. However, the future shocks are unobserved by the agent, introducing a stochastic element into the problem. Although the cake is shrinking, the agent might decide to postpone the consumption decision until a period with a better realization of the taste shock. The program of the agent can be written in the form: V (W, �) = max[�u(W ), βE�′|�V (ρW, � ′)] (3.7) 52 where V (W, ε) is the intertemporal value of a cake of size W conditional of the realization ε of the taste shock. Here E�′ denotes the expectation with respect to the future shock �, conditional on the value of �. The policy function is a function d(W, ε) which takes a value of zero if the agent decides to wait or one if the cake is eaten. We can also define a threshold ε∗(W ) such that:  d(W, ε) = 1 if ε > ε∗(W )
    d(W, ε) = 0 otherwise
    As in section 3.2, the problem can be solved by value function iterations. How-
    ever, as the problem is discrete we cannot use the projection technique as the decision
    rule is not a smooth function, but a step function.
    3.3.1 Value Function Iterations
    As before, we have to define first the functional form for the utility function and we
    need to discretize the state space. If we consider ρ < 1, the cake shrinks with time and W is naturally bounded between W̄ , the initial size and 0. In this case, the size of the cake takes only values equal to ρtW̄ , t ≥ 0. Hence, ΨS = {ρiW̄ } is a judicious choice for the state space. Contrary to an equally spaced grid, this choice ensures that we do not need to interpolate the value function outside of the grid points. Next, we need to discretize the second state variable, ε. The shock is supposed to come from a continuous distribution and follows an autoregressive process of order one. We discretize ε in I points {εi}Ii=1 following a technique presented by Tauchen (1986) and summarized in appendix 3.A.2. In fact, we approximate an autoregressive process by a markov chain. The method determines the optimal discrete points {εi} and the transition matrix πij = Prob(εt = εi|εt−1 = εj) such that the markov chain mimics the AR(1) process. Of course, the approximation is only good if I is big enough. 53 In the case where I = 2, we have to determine two grid points �L and �H . The probability that a shock �L is followed by a shock �H is denoted by πLH . The probability of transitions can be stacked in a transition matrix: π = [ πLL πLH πHL πHH ] with the constraints that the probability of reaching either a low or a high state next period is equal to one: πLL + πLH = 1 and πHL + πHH = 1. For a given size of the cake W is = ρis W̄ and a given shock �j, j = L or H, it is easy to compute the first term �ju(ρ is W̄ ). To compute the second term we need to calculate the expected value of tomorrow’s cake. Given a guess for the value function of next period, v(., .) the expected value is: E�′|�j v(ρ is+1W̄ ) = πjLv(ρ is+1W̄ , �L) + πjH v(ρ is+1W̄ , �H ) The recursion is started backward with an initial guess for V (., .). For a given state of the cake Wis and a given shock εj, the new value function is calculated from equation (3.7). The iterations are stopped when two successive value functions are close enough. In terms of numerical computing, the value function is stored as a matrix V of size nW xnε where nW and nε are the number of points on the grid for W and ε. At each iteration, the matrix is updated with the new guess for the value function. Figure 3.6 displays an example of a computer code which computes the value function vj+1(W, ε) given the value vj(W, ε). [Figure 3.6 approximately here] Given the way we have computed the grid, the next period value is simple to compute as it is given by V[is − 1, .]. This rule is valid if is > 1. Computing V[1, .]
    will be more of a problem. One can use an extrapolation method to approximate
    the values, given the knowledge of V[is, .], is > 1.

    54
    Figure 3.7 displays the value function for particular parameters. The utility
    function was taken to be u(c, ε) = ln(εc) and ln(ε) is supposed to follow an AR(1)
    process with mean zero, autocorrelation ρε = 0.5 and with an unconditional variance
    of 0.2. We have discretized ε into 4 grid points.
    [Figure 3.7 approximately here]
    Figure 3.8 displays the decision rule, and the function ε∗(W ). This threshold
    was computed as the solution of:
    u(W, ε∗(W )) = βEε′|εV (ρW, ε
    ′)
    which is the the value of the taste shock which makes the agent indifferent between
    waiting and eating, given the size of the cake W .
    [Figure 3.8 approximately here]
    We return later in this book to examples of discrete choice models. In particular,
    we refer the readers to the models presented in section 8.5 and 7.3.3.
    3.4 Extensions and Conclusion
    This chapter has reviewed common techniques to solve dynamic programming prob-
    lems as seen in chapter 2. We have applied these techniques to both deterministic
    and stochastic problems, to continuous and discrete choice models. In principle,
    these methods can be applied to solve more complicated problems.
    3.4.1 Larger State Spaces
    Both examples we have studied in sections 3.2 and 3.3 have small state spaces. In
    empirical applications, the state space often need to be much larger if the model

    55
    has to be confronted with real data. For instance, the endowment shocks might be
    serially correlated or the interest rate, R, might also be a stochastic and persistent
    process.
    For the value function iteration method, this means that the successive value
    functions have to be stacked in a multidimensional matrix. Also, the value function
    has to be interpolated in several dimensions. The techniques in section 3.A.1 can be
    extended to deal with this problem. However, the value function iteration method
    runs quickly into the ”curse of dimensionality”. If each state variable is discretized
    into ns grid points, the value function has to be evaluated into N
    ns points, where
    N is the number of state variables. This demands an increasing computer memory
    and slows down the computation. A solution to this problem is to evaluate the
    value function for a subset of the points in the state space and then interpolate the
    value function elsewhere. This solution has been implemented by Keane and Wolpin
    (1994).
    Projection methods are better at handling larger state spaces. Suppose the
    problem is characterized by N state variables {X1, . . . , XN }. The approximated
    policy function can be written as:
    ĉ(X1, . . . , XN ) =
    N∑
    j=1
    nj∑
    ij =1
    ψ
    j
    ij
    pij (Xj)
    The problem is then characterized by auxiliary parameters {ψji }.
    Exercise 3.1
    Suppose u(c) = c1−γ/(1 − γ). Construct the code to solve for the stochastic cake
    eating problem, using the value function iteration method. Plot the policy function as
    a function of the size of the cake and the stochastic endowment, for γ = {0.5, 1, 2}.
    Compare the level and slope of the policy functions for different values of γ. How
    do you interpret the results?

    56
    Exercise 3.2
    Consider the example of the discrete cake eating problem in section 3.3. Con-
    struct the code to solve for this problem, with i.i.d. taste shocks, using u(c) = ln(c),
    εL = 0.8, εH = 1.2, πL = 0.3 and πH = 0.7. Map the decision rule as a function of
    the size of the cake.
    Exercise 3.3
    Consider an extension of the discrete cake eating problem seen in section 3.3.
    The agent has now the choice between three actions: eat the cake, store it in fridge
    1 or in fridge 2. In fridge 1, the cake shrinks by a factor ρ: W ′ = ρW . In fridge
    2, the cake diminish by a fixed amount: W ′ = W − κ. The program of the agent is
    characterized as:
    V (W, ε) = max[V Eat(W, ε), V Fridge 1(W, ε), V Fridge 2(W, ε)]
    with
    
    
    V Eat(W, ε) = εu(W )
    V Fridge 1(W, ε) = βEε′V (ρW, ε
    ′)
    V Fridge 2(W, ε) = βEε′V (W − κ, ε′)
    Construct the code to solve for this problem, using u(c) = ln(c), εL = 0.8, εH = 1.2,
    πL = 0.5 and πH = 0.5. When will the agent switch from one fridge to the other?
    Exercise 3.4
    Consider the stochastic cake eating problem. Suppose that the discount rate β
    is a function of the amount of cake consumed: β = Φ(β1 + β2c), where β1 and
    β2 are known parameters and Φ() is the normal cumulative distribution function.
    Construct the code to solve for this new problem using value function iterations.
    Suppose γ = 2, β1 = 1.65, πL = πH = 0.5, yL = 0.8, yH = 1.2 and β2 = −1.

    57
    Plot the policy rule c = c(X). Compare with the case where the discount rate is
    independent of the quantity consumed. How would you interpret the fact that the
    discount rate depends on the amount of cake consumed?

    58
    3.A Additional Numerical Tools
    This appendix provides some useful numerical tools which are often used when
    solving dynamic problems. We present interpolation methods, numerical integration
    methods as well as a method to approximate serially correlated processes by a
    markov process. The last section is devoted to simulations.
    3.A.1 Interpolation Methods
    We briefly review three simple interpolation methods. For further readings, see for
    instance Press et al. (1986) or Judd (1996).
    When solving the value function or the policy function, we often have to calculate
    the value of these functions outside of the points of the grid. This requires to
    be able to interpolate the function. Using a good interpolation method is also
    helpful as one can save computer time and space by using fewer grid points to
    approximate the functions. Denote f (x) the function to approximate. We assume
    that we know this function at a number of grid points xi, i = 1, . . . , I. Denote by
    fi = f (xi) the values of the function at these grid points. We are interested in finding
    an approximate function f̂ (x) such that f̂ (x) � f (x), based on the observations
    {xi, fi}. We present three different methods and use as an example the function
    f (x) = xsin(x). Figure 3.9 displays the results for all the methods.
    [Figure 3.9 approximately here]
    Least Squares Interpolation
    A natural way to approximate f () is to use an econometric technique, such as OLS,
    to ”estimate” the function f̂ (.). The first step is to assume a functional form for f̂ .
    For instance, we can approximate f with a polynomial in x such as:
    f̂ (x) = α0 + α1x + . . . + αN x
    N N < I 59 By regressing fi on xi we can easily recover the parameters αn. In practice, this method is often not very good, unless the function f is well behaved. Higher order polynomials tend to fluctuate and can occasionally give an extremely poor fit. This is particularly true when the function is extrapolated outside of the grid points, i.e when x > xI or x < x1. The least square method is a global approximation method. As such, the fit can be on average satisfactory but mediocre almost everywhere. This can be seen in the example in Figure 3.9. Linear Interpolation This method fits the function f with piecewise linear functions on the intervals [xi−1, xi]. For any value of x in [xi−1, xi], an approximation f̂ (x) of f (x) can be found as: f̂ (x) = fi−1 + fi − fi−1 xi − xi−1 (x − xi−1) A finer grid will give a better approximation of f (x). When x is greater than xI , using this rule can lead to numerical problems as the above expression may not be accurate. Note that the approximation function f̂ is continuous, but not differentiable at the grid points. This can be an undesirable feature as this non differentiability can be translated to the value function or the policy function. This method can be extended for multivariate functions. For instance, we can approximate the function f (x, y) given data on {xi, yj, fij}. Denote dx = (x − xi)/(xi−1 − xi) and dy = (y − yi)/(yi−1 − yi). The approximation can be written as: f̂ (x, y) = dxdyfi−1,j−1 + (1 − dx)dyfi,j−1 + dx(1 − dy)fi−1,j + (1 − dx)(1 − dy)fi,j The formula can be extended to higher dimension as well. 60 Spline Methods This method extends the linear interpolation by fitting piecewise polynomials while ensuring that the resulting approximate function f̂ is both continuous and differ- entiable at the grid points xi. We restrict ourself to cubic splines for simplicity, but the literature on splines is very large (see for instance De Boor (1978)). The approximate function is expressed as: f̂i(x) = fi + ai(x − xi−1) + bi(x − xi−1)2 + ci(x − xi−1)3 x ∈ [xi−1, xi] Here for each point on the grid, we have to determine three parameters {ai, bi, ci}, so in total there is 3I parameters to compute. However, imposing the continuity of the function and of its derivative up to the second order reduces the number of coefficients: f̂i(x) = f̂i+1(x) f̂ ′i (x) = f̂ ′ i+1(x) f̂ ′′i (x) = f̂ ′′ i+1(x) It is also common practice to apply f̂ ′′1 (x1) = f̂ ′′ I (xI ) = 0. With these constraints, the number of coefficients to compute is down to I. Some algebra gives:  ai = fi − fi−1 xi − xi−1 − bi(xi − xi−1) − ci(xi − xi−1) 2 i = 1, . . . , I ci = bi+1 − bi 3(xi − xi−1) i = 1, . . . , I − 1 cI = − bI3(xI − xI−1) ai + 2bi(xi − xi−1) + 3ci(xi − xi−1)2 = ai+1 Solving this system of equation leads to expressions for the coefficients {ai, bi, ci}. Figure 3.9 shows that the cubic spline is a very good approximation to the function f . 3.A.2 Numerical Integration Numerical integration is often required in dynamic programming problems to solve for the expected value function or to ”integrate out” an unobserved state variable. For instance, solving the Bellman equation (3.3) requires to calculate Ev(X′) = 61 ∫ v(X′)dF (X′), where F (.) is the cumulative density of the next period cash-on- hand X. In econometric applications, some important state variables might not be observed. If this is the case, then one need to compute the decision rule, uncondi- tional of this state variable. In the case of the stochastic cake eating problem seen in section 3.2, if X is not observed, one could compute c̄ = ∫ c(X)dF (X) which is the unconditional mean of consumption, and match it with observed consumption. We present three methods which can be useful when numerical integration is needed. Quadrature Methods There is a number of quadrature method. We briefly detail the Gauss-Legendre method (much more detailed information can be found in Press et al. (1986)). The integral of a function f is approximated as: ∫ 1 −1 f (x)dx � w1f (x1) + . . . + wnf (xn) (3.8) where wi and xi are n weights and nodes to be determined. Integration over a different domain can be easily handled by operating a change of the integration variable. The weights and the nodes are computed such that (3.8) is exactly satisfied for polynomials of degree 2n − 1 or less. For instance, if n = 2, denote fi(x) = xi. The weights and nodes satisfy: w1f1(x1) + w2f1(x2) = ∫ 1 −1 f1(x)dx w1f2(x1) + w2f2(x2) = ∫ 1 −1 f2(x)dx w1f3(x1) + w2f3(x2) = ∫ 1 −1 f3(x)dx w1f4(x1) + w2f4(x2) = ∫ 1 −1 f4(x)dx This is a system of four equation with four unknowns. The solution is w1 = w2 = 1 and x2 = −x1 = 0.578. For larger values of n, the computation is similar. By increasing the number of nodes n, the precision increases. Note that the nodes are not necessarily equally spaced. The weights and the value of the nodes are published in the literature for commonly used values of n. 62 Approximating an Autoregressive Process with a Markov Chain In this section we follow Tauchen (1986) and Tauchen and Hussey (1991) and show how to approximate an autoregressive process of order one by a first order markov process. This is useful to simplify the computation of expected values in the value function iteration framework. For instance, to solve the value function in the cake eating problem, we need to calculate the expected value given ε: V (W, ε) = max[εu(W ), Eε′|εV (ρW, ε ′)] This involves the calculation of an integral at each iteration, which is cumbersome. If we discretize the process εt, into N points ε i, i = 1, . . . , N , we can replace the expected value by: V (W, εi) = max [ εu(W ), N∑ j=1 πi,jV (ρW, ε j) ] i = 1, . . . , N As in the quadrature method, the methods involves finding nodes εj and weights πi,j. As we shall see below, the εi and the πi,j can be computed prior to the iterations. Suppose that εt follows an AR(1) process, with an unconditional mean µ and an autocorrelation ρ: εt = µ(1 − ρ) + ρεt−1 + ut (3.9) where ut is a normally distributed shock with variance σ 2. To discretize this process, we need to determine three different objects. First, we need to discretize the process εt into N intervals. Second, we need to compute the conditional mean of εt within each intervals, which we denote by zi, i, . . . , N . Third, we need to compute the probability of transition between any of these intervals, πi,j. Figure 3.10 graphs the distribution of ε and shows the cut-off points εi as well as the conditional means zi. [Figure 3.10 approximately here] 63 The first step is to discretize the real line into N intervals, defined by the limits ε1, . . . , εN +1. As the process εt is unbounded, ε 1 = −∞ and εN +1 = +∞. The intervals are constructed such that εt has an equal probability of 1/N of falling into them. Given the normality assumption, the cut-off points {εi}N +1i=1 are defined as Φ( εi+1 − µ σε ) − Φ( ε i − µ σε ) = 1 N , i = 1, . . . , N (3.10) where Φ() is the cumulative of the normal density and σε is the standard deviation of ε and is equal to σ/ √ (1 − ρ). Working recursively we get: εi = σεΦ −1( i − 1 N ) + µ Now that we have defined the intervals, what is the average value of ε within a given interval? We denote this value by zi, which is computed as the mean of εt conditional on εt ∈ [εi, εi+1]. zi = E(εt / εt ∈ [εi, εi+1]) = σε φ( εi − µ σε ) − φ( ε i+1 − µ σε ) Φ( εi+1 − µ σε ) − Φ( ε i − µ σε ) + µ Using (3.10), the expression simplifies to: zi = N σε ( φ( εi − µ σε ) − φ( ε i+1 − µ σε ) ) + µ Next, we define the transition probability as πi,j = P (εt ∈ [εj, εj+1]|εt−1 ∈ [εi, εi+1]) πi,j = 1√ 2πσε ∫ εj+1 εj e −(u − µ) 2 2σ2ε [ Φ( εi+1 − µ(1 − ρ) − ρu σ ) − Φ( ε i − µ(1 − ρ) − ρu σ ) ] du The computation of πi,j requires the computation of a non trivial integral. This can be done numerically. Note that if ρ = 0, i.e. ε is an i.i.d. process, the above expression is simply: πi,j = 1/N 64 We can now define a Markov process zt which will mimic an autoregressive process of order one, as defined in (3.9). zt takes its values in {zi}Ni=1 and the transition between period t and t + 1 is defined as: P (zt = z j/ zt−1 = z i) = πi,j By increasing N , the discretization becomes finer and the markov process gets closer to the real autoregressive process. Example: For N=3, ρ = 0.5, µ = 0 and σ = 1, we have: z1 = −1.26 z2 = 0 z3 = 1.26 and π =   0.55 0.31 0.140.31 0.38 0.31 0.14 0.31 0.55   3.A.3 How to Simulate the Model Once the value function is computed, the estimation or the evaluation of the model often requires the simulation of the behavior of the agent through time. If the model is stochastic, the first step is to generate a series for the shocks, for t = 1, . . . , T . Then, we go from period to period and use the policy function to find out the optimal choice for this period. We also update the state variable and proceed to next period. How to Program a Markov Process The markov process is characterized by grid points, {zi} and by a transition matrix π, with elements πij = P rob(yt = z j/yt−1 = zi). We start in period 1. The process zt is initialized at , say z i. Next, we have to assign a value for z2. To this end, using the random generator of the computer, we 65 draw a uniform variable, u, in [0, 1]. The state in period 2, j, is defined as: j∑ l=1 πi,l < u ≤ j+1∑ l=1 πi,l or j = 1 if u < πi,1. The values for the periods ahead are constructed in a similar way. Figure 3.11 presents a computer code which will construct iteratively the values for T periods. [Figure 3.11 approximately here] How to Simulate the Model For this, we need to initialize all stochastic processes, which are the exogenous shock and the state variables. The state variables can be initialized to their long run values or to some other value. Often, the model is simulated over a long number of periods and the first periods are discarded to get rid of initial condition problems. The value of the state variables and the shock in period 1 are used to determine the choice variable in period 1. In the case of the continuous stochastic cake eating problem in section 3.2, we would construct c1 = c(X1). Next, we can generate the values of the state variable in period 2, X2 = R(X1 − c1) + y2 where y2 is calculated using the method described in section 3.A.3 above. This procedure would be repeated over T periods to successively construct all the values for the choice variables and the state variables. Chapter 4 Econometrics 4.1 Overview This chapter reviews techniques to estimate parameters of models based on dynamic programming. This chapters is organized in two parts. In section 4.2, we present two simple examples to illustrate the different estimation methodologies. We analyze a simple coin flipping experiment and the classic problem of supply and demand. We review standard techniques such as maximum likelihood and the method of moments as well as simulated estimation techniques. The reader who is already familiar with econometric techniques could go to section 4.3 which gives more details on these techniques and studies the asymptotic properties of the estimators. A more elaborate dynamic programming model of cake eating is used to illustrate these different techniques. 66 67 4.2 Some Illustrative Examples 4.2.1 Coin Flipping We consider here a simple coin flipping example. The coin is not necessarily fair and the outcome of the draw is either heads with a probability P1 or tails with a probability P2 = 1−P1, with {P1, P2} ∈ [0, 1]x[0, 1]. We are interested in estimating the probability of each outcome. We observe a series of T draws from the coin. Denote the realization of the tth draw by xt, which is equal either to 1 (if heads) or 2 (if tails). The data set at hand is thus a series of observations {x1, x2, . . . , xT }. This section will describe a number of methods to uncover the probabilities {P1, P2} from observed data. This simple example can be extended in two directions. First, we can try to imagine a coin with more than two sides (a dice). We are then able to consider more than two outcomes per draw. In this case, we denote P = {P1, . . . , PI} a vector with I elements where Pi = P (xt = i) is the probability of outcome i. We are interested in estimating the probabilities {Pi}i=1,...,I . For simplicity, we sometimes state results for the case where I = 2, but the generalization to a larger number of outcomes is straightforward. Second, it may be possible that the draws are serially correlated. The probability of obtaining a head might depend on the outcome of the previous draw. In this case we want to estimate P (xt = j|xt−1 = i). We also consider this generalized example below. Of course, the researcher may not be interested in these probabilities alone but rather, as in many economic examples, the parameters that underlie P . To be more specific, suppose that one had a model parameterized by θ ∈ Θ ⊂ Rκ that determines P . That is, associated with each θ is a vector of probabilities P . Denote by M (θ) the mapping from parameters to probabilities: M : Θ −→ [0, 1]I . 68 In the case where I = 2, we could consider a fair coin, in which case θ = (1/2, 1/2) and P = (P1, P2) = (1/2, 1/2). Alternatively we could consider a coin which is biased towards heads, with θ = (2/3, 1/3) and P = (P1, P2) = (2/3, 1/3). In these examples, the model M is the identity, M (θ) = θ. In practice, we would have to impose that θ ∈ [0, 1] in the estimation algorithm. Another way of specifying the model is to chose a function M (.) which is naturally bounded between 0 and 1. In this case, we can let θ to belong to R. For instance, the cumulative distribution of the normal density, noted Φ(.) satisfies this condition. In the fair coin example, we could have θ = (0, 0) and P = (Φ(0), Φ(0)) = (1/2, 1/2). With the biased coin, we would have θ = (0.43, −0.43), as Φ(0.43) = 2/3 and Φ(−0.43) = 1/3. Maximum Likelihood IID case: We start with the case where the draws from the coin are identically and independently distributed. The likelihood of observing the sample {x1, x2, . . . , xT } is given by: £(x, P ) = ΠIi=1P #i i where #i is the number of observations for which event i occurs. Thus £ represents the probability of observing {xt}Tt=1 given P . The maximum likelihood estimator of P is given by: P = arg max £. (4.1) By deriving the first order condition for a maximum of £(x, P ), the maximum likelihood estimate of Pi, i = 1, 2, ...I is given by: P ∗i = #i∑ i #i . (4.2) In words, the maximum likelihood estimator of Pi is the fraction of occurrences of event i. 69 Suppose that one had a model M (.) for the probabilities, parameterized by θ. So, indirectly, the likelihood of the sample depends on this vector of parameters, denote it £̃(x, θ) = £(x, M (θ)). In that case, the maximum likelihood estimator of the parameter vector (θ∗) is given by: θ∗ = arg max θ £̃(x, θ). In effect, by a judicious choice of θ, we choose the elements of P to maximize the likelihood of observing the sample. In fact, by maximizing this function we would end up at the same set of first-order conditions, (4.2), that we obtained from solving (4.1). Example 4.1 Suppose I=2 and that M (θ) = Φ(θ), where Φ(.) is the cumulative distribution func- tion of the standardized normal density. 28 In this case, p1 = P (xt = 1) = Φ(θ) and p2 = 1 − Φ(θ). The parameter is estimated by maximizing the likelihood of observing the data: θ∗ = arg max θ Φ(θ)#1(1 − Φ(θ))#2 where #1 and #2 are the number of observations that fall into category 1 and 2. Straightforward derivation gives: θ∗ = Φ−1( #1 #1 + #2 ) Markov Structure: The same issues arise in a model which exhibits more dy- namics, as is the case when the outcomes are serially correlated. Let Pij denote the probability of observing event j in period t + 1 conditional on observing event i in period t: Pij = Prob (xt+1 = j|xt = i). 70 These conditional probabilities satisfy: Pij ∈ (0, 1) and ∑ j Pij = 1 for i = 1, 2, .., I. Intuitively, the former condition says that given the current state is i, in period t + 1 all j ∈ I will occur with positive probability and the latter condition requires that these probabilities sum to one. The probability of observing the sample of data is: £(x, P ) = P (x1, . . . , xT ) = T∏ l=2 P (xl|xl−1) P (x1) Let #ij denote the number of observations in which state j occurred in the period following state i. Then the likelihood function in this case is: £(x, P ) = (ΠiP #ij ij ) ∗ P (x1) We can express the probability of the first observation as a function of the Pij probabilities. P (x1) = I∑ j=1 P (x1|x0 = j) = I∑ j=1 Pj1 As before, the conditional probabilities and this initial probability can, in principle, depend on θ. Thus the maximum likelihood estimator of θ would be the one that maximizes £(x, P ). Note that there are now a large number of probabilities that are estimated through maximum likelihood: I(I − 1). Thus a richer set of parameters can be estimated with this structure. Method of Moments Continuing with our examples, we consider an alternative way to estimate the pa- rameters. Consider again the iid case and suppose there are only two possible outcomes, I = 2, so that we have a repeated Bernoulli trial. Given a sample of ob- servations, let µ denote a moment computed from the data. For example, µ might simply be the fraction of times event i = 1 occurred in the sample. In this case, µ = P1. Let µ(θ) denote the same moment calculated from the model when the data generating process (the model M ) is parameterized by θ. For now, assume that the 71 number of parameters, κ, is equal to one so that the number of parameters is equal to the number of moments (the problem is then said to be just identified). Consider the following optimization problem: min θ (µ(θ) − µ)2. Here we are choosing the parameters to bring the moment from the model as close as possible to that from the actual data. The θ that emerges from this optimization is a method of moments estimator, denote this estimate by θ̂. Example 4.2 Suppose we chose as a moment the fraction of times event i = 1 occurs in the sample. From our model of coin flipping, this fraction is equal to Φ(θ). The parameter is estimated by minimizing the distance between the fraction predicted by the model and the observed one: θ∗ = arg min θ ( Φ(θ) − #1 #1 + #2 )2 Solving the minimization problem gives: θ∗ = Φ−1 ( #1 #1 + #2 ) Hence, with this choice of moment, the method of moment estimator is the same as the maximum likelihood one, seen in example 4.1. In example 4.2 we chose a particular moment which was the fraction of heads in the sample. Often, in a data set, there is a large set of moments to chose from. The method of moment does not guide us in the choice of a particular moment. So which moment should we consider? The econometric theory has not come out with a clear indication of ”optimal” moments. However, the moments should be informative of the parameters to estimate. This means that the moments under 72 consideration should depend on the parameters in such a way that slight variations in their values results in different values for the moments. With a choice of moment different from the one in example 4.2, the method of moment estimator would have been different from the maximum likelihood estima- tor. However, asymptotically, when the size of the data set increases both estimator converge to the true value. More generally, let µ be a mx1 column vector of moments from the data. If κ < m the model is said to be over identified, as there are more moments than parameters to estimate. If κ = m, the model is said to be just identified and if κ > m, the model is under identified. In the latter case, estimation cannot be
    achieved as there are too many unknown parameters.
    So if κ ≤ m, the estimator of θ comes from:
    min
    θ
    ((µ(θ) − µ)′W −1(µ(θ) − µ).
    In this quadratic form, W is a weighting matrix. As explained below, the choice of W
    is important for obtaining an efficient estimator of θ when the model is overidenfied.
    Using Simulations
    In many applications, the procedures outlined above are difficult to implement,
    either because the likelihood of observing the data or the moments are difficult
    to compute analytically or because it involves solving too many integrals. Put
    differently, the researcher does not have an analytic representation of M (θ). If this
    is the case, then estimation can still be carried out numerically using simulations.
    Consider again the iid case, where I = 2. The simulation approach proceeds
    in the following way. First, we fix θ, the parameter of M (θ). Second, using the
    random number generator of a computer, we generate S draws {us} from a uniform

    73
    distribution over [0, 1]. We classify each draw as heads (denoted i = 1) if us < M (θ) or tails (denoted i = 2) otherwise. The fractions of the two events in the simulated data are used to approximate P Si (θ) by counting the number of simulated observations that take value i, denoted by �i. So, P S i (θ) = �i/S. The simulated maximum likelihood estimator is defined as: θ∗S = arg max θ ∏ i P Si (θ) #i where, as before, #i refers to the fraction of observations in which i occurs. The estimator is indexed by S, the number of simulations. Obviously, a larger number of simulation draws will yield more precise estimates. Figure 4.1 displays the log- likelihood for the coin flipping example, based on two series of simulation with respectively 50 and 5000 draws. The observed data set was a series of 100 draws. The log-likelihood has a maximum at the true value of the parameter, although the likelihood is very flat around the true value when the number of simulations is small. Exercise 4.1 Build a computer program which computes the likelihood function, using simula- tions, of a sample of T draws for the case where I = 3. [Figure 4.1 approximately here] For the method of moment estimator, the procedure is the same. Once an arti- ficial data set has been generated, we can compute moments both on the artificial data and on the observed data. Denote by µS(θ) a moment derived from the simu- lated data. For instance, µ and µS(θ) could be the fraction of heads in the observed sample and in the simulated one. The simulated method of moment estimator is defined as: θ∗S = arg min θ (µS(θ) − µ)′W −1(µS(θ) − µ) 74 Figure 4.2 displays the objective function for the simulated method of moments. The function has a minimum at the true value of the parameter. Once again, using more simulation draws gives a smoother function, which will be easier to minimize. Exercise 4.2 Build a computer program which computes the objective function, using simula- tions, of a sample of T draws for the case where I = 3. [Figure 4.2 approximately here] In both methods, the estimation requires two steps. First, given a value of θ, one needs to simulate artificial data and compute either a likelihood or a moment from this data set. Second, using these objects, the likelihood or the objective function has to be evaluated and a new value for the parameters, closer to the true one, found. These two steps are repeated until convergence to the true value. To compute the simulated data, we need to draw random shocks using the ran- dom number generator of a computer. It is to be noted that the random draws have to be computed once and for all at the start of the estimation process. If the draws change between iterations, it would be unclear whether the change in the criterion function comes from a change in the parameter or from a change in the random draws. The ability to simulate data opens the way to yet another estimation method: indirect inference. This method uses an auxiliary model chosen by the researcher. This model should be easy to estimate by standard techniques and should capture enough of the interesting variation in the data. We denote it by M̃ (ψ), where ψ is a vector of auxiliary parameters describing this new model. Given a guess for the vector of structural parameters θ, the true model can be simulated to create a new data set. The auxiliary model is estimated both on the real data and on the 75 simulated one, providing two sets of auxiliary parameters. The vector θ is chosen such that the two sets of auxiliary parameters are close to each other. Note that the vector of auxiliary parameters ψ is of no particular interest per se, as it describes a misspecified model (M̃ ). Within the context of the original model, it has no clear interpretation. However, it serves as a mean to identify and estimate the structural parameters θ. Example 4.3 For instance, if M (θ) = Φ(θ), the model has no closed-form solution as the cu- mulative of the normal density has no analytical form. Instead of approximating it numerically, we can use the indirect inference method to estimate parameters of in- terest without computing this function. We might turn to an auxiliary model which is easier. For instance, the logit model has closed forms for the probabilities. Denote by ψ the auxiliary parameter parameterizing the logit model. With such a model, the probability of observing xt = 1 is equal to: P (xt = 1) = exp(ψ) 1 + exp(ψ) Denote by #1 and #2 the number of cases that fall into category 1 and 2. The log-likelihood of observing some data is: £ = #1 ln exp(ψ) 1 + exp(ψ) + #2 ln 1 1 + exp(ψ) = #1ψ − (#1 + #2) ln(1 + exp(ψ)) Maximization of this log likelihood and some rearranging gives a simple formula for the ML estimator of the auxiliary parameter: ψ = ln #1 #2 . We can compute this estimator of the auxiliary parameter both for our observed data and for the simulated data by observing in each case the empirical frequencies. Denote the former by ψ̂ and the latter by ψ̂ S (θ). The indirect inference estimator is then: θ∗S = arg min θ (ψ̂ S (θ) − ψ̂)2 = argmin θ (ln �1(θ) �2(θ) − ln #1 #2 )2 76 In this example, as the probit model is difficult to estimate by maximum likelihood directly, we have instead replaced it with a logit model which is easier to estimate. Although we are not interested in ψ per se, this parameter is a means to estimate the parameter of importance, θ. So far, we have not discussed the size of the simulated data set. Obviously, one expects that the estimation will be more efficient if S is large, as either the moments, the likelihood or the auxiliary model will be pinned down with greater accuracy. Using simulations instead of analytical forms introduce randomness into the estimation method. For short samples, this randomness can lead to biased esti- mates. For instance, with the simulated maximum likelihood, we need the number of simulation draws to go to infinity to get rid of the bias. This is not the case for the simulated method of moment or the indirect inference, although the results are more precise for a large S. We discuss this issue later on in this chapter. Identification Issues We conclude this section on coin flipping with an informal discussion of identification issues. Up to here, we implicitly assumed that the problem was identified, i.e. the estimation method and the data set allowed us to get a unique estimate of the true vector of parameters θ. A key issue is the dimensionality of the parameter space, κ, relative to I, the dimensionality of P . First, suppose that κ = I − 1, so that the dimensionality of θ is the same as the number of free elements of P .29 Second, assume that M (θ) is one to one. This means that M is a function and for every P there exists only one value of θ such that P = M (θ). In this case, we effectively estimate θ from P ∗ by using the inverse of the model: θ∗ = M −1(P ∗). 77 This is the most favorable case of identification and we would say the parameters of the model are just identified. It is illustrated in Figure 4.3 for the case of I = 2 and κ = 1. There is a unique value of the parameter, θ∗, for which the probability predicted by the model, M (θ∗), is equal to the true probability. [Figure 4.3 approximately here] [Figure 4.4 approximately here] [Figure 4.5 approximately here] A number of problems can arise, even for the special case of κ = I − 1. First, it might be that the model, M (θ), is not invertible. Thus, for a given maximum likelihood estimate of P ∗ , there could be multiple values of θ that generate this vector of probabilities. In this case, the model is not identified. This is shown in Figure 4.4. Example 4.4 shows an example based on the method of moment estimation where a particular choice of moment leads to non identification. Example 4.4 Suppose we label heads as 1 and tails as 2. Suppose that instead of focusing on the mean of the sample (i.e. the fraction of heads) we chose the variance of the sample. The variance can be expressed as: V (x) = Ex2 − (Ex)2 = #1 #1 + #2 + 4 #2 #1 + #2 − ( #1 #1 + #2 + 2 #2 #1 + #2 )2 = #1 #1 + #2 (1 − #1 #1 + #2 ) So the theoretical and the empirical moments are: µ(θ) = Φ(θ)(1 − Φ(θ)) µ = #1 #1 + #2 (1 − #1 #1 + #2 ) 78 This might appear as a perfectly valid choice of moment, but in fact it is not. The reason is that the function Φ(θ)(1 − Φ(θ)) is not a monotone function but a hump- shaped one and thus not invertible. For both low and high values of θ, the function is close to 0. The variance is maximal when the probability of obtaining a head is equal to that of obtaining a tail. If either tails or heads are very likely, the variance is going to be low. So a low variance indicates that either heads or tails are more frequent, but does not tell us which occurrence is more likely. Hence, in this case, the variance is not a valid moment to consider, for identification reasons. Second, it might be that for a given value of P ∗, there does not exist a value of θ such that M (θ) = P ∗. In this case, the model is simply not rich enough to fit the data. This is a situation of misspecification. Put differently, there is a zero-likelihood problem here as the model, however parameterized, is unable to match the observations. This is illustrated in Figure 4.5. So, returning to the simple coin flipping example, if there is a single parameter characterizing the probability of a head occurring and the mapping from this pa- rameter to the likelihood of heads is one-to-one, then this parameter can be directly estimated from the fraction of heads. But, it might be that there are multiple values of this parameter which would generate the same fraction of heads in a sample. In this case, the researcher needs to bring additional information to the problem. Or, there may be no value of this parameter that can generate the observed frequency of heads. In this case, the model needs to be re-specified. If, instead of κ = I − 1, we may have more dimensions to θ than informa- tion in P : κ > I − 1. In this case, we have a situation where the model is again
    underidentified. Given the maximum likelihood estimate of P ∗, there are multi-
    ple combinations of the parameters that, through the model, can generate P ∗. In

    79
    this case, the researcher needs to bring additional information to the problem to
    overcome the indeterminacy of the parameters. So in the coin-flipping example, a
    physical theory that involved more than a single parameter would be impossible to
    estimate from data that yields a single probability of heads.
    Alternatively, if κ < I −1, then the parameters are overidentified. In this case, there may not be any θ that is consistent with all the components of P. In many applications, such as those studied in this book, this situation allows the researcher a more powerful test of a model. If a model is just identified, then essentially there exists a θ such that P ∗ can be generated by the model. But when a model is overidentified, then matching the model to the data is a much more demanding task. Thus a model that succeeds in matching the data, characterized by P ∗, when the parameters are overidentified is viewed as more compelling. 4.2.2 Supply and Demand Revisited Let us consider the classic problem of supply and demand. This model will serve as an illustration for the previous estimation methods and to discuss the problem of identification. The supply depends on prices, p and the weather, z. The demand side depends on prices and income, y: qS = αpp + αzz + εS (Supply) qD = βpp + βyy + εD (Demand) (4.3) Both the demand and supply shocks are iid, normally distributed, with mean zero and variance σ2S and σ 2 D and covariance ρSD. In total, this model has seven parameters. We solve for the reduced form by expressing the equilibrium variables as function of the exogenous variables y and z: p∗ = βy αp − βp y − αz αp − βp z + εD − εS αP − βP = A1y + A2z + U1 q∗ = αpβy αp − βp y − αzβp αp − βp z + αpεD − βpεS αP − βP = B1y + B2z + U2 (4.4) 80 where A1, A2, B1 and B2 are the reduced form parameters. These parameters can be consistently estimated from regressions using the reduced form. If the system is identified, we are able to recover all the structural parameters from the reduced form coefficients using: αp = B1/A1 βp = B2/A2 βy = A1(B1/A1 − B2/A2) αz = −A2(B1/A1 − B2/A2) (4.5) From these four parameters, it is straightforward to back out the variance of the demand and supply shocks. We can compute εS = q − αpp + αzz and calculate the empirical variance. The same procedure can be applied to recover εD. The estimation in two steps is essentially an instrumental variable estimation where y and z are used as instrument for the endogenous variables p and q. Instead of using a two step OLS method, we can use a number of alternative methods including method of moments, maximum likelihood and indirect inference. We review these methods in turn. Method of Moments Denote by θ the vector of parameters describing the model: θ = (αp, αz, βp, βy) For simplicity, we assume that σD, σS and ρSD are known to the researcher. From the data, we are able to compute a list of empirical moments which consists, for example, of the variance of prices and quantities and the covariance between prices, quantities, income and the weather. Denote µ = {µ1, µ2, µ3, µ4}′ a 4x1 vector of empirical moments with 30 µ1 = cov(p, y)/V (y) µ3 = cov(p, z)/V (z) µ2 = cov(q, y)/V (y) µ4 = cov(q, z)/V (z) (4.6) These moments can be computed directly from the data. For instance, µ1 can 81 be expressed as: µ1 = ∑T t=1(pt − p̄)(yt − ȳ)∑T t=1(yt − ȳ)2 From the model, we can derive the theoretical counterpart of these moments, ex- pressed as functions of the structural parameters. We denote these theoretical mo- ments µ(θ) = {µ1(θ), µ2(θ), µ3(θ), µ4(θ)}. Starting with the expressions in (4.4), some straightforward algebra gives: µ1(θ) = βy αp − βp µ3(θ) = − αz αp − βp µ2(θ) = αpβy αp − βp µ4(θ) = − αzβp αp − βp (4.7) The basis of the method of moment estimation is that at the true value of the vector of parameters, E(µi(θ) − µi) = 0 , i = {1, . . . , 4} This is called an orthogonality condition. In practical terms, we can bring the moments from the model as close as possible to the empirical ones by solving: θ∗ = Argmin θ L(θ) = Argmin θ (µ − µ(θ))′Ω(µ − µ(θ)) (4.8) The ergodicity condition on the sample is the assumption used to make the empirical and the theoretical moments the same as the sample size goes to infinity. Note that this assumption is easily violated in many macro economic samples, as the data is non stationary. In practice, most of the macro data is first made stationary by removing trends. How do the results of (4.8) compare to the results in (4.5)? Note that with our choice of moments, µ1(θ) = A1, µ2(θ) = B1, µ3(θ) = A2 and µ4(θ) = B2. At the optimal value of the parameters, we are left with solving the same problem as in (4.4). This would lead to exactly the same values for the parameters as in (4.5). The method of moment approach collapses the two steps of the previous section into 82 a single one. The estimation of the reduced form and solving the non linear system of equations is done within a single procedure. Could we chose other moments to estimate the structural parameters? As in example 4.4, the answer is both yes and no. The moments must be informative of the parameters of the model. For instance, if we chose µ1 = E(z), the average value of weather, this moment is independent of the parameterization of the model, as z is an exogenous variable. Hence, we are in fact left to estimate four parameters with only three identifying equations. Any moment involving an endogenous variable (p or q in our example) can be used in the estimation and would asymptotically produce the same results. With a finite number of observations, higher order moments are not very precisely computed, so an estimation based on cov(p4, y), say, would not be very efficient. Finally, note that when computing the moments in (4.7), we have not used the assumption that the error terms εD and εS are normally distributed. Whatever their joint distribution, (4.8) would give a consistent estimate of the four parameters of interest. The next section presents the maximum likelihood estimation which assumes the normality of the residuals. Maximum Likelihood The likelihood of observing jointly a given price p and a quantity q, conditional on income and weather can be derived from the reduced form (4.4) as f (p − A1y − A2z, q − B1y − B2z) where f (., .) is the joint density of the disturbances U1 and U2 and where A1, A2, B1, B2 are defined as in (4.4). The likelihood of the entire sample is thus: £(θ) = T∏ t=1 f (pt − A1yt − A2zt, qt − B1yt − B2zt) (4.9) We assume here that εD and εS are normally distributed, so U1 and U2 are also 83 normally distributed with zero mean. 31 The maximization of the likelihood function with respect to the reduced form coefficients is a straightforward exercise. It will give asymptotically consistent estimates of A1, A2, B1 and B2. Given that there is a one to one mapping between the reduced form and the structural parameters, the estimation will also provide consistent estimates of the parameters αp, βp, αz and βy as in the method of moment case. Indirect Inference For a given value of the parameters, we are able to draw supply and demand shocks from their distribution and to simulate artificial data for prices and de- mand, conditional on observed weather and income. This is done using expres- sion (4.4). Denote the observed data as {qt, pt, yt, zt}Tt=1. Denote the simulated data as {qst , pst }t=1...,T,s=1,...,S. Denote the set of parameters of the structural system (4.3) as θ = {αp, αz, βp, βz}. For simplicity, we assume that the parameters σD, σS and ρDS are known. Next, we need an auxiliary model which is simple to estimate. We could use the system (4.3) as this auxiliary model. For both the observed and the simulated data, we can regress the quantities on the prices and the income or the weather. Denote the first set of auxiliary estimate ψ̂T and the second one ψ̃ s T , s = 1, . . . , S. These vectors contains an estimate for the effect of prices on quantities and the effect of weather and income on quantity from both the supply and the demand equations. These estimates will undoubtedly be biased given the simultaneous nature of the system. However, we are interested in these auxiliary parameters only as a mean to get to the structural ones (θ). The next step is to find θ which brings the vector ψ̃ S T = 1/S ∑S s=1 ψ̃(θ) s T as close as possible to ψ̂T . Econometric theory tells us that this will produce a consistent estimate of the parameters of interest, αp, αz, βq, βy. Again, we rely here on the assumption of ergodicity. As will become apparent in 84 section 4.3.3, the estimator will be less efficient than maximum likelihood or the method of moments, unless one relies on a very large number of simulations. Non Identification If the weather has no influence on supply, i.e. αz = 0, then the reduced form equations only expresses p∗ and q∗ as a function of income and shocks only. In this case, the system is under-identified. We can only recover part of the original parameters: αp = B1/A1 σ 2 p = V (q − B1/A1p) Further manipulations give: βy = B1 − A1βp (4.10) There is an infinity of pairs {βy, βp} that satisfy the above equality. Hence, we cannot recover the true values for these two parameters. From (4.10), it is easy to visualize that there is an identification problem. When the estimation involves moment matching or minimization of a likelihood function, non identification is not always straightforward to spot. Some estimation routines will provide an estimate for the parameters whether the system is identified or not. There is no reason that these estimates coincide with the true values, as many sets of parameter values will satisfy the first order conditions of (4.8). If the estimation routine is based on a gradient calculation, finding the minimum of a function requires to calculate and to inverse the hessian of the criterion function L(θ). If αz = 0 the hessian will not be of full rank, as the cross derivatives of L with respect to αz and the other parameters will be zero. Hence one should be suspicious about the results when numerical problems occur such as invertibility problems. As the hessian matrix enters the calculation of the standard errors, a common sign is also abnormally imprecise coefficients. If the estimation routine is not based on 85 gradients (the simplex algorithm for instance), the problem will be more difficult to spot, as the estimation routine will come up with an estimate. However, these results will usually look strange with some coefficients taking absurd large values. Moreover, the estimation results will be sensible to the choice of initial values. Exercise 4.3 Build a computer program which creates a data set of prices and quantities us- ing (4.4), given values for z and y. Use this program to create a data set of size T , the ”true data set” and then to construct a simulated data set of size S. Next, construct the objective function for the indirect inference case as suggested in sec- tion 4.3.3 What happens when you set αz to zero? 4.3 Estimation Methods and Asymptotic Proper- ties This section presents in detail the methods discussed in the previous section. The asymptotic properties of each estimator are presented. We review the generalized method of moments, which encompasses most of the classic estimation methods such as maximum likelihood or non linear least squares. We then present methods using simulations. All the methods are illustrated using simple Dynamic Programming models, such as the cake eating problem which has been seen in chapters 2 and 3. In the following subsections, we assume that there is a ”true” model, x(ut, θ), parameterized by a vector θ of dimension κ. ut is a shock which makes the model probabilistic. For instance, the shock ut can be a taste shock, a productivity shock or a measurement error. We observe a sequence of data generated by this model at the ”true” value of the parameters, which we denote by θ0 and at the ”true” value 86 of the shocks u0t . Let {x(u0t , θ0)}Tt=1 be the observed data, which we also denote as {xt}Tt=1 for simplicity. 32 We are interested in recovering an estimate of θ0 from the observed data and making statistical inferences. 4.3.1 Generalized Method of Moments The method of moment presented briefly in Section 4.2 minimized the distance between an empirical moment and the predicted one. This exploits the fact that on average, the difference between the predicted and the observed series (or a function of these series) should be close to zero at the true value of the parameter θ0. Denote this difference as h(θ, xt), so: E(h(θ0, xt)) = 0 (4.11) This identifying equality is called an orthogonality restriction. Denote the sample average of h(θ, xt): g(θ) = 1 T T∑ t=1 h(θ, xt) An estimate of θ can be found as: θ̂ = arg min θ Q(θ) = arg min θ g(θ)′W −1T g(θ) W −1T is a weighting matrix, which might depend on the data, hence the T subscript. If g(θ) is of size qx1, then W −1T is of size qxq. For instance, if we want to match the first two moments of the process {xt}, the function h() can be written: h(θ, xt) = ( xt(θ) − xt xt(θ) 2 − x2t ) Averaging this vector over the sample will yield g(θ) = (x̄(θ) − x̄, x̄(θ)2 − x̄2). Economic theory often provides more restrictions which can be used in the es- timation method. They often take the form of first order conditions, such as Euler equations, which can be used as an orthogonality restriction as in (4.11). This is 87 the intuition that guided the Hansen and Singleton (1982) study of consumption that we discuss in detail in Chapter 6, section 6.3.3. Here we summarize that with an example. Example 4.5 In a standard intertemporal model of consumption with stochastic income and no borrowing constraints, the first order condition gives: u′(ct) = βREtu ′(ct+1) One can use this restriction to form h(θ, ct, ct+1) = [u ′(ct) − βRu′(ct+1)], where θ is parameterizing the utility function. On average, h(θ, ct, ct+1) should be close to zero at the true value of the parameter. The Euler equation above brings actually more information than we have used so far. Not only should the differences between the marginal utility in period t and t + 1 be close to zero, but it should also be orthogonal to information dated t. Suppose zt is a variable which belongs to the information set at date t. Then the first order condition also implies that, on average, h(θ, ct, ) = zt.[u ′(ct) − βRu′(ct+1)] should be close to zero at the true value of the parameter. If we have more than one zt variable, then we can exploit as many orthogonality restrictions. For further examples, we refer the reader to section 8.4.3. Asymptotic Distribution: Let θ̂T be the GMM estimate, i.e. the solution to (4.3.1). Under regularity conditions (see Hansen (1982)): • θ̂T is a consistent estimator of the true value θ0. 88 • The GMM estimator is asymptotically normal: √ T (θ̂T − θ0) d−→ N (0, Σ) where Σ = (DW −1∞ D ′)−1 and where D′ = plim T {∂g(θ, YT ) ∂θ′ θ=θ0 } The empirical counterpart of D is: D̂′T = ∂g(θ, YT ) ∂θ′ θ=θ̂T This means that asymptotically, one can treat the GMM estimate θ̂T as a normal variable with mean θ0 and variance Σ̂/T : θ̂T ∼ N (θ0, Σ̂/T ) Note that the asymptotic properties of the GMM estimator are independent of the distribution of the error term in the model. In particular, one does not have to assume normality. Optimal Weighting Matrix We have not discussed the choice of the weighting matrix W −1T , so far. The choice of the weighting matrix does not have any bearing on the convergence of the GMM estimator to the true value. However, a judiciously chosen weighting matrix can minimize the asymptotic variance of the estimator. It can be shown that the optimal weighting matrix W ∗T produces the estimator with the smallest variance. It is defined as: W ∗∞ = lim T →∞ 1 T T∑ t=1 ∞∑ l=−∞ h(θ0, yt)h(θ0, yt−l) ′ Empirically, one can replace W ∗∞ by a consistent estimator of this matrix Ŵ ∗ T : Ŵ ∗T = Γ0,T + q∑ ν=1 (1 − [ν/(q + 1)])(Γν,T + Γ′ν,T ) 89 with Γν,T = 1 T T∑ t=ν+1 h(θ̂, yt)h(θ̂, yt−ν ) ′ which is the Newey-West estimator (see Newey and West (1987) for a more detailed exposition). Overidentifying Restrictions If the number of moments q is larger than the number of parameters to estimate κ, then the system is overidentified. One would only need κ restrictions to estimate θ. The remaining restrictions can be used to evaluate the model. Under the null that the model is the true one, these additional moments should be empirically close to zero at the true value of the parameters. This forms the basis of a specification test: T g(θ̂T ) ′Ŵ −1T g(θ̂T ) L−→ χ2(q − κ) In practice, this test is easy to compute, as one has to compare T times the criterion function evaluated at the estimated parameter vector to a chi-square critical value. Link with Other Estimation Methods The generalized method of moment is quite a general estimation method. It actually encompasses most estimation method as OLS, non linear least squares, instrumental variables or maximum likelihood by choosing an adequate moment restriction. For instance, the OLS estimator is defined such that the right hand side variables are not correlated with the error term, which provides a set of orthogonal restrictions that can be used in a GMM framework. In a linear model, the GMM estimator defined this way is also the OLS estimator. The instrumental variable method exploits the fact that an instrument is orthogonal to the residual. 90 4.3.2 Maximum Likelihood In contrast to the GMM approach, the maximum likelihood strategy requires an assumption on the distribution of the random variables. Denote by f (xt, θ) the probability of observing xt given a parameter θ. The estimation method tries to maximize the likelihood of observing a sequence of data X = {x1, . . . , xT }. Assum- ing iid shocks, the likelihood for the entire sample is: L(X, θ) = T∏ t=1 f (xt, θ) It is easier to maximize the log of the likelihood l(X, θ) = T∑ t=1 log f (xt, θ) Example 4.6 Consider the cake eating problem, defined by the Bellman equation below, where W is the size of the cake, ρ is a shrink factor and ε is an iid shock to preferences: V (W, ε) = max [εu(W ), EV (ρW, ε′)] V (.) represents the value of having a cake of size W , given the realization of the taste shock ε. The equation above states that the individual is indifferent between consuming the cake and waiting if the shock is ε∗(W, θ) = EV (ρW, ε′)/u(W ), where θ is a vector of parameters describing preferences, the distribution of ε and the shrink factor ρ. If ε > ε∗(W, θ), then the individual will consume the cake. ε∗(W, θ) has
    no analytical expression, but can be solved numerically with the tools developed in
    Chapter 3. The probability of not consuming a cake of size W in a given period is
    then:
    P (ε < ε∗(W, θ)) = F (ε∗(W, θ)) where F is the cumulative density of the shock ε. The likelihood of observing an 91 individual i consuming a cake after t periods is then: li(θ) = (1 − F (ε∗(ρtW1, θ))) t−1∏ l=1 F (ε∗(ρlW1, θ)) Suppose we observe the stopping time for N individuals. Then the likelihood of the sample is: L(θ) = N∏ i=1 li(θ) The maximization of the likelihood with respect to θ gives the estimate, θ̂. For additional examples, we refer the reader to the second part of the book, and in particular, section 5.5.4. Exercise 4.4 Use the stochastic cake eating problem to simulate some data. Construct the likelihood of the sample and plot it against different possible values for ρ. Asymptotic Properties To derive the asymptotic properties of the maximum likelihood estimator, it is convenient to notice that the maximum likelihood can be seen as a GMM procedure. The first order condition for the maximum of the log likelihood function is: T∑ t=1 ∂logf (xt, θ) ∂θ = 0 This orthogonality condition can be used as a basis for a GMM estimation, where h(θ, xt) = ∂logf (xt, θ)/∂θ. The first derivative of the log likelihood function is also called the score function. Using the GMM formula, the covariance matrix is D̂T Ŝ −1 T D̂ ′ T , with D̂′T = ∂g(θ) ∂θ′ θ=θ̂T = 1 T T∑ t=1 ∂2 log f (xt, θ) ∂θ∂θ′ = −I 92 where I is also known as the information matrix, i.e. minus the second derivative of the log likelihood function. ŜT = 1 T T∑ t=1 h(xt, θ̂T )h(xt, θ̂T ) ′ = I So, we get: √ T (θ̂T − θ0) L−→ N (0, I−1) The maximum likelihood estimator is asymptotically normal, with mean zero and a variance equal to I−1/T . 4.3.3 Simulation Based Methods We review here estimation methods based on simulation. This field is a growing one and we will concentrate on only a few methods. For a more in depth presentation of these methods, we refer the reader to Gourieroux and Monfort (1996) and Pakes and Pollard (1989), McFadden (1989), Laroque and Salanié (1989) or McFadden and Ruud (1994) (see also Lerman and Manski (1981) for an early reference). These methods are often used because the calculation of the moments are too difficult to construct (e.g. multiple integrals in multinomial probits as in McFadden (1989) or Hajivassiliou and Ruud (1994), or because the model includes a latent (unobserved) variable as in Laroque and Salanié (1993)). Or, it might be that the model M (θ) has no simple analytic representation so that the mapping from the parameters to moments must be simulated. Example 4.7 Consider the cake eating problem studied in section 4.3.2, but where the taste shocks ε are serially correlated. The Bellman equation is expressed as: V (W, ε) = max [ εu(W ), Eε′|εV (ρW, ε ′) ] 93 Here the expectations operator indicates that the expectation of next period’s shock depends on the realization of the current shock. We can still define the threshold shock ε∗(W ) = Eε′|ε∗V (ρW, ε′)/u(W ), for which the individual is indifferent between eating and waiting. The probability of waiting t periods to consume the cake can be written as: Pt = P (ε1 < ε ∗(W1), ε2 < ε ∗(ρW1), . . . , εt > ε
    ∗(ρtW1))
    In section 4.3.2, the shocks were iid, and this probability could easily be decomposed
    into a product of t terms. If ε is serially correlated, then this probability is extremely
    difficult to write as εt is correlated with all the previous shocks.
    33 For t periods, we
    have to solve a multiple integral of order t, which conventional numerical methods
    of integration cannot handle. In this section, we will show how simulated methods
    can overcome this problem to provide an estimate of θ.
    The different methods can be classified into two groups. The first group of
    methods compares a function of the observed data to a function of the simulated
    data. Here the average is taken both on the simulated draws and on all observation
    in the original data set at once. This approach is called moment calibration. It
    includes the simulated method of moments and indirect inference.
    The second set of methods compare the observed data, observation by observa-
    tion, to an average of the simulated predicted data, where the average is taken over
    the simulated shocks. This is called path calibration. Simulated non linear least
    squares or maximum likelihood fall into this category.
    The general result is that path calibration methods require the number of sim-
    ulations to go to infinity to achieve consistency. In contrast, moment calibration
    methods are consistent for a fixed number of simulations.

    94
    Simulated Method of Moments
    Definition: This method was first developed by McFadden (1989), Lee and In-
    gram (1991) and Duffie and Singleton (1993). Let {x(ut, θ0)}Tt=1 be a sequence of
    observed data. Let {x(ust , θ)}, t = 1, . . . , T, s = 1, . . . , S or xst (θ) for short, be a set
    of S series of simulated data, each of length T , conditional on a vector of parameters
    θ. The simulations are done by fixing θ and by using the T S draws of the shocks
    ust (drawn once and for all). Denote by µ(xt) a vector of functions of the observed
    data 34. The estimator for the SMM is defined as:
    θ̂S,T (W ) = arg min
    θ
    [
    T∑
    t=1
    (
    µ(xt) −
    1
    S
    S∑
    s=1
    µ(x(ust , θ))
    )]′
    W −1T[
    T∑
    t=1
    (
    µ(xt) −
    1
    S
    S∑
    s=1
    µ(x(ust , θ))
    )]
    This criterion is similar to the one presented for the method of moments in
    section 4.2.1. The difference is that we can avoid the calculation of the theoretical
    moments µ(xt(θ)) directly. Instead, we are approximating them numerically with
    simulations.
    Example 4.8
    We use here the cake example with serially correlated shocks. Suppose we have
    a data set of T cake eaters for which we observe the duration of their cake Dt,
    t = 1, . . . , T .
    Given a vector of parameter θ which describes preferences and the process of ε,
    we can solve numerically the model and compute the thresholds ε∗(W ). Next, we
    can simulate a series of shocks and determine the duration for this particular draws
    of the shock. We can repeat this step in order to construct S data sets containing
    each T simulated durations.
    To identify the parameters of the model, we can for instance use the mean du-
    ration and the variance of the duration. Both of these moments would be calculated

    95
    from the observed data set and the artificial ones. If we want to identify more than
    two parameters, we can try to characterize the distribution of the duration better and
    include the fraction of cakes eaten at the end of the first, second and third period for
    instance.
    For further examples, we refer the reader to the second part of the book, and in
    particular to section 6.3.6 and section 7.3.3.
    Exercise 4.5
    Construct a computer program to implement the approach outlined in Exam-
    ple 4.8. First, use as moments the mean and the variance of the duration. Increase
    then the number of moments using also the fraction of cakes eaten after the first and
    second period. As the model is overidentified, test the overidentification restrictions.
    Properties: When the number of simulation S is fixed and T −→ ∞,
    • θ̂ST (W ) is consistent.


    T (θ̂ST − θ0) −→ N (0, QS(W ))
    where
    QS(W ) = (1 +
    1
    S
    )
    [
    E0
    ∂µ′
    ∂θ
    W −1T
    ∂µ
    ∂θ′
    ]−1
    E0
    ∂µ′
    ∂θ
    W −1T Σ(θ0)W
    −1
    T
    ∂µ
    ∂θ′
    [
    E0
    ∂µ′
    ∂θ
    W −1T
    ∂µ
    ∂θ′
    ]−1
    where Σ(θ0) is the covariance matrix of 1/

    T ( 1
    T
    ∑T
    t=1(µ(xt) − E0µ(xst (θ))).
    The optimal SMM is obtained when ŴT = Σ̂T . In this case,
    QS(W
    ∗) = (1 +
    1
    S
    )
    [
    E0
    ∂µ′
    ∂θ
    W ∗−1
    ∂µ
    ∂θ′
    ]−1
    When S increases to infinity, the variance of the SMM estimator is the same as the
    variance of the GMM estimator. Note that when S tends to infinity, the covariance

    96
    matrix of the estimator converges to the covariance matrix of the standard GMM
    estimator.
    In practice, the optimal weighting matrix can be estimated by:
    Ŵ ∗T =
    1
    T
    T∑
    t=1
    [
    µ(xt) −
    1
    S
    S∑
    s=1
    µ(xst (θ̂ST ))
    ]
    .
    [
    µ(xt) −
    1
    S
    S∑
    s=1
    µ(xst (θ̂ST ))
    ]′
    +
    1
    S
    1
    T
    T∑
    t=1
    [
    µ(xst (θ̂ST )) −
    1
    L
    L∑
    l=1
    µ(xlt(θ̂ST ))
    ]
    .
    [
    µ(xst (θ̂ST )) −
    1
    L
    L∑
    l=1
    µ(xlt(θ̂ST ))
    ]′
    where xst (θ) and x
    l
    t(θ) are simulations generated by independent draws from the
    density of the underlying shock. Ŵ ∗T is a consistent estimate of W

    ∞ for T −→ ∞
    and L −→ ∞. Note that the SMM requires a large number of simulations to
    compute the standard errors of the estimator, even if the estimator is consistent for
    a fixed number of simulation.
    Simulated Non Linear Least Squares
    Definition: We could estimate the parameters θ by matching, at each period, the
    observation xt with the prediction of the model x(u
    s
    t , θ), where u
    s
    t is a particular
    draw for the shock. There are two reasons why the predicted data would not match
    the observed one. First, we might evaluate the model at an incorrect parameter
    point (i.e. θ �= θ0). Second, the ”true” shock u0t is unobserved, so replacing it
    with a random draw ust would lead to a discrepancy. In trying to minimize the
    distance between these two objects, we would not know whether to change θ or ust .
    To alleviate the problem, we could use S simulated shocks and compare xt with
    x̄St (θ) = 1/S
    ∑S
    s=1 x(u
    s
    t , θ). A natural method of estimation would be to minimize
    the distance between the observed data and the average predicted variable:
    min
    1
    T
    T∑
    t=1
    (xt − x̄St (θ))2

    97
    Unfortunately, this criterion does not provide a consistent estimator of θ, for a fixed
    number of simulation S, as the sample size T increases to infinity. 35
    Laffont et al. (1995) proposes to correct the non linear least square objective
    function by minimizing the following criterion:
    min
    θ
    1
    T
    T∑
    t=1
    [
    (xt − x̄St (θ))2 −
    1
    S(S − 1)
    S∑
    s=1
    (x(ust , θ) − x̄St (θ))2
    ]
    (4.12)
    The first term is the same as the one discussed above, the distance between the
    observed variable and the average predicted one. The second term is a second order
    correction term which takes into account the bias introduced by the simulation for
    a fixed S.
    Example 4.9
    Consider the continuous cake eating problem defined as:
    V (W, ε) = max
    c
    εu(c) + βEε′|εV (W − c, ε′)
    where W is the size of the cake, c is the amount consumed and ε is a taste shock.
    The optimal policy rule for this program is of the form c = c(W, ε). Suppose we
    observe an individual through time and we observe both the consumption level and
    the size of the cake, {ĉt, Ŵt}t=1,…T . The taste shock is unobserved to the researcher.
    To estimate the vector of parameter θ which describes preferences, we can use the
    simulated non linear least square method. We simulate S paths for the taste shock,
    {εst }t=1,…T, s=1,…S which are used to construct simulated predictions for the model
    {x(Wt, εst )}t=1,…T, s=1,…S. At each period, we construct the average consumption con-
    ditional on the observed size of the cake, c̄(Ŵt), by averaging out over the S simulated
    taste shocks. This average is then used to compare with the observed consumption
    level ĉt, using formula (4.12).

    98
    For further examples on the simulated non linear least square method, we refer the
    reader to section 7.3.3.
    Asymptotic Properties: For any fixed number of simulation S,
    • θ̂ST is consistent.


    T (θ̂ST − θ0) d−→ N (0, ΣS,T )
    A consistent estimate of the covariance matrix ΣS,T can be obtained by computing:
    Σ̂S,T = Â
    −1
    S,T B̂S,T Â
    −1
    S,T
    where ÂS,T and B̂S,T are defined below. To this end, denote ∇xst = ∂x(ust , θ)/∂θ,
    the gradient of the variable with respect to the vector of parameters, and ∇xt =
    1
    S
    ∑S
    s=1 ∇xst , its average across all simulations.
    ÂS,T =
    1
    T
    T∑
    t=1
    [
    ∇xt∇x′t −
    1
    S(S − 1)
    S∑
    s=1
    (
    ∇xst − ∇xt
    )(
    ∇xst − ∇xt
    )′]
    B̂S,T =
    1
    T
    T∑
    t=1
    dS,t(θ)dS,t(θ)

    with dS,t a k dimensional vector:
    dS,t(θ) = (xt − x̄t(θ))∇xt(θ) +
    1
    S(S − 1)
    S∑
    s=1
    [x(ust , θ) − x̄(θ)]∇xst (θ)
    Simulated Maximum Likelihood
    Definition: The model provides us with a prediction x(ut, θ), where θ is a vector
    of parameters and ut is an unobserved error. The distribution of ut implies a dis-
    tribution for x(ut, θ), call it φ(xt, θ). This can be used to evaluate the likelihood
    of observing a particular realization xt. In many cases, the exact distribution of
    x(θ, ut) is not easily determined, as the model can be non linear or might not even

    99
    have an explicit analytical form. In this case, we can evaluate the likelihood using
    simulations.
    The Simulated Maximum Likelihood (SML) method approximates this likelihood
    by using simulations. Let φ̃(xt, u, θ) be an unbiased simulator of φ(xt, θ):
    Euφ̃(xt, u, θ) = lim
    S
    1
    S
    S∑
    s=1
    φ̃(xt, u
    s, θ) = φ(xt, θ)
    The SML estimator is defined as:
    θ̂ST = arg max
    θ
    T∑
    t=1
    log
    [
    1
    S
    S∑
    s=1
    φ̃(xt, u
    s
    t ; θ)
    ]
    Asymptotic Properties:
    • The SML estimator is consistent, if T and S tend to infinity. When both T
    and S goes to infinity and when

    T
    S
    −→ 0, then

    T (θ̂ST − θ0) d−→ N (0, I−1(θ0))
    The matrix I(θ0) can be approximated by:
    − 1
    T
    T∑
    t=1
    ∂2 log
    (
    1
    S
    ∑S
    s=1 φ̃(xt, u
    s
    t , θ)
    )
    ∂θ∂θ′
    • It is inconsistent if S is fixed.
    The bias is then:
    Eθ̂ST − θ0 ∼
    1
    S
    I−1(θ0)Ea(xt, θ)
    where
    a(xt, θ) =
    Eu
    ∂φ̃
    ∂θ
    Vuφ̃
    (Euφ̃)3
    − covu(
    ∂φ̃
    ∂θ
    , φ̃)
    (Euφ̃)2
    The bias decreases in the number of simulations and with the precision of the esti-
    mated parameters, as captured by the information matrix. The bias also depends on
    the choice of the simulator, through the function a. Gourieroux and Monfort (1996)

    100
    proposes a first order correction for the bias. Fermanian and Salanié (2001) extend
    these results and propose a non parametric estimator of the unknown likelihood
    function, based on simulations.
    Indirect Inference
    When the model is complex, the likelihood is sometimes intractable. The indirect
    inference method works around it by using a simpler auxiliary model, which is esti-
    mated instead. This auxiliary model is estimated both on the observed data, and on
    simulated data. The indirect inference method tries to find the vector of structural
    parameters which brings the auxiliary parameters from the simulated data as close
    as possible to the one obtained on observed data. A complete description can be
    found in Gourieroux et al. (1993) (see also Smith (1993)).
    Consider the likelihood of the auxiliary model φ̃(xt, β), where β is a vector of
    auxiliary parameters. The estimator β̂T , computed from the observed data is defined
    by:
    β̂T = arg max
    β
    T∏
    t=1
    φ̃(xt, β)
    Under the null, the observed data are generated by the model at the true value
    of the parameter θ0. There is thus a link between the auxiliary parameter β0 (the
    true value of the auxiliary parameter) and the structural parameters θ. Follow-
    ing Gourieroux et al. (1993) we denote this relationship by the binding function
    b(θ). Were this function known, we could invert it to directly compute θ from the
    value of the auxiliary parameter. Unfortunately, this function usually has no known
    analytical form, so the method relies on simulations to characterize it.
    The model is then simulated, by taking independent draws for the shock ust ,
    which gives S artificial data sets of length T : {xs1(θ), . . . , xsT (θ)}, s = 1, . . . , S. The

    101
    auxiliary model is then estimated out of the simulated data, to get β̂sT :
    β̂sT (θ) = arg max
    β
    T∏
    t=1
    φ̃(xst (θ), β)
    Define β̂ST the average value of the auxiliary parameters, over all simulations:
    β̂ST =
    1
    S
    S∑
    s=1
    β̂sT (θ)
    The indirect inference estimator θ̂ST is the solution to:
    θ̂ST = arg min
    θ
    [β̂T − β̂ST (θ)]′ΩT [β̂T − β̂ST (θ)]
    where ΩT is a positive definite weight matrix which converges to a deterministic
    positive definite matrix Ω.
    Example 4.10
    Consider the cake problem with serially correlated shocks. The likelihood of the
    structural model is intractable, but we can find an auxiliary model which is easier
    to estimate. As the data set consists of durations, a natural auxiliary model is
    a standard duration model. Suppose we chose an exponential model, which is a
    simple and standard model of duration characterized by a constant hazard equal to
    β. The probability of observing a particular duration is βe−βDt . The log likelihood
    of observing a set of durations Dt, t = 1, . . . , T is :
    ln L =
    T∑
    t=1
    ln
    (
    βe−(βDt)
    )
    This likelihood can be maximized with respect to β. Straightforward maximization
    gives β̂T =
    1
    T
    ∑T
    t=1 Dt. In this case, the auxiliary parameter is estimated as the
    average duration in the data set. Given a value for the structural parameters of
    our model of interest θ, we can construct by simulation S data sets containing T
    observations. For each artificial data set s, we can estimate the auxiliary duration
    model to obtain β̂sT . Using the procedure above, we are then able to obtain an

    102
    estimate of θ, such as the auxiliary parameters both on the observed and the simulated
    data are as close as possible. Note that with the simple auxiliary model we use, it
    turns out that the indirect inference procedure is the same as a simulated method of
    moments, as we are matching the average duration.
    We have used the exponential duration model for the simplicity of the exposition.
    This model is only parameterized by one parameter, so we can identify at best only
    one structural parameter. To identify more parameters, we could estimate a duration
    model with a more flexible hazard.
    For more examples on the indirect inference method, we refer the reader to the
    second part of the book, in particular section 5.5.3 and 8.6.1.
    Gallant and Tauchen (1996) develop an Efficient Method of Moments based on
    the use of an auxiliary method. Instead of matching on a set of auxiliary parameters,
    they propose to minimize the score of the auxiliary model, i.e. the first derivative
    of the likelihood of the auxiliary model:
    m(θ, βT ) =
    1
    S
    S∑
    s=1
    1
    T
    T∑
    t=1

    ∂β
    ln φ̃(xst (θ), β̂T )
    The structural parameter are obtained from:
    θ∗ = argmin
    θ
    m(θ, β̂T )
    ′ Ω m(θ, β̂T )
    where Ω is a weighting matrix. Gourieroux et al. (1993) show that the EMM and
    the indirect inference estimators are asymptotically equivalent.
    Properties: For a fixed number of simulations S, when T goes to infinity the
    indirect inference estimator is consistent and normally distributed.

    T (θ̂ST − θ0) −→ N (0, QS(Ω))

    103
    where
    QS(Ω) = (1+
    1
    S
    )
    [
    ∂b′(θ0)
    ∂θ

    ∂b(θ0)
    ∂θ′
    ]−1
    ∂b′(θ0)
    ∂θ
    ΩJ−10 (I0−K0)J−10 Ω
    ∂b(θ0)
    ∂θ′
    [
    ∂b′(θ0)
    ∂θ

    ∂b(θ0)
    ∂θ′
    ]−1
    Denote ψT (θ, β) =
    ∑T
    t=1 log φ̃(x
    s
    t (θ), β). The matrices I0, J0 and K0 are defined
    as:
    J0 = plimT −
    ∂2ψT (θ, β)
    ∂β∂β′
    I0 = limT V
    [√
    T
    ∂ψT (θ, β)
    ∂β
    ]
    K0 = limT V
    [
    E
    (√
    T ∂
    ∂β′
    ∑T
    t=1 φ̃(xt, β)
    )]
    ∂b′(θ0)
    ∂θ
    = J−10 limT
    ∂2ψT (θ0, b(θ0))
    ∂β∂θ′
    The latter formula is useful to compute the asymptotic covariance matrix without
    calculating directly the binding function. As in the GMM case, there exists an
    optimal weighting matrix such that the variance of the estimator is minimized. The
    optimal choice denoted Ω∗ is:
    Ω∗ = J0(I0 − K0)−1J0
    in this case, the variance of the estimator simplifies to:
    QS(Ω
    ∗) = (1 +
    1
    S
    )
    (
    ∂b′(θ0)
    ∂θ
    J0(I0 − K0)−1J0
    ∂b(θ0)
    ∂θ′
    )−1
    or equivalently
    QS(Ω
    ∗) = (1 +
    1
    S
    )
    (
    ∂2ψ∞(θ0, b(θ0))
    ∂θ∂β′
    (I0 − K0)−1
    ∂2ψ∞(θ0, b(θ0))
    ∂β∂θ′
    )−1
    The latter formula does not require to compute explicitly the binding function. Note
    that the choice of the auxiliary model matters for the efficiency of the estimator.
    Clearly, one would want an auxiliary model such that ∂b′(θ)/∂θ is large in absolute
    values. If not, the model would poorly identify the structural parameters.

    104
    In practice, b(θ0) can be approximated by β̂ST (θ̂ST ). A consistent estimator of
    I0 − K0 can be obtained by computing:
    ( ̂I0 − K0) = T
    S
    S∑
    s=1
    (Ws − W̄ )(Ws − W̄ )′
    with
    Ws =
    ∂ψT (θ̂, β̂)
    ∂β
    W̄ = 1
    S
    ∑S
    s=1 Ws
    see Gourieroux et al. (1993), appendix 2.
    Note that if the number of parameters to estimate in the structural model is equal
    to the number of parameters in the auxiliary parameters, the weighting matrix Ω
    plays no role, and the variance QS(Ω) simplifies to:
    QS(Ω) = (1 +
    1
    S
    )
    [
    ∂b′(θ0)
    ∂θ
    Ω∗
    ∂b(θ0)
    ∂θ′
    ]−1
    Specification Tests: A global specification test can be carried out using the
    minimized
    ζT =
    T S
    1 + S
    min
    θ
    [β̂T − β̂ST (θ)]′ΩT [β̂T − β̂ST (θ)]
    follows asymptotically a chi-square distribution with q − p degrees of freedom.
    4.4 Conclusion
    This chapter presents methods to estimate the parameters of a model. We have re-
    viewed both classic methods such as maximum likelihood or the generalized method
    of moments and simulation based methods. In general, when dealing with dynamic
    programming models, the likelihood function or the analytical form of the moments
    are difficult to write out. If this is the case, simulated methods are of great use.
    However, they come at a cost, as simulated methods are very time consuming. The

    105
    computation of the value function and the optimal policy rules often requires the
    use of numerical techniques. If on top of that simulation estimation methods are
    used, the estimation of a full fledged structural model can take hours (or even days).
    The choice of a particular method depends on the problem and the data set.
    Path calibration methods such as non linear least squares or maximum likelihood
    use all the information available in the data, as each particular observation is used
    in the estimation procedure. The draw back is that one has to specify the entire
    model, up to the distribution of the unobserved shock. To have tractable likelihood
    functions, one often impose a normal distribution for the shocks and this might
    impose too much structure on the problem. On the other hand, moment calibration
    methods such as the method of moments use only part of the information provided
    by the data. These methods concentrate on particular functions of the data, as the
    mean or the variance for instance. In contrast to maximum likelihood, the method
    does not necessarily requires the specification of the whole model.
    Both approaches can be justified. The researcher might be interested in only
    a subset of the parameters, as the intertemporal elasticity of consumption. As in
    example 4.5, the GMM method allows to estimate this parameter, without specifying
    the distribution of the income shock. However, calibration methods require the
    choice of moments that identify the parameters of the model. When the model is
    simple, this is not very difficult. When the models are more complex, for instance
    when unobserved heterogeneity is present, it is not that straightforward to find
    informative moments. In such cases, maximum likelihood can be more desirable.
    Finally, if the data is subject to measurement errors, taking moments of the data
    can reduce the problem. When using simulation methods, calibration methods
    also presents the advantage of requiring only a fixed number of simulations to get
    consistent estimates, so the computation time is lower.

    Overview of Methodology
    The first three chapters have presented theoretical tools to model, solve and estimate
    economic models. Ideally, to investigate a particular economic topic, a research
    agenda would include all three parts, building on economic theory and confronting
    it with the data to assess its validity.
    Figure 4.6 summarizes this approach and points to the relevant chapters. The
    figure starts with an economic model, described by a set of parameters and some
    choice structure. It is important at this stage to characterize the properties of that
    model and to characterize the first order conditions or to write it as a recursive
    problem. The model under consideration might be difficult to solve analytically.
    In this case, it is sometime necessary to use numerical methods as developed in
    Chapter 3. One can then derive the optimal policy rules, i.e. the optimal behavior
    given a number of predetermined variables.
    Given the policy rules 36 the parameters can be estimated. This is usually done
    by comparing some statistics built both from the observed data and from the model.
    The estimated parameters are produced by minimizing the distance between the
    observed and the predicted outcome of the model. Once the optimal parameters are
    found, the econometric task is not over. One has to evaluate the fit of the model.
    There are various ways of doing this. First, even though the models are often non
    linear, one can construct a measure such as the R2, to evaluate the percentage of the
    variance explained by the model. A higher value is seen as a better fit. However,
    106

    107
    the model can be very good at reproducing some aspects of the data but can fail
    miserably in other important dimensions. For instance, in the discrete cake eating
    problem, the fit of the model could be considerably increased in the first T periods
    if one were to construct time dependent utility functions, with T dummy variables
    for each time period. Such a model would generate a perfect fit when it comes to
    predict the fraction of cakes eaten in the first periods. However, the model could
    do very poorly for the remaining periods. A second way to evaluate the estimated
    model is to use over identification restrictions if the model is overidentified. Finally,
    one can also perform out of sample forecasts.
    Once one is confident that the estimated model is a convincing representation of
    reality, the model can be used to evaluate different scenarios.
    The next chapters present examples of this strategy using a number of relevant
    topics.
    [Figure 4.6 approximately here]

    Part II
    Applications
    108

    Chapter 5
    Stochastic Growth
    5.1 Overview
    To begin our exploration of applications of dynamic programming problems in
    macroeconomics, a natural starting point is the stochastic growth model. Starting
    with Kydland and Prescott (1982), this framework has been used for understanding
    fluctuations in the aggregate economy. To do so, the researcher must understand the
    mapping from the parameters of preferences and technology to observations, per-
    haps summarized by pertinent moments of the data. Further, the model provides
    an analytic structure for policy evaluation.37
    The stochastic growth model provides our first opportunity to review the tech-
    niques of dynamic programming, numerical methods and estimation methodology.
    We begin with the non-stochastic model to get some basic concepts straight and
    then enrich the model to include shocks and other relevant features.
    5.2 Non-Stochastic Growth Model
    Consider the dynamic optimization problem of a very special household. This house-
    hold is endowed with one unit of leisure each period and supplies this inelastically
    109

    110
    to a production process. The household consumes an amount ct each period which
    it evaluates using a utility function, u(ct). Assume that u(·) is strictly increasing
    and strictly concave. The household’s lifetime utility is given by
    ∞∑
    1
    βt−1u(ct) (5.1)
    The household has access to a technology that produces output (y) from capital
    (k), given its inelastically supplied labor services. Let y = f (k) be the production
    function. Assume that f (k) is strictly increasing and strictly concave.
    The capital input into the production process is accumulated from forgone con-
    sumption. That is, the household faces a resource constraint that decomposes output
    into consumption and investment (it):
    yt = ct + it.
    The capital stock accumulates according to:
    kt+1 = kt(1 − δ) + it
    where δ ∈ (0, 1) is the rate of physical depreciation.
    Essentially the household’s problem is to determine an optimal savings plan by
    splitting output between these two competing uses. Note that we have assumed
    the household produces using a concave production function rather than simply
    renting labor and capital in a market for factors of production. In this way, the
    model of the household is very special and often this is referred to as a Robinson
    Crusoe economy as the household is entirely self-sufficient. Nonetheless the model is
    informative about market economies as one can argue (see below) that the resulting
    allocation can be decentralized as a competitive equilibrium. For now, our focus is
    on solving for this allocation as the solution of a dynamic optimization problem.
    To do so, we use the dynamic programming approach and consider the following
    functional equation:

    111
    V (k) = max
    k′
    u(f (k) + (1 − δ)k − k′) + βV (k′) (5.2)
    for all k. Here the state variable is the stock of capital at the start of the period
    and the control variable is the capital stock for the next period.38
    With f (k) strictly concave, there will exist a maximal level of capital achievable
    by this economy given by k̄ where
    k̄ = (1 − δ)k̄ + f (k̄).
    This provides a bound on the capital stock for this economy and thus guarantees
    that our objective function, u(c), is bounded on the set of feasible consumption
    levels, [0, f (k̄) + (1 − δ)k̄]. We assume that both u(c) and f (k) are continuous and
    real-valued so there exists a V (k) that solves (5.2).39
    The first-order condition is given by:
    u′(c) = βV ′(k′). (5.3)
    Of course, we don’t know V (k) directly so that we need to use (5.2) to determine
    V ′(k). As (5.2) holds for all k ∈ [0, k̄], we can take a derivative and obtain:
    V ′(k) = u′(c)(f ′(k) + (1 − δ)).
    Updating this one period and inserting this into the first-order condition implies:
    u′(c) = βu′(c′)(f ′(k′) + (1 − δ)).
    This is an Euler condition that is not unlike the one we encountered in the cake
    eating problem. Here the left side is the cost of reducing consumption by ε today.
    The right side is then the increase in utility in the next period from the extra capital
    created by investment of the ε. As in the cake eating structure, if the Euler equation
    holds then no single period deviations will increase utility of the household. As with
    that problem, this is a necessary but not a sufficient condition for optimality.40

    112
    From the discussion in Chapter 2, V (k) is strictly concave. Consequently, from
    (5.3), k′ must be increasing in k. To see why, suppose that current capital increases
    but future capital falls. Then current consumption will certainly increase so that
    the left side of (5.3) decreases. Yet with k′ falling and V (k) strictly concave, the
    right side of (5.3) increases. This is a contradiction.
    5.2.1 An Example
    Suppose that u(c) = ln(c), f (k) = kα and δ = 1. With this special structure, we can
    actually solve this model. As in Sargent (1987), we guess that the value function is
    given by:
    V (k) = A + B ln k
    for all k. If this guess is correct, then we must be able to show that it satisfies (5.2).
    If it does, then the first-order condition, (5.3), can be written:
    1
    c
    =
    βB
    k
    ′ .
    Using the resource constraint (kα = c + k′),
    βB(kα − k′) = k′
    or
    k′ = (
    βB
    1 + βB
    )kα. (5.4)
    So, if our guess on V (k) is correct, this is the policy function.
    Given this policy function, we can now verify whether or not our guess on V (k)
    satisfies the functional equation, (5.2). Substitution of (5.4) into (5.2) yields

    113
    A + B ln k = ln[(
    1
    1 + βB
    )kα] + β[A + B ln((
    βB
    1 + βB
    )kα)] (5.5)
    for all k. Here we use c = y − k′ so that
    c = (
    1
    1 + βB
    )kα.
    Grouping constant terms implies:
    A = ln(
    1
    1 + βB
    ) + β[A + B ln(
    βB
    1 + βB
    )]
    and grouping terms that multiply ln k,
    B = α + βBα.
    Hence B = α
    1−βα . Using this, A can be determined. Thus, we have found the solution
    to the functional equation.
    As for the policy functions, using B, we find
    k′ = βαkα
    and
    c = (1 − βα)kα.
    It is important to understand how this type of argument works. We started
    with a guess of the value function. Using this guess, we derived a policy function.
    Substituting this policy function into the functional equation gave us an expression,
    (5.5), that depends only on the current state, k. As this expression must hold for all
    k, we grouped terms and solved for the unknown coefficients of the proposed value
    function.
    Exercise 5.1

    114
    To see how this approach to finding a solution to the nonstochastic growth model
    could ”fail”, argue that the following cannot be solutions to the functional equation:
    1. V (k) = A
    2. V (k) = B ln k
    3. V (k) = A + Bkα
    5.2.2 Numerical Analysis
    Though the non-stochastic growth model is too simple to seriously take to the
    data, it provides an opportunity to again exploit the contraction mapping property
    to obtain a numerical solution to the functional equation given in (5.2). This is
    valuable as the set of economies which one can obtain an analytic solution to (5.2)
    is very small. Thus techniques must be developed to obtain policy functions in more
    general environments.
    The Matlab code grow.m solves (5.2), for the functional forms given below,
    using a value function iteration routine.41 The code has four main sections that we
    discuss in turn.
    Functional Forms
    There are two primitive functions that must be specified for the nonstochastic growth
    model. The first is the production function and the second is the utility function of
    the household. The grow.m code assumes that the production function is given by:
    f (k) = kα.
    Here α is restricted to lie in the interval (0, 1) so that f (k) is strictly increasing and
    strictly concave.

    115
    The household’s utility function is given by:
    u(c) =
    c1−σ
    1 − σ .
    With this utility function, the curvature of the utility function,
    −u′′(c)c/u′(c)
    is equal to σ.42 We assume that σ is positive so that u(c) is strictly increasing and
    strictly concave. When σ = 1, u(c) is given by ln(c).
    Parameter Values
    The second component of the program specifies parameter values. The code is
    written so that the user can either accept some baseline parameters (which you can
    edit) or input values in the execution of the program. Let
    Θ = (α, β, δ, σ)
    denote the vector of parameters that are inputs to the program. In an estimation
    exercise, Θ would be chosen so that the model’s quantitative implications match
    data counterparts. Here we are simply interested in the anatomy of the program
    and thus Θ is set at somewhat arbitrary values.
    Spaces
    As noted earlier, the value function iteration approach does require an approxima-
    tion to the state space of the problem. That is, we need to make the capital state
    space discrete. Let κ represent the capital state space. We solve the functional
    equation for all k ∈ κ with the requirement that k′ lie in κ as well. So the code
    for the non-stochastic growth model does not interpolate between the points in this
    grid but rather solves the problem on the grid.

    116
    The choice of κ is important. For the nonstochastic growth model we might be
    interested in transition dynamics: if the economy is not at the steady state, how
    does it return to the steady state? Let k∗ be the steady state value of the capital
    stock which, from (5.2), solves
    1 = β[αk∗(α−1) + (1 − δ)].
    This value of the steady state is computed in grow.m. Then the state space is
    built in the neighborhood of the steady state through the definitions of the highest
    and lowest values of the capital stock, khi and klow in the code.43 Finally, a grid is
    set-up between these two extreme values. The researcher specifies the fineness of the
    grid with two considerations in mind. A finer grid provides a better approximation
    but is ”expensive” in terms of computer time.44
    Value function Iteration
    The fourth section of the program solves (5.2) using a value function iteration rou-
    tine. To do so, we need an initial guess on the value function. For this guess, the
    program uses the one-period problem in which the household optimally consumes
    all output as well as the undepreciated capital stock (termed ytot in grow.m). 45
    Given this initial guess, a loop is set-up to perform value function iteration, as
    described in some detail in Chapter 3. Note that the program requires two inputs.
    The first is the total number of iterations that is allowed, termed T . The second
    is the tolerance which is used to determine whether the value function iteration
    routine has ”converged”. This tolerance is called toler and this scalar is compared
    against the largest percent difference between the last two calculations of the value
    function V and v in the grow.m program.

    117
    Evaluating the Results
    Once the program has converged, aspects of the policy function can be explored.
    The program produces two plots. The first, (Figure 5.1 below), plots the policy
    function: k′ as a function of k. The policy function is upward sloping as argued
    earlier. The second, (Figure 5.2), plots the level of net investment (k′ − k) for
    each level of k in the state space. This line crosses zero at the steady state and
    is downward sloping. So, for value of k below the steady state the capital stock
    is increasing (net investment is positive) while for k above k∗, net investment is
    negative.
    [Figure 5.1 approximately here]
    [Figure 5.2 approximately here]
    The program also allows you to calculate transition dynamics starting from an
    (arbitrary) initial capital stock. There are at least two interesting exercises one can
    perform from this piece of the code.
    Exercise 5.2
    1. Study how other variables (output, consumption, the real interest rate) behave
    along the transition path. Explain the patterns of these variables.
    2. Study how variations in the parameters in Θ influence the speed and other
    properties of the transitional dynamics.
    5.3 Stochastic Growth Model
    We build upon the discussion of the nonstochastic growth model to introduce ran-
    domness into the environment. We start from a specification of the basic economic

    118
    environment. The point is to make clear the nature of the intertemporal choice
    problem and the assumptions underlying the specification of preferences and tech-
    nology.
    We then turn to the planners’ optimization problem. We take the approach
    of a planner with an objective of maximizing the expected lifetime utility of a
    representative agent.46 In this way, we can characterize allocations as the results
    of a single optimization problem rather than through the solution of a competitive
    equilibrium. Given that there are no distortions in the economy, it is straightforward
    to determine the prices that support the allocation as a competitive equilibrium.
    We do this later in a discussion of the recursive equilibrium concept.
    5.3.1 Environment
    The stochastic growth model we study here is based upon an economy with infinitely
    lived households. Each household consumes some of the single good (ct) and invests
    the remainder (it). Investment augments the capital stock (kt) with a one period lag:
    i.e. investment today creates more capital in the next period. There is an exogenous
    rate of capital depreciation denoted by δ ∈ (0, 1). For now, we assume there is a
    single good which is produced each period from capital and labor inputs.47 The
    capital input is predetermined from past investment decisions and the labor input
    is determined by the household.
    Fluctuations in the economy are created by shocks to the process of producing
    goods. Thus, “good times” represent higher productivity of both labor and capital
    inputs. The planner will optimally respond to these variations in productivity by
    adjusting household labor supply and savings (capital accumulation) decisions. Of
    course, investment is a forward looking decision since the new capital is durable
    and is not productive until the next period. Further, the extent to which the labor

    119
    decision responds to the productivity variation depends, in part, on whether capital
    and labor are likely to be more productive in the future. Consequently, the serial
    correlation properties of the shocks are critical for understanding the responses of
    employment and investment.
    More formally, the households preferences over consumption (ct) and leisure (lt)
    are given by:
    ∞∑
    t=0
    βtu(ct, lt)
    where the discount factor β ∈ (0, 1). We will assume that the function u(c, l) is
    continuously differentiable and strictly concave. The households face a constraint
    on their time allocation:
    1 = lt + nt
    where the unit time endowment must be allocated between leisure and work (nt).
    The production side of the economy is represented by a constant returns to scale
    production function over the two inputs. Since scale is not determined, we model
    the economy as if there was a single competitive firm that hires the labor services
    of the households (Nt) and uses the households’ capital in the production process.
    The production function is expressed as:
    Yt = AtF (Kt, Nt)
    where F (K, N ) is increasing in both inputs, exhibits constant returns to scale and
    is strictly concave. Variations in total factor productivity, At will be the source
    of fluctuations in this economy. Here upper case variables refer to economywide
    aggregates and lower case variables are household (per capita) variables.

    120
    Finally, there is a resource constraint: the sum of consumption and investment
    cannot exceed output in each period. That is:
    Yt = Ct + It.
    For characterizing the solution to the planner’s problem, this is all the informa-
    tion that is necessary. That is, given the statement of preferences, the production
    function and the time and resource constraints, the planner’s problem can be speci-
    fied. In fact, the natural approach might be to allow the planner to choose a sequence
    of history dependent functions that describe the choices of consumption, investment
    and employment for all time periods conditional on the state of the economy at that
    point in time. In this most general formulation the description of the state would
    include all productivity shocks and the value of the capital stock.
    Instead of solving a planner’s problem in which the choice is a sequence of state
    contingent functions, the tools of dynamic programming can be used. We turn to
    that approach now.
    5.3.2 Bellman’s Equation
    To begin the analysis, we assume that labor is inelastically supplied at one unit per
    household. Thus we consider preferences represented by u(c). This allows us to
    focus on the dynamics of the problem. Of course, we will want to include a labor
    supply decision before confronting the data, else we would be unable to match any
    moments with labor variations. Hence we turn to the more general endogenous
    labor supply formulation later.
    In this case, we use the constant returns to scale assumption on F (K, N ) to
    write per capita output (yt) as a strictly concave function of the per capita capital
    stock (kt):

    121
    yt ≡ AtF (Kt/N, 1) ≡ Atf (kt).
    As F (K, N ) exhibits constant returns to scale, f (k) will be strictly concave. Bell-
    man’s equation for the infinite horizon stochastic growth model is specified as
    V (A, k) = maxk′ u(Af (k) + (1 − δ)k − k′) + βEA′|AV (A′, k′) (5.6)
    for all (A, k). Here the transition equation used to construct (5.6) is k′ = Af (k) +
    (1 − δ)k − c.
    An important element of this model is the multiplicative productivity shock.
    Through the introduction of this shock, the model is constructed to capture pro-
    cyclical fluctuations in productivity. An important question is whether the fluc-
    tuations in output, employment, consumption, investment, etc. induced by these
    shocks match relevant features of the data.
    For the quantitative analysis, we assume that A is a bounded, discrete random
    variable that follows a first-order Markov process. The transition matrix is given by
    Π and this is, implicitly, used in the conditional expectation in (5.6).48
    As in the general discussion of Chapter 2, one important question is whether
    there exists a solution to the function equation. A second is characterizing the
    optimal policy function.
    For the growth model, it is important to be sure that the problem is bounded.
    For this, let k̄ solve:
    k = A+f (k) + (1 − δ)k (5.7)
    where A+ is the largest productivity shock. Since consumption must be non-
    negative, then, from the transition equation, the k that solves this expression is
    the largest amount of capital that this economy could accumulate. Since f (k) is
    strictly concave, there will exist a unique finite value of k̄ that satisfies (5.7). This

    122
    then implies that the largest level of consumption is also k̄: the largest feasible
    consumption occurs when the largest capital stock is consumed in a single period.
    Thus we can bound utility by u(k̄).
    Given that we have bounded the problem, assumed that the discount factor is
    less than one and assumed the shocks follow a bounded, first-order Markow process,
    the results from Chapter 2 will apply. Thus we know that there exists a unique
    value function V (A, k) that solves (5.6). Further, we know that there is a policy
    function given by: k′ = φ(A, k).
    Our goal is to learn more about the properties of this solution. To stress an
    important point, the policy function represents the bridge from the optimization
    problem to the data. The policy function itself depends on the underlying struc-
    tural parameters and delivers a relationship between variables, some of which are
    observable. So, the inference problem is clean: what can we determine about the
    structural parameters from observations on output, capital, consumption, produc-
    tivity, etc.?
    5.3.3 Solution Methods
    Linearization
    One approach to characterizing a solution to the stochastic growth model written
    above is through analysis of the resource constraints and the intertemporal Euler
    equation. The latter is a necessary condition for optimality and can be obtained
    directly from the sequence problem representation of the planners problem. Alter-
    native, using Bellman’s equation, the first-order condition for the planner is
    u′(Af (k) + (1 − δ)k − k′) = βEA′|AVk′(A′, k′) (5.8)

    123
    for all (A, k). Though we do not know V (A, k), we can solve for its derivative. From
    (5.6),
    Vk(A, k) = u
    ′(c)[Af ′(k) + (1 − δ)].
    Substituting this into (5.8) and evaluating it at (A′, k′) implies:
    u′(c) = βEA′|Au
    ′(c′)[A′f ′(k′) + (1 − δ)] (5.9)
    where
    c = Af (k) + (1 − δ)k − k′ (5.10)
    and c′ is defined accordingly. These two expressions, along with the evolution of A
    (specified below) defines a system of equations. So, one can represent the optimal
    growth model as a system of first order stochastic difference equations in (c, k, A).
    In order to approximately characterize this solution, it is common to linearize
    this condition and the resource constraints around the steady state, (c∗, k∗).49 To
    do so, we fix A at its mean value, Ā. The steady state value of the capital stock
    will then satisfy:
    1 = β[Āf ′(k∗) + (1 − δ)]. (5.11)
    Further, in steady state k′ = k = k∗ so the steady state level of consumption satisfies
    c∗=Āf (k∗) − δk∗.
    Following King et al. (1988), let ĉt, k̂t and Ât denote percent deviations from
    their steady state values respectively. So, for example, x̂t ≡ xt−x∗x∗ . Assume that in
    terms of deviations from mean, the shocks follow a first-order autoregressive process,
    Ât+1 = ρÂt + εt+1 with ρ ∈ (0, 1).
    Then we can rewrite the Euler condition, (5.9), as:
    ξĉt = ξĉt+1 + νρÂt + νχk̂t+1 (5.12)

    124
    where ξ is the elasticity of the marginal utility of consumption, ξ ≡ u′′(c∗)c∗
    u′(c∗) . The
    parameter ν ≡ βĀf ′(k∗) which equals 1−β(1−δ) in the steady state. The parameter
    ρ is the serial correlation of the deviation of the shock from steady state and χ ≡
    f ′′(k∗)k∗
    f ′(k∗) is the elasticity of the marginal product of capital with respect to capital.
    The resource condition, (5.10), can be approximated by:
    k̂t+1 =
    1
    β
    k̂t +
    δ
    (1 − sc)
    Ât −
    sc
    (1 − sc)
    δĉt. (5.13)
    Here sc is consumption’s steady state share of output.
    If the researcher specifies a problem such that preferences and the production
    function exhibit constant elasticities then, ξ and χ are fixed parameters and one does
    not have to ever solve explicitly for a steady state. For example, if the production
    function is Cobb-Douglas where α is capital’s share, then χ is simply (α − 1).
    Likewise, ν just depends on the discount factor and the rate of physical capital
    depreciation. Finally, the consumption share sc is just a function of the parameters
    of the economy as well.
    For example, in the Cobb-Douglas case, (5.11) can be written as:
    1 = β[α(y∗/k∗) + (1 − δ)]
    where y∗ is the steady state level of output. Since the steady state level of investment
    i∗ = δk∗, then this can be rewritten as:
    1 = β[αδ/(1 − sc) + (1 − δ)].
    Solving this,
    (1 − sc) =
    βαδ
    1 − β(1 − δ)
    so that sc can be calculated directly from the underlying parameters.
    This approach thus delivers a log-linearized system whose parameter are deter-
    mined by the underlying specification of preferences, technology and the driving

    125
    processes of the economy. This system can be simplified by solving out for ĉt yield-
    ing a stochastic system characterizing the evolution of the state variables, i.e. the
    system can be written solely in terms of (Ât, k̂t). At this point, the response of
    the system to productivity innovations can be evaluated and, as discussed further
    below, taken to the data.50
    Value Function Iteration
    Instead of obtaining an approximate solution by log-linearization, one can attack
    the dynamic programming problem directly. To more fully characterize a solution,
    we often resort to specific examples or numerical analysis.
    As a leading example, assume that u(c) = ln(c) and that the rate of depreciation
    of capital is 100%. Further, suppose that the process for the shocks is given by
    lnA′ = ρ lnA + ε
    where ρ ∈ (−1, 1), so that the process is stationary. Finally, suppose that the
    production function has the form f (k) = Akα.
    With these restrictions, the Euler equation (5.9) reduces to:
    1
    c
    = βEA′|A[
    A′αk′(α−1)
    c′
    ]. (5.14)
    Note that here we take the expectation, over A′ given A, of the ratio since future
    consumption, c′, will presumably depend on the realized value of the productivity
    shock next period.
    To solve for the policy function, we make a guess and verify it.51 We assert that
    the policy function k′ = φ(A, k) is given by
    φ(A, k) = λAkα
    where λ is an unknown constant. That is, we will try a guess that the future capital
    is proportional to output which is quite similar to the policy function we deduced

    126
    for the example of the nonstochastic growth model. Given the resource constraint,
    this implies
    c = (1 − λ)Akα.
    To verify this guess and determine λ, we use this proposed policy function in (5.14).
    This yields:
    1
    (1 − λ)Akα = βEA′|A[
    A′αk′(α−1)
    (1 − λ)A′k′α ].
    Solving for the policy function yields:
    k′ = βαAkα. (5.15)
    Hence our guess is verified and λ = αβ. This implies that consumption is propor-
    tional to income:
    c = (1 − βα)Akα. (5.16)
    In this case, one can show that the value function that solves (5.6) is given by:
    V (A, k) = G + B ln(k) + D ln(A)
    for all (A, k), where G, B and D are unknown constants which we can solve for.
    If so, then using (5.15) and (5.16), the functional equation is given by:
    G + B ln(k) + D ln(A) = ln((1 − βα)Akα) + β[G + B ln(βαAkα) + DEA′|A ln(A′).
    (5.17)
    for all (A, k). Importantly, there is no maximization here as we have substituted the
    policy function into the functional equation.52 Since, EA′|A ln(A′) = ρ ln A, we make
    use of the fact that this relationship holds for all (A, k) and group terms together
    as we did in the analysis of the nonstochastic growth model. So the constants must

    127
    be the same on both sides of (5.17):
    G = ln(1 − βα) + βG + βB ln(βα).
    Similarly, for the coefficients multiplying ln(k), we must have:
    B = α + βBα.
    Finally, with respect to ln(A),
    D = 1 + βB + βDρ.
    So if (G, B, D) solve this system of equation, then they solve the functional
    equation. As this solution is unique, we verify our guess. While tedious, one can
    show that the solution is:
    G =
    ln(1 − βα) + β( α
    1−βα ) ln(βα)
    1 − β , B =
    α
    1 − βα , D =
    1
    (1 − βρ)(1 − βα).
    Note here the role of discounting: if β = 1, then G is infinity.
    Unfortunately, this is a very special case. We will use it again when we discuss
    empirical implications of the stochastic growth model.
    Exercise 5.3
    Verify that if there is less than 100% depreciation, the solution given by φ(A, k) =
    λAkα fails.
    Outside of the special examples, one is left with a direct analysis of (5.6). It is
    straightforward to apply the analysis of Chapter 2 to this problem so that a solution
    to the functional equation will exist.53 Further, one can show that the value function
    is a strictly concave function of k. Consequently, the policy function is increasing

    128
    in k. To see this, consider (5.8). An increase in k will increase the left side of this
    expression. If k′ doesn’t rise, then (5.8) will not hold since the right side, from the
    concavity of V (A, k) is a decreasing function of k′.
    Further details about the policy function require numerical analysis. One can
    build a stochastic version of the program termed grow.m that was discussed above.
    Doing so is a good exercise to be sure that you understand how to write a value
    function iteration program.54 We take this up again in the next section once we
    introduce a labor supply decision to the model.
    Exercise 5.4
    Drawing on grow.m, write a value function iteration program to find the solution
    to (5.6).
    5.3.4 Decentralization
    To study the decentralized economy, the household’s problem must be supplemented
    by a budget constraint and the sources of income (labor income, capital income,
    profits) would have to be specified along with the uses of these funds (consumption,
    savings). Likewise, the firm’s demands for labor and capital inputs will have to be
    specified as well. We discuss these in turn using the recursive equilibrium concept.55
    The firm’s problem is static as we assume the households hold the capital. Thus
    the firm rents capital from the household at a price of r per unit and hires labor at
    a wage of ω per hour. The wage and rental rates are all in terms of current period
    output. Taking these prices as given, the representative firm will maximize profits
    by choosing inputs (K, N ) such that:
    AfN (K, N ) = ω and AfK (K, N )+(1-δ) = r.

    129
    Here we stipulate that the capital rental agreement allows the firm to use the capital
    and to retain the undepreciated capital which it then sells for the same price as
    output in the one-sector model. Due to the constant returns to scale assumption,
    the number and size of the firms is not determined. We assume for simplicity that
    there is a single firm (though it acts competitively) which employs all the capital
    and labor in the economy, denoted by upper case letters.
    For the households, their problem is:
    V (A, k, K) = maxk′u(r(K)k + ω(K) + Π − k′) + βEA′|AV (A′, k′, K′) (5.18)
    where Π is the flow of profits from the firms to the households. This is a different
    expression than (5.6) as there is an additional state variable, K. Here k is the
    household’s own stock of capital while K is the per capita capital stock economy
    wide. The household needs to know the current value of K since factor prices
    depend on this aggregate state variable through the factor demand equations. This
    is indicated in (5.18) by the dependence of r(K) and ω(K) on K.
    Let K′ = H(A, K) represent the evolution of the aggregate capital stock. As
    the household is competitive, it takes the evolution of the aggregate state variable
    as given. Thus the household takes current and future factor prices as given.
    The first-order condition for the household’s capital decision is:
    u′(c) = βEVk(A
    ′, k′, K′). (5.19)
    Here the household uses the law of motion for K. Using (5.18), we know that
    Vk = r(K)u
    ′(c) so that the first-order condition can be written as:
    u′(c) = βEr′u′(c′). (5.20)
    A recursive equilibrium is comprised of:

    130
    • factor price functions: r(K) and ω(K)
    • individual policy functions: h(A, k, K) from (5.18)
    • a law for motion for K: H(A, K)
    such that:
    • households and firms optimize
    • markets clear
    • H(A, k) = h(A, k, k)
    By using the first-order conditions from the factor demand of the operating firm,
    it is easy to see that the solution to the planners problem is a recursive equilibrium.
    5.4 A Stochastic Growth Model with Endogenous
    Labor Supply
    We now supplement the version of the stochastic growth model given above with
    an endogenous labor supply decision. For now, we retain the perspective of the
    planner’s problem and discuss decentralization later in this section.
    5.4.1 Planner’s Dynamic Programming Problem
    Supplementing preferences and the technology with a labor input, the modified
    planner’s problem is given by:
    V (A, k) = maxk′,nu(Af (k, n) + (1 − δ)k − k′, 1 − n) + βEA′|AV (A′, k′). (5.21)
    for all (A, k). Here the variables are measured in per capita terms: k and n are the
    capital and labor inputs per capita.

    131
    The optimization problem entails the dynamic choice between consumption and
    investment that was key to the stochastic growth model with fixed labor input. In
    addition, given k′, (5.21) has a “static” choice of n.56 This distinction is impor-
    tant when we turn to a discussion of programming the solution to this functional
    equation.
    For given (A, k, k′), define σ(A, k, k′) from:
    σ(A, k, k′) = maxnu(Af (k, n) + (1 − δ)k − k′, 1 − n) (5.22)
    and let n = φ̂(A, k, k′) denote the solution to the optimization problem. The first-
    order condition for this problem is given by:
    uc(c, 1 − n)Afn(k, n) = ul(c, 1 − n). (5.23)
    This condition equates the marginal gain from increasing employment and consum-
    ing the extra output with the marginal cost in terms of the reduction in leisure time.
    This is clearly a necessary condition for optimality: in an optimal solution, this type
    of static variation should not increase welfare.
    Thus given the current productivity shock and the current capital stock and
    given a level of capital for the future, φ̂(A, k, k′) characterizes the employment
    decision. We can think of σ(A, k, k′) as a return function given the current state
    (A, k) and the control (k′).
    Using the return function from this choice of the labor input, rewrite the func-
    tional equation as:
    V (A, k) = maxk′σ(A, k, k
    ′) + βEA′|AV (A
    ′, k′). (5.24)
    for all (A, k). This has the same structure as the stochastic growth model with a
    fixed labor supply though the return function, σ(A, k, k′), is not a primitive object.
    Instead, it is derived from a maximization problem and thus inherits its properties
    from the more primitive u(c, 1 − n) and f (k, n) functions. Using the results in

    132
    Chapter 2, there will be a solution to this problem and a stationary policy function
    will exist. Denote the policy function by k′ = h(A, k).
    The first-order condition for the choice of the future capital stock is given by:
    σk′(A, k, k
    ′) + βEA′|AVk′(A
    ′, k′) = 0
    where the subscripts denote partial derivatives. Using (5.24), we can solve for
    EA′|AVk(A′, k′) yielding an Euler equation:
    −σk′(A, k, k′) = βEA′|Aσk′(A′, k′, k′′).
    Using (5.22), this can be rewritten in more familiar terms as:
    uc(c, 1 − n) = βEA′|A[uc(c′, 1 − n′)[A′fk(k′, n′) + (1 − δ)] (5.25)
    where c = Af (k, n) + (1 − δ)k − k′ and c′ is defined similarly. This Euler equation is
    another necessary condition for an optimum: else a variation in the level of savings
    could increase lifetime expected utility.
    The policy functions will exhibit a couple of key properties revolving around
    the themes of intertemporal substitution and consumption smoothing. The issue is
    essentially understanding the response of consumption and employment to a pro-
    ductivity shock. By intertemporal substitution, the household will be induced to
    work more when productivity is high. But, due to potentially offsetting income and
    substitution effects, the response to a productivity shocks will be lower the more
    permanent are these shocks.57 By consumption smoothing, a household will opti-
    mally adjust consumption in all periods to an increase in productivity. The more
    persistent is the shock to productivity, the more responsive will consumption be to
    it.58

    133
    5.4.2 Numerical Analysis
    A discussion along the same lines as that for the stochastic growth model with
    fixed labor input applies here as well. As in King et al. (1988), one can attack the
    set of necessary conditions ((5.23), (5.25) and the resource constraint) through a
    log-linearization procedure. The reader is urged to study that approach from their
    paper.
    Alternatively, one can again simply solve the functional equation directly. This
    is just an extension of the programming exercise given at the end of the previous
    section on the stochastic growth model with fixed labor supply. The outline of the
    program will be discussed here leaving the details as an additional exercise.
    The program should be structured to focus on solving (5.24) through value func-
    tion iteration. The problem is that the return function is derived and thus must be
    solved for inside of the program. The researcher can obtain an approximate solution
    to the employment policy function, given above as φ̂(A, k, k′). This is achieved by
    specifying grids for the shocks, the capital state space and the employment space.59
    As noted earlier, this is the point of approximation in the value function iteration
    routine: finer grids yield better approximations but are costly in terms of computer
    time. Once φ̂(A, k, k′) is obtained, then
    σ(A, k, k′) = u(Af (k, φ̂(A, k, k′)) + (1 − δ)k − k′, 1 − φ̂(A, k, k′))
    can be calculated and stored. This should all be done prior to starting the value
    function iteration phase of the program. So, given σ(A, k, k′), the program would
    then proceed to solve (5.24) through the usual value function iteration routine.
    The output of the program is then the policy function for capital accumulation,
    k′ = h(A, k), and a policy function for employment, n = φ(A, k) where
    φ(A, k) = φ̂(A, k, h(A, k)).

    134
    Hence both of these policy functions ultimately depend only on the state variables,
    (A, k). These policy functions provide a link between the primitive functions (and
    their parameters) and observables. We turn now to a discussion of exploiting that
    link as the stochastic growth model confronts the data.
    5.5 Confronting the Data
    Since Kydland and Prescott (1982), macroeconomists have debated the empirical
    success of the stochastic growth model. This debate is of interest both because of
    its importance for the study of business cycles and for its influence on empirical
    methodology. Our focus here is on the latter point as we use the stochastic growth
    model as a vehicle for exploring alternative approaches to the quantitative analysis
    of dynamic equilibrium models.
    Regardless of the methodology, the link between theory and data is provided by
    the policy functions. To set notation, let Θ denote a vector of unknown parameters.
    We will assume that the production function is Cobb-Douglas and is constant returns
    to scale. Let α denote capital’s share. Further, we will assume that
    u(c, 1 − n) = ln(c) + ξ(1 − n)
    as our specification of the utility function.60 Thus the parameter vector is:
    Θ = (α, δ, β, ξ, ρ, σ)
    where α characterizes the technology, δ determines the rate of depreciation of the
    capital stock, β is the discount factor, and ξ parameterizes preferences. The tech-
    nology shock process is parameterized by a serial correlation (ρ) and a variance (σ).
    To make clear that the properties of this model economy depend on these parame-
    ters, we index the policy functions by Θ: k′ = hΘ(A, k) and n = φΘ(A, k). At this

    135
    point, we assume that for a given Θ these policy functions have been obtained from
    a value function iteration program. The question is then how to estimate Θ.
    5.5.1 Moments
    One common approach to estimation of Θ is based upon matching moments. The
    researcher specifies a set of moments from the data and then finds the value of
    Θ to match (as closely as possible) these moments. A key element, of course, is
    determining the set of moments to match.
    The presentation in Kydland and Prescott [1982] is a leading example of one ver-
    sion of this approach termed calibration. Kydland and Prescott consider a much
    richer model than that presented in the previous section as they include: a sophis-
    ticated time to build model of capital accumulation, non-separable preferences, a
    signal extraction problem associated with the technology shock. They pick the pa-
    rameters for their economy using moments obtained from applied studies and from
    low frequency observations of the U.S. economy. In their words,
    ”Our approach is to focus on certain statistics for which the noise intro-
    duced by approximations and measurement errors is likely to be small
    relative to the statistic.”
    Since the model we have studied thus far is much closer to that analyzed by
    King, Plosser and Rebelo, we return to a discussion of that paper for an illustration
    of this calibration approach.61 King, Plosser and Rebelo calibrate their parameters
    from a variety of sources. As do Kydland and Prescott, the technology parameter is
    chosen to match factor shares. The Cobb-Douglas specification implies that labor’s
    share in the National Income and Product Accounts should equal (1−α). The rate of
    physical depreciation is set at 10% annually and the discount rate is chosen to match
    a 6.5% average annual return on capital. The value of ξ is set so that on average

    136
    hours worked are 20% of total hours corresponding to the average hours worked
    between 1948 and 1986. King, Plosser and Rebelo use variations in the parameters
    of the stochastic process (principally ρ) as a tool for understanding the response of
    economic behavior as the permanence of shocks is varied. In other studies, such as
    Kydland and Prescott, the parameters of the technology shock process is inferred
    from the residual of the production function.
    Note that for these calibration exercises, the model does not have to be solved
    in order to pick the parameters. That is, the policy functions are not actually used
    in the calibration of the parameters. Instead, the parameters are chosen by looking
    at evidence that is outside of business cycle properties, such as time series averages.
    Comparing the model’s predictions against actual business cycle moments is thus
    an informal overidentification exercise.
    The table below shows a set of moments from U.S. data as well as the predic-
    tions of these moments from the King, Plosser and Rebelo model parameterized as
    described above.62 The first set of moments is the standard deviation of key macroe-
    conomic variables relative to output. The second set of moments is the correlation
    of these variables with respect to output.
    [Table 5.1 approximately here]
    In this literature, this is a common set of moments to study. Note that the
    stochastic growth model, as parameterized by King, Plosser and Rebelo exhibits
    many important features of the data. In particular, the model produces consumption
    smoothing as the standard deviation of consumption is less than that of output.
    Further, as in U.S. data, the variability of investment exceeds that of output. The
    cross correlations are all positive in the model as they are in the data. One apparent
    puzzle is the low correlation of hours and output in the data relative to the model.63
    Still, based on casual observation, the model “does well”. However, these papers do

    137
    not provide “tests” of how close the moments produced by the model actually are
    to the data.
    Of course, one can go a lot further with this moment matching approach. Letting
    ΨD be the list of 8 moments from U.S. data shown in Table 5.1, one could solve the
    problem of:
    min
    Θ
    (ΨD − ΨS(Θ))W (ΨD − ΨS(Θ))′. (5.26)
    where ΨS(Θ) is a vector of simulated moments that depend on the vector of param-
    eters (Θ) that are inputs into the stochastic growth model. As discussed in Chapter
    4, W is a weighting matrix. So, for their parameterization, the ΨS(Θ) produced
    by the KPR model is simply the column of moments reported in Table 5.1. But,
    as noted earlier, the parameter vector was chosen based on other moments and
    evidence from other studies.
    Exercise 5.5
    Using a version of the stochastic growth model to create the mapping ΨD, solve
    5.26.
    5.5.2 GMM
    Another approach, closer to the use of orthogonality conditions in the GMM ap-
    proach, is used by Christiano and Eichenbaum (1992). Their intent is to enrich
    the RBC model to encompass the observations that the correlation between the
    labor input (hours worked) and the return to working (the wage and/or the average
    product of labor). To do so, they add shocks to government purchases, financed by
    lump-sum taxes. Thus government shocks influence the labor choice of households
    through income effects. For their exercise, this is important as this shift in labor
    supply interacts with variations in labor demand thereby reducing the excessively

    138
    high correlation between hours and the return to work induced by technology shocks
    alone.
    While the economics here is of course of interest, we explore the estimation
    methodology employed by Christiano and Eichenbaum. They estimate eight pa-
    rameters: the rate of physical depreciation(δ), the labor share of the Cobb-Douglas
    technology (α), a preference parameter for household’s marginal rate of substitution
    between consumption and leisure (γ), as well as the parameters characterizing the
    distributions of the shocks to technology and government spending.
    Their estimation routine has two phases. In the first, they estimate the param-
    eters and in the second they look at additional implications of the model.
    For the first phase, they use unconditional moments to estimate these parame-
    ters. For example, using the capital accumulation equation, the rate of depreciation
    can be solved for as:
    δ = 1 − kt+1 − it
    kt
    .
    Given data on the capital stock and on investment, an estimate of δ can be ob-
    tained as the time series average of this expression. 64 Note that there is just
    a single parameter in this condition so that δ is estimated independently of the
    other parameters of the model. Building on this estimate, Christiano and Eichen-
    baum then use the intertemporal optimality condition (under the assumption that
    u(c)=ln(c))to determine capital’s share in the production function. They proceed
    in this fashion of using unconditional movements to identify each of the structural
    parameters.
    Christiano and Eichenbaum then construct a larger parameter vector, termed Φ,
    which consists of the parameters described above from their version of the stochastic
    growth model and a vector of second moments from the data. They place these

    139
    moments within the GMM framework. Given this structure, they can use GMM to
    estimate the parameters and to obtain an estimate of the variance covariance matrix
    which is then used to produce standard errors for their parameter estimates. 65
    As the point of the paper is to confront observations on the correlation of hours
    and the average product of labor, corr(y/n, n), and the relative standard deviations
    of the labor input and the average productivity of labor, σn/σy/n. They test whether
    their model, at the estimated parameters, is able to match the values of these
    moments in the data. Note that this is in the spirit of an overidentification test
    though the model they estimate is just identified. They find that the stochastic
    growth model with the addition of government spending shocks is unable (with one
    exception) to match the observations for these two labor market statistics. The
    most successful version of the model is estimated with establishment data, assumes
    that the labor input is indivisible and government spending is not valued at all by
    the households.66
    5.5.3 Indirect Inference
    Smith (1993) illustrates the indirect inference methodology using a version of the
    simple stochastic growth model with fixed employment, as in (5.6). There is one
    important modification: Smith considers an accumulation equation of the form:
    k′ = k(1 − δ) + Ztit
    where Zt is a second shock in the model. Greenwood et al. (1988) interpret this as
    a shock to next investment goods and Cooper and Ejarque (2000) view this as an
    “intermediation shock”.
    With this additional shock, the dynamic programming problem for the represen-
    tative household becomes:
    V (A, Z, k) = maxk′,nu(Af (k, n)+
    (1 − δ)k − k′
    Z
    , 1−n)+βEA′|AV (A′, Z′, k′). (5.27)

    140
    Note the timing here: the realized value of Z is known prior to the accumulation
    decision. As with the stochastic growth model, this dynamic programming problem
    can be solved using value function iteration or by linearization around the steady
    state.
    From the perspective of the econometrics, by introducing this second source of
    uncertainty, the model has enough randomness to avoid zero likelihood observations.67
    As with the technology shock, there is a variance and a serial correlation parame-
    ter used to characterize this normally distributed shock. Smith assumes that the
    innovations to these shocks are uncorrelated.
    To take the model to the data, Smith estimates a VAR(2) on log detrended
    quarterly U.S. time series for the period 1947:1-1988:4. The vector used for the
    analysis is:
    xt = [yt it]

    where yt is the detrended log of output and it is the detrended log of investment
    expenditures. With two lags of each variable, two constants and three elements of
    the variance-covariance matrix, Smith generates 13 coefficients.
    He estimates 9 parameters using the SQML procedure. As outlined in his paper
    and Chapter 3, this procedure finds the structural parameters which maximize the
    likelihood of observing the data when the likelihood function is evaluated at the
    coefficients produced by running the VARs on simulated data created from the
    model at the estimated structural parameters. Alternatively, one could directly
    choose the structural parameters to minimize the difference between the VAR(2)
    coefficients on the actual and simulated data.

    141
    5.5.4 Maximum Likelihood Estimation
    Last but certainly not least versions of the stochastic growth model has been es-
    timated using the maximum likelihood approach. As in the indirect inference ap-
    proach, it is necessary to supplement the basic model with additional sources of
    randomness to avoid the zero likelihood problem. This point is developed in the
    discussion of maximum likelihood estimation in Kocherlakota et al. (1994). Their
    goal is to evaluate the contribution of technology shocks to aggregate fluctuations.
    Kocherlakota et al. (1994) construct a model economy which includes shocks to
    the production function and stochastic depreciation. In particular, the production
    function is given by Yt = AtK
    α
    t (NtXt)
    1−α + Qt. Here Xt is exogenous technological
    progress, Yt is the output of the single good, Kt is the capital stock and Nt is the labor
    input. The transition equation for capital accumulation is: Kt+1 = (1 − δt)Kt + It
    where δt is the rate of depreciation and It is the level of investment.
    The authors first consider a version of the stochastic growth model without a
    labor input. They show that the linearized decision rules imply that consumption
    and the future capital stock are proportional to the current stock of capital.68
    They then proceed to the estimation of their model economy with these three
    sources of uncertainty. They assume that the shocks follow an AR(1) process.
    Kocherlakota et al. (1994) construct a representation of the equilibrium process
    for consumption, employment and output as a function of current and lagged values
    of the shocks. This relationship can then be used to construct a likelihood function,
    conditional on initial values of the shocks.
    Kocherlakota et al. (1994) fix a number of the parameters that one might ulti-
    mately be interested in estimating and focus attention on Σ, the variance-covariance
    matrix of the shocks. This is particularly relevant to their exercise of determining
    the contribution of technology shocks to fluctuations in aggregate output. In this

    142
    regard, they argue that without additional assumptions about the stochastic process
    of the shocks, they are unable to identify the relative variances of the shocks.
    There are a number of other papers that have taken the maximum likelihood
    approach.69 Altug (1989) estimates a version of the Kydland and Prescott (1982)
    model with a single fundamental shock to technology and measurement error else-
    where. Altug (1989) finds some difficulty matching the joint behavior of labor and
    other series.
    Hall (1996) studies a version of a labor hoarding model which is then compared
    to the overtime labor model of Hansen and Sargent (1988). While the Hall (1996)
    paper is too complex to present here, the paper is particularly noteworthy for its
    comparison of results from estimating parameters using GMM and maximum like-
    lihood.
    5.6 Some Extensions
    The final section of this chapter considers extensions of the basic models. These are
    provided here partly as exercises for readers interested in going beyond the models
    presented here.70 One of the compelling aspects of the stochastic growth model is
    its flexibility in terms of admitting a multitude of extensions.
    5.6.1 Technological Complementarities
    As initially formulated in a team production context by Bryant (1983) and explored
    subsequently in the stochastic growth model by Baxter and King (1991), supple-
    menting the individual agent’s production function with a measure of the level of
    activity by other agents is a convenient way to introduce interactions across agents.71
    The idea is to introduce a complementarity into the production process so that high
    levels of activity in other firms implies that a single firm is more productive as well.

    143
    Let y represent the output at a given firm, Y be aggregate output, k and n the
    firm’s input of capital and labor respectively. Consider a production function of:
    y = AkαnφY γY ε−1 (5.28)
    where A is a productivity shock that is common across producers. Here γ param-
    eterizes the contemporaneous interaction between producers. If γ is positive, then
    there is a complementarity at work: as other agents produce more, the productiv-
    ity of the individual agent increases as well. In addition, this specification allows
    for a dynamic interaction as well parameterized by ε. As discussed in Cooper and
    Johri (1997), this may be interpreted as a dynamic technological complementarity
    or even a learning by doing effect. This production function can be imbedded into
    a stochastic growth model.
    Consider the problem of a representative household with access to a production
    technology given by (5.28). This is essentially a version of (5.21) with a different
    technology.
    There are two ways to solve this problem. The first is to write the dynamic
    programming problem, carefully distinguishing between individual and aggregate
    variables. As in our discussion of the recursive equilibrium concept, a law of motion
    must be specified for the evolution of the aggregate variables. Given this law of
    motion, the individual household’s problem is solved and the resulting policy func-
    tion compared to the one that governs the economy-wide variables. If these policy
    functions match, then there is an equilibrium. Else, another law of motion for the
    aggregate variables is specified and the search continues.72
    Alternatively, one can use the first-order conditions for the individuals optimiza-
    tion problem. As all agents are identical and all shocks are common, the represen-
    tative household will accumulate its own capital, supply its own labor and interact
    with other agents only due to the technological complementarity. In a symmetric

    144
    equilibrium, yt = Yt. As in Baxter and King (1991), this equilibrium condition is
    neatly imposed through the first-order conditions when the marginal products of la-
    bor and capital are calculated. From the set of first-order conditions, the symmetric
    equilibrium can be analyzed through by approximation around a steady state.
    The distinguishing feature of this economy from the traditional RBC model is
    the presence of the technological complementarity parameters, γ and �. It is possible
    to estimate these parameters directly from the production function or to infer them
    from the equilibrium relationships. 73
    5.6.2 Multiple Sectors
    The stochastic growth model explored so far has a single sector of production. Of
    course this is just an abstraction which allows the research to focus on intertempo-
    ral allocations without being very precise about the multitude of activities arising
    contemporaneously.
    As an example, suppose there are two sectors in the economy. Sector one pro-
    duces consumption goods and second two produces investment goods.74Let the pro-
    duction function for sector j = 1, 2 be given by:
    yj = A
    jf (kj, nj)
    Here there are sector specific total factor productivity shocks. An important issue
    for this model is the degree of correlation across the sectors of activity.
    Assuming that both capital and labor can be costlessly shifted across sectors of
    production, the state vector contains the aggregate stock of capital rather than its
    use in the previous period. Further, there is only a single accumulation equation for
    capital. The dynamic programming problem for the planner becomes:

    145
    V (A1, A2, k) = max{kj ,nj}u(c, 1 − n) + βEA1′,A2′|A1,A2 V (A1′, A2′, k′). (5.29)
    subject to:
    c = A1f (k1, n1) (5.30)
    k′ = k(1 − δ) + A2f (k2, n2) (5.31)
    n = n1 + n2 (5.32)
    k = k1 + k2 (5.33)
    This optimization problem can be solved using value function iteration and the
    properties of the simulated economy can, in principle, be compared to data. For this
    economy, the policy functions will specify the state contingent allocation of capital
    and labor across sectors.
    Economies generally exhibit positive comovement of employment and output
    across sectors. This type of correlation may be difficult for a multi-sector economy
    to match unless there is sufficient correlation in the shocks across sectors.75
    This problem can be enriched by introducing costs of reallocating capital and/or
    labor across the sectors. At the extreme, capital may be entirely sector specific. In
    that case, the state space for the dynamic programming problem must include the
    allocation of capital across sectors inherited from the past. By adding this friction
    to the model, the flow of factors across the sectors may be reduced.
    Exercise 5.6
    Extend the code for the one sector stochastic growth model to solve (5.29). Use
    the resulting policy functions to simulate the model and compute moments as a
    function of key parameters, such as the correlation of the shocks across the sectors.
    Relate these to observed correlations across sectors.

    146
    5.6.3 Taste Shocks
    Another source of uncertainty that is considered within the stochastic growth model
    allows for randomness in tastes. This may be a proxy for variations in the value
    of leisure brought about by technology changes in a home production function.
    Here we specify a model with shocks to the marginal rate of substitution between
    consumption and work. Formally, consider:
    V (A, S, k) = max{k′,n}u(c, 1 − n, S) + βEA′,S′|A,SV (A′, S′, k′) (5.34)
    subject to the usual production function and capital accumulation equations. Here
    S represents the shocks to tastes. This problem may be interpreted as a two sector
    model where the second sector produces leisure from time and a shock (S). Empir-
    ically this type of specification is useful as there is a shock, internal to the model,
    that allows the intratemporal first order condition to be violated, assuming that S
    is not observable to the econometrician.
    As usual, the policy functions will specify state contingent employment and
    capital accumulation. Again, the model can be solved, say through value function
    iteration, and then parameters selected to match moments of the data.
    Exercise 5.7
    Extend the code for the one sector stochastic growth model to solve (5.34). Use
    the resulting policy functions to simulate the model and compute moments as a
    function of key parameters, including the variance/covariance matrix for the shocks.
    Relate these to observed correlations from US data. Does the existence of taste shocks
    “help” the model fit the data better?

    147
    5.6.4 Taxes
    One important extension of the stochastic growth model introduces taxes and gov-
    ernment spending. These exercises are partly motivated as attempts to determine
    the sources of fluctuations. Further, from a policy perspective, the models are used
    to evaluate the impacts of taxes and spending on economic variables and, given
    that the models are based on optimizing households, one can evaluate the welfare
    implications of various policies.
    McGrattan (1994) and Braun (1994) study these issues. We summarize the
    results and approach of McGrattan (1994) to elaborate on maximum likelihood
    estimation of these models.
    McGrattan (1994) specifies a version of the stochastic growth model with four
    sources of fluctuations: productivity shocks, government spending shocks, capital
    taxes and labor taxes. The government’s budget is balanced each period by the use
    of lump-sum taxes/transfers to the households. So, household preferences are given
    by U (c, g, n) where c is private consumption, g is public consumption and n is the
    labor input.76. The budget constraint for the household in any period t is given by:
    ct + it = (1 − τ kt )rtkt + (1 − τ nt )wtnt + δτ kt kt + Tt (5.35)
    where it is investment by the household and the right side is represents income from
    capital rentals, labor supply, depreciation allowances and a lump-sum transfer. Here
    τ kt and τ
    n
    t are the period t tax rates on capital and labor respectively. Given the
    presence of these distortionary taxes,
    McGrattan (1994) cannot appeal to a planner’s optimization problem to charac-
    terize optimal decision rules and thus works directly with a decentralized allocation.
    As in the above discussion of recursive equilibrium, the idea is to specify state
    contingent transitions for the aggregate variables and thus, in equilibrium, for rela-
    tive prices. These prices are of course relevant to the individual through the sequence

    148
    of budget constraints, (5.35). Individual households take these aggregate variables
    as given rules and optimize. In equilibrium, the representative household’s choices
    and the evolution of the aggregate variables coincide.77
    McGrattan (1994) estimates the model using maximum likelihood techniques.
    To do so, the fundamental shocks are supplemented by measurement errors through
    the specification of a measurement equation. Assuming innovations are normally
    distributed, McGrattan (1994) can write down a likelihood function for the model
    economy. Given quarterly observations on output, investment, government pur-
    chases, hours, capital and the tax rates on capital and labor, the parameters of the
    model are estimated. Included in the list of parameters are those that characterize
    the utility function, production function as well as the stochastic process for the
    shocks in the system. McGrattan (1994) finds a capital share of 0.397, a discount
    factor of 0.9927 a capital depreciation rate of about .02. Interestingly, government
    purchases do not appear to enter directly into the household’s utility function. Fur-
    ther the log utility specification can not be rejected.
    5.7 Conclusions
    The models presented in this chapter represent some simple versions of the stochastic
    growth model. This is one of the workhorse models of macroeconomics. There is an
    enormous literature about this model and solution techniques. The intention was
    more to provide insights into the solution and estimation of these models using the
    dynamic programming approach than to provide a case for or against the usefulness
    of these models in the evaluation of aggregate fluctuations.
    There is an almost endless list of extensions of the basic framework. Using the
    approach in this chapter, the researcher can solve these problems numerically and
    begin the task of confronting the models with data.

    Chapter 6
    Consumption
    6.1 Overview and Motivation
    The next two chapters study consumption. We devote multiple chapters to this
    topic due to its importance in macroeconomics and also due to the common (though
    unfortunate) separation of consumption into a study of (i) nondurables and services
    and (ii) durables.
    From the perspective of business cycle theory, consumption is the largest com-
    ponent of total expenditures. One of the main aspects of consumption theory is the
    theme of consumption smoothing (defined below). This is evident in the data as
    the consumption of nondurablers/services is not as volatile as income. Relatedly,
    durable expenditures is one of the more volatile elements in the GDP accounts.
    These are important facts that our theories and estimated models must confront.
    This chapter focuses on the consumption of nondurables and services. We start
    with a simple two-period model to build intuition. We then progress to more com-
    plex models of consumption behavior by going to the infinite horizon, adding various
    forms of uncertainty and also considering borrowing restrictions. In keeping with the
    theme of this book, we pay particular attention to empirical studies that naturally
    149

    150
    grow out of consideration of these dynamic optimization problems.
    6.2 Two-Period Problem
    The two-period problem is, as always, a good starting point to build intuition about
    the consumption and savings decisions. We start with a statement of this problem
    and its solution and then discuss some extensions.
    6.2.1 Basic Problem
    The consumer maximizes the discount present value of consumption over the two-
    period horizon. Assuming that preferences are separable across periods, we represent
    lifetime utility as:
    1∑
    t=0
    βtu(ct) = u(c0) + βu(c1) (6.1)
    where β ∈ [0, 1] and is called the discount factor. As you may know from the
    optimal growth model, this parameter of tastes is tied to the marginal product of
    capital as part of an equilibrium allocation; here it is treated as a fixed parameter.
    Period 0 is the initial period, making use of β0 = 1.
    The consumer is endowed with some initial wealth at the start of period 0 and
    earns income yt in period t=0,1. For now, these income flows are exogenous; we
    later discuss adding a labor supply decision to the choice problem. We assume that
    the agent can freely borrow and lend at a fixed interest rate between each of the two
    periods of life. Thus the consumer faces a pair of constraints, one for each period
    of life, given by:
    a1 = r0(a0 + y0 − c0)
    and

    151
    a2 = r1(a1 + y1 − c1).
    Here yt is period t income and at is the agent’s wealth at the start of period t.
    It is important to appreciate the timing and notational assumptions made in these
    budget constraints. First, rt represents the gross return on wealth between period
    t and period t+1. Second, the consumer earns this interest on wealth plus income
    less consumption over the period. It is as if the income and consumption decisions
    were made at the start of the period and then interest was earned over the period.
    Nothing critical hinges on these timing decisions but it is necessary to be consistent
    about them.
    There are some additional constraints to note. First, we restrict consumption to
    be non-negative. Second, the stock of assets remaining at the end of the consumer’s
    life (a2) must be non-negative. Else, the consumer would set a2 = −∞ and die (rel-
    atively happily) with an enormous outstanding debt. We leave open the possibility
    of a2 > 0.
    This formulation of the consumers’ constraints are similar to the ones used
    throughout this book in our statement of dynamic programming problems. These
    constraints are often termed flow constraints since they emphasize the intertemporal
    evolution of the stock of assets being influenced by consumption. As we shall see,
    it is natural to think of the stock of assets as a state variables and consumption as
    a control variable.
    There is an alternative way to express the consumer’s constraints that combines
    these two flow conditions by substituting the first into the second. After some
    rearranging, this yields:
    a2/(r1r0) + c1/r0 + c0 = (a0 + y0) + y1/r0 (6.2)

    152
    The left side of this expression represents the expenditures of the consumer on goods
    in both periods of life and on the stock of assets held at the start of period 2. The
    right side measures the total amount of resources available to the household for
    spending over its lifetime. This is a type of ”sources” vs. ”uses” formulation of
    the lifetime budget constraint. The numeraire for this expression of the budget
    constraint is period 0 consumption goods.
    Maximization of (6.1) with respect to (c0, c1) subject to (6.2) yields:
    u′(c0) = λ = βr0u
    ′(c1) (6.3)
    as a necessary condition for optimality where λ is the multiplier on (6.2). This is
    an intertemporal first order condition (often termed the consumer’s Euler equation)
    that relates the marginal utility of consumption across two periods.
    It is best to think about this condition from the perspective of a deviation from
    a proposed solution to the consumers optimization problem. So, given a candidate
    solution, suppose that the consumer reduces consumption by a small amount in
    period 0 and increases savings by this same amount. The cost of this deviation
    is given by u′(c0) from (6.3). The household will earn r0 between the two periods
    and will consume those extra units of consumption in period 1. This leads to a
    discounted gain in utility given by the right side of (6.3). When this condition
    holds, lifetime utility cannot be increased through such a perturbation from the
    optimal path.
    As in our discussion of the cake eating problem in chapter 2, this is just a
    necessary condition since (6.3) captures a very special type of deviation from a
    proposed path: reduce consumption today and increase it tomorrow. For more
    general problems (more than 2 periods) there will be other deviations to consider.
    But, even in the two-period problem, the consumer could have taken the reduced

    153
    consumption in period 0 and used it to increase a2.
    Of course, there is another first-order condition associated with (6.1): the choice
    of a2. The derivative with respect to a2 is given by:
    λ = φ
    where φ is the multiplier on the non-negativity constraint for a2. So, clearly the non-
    negativity constraint binds (φ > 0) if and only if the marginal utility of consumption
    is positive (λ > 0). That is, it is sub-optimal to leave money in the bank when more
    consumption is desirable.
    This (somewhat obvious but very important) point has two implications to keep
    in mind. First, in thinking about perturbations from a candidate solution, we were
    right to ignore the possibility of using the reduction in c0 to increase a2 as this is
    clearly not desirable. Second, and perhaps more importantly, knowing that a2 = 0
    is a critical part of solving this problem. Looking at the Euler equation (6.3) alone
    guarantees that consumption is optimally allocated across periods but this condition
    can hold for any value of a2. So it is valuable to realize that (6.3) is only a necessary
    condition for optimality; a2 = 0 is necessary as well.
    With a2 = 0, the consumer’s constraint simplifies to:
    c1/r0 + c0 = a0 + y0 + y1/r0 ≡ w0 (6.4)
    where w0 is lifetime wealth for the agent in terms of period 0 goods. Clearly,
    the optimal consumption choices depend on the measure of lifetime wealth (w0)
    and the intertemporal terms of trade (r0). In the absence of any capital market
    restrictions, the timing of income across the households lifetime is irrelevant for
    their consumption decisions. Instead, variations in the timing of income, given w0
    are simply reflected in the level of savings between the two periods.78

    154
    As an example, suppose utility is quadratic in consumption:
    u(c) = a + bc − (d/2)c2
    where we require that u′(c) = b − dc > 0. In this case, the Euler condition simplifies
    to:
    b − dc0 = βr0(b − dc1).
    With the further simplification that βr0 = 1, we have constant consumption: c0 =
    c1. Note that this prediction is independent of the timing of income over the periods
    0 and 1. this is an example of a much more general phenomenon, termed consump-
    tion smoothing That will guide our discussion of consumption policy functions.
    6.2.2 Stochastic Income
    We now add some uncertainty to the problem by supposing that income in period 1
    (y1) is not known to the consumer in period 0. Further, we use the result of A2 = 0
    and rewrite the optimization problem more compactly as:
    max
    c0
    Ey1|y0 [u(c0) + βu(R0(A0 + y0 − c0) + y1)]
    where we have substituted for c1 using the budget constraint. Note that the expec-
    tation is taken here with respect to the only unknown variable (y1) conditional on
    knowing y0, period 0 income. In fact, we assume that
    y1 = ρy0 + ε1
    where |ρ| ∈ [0, 1]. Here ε1 is a shock to income that is not forecastable using period
    0 information. In solving the optimization problem, the consumer is assumed to

    155
    take the information about future income conveyed by observed current income into
    account.
    The Euler equation for this problem is given by:
    u′(c0) = Ey1|y0 βR0u
    ′(R0(A0 + y0 − c0) + y1).
    Note here that the marginal utility of future consumption is stochastic. Thus the
    tradeoff given by the Euler equation reflects the loss of utility today from reduc-
    ing consumption relative to the expected gain which depends on the realization of
    income in period 1.
    The special case of quadratic utility and βR0 = 1 highlights the dependence of
    the consumption decision on the persistence of income fluctuations. For this case,
    the Euler equation simplifies to:
    c0 = Ey1|y0 c1 = R0(A0 + y0 − c0) + Ey1|y0 y1.
    Solving for c0 and calculating Ey1|y0 y1 yields:
    c0 =
    R0(A0 + y0)
    (1 + R0)
    +
    ρy0
    (1 + R0)
    =
    R0A0
    (1 + R0)
    + y0
    (R0 + ρ)
    (1 + R0)
    . (6.5)
    This expression relates period 0 consumption to period 0 income through two
    separate channels. First, variations in y0 directly affect the resources currently
    available to the household. Second, variations in y0 provide information about
    future income (unless ρ = 0).
    From (6.5),
    ∂c0
    ∂y0
    =
    (R0 + ρ)
    (1 + R0)
    .
    In the extreme case of iid income shocks (ρ = 0), consumers will save a fraction of an
    income increase and consume the remainder. In the opposite extreme of permanent

    156
    shocks (ρ = 1), current consumption moves one-for-one with current income. For
    this case, savings does not respond to income at all. Clearly the sensitivity of
    consumption to income variations depends on the permanence of those shocks.79
    Both of these extreme results reflect a fundamental property of the optimal
    consumption problem: consumption smoothing. This property means that vari-
    ations in current income are spread over time periods in order to satisfy the Euler
    equation condition that marginal utility today is equal to the discounted marginal
    utility of consumption tomorrow, given the return R0. In fact, consumption smooth-
    ing is the intertemporal expression of the normality of goods property found in static
    demand theory.
    But, there is an interesting aspect of consumption smoothing highlighted by
    our example: as the persistence of shocks increases, so does the responsiveness
    of consumption to income variations. In fact, this makes good sense: if income
    increases today are likely to persist, there is no need to save any of the current
    income gain since it will reappear in the next period. These themes of consumption
    smoothing and the importance of the persistence of shocks will reappear throughout
    our discussion of the infinite horizon consumer optimization problem.
    6.2.3 Portfolio Choice
    A second extension of the two-period problem is of interest: the addition of multiple
    assets. Historically, there has been a close link between the optimization problem
    of a consumer and asset pricing models. We will make these links clearer as we
    proceed and begin here with a savings problem in which there are two assets.
    Assume that the household has no initial wealth and can save current income
    through two assets. One is nonstochastic and has a one period gross return of Rs.
    The second asset is risky with a return denoted by R̃rand a mean return of R̄r. Let

    157
    ar and as denote the consumer’s holdings of asset type j = r, s. Assets prices are
    normalized at 1 in period 0.
    The consumer’s choice problem can then be written as:
    max
    ar,as
    u(y0 − ar − as) + ER̃r βu(R̃rar + Rsas + y1).
    Here we make the simplifying assumption that y1 is known with certainty. The first
    order conditions are:
    u′(y0 − ar − as) = βRsER̃r u′(R̃rar + Rsas + y1)
    and
    u′(y0 − ar − as) = βER̃r R̃ru′(R̃rar + Rsas + y1).
    Note we have not imposed any conditions regarding the holding of these assets. In
    particular, we have allowed the agent to buy or sell the two assets.
    Suppose that u(c) is strictly concave, so that the agent is risk averse. Further,
    suppose we search for conditions such that the household is willing to hold positive
    amounts of both assets. In this case, we would expect that the agent would have
    to be compensated for the risk associated with holding the risky asset. This can
    be seen by equating these two first order conditions (which hold with equality) and
    then using the fact that the expectation of the product of two random variables is
    the product of the expectations plus the covariance. This manipulation yields:
    Rs = R̄r +
    cov[R̃r, u′(R̃rar + Rsas + y1)]
    ER̃r u
    ′(R̃rar + Rsas + y1)
    . (6.6)
    The sign of the numerator of the ratio on the right depends on the sign of ar.
    If the agent holds both the riskless and the risky asset (ar > 0 and as > 0 ),
    then the strict concavity of u(c) implies that the covariance must be negative. In

    158
    this case, R̄r must exceed Rs : the agent must be compensated for holding the risky
    asset.
    If the average returns are equal then the agent will not hold the risky asset
    (ar = 0) and (6.6) will hold. Finally, if R̄
    r is less than Rs, the agent will sell the
    risky asset and buy additional units of the riskless asset.
    6.2.4 Borrowing Restrictions
    A final extension of the two-period model is to impose a restriction on the borrowing
    of agents. To illustrate, consider a very extreme constraint where the consumer is
    able to save but not to borrow: c0 ≤ y0. Thus the optimization problem of the
    agent is:
    max
    c0≤y0
    [u(c0) + βu(R0(A0 − y0 − c0) + y1)].
    Denote the multiplier on the borrowing constraint by µ, the first-order condition is
    given by:
    u′(c0) = βR0u
    ′(R0(A0 + y0 − c0) + y1) + µ.
    If the constraint does not bind, then the consumer has non-negative savings and the
    familiar Euler equation for the two-period problem holds. However, if µ > 0, then
    c0 = y0 and
    u′(y0) > βR0u
    ′(y1).
    The borrowing constraint is less likely to bind if βR0 is not very large and if y0 is
    large relative to y1.
    An important implication of the model with borrowing constraints is that con-
    sumption will depend on the timing of income receipts and not just W0. That is,
    imagine a restructuring of income that increased y0 and decreased y1 leaving W0
    unchanged. In the absence of a borrowing restriction, consumption patterns would

    159
    not change. But, if the borrowing constraint binds, then this restructuring of income
    will lead to an increase in c0 and a reduction in c1 as consumption “follows” income.
    To the extent that this change in the timing of income flows could reflect govern-
    ment tax policy (yt is then viewed as after tax income), the presence of borrowing
    restrictions implies that the timing of taxes can matter for consumption flows and
    thus for welfare.
    The weakness of this and more general models is that the basis for the borrowing
    restrictions is not provided. Given this, it is not surprising that researchers have
    been interested in understanding the source of borrowing restrictions. We return to
    this point below.
    6.3 Infinite Horizon Formulation: Theory and Em-
    pirical Evidence
    We now consider the infinite horizon version of the optimal consumption problem.
    In doing so, we see how the basic intuition of consumption smoothing and other
    aspects of optimal consumption allocations carry over to the infinite horizon setting.
    In addition, we introduce empirical evidence into our presentation.
    6.3.1 Bellman’s equation for the Infinite Horizon Probem
    Consider a household with a stock of wealth denoted by A, a current flow of income
    y and a given return on its investments over the past period given by R−1. Then the
    state vector of the consumer’s problem is (A, y, R−1) and the associated Bellman
    equation is:
    v(A, y, R−1) = max
    c
    u(c) + βEy′,R|R−1,yv(A
    ′, y′, R)

    160
    for all (A, y, R−1) where the transition equation for wealth is given by:
    A′ = R(A + y − c).
    We assume that the problem is stationary so that no time subscripts are necessary.80
    This requires, among other things, the income and returns are stationary random
    variables and that the joint distribution of (y′, R) depends only on (y, R−1).
    The transition equation has the same timing as we assumed in the two period
    problem: interest is earned on wealth plus income less consumption over the period.
    Further, the interest rate that applies is not necessarily known at the time of the
    consumption decision. Thus the expectation in Bellman’s equation is over the two
    unknowns (y′, R′) where the given state variables provide information on forecasting
    these variables.81
    6.3.2 Stochastic Income
    To analyze this problem, we first consider the special case where the return on
    savings is known and the individual faces uncertainty only with respect to income.
    We then build on this model by adding in a portfolio choice, endogenous labor
    supply and borrowing restrictions.
    Theory
    In this case, we study:
    v(A, y) = max
    c
    u(c) + βEy′|yv(A
    ′, y′) (6.7)
    where A′ = R(A + y − c) for all (A, y). The solution to this problem is a policy
    function that relates consumption to the state vector: c = φ(A, y). The first order
    condition is:

    161
    u′(c) = βREy′|yvA(A
    ′, y′) (6.8)
    which holds for all (A, y), where vA(A
    ′, y′) denotes ∂v(A′, y′)/∂A′.
    Using (6.7) to solve for Ey′|yvA(A′, y′) yields the Euler equation:
    u′(c) = βREy′|yu
    ′(c′). (6.9)
    The interpretation of this equation is that the marginal loss of reducing consumption
    is balanced by the discounted expected marginal utility from consuming the proceeds
    in the following period. As usual, this Euler equation implies that a one-period
    deviation from a proposed solution that satisfies this relationship will not increase
    utility. The Euler equation, (6.9), holds when consumption today and tomorrow
    is evaluated using this policy function. In the special case of βR = 1, the theory
    predicts that the marginal utiliity of consumption follows a random walk.
    In general, one cannot generate a closed-form solution of the policy function
    from these conditions for optimality. Still, some properties of the policy functions
    can be deduced. Given that u(c) is strictly concave, one can show that v(A, y) is
    strictly concave in A. As argued in Chapter 2, the value function will inherit some
    of the curvature properties of the return function. Using this and (6.8), the policy
    function, φ(A, y), must be increasing in A. Else, an increase in A would reduce
    consumption and thus increase A′. This would contradict (6.8).
    As a leading example, consider the specification of utility where
    u(c) =
    c1−γ − 1
    1 − γ
    where γ = 1 is the special case of u(c) = ln(c). This is called the constant relative
    risk aversion case (CRRA) since −cu′′(c)/u′(c) = γ.
    Using this utility function, (6.9) becomes:

    162
    1 = βRE(
    c′
    c
    )−γ
    where the expectation is taken with respect to future consumption which, through
    the policy function, depends on (A′, y′) . As discussed in some detail below, this
    equation is then used to estimate the parameters of the utility function, (β, γ).
    Evidence
    Hall (1978) studies the case in which u(c) is quadratic so that the marginal utility
    of consumption is linear. In this case, consumption itself is predicted to follow a
    random walk. Hall uses this restriction to test the predictions of this model of
    consumption. In particular, if consumption follows a random walk then:
    ct+1 = ct + εt+1.
    The theory predicts that the growth in consumption (εt+1) should be orthogonal to
    any variables known in period t: Etεt+1 = 0. Hall uses aggregate quarterly data for
    non durable consumption. He shows that lagged stock market prices significantly
    predict consumption growth, which violates the permanent income hypothesis. 82
    Flavin (1981) extends Hall’s analysis allowing for a general ARMA process for the
    income. Income is commonly found as a predictor of consumption growth. Flavin
    points out that this finding is not necessarily in opposition with the prediction of the
    model. Current income might be correlated with consumption growth not because
    of a failure of the permanent income hypothesis, but because current income signals
    changes in the permanent income. However, she also rejects the model.
    The importance of current income to explain consumption growth has been seen
    as evidence of liquidity constraints (see section 6.3.5). A number of authors have
    investigated this issue. 83 However, most of the papers used aggregate data to test
    the model. Blundell et al. (1994) test the model on micro data and find that when

    163
    one controls for demographics and household characteristics, current income does
    not appear to predict consumption growth. Meghir and Weber (1996) explicitly test
    for the presence of liquidity constraints using a US panel data and do not find any
    evidence.
    6.3.3 Stochastic Returns: Portfolio choice
    We already considered a simple portfolio choice problem for the two-period problem
    so this discussion will be intentionally brief. We then turn to empirical evidence
    based upon this model.
    Theory
    Assume that there are N assets available. Let R−1 denote the N -vector of gross
    returns between the current and previous period and let A be the current stock of
    wealth. Let si denote the share of asset i = 1, 2, …N held by the agent. Normalizing
    the price of each asset to be unity, the current consumption of the agent is then:
    c = A −

    i
    si.
    With this in mind, the Bellman equation is given by:
    v(A, y, R−1) = max
    si
    u(A −

    i
    si) + βER,y′|R−1,yv(

    i
    Risi, y
    ′, R) (6.10)
    where Ri is the stochastic return on asset i. Note that R−1 is in the state vector only
    because of the informational value it provides on the return over the next period,
    R.
    The first order condition for the optimization problem holds for i = 1, 2, …, N
    and is:

    164
    u′(c) = βER,y′|R−1,yRivA(

    i
    Risi, y
    ′, R).
    where again vA() is defined as ∂v()/∂A. Using (6.10) to solve for the derivative of
    the value function, we obtain:
    u′(c) = βER,y′|R−1,yRiu
    ′(c′) for i = 1, 2, ..N
    where, of course, the level of future consumption will depend on the vector of returns,
    R, and the realization of future income, y′.
    This system of Euler equations forms the basis for financial models that link
    asset prices to consumption flows. This system is also the basis for the argument
    that conventional models are unable to explain the observed differential between the
    return on equity and relatively safe bonds. Finally, these conditions are also used
    to estimate the parameters of the utility function, such as the curvature parameter
    in the traditional CRRA specification.
    This approach is best seen through a review of Hansen and Singleton (1982). To
    understand this approach recall that Hall uses the orthogonality conditions to test
    a model of optimal consumption. Note that Hall’s exercise does not estimate any
    parameters as the utility function is assumed to be quadratic and the real interest
    rate is fixed. Instead, Hall essentially tests a restriction imposed by his model at
    the assumed parameter values.
    The logic pursued by Hansen-Singelton goes a step further. Instead of using the
    orthogonality constraints to evaluate the predictions of a parameterized model they
    use these conditions to estimate a model. In fact, if one imposes more conditions
    than there are parameters (i.e. if the exercise if overidentified), then the researcher
    can both estimate the parameters and test the validity of the model.

    165
    Empirical implementation
    The starting point for the analysis is the Euler equation for the household’s problem
    with N assets. We rewrite that first order condition here using time subscripts to
    make clear the timing of decisions and realizations of random variables:
    u′(ct) = βEtRit+1u
    ′(ct+1) for i = 1, 2, ..N (6.11)
    where Rit+1 is defined as the real return on asset i between period t and t + 1. The
    expectation here is conditional on all variables observed in period t. Unknown t + 1
    variables include the return on the assets as well as period t + 1 income.
    The power of the GMM approach derives from this first-order condition. Es-
    sentially, the theory tells us that while ex post this first-order condition need not
    hold, any deviations from it must be unpredictable given period t information. That
    is, the period t + 1 realization say, of income, may lead the consumer to increase
    consumption is period t + 1 thus implying that ex post (6.11) does not hold. This
    deviation is not inconsistent with the theory as long as it was not predictable given
    period t information.
    Formally, define εit+1(θ) as
    εit+1(θ) ≡
    βRit+1u
    ′(ct+1)
    u′(ct)
    − 1, for i = 1, 2, ..N (6.12)
    Thus εit+1(θ) is a measure of the deviation for an asset i. We have added θ as an
    argument in this error to highlight its dependence on the parameters describing the
    household’s preferences. Household optimization implies that
    Et(ε
    i
    t+1(θ)) = 0 for i = 1, 2, ..N.
    Let zt be a q-vector of variables that are in the period t information set.
    84 This
    restriction on conditional expectations implies:
    E(εit+1(θ) ⊗ zt) = 0 for i = 1, 2, ..N. (6.13)

    166
    where ⊗ is the Kronecker product. So the theory implies the Euler equation errors
    from any of the N first-order conditions ought to be orthogonal to any of the zt
    variables in the information set. There are N.q restrictions created.
    The idea of GMM estimation is then to find the vector of structural parameters
    (θ) such that (6.13) holds. Of course, applied economists only have access to a
    sample, say of length T . Let mT (θ) be an N.q-vector where the component relating
    asset i to one of the variables in zt, z
    j
    t , is defined by:
    1
    T
    T∑
    t=1
    (εit+1(θ)z
    j
    t ).
    The GMM estimator is defined as the value of θ that minimizes
    JT (θ) = mT (θ)
    ′WT mT (θ).
    Here WT is an N qxN q matrix that is used to weight the various moment restrictions.
    Hansen and Singleton (1982) use monthly seasonally adjusted aggregate data on
    US non durable consumption or nondurables and services between 1959 and 1978.
    They use as a measure of stock returns, the equally weighted average return on all
    stocks listed on the New York Stock Exchange. They choose a constant relative
    risk aversion utility function u(c) = c1−γ/(1 − γ). With this specification, there are
    two parameters to estimate, the curvature of the utility function γ and the discount
    factor β. Thus, θ = (β, γ) The authors use as instruments z
    j
    t lagged values of ε
    i
    t+1
    and estimate the model with 1, 2, 4 or 6 lags. Depending on the number of lags and
    the series used, they find values for γ which vary between 0.67 and 0.97 and values
    for the discount factor between 0.942 and 0.998. As the model is overidentified, there
    is scope for an overidentification test. Depending on the number of lags and the
    series used, the test gives mixed results as the restrictions are sometimes satisfied
    and sometimes rejected.
    Note that the authors do not adjust for possible trends in the estimation. Sup-

    167
    pose that log consumption is characterized by a linear trend:
    ct = exp(αt)c̃t
    where c̃t is the detrended consumption. In that case, equation (6.12) is rewritten
    as:
    εit+1(θ) ≡
    βe−αγRit+1c̃
    −γ
    t+1

    −γ
    t
    − 1, for i = 1, 2, ..N
    Hence the estimated discount factor is a product between the true discount factor
    and a trend effect. Ignoring the trend would result in a bias for the discount rate.
    6.3.4 Endogenous Labor Supply
    Of course, it is natural to add a labor supply decision to this model. In that case,
    we can think that the stochastic income, taken as given above, actually comes from
    a stochastic wage (w) and a labor supply decision (n). In this case, consider the
    following functional equation:
    v(A, w) = max
    A′,n
    U (A + wn − (A′/R), n) + βEw′|wv(A′, w′)
    for all (A, w). Here we have substituted in for current consumption so that the agent
    is choosing labor supply and future wealth.
    Note that the labor supply choice, given (A, A′), is purely static. That is, the
    level of employment and thus labor earnings has no dynamic aspect other than sup-
    plementing the resources available to finance current consumption and future wealth.
    Correspondingly, the first order condition with respect to the level of employment
    does not directly involve the value function and is given by:
    wUc(c, n) = −Un(c, n). (6.14)
    Using c=A + wn − (A′/R), this first order condition relates n to (A, w, A′). Denote

    168
    this relationship as n = ϕ(A, w, A′). This can then be substituted back into the
    dynamic programming problem yielding a simpler functional equation:
    v(A, w) = max
    A′
    Z(A, A′, w) + βEw′|wv(A
    ′, w′)
    where
    Z(A, A′, w) ≡ U (A + wϕ(A, w, A′) − (A′/R), ϕ(A, w, A′))
    This simplified Bellman equation can be analyzed using standard methods, thus
    ignoring the static labor supply decision. 85 Once a solution is found, the level of
    employment can then be determined from the condition n = ϕ(A, w, A′).
    Using a similar model, MaCurdy (1981) studies the labor supply of young men
    using the Panel Study on Income Dynamics (PSID). The estimation of the model
    is done in several steps. First, the intra period allocation (6.14) is estimated. The
    coefficients are then used to get at the intertemporal part of the model.
    To estimate the parameters of the utility function, one has to observe hours
    of work and consumption, but in the PSID, total consumption is not reported.
    To identify the model, the author uses a utility function which is separable between
    consumption and labor supply. The utility function is specified as u(ct, nt) = γ1tc
    ω1
    t −
    γ2tn
    ω2
    t , where γ1t and γ2t are two deterministic functions of observed characteristics
    which might affect preferences such as age, education or the number of children.
    With this specification, the marginal utility of leisure, Un(c, n) is independent of
    the consumption decision. Using (6.14), hours of work can be expressed as:
    ln(nt) =
    ln wt
    ω2 − 1
    +
    1
    ω2 − 1
    (ln Uc(ct, nt) − ln γ2t − ln ω2)
    While the first term in the right-hand-side is observed, the second term contains
    the unobserved marginal utility of consumption. Uc(ct, nt) can be expressed as a

    169
    function of the Lagrange multiplier associated with the wealth constraint in period
    0:
    Uc(ct, nt) =
    λ0
    βt(1 + r1) . . . (1 + rt)
    The author treats the unobserved multiplier λ0, as a fixed effect and uses panel
    data to estimate a subset of the parameters of the utility function using first differ-
    ences. In a next step, the fixed effect is backed out. At this point, some additional
    identification assumptions are needed. A specific functional form is assumed for the
    Lagrange multiplier, written as a function of wages over the life cycle and initial
    wealth, all of them being unobserved in the data set. The author uses then fixed
    characteristics such that education or age to proxy for the Lagrange multiplier. The
    author finds that a 10% increase in the real wage induces a one to five percent
    increase in hours worked.
    Eichenbaum et al. (1988) analyze the time series properties of a household model
    with both a savings and a labor supply decision. They pay particular attention to
    specifications in which preferences are non-separable, both across time and between
    consumption and leisure contemporaneously. They estimate their model using GMM
    on time series evidence on real consumption (excluding durables) and hours worked.
    They find support for non-time separability in preferences though in some cases they
    found little evidence against the hypothesis that preferences were separable within
    a period.
    6.3.5 Borrowing Constraints
    The Model and Policy Function
    The extension of the two period model with borrowing constraints to the infinite
    horizon case is discussed by Deaton (1991). 86 One of the key additional insights
    from extending the horizon is to note that even if the borrowing constraint does

    170
    not bind in a period, this does not imply that consumption and savings take the
    same values as they would in the problem without borrowing constraints. Simply
    put, consumers anticipate that borrowing restrictions may bind in the future (i.e.
    in other states) and this influences their choices in the current state.
    Following Deaton (1991), let x = A + y represent cash on hand. Then the
    transition equation for wealth implies:
    A′ = R(x − c)
    where c is consumption. In the event that income variations are iid, we can write
    the Bellman equation for the household as:
    v(x) = max
    0≤c≤x
    u(c) + βEy′v(R(x − c) + y′) (6.15)
    so that the return R is earned on the available resources less consumption, x − c.
    Note that income is not a state variable here as it is assumed to be iid. Hence cash
    on hand completely summarizes the resources available to the consumer.
    The borrowing restriction takes the simple form of c ≤ x so that the consumer
    is unable to borrow. Of course this is extreme and entirely ad hoc but it does allow
    us to explore the consequences of this restriction. As argued by Deaton, the Euler
    equation for this problem must satisfy:
    u′(c) = max{u′(x), βREu′(c′)}. (6.16)
    So, either the borrowing restriction binds so that c = x or it doesn’t so that the
    more familiar Euler equation holds. Only for low values of x will u′(x) > βREu′(c′)
    and only in these states, as argued for the two-period problem, will the constraint
    bind. To emphasize an important point: even if the u′(x) < βREu′(c′) so that the standard condition of 171 u′(c) = βREu′(c′) holds, the actual state dependent levels of consumption may differ from those that are optimal for the problem in which c is not bounded above by x. Alternatively, one might consider a restriction on wealth of the form: A ≥ Amin(s) where s is the state vector describing the household. In this case the house- hold may borrow but its assets are bounded below. In principle, the limit on wealth may depend on the state variables of the household: all else the same, a household with a high level of income may be able to borrow more. One can look at the im- plications of this type of constraint and, through estimation, uncover Amin(s). (see Adda and Eaton (1997)). To solve the optimal problem, one can use the value function iteration approach, described in chapters 2 and 3, based on the Bellman equation (6.15). Deaton (1991) uses another approach, working from the Euler equation (6.16). The method is sim- ilar to the projection methods presented in chapter 3, but the optimal consumption function is obtained by successive iterations instead of solving a system of non linear equations. Although there is no formal proof that iterations on the Euler equation actually converge to the optimal solution, the author note that empirically conver- gence always occur. Figure 6.1 displays the optimal consumption rule in the case of a serially correlated income. In this case, the problem has two state variables, the cash-on-hand and the current realization of income, which provide information on future income. The policy rule has been computed using a (coarse) grid with three points for the current income and with 60 equally spaced points for the cash- on-hand. When cash-on-hand is low, the consumer is constrained and is forced to consume all his cash-on-hand. The policy rule is then the 45 degree line. For higher values of the cash-on-hand, the consumer saves part of the cash-on-hand for future 172 consumption. [Figure 6.1 approximately here] [Figure 6.2 approximately here] Figure 6.2 displays a simulation of consumption and assets over 200 periods. The income follows an AR(1) process with unconditional mean of 100, a persistence of 0.5 and the innovations to income are drawn from N (0, 10). The path of income is asymmetric, as good income shocks are smoothed by savings whereas the liquidity constraints prevents the smoothing of low income realizations. Consumption is smoother than income, with a standard deviation of 8.9 instead of 11.5. An Estimation Exercise In section 6.3.3, we presented a GMM estimation by Hansen and Singleton (1982) based on the Euler equation. Hansen and Singleton (1982) find a value for γ of about 0.8. This is under the null that the model is correctly specified, and in particular, that the Euler equation holds in each periods. When liquidity constraints are binding, the standard Euler equation does not hold. An estimation procedure which does not take into account this fact would produce biased estimates. Suppose that the real world is characterized by potentially binding liquidity con- straints. If one ignores them and consider a simpler model without any constraints, how would it affect the estimation of the parameter γ? To answer this question, we chose different values for γ, solved the model with liquidity constraints and simulated it. The simulated consumption series are used to get an estimate γ̂GM M such that: γ̂GM M = Argmin γ 1 T T∑ t=1 εt(γ) with εt(γ) = β(1 + r) c −γ t+1 c −γ t − 1 [Table 6.1 approximately here] 173 The results are displayed in Table 6.1. When γ is low, the consumer is less risk averse and consumes more out of the available cash-on-hand and saves less. The result is that the liquidity constraints are binding more often. In this case, the bias in the GMM estimate is the biggest. The bias is decreasing in the proportion of liquidity constrained periods, as when liquidity constraints are almost absent, the standard Euler equation holds. From Table 6.1, there is no value of γ which would generate a GMM estimate of 0.8 as found by Hansen and Singelton. 6.3.6 Consumption Over the Life Cycle Gourinchas and Parker (2001) investigate the ability of a model of intertemporal choice with realistic income uncertainty to match observed life cycle profiles of con- sumption. (For a related study see also Attanasio et al. (1999)). They parameterize a model of consumption over the life cycle, which is solved numerically. The pa- rameters of the model are estimated using a simulated method of moments method, using data on household consumption over the life cycle. We first present a sim- plified version of their model. We then discuss the numerical computation and the estimation methods. Following Zeldes (1989a) 87, the log income process is modelled as a random walk with a moving average error. This specification is similar to the one used in empirical work (see Abowd and Card (1989)) and seems to fit the data well. Denote Yt the income of the individual: Yt = PtUt Pt = GtPt−1Nt Income is the product of two components. Ut is a transitory shock which is indepen- dently and identically distributed and takes a value of 0 with a probability p and a positive value with a probability (1 − p). Pt is a permanent component which grows at a rate Gt which depends on age. Nt is the innovation to the permanent compo- 174 nent. ln Nt and ln Ut, conditionally on Ut > 0, are normally distributed with mean
    0 and variance σ2n and σ
    2
    u respectively. The consumer faces a budget constraint:
    Wt+1 = (1 + r)(Wt + Yt − Ct)
    The consumer can borrow and save freely. However, under the assumption that there
    is a probability that income will be zero and that the marginal utility of consumption
    is infinite at zero, the consumer will choose never to borrow against future income.
    Hence, the outcome of the model is close to the one proposed by Deaton (1991) and
    presented in section 6.3.5. Note that in the model, the agent can only consume non-
    durables. The authors ignore the durable decision, or equivalently assume that this
    decision is exogenous. This might be a strong assumption. Fernández-Villaverde
    and Krueger (2001) argue that the joint dynamics of durables and non durables are
    important to understand the savings and consumption decisions over the life cycle.
    Define the cash-on-hand as the total of assets and income:
    Xt = Wt + Yt Xt+1 = R(Xt − Ct) + Yt+1
    Define Vt(Xt, Pt) as the value function at age T −t. The value function is indexed by
    age as it is assumed that the consumer has a finite life horizon. The value function
    depends on two state variables, the cash-on-hand which indicates the maximal limit
    that can be consumed, and the realization of the permanent component which pro-
    vides information on future values of income. The program of the agent is defined
    as:
    Vt(Xt, Pt) = max
    Ct
    [u(Ct) + βEtVt+1(Xt+1, Pt+1)]
    The optimal behavior is given by the Euler equation:
    u′(Ct) = βREtu
    ′(Ct+1)
    As income is assumed to be growing over time, cash-on-hand and consumption are
    also non-stationary. This problem can be solved by normalizing the variables by

    175
    the permanent component. Denote xt = Xt/Pt and ct = Ct/Pt. The normalized
    cash-on-hand evolves as:
    xt+1 = (xt − ct)
    R
    Gt+1Nt+1
    + Ut+1
    Under the assumption that the utility function is u(c) = c(1−γ)/(1 − γ), the Euler
    equation can be rewritten with only stationary variables:
    u′(ct) = βREtu
    ′(ct+1Gt+1Nt+1)
    As the horizon of the agent is finite, one has to postulate some terminal condition
    for the consumption rule. It is taken to be linear in the normalized cash-on-hand:
    cT +1 = γ0 + γ1xT +1.
    Gourinchas and Parker (2001) use this Euler equation to compute numerically
    the optimal consumption rule. Normalized consumption is only a function of the
    normalized cash-on-hand. By discretizing the cash-on-hand over a grid, the problem
    is solved recursively by evaluating ct(x) at each point of the grid using:
    u′(ct(x)) = βR(1 − p)
    ∫ ∫
    u′
    (
    ct+1
    (
    (x − ct)
    R
    Gt+1N
    + U
    )
    Gt+1N
    )
    dF (N )dF (U )
    +βRp

    u′
    (
    ct+1
    (
    (x − ct)
    R
    Gt+1N
    )
    Gt+1N
    )
    dF (N )
    The first term on the right-hand-side calculates the expected value of the future
    marginal utility conditional on a zero income, while the second term is the ex-
    pectation conditional on a strictly positive income. The integrals are solved by a
    quadrature method (see Chapter 3). The optimal consumption rules are obtained
    by minimizing the distance between the left hand side and the right hand side.
    Figure 6.3 displays the consumption rule at different ages. 88
    Once the consumption rules are determined, the model can be simulated to gen-
    erate average life cycle profiles of consumption. This is done using the approximated
    consumption rules and by averaging the simulated behavior of a large number of

    176
    households. The simulated profiles are then compared to actual profiles from US
    data. Figure 6.4 displays the predicted consumption profile for two values of the in-
    tertemporal elasticity of substitution , as well as the observed consumption profiles
    constructed from the US Consumer Expenditure Survey. 89
    More formally, the estimation method is the simulated method of moments (see
    Chapter 4). The authors minimize the distance between observed consumption and
    predicted one at different ages. As neither the cash-on-hand nor the permanent
    component of income are directly observed, the authors integrate out the state
    variables to calculate the unconditional mean of (log) consumption at a given age:
    ln Ct(θ) =

    ln Ct(x, P, θ)dFt(x, P, θ)
    where θ is the vector of parameters characterizing the model and where Ft() is the
    density of the state variables for individuals of age t. Characterizing this density
    is difficult as it has no closed form solution. Hence, the authors use simulations to
    approximate ln Ct(θ). Denote
    g(θ) =
    1
    It
    It∑
    i=1
    ln Cit −
    1
    S
    S∑
    s=1
    ln Ct(X
    s
    t , P
    s
    t , θ)
    The first part is the average log consumption for households of age t and It is the
    number of observed household in the data set. The second part is the average
    predicted consumption over S simulated paths. θ is estimated by minimizing
    g(θ)′W g(θ)
    where W is a weighting matrix.
    The estimated model is then used to analyze the determinant of savings. There
    are two reasons to accumulate savings in this model. First, it cushions the agent
    from uninsurable income shocks, to avoid facing a low marginal utility. Second,
    savings are used to finance retirement consumption. Gourinchas and Parker (2001)

    177
    show that the precautionary motive dominates at least until age 40 whereas older
    agents save mostly for retirement.
    [Figure 6.3 approximately here]
    [Figure 6.4 approximately here]
    6.4 Conclusion
    This chapter demonstrates how to use the approach of dynamic programming to
    characterize the solution of the households optimal consumption problem and to
    link it with observations. In fact, the chapter goes beyond the savings decision to
    integrate it with the labor supply and portfolio decisions.
    As in other chapters, there are numerous extensions that are open for the re-
    searcher to consider. The next chapter is devoted to one of these, the introduction
    of durable goods. Further, there are many policy related exercises that can be eval-
    uated using one of these estimated models, included a variety of policies intended
    to influence savings decisions.90

    Chapter 7
    Durable Consumption
    7.1 Motivation
    Up to now, the consumption goods we have looked at are all classified as either
    nondurables or services. This should be clear since consumption expenditures af-
    fected utility directly in the period of the purchase and then disappear.91 However,
    durable goods play a prominent role in business cycles as durable expenditures are
    quite volatile.92
    This chapter studies two approaches to understanding durable consumption.
    The first is an extension of the models studied in the previous chapter in which a
    representative agent accumulates durables to provide a flow of services. Here we
    present the results of Mankiw (1982) which effectively rejects the representative
    agent model. 93
    The second model introduces a non-convexity into the household’s optimization
    problem. The motivation for doing so is evidence that households do not continu-
    ously adjust their stock of durables. This section of the chapter explores this through
    the specification and estimation of a dynamic discrete choice model.
    178

    179
    7.2 Permanent Income Hypothesis Model of Durable
    Expenditures
    We begin with a model that builds upon the permanent income hypothesis structure
    that we used in the previous chapter to study nondurable expenditures. We first
    exhibit theoretical properties of the model and then discuss its empirical implemen-
    tation.
    7.2.1 Theory
    To model expenditures on both durable and non-durable goods, we consider a model
    of household behavior in which the consumer has a stock of wealth (A), a stock
    of durable goods (D) and current income (y). The consumer uses wealth plus
    current income to finance expenditures on current nondurable consumption (c) and
    to finance the purchase of durable goods (e) at a relative price of p.
    There are two transition equations for this problem. One is the accumulation
    equation for wealth given by:
    A′ = R(A + y − c − pe).
    The accumulation equation for durables is similar to that used for capital held by
    the business sector:
    D′ = D(1 − δ) + e (7.1)
    where δ ∈ (0, 1) is the depreciation rate for the stock of durables.
    Utility depends on the flow of services from the stock of durables and the pur-
    chases of nondurables. In terms of timing, assume that durables bought in the
    current period yield services starting in the next period. So, as with capital there

    180
    is a time lag between the order and the use of the durable good.94
    With these details in mind, the Bellman equation for the household is given by:
    V (A, D, y, p) = max
    D′,A′
    u(c, D) + βEy′,p′|y,pV (A
    ′, D′, y′, p′) (7.2)
    for all (A, D, y, p) with
    c = A + y − (A′/R) − p(D′ − (1 − δ)D) (7.3)
    and the transition for the stock of durables given by (7.1). The maximization gives
    rise to two first-order conditions:
    uc(c, D) = βREy′,p′|y,pVA(A
    ′, D′, y′) (7.4)
    and
    uc(c, D)p = βEy′,p′|y,pVD(A
    ′, D′, y′).
    In both cases, these conditions can be interpreted as equating the marginal costs of
    reducing either nondurable or durable consumption in the current period with the
    marginal benefits of increasing the (respective) state variables in the next period.
    Using the functional equation (7.2), we can solve for the derivatives of the value
    function and then update these two first order conditions. This implies:
    uc(c, D) = βREy′|yuc(c
    ′, D′) (7.5)
    and
    puc(c, D) = βEy′,p′|y,p[uD(c
    ′, D′) + p′(1 − δ)uc(c′, D′)] (7.6)
    The first condition should be familiar from the optimal consumption problem
    without durables. The marginal gain of increasing consumption is offset by the
    reduction in wealth and thus consumption in the following period. In this specifi-
    cation, the marginal utility of non-durable consumption may depend on the level

    181
    of durables. So, to the extent there is an interaction within the utility function be-
    tween nondurable and durable goods, empirical work that looks solely at nondurable
    consumption may be inappropriate.95
    The second first order condition compares the benefits of buying durables with
    the marginal costs. The benefits of a durable expenditure comes from two sources.
    First, increasing the stock of durables has direct utility benefits in the subsequent
    period. Second, as the Euler equation characterizes a one-period deviation from
    a proposed solution, the undepreciated part of the additional stock is sold and
    consumed. This is reflected by the second term on the right side. The marginal
    cost of the durable purchase is the reduction in expenditures on nondurables that
    the agent must incur.
    A slight variation in the problem assumes that durables purchased in the current
    period provide services starting that period. Since this formulation is also found
    in the literature, we present it here as well. In this case, the dynamic programming
    problem is:
    V (A, D, y, p) = max
    D′,A′
    u(c, D′) + βEy′|yV (A
    ′, D′, y′, p′) (7.7)
    for all (A, D, y, p) with c defined in (7.3).
    Manipulation of the conditions for optimality implies (7.5) and
    puc(c, D
    ′) = [uD(c, D
    ′) + βEy′,p′|y,pp
    ′(1 − δ)uc(c′, D
    ′′
    )] (7.8)
    If prices are constant (p = p′), then this becomes
    uD(c, D
    ′) = βREy′|yuD(c
    ′, D
    ′′
    ).
    This condition corresponds to a variation in which the stock of durables is reduced by
    ε in the current period, the resources are saved and then used to purchase durables
    in the subsequent period.96 As in the case of nondurable consumption, in the special
    case of βR = 1, the marginal utility from durables follows a random walk.

    182
    Note too that regardless of the timing assumption, there are interactions between
    the two Euler equations. One source of interrelationship arises if utility is not
    separable between durables and nondurables (ucD �= 0). Further, shocks to income
    will influence both durable and nondurable expenditures.
    7.2.2 Estimation of a Quadratic Utility Specification
    Mankiw (1982) studied the pattern of durable expenditures when u(c, D′) is sepa-
    rable and quadratic. In this case, Mankiw finds that durable expenditures follows
    an ARMA(1,1) process given by:
    et+1 = a0 + a1et + εt+1 − (1 − δ)εt
    where a1 = βR. Here the MA piece is parameterized by the rate of depreciation.
    Empirically, Mankiw finds that estimating the model using U.S. data that δ is
    quite close to 1. So, durables appear not to be so durable after all!
    Adda and Cooper (2000b) study the robustness of Mankiw’s results across differ-
    ent time periods, different frequencies and across countries (US and France). Their
    results are summarized in the following table of estimates.
    [Table 7.1 approximately here]
    These are annual series for France and the US. The rows pertain to both aggre-
    gated durable expenditures and estimates based on cars (both total expenditures
    on cars (for France) and new car registrations). The model is estimated with and
    without a linear trend.
    For both countries, the hypothesis that the rate of depreciation is close to 100%
    per year would not be rejected for most of the specifications. Mankiw’s ”puzzle”
    seems to be robust across categories of durables, countries, time periods and the
    method of detrending.

    183
    Over the past few years, there has been considerable effort to understand Mankiw’s
    result. One approach, described below is to embellish the basic representation agent
    model through the addition of adjustment costs and the introduction of shocks other
    than variations in income. A second approach, coming from Bar-Ilan and Blinder
    (1992) and Bertola and Caballero (1990), is to recognize that at the household level
    durable expenditures are often discrete. We turn to these lines of research in turn.
    7.2.3 Quadratic Adjustment Costs
    Bernanke (1985) goes beyond this formulation by adding in price variations and
    costs of adjustment. As he notes, it is worthwhile to look jointly at the behavior of
    durable and nondurable expenditures as well.97 Consider the dynamic optimization
    problem of:
    V (A, D, y, p) = max
    D′,A′
    u(c, D, D′) + βEy′|yV (A
    ′, D′, y′, p′) (7.9)
    for all (A, D, y, p) where the functional equation holds for all values of the state
    vector. Bernanke assumes a quadratic utility function with quadratic adjustment
    costs of the form:
    u(c, D, D′) = −1
    2
    (c̄ − c)2 − a
    2
    (D̄ − D)2 − d
    2
    (D′ − D)2
    where ct is non-durable consumption and Dt is the stock of durables. The adjustment
    cost is part of the utility function rather than the budget constraints for tractability
    reasons. Given the quadratic structure, the model (7.9) can be solved explicitly
    as a (non-linear) function of the parameters. Current non-durable consumption
    is a function of lagged non-durable consumption, the current and lagged stock of
    durables and of the innovation to the income process. Durables can be expressed
    as a function of the past stock of durables and of the innovation to income. The
    two equations with an equation describing the evolution of income are estimated

    184
    jointly by non-linear three stage least squares where current income, non-durable
    consumption and the stock of durables were instrumented to control for simultaneity
    and for measurement error bias. Instruments are lagged measures of prices, non-
    durable consumption, durable stocks and disposable income.
    Overall, the model is rejected by the data when testing the over identifying
    restrictions. The estimation of the cost of adjustment gives conflicting results as
    described in more detailed in Bernanke (1985). The non-linear function of this
    parameter implies an important cost of adjustment whereas the parameter itself is
    not statistically different from zero.
    Bernanke (1984) tests the permanent hypothesis model at the micro level by
    looking at car expenditures for a panel of households. While, Bernanke does not
    reject the model on this type of data, it is at odds with observations (described
    below) as it predicts continuous adjustment of the stock whereas car expenditures
    are typically lumpy at the individual level.
    Exercise 7.1
    Write a program to solve (7.9). Obtain the decision rules by the household. Use
    these decision rules to create a panel data set, allowing households to have different
    realizations of income. Consider estimating the Euler equations from the house-
    hold’s optimization problem. If there were non-separabilities present in u(c, D, D′),
    particularly ucD �= 0, which were ignored by the researcher, what types of “incorrect
    inferences” would be reached?
    7.3 Non Convex Adjustment Costs
    The model explored in the previous section is intended to capture the behavior of
    a representative agent. Despite its theoretical elegance, the model has difficulty

    185
    matching two aspects of the data. First, as noted above, Mankiw’s estimate of close
    to 100% depreciation should be viewed as a rejection of the model. Second, there
    is evidence at the household level that adjustment of the stock of durables is not
    continuous. Instead, households purchases of some durables, such as cars as studied
    by Lam (1991), are relatively infrequent. This may reflect irreversibility due to
    imperfect information about the quality of used durable good, the discrete nature
    of some durable goods or the nature of adjustment costs.
    Bar-Ilan and Blinder (1992) and Bar-Ilan and Blinder (1988) present a simple
    setting in which a fixed cost of adjustment implies inaction from the agent when the
    stock of durable is not too far from the optimal one. They argue that the optimal
    consumption of durables should follow an (S,s) policy. When the durable stock
    depreciates to a lower value s, the agent increases the stock to a target value S as
    depicted in Figure 7.1.
    [Figure 7.1 approximately here]
    7.3.1 General Setting
    To gain some insight into the importance of irreversibility, consider the following
    formalization of a model in which irreversibility is important. By this we mean that
    due to some friction in the market for durables, households receive only a fraction
    of the true value of a product they wish to sell. This can be thought of as a version
    of Akerlof’s famous lemons problem.98
    In particular, suppose that the price of durables is normalized to 1 when they
    are purchases (e) but that the price of durables when they are sold (s) is given by
    ps < 1. The Bellman equation for the household’s optimization problem is given by: V (A, D, y) = max(V b(A, D, y), V s(A, D, y), V i(A, D, y)) (7.10) where 186 V b(A, D, y) = max e,A′ u(A + y − (A′/R) − e, D) + βEy′|yV (A′, D(1 − δ) + e, y′) (7.11) V s(A, D, y) = max s,A′ u(A + y −(A′/R) + pss, D) + βEy′|yV (A′, D(1−δ)−s, y′) (7.12) V i(A, D, y) = max A′ u(A + y − (A′/R), D) + βEy′|yV (A′, D(1 − δ), y′) (7.13) for all (A, D, y). This is admittedly a complex problem as it includes elements of a discrete choice (to adjust or not) and also an intensive margin (given adjustment, the level of durable purchases (sales) must be determined). The presence of a gap between the buying and selling price of durables will create inaction. Imagine a household with a substantial stock of durables that experiences an income loss say due to a layoff. In the absence of irreversibility (ps = 1), the household may optimally sell off some durables. If a job is found and the income flow returns, then the stock of durables will be rebuilt. However, in the presence of irreversibility, the sale and subsequent purchase of durables is costly due to the wedge between the buying and selling price of durables. Thus, in response to an income shock, the household may be inactive and thus not adjust its stock. The functional equation in (7.10) cannot be solved using linearization techniques as there is no simple Euler equation given the discrete choice nature of the problem. Instead, value function iteration techniques are needed. As in the dynamic discrete choice problem specified in Chapter 3, one starts with initial guesses of the values of the three options and then induces V (A, D, y) through the max operator. Given these initial solutions, the iteration procedure begins. As there is also an intensive margin in this problem (given adjustment, the stock of durables one can choose is a continuous variable), a state space for durables as well as assets must be specified. This is a complex setting but one that the value function iteration approach can handle. 187 So, given a vector of parameters describing preferences and the stochastic pro- cesses, policy functions can be created. In principle, these can be used to generate moments that can be matched with observations in an estimation exercise. This is described in some detail, for a different model, in the subsequent subsections. 7.3.2 Irreversibility and Durable Purchases Grossman and Laroque (1990) develop a model of durable consumption and also consider an optimal portfolio choice. They assume that the durable good is illiquid as the agent incurs a proportional transaction cost when selling the good. The authors show that under the assumption of a constant relative risk aversion utility function, the state variable is the ratio of wealth A over the stock of durables D. The optimal behavior of the agent is to follow an [s, S] rule, with a target s∗ ∈ [s, S]. The agent does not change the stock of durable if the ratio A/D is within the two bands s and S. If the ratio drifts out of this interval, the agent adjusts it by buying or selling the good such that A/D = s∗. Eberly (1994) empirically investigates the relevance of some aspects of the Grossman- Laroque model. She uses data from the Survey of Consumer Finances which reports information on assets, income and major purchases. She estimates the bands s and S. These bands can be computed by observing the ratio A/D for individuals just before an adjustment is made. The target s∗ can be computed as the average ra- tio just after adjustment. Eberly (1994) estimates the band width and investigates its determinants. She finds that the year to year income variance and the income growth rate are strong predictors of the width of the band. Attanasio (2000) develops a more elaborate estimation strategy for these bands, allowing for unobserved heterogeneity at the individual level. This heterogeneity is needed as, conditional on household characteristics and the value of the ratio 188 of wealth to consumption, some are adjusting their stock and some are not. The estimation is done by maximum likelihood on data drawn from the Consumer Ex- penditure Survey. The width of the bands are functions of household characteristics such as age and race. The estimated model is then aggregated to study the aggregate demand for durables. Caballero (1993) uses the Grossman and Laroque (1990) approach to investigate the aggregate behavior of durable goods. The individual agent is assumed to follow an [s,S] consumption rule because of transaction costs. In the absence of transac- tion costs, the agent would follow a PIH type behavior as described in section 7.2. Caballero postulates that the optimal behavior of the agent can be described by the distance between the stock of durables held by the agent and the ”target” defined as the optimal stock in the PIH model. The agent adjusts the stock when the gap between the realized and the desired stock is big enough. In this setting, the state variables are the stock of durables and the target. The target stock is assumed to follow a known stochastic process. Hence in this model, it is assumed that the evolution of the target is a sufficient statistic to inform of all the relevant economic variables such as prices or income. The aggregate demand for durables is the sum of all agents who decide to adjust their stock in a given period. Hence, Caballero stresses the importance of the cross sectional distribution of the gap between the target and the realized stock. When there is an aggregate shock on the target, the aggregate response depends not only on the size of the shock but also on the number of individuals close to the adjustment line. The aggregate demand for durables can therefore display complicated dynamic patterns. The model is estimated on aggregate US data. 189 7.3.3 A Dynamic Discrete Choice Model Suppose that instead of irreversibility, there is a restriction that households can have either no car or one car.99 Thus, by assumption, the household solves a dy- namic discrete choice problem. We discuss solutions of that problem, estimation of parameters and aggregate implications in this section.100 Optimal Behavior We start with the dynamic programming problem as specified in Adda and Cooper (2000b). At the start of a period, the household has a car of a particular age, a level of income and a realization of a taste shock. Formally, the household’s state is described by the age of its car, i, a vector Z = (p, Y, ε) of aggregate variables and a vector z = (y) of idiosyncratic variables. Here, p is the relative price of the (new) durable good. Current income is given by the sum Y + y where Y represents aggregate income and y represents idiosyncratic shocks to nondurable consumption that could reflect variations in household income or required expenditures on car maintenance and other necessities.101 The final element in the state vector is a taste shock, ε. At every point in time, the household decides whether to retain a car of age i, trade it or scrap it. If the household decides to scrap the car, then it receives the scrap value of π and has the option to purchase a new car. If the household retains the car, then it receives the flow of services from that car and cannot, by assumption, purchase another car. Thus the household is constrained to own at most a single car. Formally, let Vi(z,Z) represent the value of having a car of age i to a household in state (z, Z). Further, let Vki (z,Z) and V r i (z,Z) represent the values from keeping and replacing an age i car in state (z, Z). Then, 190 Vi(z, Z) = max[V k i (z, Z), V r i (z, Z)] where V ki (z, Z) = u(si, y + Y, ε) + β(1 − δ)EVi+1(z′, Z′) + (7.14) βδ{EV1(z′, Z′) − u(s1, y′ + Y ′, ε′) + u(s1, y′ + Y ′ − p′ + π, ε′)} and V ri (z, Z) = u(s1, y + Y − p + π, ε) + β(1 − δ)EV2(z′, Z′) + βδ{EV1(z′, Z′) − u(s1, y′ + Y ′, ε′) + u(s1, y′ + Y ′ − p′ + π, ε′)}. In the definition of V ki (z, Z), the car is assumed to be destroyed (from accidents and breakdowns) with probability δ leading the agent to purchase a new car in the next period. The cost of a new car in numeraire terms is p′ − π, which is stochastic since the price of a new car in the next period is random. Further, since it is assumed that there is no borrowing and lending, the utility cost of the new car is given by u(s1, y ′ + Y ′, ε′) − u(s1, y′ + Y ′ − p′ + π, ε′) which exceeds p′ − π as long as u(·) is strictly concave in nondurable consumption. It is precisely at this point that the borrowing restriction appears as an additional transactions cost. Adding in either borrowing and lending or the purchase and sale of used cars presents no modelling difficulties. But adding in wealth as well as resale prices as state variables certainly increases the dimensionality of the problem. This remains as work in progress. Exercise 7.2 Reformulate (7.14) to allow the household to borrow/lend and also to resell cars in a used car market. What additional state variables would you have to add when 191 these choices are included? What are the new necessary conditions for optimal behavior of the household? Further Specification For the application the utility function is defined to be additively separable between durables and nondurables: u(si, c) = [ i−γ + ε(c/λ)1−ξ 1 − ξ ] where c is the consumption of non-durable goods, γ is the curvature for the service flow of car ownership, ξ the curvature for consumption and λ is a scale factor. In this specification, the taste shock (ε) influences the contemporaneous marginal rate of substitution between car services and non-durables. In order for the agent’s optimization problem to be solved, a stochastic process for income, prices and the aggregate taste shocks must be specified. Aggregate income, prices and the unobserved preference shock are assumed to follow a VAR(1) process given by:102 Yt = µY + ρY Y Yt−1 + ρY ppt−1 + uY t pt = µp + ρpY Yt−1 + ρpppt−1 + upt εt = µε + ρεY Yt−1 + ρεppt−1 + uεt The covariance matrix of the innovations u = {uY t, upt, uεt} is Ω =   ωY ωY p 0ωpY ωp 0 0 0 ωε   As the aggregate taste shock is unobserved, we impose a block diagonal structure on the VAR, which enables us to identify all the parameters involving prices and aggregate income in a simple first step regression. This considerably reduces the number of parameters to be estimated in the structural model. We allow prices and income to depend on lagged income and lagged prices. 103 192 The aggregate taste shock potentially depends on lagged prices and income. The coefficients of this process along with ωε are estimated within the structural model. By allowing a positive correlation between the aggregate taste shock and lagged prices, given that prices are serially correlated, we can reconcile the model with the fact that sales and prices are positively correlated in the data. This allows us to better capture some additional dynamics of sales and prices in the structural esti- mation. An alternative way would be to model jointly the producer and consumer side of the economy, to get an upward slopping supply curve. However, solving for the equilibrium is computationally very demanding. Solving the Model The model is solved by the value function iteration method. Starting with an initial guess for Vi(z, Z), the value function is updated by backward iterations until convergence. The policy functions that are generated from this optimization problem are of an optimal stopping variety. That is, given the state of the household, the car is scrapped and replaced if and only if the car is older than a critical age. Letting hk(zt, Zt; θ) represent the probability that a car of age k is scrapped, the policy functions imply that hk(zt, Zt; θ)=δ if k < J(zt, Zt;θ) and hk(zt, Zt;θ) = 1 otherwise. Here J(zt, Zt; θ) is the optimal scrapping age in state (zt, Zt) when θ is the vector of parameters describing the economic environment. In particular, for each value of the idiosyncratic shock z, there is an optimal scrapping age. Aggregating over all possible values of this idiosyncratic shock pro- duces an aggregate policy function which indicates the fraction of cars of a given vintage which are scrapped when the aggregate state of the world is Zt: Hk(Zt, θ) = ∫ hk(zt, Zt, θ)φ(zt)dzt 193 where φ(·) is the density function of zt, taken to be the normal distribution. Hk(·) is an increasing function of the vintage and bounded between δ and 1. The aggregated hazard can be used to predict aggregate sales and the evolution of the cross section distribution of car vintages over time. Letting ft(k) the period t cross sectional distribution of k, aggregate sales are given by St(Zt, θ) = ∑ k Hk(Zt, θ)ft(k) (7.15) From an initial condition on the cross sectional distribution, it is possible to generate a time series for the cross sectional distribution given a particular parameterization of the hazard function. The evolution of ft(k) is given by: ft+1(k, Zt, θ) = [1 − Hk(Zt; θ)]ft(k − 1) for k > 1 (7.16)
    and
    ft+1(1, Zt, θ) = St(Zt, θ)
    Thus for a given θ and a given draw of T aggregate shocks one can simulate both
    sales and the cross sectional distribution. This can be repeated N times to produce
    N simulated data sets of length T , which can be used in the estimation. Define
    Stn(Zt, θ) = St(pt, Yt, εnt, θ) as the predicted aggregate sales given prices, aggregate
    income and unobserved taste shock εnt. Define S̄t(Zt, θ) = 1/N
    ∑N
    n=1 Snt(Zt, θ) as
    the average aggregate sales conditional on prices, aggregate income and period t − 1
    cross sectional distribution.
    Estimation Method and Results
    In total there are eight parameters to estimate: θ = {γ, δ, λ, ζ, σy, ρεY , ρεc, ωε}. The
    estimation method follows Adda and Cooper (2000b) and is a mix between simulated
    non-linear least squares and simulated method of moments. The first part of the
    criterion matches predicted sales of new cars with the observed ones, conditional

    194
    on prices and aggregate income. The second part of the criterion matches the
    predicted shape of the cross section distribution of car vintages to the observed one.
    The objective function to minimize is written as the sum of the two criteria:
    LN (θ) = αL1N (θ) + L2N (θ)
    where N is the number of simulated draws for the unobserved aggregate taste shock
    εnt. The two criteria are defined by:
    L1N (θ) = 1T
    ∑T
    t=1
    [
    (St − S̄t(θ))2 − 1N (N−1)
    ∑N
    n=1(Stn(θ) − S̄t(θ))2
    ]
    L2N (θ) =

    i={5,10,15,AR,M A} αi(F̄
    i − F̄ i(θ))2
    where S̄t(θ) is the average F̄
    i, i = 5, 10, 15 is the average fraction of cars of age i
    across all periods and F̄ i, i = AR, M A are the autoregressive and moving average
    coefficients from an ARMA(1,1) estimated on aggregate sales.
    The estimation uses two criteria for identification reasons. Matching aggregate
    sales at each period extracts information on the effect of prices and income on
    behavior and helps to identify the parameter of the utility function as well as the
    parameters describing the distribution of the aggregate taste shock. However, the
    model is able to match aggregate sales under different values for the agent’s optimal
    stopping time. In other words, there can be different cross section distributions
    that produce aggregated sales which are close to the observed ones. In particular,
    the parameter δ is poorly identified by using only the first criterion. The second
    criterion pins down the shape of the cross section distribution of car vintages.
    [Figure 7.2 approximately here]
    [Figure 7.3 approximately here]
    The data come from France and the US and consists of the cross sectional dis-
    tribution of car vintages over time, as well as the aggregate sales of new cars, prices

    195
    and aggregate income. The estimated aggregate hazard functions Ht(Z) over the
    period 1972-1995 for France and 1981-1995 for the US are displayed in Figures 7.2
    and 7.3. Note that the probability of replacement for young cars which is equal to
    the δ is estimated at a low value between 5 to 10%. Hence, in contrast with the esti-
    mated PIH models described in section 7.2, the model is able to produce a sensible
    estimate of the rate of depreciation. Moreover, when estimating an ARMA(1,1), as
    in section 7.2.2, on the predicted aggregate sales, the MA coefficient is estimated
    close to zero as in the observed data. Hence, viewed from a PIH perspective, the
    model appears to support a 100% depreciation rate at the aggregate level, whereas
    at the micro level, the depreciation rate is low.
    Once the model is estimated, Adda and Cooper (2000b) investigate the ability
    of the model to reproduce a number of other features such as the impulse response
    of sales to an increase in prices. They also use the estimated model to decompose
    the source of variation in aggregate sales. Within the model, there are two main
    sources, the endogenous evolution of the cross section distribution and the effect of
    aggregate variables such as prices or income. Caballero (1993) seems to imply that
    the evolution of the cross section distribution is an important determinant. However,
    the empirical decomposition shows that its role is relatively minor, compared with
    the effect of income and prices.
    The Impact of Scrapping Subsidies
    Adda and Cooper (2000a) uses the same framework to analyze the impact of scrap-
    ping subsidies introduced first in France and later in a number of European countries
    such as Spain or Italy.
    From February 1994 to June 1995 the French government offered individuals
    5000 francs (approximately 5 to 10% of the value of a new car) for the scrapping
    of an old car (ten years or older) and the purchase of a new car. Sales of new cars

    196
    which had been low in the preceding period (see Figure 7.4) increased markedly
    during the period the policy was in place. In September 1995 to September 1996,
    the government re-introduced the policy, with an age limit of eight years. After
    September 1996, the demand for new cars collapsed at a record low level.
    As evident from Figure 7.4, the demand for cars is very cyclical and follows the
    business cycle. The increased demand for new cars during the period 1994-1996
    could be due either to the policy or to the cyclical nature of demand. If the latter
    is true, the French government has been wasting money on car owners who would
    have replaced their cars during that period anyway. Even if the increased demand
    was entirely fueled by the scrapping subsidies, the government has been giving out
    money to car owners who would have replaced their car in the periods ahead. The
    effect of the policy is then to anticipate new sales, and creating future and potentially
    bigger cycles in car demand. As a large number of new cars were sold in this period,
    demand for new cars was low when the policy stopped, but a peak in demand is
    likely to appear about 10 years after the policy as the cars bought in 1995-1996 are
    scrapped.
    [Figure 7.4 approximately here]
    Adda and Cooper (2000a) estimate the model in section 7.3.3 on the pre-policy
    period. The policy works through the scrapping price π, which is constant and at a
    low value (around 500 French francs) before 1993. When the policy is in place, this
    scrapping price increases and is age specific:
    π(i) = 500 if i < 10 π(i) = 5000 if i ≥ 10 Given the estimated model, the effect of the policy can be simulated as well as the counterfactual without the policy in place. This is done conditional on the cross section distribution of cars at the beginning of the period and conditional on the 197 realized income and prices (prices of new cars are assumed to be independent of the policy. While this is debatable, empirical evidence suggest that prices remained stable throughout the period mainly because the government negotiated a stable price with car producers). While the first scrapping subsidy was largely unexpected by the consumers, the second one was partly anticipated. Just after the first subsidy, there were discussions on whether to implement a new one. This is taken into account in the model by adding the scrapping price π(i) as a stochastic state variable. More precisely, π is assumed to follow a first order Markov process, with four states. These four states are described in Table 7.2. The first state models the 1994 reform and the second one the 1995 reform. State 3 is a state with heightened uncertainty, in which there are no subsidies. State 4 is the baseline state. In state 1, the scrap value is set at 5500 F for cars older than 10 years. This state is not assumed to be very permanent: there is only a one percent chance that the subsidy will be in effect in the next period, conditional on being in force in the current period. In state 2, the scrap value is also 5500F but for cars older than 8 years old. [Table 7.2 approximately here] Figures 7.5 and 7.6 display the predicted sales and government revenue relative to baseline. The model captures the peak in sales during the two policies, as well as the decline in between due to the uncertainty. The sales are lower for about 10 years, with little evidence of a subsequent peak. This result is in line with the one discussed in section 7.3.3 where it was find that the evolution of the cross section distribution has little effect on aggregate sales. [Figure 7.5 approximately here] [Figure 7.6 approximately here] 198 Government revenues are lower over the whole period. The government revenue is formed by the value added taxes perceived from the purchases of new cars, minus the scrapping subsidies given out for eligible cars. From the perspective of govern- ment revenues, the policy is clearly undesirable. In terms of sales, the subsidies accounted for about 8 to 10% of the increased demand. Chapter 8 Investment 8.1 Overview/Motivation This chapter studies capital accumulation. Investment expenditures are one of the most volatile elements of the aggregate economy. From the perspective of policy interventions, investment is also key. The dependence of investment on real interest rates is critical to many discussions of the impact of monetary policy. Further, many fiscal policy instruments, such as investment tax credits and accelerated depreciation allowances, act directly through their influence on capital accumulation. It should seem then that macroeconomics would have developed and evaluated numerous models to meet this challenge. Yet, relative to the enormous work done on consumption, research on investment lags behind. As noted in Caballero (1999), this has changed dramatically in the last 10 or so years.104 Partly, we now have the ability to characterize investment behavior in fairly rich settings. Combined with plant-level data sets, researchers are able to confront a rich set of observations with these sophisticated models. Investment, with its emphasis on uncertainty and nonconvexities is a ripe area for applications of dynamic programming techniques. In this chapter, we first analyze a 199 200 general dynamic optimization problem and then focus on special cases of convex and non-convex adjustment costs. This then sets the stage for the empirical analyzes that follow. We also discuss the use of these estimates for the analysis of policy interventions. 8.2 General Problem The unit of analysis will be the plant though for some applications (such as consider- ation of borrowing constraints) focusing on the firm may be more appropriate. The ”manager” is assumed to maximize the value of the plant: there are no incentive problems between the manager and the owners. The problem involves the choice of factors of production that are rented for the production period, the hiring of labor and the accumulation of capital. To focus on the investment decision, we assume that demand for the variable inputs (denoted by x) is optimally determined given factor prices (represented by the vector w) and the state variables of the plant’s optimization problem, represented by (A, K). Here the vector of flexible factors of production might include labor, materials and energy inputs into the production process. The result of this optimization leaves a profit function, denoted by Π(A, K) which depends solely on the state of the plant, where Π(A, K) = max x R(Â, K, x) − wx. Here R(Â, K, x) denotes revenues given the inputs of capital (K), the variable factors (x) and a shock to revenues and/or productivity, denoted by Â. The reduced form profit function thus depends on the stochastic variable A, that encompasses both  and w, and the stock of physical capital (K). Thus we often refer to A as a profitability shock since it reflects variations in technology, demand and factor 201 prices. Taking this profit function as given, we consider variations of the following sta- tionary dynamic programming problem: V (A, K, p) = max K′ Π(A, K) − C(K′, A, K) − p(K′ − (1 − δ)K) + βEA′|AV (A′, K′, p′) (8.1) for all (A, K, p) where K′ = K(1 − δ) + I is the capital accumulation equation and I is investment. Here unprimed variables are current values and primed variables refer to future values. In this problem, the manager chooses the level of the future capital stock denoted K′. The timing assumption is that new investment becomes productive with a one-period lag. The rate of depreciation of the capital stock is denoted by δ ∈ [0, 1]. The manager discounts the future at a fixed rate of β.105 Exercise 8.1 Suppose that, in contrast to (8.1), investment in period t is productive in that period. Compare these two formulations of the investment problem. Assuming that all functions are differentiable, create Euler equations for each specification. Explain any differences. Exercise 8.2 How would you modify (8.1) to allow the manager’s discount factor to be influ- enced by variations in the real interest rate? There are no borrowing restrictions in this framework. So, the choice of in- vestment and thus future capital is not constrained by current profits or retained earnings. We return to this issue later in the chapter when we discuss the implica- tions of capital market imperfections. 202 There are two costs of obtaining new capital. The first is the direct purchase price, denoted by p. Notice that this price is part of the state vector as it is a source of variation in this economy.106 Second, there are costs of adjustment given by the function C(K′, A, K). These costs are assumed to be internal to the plant and might include: installation costs, disruption of productive activities in the plant, the need to retrain workers, the need to reconfigure other aspects of the production process, etc. This function is general enough to have components of both convex and non-convex costs of adjustment as well as a variety of transactions costs. 8.3 No Adjustment Costs To make clear the contribution of adjustment costs, it is useful to start with a benchmark case in which these costs are absent: C(K′, A, K) ≡ 0 for all (K′, A, K). Note though that there is still a time to build aspect of investment so that capital accumulation remains forward looking. The first-order condition for the optimal investment policy is given by: βEA′,p′|A,pVk(A ′, K′, p′) = p (8.2) where subscripts on the functions denote partial derivatives. This condition implies that the optimal capital stock depends on the realized value of profitability, A, only through an expectations mechanism: given the time to build, current profitability is not relevant for investment except as a signal of future profitability. Further the optimal capital stock does not depend on the current stock of capital. Using (8.1) to solve for E(A′,p′|A,p)Vk(A′, K′, p′) yields: βE(A′,p′|A,p)[Πk(A ′, K′) + (1 − δ)p′] = p. (8.3) 203 This condition has a natural interpretation. The cost of an additional unit of capital today (p) is equated to the marginal return on capital. This marginal return has two pieces: the marginal profits from the capital (Πk(A ′, K′)) and the resale value of undepreciated capital at the future price ((1 − δ)p′). Substituting for the future price of capital and iterating forward, we find: pt = β ∞∑ τ =0 [β(1 − δ)]τ EAt+τ |At ΠK (Kt+τ +1, At+τ +1) where pt is the price of capital in period t. So the firm’s investment policy equates the purchase price of capital today with the discounted present value of marginal profits in the future. Note that in stating this condition, we are assuming that the firm will be optimally resetting its capital stock in the future so that (8.3) holds in all subsequent periods. While simple, the model without adjustment costs does not fit the data well. Cooper and Haltiwanger (2000) argue that relative to observations, this model without adjustment costs implies excessive sensitivity of investment to variations in profitability. So, one of the empirical motivations for the introduction of adjust- ment costs is to temper the otherwise excessively volatile movements in investment. Further, this model is unable to match the observation of inaction in capital ad- justment seen (and discussed below) in plant-level data. For these reasons, various models of adjustment costs are considered.107 8.4 Convex Adjustment Costs In this section, we assume that C(K′, A, K) is a strictly increasing, strictly convex function of future capital, K′.108 The firm chooses tomorrow’s capital (K′) using its conditional expectations of future profitability, A′. Of course, to the extent that A′ is correlated with A, current profits will be correlated with future profits. 204 Assuming that V (K, A, p) exists, an optimal policy, obtained by solving the maximization problem in (8.1), must satisfy: CK′(K ′, A, K) + p = βE(A′,p′|A,p)VK′(A ′, K′, p′). (8.4) The left side of this condition is a measure of the marginal cost of capital accumula- tion and includes the direct cost of new capital as well as the marginal adjustment cost. The right side of this expression measures the expected marginal gains of more capital through the derivative of the value function. This is conventionally termed ”marginal Q” and denoted by q. Note the timing: the appropriate measure of marginal Q is the expected discounted value for the following period due to the one-period investment delay. Using (8.1) to solve for E(A′,p′|A,p)VK′(A′, K′, p′), (8.4) can be simplified to an Euler equation: CK′(K ′, A, K) + p = βE(A′,p′|A,p){ΠK (K′, A′) + p′(1 − δ) − CK′(K′′, A′, K′)}. (8.5) To interpret this necessary condition for an optimal solution, consider increasing current investment by a small amount. The cost of this investment is measured on the left side of this expression: there is the direct cost of the capital (p) as well as the marginal adjustment cost. The gain comes in the following period. The additional capital increases profits. Further, as the manager ”returns” to the optimal path following this deviation, the undepreciated capital is valued at the future market price p′ and adjustment costs are reduced. Exercise 8.3 Suppose that the problem had been written, perhaps more traditionally, with the choice of investment rather than the future capital stock. Derive and analyze the resulting Euler equation. 205 8.4.1 Q Theory: Models One of the difficult aspects of investment theory with adjustment costs is empirical implementation. As the value function and hence its derivative is not observable, (8.4) cannot be directly estimated. Thus the theory is tested either by finding a suitable proxy for the derivative of V (A, K, p) or by estimating the Euler equation, (8.5). We focus here on the development of a theory which facilitates estimation based upon using the average value of the firm as a substitute for the marginal value of an additional unit of capital. This approach, called Q theory, places additional structure on (8.1). In particu- lar, following Hayashi (1982), assume that: Π(K, A) is proportional to K, and that the cost of adjustment function is quadratic.109 Further, we assume that the price of capital is constant. So consider: V (A, K) = max K′ AK − γ 2 ( K′ − (1 − δ)K K )2 K −p(K′−(1−δ)K)+βEA′|AV (A′, K′) (8.6) As always, Bellman’s equation must be true for all (A, K). Suppose that the shock to profitability, A, follows an autoregressive process given by: A′ = ρA + �′ where |ρ| < 1 and �′ is white noise. The first order condition for the choice of the investment level implies that the investment rate in (i ≡ I/K) is given by: i = 1 γ (βEA′|AVK (A ′, K′) − p). (8.7) 206 Here EA′|AVK (A′, K′) is again the expected value of the derivative of the value func- tion, a term we called ”marginal Q”. To solve this dynamic programming problem, we can guess at a solution and verify that it works. Given the linear-quadratic structure of the problem, it is natural to guess that: V (A, K) = φ(A)K where φ(A) is some unknown function. Using this guess, expected marginal Q is a function of A given by: EA′|AVK (A ′, K′) = EA′|Aφ(A ′) ≡ φ̃(A). Note that in this case the expected value of marginal and average Q (defined as V (A, K)/K = φ(A)) are the same.110 Using this in the Euler equation implies that i = 1 γ (βφ̃(A) − p) ≡ z(A). This expression implies that the investment rate is actually independent of the current level of the capital stock. To verify our guess, substitute this investment policy function into the original functional equation implying: φ(A)K = AK − γ 2 (z(A))2K − pz(A)K + βφ̃(A)K[(1 − δ) + z(A)] must hold for all (A, K). Clearly, the guess that the value function is proportional to K is indeed correct: the value of K cancels from the above expression. So, given the conjecture that V (A, K) is proportional to K, we find an optimal investment policy which confirm the asserted proportionality. The remaining part of the unknown value function φ(A) is given implicity by the expression above.111 The result the value function is proportional to the stock of capital is, at this point, a nice property of the linear-quadratic formulation of the capital accumulation 207 problem. In the discussion of empirical evidence, it forms the basis for a wide range of empirical exercises since it allows the researcher to substitute the average value of Q (observable from the stock market) for marginal Q (unobservable). 8.4.2 Q Theory: Evidence Due to its relatively simple structure, the convex adjustment cost model is one of the leading models of investment. In fact, as discussed above, the convex model is often simplified further so that adjustment costs are quadratic, as in (8.6). Necessary conditions for optimality for this model are expressed in two ways. First, from the first-order conditions, the investment rate is linearly related to the difference between the future marginal value of new capital and the current price of capital, as in (8.7). Using the arguments from above, this marginal value of capital can under some conditions be replaced by the average value of capital. This sets the basis for the Q-theory empirical approach discussed below. Second, one can base an empirical analysis on the Euler equation that emerges from (8.6). This naturally leads to estimation using GMM and is discussed below as well. The discussion of estimation based upon Q-theory draws heavily upon two pa- pers. The first by Gilchrist and Himmelberg (1995) provides a clean and clear presentation of the basic approach and evidence on Q-theory based estimation of capital adjustment models. A theme in this and related papers is that empirically investment depends on variables other than average Q, particularly measures of cash flow. The second by Cooper and Ejarque (2001) works from Gilchrist and Himmel- berg (1995) to explore the significance of imperfect competition and credit market frictions.112 This paper illustrates the use of indirect inference. 208 Tests of Q theory on panel data are frequently conducted using an empirical specification of: (I/K)it = ai0 + a1βEq̄it+1 + a2(Xit/Kit) + υit (8.8) Here the i subscript refers to firm or plant i and the t subscript represents time. From (8.7), a1 should equal 1/γ. This is an interesting aspect of this specification: under the null hypothesis, one can infer the adjustment cost parameter from this regression. There is a constant term in the regression which is plant specific. This comes from a modification of the quadratic cost of adjustment to: C(K′, K) = γ 2 ( K′ − (1 − δ)K K − ai)2K. as in Gilchrist and Himmelberg (1995).113 Finally, this regression includes a third term, (Xit/Kit). In fact, Q theory does not suggest the inclusion of other variables in (8.8) since all relevant information is incorporated in average Q. Rather, these variables are included as a means of testing the theory, where the theory predicts that these variables from the information set should be insignificant. Hence researchers focus on the statistical and economic significance of a2. In particular, Xit often includes financial variables as a way of evaluating an alternative hypothesis in which the effects of financial constraints are not included in average Q. The results obtained using this approach have been mixed. Estimates of large adjustment costs are not uncommon. Hayashi (1982) estimates a1 = 0.0423 and thus γ of about 25. Gilchrist and Himmelberg (1995) estimate a1 at 0.033. Further, many studies, estimate a positive value for a2 when Xit is a measure of profits and/or cash flow.114 This is taken as a rejection of the Q theory, which of course implies that the inference drawn about γ from the estimate of a1 may not be valid. Moreover, the significance of the financial variables has lead researchers 209 to conclude that capital market imperfections must be present. Cooper and Ejarque (2001) argue that the apparent failure of Q theory stems from misspecification of the firm’s optimization problem: market power is ignored. As shown by Hayashi (1982), if firms have market power, then average and marginal Q diverge. Consequently, the substitution of marginal for average Q in the standard investment regression induces measurement error that may be positively correlated with profits.115 Cooper and Ejarque (2001) ask whether one might find positive and significant a2 in (8.8) in a model without any capital market imperfections. Their methodology follows the indirect inference procedures described in Gourier- oux and Monfort (1996) and Gourieroux et al. (1993). This approach to estimation was discussed in Chapter 4. This is a minimum distance estimation routine in which the structural parameters of the optimization problem are chosen to bring the re- duced form coefficients from the regression on the simulated data close to those from the actual data. The key is that the same reduced form regression is run on both the actual and simulated data. Cooper and Ejarque (2001) use the parameter estimates of Gilchrist and Himmel- berg (1995) for (8.8) as representative of the Q theory based investment literature. Denote these estimates from their pooled panel sample using the average (Tobin’s) Q measure by (a∗1, a ∗ 2)= (.03, .24). 116 Cooper and Ejarque (2001) add three other moments reported by Gilchrist and Himmelberg (1995): the serial correlation of investment rates (.4), the standard deviation of profit rates (.3) and the average value of average Q (3). Let Ψd denote the vector moments from the data. In the Cooper and Ejarque (2001) study, Ψd = [.03 .24 .4 .3 3]. The estimation focuses on two key parameters: the curvature of the profit func- tion (α) and the level of the adjustment costs (γ). So, they set other parameters 210 at levels found in previous studies: δ = .15 and β = .95. This leaves (α,γ) and the stochastic process for the firm-specific shocks to profitability as the param- eters remaining to be estimated. Cooper and Ejarque (2001) estimate the serial correlation (ρ) and the standard deviation (σ) of the profitability shocks while the aggregate shock process is represented process as a two-state Markov process with a symmetric transition matrix in which the probability of remaining in either of the two aggregate states is .8.117 As described in Chapter 4, the indirect inference procedure proceeds, in this application, by: • given a vector of parameters, Θ ≡ (α,γ, ρ, σ), solve the firm’s dynamic pro- gramming problem of V (A, K) = max K′ AKα−γ 2 ( K′ − (1 − δ)K K )2 K−p(K′−(1−δ)K)+βEA′|AV (A′, K′) (8.9) for all (A, K) using value function iteration. The method outlined in Tauchen (1986) is used to create a discrete state space representation of the shock process given (ρ, σ). Use this in the conditional expectation of the optimization problem. • given the policy functions obtained by solving the dynamic programming prob- lem, create a panel data set by simulation • estimate the Q theory model, as in (8.8), on the simulated model and calculate relevant moments. Let Ψs(Θ) denote the corresponding moments from the simulated data • Compute J(Θ) defined as: 211 J(Θ) = (Ψd − Ψs(Θ))′W (Ψd − Ψs(Θ)) (8.10) where W is an estimate of the inverse of the variance-covariance matrix of Ψd. • The estimator of Θ, Θ̂, solves: min Θ J(Θ). The second row of Table 8.1 presents the estimates of structural parameters and standard errors reported in Cooper and Ejarque (2001).118 Table 8.2 reports the resulting regression results and moments. Here the row labelled GH95 represents the regression results and moments reported by Gilchrist and Himmelberg (1995). [Table 8.1 approximately here] [Table 8.2 approximately here] The model, with its four parameters, does a good job of matching four of the five estimates/moments but is unable to reproduce the high level of serial correlation in plant-level investment rates. This appears to be a consequence of the fairly low level of γ which implies that adjustment costs are not very large. Raising the adjustment costs will increase the serial correlation of investment. The estimated curvature of the profit function of .689 implies a markup of about 15%.119 This estimate of α and hence the markup is not at variance with results reported in the literature. The other interesting parameter is the estimate of the level associated with the quadratic cost of adjustment, γ. Relative to other studies, this appears quite low. However, an interesting point from these results is that the estimate of γ is not identified from the regression coefficient on average Q. From this table, the 212 estimated value of γ = .149 is far from the inverse of the coefficient on average Q (about 4). So clearly the identification of the quadratic cost of adjustment parameter from a2 is misleading in the presence of market power. Exercise 8.4 Write a program to solve V (A, K) = max K′ AKα− γ 2 ( K′ − (1 − δ)K K )2 K−p(K′−(1−δ)K)+βEA′|AV (A′, K′) (8.11) using a value function iteration routine given a parameterization of the problem. Use the results to explore the relationship of investment to average Q. Is there a nonlinearity in this relationship? How is investment related to profitability in your simulated data set? 8.4.3 Euler Equation Estimation This approach to estimation shares with the consumption applications presented in Chapter 6 a simple but powerful logic. The Euler equation given in (8.5) is a necessary condition for optimality. In the quadratic cost of adjustment model case this simplifies to: it = 1 γ [ β[Et(πK (At+1, Kt+1) + pt+1(1 − δ) + γ 2 i2t+1 + γ(1 − δ)it+1] − pt ] . Let εt+1 be defined from realized values of these variables: εt+1 = it − 1 γ [ β[(πK (At+1, Kt+1) + pt+1(1 − δ) + γ 2 i2t+1 + γ(1 − δ)it+1)] − pt ] . (8.12) Then the restriction imposed by the theory is that Etεt+1 = 0. It is precisely this orthogonality condition that the GMM procedure exploits in the estimation of 213 underlying structural parameters, θ = (β, γ, δ, α). To illustrate, we have solved and simulated a model with quadratic adjustment costs (γ = 2) with constant investment good prices. Using that data set, we can estimate the parameters of the firm’s problem using GMM. To make this as transparent as possible, assume that the researcher knows the values of all parameters except for γ. Thus, we can rely on a single orthogonality condition to determine γ. Suppose that we use the lagged profitability shock as the instrument. Define Ω(γ) = 1 T ∑ t εt+1(γ)At (8.13) The GMM estimate of γ is obtained from the minimization of Ω(γ). This function is shown in Figure 8.1. Clearly, this function is minimized near γ = 2.120 [Figure 8.1 approximately here] Whited (1998) contains a thorough review and analysis of existing evidence on Euler equation estimation of investment models. As Whited notes, the Euler equation approach certainly has a virtue over the Q-theory based model: there is no need to try to measure marginal Q. Thus some of the restrictions imposed on the estimation, such as the conditions specified by Hayashi, do not have to be imposed. Estimation based upon an investment Euler equation generally leads to rejection of the overidentifying restrictions and, as in the Q-theory based empirical work, the inclusion of financial constraints improves the performance of the model. The point of Whited (1998) is to dig further into these results. Importantly, her analysis brings the importance of fixed adjustment costs into the evaluation of the Euler equation estimation. As noted earlier and discussed at some length below, investment studies have been broadened to go beyond convex adjustment costs to match the observations of non-adjustment in the capital stock. Whited (1998) takes 214 this into account by dividing her sample into the set of firms which undertakes positive investment. Estimation of the Euler equation for this subset is much more successful. Further Whited (1998) finds that while financial variables are important overall, they are also weakly relevant for the firms with ongoing investment. These results are provocative. They force us to think jointly about the pres- ence of non-convex adjustment costs and financial variables. We now turn to these important topics. 8.4.4 Borrowing Restrictions Thus far, we have ignored the potential presence of borrowing restrictions. These have a long history in empirical investment analysis. As in our discussion of the empirical Q-theory literature, financial frictions are often viewed as the source of the significance of profit rates and/or cash flow in investment regressions. There is nothing particularly difficult about introducing borrowing restrictions into the capital accumulation problem. Consider: V (A, K) = max K′∈Γ(A,K) AKα − γ 2 ( K′ − (1 − δ)K K )2 K (8.14) − p(K′ − (1 − δ)K) + βEA′|AV (A′, K′) (8.15) for all (A, K) where Γ(A, K) constrains the choice set for the future capital stock. So, for example, if capital purchases had to be financed out of current profits, then the financial restriction is K′ − (1 − δ)K ≤ AKα (8.16) so that Γ(A, K) = [0, AKα + (1 − δ)K] (8.17) The dynamic optimization problem with a restriction of this form can certainly be evaluated using value function iteration techniques. The problem of the firm can 215 be broadened to include retained earnings as a state variable and to include other financial variables in the state vector. There are a number of unresolved issues though that have limited research in this area: • What are the Γ(A, K) functions suggested by theory? • For what Γ(A, K) functions is there a wedge between average and marginal Q? The first point is worthy of note: while we have many models of capital accu- mulation without borrowing restrictions, the alternative model of investment with borrowing restrictions is not on the table. Thus, the rejection of the model without constraints in favor of one with constraints is not as convincing as it could be. The second point, related to work by Chirinko (1993) and Gomes (2001), returns to the evidence discussed earlier on Q theory based empirical models of investment. The value function, V (A, K) that solves (8.15) contains all the information about the constrained optimization problem. As long as this function is differentiable (which restricts the Γ(A, K) function), marginal Q will still measure the return to an extra unit of capital. The issue is whether the borrowing friction introduces a wedge between marginal and average Q.121 Empirically, the issue is whether this wedge between marginal and average Q can create the regression results such as those reported in Gilchrist and Himmelberg (1995). 8.5 Non-Convex Adjustment: Theory Empirically, one finds that at the plant level there are frequent periods of invest- ment inactivity and also bursts of investment activity. Table 8.3 below, taken from Cooper and Haltiwanger (2000), documents the nature of capital adjustment in the Longitudinal Research Database (LRD), a plant level U.S. manufacturing data 216 set.122 [Table 8.3 approximately here] Here inaction is defined as a plant level investment rate less than .01 and a spike is an investment rate in excess of 20%. Clearly the data exhibit both inaction as well as large bursts of investment. As argued by Caballero et al. (1995), Cooper et al. (1999) and Cooper and Haltiwanger (2000) it is difficult to match this type of evidence with a quadratic cost of adjustment model. Thus we turn to alternative models which can produce inaction. In the first type of model, we relax the convex adjustment cost structure and assume that the costs of adjustment depend only on whether investment has been undertaken and not its magnitude. We then consider a second type of model in which there is some type of irreversibility. The next section reports on estimation of these models. 8.5.1 Non-convex Adjustment Costs For this formulation of adjustment costs, we follow Cooper and Haltiwanger (1993) and Cooper et al. (1999) and consider a dynamic programming problem specified at the plant level as: V (A, K, p) = max{V i(A, K, p), V a(A, K, p)} for all (A, K, p) (8.18) where the superscripts refer to active investment ”a” and inactivity ”i”. These options, in turn, are defined by: V i(A, K, p) = Π(A, K) + βEA′,p′|A,pV (A ′, K(1 − δ), p′) and 217 V a(A, K, p) = max K′ Π(A, K)λ − F K − p(K′ − (1 − δ)K) + βEA′,p′|A,pV (A′, K′, p′). Here there are two costs of adjustment that are independent of the level of investment activity. The first is a loss of profit flow equal to (1− λ). This is intended to capture an opportunity cost of investment in which the plant must be shut down during a period of investment activity. The second non-convex cost is simply subtracted from the flow of profits as F K. The inclusion of K here is intended to capture the idea that these fixed costs, while independent of the current level of investment activity, may have some scale aspects to them.123 In this formulation, the relative price of capital (p) is allowed to vary as well. Before proceeding to a discussion of results, it might be useful to recall from Chapter 3 how one might obtain a solution to a problem such as (8.18).124 The first step is to specify a profit function, say Π(A, K) = AKα and to set the parameters, (F, β, λ, α, δ) as well as the stochastic processes for the random variables (A, p). Denote this parameter vector by Θ. The second step is to specify a space for the state variables, (A, K, p) and thus for control variable K′. Once these steps are complete, the value function iteration logic (subscripts denote iterations of the mapping) takes over: • provide an initial guess for V1(A, K, p), such as the one period solution • using this initial guess, compute the values for the two options, V a1 (A, K, p) and V i1 (A, K, p) • using these values, solve for the next guess of the value function: V2(A, K, p) = max {V a1 (A, K, p) , V i1 (A, K, p)} • continue this process until convergence 218 • once the value function is known, it is straightforward to compute the set of state variables such that action (inaction) are optimal as well as the investment level in the event adjustment is optimal. • given these policy functions, the model can be simulated to create either a panel or a time series data set. The policy function for this problem will have two important dimensions. First, there is the determination of whether the plant will adjust its capital stock or not. Second, conditional on adjustment, the plant must determine its level of investment. As usual, the optimal choice of investment depends on the marginal value of capital in the next period. However, in contrast to say the quadratic cost of adjustment model, the future value of additional capital depends on future choice with respect to adjustment. Thus there is no simple Euler equation linking the marginal cost of additional capital today with future marginal benefit, as in (8.5), since there is no guarantee that this plant will be adjusting its capital stock in the future period. Note that the two types of costs have very different implications for the cyclical properties of investment. In particular, when adjustment costs interfere with the flow of profits (λ < 1) then it is more expensive to investment in periods of high profitability. Yet, if the shocks are sufficiently correlated, there is a gain to investing in good times. In contrast, if costs are largely lump sum, then given the time to build aspect of the accumulation decision, the best time to invest is when it is prof- itable to do so (A is high) assuming that these shocks are serially correlated. Thus whether investment is procyclical or countercyclical depends on both the nature of the adjustment costs and the persistence of shocks. We shall discuss the policy functions for an estimated version of this model below. For now, we look at a simple example to build intuition. 219 Machine Replacement Example As an example, we turn to a modified version of the simple model of machine replacement studied by Cooper and Haltiwanger (1993). Here there is no choice of the size of the investment expenditure. Investment means the purchase of a new machine at a net price of p. By assumption the old machine is scrapped. The size of the new machine is normalized to 1.125 Further, to simplify the argument, we assume that new capital becomes pro- ductive immediately. In addition, the price of new capital good is assumed to be constant and can be interpreted as including the fixed cost of adjusting the capital stock. In this case, we can write the Bellman equation as: V (A, K) = max{V i(A, K), V a(A, K)} for all (A, K) where the superscripts refer to active investment “a” and inactivity “i”. These options, in turn, are defined by: V i(A, K) = Π(A, K) + βEA′|AV (A ′, K(1 − δ)) and V a(A, K) = Π(A, 1)λ − p + βEA′|AV (A′, (1 − δ)). So here ”action” means that a new machine is bought and is immediately productive. The cost of this is the net price of the new capital and the disruption caused by the adjustment process. Let ∆(A, K) be the relative gains to action so: ∆(A, K) ≡ V a(A, K) − V i(A, K) = Π(A, 1)λ − Π(A, K) − p + β ( EA′|AV (A ′, (1 − δ)) − EA′|AV (A′, K(1 − δ)) ) 220 The problem posed in this fashion is clearly one of the optimal stopping vari- ety. Given the state of profitability (A), there is a critical size of the capital stock (K∗(A)) such that machine replacement occurs if and only if K < K∗(A). To see why this policy is optimal, note that by our timing assumption, V a(A, K) is in fact independent of K. Clearly V i(A, K) is increasing in K. Thus there is a unique cross- ing of these two functions at K∗(A). In other words, ∆(A, K) is decreasing in K, given A with ∆(A, K∗(A)) = 0. Is K∗ between 0 and 1? With Π(A, 0) sufficiently small, V i(A, K) < V a(A, K) for K near 0. Hence, K∗ > 0. Further, with the costs of acquiring new capital
    (p > 0, λ < 1), large enough and the rate of depreciation low enough, capital will not be replaced each period: K∗ < 1. Thus there will be a ”replacement cycle” in which there is a burst of investment activity followed by inactivity until the capital ages enough to warrant replacement.126 The policy function is then given by z(A, K) ∈ {0, 1} where z(A, K) = 0 means inaction and z(A, K) = 1 means replacement. From the argument above, for each A there exists K∗(A) such that z(A, K) = 1 if and only if K ≤ K∗(A). With the assumption that capital becomes productively immediately, the re- sponse of K∗(A) to variations in A can be analyzed.127 Suppose for example that λ = 1 and A is iid. In this case, the dependence of ∆(A, K) on A is solely through current profits. Thus ∆(A, K) is increasing in A as long as the marginal productiv- ity of capital is increasing in A, ΠAK (A, K) > 0. So, K
    ∗(A) will be increasing in A
    and replacement will be more likely in good times.
    Alternatively, suppose that λ < 1. In this case, during periods of high produc- tivity it is desirable to have new capital but it is also costly to install it. If A is positively serially correlated, then the effect of A on ∆(A, K) will reflect both the direct effect on current profits and the effects on the future values. If the opportu- nity cost is large (a small λ) and shocks are not persistent enough, then machine 221 replacement will be delayed until capital is less productive. Aggregate Implications of Machine Replacement This model of capital adjustment at the plant level can be used to generate aggre- gate implications. Let ft(K) be the current distribution of capital across a fixed population of plants. Suppose that the shock in period t, At, has two components, At = atεt. The first is aggregate and the second is plant specific. Following Cooper et al. (1999), assume that the aggregate shock takes on two values and the plant specific shock takes on 20 values. Further, assume that the idiosyncratic shocks are iid. With this decomposition, write the policy function as z(at, εt, Kt) where z(at, εt, Kt) = 1 signifies actions and z(at, εt, Kt) = 0 indicates inaction. Clearly the decision on replacement will generally depend differentially on the two types of shocks since they may be drawn from different stochastic properties. For example, if the aggregate shock is more persistent than the plant specific one, the response to a variation in at will be larger than the response to an innovation in εt. Define H(at, K) = ∫ ε z(at, εt, K)dGt(ε) where Gt(ε) is the period t cumulative distribution function of the plant specific shocks. Here H(at, K) is a hazard function representing the probability of adjust- ment for all plants with capital K in aggregate state at. To the extent that the researcher may be able to observe aggregate but not plant specific shocks, H(at, K) represents a hazard that averages over the {0, 1} choices of the individual plants so that H(at, K) ∈ [0, 1]. Using this formulation, let I(at; ft(K)) be the rate of investment in state at given the distribution of capital holdings ft(K) across plants. Aggregate investment is defined as: 222 I(at; ft(K)) = ∑ K H(at, K)ft(K). (8.19) Thus total investment reflects the interaction between the average adjustment haz- ard and the cross sectional distribution of capital holdings. The evolution of the cross sectional distribution of capital is given by: gt+1((1 − δ)K) = (1 − H(at, K))gt(K) (8.20) Expressions such as these are common in aggregate models of discrete adjust- ment, see for example, Rust (1985) and Caballero et al. (1995). Given an initial cross sectional distribution and a hazard function, a sequence of shocks will thus generate a sequence of aggregate investment levels from (8.19) and a sequence of cross sectional distributions from (8.20). Thus the machine replacement problem can generate both a panel data set and, through aggregation, time series as well. In principle, estimation from aggregate data supplements the perhaps more direct route of estimating a model such as this from a panel. Exercise 8.5 Use a value function iteration routine to solve the dynamic optimization problem with a firm when there are non-convex adjustment costs. Suppose there is a panel of such firms. Use the resulting policy functions to simulate the time series of aggre- gate investment. Now, use a value function iteration routine to solve the dynamic optimization problem with a firm when there are quadratic adjustment costs. Create a time series from the simulated panel. How well can a quadratic adjustment cost model approximate the aggregate investment time series created by the model with non-convex adjustment costs? 223 8.5.2 Irreversibility The specifications considered thus far do not distinguish between the buying and selling prices of capital. However, there are good reasons to think that investment is at least partially irreversible so that the selling price of a unit of used capital is less than the cost of a unit of new capital. This reflects frictions in the market for used capital as well as specific aspects of capital equipment that may make them imperfectly suitable for uses at other production sites. To allow for this, we alter our optimization problem to distinguish the buying and selling prices of capital: The value function for this specification is given by: V (A, K) = max{V b(A, K), V s(A, K), V i(A, K)} for all (A, K) where the superscripts refer to the act of buying capital ”b”, selling capital ”s” and inaction ”i”. These options, in turn, are defined by: V b(A, K) = max I Π(A, K) − I + βEA′|AV (A′, K(1 − δ) + I), V s(A, K) = max R Π(A, K) + psR + βEA′|AV (A ′, K(1 − δ) − R) and V i(A, K) = Π(A, K) + βEA′|AV (A ′, K(1 − δ)). Under the buy option, the plant obtains capital at a cost normalized to one. Under the sell option, the plant retires R units of capital at a price ps. The third option is inaction so that the capital stock depreciates at a rate of δ. Intuitively, the gap between the buying and selling price of capital will produce inaction. Suppose that there is an adverse shock to the profitability of the plant. If this shock was known to be temporary, then selling capital and repurchasing it in the near future 224 would not be profitable for the plant as long as ps < 1. Thus inaction may be optimal. Clearly though, the amount of inaction that this model can produce will depend on both the size of ps relative to 1 and the serial correlation of the shocks. 128 8.6 Estimation of a Rich Model of Adjustment Costs Using this dynamic programming structure to understand the optimal capital deci- sion at the plant (firm) level, we confront the data on investment decisions allowing for a rich structure of adjustment costs.129 To do so, we follow Cooper and Halti- wanger (2000) and consider a model with quadratic adjustment costs, non-convex adjustment costs and irreversibility. We describe the optimization problem and then the estimation results obtained by Cooper and Haltiwanger. 8.6.1 General Model The dynamic programming problem for a plant is given by: V (A, K) = max{V b(A, K), V s(A, K), V i(A, K)} (8.21) for all (A, K) where, as above, the superscripts refer to the act of buying capital ”b”, selling capital ”s” and inaction ”i”. These options, in turn, are defined by: V b(A, K) = max I Π(A, K) − F K − I − γ 2 [I/K]2K + βEA′|AV (A ′, K(1 − δ) + I), V s(A, K) = max R Π(A, K) + psR − F K − γ 2 [R/K]2K + βEA′|AV (A ′, K(1 − δ) − R) and 225 V i(A, K) = Π(A, K) + βEA′|AV (A ′, K(1 − δ)). Cooper and Haltiwanger (2000) estimate three parameters, Θ ≡ (F, γ, ps) and assume that β = .95, δ = .069. Further, they specify a profit function of Π(A, K) = AKθ with θ=.50 estimated from a panel data set of manufacturing plants.130 Note that the adjustment costs in (8.21) exclude any disruptions to the production process so that the Π(A, K) can be estimated and the shock process inferred independently of the estimation of adjustment costs. If these additional adjustment costs were added, then the profit function and the shocks would have to be estimated along with the parameters of the adjustment cost function. These parameters are estimated using an indirect inference routine. The reduced form regression used in the analysis is: iit = αi + ψ0 + ψ1ait + ψ2(ait) 2 + uit (8.22) where iit is the investment rate at plant i in period t, ait is the log of a profitability shock at plant i in period t and αi is a fixed effect. 131 This specification was chosen as it captures in a parsimonious way the nonlinear relationship between investment rates and fundamentals. The profitability shocks are inferred from the plant level data using the estimated profit function.132 Cooper and Haltiwanger document the extent of the nonlinear response of investment to shocks. Table 8.4 reports Cooper and Haltiwanger’s results for four different models along with standard errors. The first row shows the estimated parameters for the most general model. The parameter vector Θ = [0.043, 0.00039, 0.967] implies the presence of statistically significant convex and non-convex adjustment costs (but non-zero) and a relatively substantial transaction cost. Restricted versions of the model are also reported for purposes of comparison. Clearly the mixed model does 226 better than any of the restricted models. [Table 8.4 approximately here] Cooper and Haltiwanger argue that these results are reasonable.133 First, as noted above a low level for the convex cost of adjustment parameter is consistent with the estimates obtained from the Q-theory based models due to the presence of imperfect competition. Further, the estimation implies that the fixed cost of adjustment is about 0.04% of average plant level profits. Cooper and Haltiwanger find that this cost is significant relative to the difference between adjusting and not adjusting the capital stock. So, in fact, the estimated fixed cost of adjustment, along with the irreversibility, produces a large amount of inaction. Finally, the estimated selling price of capital is much higher than the estimate report in Ramey and Shapiro (2001) for some plants in the aerospace industry. Cooper and Haltiwanger (2000) also explore the aggregate implications of their model. They contrast the time series behavior of the estimated model with both convex and non-convex adjustment costs against one in which there are only convex adjustment costs. Even though the model with only convex adjustment costs does relatively poorly on the plant-level data, it does reasonably well in terms of matching time series. In particular, Cooper and Haltiwanger (2000) find that over 90% of the time series variation in investment created by a simulation of the estimated model can be accounted for by a quadratic adjustment model. Of course, this also implies that the quadratic model misses 10% of the variation. Note too that this framework for aggregation captures the smoothing by ag- gregating over heterogeneous plants but misses smoothing created by variations in relative prices. From Thomas (2000) and Kahn and Thomas (2001) we know that this additional source of smoothing can be quite powerful as well. 227 8.6.2 Maximum Likelihood Estimation A final approach to estimation follows the approach in Rust (1987). Consider again, for example, the stochastic machine replacement problem given by: V (A, K, F ) = max{V i(A, K, F ), V a(A, K, F )} for all (A, K, F ) (8.23) where: V i(A, K, F ) = Π(A, K) + βEA′|AV (A ′, K(1 − δ), F ′) and V a(A, K, F ) = max K′ Π(A, K)λ − F K − p(K′ − (1 − δ)K) + βEA′|AV (A′, K′, F ′). Here we have added the fixed cost of adjustment into the state vector as we assume that the adjustment costs are random at the plant level. Let G(F ) represent the cu- mulative distribution function for these adjustment costs. Assume that these are iid shocks. Then, given a guess for the functions {V (A, K, F ), V i(A, K, F ), V a(A, K, F )}, the likelihood of inaction can be computed directly from the cumulative distribu- tion function G(·). Thus a likelihood function can be constructed which depends on the parameters of the distribution of adjustment costs and those underlying the dynamic optimization problem. From there, a maximum likelihood estimate can be obtained.134 8.7 Conclusions The theme of this chapter has been the dynamics of capital accumulation. From the plant-level perspective, the investment process is quite rich and entails periods 228 of intense activity followed by times of inaction. This has been documented at the plant-level. Using the techniques of the estimation of dynamic programming models, this chapter has presented evidence on the nature of adjustment costs. Many open issues remain. First, the time series implications of non-convexities is still not clear. How much does the lumpiness at the plant-level matter for aggregate behavior? Put differently, how much smoothing obtains from the aggregate across heterogeneous plants as well as through variations in relative prices? Second, there are a host of policy experiments to be considered. What, for exam- ple, are the implications of investment tax credits given the estimates of adjustment cost parameters? Exercise 8.6 Add in variations in the price of new capital into the optimization problem given in (8.21). How would you use this to study the impact of, say, an investment tax credit? Chapter 9 Dynamics of Employment Adjustment 9.1 Motivation This chapter studies labor demand. The usual textbook model of labor demand depicts a firm as choosing the number of workers and their hours given a wage rate. But, the determination of wages, employment and hours is much more complex than this. The key is to recognize that the adjustment of many factors of production, including labor, is not costless. We study the dynamics of capital accumulation elsewhere in this book and in this chapter focus attention on labor demand. Understanding the nature of adjustment costs and thus the factors determined labor demand is important for a number of reasons. First, many competing models of the business cycle depend crucially on the operation of labor markets. As empha- sized in Sargent (1978), a critical point in distinguishing competing theories of the business cycle is whether labor market observations could plausibly be the outcome of a dynamic market clearing model. Second, attempts to forecast macroeconomic conditions often resort to consideration of observed movements in hours and em- 229 230 ployment to infer the state of economic activity. Finally, policy interventions in the labor market are numerous and widespread. These include: restrictions on wages, restrictions on hours, costs of firing workers and so forth. Policy evaluate requires a model of labor demand. We begin the chapter with the simplest models of dynamic labor demand where adjustment costs are assumed to be convex and continuously differentiable. These models are analytically tractable as we can often estimate their parameters directly from first-order conditions. However, they have implications of constant adjustment that are not consistent with microeconomic observations. Nickell (1978) argues: “One point worth noting is that there seems little reason to suppose costs per worker associated with either hiring or firing increase with the rate at which employees flow in or out. Indeed, given the large fixed costs associated with personnel and legal departments, it may even be more reasonable to suppose that the average cost of adjusting the workforce diminishes rather than increases with the speed of adjustment.” This quote is supported by recent evidence in Hamermesh (1989) and Caballero et al. (1997) that labor adjustment is rather erratic at the plant level with periods of inactivity punctuated by large adjustments. Thus this chapter goes beyond the con- vex case and considers models of adjustment which can mimic these microeconomic facts. 9.2 General Model of Dynamic Labor Demand In this chapter, we consider variants of the following dynamic programming problem: V (A, e−1) = max h,e R(A, e, h) − ω(e, h, A) − C(e, e−1) + βEA′|AV (A′, e). (9.1) 231 for all (A, e−1). Here A represents a shock to the profitability of the plant and/or firm. As in our discussion of the investment problem, this shock could reflect vari- ations in product demands or variations in the productivity of inputs. Generally A will have a component that is common across plants, denoted a, and one that is plant specific, denoted ε.135 The function R(A, e, h) represents the revenues which depend on the hours worked (h) and the number of workers (e) as well as the prof- itability shock. Other factors of production, such as capital, are assumed to be rented and optimization over these inputs are incorporated into R(A, e, h).136 The function ω(e, h, A) is the total cost of hiring e workers when each supplies h units of labor time. This general specification allows for overtime pay and other provisions. Assume that this compensation function is increasing in both of its arguments and is convex with respect to hours. Further, we allow this compensation function to be state dependent. This may reflect a covariance with the idiosyncratic profitability shocks (due, perhaps, to profit sharing arrangements) or an exogenous stochastic component in aggregate wages. The function C(e, e−1) is the cost of adjusting the number of workers. Hamer- mesh (1993) and Hamermesh and Pfann (1996) provide a lengthy discussion of var- ious interpretations and motivations for adjustment costs. This function is meant to cover costs associated with: • search and recruiting • training • explicit firing costs • variations in complementary activities (capital accumulation, reorganization of production activities, etc.) It is important to note the timing implicit in the statement of the optimization 232 problem. The state vector includes the stock of workers in the previous period, e−1. In contrast to the capital accumulation problem, the number of workers in the current period is not predetermined. Instead, workers hired in the current period are immediately utilized in the production process: there is no ”time to build”. The next section of the chapter is devoted to the study of adjustment cost functions such that the marginal cost of adjustment is positive and increasing in e given e−1. We then turn to more general adjustment cost functions which allow for more nonlinear and discontinuous behavior. 9.3 Quadratic Adjustment Costs Without putting additional structure on the problem, particularly the nature of adjustment costs, it is difficult to say much about dynamic labor demand. As a starting point, suppose that the cost of adjustment is given by C(e, e−1) = η 2 (e − (1 − q)e−1)2. (9.2) so C(e, e−1) is convex in e and continuously differentiable. Here, q is an exogenous quit rate. In this specification of adjustment costs, the plant/firm incurs a cost of changing the level of employment relative to the stock of workers ((1 − q)e−1) that remain on the job from the previous period. Of course, this is a modelling choice: one can also consider the case where the adjustment cost is based on net rather than gross hires.137 The first-order conditions for (9.1) using (9.2) are: Rh(A, e, h) = ωh(e, h, A) and (9.3) Re(A, e, h) − ωe(e, h, A) − η(e − (1 − q)e−1) + βEVe(A′, e) = 0. (9.4) Here the choice of hours, given in (9.3) is static: the firm weighs the gains to the 233 increasing labor input against the marginal cost (assumed to be increasing in hours) of increasing hours. In contrast, (9.4) is a dynamic relationship since the number of employees is a state variable. Assuming that the value function is differentiable, EVe(A ′, e′) can be evaluated using (9.1) leading to: Re(A, e, h) − ωe(e, h, A) − η(e − (1 − q)e−1) + βE[η(e′ − (1 − q)e)(1 − q)] = 0 (9.5) The solution to this problem will yield policy functions for hours and employment given the state vector. Let e = φ(A, e−1) denote the employment policy function and h = H(A, e−1) denote the hours policy function. These functions jointly satisfy (9.3) and (9.5). As a benchmark, suppose there were no adjustment costs, η ≡ 0, and the com- pensation function is given by: ω(e, h, A) = eω̃(h). Here compensation per worker depends only on hours worked. Further, suppose that revenues depend on the product eh so that only total hours matters for the production process. Specially, R(A, e, h) = AR̃(eh) (9.6) with R̃(eh) strictly increasing and strictly concave. In this special case, the two first-order conditions can be manipulated to imply 1 = h ω̃′(h) ω̃(h) . So, in the absence of adjustment costs and with the functional forms given above, hours are independent of either e or A. Consequently, all variations in the labor input arise from variations in the number of workers rather than hours. This is efficient given that the marginal cost of hours is increasing in the number of hours 234 worked while there are no adjustment costs associated with varying the number of workers. At another extreme, suppose there are adjustment costs (η �= 0). Further, sup- pose that compensation is simply ω(e, h, A) = eh so there are no costs to hours variation. In this case, (9.3) implies AR̃′(eh) = 1. Using this, (9.5) is clearly satisfied at a constant level of e. Hence, the variation in the labor input would be only in terms of hours and we would never observe employment variations. Of course, in the presence of adjustment costs and a strictly convex (in h) com- pensation function, the plant/firm will optimally balance the costs of adjustment hours against those of adjusting the labor force. This is empirically relevant since in the data both employment and hours variation are observed. Note though that it is only adjustment in the number of workers that contains a dynamic element. The dynamic in hours is derived from the dynamic adjustment of employees.138 It is this tradeoff between hours and worker adjustment that lies at the heart of the optimization problem. Given functional forms, these first-order conditions can be used in an estimation routine which exploits the implied orthogonality conditions. Alternatively, a value function iteration routine can be used to approximate the solution to (9.1) using (9.2). We consider below some specifications. A Simulated Example Here we follow Cooper and Willis (2001) and study the policy functions gener- ated by a quadratic adjustment cost model with some particular functional form assumptions.139 Suppose output is a Cobb-Douglas function of total labor input 235 (eh) and capital and assume the firm has market power as a seller. In this case, consider: R(A, e, h) = A(eh)α (9.7) where α reflects labor’s share in the production function as well as the elasticity of the demand curve faced by the firm. Further, impose a compensation schedule that follows Bils (1987): ω(e, h) = w ∗ e ∗ [ w0 + h + w1 (h − 40) + w2 (h − 40)2 ] (9.8) where w is the straight-time wage. Instead of working with (9.5), Cooper and Willis (2001) solve the dynamic pro- gramming problem, (9.1), with the above functional forms, using value function iteration. The functional equation for the problem is: V (A, e−1) = max h,e A(eh)α − ω(e, h) − η 2 (e − e−1)2 e−1 + βEA′|AV (A ′, e) (9.9) for all (A, e−1). In this analysis, decisions are assumed to be made at the plant level. Accordingly, the profitability shocks are assumed to have two components: a piece that is common across plants (an aggregate shock) and a piece that is plant specific. Both types of shocks are assumed to follow first-order Markov processes. These are embedded in the conditional expectation in (9.9). In this formulation, the adjustment costs are paid on net changes in employment. Further, the adjustment costs depend on the rate of adjustment rather than the absolute change alone.140 The policy function that solves (9.9) is given by e = φ(A, e−1). This policy function can be characterized given a parameterization of (9.9). Cooper and Willis (2001) assume: • Labor’s share is 0.65 and the markup is 25% so that α in (9.7) is .72 . 236 • the compensation function uses the estimates of Bils (1987) and Shapiro (1986): {w0, w1, w2} = {1.5, 0.19, 0.03} and the straight time wage, w, is normalized to 0.05 for convenience. The elasticity of the wage with respect to hours is close to 1 on average • the profitability shocks are represented by a first-order Markov process and are decomposed into aggregate (A) and idiosyncratic components (ε). A ∈ {0.9, 1.1} and ε takes on 15 possible values. The serial correlation for the plant-level shocks is 0.83 and is 0.8 for the aggregate shocks.141 This specification leaves open the parameterization of η in the cost of adjustment function. In the literature, this is a key parameter to estimate. The policy functions computed for two values of A at these parameter choices are depicted in Figure 9.1. Here we have set η = 1 which is at the low end of estimates in the literature. These policy functions have two important characteristics: • φ(A, e−1) is increasing in (e−1). • φ(A, e−1) is increasing in A: as profitability increases, so does the marginal gain to adjustment and thus e is higher. [Figure 9.1 approximately here] The quadratic adjustment cost model can be estimated either from plant (firm) data or aggregate data. To illustrate this, we next discuss the approach of Sargent (1978). We then discuss a more general approach to estimation in a model with a richer specification of adjustment costs. Exercise 9.1 Write down the necessary conditions for the optimal choices of hours and em- ployment in (9.9). Provide an interpretation of these conditions. 237 Sargent: Linear Quadratic Specification A leading example of bringing the quadratic adjustment cost model directly to the data is Sargent (1978). In that application, Sargent assumes there are two types of labor input: straight-time and overtime workers. The production function is linear- quadratic in each of the two inputs and the costs of adjustment are quadratic and separable across the types of labor. As the two types of labor inputs do not interact in either the production function or the adjustment cost function, we will focus on the model of straight-time employment in isolation. Following, Sargent assume that revenue from straight-time employment is given by: R(A, e) = (R0 + A)e − (R1/2)e2 (9.10) Here A is a productivity shock and follows an AR(1) process. Sargent does not include hours variation in his model except through the use of overtime labor. Ac- cordingly, there is no direct dependence of the wage bill on hours. Instead he assumes that the wage rate follows an exogenous (with respect to employment) given by: wt = ν0 + i=n∑ i=1 νiwt−i + ζt. (9.11) In principle, the innovation to wages can be correlated with the shocks to revenues.142 With this structure, the firm’s first-order condition with respect to employment is given by: βEtet+1 − et ( R1 η + (1 + β) ) + et−1 = 1 η (wt − R0 − At) (9.12) From this Euler equation, current employment will depend on the lagged level of employment (through the cost of adjustment) and on (expected) future values of the stochastic variables, productivity and wages, as these variables influence the future level of employment. As described by Sargent, the solution to this Euler equation can be obtained so that employment in a given period depends on lagged 238 employment, current and (conditional expectations of) future wages and current and (conditional expectations of) future productivity shocks. Given the driving process for wages and productivity shocks, this conditional expectations can be evaluated so that employment in period t is solely a function of lagged employment, current and past wages. The past wages are relevant for predicting future wages. Sargent estimates the resulting VAR model of wages employment using max- imum likelihood techniques.143 The parameters he estimated included (R1, η, ρ) where ρ is the serial correlation of the productivity shocks. In addition, Sargent estimated the parameters of the wage process. The model is estimated using quarterly data on total US civilian employment. Interestingly, he also decides to use seasonally unadjusted data for some of the estimation, arguing that, in effect, there is no reason to separate the responses to seasonal and nonseasonal variations. The data are detrended to correspond to the stationarity of the model. He finds evidence of adjustment costs insofar as η is significantly different from zero.144 Sargent [pg. 1041] argues that these results ”..are moderately comforting to the view that the employment-real-wage observations lie along a demand schedule for employment”. Exercise 9.2 There are a number of exercises to consider working from this simple model. 1. Write a program to solve (9.9) for the employment and hours policy functions using value function iteration. What are the properties of these policy functions? How do these functions change as you vary the elasticity of the compensation func- tion and the cost of adjustment parameter? 2. Solve (9.9) using a log-linearization technique. Compare your results with those obtained by the value function iteration approach. 239 3. Consider some moments such as the relative variability of hours and employ- ment and the serial correlations of these two variables. Calculate these moments from a simulated panel and also from a time series constructed from the panel. Look for studies that characterize these moments at the micro and/or aggregate lev- els. Or, better yet, calculate them yourself. Construct an estimation exercise using these moments. 4. Suppose that you wanted to estimate the parameters of (9.9) using GMM. How would you proceed? 9.4 Richer Models of Adjustment In part, the popularity of the quadratic adjustment cost structure reflects it tractabil- ity. But, the implications of these models conflict with evidence of inactivity and bursts at the plant level. Thus researchers have been motivated to consider a richer set of models. Those are studied here and then are used for estimation purposes be- low. For these models of adjustment, we discuss the dynamic optimization problem and present policy functions. 9.4.1 Piecewise Linear Adjustment Costs One of the criticisms of the quadratic adjustment cost specification is the implication of continuous adjustment. At the plant-level, as mentioned earlier, there is evidence that adjustment is much more erratic than the pattern implied by the quadratic model. Piecewise linear adjustment costs can produce inaction. For this case, the cost of adjustment function is: C(e, e−1) = { γ+∆e if ∆e > 0
    γ−∆e if ∆e < 0 . (9.13) 240 The optimal policy rules are then determined by solving (9.1) using this specification of the adjustment cost function. The optimal policy rule will look quite different from the one produced with quadratic adjustment costs. This difference is a consequence of the lack of differen- tiability in the neighborhood of zero adjustment. Consequently, small adjustments will not occur since the marginal cost of adjustment does not go to zero as the size of the adjustment goes to zero. Further, this specificiation of adjustment costs implies there is no partial adjustment. Since the marginal cost of adjustment is constant, there is no basis for smoothing adjustment. The optimal policy is characterized by two boundaries: e−(A) and e+(A) If e−1 ∈ [e−(A), e+(A)], then there is no adjustment. In the event of adjustment, the optimal adjustment is to e−(A) if e−1 < e−(A) and is to e+(A) if e−1 > e+(A).
    Following Cooper and Willis (2001) and using the same basic parameters as
    described above, we can study the optimal policy function for this type of adjustment
    cost. Assume that γ+ = γ− = .05 which produces inaction at the plant level in 23%
    of the observations. 145 Then (9.1) along with (9.13) can be solved using value
    function iteration and the resulting policy functions evaluated.
    These are shown in Figure 9.2. Note that there is no adjustment for values of
    e−1 in an interval: the employment policy function coincides with the 45 degree
    line. Outside of that internal there are two targets: e−(A) and e+(A). Again, as
    this policy function is indexed by the values of γ+ and γ−. So these parameters
    can be estimated by matching the implications of the model against observations
    of employment adjustment at the plant and/or aggregate levels. We will return to
    this point below.
    [Figure 9.2 approximately here]
    Exercise 9.3

    241
    Specify the dynamic programming problem for labor adjustment using a piece-wise
    linear adjustment cost structure. What determines the region of inaction? Study this
    model numerically by solving the dynamic programming problem and obtaining policy
    functions.
    9.4.2 Non-Convex Adjustment Costs
    The observations of inactivity at the plant level that motivate the piecewise linear
    specification are also used to motivate consideration of fixed costs in the adjustment
    process. As noted by Hamermesh and Pfann (1996) the annual recruiting activities
    of economics departments provide a familiar example of the role of fixed costs. In
    the US, hiring requires the posting of an advertisement of vacancies, the extensive
    review of material provided by candidates, the travel of a recruiting team to a
    convention site, interviews of leading candidates, university visits and finally a vote
    to select among the candidates. Clearly there are fixed cost components to many of
    these activities that comprise the hiring of new employees. 146
    As a formal model of this, consider:
    V (A, e−1) = max [V
    a (A, e−1) , V
    n (A, e−1)] (9.14)
    for all (A, e−1) where V a (A, e−1) represents the value of adjusting employment and
    V n (A, e−1) represents the value of not adjusting employment. These are given by
    V a (A, e−1) = max
    h,e
    R(A, e, h) − ω(e, h) − F + βEA′|AV (A′, e) (9.15)
    V n (A, e−1) = max
    h
    R(A, e−1, h) − ω(e−1, h) + βEA′|AV (A′, e−1). (9.16)
    So, in this specification, the firm can either adjust the number of employees or
    not. These two options are labelled action (V a (A, e−1)) and inaction (V n (A, e−1)).
    In either case, hours are assumed to be freely adjusted and thus will respond to

    242
    variations in profitability even if there is no adjustment in the number of workers.
    Note too that this specification assumes adjustment costs depend on gross changes
    in the number of workers. In this way the model can potentially match the inaction
    in employment adjustment at the plant level defined by zero changes in the number
    of workers.
    The optimal policy has three dimensions. First, there is the choice of whether
    to adjust or not. Let z(A, e−1) ∈ {0, 1} indicate this choice where z(A, e−1) = 1
    if and only if there is adjustment. Second, there is the choice of employment in
    the event of adjustment. Let φ(A, e−1) denote that choice where φ(A, e−1) = e−1 if
    z(A, e−1) = 0. Finally, there is the choice of hours, h(A, e−1), which will reflect the
    decision of the firm whether or not to adjust employment. As these employment
    adjustments depend on (A, e−1) through e = φ(A, e−1), one can always consider
    hours to be a function of the state vector alone.
    There are some rich trade-offs between hours and employment variations imbed-
    ded in this model. Suppose that there is a positive shock to profitability: A rises. If
    this variation is large and permanent, then the optimal response of the firm will be
    to adjust employment. Hours will vary only slightly. If the shock to profitability is
    not large or permanent enough to trigger adjustment, then by definition employment
    will remain fixed. In that case, the main variation will be in worker hours.
    These variations in hours and employment are shown in Figure 9.3. The policy
    functions underlying this figure were created using a baseline parameters with fixed
    costs at .1 of the steady state profits.147
    [Figure 9.3 approximately here]
    Exercise 9.4
    Specify the dynamic programming problem for labor adjustment using a non-
    convex adjustment cost structure. What determines the frequency of inaction? What

    243
    comovement of hours and employment is predicted by the model? What features
    of the policy functions distinguish this model from the one with piece-wise linear
    adjustment costs? Study this model numerically by solving the dynamic programming
    problem and obtaining policy functions.
    9.4.3 Asymmetries
    As discussed in Hamermesh and Pfann (1996), there is certainly evidence in favor of
    asymmetries in the adjustment costs. For example, there may be a cost of advertising
    and evaluation that is proportional to the number of workers hired but no costs of
    firing workers. Alternatively, it may be of interest to evaluate the effects of firing
    costs on hiring policies as discussed in the context of some European economies.
    It is relatively straightforward to introduce asymmetries into the model. Given
    the approach to obtaining policy functions by solving (9.1) through a value function
    iteration routine, asymmetries do not present any additional difficulties. As with
    the other parameterizations of adjustment costs, these model can be estimated using
    a variety of techniques. Pfann and Palm (1993) provide a nice example of this
    approach. They specify an adjustment cost function of:
    C(e, e−1) = −1 + eγ∆e − γ∆e +
    1
    2
    η(∆e)2. (9.17)
    where ∆e ≡ (e − e−1). If γ ≡ 0, then this reduces to (9.2) with q = 0.
    As Pfann and Palm (1993) illustrate, the asymmetry in adjustment costs is
    controlled by γ. For example, if γ < 0, then firing costs exceed hiring costs. Using this model of adjustment costs, Pfann and Palm (1993) estimate parame- ters using a GMM approach on data for manufacturing in the Netherlands (quarterly, seasonally unadjusted data, 1971(I)-1984(IV)) and annual data for U.K. manufac- turing. They have data on both production and nonproduction workers and the 244 employment choices are interdependent from the production function. For both countries they find evidence of the standard quadratic adjustment cost model: η is positive and significantly different from zero for both types of workers. Moreover, there is evidence of asymmetry. They report that the costs of firing production workers are lower than the hiring costs. But, the opposite is true for the non-production workers. 9.5 The Gap Approach The work in Caballero and Engel (1993b) and Caballero et al. (1997) pursues an alternative approach to studying dynamic labor adjustment. Instead of solving an explicit dynamic optimization problem, they postulate that labor adjustment will respond to a gap between the actual and desired employment level at a plant. They then test for nonlinearities in this relationship. The theme of creating an employment target to define an employment gap as a proxy for the current state is quite intuitive and powerful. As noted in our dis- cussion of non-convex adjustment costs, when a firm is hit by a profitability shock, a gap naturally emerges between the current level of employment and the level of employment the firm would choose if there were no costs of adjustment. This gap should then be a good proxy for the gains to adjustment. These gains, of course, are then compared to the costs of adjustment which depend on the specification of the adjustment cost function. This section studies some attempts to study the nature of adjustment costs using this approach.148 The power of this approach is the simplification of the dynamic optimization problem as the target level of employment summarizes the current state. However, as we shall see, these gains may be difficult to realize. The problem arises from the fact that the target level of employment and thus the gap is not observable. 245 To understand this approach, it is useful to begin with a discussion of the par- tial adjustment model. We then return to evidence on adjustment costs from this approach. 9.5.1 Partial Adjustment Model Researchers often specify a partial adjustment model in which the firm is assumed to adjust the level of employment to a target.149 The assumed model of labor adjustment would be: et − et−1 = λ(e∗ − et−1). (9.18) So here the change in employment et −et−1 is proportional to the difference between the previous level of employment and a target, e∗, where λ parameterizes how quickly the gap is closed. Where does this partial adjustment structure come from? What does the target represent? Cooper and Willis (2001) consider a dynamic programming problem given by: £(e∗, e−1) = min e (e − e∗)2 2 + κ 2 (e − e−1)2 + βEe∗′|e∗£(e∗′, e). (9.19) where the loss depends on the gap between the current stock of workers (e) and the target (e∗). The target is taken as an exogenous process though in general it reflects the underlying shocks to profitability that are explicit in the optimizing model. In particular, suppose that e∗ follows an AR(1) process with serial correlation of ρ. Further, assume that there are quadratic adjustment costs, parameterized by κ. The first-order condition to the optimization problem is: (e − e∗) + κ(e − e−1) − βκE(e′ − e) = 0 (9.20) where the last term was obtained from using (9.19) to solve for ∂£/∂e. Given that the problem is quadratic, it is natural to conjecture a policy function in which 246 the control variable (e) is linearly related to the two elements of the state vector (e∗, e−1). e = λ1e ∗ + λ2e−1. (9.21) Using this conjecture in (9.20) and taking expectations of the future value of e∗ yields: (e − e∗) + κ(e − e−1) − βκ(λ1ρe∗ + (λ2 − 1)e) = 0. (9.22) This can be used to solve for e as a linear function of (e∗, e−1) with coefficients given by: λ1 = 1 + βκλ1ρ 1 + κ − βκ(λ2 − 1) (9.23) and λ2 = κ (1 + κ − βκ (λ2 − 1)) . (9.24) Clearly, if the shocks follow a random walk (ρ = 1), then partial adjustment is optimal (λ1 + λ2 = 1). Otherwise, the optimal policy created by minimization of the quadratic loss is linear but does not dictate partial adjustment. 9.5.2 Measuring the Target and the Gap Taking this type of model directly to the data is problematic as the target e∗ is not observable. In the literature (see, for example, the discussion in Caballero and Engel (1993b)) the target is meant to representation the destination of the adjustment process. There are two representations of the target. One, termed a static target, treats e∗ as the solution of a static optimization problem, as if adjustment costs did not exist. Thus, e∗ solves (9.5) with η ≡ 0 and hours set optimally. A second approach is treats e∗ as the level of employment the firm would choose if there were no adjustment costs for a single period. This is termed the frictionless target. This level of employment solves e = φ(A, e) where φ(A, e−1) is the policy 247 function for employment for the quadratic adjustment cost model. Thus the target is the level of employment where the policy function, contingent on the profitability shock, crosses the 45 degree line, as in Figure 9.1. Following Caballero et al. (1997) define the gap as the difference between desired (e∗i,t) and actual employment levels (in logs): z̃i,t ≡ e∗i,t − ei,t−1. (9.25) Here ei,t−1 is number of workers inherited from the previous period. So z̃i,t measures the gap between the desired and actual levels of employment in period t prior to any adjustments but after any relevant period t random variables are realized as these shocks are embedded in the target and thus the gap. The policy function for the firm is assumed to be:150 ∆ei,t = φ(z̃i,t). (9.26) The key of the empirical work is to estimate the function φ(·). Unfortunately, estimation of (9.26) is not feasible as the target and thus the gap are not observable. So, the basic theory must be augmented with a technique to measure the gap. There are two approaches in the literature corresponding to the two notions of a target level of employment, described earlier. Caballero et al. (1997) pursue the theme of a frictionless target. To implement this, they postulate a second relationship between another (closely related) measure of the gap, (z̃1i,t), and plant specific deviations in hours: z̃1i,t = θ(hi,t − h̄). (9.27) Here z̃1i,t is the gap in period t after adjustments in the level of e have been made: z̃1i,t = z̃i,t − ∆ei,t. The argument in favor of this approach again returns to our discussion of the choice between employment and hours variation in the presence of adjustment costs. 248 In that case we saw that the firm chose between these two forms of increasing output when profitability rose. Thus, if hours are measured to be above average, this will reflect a gap between actual and desired workers. If there was no cost of adjustment, the firm would choose to hire more workers. But, in the presence of these costs the firm maintains a positive gap and hours worked are above average. The key to (9.27) is θ. Since the left side of (9.27) is also not observable, the analysis is further amended to generate an estimate of θ. Caballero et al. (1997) estimate θ from: ∆ei,t = α − θ∆hi,t + εi,t. (9.28) where the error term includes unobserved changes in the target level of employment, ∆e∗i,t) as well as measurement error. Caballero et al. (1997) note that the equation may have omitted variable bias as the change in the target may be correlated with changes in hours. From the discussion in Cooper and Willis (2001), this omitted variable bias can be quite important. Once θ is estimated, Caballero et al. (1997) can construct plant specific gap measures using observed hours variations. In principle, the model of employment adjustment using these gap measures can be estimated from plant level data. In- stead, Caballero et al. (1997) focus on the aggregate time series implications of their model. In particular, the growth rate of aggregate employment is given by: ∆Et = ∫ z zΦ(z)ft(z) (9.29) where Φ(z) is the adjustment rate or hazard function characterizing the fraction of the gap that is closed by employment adjustment. From aggregate data, this expression can be used to estimate Φ(z). As discussed in Caballero et al. (1997), if Φ(z) is say a quadratic, then (9.29) can be expanded implying that employment growth will depend on the first and third moments of the cross sectional distribution 249 of gaps. The findings of Caballero et al. (1997) can be summarized as: • using (9.28), θ is estimated at 1.26. • the relationship between the average adjustment rate and the gap is nonlinear. • they find some evidence of inaction in employment adjustment. • aggregate employment growth depends on the second moment of the distribu- tion of employment gaps. In contrast, Caballero and Engel (1993b) do not estimate θ. Instead they cali- brate it from a structural model of static optimization by a firm with market power. In doing so, they are adopting a target that ignores the dynamics of adjustment. From their perspective, the gap is defined using (9.25) where e∗i,t corresponds to the solution of a static optimization problem over both hours and employment with- out any adjustment costs. They argue that this static target will approximate the frictionless target quite well if shocks are random walks. As with Caballero et al. (1997), once the target is determined a measure of the gap can be created. This approach to approximating the dynamic optimization problem is applied extensively because it is so easy to characterize. Further, it is a natural extension of the partial adjustment model. But as argued in Cooper and Willis (2001) the approach may place excessive emphasis on static optimization.151 Caballero and Engel (1993b) estimate their model using aggregate observations on net and gross flows for US manufacturing employment. They find that a quadratic hazard specification fits the aggregate data better than the flat hazard specification. The key point in both of these papers is the rejection of the flat hazard model. Both Caballero et al. (1997) and Caballero and Engel (1993b) argue that the es- timates of the hazard function from aggregate data imply that the cross sectional 250 distribution “matters” for aggregate dynamics. Put differently, both studies reject a flat hazard specification in which a constant fraction of the gap is closed each period. Given that this evidence is obtained from time series, this implies that the non- convexities at the plant-level have aggregate implications. This is an important finding in terms of the way macroeconomists build models of labor adjustment. To the extent that the flat hazard model is the outcome of a quadratic adjustment cost model, both papers reject that specification in favor of a model that generates some nonlinearities in the adjustment process. But, as these papers do not consider explicit models of adjustment, one can not infer from these results anything about the underlying adjustment cost structure. Further, as argued by Cooper and Willis (2001), the methodology of these studies may itself induce the nonlinear relationship between employment adjustment and the gap. Cooper and Willis (2001) construct a model economy with quadratic adjustment costs. They assume that shocks follow a first-order Markov process, with serial correlation less than unity.152 They find that using either the Caballero et al. (1997) or Caballero and Engel (1993b) measurements of the gap, the cross sectional distribution of employment gaps may be significant in a time series regression of employment growth. 9.6 Estimation of a Rich Model of Adjustment Costs Thus far we have discussed some evidence associated with the quadratic adjustment cost models and provided some insights into the optimal policy functions from more complex adjustment cost models. In this section we go a step further and discuss 251 attempts to evaluate models that may have both convex and non-convex adjustment costs. As with other dynamic optimization problems studied in this book, there is, of course, a direct way to estimate the parameters of labor adjustment costs. This requires the specification of a model of adjustment that nests the variety of special cases described above along with a technique to estimate the parameters. In this subsection, we outline this approach.153 Letting A represent the profitability of a production unit (e.g. a plant), we consider the following dynamic programming problem: V (A, e−1) = max h,e R(A, e, h) − ω(e, h, A) − C (A, e−1, e) + βEA′|AV (A′, e). (9.30) As above, let, R(A, e, h) = A(eh)α (9.31) where the parameter α is again determined by the shares of capital and labor in the production function as well as the elasticity of demand. The function ω(e, h, A) represents total compensation to workers as a function of the number of workers and their average hours. As before, this compensation func- tion could be taken from other studies or perhaps a constant elasticity formulation might be adequate: w = w0 + w1h ζ . The costs of adjustment function nests quadratic and non-convex adjustment costs of changing employment C (A, e−1, e) = F H + ν 2 ( e−e−1 e−1 )2 e−1, if e > e−1
    F F + ν
    2
    (
    e−e−1
    e−1
    )2
    e−1, if e < e−1 (9.32) where F H and F F represent the respective fixed costs of hiring and firing workers. Note that quadratic adjustment costs are based upon net not gross hires. In (9.32), 252 ν parameterizes the level of the adjustment cost function. This adjustment cost function yields the following dynamic optimization problem V (A, e−1) = max { V H (A, e−1), V F (A, e−1), V N (A, e−1) } (9.33) for all (A, e−1) where N refers to the choice of no adjustment of employment. These options are defined by: V H (A, e−1) = max h,e R(A, e, h) − ω(e, h, A) − F H −ν 2 ( e − e−1 e−1 )2 e−1 + βEA′|AV (A ′, e) if e > e−1
    V F (A, e−1) = max
    h,e
    R(A, e, h) − ω(e, h, A) − F F
    −ν
    2
    (
    e − e−1
    e−1
    )2
    e−1 + βEA′|AV (A
    ′, e) if e < e−1 V N (A, e−1) = max h R(A, e−1, h) − ω(e−1, h, A) + βEA′|AV (A′, e−1) This problem looks formidable. It contains both an extensive (adjustment or no adjustment) as well an an intensive (the choice of e, given adjustment) margin. Further, there is no simple Euler equation to study given the non-convex adjustment costs.154 But, given the methodology of this book, attacking a problem like this is feasible. In fact, one could build additional features into this model, such as allowing for a piece-wise linear adjustment cost a structure.155 From our previous discussion, we know that “solving” a model with this com- plexity is relatively straightforward. Let Θ represent the vector of parameters nec- essary to solve the model.156 Then, for a given value of this vector, a value function iteration procedure will generate a solution to (9.30). 253 Once a solution to the functional equation is obtained, then policy functions can be easily created. Figure 9.4 produces a policy function for the case of η = 1 and F F = F H = .01. [Figure 9.4 approximately here] One can obtain correlations from a simulated panel. For this parameterization, some moments of interest are: corr(e,A)=.856; corr(h,A)=.839 and corr(e,h)=.461. Clearly, employment and hours adjustment are both positively related to the shock. Further, we find that the correlation of hours and employment is positive indicating that the adjustment towards a target, in which the correlation is negative, is offset by the joint response of these variables to a shock. Computation of these moments for a given Θ opens the door to estimation. If these moments can be computed for a given Θ, then: • it is easy to compute other moments (including regression coefficients) • it is easy to find a value of Θ to bring the actual and simulated moments close together The techniques of this book are then easily applied to a study of labor market dynamics using either panel data or time series.157 Of course, this exercise may be even more interesting using data from countries other than the US who, through institutional constraints, have richer adjustment costs. 9.7 Conclusions This point of this chapter has been to explore the dynamics of labor adjustment. In the presence of adjustment costs, the conventional model of static labor demand is 254 replaced by a possibly complex dynamic optimization problem. Solving these prob- lems and estimating parameters using either plant-level or aggregate observations is certainly feasible using the techniques developed in this book. In terms of policy implications, governments often impose restrictions on em- ployment and hours. The dynamic optimization framework facilitates the analysis of those interventions.158 Further, these policies (such as restrictions on hours and/or the introduction of firing costs) may provide an opportunity to infer key structural parameters.159 Chapter 10 Future Developments 10.1 Overview/Motivation This final section of this book covers an assortment of additional topics. These represent active areas of research which utlize the approach of this book. In some cases, the research is not yet that far along. Examples of this would include ongoing research on the integration of pricing and inventory problems or the joint evolution of capital and labor. In a second category are search models of the labor market which illustrate the usefulness of empirical work on dynamic programming though generally are not part of standard course in applied macroeconomics. Consequently, the presentation is different than other chapters. Here we focus mainly on the statement of coherent dynamic optimization problems and properties of policy functions. To the extent that there are empirical studies, we summarize them. 10.2 Price Setting We begin with a very important problem in macroeconomics, the determination of prices. For this discussion, we do not rely on the Walrasian auctioneer to miracu- 255 256 lously set prices. Instead, we allow firms to set prices and study this interaction in a monopolistic competition setting.160 The natural specification includes a fixed cost of adjusting prices so that the firm optimally chooses between adjusting or not. Hence we term this the state de- pendent pricing model. These have been most recently termed “menu cost” models to highlight the fact that a leading parable of the model is one where a seller finds it costly to literally change the posted price. In fact, this terminology is somewhat unfortunate as it tends to trivialize the problem. Instead, it is best to view these costs as representing a wide range of sources of frictions in the pricing of goods. Besides presenting a basic optimization problem, this section summarizes two empirical exercises. The first reports on an attempt to use indirect inference to estimate the cost of price adjustment for magazine prices. The second is a study of the aggregate implications of state dependent pricing. 10.2.1 Optimization Problem Consider a dynamic optimization problem at the firm level where, by assumption, prices are costly to adjust. The firm has some market power, represented by a downward sloping demand curve. This demand curve may shift around so that the price the firm would set in the absence of adjustment costs is stochastic. The question is: how, in the presence of adjustment costs, do firms behave? Suppose, to be concrete, that product demand comes from the CES specification of utility so that the demand for product i is given by: qdi (p, D, P ) = ( p P )−γ D P (10.1) Here all variables are nominal. The price of product i is p while the general price level is P . Finally, nominal spending, taken to be exogenous and stochastic is denoted 257 D. Given this specification of demand and the realized state, (p, D, P ), the firm’s real profits are: π(p, D, P ) = qdi (p, D, P ) p P − c(qdi (p, D, P )) (10.2) where c(·) is assumed to be a strictly increasing and strictly convex function of output. The dynamic optimization problem of the firm, taking the current values and evolution of (D, P ) as given, is: V (p, D, P, F ) = max{V a(p, D, P, F ), V n(p, D, P, F )} (10.3) for all (p, D, P, F ) where V a(p, D, P, F ) = maxp̃ π(p̃, D, P ) − F + βE(D′,P ′,F ′|D,P,F )V (p̃, D′, P ′, F ′) (10.4) V n(p, D, P, F ) = π(p, D, P ) + βE(D′,P ′,F ′|D,P,F )V (p, D ′, P ′, F ′) (10.5) Here the state vector is (p, D, P, F ). The cost of changing a price is F . It enters the state vector since, in this specification, we allow this adjustment cost to be stochastic.161 The firm has two options. If the firm does not change its price, it enjoys a profit flow, avoids adjustment costs and then, in the next period, has the same nominal price. Of course, if the aggregate price level changes (P �= P ′), then the firm’s relative price will change over time. Note that the cost here is associated with adjustment of the nominal price. Alternatively, the firm can pay the “menu cost” F and adjust its price to p̃. This price change becomes effective immediately so that the profit flow given adjustment is π(p̃, D, P ). This price then becomes part of the state vector for the next period. The policy function for this problem will have two components. First, there is a discrete component indicating whether or not price adjustment will take place. 258 Second, conditional on adjustment, there is the policy function characterizing the dependence of p̃ on the state vector (D, P, F ). Interestingly, the choice of p̃ is independent of p once the decision to adjust has been made. There is a very important difference between this optimization problem and most of the others studied in this book. From (10.3), the choice at the individual firm level depends on the choices of other firms, summarized by P . Thus, given the specification of demand, the behavior of a single firm depends on the behavior of other firms.162 This feature opens up a number of alternative ways of solving the model. As a starting point, one might characterize the exogenous evolution of P , per- haps through a regression model, and impose this in the optimization problem of the firm.163 In this case, the individual optimizer is simply using an empirical model of the evolution of P . Using this approach, there is no guarantee that the aggregate evolution of P assumed by the individual agent actually accords with the aggregated behavior of these agents. This suggests a second approach in which this consistency between the beliefs of agents and their aggregate actions is imposed on the model. Essentially this amounts to: • solving (10.3) given a transition equation for P • using the resulting policy functions to solve for the predicted evolution of P • stopping if these functions are essentially the same • iterating if they are not. There is a point of caution here though. For the dynamic programming problem, we can rely on the contraction mapping property to guarantee that the value function iteration process will find the unique solution to the functional equation. We have no 259 such theorem to guide us in the iterative procedure described above. Consequently, finding an equilibrium may be difficult and, further, there is no reason to suspect that the equilibrium is unique.164 10.2.2 Evidence on Magazine Prices Willis (2000a) studies the determination of magazine price adjustment using a data set initially used by Cecchetti (1986). The idea is to use data on the frequency and magnitude of magazine price adjustment to estimate a dynamic menu cost model.165 Willis postulates a theory model similar to that given above. For the empir- ical analysis, he specifies an auxiliary equation in which the probability of price adjustment is assumed to depend on: • the number of years since the last price adjustment • cumulative inflation since the last price adjustment • cumulative growth in industry demand since the last price adjustment • current inflation • current industry demand. This specification is partly chosen as it mimics some of the key elements of the specification in Cecchetti (1986). Further, the cumulative inflation and demand since the last price change are, from the dynamic programming problem, key elements in the incentive to adjust prices. Interestingly, there seems to be little support for any time dependence, given the presence of the proxies for the state variables. Willis estimates this auxiliary model and then uses it, through an indirect infer- ence procedure, to estimate the structural parameters of his model. These include: • the curvature of the profit function 260 • the curvature of the cost function • the distribution of menu costs. Willis (2000a) finds that magazine sellers have a significant amount of market power but that production is essentially constant returns to scale. Finally, Willis is able the distinguish the average adjustment cost in the distribution from the average that is actually paid. He finds that the former is about 35% of revenues while the latter is only about 4% of revenues.166 10.2.3 Aggregate Implications A large part of the motivation for studying models with some form of price rigidity reflected the arguments, advanced by macroeconomists, that inflexible prices were a source of aggregate inefficiency. Further, rigidity of prices and/or wages provides a basis for the non-neutrality of money, thus generating a link between the stock of nominal money and real economic activity. But, these arguments rest on the presence of quantitatively relevant rigidities at the level of individual sellers. Can these costs of adjusting prices “explain” observations at both the microeconomic and aggregate levels? One approach to studying these issues is to model the pricing behavior of sellers in a particular industry. This estimated model can then be aggregated to study the effects of, say, money on output. An alternative, more aggregate approach, is to specify and estimate a macroeconomic model with price rigidities. At this point, while the estimation of such a model is not complete, there is some progress. A recent paper by Dotsey et al. (1999) studies the quantitative implications of state dependent pricing for aggregate variables. We summarize those results here. The economy studied by Dotsey et al. (1999) has a number of key elements: 261 • as in Blanchard and Kiyotaki (1987) the model is based upon monopolistic competition between producers of final goods • sellers face a (stochastic) iid fixed cost of adjusting their price (expressed in terms of labor time) • sellers meet all demand forthcoming at their current price • there is an exogenously specified demand for money At the individual level, firms solve a version of (10.3) where the cost of adjustment F is assumed to be iid. Further, heterogeneity across firms is restricted to two dimensions, (F, p). That is, firms may be in different states because they began the period with a different price or because their price adjustment cost for that period is different from that of other firms. There is a very important consequence of this restricted form of heterogeneity: if two firms choose to adjust, they select the same price. Interestingly, Dotsey et al. solve the dynamic optimization problem of a firm by using a first-order condition. This is somewhat surprising as we have not used first- order conditions to characterize the solutions to dynamic discrete choice problems. Consider the choice of a price by a firm conditional on adjustment, as in (10.4). The firm optimally sets the price taking into account the effects on current profits and on the future value. In the price setting model, the price only effects the future value if the firm elects not to adjust in the next period. If the firm adjusts its price in the next period, as in (10.4), then the value of the price at the start of the period is irrelevant. So, there is a first-order condition which weighs the effects of the price on current profits and on future values along the no-adjustment branch of the value function. As long as the value function of the firm along this branch is differentiable in p̃, 262 there will be a first-order condition characterizing this optimal choice given by: ∂π(p̃, D, P )/∂p + βG(F ∗)E(D′,P ′,F ′|D,P,F )∂V n(p̃, D′, P ′, F ′)/∂p = 0. (10.6) where G(F ∗) is the state contingent probability of not adjusting in the next period. This is not quite an Euler equation as the derivative of the future value remains in this expression. Dotsey et al. iterate this condition forward and, using a restriction that the firm eventually adjusts, derivatives of the primitive profit function can substitute for ∂V n(p̃, D′, P ′, F ′)/∂p.167 The solution of the optimization problem and the equilibrium analysis relies on a discrete representation of the possible states of the firms. Given a value of p, there will exist a critical adjustment cost such that sellers adjust if and only if the realized value of F is less than this critical level. So, given the state of the system, there is an endogenously determined probability of adjustment for each seller. Dotsey et al. (1999) use this discrete representation, these endogenous probabilities of adjustment and the (common) price charged by sellers who adjust to characterize the equilibrium evolution of their model economy. Details on computing an equilibrium are provided in Dotsey et al. (1999). In terms of the effects of money on output they find: • if the inflation rate is constant at 10% then prices are adjusted at least once every 5 quarter. • comparing different constant inflation rate regimes, the higher the inflation rate, the shorter is the average time to adjustment and the mark-up only increases slightly • an unanticipated, permanent monetary expansion leads to higher prices and higher output at impact and there is some persistence in the output effects. 263 • as the money shocks become less persistent, the price response dampens and consequently the output effect is larger. This discussion of the aggregate implications of monetary shocks in an environ- ment with state dependent prices nicely complements our earlier discussion of the estimation of a state dependent pricing model using micro-data. Clearly, there is an open issue here concerning the estimation of a state dependent pricing model using aggregate data.168 10.3 Optimal Inventory Policy The models we have studied thus far miss an important element of firm behavior, the holding of inventories. This is somewhat ironic as the optimal inventory problem was one of the earlier dynamic optimization problems studied in economics.169 We begin with a traditional model of inventories in which a seller with a convex cost function uses inventories to smooth production when demand is stochastic. We then turn to models which include non-convexities. The section ends with a brief discussion of a model with dynamic choices over prices and inventories. 10.3.1 Inventories and the Production Smoothing Model The basic production smoothing argument for the holding of inventories rests upon the assumption that the marginal cost of production is increasing. In the face of fluctuating demand, the firm would then profit by smoothing production relative to sales. This requires the firm to build inventories in periods of low demand and to liquidate them in periods of high demand. Formally, consider: V (s, I) = maxyr(s) − c(y) + βEs′|sV (s′, I′) (10.7) 264 for all (s, I). Here the state vector is the level of sales s and the stock of inventories at the start of the period, I. The level of sales is assumed to be random and outside of the firm’s control. From sales, the firm earns revenues of r(s). The firm chooses its level of production (y) where c(y) is a strictly increasing, strictly convex cost function. Inventories at the start of the next period are given by a transition equation: I′ = R(I + y − s). (10.8) where R is the return on a marginal unit of inventory (which may be less than unity).170 From this problem, a necessary condition for optimality is: c′(y) = βREs′|sc ′(y′) (10.9) where future output is stochastic and will generally depend on the sales realization in the next period. To make clear the idea of production smoothing, suppose that sales follow an iid process: Es′|ss is independent of s. In that case, the right side of (10.9) is independent of the current realization of sales. Hence, since (10.9) must hold for all s, the left side must be constant too. Since production costs are assumed to be strictly convex, this implies that y must be independent of s. Exercise 10.1 Solve (10.7) using a value function iteration routine (or another for comparison purposes). Under what conditions will the variance of production be less than the variance of sales? Despite its appeal, the implications of the production smoothing model contrast sharply with observation. In particular, the model’s prediction that production will be smoother than sales but the data do not exhibit such production smoothing.171 265 One response to this difference between the model’s predictions and observation is to introduce other shocks into the problem to increase the variability of pro- duction. A natural candidate would be variations in productivity or the costs of production. Letting A denote a productivity shock, consider: V (s, I, A) = maxyr(s) − c(y, A) + βEA′,s′|A,sV (s′, I′, A′) (10.10) so that the cost of producing y units is stochastic. In this case, (10.9) becomes: cy(y, A) = βREA′,s′|A,scy(y ′, A′). (10.11) In this case, inventories are used so that goods can be produced during periods of relatively low cost and, in the absence of demand variations, sold smoothly over time.172 Kahn (1987) studies a model with an explicit model of stock-out avoidance. Note that in (10.7), the seller was allowed to hold negative inventories. As discussed in Kahn (1987), some researchers add a nonnegativity constraint to the inventory problem while others are more explicit about a cost of being away from a target level of inventories (such as a fraction of sales). Kahn (1987) finds that even without a strictly convex cost function, the nonnegativity constraint alone can increase the volatility of output relative to sales. Exercise 10.2 Solve (10.10) using a value function iteration routine (or another for comparison purposes). Under what conditions on the variance of the two types of shocks and on the cost function will the variance of production be less than the variance of sales? Supplement the model with a nonnegativity constraint on inventories and/or an explicit target level of investment. Explore the relationship between the variance of sales and the variance of output. 266 Alternatively, researchers have introduced non-convexities into this problem. One approach, as in Cooper and Haltiwanger (1992), is to introduce production bunching due to the fixed costs of a production run. For that model, consider a version of (10.7) where the cost of production is given by: c(y) =   0 for y = 0 K + ay for y ∈ (0, Y ] ∞ otherwise (10.12) Here Y represents the total output produced if there is a production run. It repre- sents a capacity constraint on the existing capital. In this case, production is naturally more volatile than sales as the firm has an incentive to have a large production run and then to sell from inventories until the next burst of production.173 Further, the original inventory models that gave rise to the development of the (S,s) literature were based upon a fixed cost of ordering.174 One dynamic stochastic formalization of the models discussed in Arrow et al. (1951) might be: v(x, y) = max{vo(x, y), vn(x, y)} (10.13) where x measures the state of demand and y the inventories on hand at the sales site. The optimizer has two options: to order new goods for inventory (vo) or not (vn). These options are defined as: vo(x, y) = maxqr(s) − c(q) − K + βEx′|xv(x′, (y − s + q)(1 − δ)) (10.14) and vn(x, y) = r(s) + βEx′|xv(x ′, (y − s)(1 − δ). (10.15) Here s is a measure of sales and is given as the maximum of (x, y): demand can only be met from inventories on hand. The function r(s) is simply the revenues earned from selling s units. 267 If the firm orders new inventories, it incurs a fixed cost of K and pays c(q), an increasing and convex function, to obtain q units. In the case of ordering new goods, the inventories next period reflect the sales and the new orders. The rate of inventory depreciation is given by δ. If the firm does not order inventories, then its inventories in the following period are the depreciated level of initial inventories less sales. This is zero is the firm stocks out. This problem, which is similar to the stochastic investment problem with non- convex adjustment costs, can be easily solved numerically. It combines a discrete choice along with a continuous decision given that the firm decides to order new goods. 175 10.3.2 Prices and Inventory Adjustment Thus far we have treated pricing problems and inventory problems separately. So, in the model of costly price adjustment, sellers had no inventories. And, in the inventory models, sales are usually taken as given. Yet, there is good reason to think jointly about pricing decisions and inventories.176 First, one of the motivations for the holding of inventories is to smooth produc- tion relative to sales. But, there is another mechanism for smoothing sales: as its demand fluctuates, the firm (assuming it has some market power) could adjust its price. Yet, if prices are costly to adjust, this may be an expensive mechanism. So, the choices of pricing and inventory policies reflect the efficient response of a profit maximizing firm to variations in demand and/or technology. At one extreme, suppose that the firm can hold inventories and faces a cost of changing its price. In this case, the functional equation for the firm is given by: V (p, I; S, P ) = max{V a(p, I; S, P ), V n(p, I; S, P )} (10.16) 268 where V a(p, I; S, P ) = maxp̃ π(p̃, I; S, P ) − F + βE(S′,P ′|S,P )V (p̃, I′; S′, P ′) (10.17) V n(p, I; S, P ) = π(p, I; S, P ) + βE(S′,P ′|S,P )V (p, I ′; S′, P ′) (10.18) where the transition equation for inventories is again I′ = R(I + y − s). In this optimization problem, p is again the price of the seller and I is the stock of invento- ries. These are both controlled by the firm. The other elements in the state vector, S and P , represent a shock to profits and the general price level respectively. The function π(p, I; S, P ) represent the flow of profit when the firm charges a price p, holding inventories I when the demand shock is S and the general price level is P . Here, in contrast to the inventory problems described above, sales are not ex- ogenous. Instead, sales come a stochastic demand function that depends on the firm’s price (p) and the price index (P ). From this, we see that the firm can in- fluence sales by its price adjustment. But, of course, this adjustment is costly so that the firm must balance meeting fluctuating demand through variations in in- ventories, variations in production or through price changes. The optimal pattern of adjustment will presumably depend on the driving process of the shocks, the cost of price adjustment and the curvature of the production cost function (underlying π(p, I; S, P )). Exercise 10.3 A recent literature asserts that technology shocks are negatively correlated with employment in the presence of sticky prices. Use (10.19) to study this issue by interpreting S as a technology shock. At the other extreme, suppose that new goods are delivered infrequently due to the presence of a fixed ordering cost. In that case, the firm will seek other ways 269 to meet fluctuations in demand, such as changing its price. Formally, consider the optimization problem of the seller if there is a fixed cost to ordering and, in contrast to (10.13), prices are endogenous: V (p, I; S, P ) = max{V o(p, I; S, P ), V n(p, I; S, P )} (10.19) where V o(p, I; S, P ) = maxp̃,q π(p̃, I; S, P )−K −c(q)+βE(S′,P ′|S,P )V (p̃, I′; S′, P ′) (10.20) V n(p, I; S, P ) = maxp̃,π(p̃, I; S, P ) + βE(S′,P ′|S,P )V (p̃, I ′; S′, P ′). (10.21) The transition equation for inventories is again I′ = R(I + q − s). Aguirregabiria (1999) studies a model with menu costs and lump-sum costs of ad- justing inventories. This research is partly motivated by the presence of long periods of time in which prices are not adjusted and by observations of sales promotions. Interestingly, the model has predictions for the joint behavior of markups and inventories even if the costs of adjustment are independent. Aguirregabiria (1999) argues that markups will be high when inventories are low. This reflects the effects of stock-outs on the elasticity of sales. Specifically, Aguirregabiria assumes that: s = min(D(p), q + I) (10.22) where as above, s is sales, q is orders of new goods for inventory and I is the stock of inventories. Here D(p) represents demand that depends, among other things, on the current price set by the seller. So, when demand is less than output and the stock of inventories, then sales equal demand and the price elasticity of sales is equal to that of demand. But, when demand exceeds q + I, then the elasticity of sales with respect to price is zero: when the stock-out constraint binds, realized ”demand” is very inelastic. In the model of Aguirregabiria (1999) the firm chooses its price and the level of inventories prior to the realizations of a demand shock so that stock-outs may occur. 270 Aguirregabiria (1999) estimates the model using monthly data on a supermar- ket chain. His initial estimation is of a reduced form model for the choice to adjust prices and/or inventories. In this discrete choice framework he finds an interesting interaction between the adjustments of inventories and prices. The level of invento- ries are significant in the likelihood of price adjustment: large inventories increases the probability of price adjustment. Aguirregabiria (1999) estimates a structural model based upon a dynamic pro- gramming model.177 He finds support for the presence of both types of lump-sum adjustment costs. Moreover, he argues that the costs of increasing a price appear to exceed the cost of price reductions. 10.4 Capital and Labor The grand problem we consider here allows for adjustment costs for both labor and capital.178 Intuitively, many of the stories of adjustment costs for one factor have implications for the adjustment of the other. For example, if part of the adjustment cost for capital requires the shutting down of a plant to install new equipment, then this may also be a good time to train new workers. Moreover, we observe inaction in the adjustment of both labor and capital and bursts as well. So, it seems reasonable to entertain the possibility that both factors are costly to adjust and that the adjustment processes are interdependent. For this more general dynamic factor demand problem, we assume that the dynamic programming problem for a plant is given by: V (A, K, L) = max K′,L′,h Π(A, K, L′, h) − ω(L′, h, K, A) − (10.23) C(A, K, L, K′, L′) + βEA′|AV (A ′, K′, L′). for all (A, K, L). Here the flow of profits, Π(A, K, L′, h), depends on the profitability 271 shock, A, the predetermined capital stock, K,the number of workers, L′, and the hours workers, h. The function ω(L′, h, K, A) represents the total state dependent compensation paid to workers. Finally, the general adjustment cost function is given by C(A, K, L, K′, L′). To allow the model to capture inaction, the adjustment cost function in (10.23) contains convex and non-convex adjustment costs for both labor and capital. Fur- ther, one or both of these components might be interactive. So, for example, there may be a fixed cost of adjusting capital that may ”cover” any adjustments in labor as well. Or, within the convex piece of the adjustment cost function, there may be some interaction between the factors. Writing down and analyzing this dynamic optimization problem is by itself not difficult. There are some computational challenges posed by the larger state space. The key is the estimation of the richer set of parameters. One approach would be to continue in the indirect inference spirit and consider a VAR estimated from plant-level data in, say, hours, employment and capital. As with the single factor models, we might also include some nonlinearities in the specification. We could use the reduced form parameters as the basis for indirect inference of the structural parameters. One of the interesting applications of the estimated model will be policy exper- iments. In particular, the model with both factors will be useful in evaluating the implications of policy which directly influences one factor on the other. So, for ex- ample, we can study how restrictions on worker hours might influence the demand for equipment. Or, how do investment tax credits impact on labor demand? 272 10.5 Technological Complementarities: Equilib- rium Analysis Here we continue discussion of a topic broached in Chapter 5 where we studied the stochastic growth model. There we noted that researchers, starting with Bryant (1983) and Baxter and King (1991), introduced interactions across agents through the production function. The model captures, in a tractable way, the theme that high levels of activity by other agents increases the productivity of each agent.179 Let y represent the output at a given firm, Y be aggregate output, k and n the firm’s input of capital and labor respectively. Consider a production function of: y = AkαnφY γY ε−1 (10.24) where A is a productivity shock that is common across producers. Here γ param- eterizes the contemporaneous interaction between producers. If γ is positive, then there is a complementarity at work: as other agents produce more, the productivity of the individual agent increases as well. In addition, this specification allows for a dynamic interaction as well parameterized by ε, where Y−1 is the lagged level of aggregate output. As discussed in Cooper and Johri (1997), this may be interpreted as a dynamic technological complementarity or even a learning by doing effect. This production function can be imbedded into a stochastic growth model. Consider the problem of a representative household with access to a production technology given by (10.24). This is essentially a version of the stochastic growth model with labor but with a different technology. There are two ways to solve this problem. The first is to write the dynamic programming problem, carefully distinguishing between individual and aggregate variables. As in our discussion of the recursive equilibrium concept, a law of motion must be specified for the evolution of the aggregate variables. Given this law of 273 motion, the individual household’s problem is solved and the resulting policy func- tion compared to the one that governs the economy-wide variables. If these policy functions match, then there is an equilibrium. Else, another law of motion for the aggregate variables is specified and the search continues. This is similar to the ap- proach described above for finding the equilibrium in the state dependent pricing model. 180 Alternatively, one can use the first-order conditions for the individual’s optimiza- tion problem. As all agents are identical and all shocks are common, the represen- tative household will accumulate its own capital, supply its own labor and interact with other agents only due to the technological complementarity. In a symmetric equilibrium, yt = Yt. As in Baxter and King (1991), this equilibrium condition is neatly imposed through the first-order conditions when the marginal products of la- bor and capital are calculated. From the set of first-order conditions, the symmetric equilibrium can be analyzed through by approximation around a steady state. The distinguishing feature of this economy from the traditional Real Business Cycle model is the presence of the technological complementarity parameters, γ and �. It is possible to estimate these parameters directly from the production function or to infer them from the equilibrium relationships. 181 10.6 Search Models This is a very large and active area of research in which the structural approach to individual decision making has found fertile ground. This partly reflects the elegance of the search problem at the individual level, the important policy question surrounding the provision of unemployment insurance and the existence of rich data sets on firms and workers. This subsection will only introduce the problem and briefly touch on empirical methodology and results. 274 10.6.1 A Simple Labor Search Model The starting point is a model in the spirit of McCall (1970).182 A prospective worker has a job offer, denoted by ω. If this job is accepted, then the worker stays in this job for life and receives a return of u(ω) 1−β . Alternatively, the offer can be rejected. In this case, the worker can receive unemployment benefits of b for a period and then may draw again from the distribution. Assume that the draws from the wage distribution are iid. 183 The Bellman equation for a worker with a wage offer of ω in hand is: v(ω) = max { u(ω) 1 − β , u(b) + βEv(ω ′) } . (10.25) for all ω. The worker either accepts the job, the first option, or rejects it in favor of taking a draw in the next period. Given the assumption of iid draws, the return to another draw, Ev(ω′) is just a constant, denoted κ. It is intuitive to think of this functional equation from the perspective of value function iteration. For a given value of κ, (10.25) implies a function v(ω). Use this to create a new expected value of search and thus a new value for κ. Continue to iterate in this fashion until the process converges.184 Clearly, the gain to accepting the job is increasing in ω while the return associated with rejecting the job and drawing again is independent of ω. Assuming that the lower (upper) support of the wage offer distribution is sufficiently low (high) relative to b, there will exist a critical wage, termed the reservation wage, such that the worker is indifferent between accepting and rejecting the job. The reservation wage, w∗ is determined from: u(w∗) 1 − β = u(b) + βκ (10.26) 275 where κ = Ev(w) = ∫ +∞ −∞ v(w)dF (w) (10.27) = F (w∗) (u(b) + βκ) + ∫ ∞ w∗ u(w) 1 − β dF (w) For wages below the reservation wage, the value v(·) is constant and independent of w as the individual chooses to stay in unemployment. For wages above w∗, the individual accept the offer and gets the utility of the wage for ever. Exercise 10.4 Write a program to solve (10.25) using the approach suggested above. 10.6.2 Estimation of the Labor Search Model There is now a large literature on the estimation of these models. Here we focus on estimating the simple model given above and then discuss other parts of the literature. The theory implies that there exists a reservation wage that depends on the underlying parameters of the search problem: w∗(Θ).185 Suppose that the researcher has data on a set of I individuals over T periods. In particular, suppose that an observation for agent i in period t is zit ∈ {0, 1} where zit = 0 means that the agent is searching and zit = 1 means that the agent has a job. For purposes of discussion, we assume that the model is correct: once an agent has a job, he keeps it forever. Consider then the record for agent i who, say, accepted a job in period k + 1. According to the model, the likelihood of this is F (w∗)k(1 − F (w∗)). (10.28) The likelihood function for this problem is equivalent to the coin flipping exam- ple that we introduced in Chapter 4. There we saw that the likelihood function 276 would provide a way to estimate the probability of ”heads” but would not allow the researcher to identify the parameters that jointly determine this probability. The same point is true for the search problem. Using (10.28) for all agents in the sample, we can represent the likelihood of observing the various durations of search. But, in the end, the likelihood will only depend on the vector Θ through w∗. Wolpin (1987) estimates a version of this search model with a finite horizon and costly search. This implies, among other things, that the reservation wage is not constant as the problem is no longer stationary. Instead, he argues that the reservation wage falls over time.186 This time variation in the reservation wage is useful for identification since it creates time variation in the acceptance probability for given Θ. Wolpin (1987) also assumes that agents receive an offer each period with a prob- ability less than one. In order to estimate the model, he specifies a function for the likelihood an agent receives an offer in a given period. This probability is allowed to depend on the duration of unemployment. Wolpin uses data on both duration to employment and accepted wages. The ad- dition of wage data is interesting for a couple of reasons. First, the lowest accepted wage yields an upper bound on the reservation wage. Second, the researcher gener- ally observes accepted wage but not the offered wage. Thus there is an interesting problem of deducing the wage distribution from data on accepted wages. Wolpin (1987) estimates the model using a panel from the 1979 NLS youth cohort. In doing so, he allows for measurement error in the wage and also specifies a distribution for wage offers. Among other things, he finds that a log-normal distribution of wages fits better than a normal distribution. Further, the estimated hazard function (giving the likelihood of accepting a job after j periods of search) mimics the negative slope of that found in the data. 277 10.6.3 Extensions Of course, much has been accomplished in the search literature over the recent years. This includes introducing equilibrium aspects to the problem so that the wage dis- tribution is not completely exogenous. Other contributions introduce bargaining and search intensity, such as Eckstein and Wolpin (1995). Postel-Vinay and Robin (2002) develop an equilibrium model where the distribution of wage offers is en- dogenous to the model and results from heterogenous workers and firms and from frictions in the matching process. The model is then estimated on French data by maximum likelihood techniques. The simple model of labour search (10.25) can be extended to include transitions into unemployment, learning by doing and experience effects, as well as the effect of unobserved heterogeneity. The model of labor search can also be extended to education choices. The education choices can be a function of an immediate cost of education and the future rewards in terms of increased wages. Eckstein and Wolpin (1999) develop such a model. Wages and Experience The model in (10.25) can also be extended to understand why wages are increasing in age. An important part of the labor literature has tried to understand this phenomenon. This increase can come from two sources, either through an increase in productivity through general experience or possibly seniority within the firm, or through labor market mobility and on the job search. Altonji and Shakotko (1987), Topel (1991), Altonji and Williams (1997) and Dustmann and Meghir (2001) explore these issues, although in a non structural framework. Distinguishing the effect of experience from seniority is mainly done by com- paring individuals with similar experience but with different seniority. However, 278 seniority depends on the job to job mobility which is a choice for the agent, possi- bly influenced by heterogeneity in the return to experience. Hence seniority (and experience) has to be considered as an endogenous variable. It is difficult to find good instruments which can deal with the endogeneity. Altonji and Shakotko (1987) instrument the seniority variable with deviations from job means, while Dustmann and Meghir (2001) use workers who are fired when the whole plant close down as an exogenous event. We present a structural model below which can potentially be used to distinguish between the two sources of wage determinants. The wage is a function of labor market experience X, of seniority in the firm S, of an unobserved fixed component ε, which is possibly individual specific and a stochastic individual component η which is specific to the match between the agent and the firm and is potentially serially correlated. An employed individual earns a wage w(X, S, ε, η). At the end of the period, the agent has a probability δ of being fired. If not, next period, the individual receives a job offer represented by a wage w(X′, 0, ε, η̃′). This is compared to a wage within the firm of w(X′, S′, ε, η′). The value of work and of unemployment are defined as: 187 V W (X, S, ε, η) = w(X, S, ε, η) + βδV U (X′, ε) (10.29) +β(1 − δ)Eη′|η,η̃′ max[V W (X′, S′, ε, η′), V W (X′, 0, ε, η̃′)] V U (X, ε) = b(X) + βEη′ max[V U (X, ε), V W (X, 0, ε, η′)] When employed, the labor market experience evolves as X′ = X +1 and seniority, S, evolves in a similar way. When unemployed, the individual earns an unemployment benefit b(X) and receive at the end of the period a job offer characterized by a wage w(X, 0, ε, η′). The individual then chooses whether to accept the job or to remain for at least an additional period in unemployment. 279 An important issue is the unobserved heterogeneity in the return to experience. The model capture this with the term ε. Here, the identification of the different sources of wage growth comes from the structural framework and no instruments are needed. This model could be solved numerically using a value function iteration approach and then estimated by maximum likelihood, integrating out the unob- served heterogeneity. This can be done as in Heckman and Singer (1984) allowing for mass point heterogeneity (see for example Eckstein and Wolpin (1999) for an implementation in the context of a structural dynamic programming problem). Equilibrium Search Yashiv (2000) specifies and estimates a model of search and matching. The impor- tant feature of this exercise is that it accounts for the behavior of both firms and workers. In this model, unemployed workers search for jobs and firms with vacancies search for workers. Firms have stochastic profit functions and face costs of attracting workers through the posting of vacancies. Workers have an objective of maximizing the discounted expected earnings. Workers too face a cost of search and choose their search inten- sity. These choices yield Euler equations which are used in the GMM estimation. The key piece of the model is a matching function that brings the search of the workers and the vacancies of the firms together. The matching function has inputs of the vacancies opened by firms and the search intensity by the unemployed work- ers. Though all agents (firms and workers) take the matching probability as given, this probability is determined by their joint efforts in equilibrium. Empirically, an important component of the analysis is the estimation of the matching function. Yashiv (2000) finds that the matching function exhibits increasing returns, contrary to the assumption made in much of the empirical literature on matching. There is a very interesting link between this research and the discussion of dy- 280 namic labor demand. While researchers have specified labor adjustment costs, the exact source of these costs is less clear. The analysis in Yashiv (2000) is a step towards bridging this gap: he provides an interpretation of labor adjustment costs in the estimated search model. 10.7 Conclusions The intention of this book was to describe a research methodology for bringing dynamic optimization problems to the data. In this chapter, we have described some ongoing research programs that utilize this methodology. Still, there are many avenues for further contributions. In particular, the applica- tions described here have generally been associated with the dynamic optimization problem of a single agent. Of course, this agent may be influenced by relative prices but these prices have been exogenous to the agent. This does not present a problem as long as we are content to study individual optimization. But, as noted in the motivation of the book, one of the potential gains associated with the estimation of structural parameters is the confidence gained in the examination of alternative policies. In that case, we need to include policy induced variations in equilibrium variables. That is, we need to go beyond the single-agent problem to study equilibrium behavior. While some progress has been made on these issues, estimation of a dynamic equilibrium model with heterogeneous agents and allowing for non-convex adjustment of factors of production and/or prices still lies ahead.188 Related to this point, the models we have studied do not allow any strategic interaction between agents. One might consider the estimation of a structure in which a small set of agents interact in a dynamic game. The natural approach is to compute a Markov-perfect equilibrium and use it as a basis for estimating 281 observed behavior by the agents. Pakes (2000) provides a thorough review of these issues in the context of applications in industrial organization. Again, extensions to macroeconomics lie ahead. , , Bibliography Abel, A. and J. Eberly (1994). “A Unified Model of Investment Under Uncer- tainty.” American Economic Review , 94, 1369–84. Abowd, J. and D. Card (1989). “On the Covariance Structure of Earnings and Hours Changes.” Econometrica, 57, 411–445. Adda, J. and R. Cooper (2000a). “Balladurette and Juppette: A Discrete Analysis of Scrapping Subsidies.” Journal of Political Economy, 108(4), 778–806. Adda, J. and R. Cooper (2000b). “The Dynamics of Car Sales: A Discrete Choice Approach.” NBER WP No. 7785. Adda, J., C. Dustmann, C. Meghir, and J.-M. Robin (2002). “Human capital investment and job transitions.” mimeo University College London. Adda, J. and J. Eaton (1997). “Borrowing with Unobserved Liquidity Con- straints: Structural Estimation with an Application to Sovereign Debt.” mimeo, Boston University. Aguirregabiria, V. (1997). “Estimation of Dynamic Programming Models with Censored Dependent Variables.” Investigaciones Economicas, 21, 167–208. Aguirregabiria, V. (1999). “The Dynamics of Markus and Inventories in Retail- ing Firms.” Review of Economic Studies, 66, 275–308. 282 283 Altonji, J. and R. Shakotko (1987). “Do Wages Rise with Job Seniority?” Review of Economic Studies, 54(3), 437–459. Altonji, J. and Williams (1997). “Do Wages Rise with Job Security?” Review of Economic Studies, 54(179), 437–460. Altug, S. (1989). “Time to Build and Aggregate Fluctuations: Some New Evi- dence.” International Economic Review , 30, 889–920. Amman, H. M., D. A. Kendrick, and J. Rust (1996). Handbook of Computa- tional Economics, volume 1. Elsevier Science, North-Holland, Amsterdam, New York and Oxford. Arrow, K. J., T. Harris, and J. Marschak (1951). “Optimal Inventory Policy.” Econometrica, 19(3), 250–72. Attanasio, O. (2000). “Consumer Durables and Inertial Behaviour: Estimation and Aggregation of (S, s) Rules for Automobile Purchases.” Review of Economic Studies, 67(4), 667–696. Attanasio, O., J. Banks, C. Meghir, and G. Weber (1999). “Humps and Bumps in Lifetime Consumption.” Journal of Business and Economic Statistics, 17(1), 22–35. Ball, L. and D. Romer (1990). “Real Rigidities and the Non-neutrality of Money.” Review of Economic Studies, 57(2), 183–204. Bar-Ilan, A. and A. Blinder (1988). “The Life-Cycle Permanent-Income Model and Consumer Durables.” Annales d’Economie et de Statistique, (9). Bar-Ilan, A. and A. S. Blinder (1992). “Consumer Durables : Evidence on the Optimality of usually doing Nothing.” Journal of Money, Credit and Banking, 24, 258–272. 284 Baxter, M. (1996). “Are Consumer Durables Important for Business Cycles?” The Review of Economics and Statistics, 77, 147–55. Baxter, M. and R. King (1991). “Production Externalities and Business Cy- cles.” Federal Reserve Bank of Minneapolis, Discussion Paper 53 . Bellman, R. (1957). Dynamic Programming. Princeton University Press. Benassy, J. (1982). The Economics of Market Disequilibrium. NY: Academic Press. Beneveniste, L. and J. Scheinkman (1979). “On the differentiability of the value function in dynamic models of economics.” Econometrica, 47(3), 727–732. Benhabib, J. and R. Farmer (1994). “Indeterminacy and Increasing Returns.” Journal of Economic Theory, 63, 19–41. Bernanke, B. (1984). “Permanent Income, Liquidity and Expenditures on Au- tomobiles: Evidence from Panel Data.” Quarterly Journal of Economics, 99, 587–614. Bernanke, B. (1985). “Adjustment Costs, Durables and Aggregate Consump- tion.” Journal of Monetary Economics, 15, 41–68. Bertola, G. and R. J. Caballero (1990). “Kinked Adjustment Cost and Ag- gregate Dynamics.” In NBER Macroeconomics Annual , edited by O. J. Blanchard and S. Fischer. MIT Press, Cambridge, Mass. Bertsekas, D. (1976). Dynamic Programming and Stochastic Control . Academic Press. Bils, M. (1987). “The Cyclical Behavior of Marginal Cost and Price.” American Economic Review , 77, 838–55. 285 Bils, M. and P. Klenow (2002). “Some Evidence on the Importance of Sticky Prices.” NBER Working Paper No. 9069. Blackwell, D. (1965). “Discounted Dynamic Programming.” Annals of Mathe- matical Statistics, 36, 226–35. Blanchard, O. and N. Kiyotaki (1987). “Monopolistic Competition and the Effects of Aggregate Demand.” American Economic Review , 77, 647–66. Blinder, A. (1986). “Can the Production Smoothing Model of Inventory Behavior be Saved?” Quarterly Journal of Economics, 101, 431–53. Blinder, A. and L. Maccini (1991). “Taking Stock: A Critical Assessment of Recent Research on Inventories.” Journal of Economic Perspectives, 5(1), 73–96. Blundell, R., M. Browning, and C. Meghir (1994). “Consumer Demand and the Life-Cycle Allocation of Household Expenditures.” Review of Economic Studies, 61, 57–80. Braun, R. (1994). “Tax Disturbances and Real Economic Activity in Post-War United States.” Journal of Monetary Economics, 33. Bryant, J. (1983). “A Simple Rational Expectations Keynes-Type Model.” Quar- terly Journal of Economics, 97, 525–29. Caballero, R. (1999). “Aggregate Investment.” In Handbook of Macroeconomics, edited by J. Taylor and M. Woodford. North HOlland. Caballero, R. and E. Engel (1993a). “Heterogeneity and Output Fluctuation in a Dynamic Menu-Cost Economy.” Review of Economic Studies, 60, 95–119. Caballero, R. and E. Engel (1993b). “Heterogeneity and Output Fluctuations in a Dynamic Menu-Cost Economy.” Review of Economic Studies, 60, 95–119. 286 Caballero, R., E. Engel, and J. Haltiwanger (1995). “Plant Level Ad- justment and Aggregate Investment Dynamics.” Brookings Paper on Economic Activity, 0(2), 1–39. Caballero, R., E. Engel, and J. Haltiwanger (1997). “Aggregate Employ- ment Dynamics: Building From Microeconomic Evidence.” American Economic Review , 87, 115–37. Caballero, R. J. (1993). “Durable Goods: An Explanation for their Slow Ad- justment.” Journal of Political Economy, 101, 351–384. Campbell, J. and G. Mankiw (1989). “Consumption, Income and Interest Rates : Reinterpreting the Time Series Evidence.” In NBER Macroeconomic An- nual 1989 , edited by Olivier Blanchard and Stanley Fischer, pages 1–50. Chicago University Press. Caplin, A. and J. Leahy (1991). “State Dependent Pricing and the Dynamics of Money and Output.” Quarterly Journal of Economics, 106, 683–708. Caplin, A. and J. Leahy (1997). “Durable Goods Cycles.” mimeo, Boston University. Carroll, C. D. (1992). “The Buffer-Stock Theory of Saving : Some Macroeco- nomic Evidence.” Brookings Papers on Economic Activity, 2, 61–156. Cecchetti, S. (1986). “The Frequency of Price Adjustment: A Study of Newstand Prices of Magazines.” Journal of Econometrics, 31, 255–74. Chirinko, R. (1993). “Business Fixed Investment Spending.” Journal of Economic Literature, 31, 1875–1911. Christiano, L. (1988). “Why Does Inventory Investment Fluctuate So Much?” Journal-of-Monetary-Economics, 21(2), 247–80. 287 Christiano, L. and M. Eichenbaum (1992). “Current Real-Business-Cycle The- ories and Aggregate Labor Market Fluctuations.” American Economic Review , 82, 430–50. Cooper, R. (1999). Coordination Games: Complementarities and Macroeco- nomics. Cambridge University Press. Cooper, R. (2002). “Estimation and Identification of Structural Parameters in the Presence of Multiple Equilibria.” NBER Working Paper No. 8941. Cooper, R. and J. Ejarque (2000). “Financial Intermediation and Aggregate Fluctuations: A Quantitative Analysis.” Macroeconomic Dynamics, 4, 423–447. Cooper, R. and J. Ejarque (2001). “Exhuming Q: Market Power vs. Capital Market Imperfections.” NBER Working Paper . Cooper, R. and J. Haltiwanger (1993). “On the Aggregate Implications of Machine Replacement: Theory and Evidence.” American Economic Review , 83, 360–82. Cooper, R. and J. Haltiwanger. “On the Nature of the Capital Adjustment Process.” NBER Working Paper #7925 (2000). Cooper, R., J. Haltiwanger, and L. Power (1999). “Machine Replacement and the Business Cycle: Lumps and Bumps.” American Economic Review , 89, 921–946. Cooper, R. and A. Johri (1997). “Dynamic Complementarities: A Quantitative Analysis.” Journal of Monetary Economics, 40, 97–119. Cooper, R. and J. Willis. “The Economics of Labor Adjustment: Mind the Gap.” NBER Working Paper # 8527 (2001). 288 Cooper, R. W. and J. C. Haltiwanger (1992). “Macroeconomic implications of production bunching.” Journal Of Monetary Economics, 30(1), 107–27. De Boor, C. (1978). A Practical Guide to Splines. Springer-Verlag, New York. Deaton, A. (1991). “Savings and Liquidity Constraints.” Econometrica, 59, 1221– 1248. Dotsey, M., R. King, and A. Wolman (1999). “State-Dependent Pricing and the General Equilibrium Dynamics of Prices and Output.” Quarterly Journal of Economics, 114, 655–90. Duffie, D. and K. Singleton (1993). “Simulated Moment Estimation of Markov Models of Asset Prices.” Econometrica, 61(4), 929–952. Dustmann, C. and C. Meghir (2001). “Wages, experience and seniority.” IFS working paper W01/01. Eberly, J. C. (1994). “Adjustment of Consumers’ Durables Stocks : Evidence from Automobile Purchases.” Journal of Political Economy, 102, 403–436. Eckstein, Z. and K. Wolpin (1989). “The Specification and Estimation of Dynamic Stochastic Discrete Choice Models.” Journal of Human Resources, 24, 562–98. Eckstein, Z. and K. Wolpin (1995). “Duration to First Job and the Return to Schooling: Estimates from a Search-Matching Model.” Review of Economic Studies, 62, 263–286. Eckstein, Z. and K. I. Wolpin (1999). “Why Youths Drop Out of High School: The Impact of Preferences, Opportunities and Abilities.” Econometrica, 67(6), 1295–1339. 289 Eichenbaum, M. (1989). “Some Empirical Evidence on the Production Level and Production Cost Smoothing Models of Inventory Investment.” American Economic Review , 79(4), 853–64. Eichenbaum, M., L. Hansen, and K. Singleton (1988). “A Time Series Anlaysis of Representative Agent Models of Consumption and Leisure Choice Under Uncertainty.” Quarterly Journal of Economics, 103, 51–78. Eichenbaum, M. and L. P. Hansen (1990). “Estimating Models with Intertem- poral Substitution Using Aggregate Time Series Data.” Journal of Business and Economic Statistics, 8, 53–69. Erickson, T. and T. Whited (2000). “Measurement Error and the Relationship Between Investment and Q.” Journal of Policy Economy, 108, 1027–57. Farmer, R. and J. T. Guo (1994). “Real Business Cycles and the Animal Spirits Hypothesis.” Journal of Economic Theory, 63, 42–72. Fermanian, J.-D. and B. Salanié (2001). “A Nonparametric Simulated Maxi- mum Likelihood Estimation Method.” mimeo CREST-INSEE. Fernández-Villaverde, J. and D. Krueger (2001). “Consumption and Sav- ing over the Life Cycle: How Important are Consumer Durables?” mimeo, Stan- ford University. Flavin, M. (1981). “The Adjustment of Consumption to Changing Expectations about future Income.” Journal of Political Economy, 89, 974–1009. Gallant, R. A. and G. Tauchen (1996). “Which Moments to Match?” Econo- metric Theory, 12(4), 657–681. Gilchrist, S. and C. Himmelberg (1995). “Evidence on the role of cash flow for Investment.” Journal of Monetary Economics, 36, 541–72. 290 Gomes, J. (2001). “Financing Investment.” American Economic Review , 91(5), 1263–1285. Gourieroux, C. and A. Monfort (1996). Simulation-Based Econometric Meth- ods. Oxford University Press. Gourieroux, C., A. Monfort, and E. Renault (1993). “Indirect Inference.” Journal of Applied Econometrics, 8, S85–S118. Gourinchas, P.-O. and J. Parker (2001). “Consumption Over the Life Cycle.” Forthcoming, Econometrica. Greenwood, J., Z. Hercowitz, and G. Huffman (1988). “Investment, Ca- pacity Utilization and the Real Business Cycle.” American Economic Review , 78, 402–17. Grossman, S. J. and G. Laroque (1990). “Asset Pricing and Optimal Portfolio Choice in the Presence of Illiquid Durable Consumption Goods.” Econometrica, 58, 25–51. Hajivassiliou, V. A. and P. A. Ruud (1994). “Classical Estimation Methods for LDV Models Using Simulation.” In Handbook of Econometrics, edited by D. McFadden and R. Engle, volume 4, pages 2383–2441. North-Holland, Amsterdam. Hall, G. (1996). “Overtime, Effort and the propagation of business cycle shocks.” Journal of Monetary Economics, 38, 139–60. Hall, G. (2000). “Non-convex costs and capital utilization: A study of production scheduling at automobile assembly plants.” Journal of Monetary Economics, 45, 681–716. 291 Hall, G. and J. Rust (2000). “An empirical model of inventory investment by durable commodity intermediaries.” Carnegie-Rochester Conference Series on Public Policy, 52, 171–214. Hall, R. E. (1978). “Stochastic Implications of the Life Cycle- Permanent Income Hypothesis : Theory and Evidence.” Journal of Political Economy, 86, 971–987. Hamermesh, D. (1989). “Labor Demand and the Structure of Adjustment Costs.” American Economic Review , 79, 674–89. Hamermesh, D. (1993). Labor Demand . Princeton University Press. Hamermesh, D. and G. Pfann (1996). “Adjustment Costs in Factor Demand.” Journal of Economic Literature, 34, 1264–92. Hansen, G. (1985). “Indivisible Labor and the Business Cycle.” Journal of Mon- etary Economics, 16, 309–27. Hansen, G. and T. Sargent (1988). “Straight time and overtime in Equilib- rium.” Journal of Monetary Economics, 21, 281–308. Hansen, L. P., E. McGrattan, and T. Sargent (1994). “Mechanics of Form- ing and Estimating Dynamic Linear Economies.” Federal Reserve Bank of Min- neapolis, Staff Report 182 . Hansen, L. P. and K. J. Singleton (1982). “Generalized Instrumental Variables Estimation of Nonlinear Rational Expectations Models.” Econometrica, 50, 1269– 1286. Hayashi, F. (1982). “Tobin’s marginal Q and average Q: A neoclassical interpre- tation.” Econometrica, 50, 215–24. 292 Heckman, J. and B. Singer (1984). “A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data.” Econo- metrica, 52(2), 271–320. House, C. and J. Leahy (2000). “An sS model with Adverse Selection.” NBER WP No. 8030. Hubbard, G. (1994). “Investment under Uncertainty: Keeping One’s Options Open.” Journal of Economic Literature, 32(4), 1816–1831. John, A. and A. Wolman (1999). “Does State-Dependent Pricing Imply Coor- dination Failure?” Federal Reserve Bank of Richmond . Judd, K. (1992). “Projection Methods for Solving Aggregate Growth Models.” Journal of Economic Theory, 58, 410–452. Judd, K. (1996). “Approximation, Perturbation and Projection Methods in Eco- nomic Analysis.” In Handbook of Computational Economics, edited by H. M. Amman, D. A. Kendrick, and J. Rust. Elsevier Science, North-Holland. Judd, K. (1998). Numerical methods in economics. MIT Press, Cambridge and London. Kahn, A. and J. Thomas (2001). “Nonconvex Factor Adjustments in Equilib- rium Business Cycle Models: Do Nonlinearities Matter?” mimeo, University of Minnesota. Kahn, J. (1987). “Inventories and the Volatility of Production.” American Eco- nomic Review , 77(4), 667–79. Keane, M. P. and K. I. Wolpin (1994). “The Solution and Estimation of Discrete Choice Dynamic Programming Models by Simulation and Interpolation: Monte Carlo Evidence.” The Review of Economics and Statistics, pages 648–672. 293 King, R., C. Plosser, and S. Rebelo (1988). “Production, Growth and Busi- ness Cycles I. The Basic Neoclassical Model.” Journal of Monetary Economics, 21, 195–232. Kocherlakota, N., B. F. Ingram, and N. E. Savin (1994). “Explaining Business Cycles: A Multiple Shock Approach.” Journal of Monetary Economics, 34, 415–28. Kydland, F. and E. Prescott (1982). “Time To Build and Aggregate Fluctu- ations.” Econometrica, 50, 1345–70. Laffont, J.-J., H. Ossard, and Q. Vuong (1995). “Econometrics of First- Price Auctions.” Econometrica, 63, 953–980. Lam, P. (1991). “Permanent Income, Liquidity and Adjustments of Automobile Stocks: Evidence form Panel Data.” Quarterly Journal of Economics, 106, 203– 230. Laroque, G. and B. Salanié (1989). “Estimation of Multi-Market Fix-Price Models: An Application of Pseudo Maximum Likelihood Methods.” Eca, 57(4), 831–860. Laroque, G. and B. Salanié (1993). “Simulation Based Estimation Models with Lagged Latent Variables.” Journal of Applied Econometrics, 8, S119–S133. Lee, B.-S. and B. F. Ingram (1991). “Simulation Estimation of Time-Series Models.” Journal of Econometrics, 47, 197–205. Lerman, S. and C. Manski (1981). “On the Use of Simulated Frequencies to Approximate Choice Probabilities.” In Structural Analysis of Discrete Data with Econometric Applications, edited by C. Manski and D. McFadden, pages 305–319. MIT Press, Cambridge. 294 Ljungqvist, L. and T. J. Sargent (2000). Recursive Macroeconomic Theory. MIT. MaCurdy, T. E. (1981). “An Empirical Model of Labor Supply in a Life-Cycle Setting.” Journal of Political Economy, 89(6), 1059–1085. Mankiw, G. N. (1982). “Hall’s Consumption Hypothesis and Durable Goods.” Journal of Monetary Economics, 10, 417–425. Manski, C. (1993). “Identification of Endogenous Social Effects: The Reflection Problem,.” Review of Economic Studies, 60(3), 531–42. McCall, J. (1970). “Economics of Information and Job Search.” Quarterly Journal of Economics, 84(1), 113–26. McFadden, D. (1989). “A Method of Simulated Moments for Estimation of Dis- crete Response Models Without Numerical Integration.” Econometrica, 57, 995– 1026. McFadden, D. and P. A. Ruud (1994). “Estimation by Simulation.” The Review of Economics and Statistics, 76(4), 591–608. McGrattan, E. (1994). “The Macroeconomic Effects of Distortionary Taxes.” Journal of Monetary Economics, 33, 573–601. McGrattan, E. R. (1996). “Solving the Stochastic Growth Model with a Finite Element Method.” Journal of Economic Dynamics and Control , 20, 19–42. Meghir, C. and G. Weber (1996). “Intertemporal Non-Separability or Borrow- ing Restrictions ? A disaggregate Analysis Using US CEX Panel.” Econometrica, 64(5), 1151–1181. 295 Miranda, M. J. and P. G. Helmberger (1988). “The Effects of Commodity Price Stabilization Programs.” American Economic Review , 78(1), 46–58. Newey, W. K. and K. D. West (1987). “A Simple, Positive, Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econo- metrica, 55, 703–708. Nickell, S. (1978). “Fixed Costs, Employment and Labour Demand over the Cycle.” Econometrica, 45. Pakes, A. (1994). “Dynamic Structural Models, Problems and Prospects: Mixed Continuous discrete controls and market interactions.” In Advances in Econo- metrics, Sixth World Congress, edited by C. Sims, pages 171–259. Cambridge University Press. Pakes, A. (2000). “A Framework for Applied Dynamic Analysis in I.O.” NBER Paper # 8024. Pakes, A. and D. Pollard (1989). “Simulation and the Asymptotics of Opti- mization Estimators.” Econometrica, 57, 1027–1057. Pfann, G. and F. Palm (1993). “Asymmetric Adjustment Costs in Non-linear Labour Models for the Netherlands and U.K. Manufacturing Sectors.” Review of Economic Studies, 60, 397–412. Postel-Vinay, F. and J.-M. Robin (2002). “Equilibrium Wage Dispersion with Worker and Employer Heterogeneity.” Econometrica. Press, W., B. Flannery, S. Teukolsky, and W. Vetterling (1986). Nu- merical Recipes: The Art of Scientific Computing. Ramey, V. and M. Shapiro (2001). “Displaced Capital.” Journal of Political Economy, 109, 958–92. 296 Reddy, J. (1993). An Introduction to the Finite Element Method . McGraw-Hill, New York. Rogerson, R. (1988). “Indivisible Labor, Lotteries and Equilibrium.” Journal of Monetary Economics, 21, 3–16. Rust, J. (1985). “Stationary Equilibrium in a Market for Durable Assets.” Econo- metrica, 53(4), 783–805. Rust, J. (1987). “Optimal Replacement of GMC Bus Engines: an Empirical Model of Harold Zurcher.” Econometrica, 55(5), 999–1033. Rust, J. and C. Phelan (1997). “How Social Security and Medicare Affect Retirement Behavior in a World of Incomplete Markets.” Econometrica, 65(4), 781–832. Sakellaris, P. (2001). “Patterns of Plant Adjustment.” Working Paper #2001- 05, Finance and Economics Discussion Series, Division of Research and Statistics and Monetary Affairs, Federal Reserve Board, Washington D.C. Sargent, T. (1978). “Estimation of Dynamic Labor Demand Schedules under Rational Expectations.” Journal of Political Economy, 86(6), 1009–1044. Sargent, T. (1987). Dynamic Macroeconomic Theory. Harvard University Press. Scarf, H. (1959). “The Optimality of (S,s) Policies in the Dynamic Inventory Problem.” In Mathmematical Methods in Social Sciences, edited by S. K. K. Ar- row and P. Suppes, pages 196–202. Stanford University Press. Shapiro, M. (1986). “The Dynamic Demand for Labor and Capital.” Quarterly Journal of Economics, 101, 513–42. 297 Smith, A. (1993). “Estimating Nonlinear Time-Series Models using Simulated Vector Autoregressions.” Journal of Applied Econometrics, 8, S63–84. Stokey, N. and R. Lucas (1989). Recursive Methods in Economic Dynamics. Harvard University Press. Tauchen, G. (1986). “Finite State Markov-Chain Approximation to Univariate and Vector Autoregressions.” Economics Letters, 20, 177–81. Tauchen, G. (1990). “Solving the Stochastic Growth Model by Using Quadra- ture Methods and Value-Function Iterations.” Journal of Business and Economic Statistics, 8(1), 49–51. Tauchen, G. and R. Hussey (1991). “Quadrature-Based Methods for Obtaining Approximate Solutions to Nonlinear Asset Pricing Models.” Econometrica, 59, 371–396. Taylor, J. B. and H. Uhlig (1990). “Solving Nonlinear Stochastic Growth Models : A Comparison of Alternative Solution Methods.” Journal of Business and Economic Statistics, 8, 1–17. Thomas, J. (2000). “Is Lumpy Investment Relevant for the Business Cycle?” manuscript, Carnegie-Mellon University, forthcoming, JPE. Topel, R. (1991). “Specific Capital, Mobility, and Wages: Wages Rise with Job Seniority.” Journal of Political Economy, 99(1), 145–176. Whited, T. (1998). “Why Do Investment Euler Equations Fail?” Journal of Business and Economic Statistics, 16(4), 479–488. Willis, J. (2000a). “Estimation of Adjustment Costs in a Model of State- Dependent Pricing.” Working Paper RWP 00-07, Federal Reserve Bank of Kansas City. 298 Willis, J. (2000b). “General Equilibrium of Monetary Model with State Dependent Pricing.” mimeo, Boston University. Wolpin, K. (1987). “Estimating a Structural Search Model: The Transition from School to Work.” Econometrica, 55(4), 801–18. Wright, B. D. and J. C. Williams (1984). “The Welfare Effects of the Intro- duction of Storage.” Quarterly Journal of Economics, 99(1), 169–192. Yashiv, E. (2000). “The Determinants of Equilibrium Unemployment.” American Economic Review , 90(5), 1297–1322. Zeldes, S. (1989a). “Optimal Consumption with Stochastic Income : Deviations from Certainty Equivalence.” Quarterly Journal of Economics, 104, 275–298. Zeldes, S. P. (1989b). “Consumption and Liquidity Constraints : An Empirical Investigation.” Journal of Political Economy, 97, 305–346. Index adjustment costs, see costs of adjust- ment aggregate implications durable purchases, 192 machine replacement, 221 menu costs, 260 aggregation, 192, 221 asymptotic properties, 85–104 GMM, 87 indirect inference, 102 maximum likelihood, 91 simulated maximum likelihood, 99 simulated method of moments, 95 autocorrelation, 92 autoregressive process, 62 average Q, 206 Balladurette, 195 Bellman equation definition, 19, 31 example, 90, 92, 120, 159, 170, 180, 205, 274 numerical solution, 41, 60 Blackwell’s conditions, 32 borrowing restrictions consumption, 158, 169 investment, 214 cake eating problem dynamic discrete choice, 26 example, 16 finite horizon, 14 infinite horizon, 18 example, 21 infinite horizon with taste shocks, 24 overview, 10–17 calibration stochastic growth model, 135 capital capital accumulation, 200 costs of adjustment, 202 convex, 203 non-convex, 215 quadratic, 205 imperfection in capital markets, 209 299 300 labor adjustment, 270–271 car sales, 196 certainty equivalence, 162 CES (constant elasticity of substitu- tion), 256 Chebyshev polynomial, 49 coin flipping, 67 collocation method, 49 complementarities technology, 272–273 technology and stochastic growth, 142 consumption borrowing restrictions, 158 durables, see durable consumption endogenous labor supply, 167 evidence, 162, 165, 173 GMM estimation, 165 infinite horizon, 159–163 life cycle, 173–177 portfolio choice, 156, 163 random walk, 162 smoothing, 149–154 stochastic income, 160 two period model, 150–159 contraction mapping, 32 control space, 42 control variable, 19 control vector, 29 convergence rate, 44 convex adjustment costs durables, 183 investment, 203 costs of adjustment capital, 202 convex, 203 non-convex, 215 quadratic, 205 capital and labor, 270–271 durables non convex, 184 quadratic, 183 employment asymmetric adjustment costs, 243 convex and non-convex adjust- ment costs , 250 non-convex, 241 piece-wise linear, 239 quadratic, 232 CRRA (constant relative risk aversion), 41, 161 decentralisation, 128 demand (and supply), 79 301 depreciation rate, 138, 179, 182 discount factor, 10, 150 discounting, 30, 33 discrete cake eating problem estimation, 90, 92, 94, 97, 101 numerical implementation, 51 overview, 26–28 distortionary taxes, 147 durable consumption, 178–198 dynamic discrete choice, 189 estimation dynamic discrete choice, 193 with quadratic utility, 182 irreversibility, 187 non convex costs, 184–198 PIH, 179–184 scrapping subsidies, 195–198 duration search, 276 duration model, 101 dynamic discrete choice durables, 189 dynamic labor demand, 229–254 estimation, 250 linear quadratic specification, 237 non-convex adjustment costs, 241 partial adjustment, 245 piecewise linear adjustment costs, 239 quadratic adjustment costs, 232 dynamic programming theory Blackwell’s sufficient conditions, 32 cake eating example, 10, 16 control vector, 29 discounting, 32 finite horizon cake eating problem, 14 general formulation, 28 infinite horizon cake eating prob- lem, 18 monotonicity, 32 optimal stopping problem, 26 overview, 7 state vector, 29 stochastic models, 35 transition equation, 29 value function, 13 value function iteration, 33 education choice, 277 efficient method of moments, 102 elasticity demand curve, 235 intertemporal, 166, 176 302 labor supply, 169 employment adjustment, 229–253 asymmetric adjustment costs, 243 convex and non-convex adjustment costs , 250 gap approach , 244 general functional equation, 230 non-convex adjustment costs, 241 partial adjustment model, 245 piece-wise linear adjustment costs, 239 quadratic adjustment costs, 232 quadratic costs of adjustment simulated example, 234 Sargent’s linear quadratic model, 237 endogenous labor supply consumption, 167 growth, 130 equilibrium analysis, 272 equilibrium search, 279 ergodicity, 81, 83 Euler equation consumption, 152, 174 consumption and borrowing con- straints, 170 consumption and portfolio choice, 164 durables, 180 employment adjustment, 237 estimation, 87, 212 finite horizon cake eating problem, 11 investment, 203, 204 non-stochastic growth model, 111 projection methods, 46–51 stochastic growth model, 125 experience, return to, 277 exponential model, 101 finite element method, 50 functional equation, 19 Galerkin method, 49 gap approach employment adjustment, 244 Gauss-Legendre quadrature, 61 generalized method of moments example capital and quadratic adjustment costs, 212 consumption, 87, 165, 172 employment adjustment, 243 stochastic growth model, 137 orthogonality restriction, 86 303 theory, 70–72, 86–89 government spending, 147 heterogeneity, 261 Howard’s improvement algorithm, 45 identification, 76, 84–85 imperfection, in capital market, 158, 169, 209, 214 inaction, 215, 239 income specification, 173 indirect inference example cake eating problem, 101 dynamic capital demand, 224 Q theory of investment, 210 stochastic growth model, 139 supply and demand, 83 specification test, 104 theory, 74–76, 100–104 indirect utility, 7 infinite horizon consumption model, 159–163 information matrix, 92 instrumental variable, 80, 89 integration methods, 60–64 quadrature methods, 61 interpolation methods, 58–60 least squares, 58 linear, 59 splines, 60 intertemporal elasticity of substitution, 166, 176 inventory policy, 263–270 prices, 267 production smoothing, 263 investment borrowing restrictions, 214 convex adjustment costs, 203 convex and non-convex adjustment costs, 224 costs of adjustment, 202 Euler equation with no adjustment costs, 203 functional equation, 201 general formulation, 200 GMM estimation of quadratic ad- justment cost model, 212 irreversibility, 223 machine replacement problem, 219 aggregate implications, 221 maximum likelihood estimation, 227 no adjustment costs, 202 non-convex adjustment costs, 215 Q theory, 205 304 evidence, 207 irreversibility durables, 185, 187–188 investment, 223 IV, see instrumental variable job mobility, 278 job offer, 276 Juppette, see Balladurette labor market experience, 277–280 mobility, 277–280 search, 273–280 transitions, 277–280 wage, 277–280 labor supply endogenous, 130, 167–169 least squares interpolation, 58 life cycle consumption, 173–177 likelihood, 68, 82, 90 linear interpolation, 59 linear quadratic model of labor demand, 237 linearization, 122 logit model, 75 machine replacement aggregate implications, 221 model, 219 magazine prices, 259 mapping, 32 marginal q, 204, 206 market power, 209, 235, 249, 256, 260 markov chain, 69 as approximation, 62 example, 197, 210, 235 simulation, 64 maximum likelihood, 68–70, 82–83, 90– 92 asymptotic properties, 91 example coin flipping, 69 discrete cake eating problem, 90 employment adjustment, 238 investment, 227 stochastic growth model, 141 supply and demand, 82 simulated, see simulated maximum likelihood menu costs, 255–263 aggregate implications, 260 evidence, 259 model, 256 method of moments, 70, 80, 86 orthogonality condition, 81 305 misspecification, 172, 209 mobility, 277 moment calibration, 93 moments stochastic growth model, 135 monotonicity, 32 multiple sector model, 144 Newey-West estimator, 89 non-convex adjustment costs durables, 184 employment, 241 investment, 215 non-stochastic growth model, 109 Euler equation, 111 example, 112 matlab code, 114 preferences, 110 technology, 110 value function, 110 numerical integration, 60–64 quadrature methods, 61 optimal stopping problem, 26, 192, 220 optimal weighting matrix, 88, 95 orthogonality restriction, 81, 86 overidentification test, 89 partial adjustment model employment , 245 permanent income hypothesis durables, 179–184 permanent vs. transitory shocks, 132, 156, 162, 173, 242 PIH, see permanent income hypothesis planner’s problem, 118 policy evaluation, 195–198 policy function consumption, 171 definition, 21 policy function iterations, 45 policy rule, 47 portfolio choice, 156, 163 and durables, 187 price setting, 255–263 principle of optimality, 14, 15 production smoothing, 263 projection methods, 46 Q theory evidence, 207 model, 205 quadratic adjustment costs durables, 183 employment, 232 quadrature methods, 61 306 random walk in consumption durables, 181 non durables, 161, 162 rate of convergence, 44 recursive equilibrium, 129 reduced form, 80 reservation wage, 274 return to experience, 277 tenure, 277 sales of new cars, 196 score function, 91, 102 scrapping subsidies, 195 search model, 273–280 duration, 276 seniority, 277 sequence problem, 10 finite horizon, 10 serial correlation, 92 simulated maximum likelihood, 73 theory, 72, 98–100 simulated method of moments asymptotic properties, 95 efficient method of moments, 102 example cake eating problem, 94 consumption, 176 durables, 193–194 theory, 73, 94–96 simulated non linear least squares example cake eating problem, 97 durables, 193–194 theory, 96–98 simulation methods, 65 solution methods linearization, 122 projection methods, 46–51 value function iteration, 41–45, 52– 54, 125 specification test GMM, 89 indirect inference, 104 spline interpolation, 60 [s,S] models, 185, 187 state space, 42, 52, 115 large, 54 state variable, 19 state vector, 29 stationarity, 19, 30 stochastic cake eating problem projection methods approach, 46 value function approach, 40 307 stochastic growth model, 109 calibration, 135 confronting the data, 134–142 calibration, 135 GMM, 137 indirect inference, 139 maximum likelihood, 141 decentralization, 128 endogenous labor supply, 130 example, 125 functional equation, 120 GMM, 137 indirect inference, 139 intermediation shocks, 139 investment shocks, 139 linearization, 122 multiple sectors, 144 overview, 117 taste shocks, 146 technological complementarities, 142 technology, 119 value function iteration, 125, 133 stochastic income, 154, 160 stochastic returns, 163 supply and demand, 79 taste shock aggregate, 189 cake eating problem, 24, 51, 52 durables, 189 in estimation, 90 stochastic growth model, 141, 146 tax credits, 199 taxes, 147, 153, 198 technological complementarities, 272, see complementarities tenure, return to, 277 transition equation, 19, 29 transition matrix, 25 transversality condition, 111 uncertainty consumption/saving choice, 154–156, 160–163 unobserved heterogeneity, 187, 261, 278, 279 utility quadratic, 182 utility function adjustment costs, 183 CRRA, 41, 161 quadratic, 183 value function implementation, 41, 52 308 value function iteration, 33 example, 192–193 non-stochastic growth model, 114 stochastic growth model, 125 VAR, 191 wage offer, 277 weighting matrix, 86, 88 309 Notes 1This exercise is described in some detail in the chapter on consumer durables in this book. 2Some of the tools for numerical analysis are also covered in Ljungqvist and Sargent (2000) and Judd (1996). 3Assume that there are J commodities in this economy. This presentation as- sumes that you understand the conditions under which this optimization problem has a solution and when that solution can be characterized by first-order conditions. 4For a very complete treatment of the finite horizon problem with uncertainty, see Bertsekas (1976). 5Throughout, the notation {xt}T1 is used to define the sequence (x1, x2, ....xT ) for some variable x. 6This comes from the Weierstrass theorem. See Bertsekas (1976), Appendix B, or Stokey and Lucas (1989), Chpt. 3, for a discussion. 7By the sequence approach, we mean solving the problem using the direct ap- proach outlined in the previous section. 8As you may already know, stationarity is vital in econometrics as well. Thus making assumptions of stationarity in economic theory have a natural counterpart in empirical studies. In some cases, we will have to modify optimization problems to ensure stationarity. 9To be careful, here we are adding shocks that take values in a finite and thus countable set. See the discussion in Bertsekas (1976), Section 2.1, for an introduction 310 to the complexities of the problem with more general statements of uncertainty. 10For more details on markov chains we refer the reader to Ljungqvist and Sargent (2000). 11The evolution can also depend on the control of the previous period. Note too that by appropriate rewriting of the state space, richer specifications of uncertainty can be encompassed. 12This is a point that we return to below in our discussion of the capital accumu- lation problem. 13Throughout we denote the conditional expectation of ε′ given ε as Eε′|ε. 14Eckstein and Wolpin (1989) provide an extensive discussions of the formulation and estimation of these problems in the context of labor applications. 15In the following chapter on the numerical approach to dynamic programming, we study this case in considerable detail. 16This section is intended to be self-contained and thus repeats some of the ma- terial from the earlier examples. Our presentation is by design not as formal as say that provided in Bertsekas (1976) or Stokey and Lucas (1989). The reader interested in more mathematical rigor is urged to review those texts and their many references. 17Ensuring that the problem is bounded is an issue in some economic applications, such as the growth model. Often these problems are dealt with by bounding the sets C and S. 18Essentially, this formulation inverts the transition equation and substitutes for c in the objective function. This substitution is reflected in the alternative notation for the return function. 311 19Some of the applications explored in this book will not exactly fit these con- ditions either. In those cases, we will alert the reader and discuss the conditions under which there exists a solution to the functional equation. 20The notation dates back at least to Bertsekas (1976). 21See Stokey and Lucas (1989) for a statement and proof of this theorem. 22Define σ(s, s′) as concave if σ(λ(s1, s′1) + (1 − λ)(s2, s′2)) ≥ λσ(s1, s′1) + (1 − λ)σ(s2, s ′ 2) for all 0 < λ < 1 where the inequality is strict if s1 �= s2. 23As noted earlier, this structure is stronger than necessary but accords with the approach we will take in our empirical implementation. The results reported in Bertsekas (1976) require that Ψ is countable. 24We present additional code for this approach in the context of the nonstochastic growth model presented in Chapter 5. 25In some application, it can be useful to define a grid which is not uniformally spaced, see the discrete cake eating problem in section 3.3. 26Popular orthogonal bases are Chebyshev, Legendre or Hermite polynomials. 27The polynomials are also defined recursively by pi(X) = 2Xpi−1(X) − pi−2(X), i ≥ 2, with p0(0) = 1 and p(X, 1) = X. 28This is in fact the structure of a probit model. 29 This is not I since we have the restriction ∑ i Pi = 1. 30If we also want to estimate σD, σS and ρSD, we can include additional moments such as E(p), E(q), V (p), V (q) or cov(p, q). 312 31The variance of U1 and U2 are defined as: σ21 = σ2D + σ 2 S − 2ρDS (αp − βp)2 σ22 = α2pσ 2 D + β 2 pσ 2 S − 2αpβpρDS (αp − βp)2 and the covariance between U1 and U2 is: ρ12 = αpσ 2 D + βpσ 2 S − ρDS(αp + βp) (αp − βp)2 The joint density of U1 and U2 can be expressed as: f (u1, u2) = 1 2πσ1σ2 √ 1 − ρ2 exp − 1 2(1 − ρ2) ( u21 σ21 + u22 σ22 + 2ρu1u2 ) with ρ = ρ12/(σ1σ2). 32Here we view T as the length of the data for time series applications and as the number of observations in a cross section. 33 For instance, if εt = ρεt−1 + ut with ut ∼N(0,σ2), the probability that the cake is eaten in period 2 is: p2 = P (ε1 < ε ∗(W1), ε2 > ε
    ∗(W2))
    = P (ε1 < ε ∗(W1)) P (ε2 > ε
    ∗(W2)|ε1 < ε∗(W1)) = Φ ( ε∗1(W1) σ/ √ 1 − ρ2 ) 1√ 2πσ ∫ +∞ ε∗2 ∫ ε∗1 −∞ exp(− 1 2σ2 (u − ρv)2)dudv If ρ = 0 then the double integral resumes to a simple integral of the normal distri- bution. 34for instance, µ(x) = [x, x2] if one wants to focus on matching the mean and the variance of the process 35To see this, define θ∞, the solution to the minimization of the above criterion, 313 when the sample size T goes to infinity. θ∞ = arg min θ lim T 1 T T∑ t=1 (x(ut, θ0) − x̄(θ))2 = arg min θ E(x(u, θ0) − x̄(θ))2 = arg min θ E ( x(u, θ0) 2 + x̄(θ)2 − 2x(u, θ0)x̄(θ) ) = arg min θ V (x(u, θ0)) + V (x̄(θ)) + (Ex(u, θ0) − Ex̄(θ))2 This result holds as Exx̄ = ExEx̄, i.e. the covariance between ut and u s t is zero. Differentiating the last line with respect to θ, we obtain the first order conditions satisfied by θ∞: ∂ ∂θ V (x̄(θ∞)) + 2 ∂ ∂θ Ex̄(θ∞)[Ex̄(θ∞) − Ex(u, θ0))] = 0 If θ∞ = θ0, this first order condition is only satisfied if ∂ ∂θ V (x̄(θ0)) = 0, which is not guaranteed. Hence, θ∞ is not necessarily a consistent estimator. This term depends on the (gradient of the) variance of the variable, where the stochastic element is the simulated shocks. Using simulated paths instead of the true realization of the shock leads to this inconsistency. 36The specification of the model should also be rich enough so that the estimation makes sense. In particular, the model must contain a stochastic element which explains why the model is not fitting the data exactly. This can be the case if some characteristics, such as taste shocks, are unobserved. 37Though in the standard real business cycle model there is no rationale for such intervention. 38 Equivalently, we could have specified the problem with k as the state, c as the control and then used a transition equation of: k′ = f (k) + (1 − δ)k − c. 39This follows from the arguments in Chapter 2. 314 40As noted in the discussion of the cake eating problem, this is but one form of a deviation from a proposed optimal path. Deviations for a finite number of periods also do not increase utility if (5.2) holds. In addition, a transversality condition must be imposed to rule out deviations over an infinite number of period. 41That code and explanations for its use is available on the web page for this book. 42In the discussion of King et al. (1988), this term is often called the elasticity of the marginal utility of consumption with respect to consumption. 43One must take care that the state space is not binding. For the growth model, we know that k′ is increasing in k and that k′ exceeds (is less than) k when k is less than (exceeds) k∗. Thus the state space is not binding. 44This tradeoff can be seen by varying the size of the state space in grow.m. In many empirical applications, there is a limit to the size of the state space in that a finer grid doesn’t influence the moments obtained from a given parameter vector. 45A useful exercise is to alter this initial guess and determine whether the solution of the problem is independent of it. Making good initial guesses is often quite valuable for estimation routines in which there are many loops over parameters so that solving the functional equation quickly is quite important. 46Later in this chapter we move away from this framework to discuss economies with distortions and heterogeneity. 47Later in this chapter, we discuss extensions that would include multiple sectors. 48Some of these restrictions are stronger than necessary to obtain a solution. As we are going to literally compute the solution to (5.6), we will eventually have to create 315 a discrete representation anyways. So we have imposed some of these features at the start of the formulation of the problem. The assumptions on the shocks parallel those made in the presentation of the stochastic dynamic programming problem in Chapter 2. 49Thus the problem is quite similar to that described by King et al. (1988) though here we have not yet introduced employment. 50The discussion in the appendix of King et al. (1988) is recommended for those who want to study this linearization approach in detail. 51Here we formulate the guess of the policy function rather than the value function. In either case, the key is to check that the functional equation is satisfied. 52Alternatively, one could start from this guess of the value function and then use it to deduce the policy function. 53Given that u(c) and f (k) are both strictly concave, it is straightforward to see that the value function for the one period problem is strictly concave in k. As argued in Chapter 2, this property is preserved by the T (V ) mapping used to construct a solution to the functional equation. 54See Tauchen (1990) for a discussion of this economy and a comparison of the value function iteration solution relative to other solution methods. 55See also the presentation of various decentralizations in Stokey and Lucas (1989). 56Of course, this is static for a given k′. The point is that the choice of n does not influence the evolution of the state variable. 57In fact, preferences are often specified so that there is no response in hours worked to permanent shocks. Another specification of preferences, pursued in 316 Hansen (1985), arises from the assumption that employment is a discrete variable at the individual level. Rogerson (1988) provides the basic framework for the ”in- divisible labor model”. 58We will see this in more detail in the following chapter on household savings and consumption when there is stochastic income. 59For some specifications of the utility function, φ̂(A, k, k′) can be solved for an- alytically and inserted into the program. For example, suppose u(c, 1 − n) = U (c + ξ(1 − n)), where ξ is a parameter. Then the first order condition is Afn(k, n) = ξ which can be solved to obtain φ̂(A, k, k′) given the production function. To verify this, assume that Af (k, n) is a Cobb-Douglas function. 60The interested reader can clearly go beyond this structure though the arguments put forth by King et al. (1988) on restrictions necessary for balanced growth should be kept in mind. Here the function ξ(1 − n) is left unspecified for the moment though we assume it has a constant elasticity given by η. 61Note though that King, Plosser and Rebelo build a deterministic trend into their analysis which they remove to render the model stationary. As noted in Section 3.2.1 of their paper, this has implications for selecting a discount factor. 62Specifically, the moments from the KPR model are taken from their Table 4, using the panel data labor supply elasticity and ρ = .9. and the standard deviation of the technology shock (deviation from steady state) is set at 2.29. 63See King et al. (1988) for a discussion of this. 64As the authors appear to note, this procedure may actually just uncover the de- preciation rate used to construct the capital series from observations on investment. 317 65Thus in contrast to many studies in the calibration tradition, this is truly an estimation exercise, complete with standard errors. 66In this case, the model cannot be rejected at a 15 % level using the J-statistic computed from the match of these two moments. 67This is the case since the empirical analysis focuses on output and investment fluctuations. 68When employment is variable and wages are observed, then (5.23) has no error term either. In this case, researchers include taste shocks. Using this, they find that current consumption can be written as a function of current output and lagged consumption without any error term. This prediction is surely inconsistent with the data. 69See Hansen et al. (1994) for a general formulation of this approach. 70Each of these extensions creates an environment which the interested reader can use as a basis for specifying and solving a dynamic programming and confronting it with data. 71Cooper (1999) explores a wide variety of ways to model complementarities. Enriching the neoclassical production function is the one closest to existing models. See the discussion in Benhabib and Farmer (1994) and Farmer and Guo (1994) about the use of these models to study indeterminacy. Manski (1993) and Cooper (2002) discuss issues associated with the estimation of models with complementarities and multiple equilibria. 72In contrast to the contraction mapping theorem, there is no guarantee that this process will converge. In some cases, the household’s response to an aggregate law of motion can be used as the next guess on the aggregate law of motion. Iteration 318 of this may lead to a recursive equilibrium. 73See Cooper (1999) and the references therein. 74For now think of these are producer durables though one could also add con- sumer durables to this sector or create another sector. 75Similar problems of matching positive comovements arise in multiple-country real business cycle models. 76McGrattan (1994) allows for past labor to enter current utility as well. 77See McGrattan (1994) and the references therein for a discussion of computing such equilibria. 78This has a well understood implication for the timing of taxes. Essentially, a government with a fixed level of spending must decide on the timing of its taxes. If we interpret the income flows in our example as net of taxes, then intertemporal variation in taxes (holding fixed their present value) will only change the timing of household income and not its present value. Thus, tax policy will influence savings but not consumption decisions. 79If ρ > 1, then ∂c0
    ∂y0
    will exceed 1.
    80We assume that there exists a solution to this function equation. This requires,
    as always, that the choice be bounded, perhaps by a constraint on the total debt
    that a household can accumulate.
    81In fact, if there are other variables known to the decision maker that provide
    information on (y′, R) then these variables would be included in the state vector as
    well.
    82Sargent (1978) also provides a test for the permanent income hypothesis and

    319
    rejects the model.
    83See for instance Zeldes (1989b) or Campbell and Mankiw (1989).
    84In fact, the theory does not imply which of the many possible variables should
    be used when employing these restrictions in an estimation exercise. That is, the
    question of “which moments to match?” is not answered by the theory.
    85This is similar to the trick we used in the stochastic growth model with endoge-
    nous employment.
    86See also Wright and Williams (1984) and Miranda and Helmberger (1988) for
    an early contribution on this subject, including numerical solutions and simulations
    of these models
    87see also Carroll (1992)
    88The figure was computed using the following parameterization: β = 0.96, γ =
    0.5, σ2u = 0.0212, σ
    2
    n = 0.044, p = 0.03. γ0 = 0.0196, γ1 = 0.0533. We are grateful
    to Gourinchas and Parker for providing us with their codes and data.
    89See footnote 88 for the parameterization.
    90As an outstanding example, Rust and Phelan (1997) explore the effects of social
    security policies on labor supply and retirement decisions in a dynamic programming
    framework.
    91In a model of habit formation, past consumption can influence current utility
    even if the consumption is of a nondurable or service. In that case, the state vector
    is supplemented to keep track of that experience. For the case of durable goods, we
    will supplement the state vector to take the stock of durables into account.
    92From Baxter (1996), the volatility of durable consumption is about five times

    320
    that of nondurable consumption.
    93To be complete, as we explain there are also maintained assumption about
    preferences, shocks and the lack of adjustment costs.
    94Of course, other possible assumptions on timing are implementable in this frame-
    work. We discuss this below.
    95That is, movement in the marginal utility of consumption of nondurables may
    be the consequence of variations in the stock of durables. We return to this point
    in the discussion of empirical evidence.
    96This condition doesn’t obtain under the previous timing due to the time to build
    aspect of durables assumed there.
    97See also Eichenbaum and Hansen (1990).
    98See House and Leahy (2000) for a model of durables with an endogenous lemons
    premium.
    99The assumption that one car is the max is just for convenience. What is im-
    portant is that the car choice set is not continuous.
    100This presentation relies heavily on Adda and Cooper (2000b).
    101 Adda and Cooper (2000b) explicitly views this as a household specific income
    shock but a broader interpretation is acceptable, particularly in light of their iid
    assumption associated with this source of variation.
    102Here only a single lag is assumed to economize on the state space of the agents’
    problem.
    103 As in Adda and Cooper (2000b), we assume that the costs of production are

    321
    independent of the level of production. Combined with an assumption of constant
    mark-ups, this implies that the product price is independent of the cross sectional
    distribution of car vintages.
    This assumption of an exogenous price process greatly simplifies the empirical
    implementation of the model since we do not have to solve an equilibrium problem.
    In fact, we have found that adding information on the moments of the cross sectional
    distribution of car vintages has no explanatory power in forecasting car prices in
    the French case. Results are mixed for the US case, as the average age of cars
    significantly predicts future prices.
    104There are numerous surveys of investment. See Caballero (1999) and Chirinko
    (1993) and the references therein for further summaries of existing research.
    105This is corresponds to the outcome of a stochastic growth model if there are
    risk neutral consumers. Otherwise, a formulation with variable real interest rates
    may be warranted.
    106In many economies, it is also influenced by policy variations in the form of
    investment tax credits.
    107Moreover, the special case of no adjustment costs is generally nested in these
    other models.
    108In some applications, the cost of adjustment function depends on investment
    and is written C(I, K) where I = K′ − (1 − δ)K .
    109Abel and Eberly (1994) contain further discussion of the applicability of Q
    theory for more general adjustment cost and profit functions.
    110Hayashi (1982) was the first to point out that in this case average and marginal

    322
    q coincide though his formulation was nonstochastic.
    111 Interestingly, the natural conjecture that φ(A) = A does not satisfy the func-
    tional equation.
    112We are grateful to Joao Ejarque for allowing us to use this material.
    113The error term in (8.8) is often ascribed to stochastic elements in the cost of
    adjustment function so that ai is modified to become ait = ai + εit.
    114 Hubbard (1994) reviews these findings.
    115Cooper and Ejarque (2001) do not attempt to characterize this measurement
    error analytically but use their simulated environment to understand its implica-
    tions. See Erickson and Whited (2000) for a detailed and precise discussion of the
    significance of measurement error in the Q regressions.
    116 Cooper and Ejarque (2001) have no unobserved heterogeneity in the model so
    that the constant from the regression as well as the fixed effects are ignored. The
    remaining coefficients are taken to be common across all firms.
    117 In fact, the estimates are not very sensitive to the aggregate shocks. The model
    is essentially estimated from the rich cross sectional variation, as in the panel study
    of Gilchrist and Himmelberg (1995).
    118The computation of standard errors follows the description in Chapter 4 of
    Gourieroux and Monfort (1996).
    119Cooper and Ejarque (2001) show that if p = y−η is the demand curve and
    y = Akφl(1−φ) the production function. Maximization of profit over the flexible
    factor, l, leads to a reduced form profit function where the exponent on capital is
    φ(η−1)
    (1−φ)(1−η)−1 . With φ = .33, η = .1315, implying a markup of about 15%.

    323
    120The program to estimate this model is very simple. Once Ω(γ) is programmed,
    it is simply a basic routine to minimize this function. Obtaining Ω(γ) is easy too
    using the information on parameters plus observations in the data set on investment
    rates and the ratio of output to capital (which is used to determine marginal profit
    rates). The minimization may not occur exactly at γ = 2 due to sampling error.
    The interested reader can extend this analysis to create a distribution of estimates
    by redrawing shocks, simulating and then re-estimating γ from the GMM procedure.
    121If, in the example above, α = 1, then the constraint is proportional to K. In
    this case, it appears that average and marginal Q are equal.
    122Cooper and Haltiwanger provide a detailed description of the data.
    123See Abel and Eberly (1994) for a model in which fixed costs are proportional to
    K. If these costs were independent of size, then large plants would face lower adjust-
    ment costs (relative to their capital stock) and thus might adjust more frequently.
    So, as in the quadratic specification, the costs are scaled by size. This is though an
    assumption and the relationship between plant size and investment activity is still
    an open issue.
    124 Recall the outline of the basic value function iteration program for the non-
    stochastic growth model and the modification of that for non-convex adjustment
    costs in Chapter 3.
    125As discussed in Cooper and Haltiwanger (1993) and Cooper et al. (1999), this
    assumption that a new machine has fixed size can be derived from a model with
    embodied technological progress which is rendered stationary by dividing through by
    the productivity of the new machine. In this case, the rate of depreciation measures
    both physical deterioration and obsolescence.

    324
    126 Cooper and Haltiwanger (2000) and Cooper et al. (1999) argue that these
    features also hold when there is a one period lag in the installation process.
    127Cooper et al. (1999) analyze the more complicated case of a one-period lag in
    the installation of new capital.
    128An interesting extension of the model would make this gap endogenous.
    129The data set is described in Cooper and Haltiwanger (2000) and is a balanced
    panel of US manufacturing plants. Comparable data sets are available in other
    countries. Similar estimation exercises using these data sets would be of considerable
    interest.
    130See the discussion in Cooper and Haltiwanger (2000) of the estimation of this
    profit function.
    131More recent versions of the Cooper-Haltiwanger paper explore adding lagged
    investment rates to this reduced form to pick up some of the dynamics of the ad-
    justment process.
    132This is an important step in the analysis. Determining the nature of adjustment
    costs will depend on the characterization of the underlying profitability shocks. For
    example, if a researcher is trying to identify non-convex adjustment costs from bursts
    of investment, then getting the distribution of shocks right is critical.
    133The results are robust to allowing the discount factor to vary with the aggregate
    shock in order to mimic the relationship between real interest rates and consumption
    growth from a household’s Euler equation.
    134The interested reader should read closely the discussion in Rust (1987) and the
    papers that followed this line of work. Note that often assumptions are made on

    325
    G(·) to ease the computation of the likelihood function.
    135Here we are also assuming that the discount factor is fixed. More generally it
    might depend on a and a′.
    136So, in contrast to the chapter on capital adjustment, here we assume that there
    are no costs to adjusting the stock of capital. This is, of course, for convenience
    only and a complete model would incorporate both forms of adjustment costs.
    137We can study the implications of that specification by setting q = 0 in (9.2) to
    study the alternative.
    138As well as from the dynamic adjustment of other factors, such as capital.
    139As discussed later in this chapter, this model is used in Cooper and Willis (2001)
    as a basis for a quantitative analysis of the gap approach.
    140The literature on labor adjustment costs contains both specifications. Cooper
    and Willis (2001) find that their results are not sensitive to this part of the specifi-
    cation.
    141Alternatively, the parameters of these processes could be part of an estimation
    exercise.
    142The factors that help the firm forecast future wages are then included in the
    state space of the problem; i.e. they are in the aggregate component of A.
    143Sargent (1978) estimates a model with both regular and overtime employment.
    For simplicity, we have presented the model of regular employment alone.
    144He also discusses in detail the issue of identification and in fact finds multiple
    peaks in the likelihood function. Informally, the issue is distinguishing between the
    serial correlation in employment induced by lagged employment from that induced

    326
    by the serial correlation of the productivity shocks.
    145This inaction rate is too high relative to observation: the parameterization is
    for illustration only.
    146In fact, this depiction also motivates consideration of a search model as the
    primitive that underlies a model of adjustment costs. See the discussion of this in
    the discussion of Yashiv (2000) in Chapter 10.
    147At this level of fixed costs, there is about 50% employment inaction. Again the
    parameterization is just for illustration.
    148This presentation draws heavily upon Cooper and Willis (2001). We are grateful
    to John Haltiwanger and Jon Willis for helpful discussions on this topic.
    149In fact the structure is used to study adjustment of capital as well.
    150Based on discussions above, the policy function of the firm should depend jointly
    on (A, e−1) and not the gap alone.
    151 This point was made some years ago. Nickell (1978) says,
    “… the majority of existing models of factor demand simply analyze
    the optimal adjustment of the firm towards a static equilibrium and it
    is very difficult to deduce from this anything whatever about optimal
    behavior when there is no ‘equilibrium’ to aim at.”
    152The process is taken from the Cooper and Haltiwanger (2000) study of capital
    adjustment. As these shocks were measured using static labor first order condition,
    Cooper and Willis (2001) study the robustness of their results to variations in these
    Markov processes.

    327
    153This discussion parallels the approach in Cooper and Haltiwanger (2000).
    154Though see the discussion Aguirregabiria (1997) for progress in this direction.
    155Of course, it then becomes a question of identification: can one distinguish the
    non-convex and piecewise linear models.
    156Note that Θ would include the parameters of the stochastic processes as well.
    157This is the goal of an ongoing project.
    158Though in some cases a more general equilibrium approach is needed to assess
    the complete implications of the policy.
    159This suggestion is along the lines of the so-called “natural experiments” ap-
    proach to estimation where the researcher searches for “exogenous” events that may
    allow for the identification of key parameters. Evaluating this approach in the con-
    text of structural model is of interest.
    160Early formulations of the framework we discuss include Benassy (1982), Blan-
    chard and Kiyotaki (1987),Caballero and Engel (1993a), Caplin and Leahy (1991)
    and Caplin and Leahy (1997).
    161This is similar to the stochastic adjustment cost structure used in Rust (1987).
    162As discussed, for example, in Blanchard and Kiyotaki (1987), there is a com-
    plementarity that naturally arises in the pricing decisions in this environment.
    163Of course, this may entails adding additional elements to the state space. See
    Adda and Cooper (2000a) and Willis (2000a) for discussions of this point.
    164Ball and Romer (1990) provide an example of this. John and Wolman (1999)
    study these issues in a dynamic setting of price adjustment.

    328
    165The contribution here is bringing the dynamic menu cost model to the data.
    Bils and Klenow (2002) provide further evidence on price setting behavior based
    upon BLS price data.
    166For this specification, there is assumed to be no serial correlation in the adjust-
    ment costs. See Willis (2000a) for further discussion of this point and estimates
    which relax this restriction.
    167Thus in principle one can use this condition for estimation of some parameters
    of the model using orthogonality conditions as moments. See the discussion of this
    point in Pakes (1994) and Aguirregabiria (1997), where the latter paper includes a
    labor example.
    168The findings of Dotsey et al. (1999) are based on a parameterization of the
    adjustment cost distribution and the other assumptions noted above. Whether
    these properties obtain in an estimated model is an open issue. See Willis (2000b)
    for progress on this issue.
    169See the discussion in Arrow et al. (1951) and the references therein.
    170Taken literally R in excess of unity means that inventories accumulate on their
    own which may seem odd. The literature is much more explicit about various
    marginal gains to holding inventories. If R is less than unity, than output will
    be independent of the state but will be rising over time. This policy may require
    negative inventories, an issue we address below.
    171See Blinder (1986), Blinder and Maccini (1991) and the references therein for
    the extensive literature on these points.
    172See, for example, the discussion in Blinder (1986), Eichenbaum (1989) and
    Christiano (1988).

    329
    173Hall (2000) studies a model of production scheduling using data on automobile
    assembly plants and finds some support for hypothesis that nonconvexities in the
    production process lie behind the observations on the relative volatility of production
    and sales.
    174See Scarf (1959) for developments of this argument.
    175Hall and Rust (2000) examines a model of optimal inventory behavior in an
    environment where there is a fixed ordering cost with a stochastic product price.
    They argue that a calibrated version of their model fits important aspects of their
    data from a US steel wholesaler.
    176Kahn (1987) includes a period of price predetermination.
    177The estimation methodology is complex and the reader is urged to study Aguir-
    regabiria (1999).
    178Estimation of this more general structure using plant level data is part of ongoing
    research of R. Cooper and J. Haltiwanger. See Sakellaris (2001) for some interesting
    facts concerning the interaction of capital and labor adjustment.
    179This is the underlying theme of the macroeconomic complementarities literature,
    as in Cooper (1999).
    180In contrast to the contraction mapping theorem, there is no guarantee that this
    process will converge. In some cases, the household’s response to an aggregate law
    of motion can be used as the next guess on the aggregate law of motion. Iteration
    of this may lead to a recursive equilibrium.
    181See Cooper (1999) and the references therein.
    182Interestingly, McCall mentions that his paper draws on Stanford class notes

    330
    from K. Arrow on the reservation wage property.
    183This model is frequently used for expositional purposes in other presentations of
    the search process. It can be enriched in many ways, including adding: fires, quits,
    costly search, etc.
    184Writing a small program to do this would be a useful exercise. Note that this
    dynamic programming model is close to the discrete cake eating problem presented
    in Chapters 2 to 4.
    185Here Θ would include the parameters for the individual agent (eg. those char-
    acterizing u(w) as well as β) and the parameters of the wage distribution.
    186Sometimes unobserved heterogeneity is added to create the same effect.
    187Adda et al. (2002) estimate a related model using a panel data of German
    workers.
    188As noted earlier, Willis (2000b) makes some progress on this in a pricing problem
    and Thomas (2000) studies some of these issues in the context of an investment
    problem.

    Table 5.1: Observed and Predicted Moments
    Moments US data KPR calibrated model
    Std relative to output
    consumption .69 .64
    investment 1.35 2.31
    hours .52 .48
    wages 1.14 .69
    Cross correlation with output
    consumption .85 .82
    investment .60 .92
    hours .07 .79
    wages .76 .90

    Table 6.1: GMM Estimation Based on the Euler Equation
    γ Prop of liquidity γ̂GM M
    constrained periods
    0.5 80% 2.54
    1 50% 3.05
    2 27% 3.92
    3 23% 4.61
    4 11% 5.23
    5 9% 5.78
    6 8% 6.25
    Note: ρ = 0, σ = 10, µ = 100, β =
    0.9, r = 0.05. Estimation done on
    3000 simulated observations.

    Table 7.1: ARMA(1,1) Estimates on US and French Data
    Specification No trend Linear trend
    α1 δ α1 δ
    US durable expenditures 1.00(.03) 1.5 (.15) 0.76 (0.12) 1.42 (0.17)
    US car registration 0.36(.29) 1.34 (.30) 0.33 (0.30) 1.35(0.31)
    France durable expenditures 0.98 (0.04) 1.20 (0.2) 0.56 (0.24) 1.2 (0.36)
    France car expenditures 0.97(0.06) 1.3 (0.2) 0.49 (0.28) 1.20 (0.32)
    France car registrations 0.85 (0.13) 1.00 (0.26) 0.41 (0.4) 1.20 (0.41)
    Notes: Annual data. For the US, source FRED database, 1959:1-1997:3. French
    data: source INSEE, 1970:1-1997:2. US registration: 1968-1995.

    Table 7.2: Transition Matrix for π
    state tomorrow
    1 2 3 4
    1 0.01 0.01 0.01 0.97
    state 2 0.01 0.01 0.01 0.97
    today 3 0.225 0.225 0.1 0.45
    4 0.01 0.01 0.01 0.97

    Table 8.1: Estimated Structural Parameters
    Structural Parameters
    α γ ρ σ θ
    GH95
    CE .689(.011) .149(.016) .106(.008) .855 (.04) 2

    Table 8.2: Regression Results and Moments
    Reduced Form Coef . Estimates/Moments
    a1 a2 sc
    I
    K
    std π
    K

    GH95 .03 .24 .4 .25 3
    CE .041 .237 .027 .251 2.95

    Table 8.3: Descriptive Statistics, LRD
    Variable LRD
    Average Investment Rate 12.2%
    Inaction Rate: Investment 8.1%
    Fraction of Observations with Negative Investment 10.4%
    Spike Rate: Positive Investment 18%
    Spike Rate: Negative Investment 1.4%

    Table 8.4: Parameter Estimates
    Spec. Structural Parm. Estimates (s.e.) parm. est. for (8.22)
    γ F ps ψ0 ψ1 ψ2
    LRD -.013 .265 .20
    all .043 (0.00224) .00039(.0000549) .967(.00112) -.013 .255 .171
    F only 0 .0333(.0000155) 1 -.02 .317 .268
    γ only .125(.000105) 0 1 -.007 .241 .103
    ps only 0 0 .93(.000312) -.016 .266 .223

    Figure 3.1: Stochastic Cake Eating Problem,
    i_s=1
    do until i_s>n_s * Loop over all sizes of the total
    amount of cake X *
    c_L=X_L * Min value for consumption *
    c_H=X[i_s] * Max value for consumption *
    i_c=1
    do until i_c>n_c * Loop over all consumption levels *
    c=c_L+(c_H-c_L)/n_c*(i_c-1)
    i_y=1
    EnextV=0 * initialize the next value to zero
    do until i_y>n_y * Loop over all possible realizations
    of the future endowment *
    nextX=R*(X[i_s]-c)+Y[i_y] * Next period amount of cake *
    nextV=V(nextX) * Here we use interpolation to find
    the next value function *
    EnextV=EnextV+nextV*Pi[i_y] * Store the expected future value
    using the transition matrix *
    i_y=i_y+1
    endo * end of loop over endowment *
    aux[i_c]=u(c)+beta*EnextV * stores the value of a given
    consumption level *
    i_c=i_c+1
    endo * end of loop over consumption *
    newV[i_s,i_y]=max(aux) * Take the max over all consumption
    levels *
    i_s=i_s+1
    endo * end of loop over size of cake *
    V=newV * update the new value function *

    Figure 3.2: Value Function, Stochastic Cake Problem

    Figure 3.3: Policy Function, Stochastic Cake Eating Problem

    Figure 3.4: Stochastic Cake Eating Problem, Projection Method
    procedure c(x) * Here we define an approximation for
    cc=psi_0+psi_1*x+psi_2*x*x the consumption function based on
    return(cc) a second order polynomial *
    endprocedure
    i_s=1
    do until i_s>n_s * Loop over all sizes of the total
    amount of cake *
    utoday=U’(c(X[i_s])) * marginal utility of consuming *
    ucorner=U’(X[i_s]) * marginal utility if corner solution *
    i_y=1
    do until i_y>n_y * Loop over all possible realizations
    of the future endowment *
    nextX=R(X[i_s]-c(X[i_s]))+Y[i_y] * next amount of cake *
    nextU=U’(nextX) * next marginal utility of consumption *
    EnextU=EnextU+nextU*Pi[i_y] * here we compute the expected future
    marginal utility of consumption using
    the transition matrix Pi *
    i_y=i_y+1
    endo * end of loop over endowment *
    F[i_s]=utoday-max(ucorner,beta*EnextU)
    i_s=i_s+1
    endo * end of loop over size of cake *

    Figure 3.5: Basis Functions, Finite Element Method


    X1 X2 X3
    0
    1
    p1(X) p2(X) p3(X) p4(X)

    Figure 3.6: Discrete Cake Eating Problem,
    i_s=2
    do until i_s>n_s * Loop over all sizes of the cake *
    i_e=1
    do until i_e>2 * Loop over all possible realizations
    of the taste shock *
    ueat=u(W[i_s],e[i_e]) * utility of eating the eating now *
    nextV1=V[i_s-1, 1] * next period value if low taste shock *
    nextV2=V[i_s-1, 2] * next period value if high taste shock *
    EnextV=nextV1*p[i_e,1]+nextV2*p[i_e,2]
    newV[i_s,i_e]=max(ueat,beta*EnextV)
    * Take the max between eating now
    or waiting *
    i_e=i_e+1
    endo * end of loop over taste shock *
    i_s=i_s+1
    endo * end of loop over size of cake *
    V=newV * update the new value function *

    Figure 3.7: Value Function, Discrete Cake Eating Problem

    Figure 3.8: Decision Rule, Discrete Cake Eating Problem

    Figure 3.9: Approximation Methods

    Figure 3.10: Example of Discretization, N=3

    ……………………………
    ……………………………
    1/3 1/3 1/3
    ε2 ε3z1 z2 z3
    φ(ε)

    Figure 3.11: Simulation of a Markov Process
    t=1
    oldind=1 * variable to keep track of state in period t-1 *
    y[t]=z[oldind] * initialize first period *
    do until t>T * Loop over all time periods *
    u=uniform(0,1) * Generate a uniform random variable *
    sum=0 * will contain the cumulative sum of pi *
    ind=1 * index over all possible values for process *
    do until u<=sum * loop to find out the state in period t * sum=sum+pi[oldind,ind] * cumulative sum of pi * ind=ind+1 endo y[t]=z[ind] * state in period t * oldind=ind * keep track of lagged state * t=t+1 endo Figure 4.1: Log Likelihood, True θ0 = 0 Figure 4.2: Objective Function, Simulated Method of Moments, true θ0 = 0 Figure 4.3: Just Identification � � ................................................................................... ............................................... M (θ) θ0 1 P P ∗ ........................................... θ∗ Figure 4.4: Non Identification � � ................................................................................... .......................................................................... M (θ) 0 1 θ P ........................................... ........................................... θ∗2θ ∗ 1 P ∗ Figure 4.5: Zero Likelihood � � ................................................................................... ................................................................................... M (θ) θ P 0 1 P ∗ Figure 4.6: Overview of Methodology Economic Model Policy Rules Predicted Outcome Observed Outcome Match ? Yes Vector of parameters No Optimal Parameters Goodness of fit Overidentification tests Policy analysis � � � � � � � � � � � � � � � Chapter 2 Economic Properties � � � � Chapter 3 Numerical solution � � � � Chapter 4 Estimation method Figure 5.1: Policy Function 9.5 10 10.5 11 11.5 12 12.5 9.5 10 10.5 11 11.5 12 12.5 current capital fu tu re c a p ita l policy function current capital Figure 5.2: Net Investment 9.5 10 10.5 11 11.5 12 12.5 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 current capital n e t in ve st m e n t Figure 6.1: Consumption and Liquidity Constraints: Optimal Consumption Rule Figure 6.2: Simulations of Consumption and Assets with Serially Correlated Income Figure 6.3: Optimal Consumption Rule Figure 6.4: Observed and Predicted Consumption Profiles Figure 7.1: [s,S] rule Figure 7.2: Estimated Hazard Function, France Figure 7.3: Estimated Hazard Function, US Figure 7.4: Sales of New Cars, in thousands, monthly Figure 7.5: Expected Aggregate Sales, Relative to Baseline Figure 7.6: Expected Government Revenue, Relative to Baseline Figure 8.1: The function Ω(γ) 1 1.5 2 2.5 3 3.5 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Gamma fit Figure 9.1: Employment Policy Functions: Quadratic Costs 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 current E fu tu re E high state low state Figure 9.2: Employment Policy Functions: Piece-wise Linear Adjustment Costs 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 current E fu tu re E high state low state Figure 9.3: Employment Policy Functions: Non-convex Adjustment Costs 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 current E fu tu re E high state low state Figure 9.4: Employment Policy Functions: Mixed Adjustment Costs 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400 current E fu tu re E high state low state

    Prof. Andrzej Cieślik
    Department of Macroeconomics and International Trade Theory, Faculty of Economic Sciences, University
    of Warsaw,

    4

    4/50 Długa St., 00-241 Warsaw, email: cieslik@wne.uw.edu.pl
    Office hours: Thursday

    1

    5.00-16.30, room 409/410.

    A d v a n c e d M a c r o e c o n o m i c s C o u r s e S y l l a b u s – S p r i n g

    2

    0 1

    3

    :
    M i c r o f o u n d a t i o n s , E c o n o m i c G r o w t h , B u s i n e s s C y c l e s a n d L a b o r M a r k e t

    1. Description:
    This is a 60-hour graduate course in advanced macroeconomics that focuses on dynamic real dynamic
    macroeconomics. The topics will cover microeconomic foundations of macroeconomics, growth theories,
    business cycles and selected labor market issues. This is an obligatory course for MA Programs in
    International Economics and Quantitative Finance. Foreign students visiting the Faculty of Economic
    Sciences at the University of Warsaw are also welcome to participate. Polish students with good knowledge
    of English from other specialization fields can enroll subject to instructor’s approval. The course is offered
    only in the spring semester. The class meets twice a week on Tuesdays and Thursdays for two hours (9.45-
    11.20) in room A. The class is accompanied by a non-obligatory tutorial classes that meet on Mondays (9.45-
    11.20) in room 203 every fortnight starting February 25, 2013.

    2. Objectives:
    The main objective of this course is to familiarize students with key analytical models in real
    macroeconomics. The course consists of four parts. The first part is devoted to microfoundations of
    macroeconomic models such as consumption, investment and the government sector. The second part
    focuses on exogenous and endogenous growth theories and covers neoclassical models such as Solow-Swan,
    Ramsey and OLG models as well as newer models such as AK, Lucas-Uzawa, Romer and Grossman-
    Helpman models. The third part concentrates on business cycles and covers real business cycle and new
    Keynesian theories. The fourth part is devoted to various labor market issues.

    3. Required reading:
    There is no single textbook for this course. Materials for this course come from various textbooks and
    articles. All assigned readings are required readings. Most often reference will be made to the selected
    chapters from the following six books:
    [1] Acemoglu D., 2009, Introduction to Modern Economic Growth, Princeton University Press, Princeton,
    [2] Adda J., Cooper R., 2003, Dynamic Economics, The MIT Press, Cambridge, M.A.,
    [3] Bagliano F.C., Bertola G., 2004, Models for Dynamic Macroeconomics, Oxford University Press,
    Oxford,
    [4] Barro R.J., Sala-i-Martin X., 2004, Economic Growth, Second Edition, The MIT Press, Cambridge,
    M.A.,
    [5] Blanchard O.J., Fischer S., 1989, Lectures on Macroeconomics, The MIT Press, Cambridge, M.A.,
    [6] Romer D., 2001, Advanced Macroeconomics, Second Edition, McGraw-Hill, New York.

    4. Prerequisites:
    The main prerequisite for this course is knowledge of both macro and microeconomics at the undergraduate
    level, microeconomics and mathematical methods in economics at the graduate level.

    5. Exam:
    The grading will be based on the final written exam offered on June 13 (starting 9.00 and ending 12.00) in
    room A.

    1

    mailto:cieslik@wne.uw.edu.pl

    Detailed course program description:

    Part I. Microeconomic Foundations

    Topic 1. Consumption
    Adda J., Cooper R., 2003, ch. 6, Consumption, 139-164.
    Bagliano F.C., Bertola G., 2004, ch. 1, Dynamic consumption theory, 1-46.
    Romer D., 2001, ch. 7, Consumption, 331-362.

    Topic 2. Government sector
    Romer D., 2001, ch. 11, Budget deficits and fiscal policy, 531-582.

    Topic 3. Investment theory
    Adda J., Cooper R., 2003, ch. 8, Investment, 187-214.
    Bagliano F.C., Bertola G., 2004, ch. 2, Dynamic models of investment, 47-101.
    Romer D., 2001, ch. 8., Investment, 367-409.
    Sala-i-Martin X., 2000, Internal and external adjustment costs in the theory of fixed investment,
    lecture notes.
    Hall R., Jorgenson D., 1967, Tax policy and investment behavior, American Economic Review 57,
    391-414.
    Hayashi F., 1982, Tobin’s marginal q and average q: A neoclassical interpretation, Econometrica 50,
    213-224.

    Part II. Growth Theory

    A. Neoclassical growth theory

    Topic 4. Solow-Swan model
    Acemoglu D., 2009, ch. 2., The Solow growth model, 26-76.
    Barro R.J., Sala-i-Martin X., 2004, ch. 1, Growth models with exogenous saving rates, The
    neoclassical model of Solow and Swan, 23-59.
    Romer D., 2001, ch. 1, The Solow growth model, 5-43.

    Topic 5. Ramsey-Cass-Koopmans (RCK) model
    Acemoglu D., 2009, ch. 8., The neoclassical growth model, 287-326.
    Barro R.J., Sala-i-Martin X., 2004, ch. 2, Growth models with consumer optimization, 85-133.
    Romer D., 2001, ch. 2, Infinite horizon and overlapping generation models, Part A: The Ramsey-
    Cass-Koopmans model, 47-74.
    Blanchard O.J., Fischer S., 1989, Lectures on Macroeconomics, ch. 2, Consumption and
    investment: Basic infinite horizon models, section 2.3, Government in the decentralized economy,
    52-58.

    Topic 6. Overlapping generations (OLG) model
    Acemoglu D., 2009, ch. 9., Growth with overlapping generations, 327-358.
    Romer D., 2001, ch. 2, Infinite horizon and overlapping generation models, Part B: The Diamond
    model, 75-90.
    Blanchard O.J., Fischer S., 1989, Lectures on Macroeconomics, ch. 3, The overlapping generations
    model, section 3.2, Social security and capital accumulation, 110-114.

    2

    Barro R., 1974, Are government bonds net wealth?, Journal of Political Economy 82, 1095-1117.
    Diamond P., 1965, National debt in a neoclassical growth model, American Economic Review 55,
    1126-1150.
    Samuelson P.A., 1958, An exact consumption-loan model of interest with or without the social
    contrivance of money, Journal of Political Economy 66, 467-482.
    Abel A., Mankiw N.G., Summers L., Zeckhauser R., 1989, Assessing dynamic inefficiency: Theory
    and evidence, Review of Economic Studies 56, 1-20.

    Topic 7. Convergence debate
    Acemoglu D., 2009, ch. 3., The Solow model and the data, 77-108.
    Barro R.J., Sala-i-Martin X., 2004, ch. 11, Empirical analysis of regional datasets, 461-496.
    Barro R.J., Mankiw N.G., Sala-i-Martin X., 1995, Capital mobility in neoclassical models of
    growth, American Economic Review 85, 103-115.
    Islam N., 1995, Growth empirics: A panel data approach, Quarterly Journal of Economics 110,
    1127-1170.
    Mankiw N.G., Romer D., Weil D.N., 1992, A contribution to the empirics of economic growth,
    Quarterly Journal of Economics 107, 407-437.

    B. New growth theory

    Topic 8. AK models and externalities
    Acemoglu D., 2009, ch. 11., First-generation models of endogenous growth, 387-410.
    Barro R.J., Sala-i-Martin X., 2004, ch. 1, Growth models with exogenous saving rates, Models of
    endogenous growth, 61-71.
    Barro R.J., Sala-i-Martin X., 2004, ch. 4, One sector model of endogenous growth, 205-232.
    Rebelo S., 1991, Long-run policy analysis and long-run growth, Journal of Political Economy 99,
    500-521.
    Romer P., 1986, Increasing returns and long run growth, Journal of Political Economy 94, 1002-
    1037.
    Romer D., 2001, ch. 3, New growth theory, Part B, Cross-country income differences, 120-122.

    Topic 9. Lucas-Uzawa model
    Barro R.J., Sala-i-Martin X., 2004, ch. 5., Two-sector models of endogenous growth, 239-271.
    Romer D., 2001, ch. 3, New growth theory, Part A, Research and development models, 98-160.
    Lucas R.E., 1988, On the mechanics of economic development, Journal of Monetary Economics 22,
    3-42.

    Topic 10. Expanding product variety models
    Acemoglu D., 2009, ch. 13., Expanding variety models, 433-457.
    Barro R.J., Sala-i-Martin X., 2004, ch. 6, Technological change: Models with an expanding product
    variety, 285-313.
    Grossman G., Helpman E., 1993, Innovation and growth in the global economy, MIT Press,
    Cambridge MA, ch. 3, Expanding product variety, 45-76.

    Topic 11. Quality ladder models
    Barro R.J., Sala-i-Martin X., 2004, ch. 7, Technological change: Models with improvements in the
    quality of products, 317-343.
    Grossman G., Helpman E., 1993, Innovation and growth in the global economy, MIT Press,
    Cambridge MA, ch. 4, Rising product quality, 86-109.

    3

    Topic 12. Growth empirics
    Barro R.J., Sala-i-Martin X., 2004, ch. 10, Growth accounting, 433-460.
    Barro R.J., Sala-i-Martin X., 2004, ch. 11, Empirical analysis of regional datasets, 461-496.

    Part III. Business Cycle Theory

    Topic 13. Real business cycles
    Acemoglu D., 2009, ch. 16., Stochastic growth models, 566-610.
    Barro R.J., Sala-i-Martin X., 2004, ch. 9, Labor supply and population, 9.3, Labor/Leisure choice,
    422-428.
    Romer D., 2001, ch. 4, Real business cycle theory, 168-212.
    Campbell J.M., 1994, Inspecting the mechanism: An analytical approach to the stochastic growth
    model, Journal of Monetary Economics 33, 463-506.
    Ritter J.A., 1995, An outsider’s guide to real cycle modeling, Federal Reserve Bank of St. Louis
    Review, 49-60.
    Christiano L., Eichenbaum M., 1992, Current real business cycle theories and aggregate labor
    market fluctuations, American Economic Review 82, 430-450.

    Topic 14. Coordination failures and macroeconomic policy
    Bagliano F.C., Bertola G., 2004, ch. 5, Coordination and externalities in macroeconomics, 170-187.
    Cooper R., 1999, Coordination games: Complementarities and Macroeconomics, Cambridge
    University press, Cambridge.
    Cooper R., John A., 1988, Coordinating coordination failures in Keynesian models, Quarterly
    Journal of Economics 103, 441-463.
    Diamond P., 1982, Aggregate demand management in search equilibrium, Journal of Political
    Economy 90, 881-894.

    Topic 15. Imperfect competition and real rigidities
    Romer D., 2001, ch. 6, Microeconomic foundations of incomplete nominal adjustment, Part B,
    Staggered price adjustment, 279-324.
    Blanchard O., Kiyotaki N., 1987, Monopolistic competition and the effects of aggregate demand,
    American Economic Review 77, 647-666.
    Mankiw N.G., 1988, Imperfect competition and the Keynesian cross, Economics Letters 26, 7-13.
    Rotemberg J.J., Saloner G., 1986, A supergame-theoretic model of price wars during booms,
    American Economic Review 76, 390-407.
    Weitzman M., 1982, Increasing returns and the foundations of unemployment theory, Economic
    Journal 92, 787-804.

    Part IV. Labor Market

    Topic 16. Efficiency wage models of unemployment
    Romer D., 2001, ch. 9, Unemployment, 410-432.
    Yellen J.L., 1984, Efficiency-wage models of unemployment, American Economic Review 74, 200-
    205.
    Shapiro C., Stiglitz J.E., 1984, Equilibrium unemployment as a worker-discipline device, American
    Economic Review 74, 433-444.

    Topic 17. Search models of unemployment
    Bagliano F.C., Bertola G., 2004, ch. 5, Coordination and externalities in macroeconomics, 188-206.
    Romer D., 2001, ch. 9, Unemployment, 444-461.

    4

    • Department of Macroeconomics and International Trade Theory, Faculty of Economic Sciences, University of Warsaw, 44/50 Długa St., 00-241 Warsaw, email: cieslik@wne.uw.edu.pl
    • 1. Description:

      Part I. Microeconomic Foundations
      Topic 2. Government sector
      Topic 3. Investment theory
      Topic 6. Overlapping generations (OLG) model
      Romer D., 2001, ch. 2, Infinite horizon and overlapping generation models, Part B: The Diamond model, 75-90.
      Barro R., 1974, Are government bonds net wealth?, Journal of Political Economy 82, 1095-1117.
      Diamond P., 1965, National debt in a neoclassical growth model, American Economic Review 55, 1126-1150.

      B. New growth theory
      Topic 8. AK models and externalities
      Topic 10. Expanding product variety models
      Topic 11. Quality ladder models
      Topic 12. Growth empirics
      Part III. Business Cycle Theory
      Topic 13. Real business cycles

      Topic 14. Coordination failures and macroeconomic policy
      Topic 15. Imperfect competition and real rigidities

      Part IV. Labor Market
      Topic 16. Efficiency wage models of unemployment

    Models for Dynamic Macroeconomics

    This page intentionally left blank

    Models for Dynamic
    Macroeconomics
    Fabio-Cesare Bagliano
    Giuseppe Bertola
    1

    3
    Great Clarendon Street, Oxford ox2 6dp
    Oxford University Press is a department of the University of Oxford.
    It furthers the University’s objective of excellence in research, scholarship,
    and education by publishing worldwide in
    Oxford New York
    Auckland Cape Town Dar es Salaam Hong Kong Karachi
    Kuala Lumpur Madrid Melbourne Mexico City Nairobi
    New Delhi Shanghai Taipei Toronto
    With offices in
    Argentina Austria Brazil Chile Czech Republic France Greece
    Guatemala Hungary Italy Japan Poland Portugal Singapore
    South Korea Switzerland Thailand Turkey Ukraine Vietnam
    Oxford is a registered trade mark of Oxford University Press
    in the UK and in certain other countries
    Published in the United States
    by Oxford University Press Inc., New York
    © Fabio-Cesare Bagliano and Giuseppe Bertola 2004
    The moral rights of the authors have been asserted
    Database right Oxford University Press (maker)
    First published 2004
    First published in paperback 2007
    All rights reserved. No part of this publication may be reproduced,
    stored in a retrieval system, or transmitted, in any form or by any means,
    without the prior permission in writing of Oxford University Press,
    or as expressly permitted by law, or under terms agreed with the appropriate
    reprographics rights organization. Enquiries concerning reproduction
    outside the scope of the above should be sent to the Rights Department,
    Oxford University Press, at the address above
    You must not circulate this book in any other binding or cover
    and you must impose the same condition on any acquirer
    British Library Cataloguing in Publication Data
    Data available
    Library of Congress Cataloging in Publication Data
    Data available
    Typeset by SPI Publisher Services, Pondicherry, India
    Printed in Great Britain
    on acid-free paper by
    Ashford Colour Press Ltd, Gosport, Hampshire
    ISBN 978–0–19–926682–1 (hbk.)
    ISBN 978–0–19–922832–4 (pbk.)
    10 9 8 7 6 5 4 3 2 1

    � P R E F A C E T O P A P E R B A C K E D I T I O N
    The impact of macroeconomics on daily life is less tangible than that of micro-
    economics. Everyone has to deal with rising supermarket prices, fluctuations
    in the labor market, and other microeconomic problems. Only a handful of
    policymakers and government officials really need to worry about fiscal and
    monetary policy, or about a country’s overall competitiveness. The highly sim-
    plified, and unavoidably controversial nature of theories used to represent the
    complex phenomena resulting from the interaction of millions of individuals,
    tends to make macroeconomics appear to be a relatively arcane and technical
    branch of the social sciences. Its focus is on issues more likely to be of interest
    to specialists than the general public.
    Yet, macroeconomics and the problems it attempts to deal with are
    extremely important, even if they are sometimes difficult to grasp. It cannot
    be denied that macroeconomic analysis has become more technical over the
    last few decades. The formal treatment of expectations and of inter-temporal
    interactions is nowadays an essential ingredient of any model meant to address
    practical and policy problems. But, at the same time, it has also become more
    pragmatic because modern macroeconomics is firmly rooted in individual
    agents’ day-to-day decisions. To understand and appreciate scientific research
    papers, the modern macroeconomist has to master the dynamic optimization
    tools needed to represent the solution of real, live individuals’ problems in
    terms of optimization, equilibrium and dynamic accumulation relationships,
    expectations and uncertainty. The macroeconomist, unlike most microecono-
    mists, also needs to know how to model and interpret the interactions of
    individual decisions that, in different ways and at different levels, make an
    economy’s dynamic behavior very different from the simple juxtaposition of
    its inhabitant’s actions and objectives.
    This book offers its readers a step-by-step introduction to aspects of
    macroeconomic engineering, individual optimization techniques and modern
    approaches to macroeconomic equilibrium modeling. It applies the relevant
    formal analysis to some of the standard topics covered less formally by all
    intermediate macroeconomics course: consumption and investment, employ-
    ment and unemployment, and economic growth. Aspects of each topic are
    treated in more detail by making use of advanced mathematics and setting
    them in a broader context than is the case in standard undergraduate text-
    books. The book is not, however, as technically demanding as some other
    graduate textbooks. Readers require no more mathematical expertise than is
    provided by the majority of undergraduate courses. The exposition seeks to

    vi PREFACE TO PAPERBACK EDITION
    develop economic intuition as well as technical know-how, and to prepare
    students for hands-on solutions to practical problems rather than providing
    fully rigorous theoretical analysis. Hence, relatively advanced concepts (such
    as integrals and random variables) are introduced in the context of economic
    arguments and immediately applied to the solution of economic problems,
    which are accurately characterized without an in-depth discussion of the
    theoretical aspects of the mathematics involved. The style and coverage of
    the material bridges the gap between basic textbooks and modern applied
    macroeconomic research, allowing readers to approach research in leading
    journals and understand research practiced in central banks and international
    research institutions as well as in academic departments.
    How to Use This Book
    Models for Dynamic Macroeconomics is suitable for advanced undergraduate
    and first-year graduate courses and can be taught in about 60 lecture hours.
    When complemented by recent journal articles, the individual chapters—
    which differ slightly in the relative emphasis given to analytical techniques
    and empirical perspective—can also be used in specialized topics courses. The
    last section of each chapter often sketches more advanced material and may
    be omitted without breaking the book’s train of thought, while the chapters’
    appendices introduce technical tools and are essential reading. Some exercises
    are found within the chapters and propose extensions of the model discussed
    in the text. Other exercises are found at the end of chapters and should be used
    to review the material. Many technical terms are contained in the index, which
    can be used to track down definitions and sample applications of possibly
    unfamiliar concepts.
    The book’s five chapters can to some extent be read independently, but
    are also linked by various formal and substantive threads to each other and
    to the macroeconomic literature they are meant to introduce. Discrete-time
    optimization under uncertainty, introduced in Chapter 1, is motivated and
    discussed by applications to consumption theory, with particular attention to
    empirical implementation. Chapter 2 focuses on continuous-time optimiza-
    tion techniques, and discusses the relevant insights in the context of partial-
    equilibrium investment models. Chapter 3 revisits many of the previous
    chapters’ formal derivations with applications to dynamic labor demand, in
    analogy to optimal investment models, and characterizes labor market equi-
    librium when not only individual firms’ labor demand is subject to adjustment
    costs, but also individual labor supply by workers faces dynamic adjustment

    PREFACE TO PAPERBACK EDITION vii
    problems. Chapter 4 proposes broader applications of methods introduced by
    the previous chapters, and studies continuous-time equilibrium dynamics of
    representative-agent economies featuring both consumption and investment
    choices, with applications to long-run growth frameworks of analysis. Chapter
    5 illustrates the role of decentralized trading in determining aggregate equilib-
    ria, and characterizes aggregate labor market dynamics in the presence of fric-
    tional unemployment. Chapters 4 and 5 pay particular attention to strategic
    interactions and externalities: even when each agent correctly solves his or her
    individual dynamic problem, modern micro-founded macroeconomic mod-
    els recognize that macroeconomic equilibrium need not have unambiguously
    desirable properties.
    Brief literature reviews at the end of each chapter outline some recent
    directions of progress, but no book can effectively survey a literature as wide-
    ranging, complex, and evolving as the macroeconomic one. In the interests
    of time and space this book does not cover all of the important analytical and
    empirical issues within the topics it discusses. Overlapping generation dynam-
    ics and real and monetary business cycle fluctuations, as well as more technical
    aspects, such as those relevant to the treatment of asymmetric information
    and to more sophisticated game-theoretic and decision-theoretic approaches
    are not covered. It would be impossible to cover all aspects of all relevant topics
    in one compact and accessible volume and the intention is to complement
    rather than compete with some of the other texts currently available.

    The
    positive reception of the hardback edition, however, would seem to confirm
    that the book does succeed in its intended purpose of covering the essential
    elements of a modern macroeconomist’s toolkit. It also enables readers to
    knowledgeably approach further relevant research. It is hoped that this paper-
    back edition will continue to fulfil that purpose even more efficiently for a
    number of years to come.
    The first hardback edition was largely based on Metodi Dinamici e
    Fenomeni Macroeconomici (il Mulino, Bologna, 1999), translated by Fabio
    Bagliano (ch.1), Giuseppe Bertola (ch. 2), Marcel Jansen (chs. 3, 4, 5, edited
    by Jessica Moss Spataro and Giuseppe Bertola). For helpful comments the
    authors are indebted to many colleagues (especially Guido Ascari, Onorato
    ∗ Foundations of Modern Macroeconomics, by Ben J. Heijdra and Frederick van der Ploeg (Oxford
    University Press, 2002) is more comprehensive and less technical; the two books can to some extent
    complement each other on specific topics. This book offers more technical detail and requires less
    mathematical knowledge than Lectures on Macroeconomics, by Olivier J. Blanchard and Stanley Fischer
    (MIT Press, 1989), and offers a more up to date treatment of a more limited range of topics. It is less
    wide ranging than Advanced Macroeconomics, by David Romer (McGraw-Hill 3rd rev. edn. 2005) but
    provides more technical and rigorous hands-on treatment of more advanced techniques. By contrast,
    Recursive Macroeconomic Theory, by Lars Ljungqvist and Thomas J. Sargent (MIT Press, 2nd edn. 2004)
    offers a more rigorous but not as accessible formal treatment of a broad range of topics, and a narrower
    range of technical and economic insights.

    viii PREFACE TO PAPERBACK EDITION
    Castellino, Elsa Fornero, Pietro Garibaldi, Giulio Fella, Vinicio Guidi, Claudio
    Morana) and to the anonymous reviewers. The various editions of the book
    have also benefited enormously from the input of the students and teaching
    assistants (especially Alberto Bucci, Winfried Koeniger, Juana Santamaria,
    Mirko Wiederholt) over many years at the CORIPE Master program in Turin,
    at the European University Institute, and elsewhere. Any remaining errors and
    all shortcomings are of course the authors’ own.

    � C O N T E N T S
    DETAILED CONTENTS x
    LIST OF FIGURES xiii
    1 Dynamic Consumption Theory 1
    2 Dynamic Models of Investment 48
    3 Adjustment Costs in the Labor Market 102
    4 Growth in Dynamic General Equilibrium 130
    5 Coordination and Externalities in Macroeconomics 170
    ANSWERS TO EXERCISES 221
    INDEX 274

    � D E T A I L E D C O N T E N T S
    LIST OF FIGURES xiii
    1 Dynamic Consumption Theory 1
    1.1 Permanent Income and Optimal Consumption 1
    1.1.1 Optimal consumption dynamics 5
    1.1.2 Consumption level and dynamics 7
    1.1.3 Dynamics of income, consumption, and saving 9
    1.1.4 Consumption, saving, and current income 11
    1.2 Empirical Issues 13
    1.2.1 Excess sensitivity of consumption to current income 13
    1.2.2 Relative variability of income and consumption 15
    1.2.3 Joint dynamics of income and saving 19
    1.3 The Role of Precautionary Saving 22
    1.3.1 Microeconomic foundations 22
    1.3.2 Implications for the consumption function 25
    1.4 Consumption and Financial Returns 29
    1.4.1 Empirical implications of the CCAPM 31
    1.4.2 Extension: the habit formation hypothesis 35
    Appendix A1: Dynamic Programming 36
    Review Exercises 41
    Further Reading 43
    References 45
    2 Dynamic Models of Investment 48
    2.1 Convex Adjustment Costs 49
    2.2 Continuous-Time Optimization 52
    2.2.1 Characterizing optimal investment 55
    2.3 Steady-State and Adjustment Paths 60
    2.4 The Value of Capital and Future Cash Flows 65
    2.5 Average Value of Capital 69
    2.6 A Dynamic IS–LM Model 71
    2.7 Linear Adjustment Costs 76
    2.8 Irreversible Investment Under Uncertainty 81
    2.8.1 Stochastic calculus 82
    2.8.2 Optimization under uncertainty and irreversibility 85
    Appendix A2: Hamiltonian Optimization Methods 91
    Review Exercises 97
    Further Reading 99
    References 100
    3 Adjustment Costs in the Labor Market 102
    3.1 Hiring and Firing Costs 104
    3.1.1 Optimal hiring and firing 107

    DETAILED CONTENTS xi
    3.2 The Dynamics of Employment 110
    3.3 Average Long-Run Effects 114
    3.3.1 Average employment 115
    3.3.2 Average profits 117
    3.4 Adjustment Costs and Labor Allocation 119
    3.4.1 Dynamic wage differentials 122
    Appendix A3: (Two-State) Markov Processes 125
    Exercises 127
    Further Reading 128
    References 129
    4 Growth in Dynamic General Equilibrium 130
    4.1 Production, Savings, and Growth 132
    4.1.1 Balanced growth 134
    4.1.2 Unlimited accumulation 136
    4.2 Dynamic Optimization 138
    4.2.1 Economic interpretation and optimal growth 139
    4.2.2 Steady state and convergence 140
    4.2.3 Unlimited optimal accumulation 141
    4.3 Decentralized Production and Investment Decisions 144
    4.3.1 Optimal growth 147
    4.4 Measurement of “Progress”: The Solow Residual 148
    4.5 Endogenous Growth and Market Imperfections 151
    4.5.1 Production and non-rival factors 152
    4.5.2 Involuntary technological progress 153
    4.5.3 Scientific research 156
    4.5.4 Human capital 157
    4.5.5 Government expenditure and growth 158
    4.5.6 Monopoly power and private innovations 160
    Review Exercises 163
    Further Reading 167
    References 168
    5 Coordination and Externalities in Macroeconomics 170
    5.1 Trading Externalities and Multiple Equilibria 171
    5.1.1 Structure of the model 171
    5.1.2 Solution and characterization 172
    5.2 A Search Model of Money 180
    5.2.1 The structure of the economy 180
    5.2.2 Optimal strategies and equilibria 182
    5.2.3 Implications 185
    5.3 Search Externalities in the Labor Market 188
    5.3.1 Frictional unemployment 189
    5.3.2 The dynamics of unemployment 191
    5.3.3 Job availability 192
    5.3.4 Wage determination and the steady state 195
    5.4 Dynamics 199
    5.4.1 Market tightness 199
    5.4.2 The steady state and dynamics 203

    xii DETAILED CONTENTS
    5.5 Externalities and efficiency 206
    Appendix A5: Strategic Interactions and Multipliers 211
    Review Exercises 216
    Further Reading 217
    References 219
    ANSWERS TO EXERCISES 221
    INDEX 274

    � L I S T O F F I G U R E S
    1.1 Precautionary savings 24
    2.1 Unit investment costs 50
    2.2 Dynamics of q (supposing that ∂ F (·)/∂ K is decreasing in K ) 57
    2.3 Dynamics of K (supposing that ∂È(·)/ ∂ K − ‰ < 0) 58 2.4 Phase diagram for the q and K system 59 2.5 Saddlepath dynamics 60 2.6 A hypothetical jump along the dynamic path, and the resulting time path of Î(t ) and investment 63 2.7 Dynamic effects of an announced future change of w 64 2.8 Unit profits as a function of the real wage 68 2.9 A dynamic IS–LM model 73 2.10 Dynamic effects of an anticipated fiscal restriction 75 2.11 Piecewise linear unit investment costs 77 2.12 Installed capital and optimal irreversible investment 79 3.1 Static labor demand 103 3.2 Adjustment costs and dynamic labor demand 111 3.3 Nonlinearity of labor demand and the effect of turnover costs on average employment, with r = 0 117 3.4 The employer’s surplus when marginal productivity is equal to the wage 118 3.5 Dynamic supply of labor from downsizing firms to expanding firms, without adjustment costs 121 3.6 Dynamic supply of labor from downsizing firms to expanding firms, without employers’ adjustment costs, if mobility costs Í per unit of labor 124 4.1 Decreasing marginal returns to capital 134 4.2 Steady state of the Solow model 134 4.3 Effects of an increase in the savings rate 136 4.4 Convergence and steady state with optimal savings 141 5.1 Stationarity loci for e and c ∗ 174 5.2 Equilibria of the economy 178 5.3 Optimal (�) response function 184 5.4 Optimal quantity of money M∗ and ex ante probability of consumption P 187 5.5 Dynamics of the unemployment rate 192 xiv LIST OF FIGURES 5.6 Equilibrium of the labor market with frictional unemployment 198 5.7 Dynamics of the supply of jobs 201 5.8 Dynamics of unemployment and vacancies 203 5.9 Permanent reduction in productivity 204 5.10 Increase in the separation rate 205 5.11 A temporary reduction in productivity 206 5.12 Strategic interactions 212 5.13 Multiplicity of equilibria 214 1 Dynamic Consumption Theory Optimizing models of intertemporal choices are widely used by theoretical and empirical studies of consumption. This chapter outlines their basic ana- lytical structure, along with some extensions. The technical tools introduced here aim at familiarizing the reader with recent applied work on consumption and saving, but they will also prove useful in the rest of the book, when we shall study investment and other topics in economic dynamics. The chapter is organized as follows. Section 1.1 illustrates and solves the basic version of the intertemporal consumption choice model, deriving the- oretical relationships between the dynamics of permanent income, current income, consumption, and saving. Section 1.2 discusses problems raised by empirical tests of the theory, focusing on the excess sensitivity of consumption to expected income changes and on the excess smoothness of consumption following unexpected income variations. Explanations of the empirical evi- dence are offered by Section 1.3, which extends the basic model by introducing a precautionary saving motive. Section 1.4 derives the implications of optimal portfolio allocation for joint determination of optimal consumption when risky financial assets are available. The Appendix briefly introduces dynamic programming techniques applied to the optimal consumption choice. Biblio- graphic references and suggestions for further reading bring the chapter to a close. 1.1. Permanent Income and Optimal Consumption The basic model used in the modern literature on consumption and saving choices is based on two main assumptions: 1. Identical economic agents maximize an intertemporal utility function, defined on the consumption levels in each period of the optimization horizon, subject to the constraint given by overall available resources. 2. Under uncertainty, the maximization is based on expectations of future relevant variables (for example, income and the rate of interest) formed rationally by agents, who use optimally all information at their disposal. We will therefore study the optimal behavior of a representative agent who lives in an uncertain environment and has rational expectations. Implications 2 CONSUMPTION of the theoretical model will then be used to interpret aggregate data. The representative consumer faces an infinite horizon (like any aggregate econ- omy), and solves at time t an intertemporal choice problem of the following general form: max {c t +i ;i =0,1,... } U (c t , c t +1, . . .) ≡ Ut , subject to the constraint (for i = 0, . . . , ∞) At +i +1 = (1 + rt +i ) At +i + yt +i − c t +i , where At +i is the stock of financial wealth at the beginning of period t + i ; rt +i is the real rate of return on financial assets in period t + i ; yt +i is labor income earned at the end of period t + i , and c t +i is consumption, also assumed to take place at the end of the period. The constraint therefore accounts for the evolution of the consumer’s financial wealth from one period to the next. Several assumptions are often made in order easily to derive empirically testable implications from the basic model. The main assumptions (some of which will be relaxed later) are as follows. � Intertemporal separability (or additivity over time) The generic utility function Ut (·) is specified as Ut (c t , c t +1, . . .) = vt (c t ) + vt +1(c t +1) + . . . (with v′t +i > 0 and v
    ′′
    t +i < 0 for any i ≥ 0), where vt +i (c t +i ) is the val- uation at t of the utility accruing to the agent from consumption c t +i at t + i . Since vt +i depends only on consumption at t + i , the ratio of marginal utilities of consumption in any two periods is independent of consumption in any other period. This rules out goods whose effects on utility last for more than one period, either because the goods themselves are durable, or because their consumption creates long-lasting habits. (Habit formation phenomena will be discussed at the end of this chapter.) � A way of discounting utility in future periods that guarantees intertempo- rally consistent choices. Dynamic inconsistencies arise when the valuation at time t of the relative utility of consumption in any two future periods, t + k1 and t + k2 (with t < t + k1 < t + k2 ), differs from the valuation of the same relative utility at a different time t + i . In this case the optimal levels of consumption for t + k1 and t + k2 originally chosen at t may not be considered optimal at some later date: the consumer would then wish to reconsider his original choices simply because time has passed, even if no new information has become available. To rule out this phe- nomenon, it is necessary that the ratios of discounted marginal utilities of consumption in t + k1 and t + k2 depend, in addition to c t +k1 and c t +k2 , only on the distance k2 − k1, and not also on the moment in time when the optimization problem is solved. With a discount factor for the CONSUMPTION 3 utility of consumption in t + k of the form (1 + Ò)−k (called “exponential discounting”), we can write vt +k (c t +k ) = ( 1 1 + Ò )k u(c t +k ), and dynamic consistency of preferences is ensured: under certainty, the agent may choose the optimal consumption plan once and for all at the beginning of his planning horizon.1 � The adoption of expected utility as the objective function under uncertainty (additivity over states of nature) In discrete time, a stochastic process spec- ifies a random variable for each date t , that is a real number associated to the realization of a state of nature. If it is possible to give a probability to different states of nature, it is also possible to construct an expecta- tion of future income, weighting each possible level of income with the probability of the associated state of nature. In general, the probabilities used depend on available information, and therefore change over time when new information is made available. Given her information set at t , It , the consumer maximizes expected utility conditional on It : Ut = E (∑∞ i =0 vt +i (c t +i ) | It ) . Together with the assumption of intertemporal separability (additivity over periods of time), the adoption of expected utility entails an inverse relationship between the degree of intertemporal substitutability, measuring the agent’s propensity to substitute current consumption with future consumption under certainty, and risk aver- sion, determining the agent’s choices among different consumption lev- els under uncertainty over the state of nature: the latter, and the inverse of the former, are both measured in absolute terms by −v′′t (c )/v′t (c ) at time t and for consumption level c . (We will expand on this point on page 6.) � Finally, we make the simplifying assumption that there exists only one financial asset with certain and constant rate of return r . Financial wealth A is the stock of the safe asset allowing the agent to transfer resources through time in a perfectly forecastable way; the only uncertainty is on the (exogenously given) future labor incomes y. Stochastic rates of return on n financial assets are introduced in Section 1.4 below. Under the set of hypotheses above, the consumer’s problem may be speci- fied as follows: max {c t +i ,i =0,1,... } Ut = E t [ ∞∑ i =0 ( 1 1 + Ò )i u(c t +i ) ] (1.1) ¹ A strand of the recent literature (see the last section of this chapter for references) has explored the implications of a different discount function: a “hyperbolic”discount factor declines at a relatively higher rate in the short run (consumers are relatively “impatient” at short horizons) than in the long run (consumers are “patient” at long horizons, implying dynamic inconsistent preferences). 4 CONSUMPTION subject to the constraint (for i = 0, . . . , ∞):2 At +i +1 = (1 + r ) At +i + yt +i − c t +i , At given. (1.2) In (1.1) Ò is the consumer’s intertemporal rate of time preference and E t [·] is the (rational) expectation formed using information available at t : for a generic variable xt +i we have E t xt +i = E (xt +i | It ). The hypothesis of rational expectations implies that the forecast error xt +i − E (xt +i | It ) is uncorrelated with the variables in the information set It : E t (xt +i − E (xt +i | It )) = 0 (we will often use this property below). The value of current income yt in included in It . In the constraint (1.2) financial wealth A may be negative (the agent is not liquidity-constrained); however, we impose the restriction that the consumer’s debt cannot grow at a rate greater than the financial return r by means of the following condition (known as the no-Ponzi-game condition): lim j →∞ ( 1 1 + r ) j At + j ≥ 0. (1.3) The condition in (1.3) is equivalent, in the infinite-horizon case, to the non- negativity constraint AT +1 ≥ 0 for an agent with a life lasting until period T : in the absence of such a constraint, the consumer would borrow to finance infinitely large consumption levels. Although in its general formulation (1.3) is an inequality, if marginal utility of consumption is always positive this condition will be satisfied as an equality. Equation (1.3) with strict equality is called transversality condition and can be directly used in the problem’s solution. Similarly, without imposing (1.3), interests on debt could be paid for by further borrowing on an infinite horizon. Formally, from the budget con- straint (1.2) at time t , repeatedly substituting At +i up to period t + j , we get the following equation: 1 1 + r j −1∑ i =0 ( 1 1 + r )i c t +i + ( 1 1 + r ) j At + j = 1 1 + r j −1∑ i =0 ( 1 1 + r )i yt +i + At . The present value of consumption flows from t up to t + j − 1 can exceed the consumer’s total available resources, given by the sum of the initial financial wealth At and the present value of future labor incomes from t up to t + j − 1. In this case At + j < 0 and the consumer will have a stock of debt at the begin- ning of period t + j . When the horizon is extended to infinity, the constraint (1.3) stops the agent from consuming more than his lifetime resources, using further borrowing to pay the interests on the existing debt in any period up to infinity. Assuming an infinite horizon and using (1.3) with equality, we get ² In addition, a non-negativity constraint on consumption must be imposed: c t +i ≥ 0. We assume that this constraint is always fulfilled. CONSUMPTION 5 the consumer’s intertemporal budget constraint at the beginning of period t (in the absence of liquidity constraints that would rule out, or limit, borrowing): 1 1 + r ∞∑ i =0 ( 1 1 + r )i c t +i = 1 1 + r ∞∑ i =0 ( 1 1 + r )i yt +i + At . (1.4) 1.1.1. OPTIMAL CONSUMPTION DYNAMICS Substituting the consumption level derived from the budget constraint (1.2) into the utility function, we can write the consumer’s problem as max Ut = E t ∞∑ i =0 ( 1 1 + Ò )i u((1 + r ) At +i − At +i +1 + yt +i ) with respect to wealth At +i for i = 1, 2, . . . , given initial wealth At and subject to the transversality condition derived from (1.3). The first-order conditions E t u ′(c t +i ) = 1 + r 1 + Ò E t u ′(c t +i +1) are necessary and sufficient if utility u(c ) is an increasing and concave function of consumption (i.e. if u′(c ) > 0 and u′′(c ) < 0). For the consumer’s choice in the first period (when i = 0), noting that u′(c t ) is known at time t , we get the so-called Euler equation: u′(c t ) = 1 + r 1 + Ò E t u ′(c t +1). (1.5) At the optimum the agent is indifferent between consuming immediately one unit of the good, with marginal utility u′(c t ), and saving in order to consume 1 + r units in the next period, t + 1. The same reasoning applies to any period t in which the optimization problem is solved: the Euler equation gives the dynamics of marginal utility in any two successive periods.3 ³ An equivalent solution of the problem is found by maximizing the Lagrangian function: Lt = E t ∞∑ i =0 ( 1 1 + Ò )i u(c t +i ) − Î [ ∞∑ i =0 ( 1 1 + r )i E t c t +i − (1 + r ) At − ∞∑ i =0 ( 1 1 + r )i E t yt +i ] , where Î is the Lagrange multiplier associated with the intertemporal budget constraint (here evaluated at the end of period t ). From the first-order conditions for c t and c t +1 , we derive the Euler equa- tion (1.5). In addition, we get u′(c t ) = Î. The shadow value of the budget constraint, measuring the increase of maximized utility that is due to an infinitesimal increase of the resources available at the end of period t , is equal to the marginal utility of consumption at t . At the optimum, the Euler equation holds: the agent is indifferent between consumption in the current period and consumption in any 6 CONSUMPTION The evolution over time of marginal utility and consumption is governed by the difference between the rate of return r and the intertemporal rate of time preference Ò. Since u′′(c t ) < 0, lower consumption yields higher marginal utility: if r > Ò, the consumer will find it optimal to increase consumption over
    time, exploiting a return on saving higher than the utility discount rate; when
    r = Ò, optimal consumption is constant, and when r < Ò it is decreasing. The shape of marginal utility as a function of c (i.e. the concavity of the utility function) determines the magnitude of the effect of r − Ò on the time path of consumption: if |u′′(c )| is large relative to u′(c ), large variations of marginal utility are associated with relatively small fluctuations in consumption, and then optimal consumption shows little changes over time even when the rate of return differs substantially from the utility discount rate. Also, the agent’s degree of risk aversion is determined by the concavity of the utility function. It has been already mentioned that our assumptions on preferences imply a negative relationship between risk aversion and intertem- poral substitutability (where the latter measures the change in consumption between two successive periods owing to the difference between r and Ò or, if r is not constant, to changes in the rate of return). It is easy to find such relationship for the case of a CRRA (constant relative risk aversion) utility function, namely: u(c t ) = c 1−„ t − 1 1 − „ , „ > 0,
    with u′(c ) = c −„. The degree of relative risk aversion—whose general measure
    is the absolute value of the elasticity of marginal utility, −u′′(c ) c /u′(c )—is in
    this case independent of the consumption level, and is equal to the parameter
    „.4 The measure of intertemporal substitutability is obtained by solving the
    consumer’s optimization problem under certainty. The Euler equation corre-
    sponding to ( 1.5) is
    c
    −„
    t =
    1 + r
    1 + Ò
    c
    −„
    t +1 ⇒
    (
    c t +1
    c t
    )„
    =
    1 + r
    1 + Ò
    .
    Taking logarithms, and using the approximations log(1 + r ) � r and
    log(1 + Ò) � Ò, we get
    � log c t +1 =
    1

    (r − Ò).
    future period, since both alternatives provide additional utility given by u′(c t ). In the Appendix to this
    chapter, the problem’s solution is derived by means of dynamic programming techniques.
    ⁴ The denominator of the CRRA utility function is zero if „ = 1, but marginal utility can never-
    theless have unitary elasticity: in fact, u′(c ) = c −„ = 1/c if u(c ) = log(c ). The presence of the constant
    term “−1” in the numerator makes utility well defined also when „ → 1. This limit can be computed,
    by l’Hôpital’s rule, as the ratio of the limits of the numerator’s derivative, d c 1−„/d „ = − log(c )c 1−„ ,
    and the denominator’s derivative, which is −1.

    CONSUMPTION 7
    The elasticity of intertemporal substitution, which is the effect of changes in
    the interest rate on the growth rate of consumption � log c , is constant and is
    measured by the reciprocal of the coefficient of relative risk aversion „.
    1.1.2. CONSUMPTION LEVEL AND DYNAMICS
    Under uncertainty, the expected value of utility may well differ from its real-
    ization. Letting
    u′(c t +1) − E t u′(c t +1) ≡ Át +1,
    we have by definition that E t Át +1 = 0 under the hypothesis of rational expec-
    tations. Then, from (1.5), we get
    u′(c t +1) =
    1 + Ò
    1 + r
    u′(c t ) + Át +1. (1.6)
    If we assume also that r = Ò, the stochastic process describing the evolution
    over time of marginal utility is
    u′(c t +1) = u
    ′(c t ) + Át +1, (1.7)
    and the change of marginal utility from t to t + 1 is given by a stochastic term
    unforecastable at time t (E t Át +1 = 0).
    In order to derive the implications of the above result for the dynamics of
    consumption, it is necessary to specify a functional form for u(c ). To obtain
    a linear relation like (1.7), directly involving the level of consumption, we can
    assume a quadratic utility function u(c ) = c − (b/2)c 2, with linear marginal
    utility u′(c ) = 1 − bc (positive only for c < 1/b). This simple and somewhat restrictive assumption lets us rewrite equation (1.7) as c t +1 = c t + ut +1, (1.8) where ut +1 ≡ −(1/b)Át +1 is such that E t ut +1 = 0. If marginal utility is linear in consumption, as is the case when the utility function is quadratic, the process (1.8) followed by the level of consumption is a martingale, or a random walk, with the property:5 E t c t +1 = c t . (1.9) This is the main implication of the intertemporal choice model with rational expectations and quadratic utility: the best forecast of next period’s con- sumption is current consumption. The consumption change from t to t + 1 ⁵ A martingale is a stochastic process xt with the property E t xt +1 = xt . With r = Ò, marginal utility and, under the additional hypothesis of quadratic utility, the level of consumption have this property. No assumptions have been made about the distribution of the process xt +1 − xt , for example concerning time-invariance, which is a feature of a random walk process. 8 CONSUMPTION cannot be forecast on the basis of information available at t : formally, ut +1 is orthogonal to the information set used to form the expectation E t , including all variables known to the consumer and dated t, t − 1, . . . This implication has been widely tested empirically. Such orthogonality tests will be discussed below. The solution of the consumer’s intertemporal choice problem given by (1.8) cannot be interpreted as a consumption function. Indeed, that equation does not link consumption in each period to its determinants (income, wealth, rate of interest), but only describes the dynamics of consumption from one period to the next. The assumptions listed above, however, make it possible to derive the consumption function, combining what we know about the dynamics of optimal consumption and the intertemporal budget constraint (1.4). Since the realizations of income and consumption must fulfill the constraint, (1.4) holds also with expected values: 1 1 + r ∞∑ i =0 ( 1 1 + r )i E t c t +i = 1 1 + r ∞∑ i =0 ( 1 1 + r )i E t yt +i + At . (1.10) Linearity of the marginal utility function, and a discount rate equal to the interest rate, imply that the level of consumption expected for any future period is equal to current consumption. Substituting E t c t +i with c t on the left-hand side of (1.10), we get 1 r c t = At + 1 1 + r ∞∑ i =0 ( 1 1 + r )i E t yt +i ≡ At + Ht . (1.11) The last term in (1.11), the present value at t of future expected labor incomes, is the consumer’s “human wealth” Ht . The consumption function can then be written as c t = r ( At + Ht ) ≡ y Pt (1.12) Consumption in t is now related to its determinants, the levels of financial wealth At and human wealth Ht . The consumer’s overall wealth at the begin- ning of period t is given by At + Ht . Consumption in t is then the annuity value of total wealth, that is the return on wealth in each period: r ( At + Ht ). That return, that we define as permanent income (y Pt ), is the flow that could be earned for ever on the stock of total wealth. The conclusion is that the agent chooses to consume in each period exactly his permanent income, computed on the basis of expectations of future labor incomes. CONSUMPTION 9 1.1.3. DYNAMICS OF INCOME, CONSUMPTION, AND SAVING Given the consumption function (1.12), we note that the evolution through time of consumption and permanent income coincide. Leading (1.12) one period, we have y Pt +1 = r ( At +1 + Ht +1). (1.13) Taking the expectation at time t of y Pt +1, subtracting the resulting expression from (1.13), and noting that E t At +1 = At +1 from (1.2), since realized income yt is included in the consumer’s information set at t , we get y Pt +1 − E t y Pt +1 = r ( Ht +1 − E t Ht +1). (1.14) Permanent income calculated at time t + 1, conditional on information avail- able at that time, differs from the expectation formed one period earlier, conditional on information at t , only if there is a “surprise” in the agent’s human wealth at time t + 1. In other words, the “surprise” in permanent income at t + 1 is equal to the annuity value of the “surprise” in human wealth arising from new information on future labor incomes, available only at t + 1. Since c t = y P t , from (1.9) we have E t y P t +1 = y P t . All information available at t is used to calculate permanent income y Pt , which is also the best forecast of the next period’s permanent income. Using this result, the evolution over time of permanent income can be written as y Pt +1 = y P t + r [ 1 1 + r ∞∑ i =0 ( 1 1 + r )i (E t +1 − E t )yt +1+i ] , where the “surprise” in human wealth in t + 1 is expressed as the revision in expectations on future incomes: y P can change over time only if those expect- ations change, that is if, when additional information accrues to the agent in t + 1, (E t +1 − E t )yt +1+i ≡ E t +1 yt +1+i − E t yt +1+i is not zero for all i . The evolution over time of consumption follows that of permanent income, so that we can write c t +1 = c t + r [ 1 1 + r ∞∑ i =0 ( 1 1 + r )i (E t +1 − E t )yt +1+i ] = c t + ut +1. (1.15) It can be easily verified that the change of consumption between t and t + 1 cannot be foreseen as of time t (since it depends only on information available in t + 1): E t ut +1 = 0. Thus, equation (1.15) enables us to attach a well defined economic meaning and a precise measure to the error term ut +1 in the Euler equation (1.8). 10 CONSUMPTION Intuitively, permanent income theory has important implications not only for the optimal consumption path, but also for the behavior of the agent’s saving, governing the accumulation of her financial wealth. To discover these implications, we start from the definition of disposable income y D , the sum of labor income, and the return on the financial wealth: y Dt = r At + yt . Saving s t (the difference between disposable income and consumption) is easily derived by means of the main implication of permanent income theory (c t = y P t ): s t ≡ y Dt − c t = y Dt − y Pt = yt − r Ht . (1.16) The level of saving in period t is then equal to the difference between current (labor) income yt and the annuity value of human wealth r Ht . Such a dif- ference, being transitory income, does not affect consumption: if it is positive it is entirely saved, whereas, if it is negative it determines a decumulation of financial assets of an equal amount. Thus, the consumer, faced with a variable labor income, changes the stock of financial assets so that the return earned on it (r A) allows her to keep consumption equal to permanent income. Unfolding the definition of human wealth Ht in (1.16), we can write saving at t as s t = yt − r 1 + r ∞∑ i =0 ( 1 1 + r )i E t yt +i = 1 1 + r yt − [ 1 1 + r − ( 1 1 + r )2] E t yt +1 − [( 1 1 + r )2 − ( 1 1 + r )3] E t yt +2 + . . . = − ∞∑ i =1 ( 1 1 + r )i E t �yt +i , (1.17) where �yt +i = yt +i − yt +i −1. Equation (1.17) sheds further light on the motivation for saving in this model: the consumer saves, accumulating finan- cial assets, to face expected future declines of labor income (a “saving for a rainy day” behavior). Equation (1.17) has been extensively used in the empir- ical literature, and its role will be discussed in depth in Section 1.2. CONSUMPTION 11 1.1.4. CONSUMPTION, SAVING, AND CURRENT INCOME Under certainty on future labor incomes, permanent income does not change over time. As a consequence, with r = Ò, consumption is constant and unre- lated to current income yt . On the contrary, when future incomes are uncer- tain, permanent income changes when new information causes a revision in expectations. Moreover, there is a link between current income and consump- tion if changes in income cause revisions in the consumer’s expected perma- nent income. To explore the relation between current and permanent income, we assume a simple first-order autoregressive process generating income y: yt +1 = Îyt + (1 − Î)ȳ + εt +1, E t εt +1 = 0, (1.18) where 0 ≤ Î ≤ 1 is a parameter and ȳ denotes the unconditional mean of income. The stochastic term εt +1 is the component of income at t + 1 that cannot be forecast on the basis of information available at t (i.e. the income innovation). Suppose that the stochastic process for income is in the con- sumer’s information set. From ( 1.18) we can compute the revision, between t and t + 1, of expectations of future incomes caused by a given realization of the stochastic term εt +1. The result of this calculation will then be substituted into (1.15) to obtain the effect on consumption c t +1. The revision in expectations of future incomes is given by E t +1 yt +1+i − E t yt +1+i = Îi εt +1, ∀i ≥ 0. Substituting this expression into (1.15) for each period t + 1 + i , we have r [ 1 1 + r ∞∑ i =0 ( 1 1 + r )i Îi εt +1 ] = [ εt +1 r 1 + r ∞∑ i =0 ( Î 1 + r )i] , (1.19) and solving the summation, we get6 c t +1 = c t + ( r 1 + r − Î ) εt +1, (1.20) which directly links current income innovation εt +1 to current consumption c t +1. Like equation (1.8), (1.20 ) is a Euler equation; the error term is the inno- vation in permanent income, here expressed in terms of the current income innovation. Given an unexpected increase of income in period t + 1 equal to εt +1, the consumer increases consumption in t + 1 and expected consumption in all future periods by the annuity value of the increase in human wealth, ⁶ The right-hand side expression in (1.19) can be written εt +1 (r /1 + r )S∞(Î/1 + r ) if we denote by SN (·) a geometric series with parameter ·, of order N. Since SN (·) − ·SN (·) = (1 + · + ·2 + . . . + ·N ) − (· + ·2 + ·3 + . . . + ·N+1 ) = 1 + ·N+1 , such a series takes values SN (·) = (1 + ·N+1 )/(1 − ·) and, as long as · < 1, converges to S∞(·) = (1 − ·)−1 as N tends to infinity. Using this formula in (1.19) yields the result. 12 CONSUMPTION r εt +1/(1 + r − Î). The portion of additional income that is not consumed, i.e. εt +1 − r 1 + r − Î εt +1 = 1 − Î 1 + r − Î εt +1, is saved and added to the outstanding stock of financial assets. Starting from the next period, the return on this saving will add to disposable income, enabling the consumer to keep the higher level of consumption over the whole infinite future horizon. It is important to notice that the magnitude of the consumption change between t and t + 1 resulting from an innovation in current income εt +1 depends, for a given interest rate r , on the parameter Î, capturing the degree of persistence of an innovation in t + 1 on future incomes. To see the role of this parameter, it is useful to consider two polar cases. 1. Î = 0. In this case yt +1 = ȳ + εt +1. The innovation in current income is purely transitory and does not affect the level of income in future peri- ods. Given an innovation εt +1, the consumer’s human wealth, calculated at the beginning of period t + 1, changes by εt +1/(1 + r ). This change in Ht +1 determines a variation of permanent income—and consumption— equal to r εt +1/(1 + r ). In fact, from (1.20) with Î = 0, we have c t +1 = c t + ( r 1 + r ) εt +1. (1.21) 2. Î = 1. In this case yt +1 = yt + εt +1. The innovation in current income is permanent, causing an equal change of all future incomes. The change in human wealth is then εt +1/r and the variation in permanent income and consumption is simply εt +1. From (1.20), with Î = 1, we get c t +1 = c t + εt +1. Exercise 1 In the two polar cases Î = 0 and Î = 1, find the effect of εt +1 on saving in t + 1 and on saving and disposable income in the following periods. Exercise 2 Using the stochastic process for labor income in (1.18), prove that the consumption function that holds in this case (linking c t to its determinants At , yt , and ȳ) has the following form: c t = r At + r 1 + r − Î yt + 1 − Î 1 + r − Î ȳ. What happens if Î = 1 and if Î = 0? CONSUMPTION 13 1.2. Empirical Issues The dynamic implications of the permanent income model of consumption illustrated above motivated many recent empirical studies on consump- tion. Similarly, the life-cycle theory of consumption developed mainly by F. Modigliani has been subjected to empirical scrutiny. The partial- equilibrium perspective of this chapter makes it difficult to discuss the rela- tionship between long-run saving and growth rates at the aggregate level: as we shall see in Chapter 4, the link between income growth and saving depends also on the interest rate, and becomes more complicated when the assumption of an exogenously given income process is abandoned. But even empirical studies based on cross-sectional individual data show that saving, if any, occurs only in the middle and old stages of the agent’s life: consumption tracks income too closely to explain wealth accumulation only on the basis of a life-cycle motive. As regards aggregate short-run dynamics, the first empirical test of the fun- damental implication of the permanent income/rational expectations model of consumption is due to R. E. Hall (1978), who tests the orthogonality of the error term in the Euler equation with respect to past information. If the theory is correct, no variable known at time t − 1 can explain changes in con- sumption between t − 1 and t . Formally, the test is carried out by evaluating the statistical significance of variables dated t − 1 in the Euler equation for time t . For example, augmenting the Euler equation with the income change that occurred between t − 2 and t − 1, we get �c t = ·�yt−1 + e t , (1.22) where · = 0 if the permanent income theory holds. Hall’s results for the USA show that the null hypothesis cannot be rejected for several past aggregate variables, including income. However, some lagged variables (such as a stock index) are significant when added to the Euler equation, casting some doubt on the validity of the model’s basic version. Since Hall’s contribution, the empirical literature has further investigated the dynamic implications of the theory, focusing mainly on two empirical regularities apparently at variance with the model: the consumption’s excess sensitivity to current income changes, and its excess smoothness to income innovations. The remainder of this section illustrates these problems and shows how they are related. 1.2.1. EXCESS SENSITIVITY OF CONSUMPTION TO CURRENT INCOME A different test of the permanent income model has been originally proposed by M. Flavin (1981). Flavin’s test is based on (1.15) and an additional equation 14 CONSUMPTION for the stochastic process for income yt . Consider the following stochastic process for income (AR(1) in first differences): �yt = Ï + Î�yt−1 + εt , (1.23) where εt is the change in current income, �yt , that is unforecastable using past income realizations. According to the model, the change in consumption between t − 1 and t is due to the revision of expectations of future incomes caused by εt . Letting Ë denote the intensity of this effect, the behavior of consumption is then �c t = Ëεt . (1.24) Consumption is excessively sensitive to current income if c t reacts to changes of yt by more than is justified by the change in permanent income, measured by Ëεt . Empirically, the Excess Sensitivity Hypothesis is formalized by augmenting (1.24) with the change in current income, �c t = ‚�yt + Ëεt + vt , (1.25) where ‚ (if positive) measures the overreaction of consumption to a change in current income, and vt captures the effect on consumption of information about permanent income, available to agents at t but unrelated to current income changes. According to the permanent income model, an increase in current income causes a change in consumption only by the amount warranted by the revision of permanent income. Only innovations (that is, unpredictable changes) in income cause consumption changes: the term Ëεt in (1.25) captures precisely this effect. An estimated value for ‚ greater than zero is then interpreted as signaling an overreaction of consumption to anticipated changes in income. The test on ‚ in (1.25) is equivalent to Hall’s orthogonality test in (1.22). In fact, substituting the stochastic process for income (1.23) into (1.25), we get �c t = ‚Ï + ‚Î �yt−1 + (Ë + ‚)εt + vt . (1.26) From this expression for the consumption change, we note that the hypothesis ‚ = 0 in (1.25) implies that · = 0 in (1.22): if consumption is excessively sensitive to income, then the orthogonality property of the error term in the equation for �c t does not hold. Equation (1.26) highlights a potential difficulty with the orthogonality test. Indeed, �c t may be found to be uncor- related with �yt−1 if the latter does not forecast future income changes. In this case Î = 0 and the orthogonality test fails to reject the theory, even though consumption is excessively sensitive to predictable changes in income. Thus, differently from Hall’s test, the approach of Flavin provides an estimate of the CONSUMPTION 15 excess sensitivity of consumption, measured by ‚, which is around 0.36 on US quarterly data over the 1949–79 period.7 Among the potential explanations for the excess sensitivity of consumption, a strand of the empirical literature focused on the existence of liquidity constraints, which limit the consumer’s borrowing capability, thus prevent- ing the realization of the optimal consumption plan. With binding liquid- ity constraints, an increase in income, though perfectly anticipated, affects consumption only when it actually occurs.8 A different rationale for excess sensitivity, based on the precautionary saving motive, will be analyzed in Section 1.3.9 1.2.2. RELATIVE VARIABILITY OF INCOME AND CONSUMPTION One of the most appealing features of the permanent income theory, since the original formulation due to M. Friedman, is a potential explanation of why consumption typically is less volatile than current income: even in simple textbook Keynesian models, a marginal propensity to consume c < 1 in aggre- gate consumption functions of the form C = c̄ + c Y is crucial in obtaining the basic concept of multiplier of autonomous expenditure. By relating con- sumption not to current but to permanent, presumably less volatile, income, the limited reaction of consumption to changes in current income is theoret- ically motivated. The model developed thus far, adopting the framework of intertemporal optimization under rational expectations, derived the implica- tions of this original intuition, and formalized the relationship between cur- rent income, consumption, and saving. (We shall discuss in the next chapter formalizations of simple textbook insights regarding investment dynamics: investment, like changes in consumption, is largely driven by revision of expectations regarding future variables.) In particular, according to theory, the agent chooses current consumption on the basis of all available information on future incomes and changes optimal consumption over time only in response to unanticipated changes (innovations) in current income, causing revisions in permanent income. ⁷ However, Flavin’s test cannot provide an estimate of the change in permanent income resulting from a current income innovation Ë, if ε and v in (1.26) have a non-zero covariance. Using aggregate data, any change in consumption due to vt is also reflected in innovations in current income εt , since consumption is a component of aggregate income. Thus, the covariance between ε and v tends to be positive. ⁸ Applying instrumental variables techniques to (1.25), Campbell and Mankiw (1989, 1991) directly interpret the estimated ‚ as the fraction of liquidity-constrained consumers, who simply spend their current income. ⁹ While we do not focus in this chapter on aggregate equilibrium considerations, it is worth mentioning that binding liquidity constraints and precautionary savings both tend to increase the aggregate saving rate: see Aiyagari (1994), Jappelli and Pagano (1994). 16 CONSUMPTION Therefore, on the empirical level, it is important to analyze the relationship between current income innovations and changes in permanent income, tak- ing into account the degree of persistence over time of such innovations. The empirical research on the properties of the stochastic process generat- ing income has shown that income y is non-stationary: an innovation at time t does not cause a temporary deviation of income from trend, but has perma- nent effects on the level of y, which does not display any tendency to revert to a deterministic trend. (For example, in the USA the estimated long-run change in income is around 1.6 times the original income innovation.10) The implication of this result is that consumption, being determined by permanent income, should be more volatile than current income. To clarify this point, consider again the following process for income: �yt +1 = Ï + Î�yt + εt +1, (1.27) where Ï is a constant, 0 < Î < 1, and E t εt +1 = 0. The income change between t and t + 1 follows a stationary autoregressive process; the income level is permanently affected by innovations ε.11 To obtain the effect on permanent income and consumption of an innovation εt +1 when income is governed by (1.27), we can apply the following property of ARMA stochastic processes, which holds whether or not income is stationary (Deaton, 1992). For a given stochastic process for y of the form a (L )yt = Ï + b(L )εt , where a (L ) = a0 + a1 L + a2 L 2 + . . . and b(L ) = b0 + b1 L + b2 L 2 + . . . are two polynomials in the lag operator L (such that, for a generic variable x, we have L i xt = xt−i ), we derive the following expression for the variance of the change in permanent income (and consequently in consumption):12 r 1 + r ∞∑ i =0 ( 1 1 + r )i (E t +1 − E t )yt +1+i = r 1 + r ∑∞ i =0 ( 1 1+r )i bi∑∞ i =0 ( 1 1+r )i ai εt +1. (1.28) In the case of (1.27), we can write yt = Ï + (1 + Î)yt−1 − Îyt−2 + εt ; ¹⁰ The feature of non-stationarity of income (in the USA and in other countries as well) is still an open issue. Indeed, some authors argue that, given the low power of the statistical tests used to assess the non-stationarity of macroeconomic time series, it is impossible to distinguish between non- stationarity and the existence of a deterministic time trend on the basis of available data. ¹¹ A stochastic process of this form, with Î = 0.44, is a fairly good statistical description of the (aggregate) income dynamics for the USA, as shown by Campbell and Deaton (1989) using quarterly data for the period 1953–84. ¹² The following formula can also be obtained by computing the revisions in expectations of future incomes, as has already been done in Section 1.1. CONSUMPTION 17 hence we have a (L ) = 1 − (1 + Î)L + ÎL 2 and b(L ) = 1. Applying the general formula (1.28) to this process, we get �c t +1 = r 1 + r ( r (1 + r − Î) (1 + r )2 )−1 εt +1 = 1 + r 1 + r − Î εt +1. This is formally quite similar to (1.20), but, because the income process is stationary only in first differences, features a different numerator on the right- hand side: the relation between the innovation εt +1 and the change in con- sumption �c t +1 is linear, but the slope is greater than 1 if Î > 0 (that is if, as is
    realistic in business-cycle fluctuations, above-average growth tends to be fol-
    lowed by still fast—if mean-reverting—growth in the following period). The
    same coefficient measures the ratio of the variability of consumption (given
    by the standard deviation of the consumption change) and the variability
    of income (given by the standard deviation of the innovation in the income
    process):
    Û�c
    Ûε
    =
    1 + r
    1 + r − Î .
    For example, Î = 0.44 and a (quarterly) interest rate of 1% yield a coefficient
    of 1.77. The implied variability of the (quarterly) change of consumption
    would be 1.77 times that of the income innovation. For non-durable goods
    and services, Campbell and Deaton (1989) estimate a coefficient of only
    0.64. Then, the response of consumption to income innovations seems to
    be at variance with the implications of the permanent income theory: the
    reaction of consumption to unanticipated changes in income is too smooth
    (this phenomenon is called excess smoothness). This conclusion could be
    questioned by considering that the estimate of the income innovation, ε,
    depends on the variables included in the econometric specification of the
    income process. In particular, if a univariate process like (1.27) is specified,
    the information set used to form expectations of future incomes and to derive
    innovations is limited to past income values only. If agents form their expecta-
    tions using additional information, not available to the econometrician, then
    the “true” income innovation, which is perceived by agents and determines
    changes in consumption, will display a smaller variance than the innovation
    estimated by the econometrician on the basis of a limited information set.
    Thus, the observed smoothness of consumption could be made consistent
    with theory if it were possible to measure the income innovations perceived
    by agents.13
    A possible solution to this problem exploits the essential feature of the
    permanent income theory under rational expectations: agents choose optimal
    consumption (and saving) using all available information on future incomes.
    ¹³ Relevant research includes Pischke (1995) and Jappelli and Pistaferri (2000).

    18 CONSUMPTION
    It is the very behavior of consumers that reveals their available information.
    If such behavior is observed by the econometrician, it is possible to use it
    to construct expected future incomes and the associated innovations. This
    approach has been applied to saving, which, as shown by ( 1.17), depends
    on expected future changes in income.
    To formalize this point, we start from the definition of saving and make
    explicit the information set used by agents at time t to forecast future
    incomes, It :
    s t = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E (�yt +i | It ). (1.29)
    The information set available to the econometrician is �t , with �t ⊆ It
    (agents know everything the econometrician knows but the reverse is not
    necessarily true). Moreover, we assume that saving is observed by the econo-
    metrician: s t ∈ �t . Then, taking the expected value of both sides of (1.29)
    with respect to the information set �t and applying the “law of iterated
    expectations,” we get
    E (s t | �t ) = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E [E (�yt +i | It ) | �t ]
    =⇒ s t = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E (�yt +i | �t ), (1.30)
    where we use the assumption that saving is included in �t . According to
    theory, then, saving is determined by the discounted future changes in labor
    incomes, even if they are forecast on the basis of the smaller information
    set �t .
    Since saving choices, according to (1.29), are made on the basis of all
    information available to agents, it is possible to obtain predictions on future
    incomes that do not suffer from the limited information problem typical of
    the univariate models widely used in the empirical literature. Indeed, pre-
    dictions can be conditioned on past saving behavior, thus using the larger
    information set available to agents. This is equivalent to forming predictions
    of income changes �yt by using not only past changes, �yt−1, but also past
    saving, s t−1.
    In principle, this extension of the forecasting model for income could
    reduce the magnitude of the estimated innovation variance Ûε. In practice,
    as is shown in some detail below, the evidence of excess smoothness of con-
    sumption remains unchanged after this extension.

    CONSUMPTION 19
    1.2.3. JOINT DYNAMICS OF INCOME AND SAVING
    Studying the implications derived from theory on the joint behavior of income
    and saving usefully highlights the connection between the two empirical puz-
    zles mentioned above (excess sensitivity and excess smoothness). Even though
    the two phenomena focus on the response of consumption to income changes
    of a different nature (consumption is excessively sensitive to anticipated
    income changes, and excessively smooth in response to unanticipated income
    variations), it is possible to show that the excess smoothness and excess sensi-
    tivity phenomena are different manifestations of the same empirical anomaly.
    To outline the connection between the two, we proceed in three successive
    steps.
    1. First, we assume a stochastic process jointly governing the evolution of
    income and saving over time and derive its implications for equations
    like (1.22), used to test the orthogonality property of the consumption
    change with respect to lagged variables. (Recall that the violation of
    the orthogonality condition entails excess sensitivity of consumption to
    predicted income changes.)
    2. Then, given the expectations of future incomes based on the assumed
    stochastic process, we derive the behavior of saving implied by theory
    according to (1.17), and obtain the restrictions that must be imposed on
    the estimated parameters of the process for income and saving to test the
    validity of the theory.
    3. Finally, we compare such restrictions with those required for the orthog-
    onality property of the consumption change to hold.
    We start with a simplified representation of the bivariate stochastic process
    governing income—expressed in first differences as in (1.27) to allow for
    non-stationarity, and imposing Ï = 0 for simplicity—and saving:
    �yt = a11�yt−1 + a12s t−1 + u1t , (1.31)
    s t = a21�yt−1 + a22s t−1 + u2t . (1.32)
    With s t−1 in the model, it is now possible to generate forecasts on future
    income changes by exploiting the additional informational value of past sav-
    ing. Inserting the definition of saving (s t = r At + yt − c t ) into the accumula-
    tion constraint (1.2), we get
    At +1 = At + (r At + yt − c t ) ⇒ s t = At +1 − At . (1.33)
    Obviously, the flow of saving is the change of the stock of financial assets
    from one period to the next, and this makes it possible to write the change in
    consumption by taking the first difference of the definition of saving

    20 CONSUMPTION
    used above:
    �c t = �yt + r � At − �s t
    = �yt + r s t−1 − s t + s t−1
    = �yt + (1 + r )s t−1 − s t . (1.34)
    Finally, substituting for �yt and s t from (1.31) and ( 1.32), we obtain the
    following expression for the consumption change �c t :
    �c t = „1�yt−1 + „2s t−1 + vt , (1.35)
    where
    „1 = a11 − a21, „2 = a12 − a22 + (1 + r ), vt = u1t − u2t .
    The implication of the permanent income theory is that the consumption
    change between t − 1 and t cannot be predicted on the basis of information
    available at time t − 1. This entails the orthogonality restriction „1 = „2 = 0,
    which in turn imposes the following restrictions on the coefficients of the joint
    process generating income and savings:
    a11 = a21, a22 = a12 + (1 + r ). (1.36)
    If these restrictions are fulfilled, the consumption change �c t = u1t − u2t
    is unpredictable using lagged variables: the change in consumption (and in
    permanent income) is equal to the current income innovation (u1t ) less the
    innovation in saving (u2t ), which reflects the revision in expectations of future
    incomes calculated by the agent on the basis of all available information. Now,
    from the definition of savings (1.17), using the expectations of future income
    changes derived from the model in (1.31) and (1.32), it is possible to obtain
    the restrictions imposed by the theory on the stochastic process governing
    income and savings. Letting
    xt ≡
    (
    �yt
    s t
    )
    , A ≡
    (
    a11 a12
    a21 a22
    )
    , ut =
    (
    u1t
    u2t
    )
    ,
    we can rewrite the process in (1.31)–(1.32) as
    xt = Axt−1 + ut . (1.37)
    From (1.37), the expected values of �yt +i can be easily derived:
    E t xt +i = A
    i xt , i ≥ 0;

    CONSUMPTION 21
    hence (using a matrix algebra version of the geometric series formula)

    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E t xt +i = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    Ai xt
    = −
    [(
    I − 1
    1 + r
    A
    )−1
    − I
    ]
    xt . (1.38)
    The element of vector x we are interested in (saving s ) can be “extracted” by
    applying to x a vector e2 ≡ (0 1)′, which simply selects the second element of
    x. Similarly, to apply the definition in (1.17), we have to select the first element
    of the vector in (1.38) using e1 ≡ (1 0)′. Then we get
    e′2xt = −e′1
    [(
    I − 1
    1 + r
    A
    )−1
    − I
    ]
    xt ⇒ e′2 = −e′1
    [(
    I − 1
    1 + r
    A
    )−1
    − I
    ]
    ,
    yielding the relation
    e′2 = (e

    2 − e′1)
    1
    1 + r
    A. (1.39)
    Therefore, the restrictions imposed by theory on the coefficients of matrix
    A are
    a11 = a21, a22 = a12 + (1 + r ). (1.40)
    These restrictions on the joint process for income and saving, which rule
    out the excess smoothness phenomenon, are exactly the same as those—in
    equation (1.35)—that must be fulfilled for the orthogonality property to hold,
    and therefore also ensure elimination of excess sensitivity.14 Summarizing,
    the phenomena of excess sensitivity and excess smoothness, though related to
    income changes of a different nature (anticipated and unanticipated, respec-
    tively), signal the same deviation from the implications of the permanent
    income theory. If agents excessively react to expected income changes, they
    must necessarily display a lack of reaction to unanticipated income changes.
    In fact, any variation in income is made up of a predicted component and a
    (unpredictable) innovation: if the consumer has an “excessive” reaction to the
    former component, the intertemporal budget constraint forces him to react in
    an “excessively smooth” way to the latter component of the change in current
    income.
    ¹⁴ The coincidence of the restrictions necessary for orthogonality and for ruling out excess smooth-
    ness is obtained only in the special case of a first-order stochastic process for income and saving. In
    the more general case analyzed by Flavin (1993), the orthogonality restrictions are nested in those
    necessary to rule out excess smoothness. Then, in general, orthogonality conditions analogous to
    (1.36) imply—but are not implied by—those analogous to (1.40).

    22 CONSUMPTION
    1.3. The Role of Precautionary Saving
    Recent developments in consumption theory have been aimed mainly at
    solving the empirical problems illustrated above. The basic model has been
    extended in various directions, by relaxing some of its most restrictive
    assumptions. On the one hand, as already mentioned, liquidity constraints
    can prevent the consumer from borrowing as much as required by the optimal
    consumption plan. On the other hand, it has been recognized that in the basic
    model saving is motivated only by a rate of interest higher than the rate-of-
    time preference and/or by the need for redistributing income over time, when
    current incomes are unbalanced between periods. Additional motivations for
    saving may be relevant in practice, and may contribute to the explanation of,
    for example, the apparently insufficient decumulation of wealth by older gen-
    erations, the high correlation between income and consumption of younger
    agents, and the excess smoothness of consumption in reaction to income
    innovations. This section deals with the latter strand of literature, studying
    the role of a precautionary saving motive in shaping consumers’ behavior.
    First, we will spell out the microeconomic foundations of precautionary
    saving, pointing out which assumption of the basic model must be relaxed to
    allow for a precautionary saving motive. Then, under the new assumptions,
    we shall derive the dynamics of consumption and the consumption function,
    and compare them with the implications of the basic version of the permanent
    income model previously illustrated.
    1.3.1. MICROECONOMIC FOUNDATIONS
    Thus far, with a quadratic utility function, uncertainty has played only a
    limited role. Indeed, only the expected value of income y affects consumption
    choices—other characteristics of the income distribution (e.g. the variance)
    do not play any role.
    With quadratic utility, marginal utility is linear and the expected value of
    the marginal utility of consumption coincides with the marginal utility of
    expected consumption. An increase in uncertainty on future consumption,
    with an unchanged expected value, does not cause any reaction by the con-
    sumer.15 As we shall see, if marginal utility is a convex function of consump-
    tion, then the consumer displays a prudent behavior, and reacts to an increase
    in uncertainty by saving more: such saving is called precautionary, since it
    depends on the uncertainty about future consumption.
    ¹⁵ In the basic version of the model, the consumer is interested only in the certainty equivalent value
    of future consumption.

    CONSUMPTION 23
    Convexity of the marginal utility function u′(c ) implies a positive sign
    of its second derivative, corresponding to the third derivative of the utility
    function: u′′′(c ) > 0. A precautionary saving motive, which does not arise
    with quadratic utility (u′′′(c ) = 0), requires the use of different functional
    forms, such as exponential utility.16 With risk aversion (u′′(c ) < 0) and convex marginal utility (u′′′(c ) > 0), under uncertainty about future incomes (and
    consumption), unfavorable events determine a loss of utility greater than the
    gain in utility obtained from favorable events of the same magnitude. The
    consumer fears low-income states and adopts a prudent behavior, saving in
    the current period in order to increase expected future consumption.
    An example can make this point clearer. Consider a consumer living for two
    periods, t and t + 1, with no financial wealth at the beginning of period t . In
    the first period labor income is ȳ with certainty, whereas in the second period
    it can take one of two values—y At +1 or y
    B
    t +1 < y A t +1—with equal probability. To focus on the precautionary motive, we rule out any other motivation for saving by assuming that E t (yt +1) = ȳ and r = Ò = 0. In equilibrium the following relation holds: E t u ′(c t +1) = u′(c t ). At time t the consumer chooses saving s t (equal to ȳ − c t ) and his consumption at time t + 1 will be equal to saving s t plus realized income. Considering actual realizations of income, we can write the budget constraint as c At +1 c Bt +1 } = ȳ − c t + { y At +1 y Bt +1 = s t + { y At +1 y Bt +1 . Using the definition of saving, s t ≡ ȳ − c t , the Euler equation becomes E t (u ′( yt +1 + s t )) = u ′( ȳ − s t ). (1.41) Now, let us see how the consumer chooses saving in two different cases, beginning with that of linear marginal utility (u′′′(c ) = 0). In this case we have E t u ′(·) = u′(E t (·)). Recalling that E t ( yt +1) = ȳ, condition ( 1.41) becomes u′( ȳ + s t ) = u ′( ȳ − s t ), (1.42) and is fulfilled by s t = 0. The consumer does not save in the first period, and his second-period consumption will coincide with current income. The uncertainty on income in t + 1 reduces overall utility but does not induce the consumer to modify his choice: there is no precautionary saving. On the contrary, if, as in Figure 1.1, marginal utility is convex (u′′′(c ) > 0), then,
    ¹⁶ A quadratic utility function has another undesirable property: it displays increasing absolute risk
    aversion. Formally, −u′′(c )/u′(c ) is an increasing function of c . This implies that, to avoid uncertainty,
    the agent is willing to pay more the higher is his wealth, which is not plausible.

    24 CONSUMPTION
    Figure 1.1. Precautionary savings
    from “Jensen’s inequality,” E t u
    ′(c t +1) > u′(E t (c t +1)).17 If the consumer were
    to choose zero saving, as was optimal under a linear marginal utility, we would
    have (for s t = 0, and using Jensen’s inequality)
    E t (u
    ′(c t +1)) > u
    ′(c t ). (1.43)
    The optimality condition would be violated, and expected utility would not
    be maximized. To re-establish equality in the problem’s first-order condition,
    marginal utility must decrease in t + 1 and increase in t : as shown in the figure,
    this may be achieved by shifting an amount of resources s t from the first to the
    second period. As the consumer saves more, decreasing current consumption
    c t and increasing c t +1 in both states (good and bad), marginal utility in t
    increases and expected marginal utility in t + 1 decreases, until the optimal-
    ity condition is satisfied. Thus, with convex marginal utility, uncertainty on
    future incomes (and consumption levels) entails a positive amount of saving
    in the first period and determines a consumption path trending upwards over
    time ( E t c t +1 > c t ), even though the interest rate is equal to the utility discount
    rate. Formally, the relation between uncertainty and the upward consumption
    path depends on the degree of consumer’s prudence, which we now define
    rigorously. Approximating (by means of a second-order Taylor expansion)
    around c t the left-hand side of the Euler equation E t u
    ′(c t +1) = u′(c t ), we get
    E t (c t +1 − c t ) = −
    1
    2
    u′′′(c t )
    u′′(c t )
    E t (c t +1 − c t )2 ≡
    1
    2
    a E t (c t +1 − c t )2, (1.44)
    ¹⁷ Jensen’s inequality states that, given a strictly convex function f (x ) of a random variable x , then
    E ( f (x )) > f ( E x ).

    CONSUMPTION 25
    where a ≡ −u′′′(c )/u′′(c ) is the coefficient of absolute prudence. Greater
    uncertainty, increasing E t ((c t +1 − c t )2), induces a larger increase in con-
    sumption between t and t + 1. The definition of the coefficient measuring
    prudence is formally similar to that of risk-aversion coefficients: however, the
    latter is related to the curvature of the utility function, whereas prudence is
    determined by the curvature of marginal utility. It is also possible to define
    the coefficient of relative prudence, −u′′′(c )c /u′′(c ). Dividing both sides of
    (1.44) by c t , we get
    E t
    (
    c t +1 − c t
    c t
    )
    = − 1
    2
    u′′′(c t ) · c t
    u′′(c t )
    E t
    (
    c t +1 − c t
    c t
    )2
    =
    1
    2
    p E t
    (
    c t +1 − c t
    c t
    )2
    ,
    where p ≡ −(u′′′(c ) · c /u′′(c )) is the coefficient of relative prudence. Readers
    can check that this is constant for a CRRA function, and determine its rela-
    tionship to the coefficient of relative risk aversion.
    Exercise 3 Suppose that a consumer maximizes
    log (c 1) + E [log (c 2)]
    under the constraint c 1 + c 2 = w1 + w2 (i.e., the discount rate of period 2 utility
    and the rate of return on saving w1 − c 1 are both zero). When c 1 is chosen, there
    is uncertainty about w2: the consumer will earn w2 = x or w2 = y with equal
    probability. What is the optimal level of c 1?
    1.3.2. IMPLICATIONS FOR THE CONSUMPTION FUNCTION
    We now solve the consumer’s optimization problem in the case of a non-
    quadratic utility function, which motivates precautionary saving. The setup
    of the problem is still given by (1.1) and (1.2), but the utility function in each
    period is now of the exponential form:
    u(c t +i ) = −
    1

    e −„c t +i , (1.45)
    where „ > 0 is the coefficient of absolute prudence (and also, for such a
    constant absolute risk aversion—CARA—utility function, the coefficient of
    absolute risk aversion).18 Assume that labor income follows the AR(1) sto-
    chastic process:
    yt +i = Îyt +i −1 + (1 − Î)ȳ + εt +i , (1.46)
    ¹⁸ Since for the exponential utility function u′(0) = 1 < ∞, in order to rule out negative values for consumption it would be necessary to explicitly impose a non-negativity constraint; however, a closed-form solution to the problem would not be available if that constraint were binding. 26 CONSUMPTION where εt +i are independent and identically distributed (i.i.d.) random vari- ables, with zero mean and variance Û2ε. We keep the simplifying hypothesis that r = Ò. The problem’s first-order condition, for i = 0, is given by e −„c t = E t (e −„c t +1 ). (1.47) To proceed, we guess that the stochastic process followed by consumption over time has the form c t +i = c t +i −1 + K t +i −1 + vt +i , (1.48) where K t +i −1 is a deterministic term (which may however depend on the period’s timing within the individual’s life cycle) and vt +i is the innovation in consumption (E t +i −1vt +i = 0). Both the sequence of K t terms and the features of the distribution of v must be determined so as to satisfy the Euler equation (1.47) and the intertemporal budget constraint (1.4). Using (1.48), from the Euler equation, after eliminating the terms in c t , we get e „K t = E t (e −„vt +1 ) ⇒ K t = 1 „ log E t (e −„vt +1 ). (1.49) The value of K depends on the characteristics of the distribution of v, yet to be determined. Using the fact that log E (·)>E (log(·)) by Jensen’s
    inequality and the property of consumption innovations E t vt +1 = 0, we can
    however already write
    K t =
    1

    log E t (e
    −„vt +1 ) >
    1

    E t (log(e
    −„vt +1 )) =
    1

    E t (−„vt +1) = 0 ⇒ K t > 0.
    (1.50)
    The first result is that the consumption path is increasing over time: the
    consumption change between t and t + 1 is expected to equal K t > 0, whereas
    with quadratic utility (maintaining the assumption Ò = r ) consumption
    changes would have zero mean. Moreover, from (1.49) we interpret −K t as
    the “certainty equivalent” of the consumption innovation vt +1, defined as the
    (negative) certain change of consumption from t to t + 1 that the consumer
    would accept to avoid the uncertainty on the marginal utility of consumption
    in t + 1.
    To obtain the consumption function (and then to determine the effect of
    the precautionary saving motive on the level of consumption) we use the
    intertemporal budget constraint (1.10) computing the expected values E t c t +i
    from (1.48). Knowing that E t vt +i = 0, we have
    1
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    c t +
    1
    1 + r
    ∞∑
    i =1
    (
    1
    1 + r
    )i i∑
    j =1
    K t + j −1 = At + Ht . (1.51)

    CONSUMPTION 27
    Solving for c t , we finally get
    c t = r ( At + Ht ) −
    r
    1 + r
    ∞∑
    i =1
    (
    1
    1 + r
    )i i∑
    j =1
    K t + j −1. (1.52)
    The level of consumption is made up of a component analogous to the def-
    inition of permanent income, r ( At + Ht ), less a term that depends on the
    constants K and captures the effect of the precautionary saving motive: since
    the individual behaves prudently, her consumption increases over time, but
    (consistently with the intertemporal budget constraint) the level of consump-
    tion in t is lower than in the case of quadratic utility.
    As the final step of the solution, we derive the form of the stochastic term
    vt +i , and its relationship to the income innovation εt +i . To this end we use the
    budget constraint (1.4), where c t +i and yt +i are realizations and not expected
    values, and write future realized incomes as the sum of the expected value
    at time t and the associated “surprise”: yt +i = E t yt +i + ( yt +i − E t yt +i ). The
    budget constraint becomes
    1
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    c t +i = At + Ht +
    1
    1 + r
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    ( yt +i − E t yt +i ).
    Substituting for c t +i (with i > 0) from (1.48) and for c t from the consumption
    function (1.52), we get
    ∞∑
    i =1
    (
    1
    1 + r
    )i i∑
    j =1
    vt + j =
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    ( yt +i − E t yt +i ).
    Given the stochastic process for income (1.46) we can compute the income
    “surprises,”
    yt +i − E t yt +i =
    i −1∑
    k=0
    Îk εt +i −k ,
    and insert them into the previous equation, to obtain
    ∞∑
    i =1
    (
    1
    1 + r
    )i i∑
    j =1
    vt + j =
    ∞∑
    i =1
    (
    1
    1 + r
    )i i −1∑
    k=0
    Îk εt +i −k . (1.53)
    Developing the summations, collecting terms containing v and ε with the
    same time subscript, and using the fact that v and ε are serially uncorrelated
    processes, we find the following condition that allows us to determine the form
    of vt +i :
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    (vt +h − Îi −1εt +h ) = 0, ∀h ≥ 1. (1.54)

    28 CONSUMPTION
    Solving the summation in (1.54), we arrive at the final form of the stochastic
    terms of the Euler equation guessed in (1.48): at all times t + h,
    vt +h =
    r
    1 + r − Î εt +h . (1.55)
    As in the quadratic utility case (1.20), the innovation in the Euler equation can
    be interpreted as the annuity value of the revision of the consumer’s human
    wealth arising from an innovation in income for the assumed stochastic
    process.
    Expression (1.55) for vt +1 can be substituted in the equation for K t (1.49).
    The fact that the innovations ε are i.i.d. random variables implies that K t does
    not change over time: K t +i −1 = K in (1.48). The evolution of consumption
    over time is then given by
    c t +1 = c t + K +
    r
    1 + r − Î εt +1. (1.56)
    Substituting the constant value for K into (1.52), we get a closed-form con-
    sumption function:19
    c t = r ( At + Ht ) −
    r
    1 + r
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    i · K
    = r ( At + Ht ) −
    r
    1 + r
    K
    1 + r
    r 2
    = r ( At + Ht ) −
    K
    r
    .
    Finally, to determine the constant K and its relationship with the uncer-
    tainty about future labor incomes, some assumptions on the distribution of
    ε have to be made. If ε is normally distributed, ε ∼ N(0, Û2ε), then, letting
    ¹⁹ To verify this result, note that
    ∞∑
    i =1
    ·i i =
    ∞∑
    i =1
    ·i +
    ∞∑
    i =2
    ·i +
    ∞∑
    i =3
    ·i + …
    =
    ∞∑
    i =1
    ·i + ·
    ∞∑
    i =1
    ·i + ·2
    ∞∑
    i =1
    ·i + …
    = (1 + · + ·2 + …)
    ∞∑
    i =1
    ·i
    =
    ∞∑
    i =0
    ·i
    ( ∞∑
    i =0
    ·i − 1
    )
    ,
    which equals 11−·
    ·
    1−· = ·/(1 − ·)2 as long as · < 1, which holds true in the relevant · = 1/(1 + r ) case with r > 0.

    CONSUMPTION 29
    Ë ≡ r /(1 + r − Î), we have20
    K t =
    1

    log E t (e
    −„Ëεt +1 ) =
    1

    log e
    „2 Ë2 Û2ε
    2 =
    „Ë2Û2ε
    2
    . (1.57)
    The dynamics of consumption over time and its level in each period are then
    given by
    c t +1 = c t +
    „Ë2Û2ε
    2
    + Ëεt +1,
    c t = r ( At + Ht ) −
    1
    r
    „Ë2Û2ε
    2
    .
    The innovation variance Û2ε has a positive effect on the change in consumption
    between t and t + 1, and a negative effect on the level of consumption in t .
    Increases in the uncertainty about future incomes (captured by the variance of
    the innovations in the process for y) generate larger changes of consumption
    from one period to the next and drops in the level of current consumption.
    Thus, allowing for a precautionary saving motive can rationalize the slow
    decumulation of wealth by old individuals, and can explain why (increasing)
    income and consumption paths are closer to each other than would be implied
    by the basic permanent income model. Moreover, if positive innovations in
    current income are associated with higher uncertainty about future income,
    the excess smoothness phenomenon may be explained, since greater uncer-
    tainty induces consumers to save more and may then reduce the response of
    consumption to income innovations.
    Exercise 4 Assuming u(c ) = c 1−„/(1 − „) and r �= Ò, derive the first-order con-
    dition of the consumer’s problem under uncertainty. If c t +1/c t has a lognormal
    distribution (i.e. if the rate of change of consumption � log c t +1 is normally
    distributed with constant variance Û2), write the Euler equation in terms of the
    expected rate of change of consumption E t (� log c t +1). How does the variance Û
    2
    affect the behavior of the rate of change of c over time? (Hint: make use of the
    fact mentioned in note 20, recall that c t +1/c t = e
    � log c t +1 , and express the Euler
    equations in logarithmic terms.)
    1.4. Consumption and Financial Returns
    In the model studied so far, the consumer uses a single financial asset with a
    certain return to implement the optimal consumption path. A precautionary
    saving motive has been introduced by abandoning the hypothesis of quadratic
    ²⁰ To derive (1.57) we used the following statistical fact: if x ∼ N( E (x ), Û2 ), then e x is a lognormal
    random variable with mean E (e x ) = e E (x )+Û
    2 /2 .

    30 CONSUMPTION
    utility. However, there is still no choice on the allocation of saving. If we
    assume that the consumer can invest his savings in n financial assets with
    uncertain returns, we generate a more complicated choice of the composition
    of financial wealth, which interacts with the determination of the optimal
    consumption path. The chosen portfolio allocation will depend on the char-
    acteristics of the consumer’s utility function (in particular the degree of risk
    aversion) and of the distribution of asset returns. Thereby extended, the model
    yields testable implications on the joint dynamics of consumption and asset
    returns, and becomes the basic version of the consumption-based capital asset
    pricing model (CCAPM).
    With the new hypothesis of n financial assets with uncertain returns,
    the consumer’s budget constraint must be reformulated accordingly. The
    beginning-of-period stock of the j th asset, measured in units of consumption,
    is given by A
    j
    t +i . Therefore, total financial wealth is At +i =
    ∑n
    j =1 A
    j
    t +i . r
    j
    t +i +1
    denotes the real rate of return of asset j in period t + i , so that A
    j
    t +i +1 =
    (1 + r
    j
    t +i +1) A
    j
    t +i . This return is not known by the agent at the beginning
    of period t + i . (This explains the time subscript t + i + 1, whereas labor
    income—observed by the agent at the beginning of the period—has subscript
    t + i .) The accumulation constraint from one period to the next takes the form
    n∑
    j =1
    A
    j
    t +i +1 =
    n∑
    j =1
    (1 + r
    j
    t +i +1) A
    j
    t +i + yt +i − c t +i , i = 0, . . . , ∞. (1.58)
    The solution at t of the maximization problem yields the levels of consump-
    tion and of the stocks of the n assets from t to infinity. Like in the solution of
    the consumer’s problem analyzed in Section 1.1 (but now with uncertain asset
    returns), we have a set of n Euler equations,
    u′(c t ) =
    1
    1 + Ò
    E t
    [
    (1 + r
    j
    t +1) u
    ′(c t +1)
    ]
    for j = 1, . . . , n. (1.59)
    Since u′(c t ) is not stochastic at time t , we can write the first-order condi-
    tions as
    1 = E t
    [
    (1 + r
    j
    t +1)
    1
    1 + Ò
    u′(c t +1)
    u′(c t )
    ]
    ≡ E t
    [
    (1 + r
    j
    t +1) Mt +1
    ]
    , (1.60)
    where Mt +1 is the “stochastic discount factor” applied at t to consumption
    in the following period. Such a factor is the intertemporal marginal rate of
    substitution, i.e. the discounted ratio of marginal utilities of consumption in
    any two subsequent periods. From equation (1.60) we derive the fundamental

    CONSUMPTION 31
    result of the CCAPM, using the following property:
    E t
    [(
    1 + r
    j
    t +1
    )
    Mt +1
    ]
    = E t (1 + r
    j
    t +1) E t ( Mt +1) + covt (r
    j
    t +1, Mt +1). (1.61)
    Inserting (1.61) into (1.60) and rearranging terms, we get
    E t (1 + r
    j
    t +1) =
    1
    E t (Mt +1)
    [
    1 − covt
    (
    r
    j
    t +1, Mt +1
    )]
    . (1.62)
    In the case of the safe asset (with certain return r 0) considered in the previous
    sections,21 (1.62) reduces to
    1 + r 0t +1 =
    1
    E t (Mt +1)
    . (1.63)
    Substituting (1.63) into (1.62), we can write the expected return of each asset
    j in excess of the safe asset as
    E t (r
    j
    t +1) − r 0t +1 = − (1 + r 0t +1) covt (r jt +1, Mt +1). (1.64)
    Equation (1.64) is the main result from the model with risky financial
    assets: in equilibrium, an asset j whose return has a negative covariance with
    the stochastic discount factor yields an expected return higher than r 0. In fact,
    such an asset is “risky” for the consumer, since it yields lower returns when
    the marginal utility of consumption is relatively high (owing to a relatively
    low level of consumption). The agent willingly holds the stock of this asset in
    equilibrium only if such risk is appropriately compensated by a “premium,”
    given by an expected return higher than the risk-free rate r 0.
    1.4.1. EMPIRICAL IMPLICATIONS OF THE CCAPM
    In order to derive testable implications from the model, we consider a CRRA
    utility function,
    u(c ) =
    c 1−„ − 1
    1 − „ ,
    where „ > 0 is the coefficient of relative risk aversion. In this case, (1.60)
    becomes
    1 = E t
    [
    (1 + r
    j
    t +1)
    1
    1 + Ò
    (
    c t +1
    c t
    )−„]
    for j = 1, . . . , n. (1.65)
    ²¹ The following results hold also if the safe return rate r 0 is random, as long as it has zero
    covariance with the stochastic discount factor M.

    32 CONSUMPTION
    Moreover, let us assume that the rate of growth of consumption and the rates
    of return of the n assets have a lognormal joint conditional distribution.22
    Taking logs of (1.65) (with the usual approximation log(1 + Ò) � Ò), we get
    0 = −Ò + log E t
    [
    (1 + r
    j
    t +1)
    (
    c t +1
    c t
    )−„]
    ,
    and by the property mentioned in the preceding footnote we obtain
    log E t
    [
    (1 + r
    j
    t +1)
    (
    c t +1
    c t
    )−„]
    = E t (r
    j
    t +1 − „� log c t +1) +
    1
    2
    � j , (1.66)
    where
    � j = E
    {[
    (r
    j
    t +1 − „� log c t +1) − E t (r jt +1 − „� log c t +1)
    ]2}
    .
    Note that the unconditional expectation E [ · ] in the definition of � j may be
    used under the hypothesis that the innovations in the joint process for returns
    and the consumption growth rate have constant variance (homoskedasticity).
    Finally, from (1.66) we can derive the expected return on the j th asset:
    E t r
    j
    t +1 = „E t (� log c t +1) + Ò −
    1
    2
    � j . (1.67)
    Several features of equation (1.67) can be noticed. In the first place, (1.67)
    can be immediately interpreted as the Euler equation that holds for each
    asset j . This interpretation can be seen more clearly if (1.67) is rewritten with
    the expected rate of change of consumption on the left-hand side. (See the
    solution to exercise 4 for the simpler case of only one safe asset.)
    Second, the most important implication of (1.67) is the existence of a
    precise relationship between the forecastable component of (the growth rate
    of) consumption and asset returns. A high growth rate of consumption is
    associated with a high rate of return, so as to enhance saving, for a given
    intertemporal discount rate Ò. The degree of risk aversion „ is a measure of this
    effect, which is the same for all assets. At the empirical level, (1.67) suggests
    the following methodology to test the validity of the model.
    1. A forecasting model for � log c t +1 is specified; vector xt contains only
    those variables, from the wider information set available to agents at
    time t , which are relevant for forecasting consumption growth.
    ²² In general, when two random variables x and y have a lognormal joint conditional
    probability distribution, then log E t (xt +1 yt +1 ) = E t (log(xt +1 yt +1 )) +
    1
    2 vart (log(xt +1 yt +1 )), where
    vart (log(xt +1 yt +1 )) = E t {[log(xt +1 yt +1 ) − E t (log(xt +1 yt +1 ))]2}.

    CONSUMPTION 33
    2. The following system for � log c t +1 and r
    j
    t +1 is estimated:
    � log c t +1 = δ
    ′ xt + ut +1,
    r
    j
    t +1 = π

    j xt + k j + v
    j
    t +1, j = 1, . . . , n,
    where k j is a constant and u and v are random errors uncorrelated with
    the elements of x.
    3. The following restrictions on the estimated parameters are tested:
    π j = „ δ, j = 1, . . . , n.
    Finally, the value of � j differs from one asset return to another, because
    of differences in the variability of return innovations and differences in the
    covariances between such innovations and the innovation of the consumption
    change. In fact, by the definition of � j and the lognormality assumption, we
    have
    � j = E
    [
    (r
    j
    t +1 − E t (r jt +1))2
    ]
    + „2 E
    [
    (� log c t +1 − E t (� log c t +1))2
    ]
    − 2„E
    [
    (r
    j
    t +1 − E t (r jt +1))(� log c t +1 − E t (� log c t +1))
    ]
    ≡ Û2j + „2Û2c − 2„Û j c . (1.68)
    The expected return of an asset is negatively affected by the variance of the
    return itself and is positively affected by its covariance with the rate of change
    in consumption. Thus, using (1.67) and (1.68), we obtain, for any asset j ,
    E t r
    j
    t +1 = „ E t ( � log c t +1) + Ò −
    „2Û2c
    2

    Û2j
    2
    + „Û j c . (1.69)
    This equation specializes the general result given in (1.62), and it is interesting
    to interpret each of the terms on its right-hand side. Faster expected consump-
    tion growth implies that the rate of return should be higher than the rate of
    time preference Ò, to an extent that depends on intertemporal substitutability
    as indexed by „. “Precaution,” also indexed by „, implies that the rate of
    return consistent with optimal consumption choices is lower when consump-
    tion is more volatile (a higher Û2c ). The variance of returns has a somewhat
    counterintuitive negative effect on the required rate or return: however, this
    term appears only because of Jensen’s inequality, owing to the approximation
    that replaced log E t (1 + r
    j
    t +1) with E t r
    j
    t +1 in equation (1.69). But it is again
    interesting and intuitive to see that the return’s covariance with consumption
    growth implies a higher required rate of return. In fact, the consumer will be
    satisfied by a lower expected return if an asset yields more when consumption
    is decreasing and marginal utility is increasing; this asset provides a valuable
    hedge against declines in consumption to risk-averse consumers. Hence an
    asset with positive covariance between the own return innovations and the

    34 CONSUMPTION
    innovations in the rate of change of consumption is not attractive, unless (as
    must be the case in equilibrium) it offers a high expected return.
    When there is also an asset with a safe return r 0, the model yields the
    following relationship between r 0 and the stochastic properties of � log c t +1
    (see again the solution of exercise 4):
    r 0t +1 = „ E t (� log c t +1) + Ò −
    „2Û2c
    2
    . (1.70)
    (The return variance and covariance with consumption are both zero in
    this case.) Equations (1.69) and (1.70) show the determinants of the returns
    on different assets in equilibrium. All returns depend positively on the
    intertemporal rate of time preference Ò, since, for a given growth rate of con-
    sumption, a higher discount rate of future utility induces agents to borrow in
    order to finance current consumption: higher interest rates are then required
    to offset this incentive and leave the growth rate of consumption unchanged.
    Similarly, given Ò, a higher growth rate of consumption requires higher rates
    of return to offset the incentive to shift resources to the present, reducing
    the difference between the current and the future consumption levels. (The
    strength of this effect is inversely related to the intertemporal elasticity of
    substitution, given by 1/„ in the case of a CRRA utility function.) Finally, the
    uncertainty about the rate of change of consumption captured by Û2c generates
    a precautionary saving motive, inducing the consumer to accumulate financial
    assets with a depressing effect on their rates of return. According to (1.69), the
    expected rate of return on the j th risky asset is also determined by Û2j (as a
    result of the approximation) and by the covariance between rates of return
    and consumption changes. The strength of the latter effect is directly related
    to the degree of the consumer’s risk aversion.
    For any asset j , the “risk premium,” i.e. the difference between the expected
    return E t r
    j
    t +1 and the safe return r
    0
    t +1, is
    E t r
    j
    t +1 − r 0t +1 = −
    Û2j
    2
    + „Û j c . (1.71)
    An important strand of literature, originated by Mehra and Prescott (1985),
    has tested this implication of the model. Many studies have shown that the
    observed premium on stocks (amounting to around 6% per year in the USA),
    given the observed covariance Û j c , can be explained by (1.71) only by values
    of „ too large to yield a plausible description of consumers’ attitudes towards
    risk. Moreover, when the observed values of � log c and Û2c are plugged into
    (1.70), with plausible values for Ò and „, the resulting safe rate of return is
    much higher than the observed rate. Only the (implausible) assumption of a
    negative Ò could make equation (1.70) consistent with the data.
    These difficulties in the model’s empirical implementation are known as
    the equity premium puzzle and the risk-free rate puzzle, respectively, and have

    CONSUMPTION 35
    motivated various extensions of the basic model. For example, a more gen-
    eral specification of the consumer’s preferences may yield a measure of risk
    aversion that is independent of the intertemporal elasticity of substitution.
    It is therefore possible that consumers at the same time display a strong
    aversion toward risk, which is consistent with (1.71), and a high propensity to
    intertemporally substitute consumption, which solves the risk-free rate puzzle.
    A different way of making the above model more flexible, recently
    put forward by Campbell and Cochrane (1999), relaxes the hypothesis of
    intertemporal separability of utility. The next section develops a simple ver-
    sion of their model.
    1.4.2. EXTENSION: THE HABIT FORMATION HYPOTHESIS
    As a general hypothesis on preferences, we now assume that what provides
    utility to the consumer in each period is not the whole level of consumption
    by itself, but only the amount of consumption in excess of a “habit” level.
    An individual’s habit level changes over time, depending on the individual’s
    own past consumption, or on the history of aggregate consumption.
    In each period t , the consumer’s utility function is now
    u(c t , xt ) =
    (c t − xt )1−„
    1 − „ ≡
    (zt c t )
    1−„ − 1
    1 − „ ,
    where zt ≡ (c t − xt )/c t is the surplus consumption ratio, and xt (with c t >
    xt ) is the level of habit. The evolution of x over time is here determined by
    aggregate (per capita) consumption and is not affected by the consumption
    choices of the individual consumer. Then, marginal utility is simply
    uc (c t , xt ) = (c t − xt )−„ ≡ (zt c t )−„.
    The first-order conditions of the problem—see equation (1.65)—now have
    the following form:
    1 = E t
    [
    (1 + r
    j
    t +1)
    1
    1 + Ò
    (
    zt +1
    zt
    )−„ (
    c t +1
    c t
    )−„]
    , for j = 1, . . . , n.
    (1.72)
    The evolution over time of habit and aggregate consumption, denoted by c̄ ,
    are modeled as
    � log zt +1 = ˆεt +1, (1.73)
    � log c̄ t +1 = g + εt +1. (1.74)

    36 CONSUMPTION
    Aggregate consumption grows at the constant average rate g , with innova-
    tions ε ∼ N(0, Û2c ). Such innovations affect the consumption habit,23 with the
    parameter ˆ capturing the sensitivity of z to ε. Under the maintained hypoth-
    esis of lognormal joint distribution of asset returns and the consumption
    growth rate (and using the fact that, with identical individuals, in equilibrium
    c = c̄ ), taking logarithms of (1.72), we get
    0 = −Ò + E t r jt +1 − „E t (� log zt +1) − „E t (� log c t +1)
    + 1
    2
    vart (r
    j
    t +1 − „� log zt +1 − „� log c t +1).
    Using the stochastic processes specified in (1.73) and (1.74), we finally obtain
    the risk premium on asset j and the risk-free rate of return:
    E t r
    j
    t +1 − r 0t +1 = −
    Û2j
    2
    + „ (1 + ˆ)Û j c , (1.75)
    r 0t +1 = „ g + Ò −
    „2(1 + ˆ)2 Û2c
    2
    . (1.76)
    Comparing (1.75) and (1.76) with the analogous equations (1.71) and (1.70),
    we note that the magnitude of ˆ has a twofold effect on returns. On the one
    hand, a high sensitivity of habit to innovations in c enhances the precaution-
    ary motive for saving, determining a stronger incentive to asset accumulation
    and consequently a decrease in returns, as already shown by the last term in
    (1.70).24 On the other hand, a high ˆ magnifies the effect of the covariance
    between risky returns and consumption (Û j c ) on the premium required to
    hold risky assets in equilibrium.
    Therefore, the introduction of habit formation can (at least partly) solve the
    two problems raised by empirical tests of the basic version of the CCAPM: for
    given values of other parameters, a sufficiently large value of ˆ can bring the
    risk-free rate implied by the model closer to the lower level observed on the
    markets, at the same time yielding a relatively high risk premium.
    � APPENDIX A1: DYNAMIC PROGRAMMING
    This appendix outlines the dynamic programming methods widely used in the macro-
    economic literature and in particular in consumption theory. We deal first with the
    representative agent’s intertemporal choice under certainty on future income flows;
    the extension to the case of uncertainty follows.
    ²³ The assumed stochastic process for the logarithm of s satisfies the condition c > x (s > 0):
    consumption is never below habit.
    ²⁴ A constant ˆ is assumed here for simplicity. Campbell and Cochrane (1999) assume that ˆ
    decreases with s : the variability of consumption has a stronger effect on returns when the level of
    consumption is closer to habit.

    CONSUMPTION 37
    A1.1. Certainty
    Let’s go back to the basic model of Section 1.1, assuming that future labor incomes are
    known to the consumer and that the safe asset has a constant return. The maximiza-
    tion problem then becomes
    max
    c t +i
    [
    Ut =
    ∞∑
    i =0
    (
    1
    1 + Ò
    )i
    u(c t +i )
    ]
    ,
    subject to the accumulation constraint (for all i ≥ 0),
    At +i +1 = (1 + r ) At +i + yt +i − c t +i .
    Under certainty, we can write the constraint using the following definition of
    total wealth, including the stock of financial assets A and human capital H : Wt =
    (1 + r )( At + Ht ). Wt measures the stock of total wealth at the end of period t but
    before consumption c t occurs, whereas At and Ht measure financial and human
    wealth at the beginning of the period. In terms of total wealth W, the accumulation
    constraint for period t becomes
    Wt +1 = (1 + r )
    [
    At +1 +
    1
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    yt +1+i
    ]
    = (1 + r )
    [
    (1 + r ) At + yt − c t +
    1
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    yt +1+i
    ]
    = (1 + r ) [(1 + r )( At + Ht ) − c t ]
    = (1 + r ) (Wt − c t ).
    The evolution over time of total wealth is then (for all i ≥ 0)
    Wt +i +1 = (1 + r ) (Wt +i − c t +i ).
    Formally, Wt +i is the state variable, giving, in each period t + i , the total amount
    of resources available to the consumer; and c t +i is the control variable, whose level,
    optimally chosen by the utility-maximizing consumer, affects the amount of resources
    available in the next period, t + i + 1. The intertemporal separability of the objective
    function and the accumulation constraints allow us to use dynamic programming
    methods to solve the above problem, which can be decomposed into a sequence of two-
    period optimization problems. To clarify matters, suppose that the consumer’s horizon
    ends in period T , and impose a non-negativity constraint on final wealth: WT +1 ≥ 0.
    Now consider the optimization problem at the beginning of the final period T , given
    the stock of total wealth WT . We maximize u(c T ) with respect to c T , subject to the
    constraints WT +1 = (1 + r )(WT − c T ) and WT +1 ≥ 0. The solution yields the optimal
    level of consumption in period T as a function of wealth: c T = c T (WT ). Also, the
    maximum value of utility in period T (V ) depends, through the optimal consumption

    38 CONSUMPTION
    choice, on wealth. The resulting value function VT (WT ) summarizes the solution of
    the problem for the final period T .
    Now consider the consumer’s problem in the previous period, T − 1, for a given
    value of WT −1. Formally, the problem is
    max
    c T −1
    (
    u(c T −1) +
    1
    1 + Ò
    VT (WT )
    )
    ,
    subject to the constraint WT = (1 + r )(WT −1 − c T −1). As in the case above, the prob-
    lem’s solution has the following form: c T −1 = c T −1(WT −1), with an associated max-
    imized value of utility (now over periods T − 1 and T ) given by VT −1(WT −1). The
    same procedure can be applied to earlier periods recursively (backward recursion). In
    general, the problem can be written in terms of the Bellman equation:
    Vt (Wt ) = max
    c t
    (
    u(c t ) +
    1
    1 + Ò
    Vt +1(Wt +1)
    )
    , (1.A1)
    subject to Wt +1 = (1 + r )(Wt − c t ). Substituting for Wt +1 into the objective function
    and differentiating with respect to c t , we get the following first-order condition:
    u′(c t ) =
    1 + r
    1 + Ò
    V ′t +1(Wt +1). (1.A2)
    Using the Bellman equation at time t and differentiating with respect to Wt , we obtain
    V ′t +1(Wt +1):
    V ′t (Wt ) = u
    ′(c t )
    ∂c t
    ∂ Wt
    +
    1 + r
    1 + Ò
    V ′t +1(Wt +1) −
    1 + r
    1 + Ò
    V ′t +1(Wt +1)
    ∂c t
    ∂ Wt
    =
    (
    u′(c t ) −
    1 + r
    1 + Ò
    V ′t +1(Wt +1)
    )
    ∂c t
    ∂ Wt
    +
    1 + r
    1 + Ò
    V ′t +1(Wt +1)
    =
    1 + r
    1 + Ò
    V ′t +1(Wt +1),
    where we use the fact that the term in square brackets in the second line equals zero by
    (1.A2). Finally, using again the first-order condition, we find
    V ′t (Wt ) = u
    ′(c t ). (1.A3)
    The effect on utility Vt of an increase in wealth Wt is equal to the marginal utility
    from immediately consuming the additional wealth. Along the optimal consumption
    path, the agent is indifferent between immediate consumption and saving. (The term
    in square brackets is zero.) The additional wealth can then be consumed in any period
    with the same effect on utility, measured by u′(c t ) in (1.A2): this is an application of
    the envelope theorem.

    CONSUMPTION 39
    Inserting condition (1.A3) in period t + 1 into (1.A2), we get the Euler equation,
    u′(c t ) =
    1 + r
    1 + Ò
    u′(c t +1),
    which is the solution of the problem (here under certainty) already discussed in
    Section 1.1.
    The recursive structure of the problem and the backward solution procedure pro-
    vide the optimal consumption path with the property of time consistency. Maximiza-
    tion of (1.A1) at time t takes into account Vt +1(Wt +1), which is the optimal solution of
    the same problem at time t + 1, obtained considering also Vt +2(Wt +2), and so forth.
    As time goes on, then, consumption proceeds optimally along the path originally
    chosen at time t . (This time consistency property of the solution is known as Bellman’s
    optimality principle.)
    Under regularity conditions, the iteration of Bellman equation starting from a
    (bounded and continuous) value function VT (·) leads to a limit function V (·), which
    is unique and invariant over time. Such a function V = lim j →∞ VT − j solves the
    consumer’s problem over an infinite horizon. In this case also, the function that gives
    the agent’s consumption c (W) is invariant over time. Operationally, if the problem
    involves (1) a quadratic utility function, or (2) a logarithmic utility function and
    Cobb–Douglas constraints, it can be solved by first guessing a functional form for
    V (·) and then checking that such function satisfies Bellman equation (1.A1).
    As an example, consider the case of the CRRA utility function25
    u(c ) =
    c 1−„
    1 − „ .
    The Bellman equation is
    V (Wt ) = max
    c t
    (
    c
    1−„
    t
    1 − „ +
    1
    1 + Ò
    V (Wt +1)
    )
    ,
    subject to the constraint Wt +1 = (1 + r )(Wt − c t ). Let us assume (to be proved later
    on) that the value function has the same functional form as utility:
    V (Wt ) = K
    W
    1−„
    t
    1 − „ , (1.A4)
    with K being a positive constant to be determined. Using (1.A4), we can write the
    Bellman equation as
    K
    W
    1−„
    t
    1 − „ = maxc t
    (
    c
    1−„
    t
    1 − „ +
    1
    1 + Ò
    K
    W
    1−„
    t +1
    1 − „
    )
    . (1.A5)
    ²⁵ The following solution procedure can be applied also when „ > 1 and the utility function is
    unbounded. To guarantee this result an additional condition will be imposed below; see Stokey, Lucas,
    and Prescott (1989) for further details.

    40 CONSUMPTION
    From this equation, using the constraint and differentiating with respect to c t , we get
    the first-order condition
    c
    −„
    t =
    1 + r
    1 + Ò
    K [(1 + r )(Wt − c t )]−„ ,
    and solving for c t we obtain the consumption function c t (Wt ):
    c t =
    1
    1 + (1 + r )
    1−„
    „ (1 + Ò)−
    1
    „ K
    1

    Wt , (1.A6)
    where K is still to be determined.
    To complete the solution, we combine the Bellman equation (1.A5) with the con-
    sumption function (1.A6) and define
    B ≡ (1 + r )1−„/„(1 + Ò)−1/„
    to simplify notation. We can then write
    K
    W
    1−„
    t
    1 − „ =
    1
    1 − „
    [
    Wt
    1 + B K
    1

    ]1−„
    +
    1
    1 + Ò
    K
    1 − „
    [
    (1 + r )
    (
    B K
    1

    1 + B K
    1

    )
    Wt
    ]1−„
    , (1.A7)
    where the terms in square brackets are, respectively, Ct and Wt +1. The value of K that
    satisfies (1.A7) is found by equating the coefficient of W
    1−„
    t on the two sides of the
    equation, noting that (1 + r )1−„(1 + Ò)−1 ≡ B „, and solving for K :
    K =
    (
    1
    1 − B
    )„
    . (1.A8)
    Under the condition that B < 1, the complete solution of the problem is V (Wt ) = ( 1 1 − (1 + r ) 1 − „ „ (1 + Ò)− 1 „ )„ W 1−„ t 1 − „ , c (Wt ) = [ 1 − (1 + r ) 1 − „ „ (1 + Ò)− 1 „ ] Wt . A1.2. Uncertainty The recursive structure of the problem ensures that, even under uncertainty, the solu- tion procedure illustrated above is still appropriate. The consumer’s objective function CONSUMPTION 41 to be maximized now becomes Ut = E t ∞∑ i =0 ( 1 1 + Ò )i u(c t +1), subject to the usual budget constraint (1.2). Now we assume that future labor incomes yt +i (i > 0) are uncertain at time t , whereas the interest rate r is known and constant.
    The state variable at time t is the consumer’s certain amount of resources at the end of
    period t : (1 + r ) At + yt . The value function is then Vt ((1 + r ) At + yt ), where subscript
    t means that the value of available resources depends on the information set at time t .
    Under uncertainty, the Bellman equation becomes
    Vt [(1 + r ) At + yt ] = max
    c t
    {
    u(c t ) +
    1
    1 + Ò
    E t Vt +1[(1 + r ) At +1 + yt +1]
    }
    . (1.A9)
    The value of Vt +1(·) is stochastic, since future income are uncertain, and enters (1.A9)
    as an expected value.
    Differentiating with respect to c t and using the budget constraint, we get the follow-
    ing first-order condition:
    u′(c t ) =
    1 + r
    1 + Ò
    E t V

    t +1[(1 + r ) At +1 + yt +1].
    As in the certainty case, by applying the envelope theorem and using the condition
    obtained above, we have
    V ′t (·) =
    1 + r
    1 + Ò
    E t V

    t +1(·)
    = u′(c t ).
    Combining the last two equations, we finally get the stochastic Euler equation
    u′(c t ) =
    1 + r
    1 + Ò
    E t u
    ′(c t +1),
    already derived in Section 1.1 as the first-order condition of the problem.
    REVIEW EXERCISES
    Exercise 5 Using the basic version of the rational expectations/permanent income model
    (with quadratic utility and r = Ò), assume that labor income is generated by the following
    stochastic process:
    yt +1 = ȳ + εt +1 − ‰εt , ‰ > 0,
    where ȳ is the mean value of income and ε is an innovation with E t εt +1 = 0.
    (a) Discuss the impact of an increase of ȳ (�ȳ > 0) on the agent’s permanent income,
    consumption and saving.

    42 CONSUMPTION
    (b) Now suppose that, in period t + 1 only, a positive innovation in income occurs:
    εt +1 > 0. In all past periods income has been equal to its mean level: yt−i = ȳ for
    i = 0, . . . , ∞. Find the change in consumption between t and t + 1 (�c t +1) as a
    function of εt +1, providing the economic intuition for your result.
    (c) With reference to question (b), discuss what happens to saving in periods t + 1 and
    t + 2.
    Exercise 6 Suppose the consumer has the following utility function:
    Ut =
    ∞∑
    i =0
    (
    1
    1 + Ò
    )i
    u(c t +i , St +i ),
    where St +i is the stock of durable goods at the beginning of period t + i . There is no
    uncertainty. The constraints on the optimal consumption choice are:
    St +i +1 = (1 − ‰)St +i + dt +i ,
    At +i +1 = (1 + r ) At +i + yt +i − c t +i − pt +i dt +i ,
    where ‰ is the physical depreciation rate of durable goods, d is the expenditure on durable
    goods, p is the price of durable goods relative to non-durables, and St and At are given.
    Note that the durable goods purchased at time t + i start to provide utility to the consumer
    only from the following period, as part of the stock at the beginning of period t + i + 1
    (St +i +1). Set up the consumer’s utility maximization problem and obtain the first-order
    conditions, providing the economic intuition for your result.
    Exercise 7 The representative consumer maximizes the following intertemporal utility
    function:
    Ut = E t
    ∞∑
    i =0
    (
    1
    1 + Ò
    )i
    u(c t +i , c t +i −1),
    where
    u(c t +i , c t +i −1) = (c t +i − „c t +i −1) −
    b
    2
    (c t +i − „c t +i −1)2, „ > 0.
    In each period t + i , utility depends not only on current consumption, but also on con-
    sumption in the preceding period, t + i − 1. All other assumptions made in the chapter
    are maintained (in particular Ò = r ).
    (a) Give an interpretation of the above utility function in terms of habit formation.
    (b) From the first-order condition of the maximization problem, derive the dynamic
    equation for c t +1, and check that this formulation of utility violates the property of
    orthogonality of �c t +1 with respect to variables dated t .

    CONSUMPTION 43
    Exercise 8 Suppose that labor income y is generated by the following stochastic process:
    yt = Îyt−1 + xt−1 + ε1t ,
    xt = ε2t ,
    where xt (= ε2t ) does not depend on its own past values ( xt−1, xt−2, . . .) and E (ε1t ·
    ε2t ) = 0. xt−1 is the only additional variable (realized at time t − 1) which affects income
    in period t besides past income yt−1. Moreover, suppose that the information set used
    by agents to calculate their permanent income y Pt is It−1 = {yt−1, xt−1}, whereas the
    information set used by the econometrician to estimate the agents’ permanent income
    is �t−1 = {yt−1}. Therefore, the additional information in xt−1 is used by agents in
    forecasting income but is ignored by the econometrician.
    (a) Using equation (1.7) in the text (lagged one period), find the changes in perma-
    nent income computed by the agents (�y Pt ) and by the econometrician (�ỹ
    P
    t ),
    considering the different information set used ( It−1 or �t−1).
    (b) Compare the variance of �y Pt e �ỹ
    P
    t , and show that the variability of permanent
    income according to agents’ forecast is lower than the variability obtained by the
    econometrician with limited information. What does this imply for the interpreta-
    tion of the excess smoothness phenomenon?
    Exercise 9 Consider the consumption choice of an individual who lives for two periods
    only, with consumption c 1 and c 2 and incomes y1 and y2. Suppose that the utility function
    in each period is
    u(c ) =
    {
    a c − (b/2)c 2 for c < a/b; (a 2/2b) for c ≥ a/b. (Even though the above utility function is quadratic, we rule out the possibility that a higher consumption level reduces utility.) (a) Plot marginal utility as a function of consumption. (b) Suppose that r = Ò = 0, y1 = a/b, and y2 is uncertain: y2 = { a/b + Û, with probability 0.5; a/b − Û, with probability 0.5. Write the first-order condition relating c 1 to c 2 (random variable) if the consumer maximizes expected utility. Find the optimal consumption when Û = 0, and discuss the effect of a higher Û on c 1. � FURTHER READING The consumption theory based on the intertemporal smoothing of optimal consump- tion paths builds on the work of Friedman (1957) and Modigliani and Brumberg (1954). A critical assessment of the life-cycle theory of consumption (not explicitly 44 CONSUMPTION mentioned in this chapter) is provided by Modigliani (1986). Abel (1990, part 1), Blanchard and Fischer (1989, para. 6.2), Hall (1989), and Romer (2001, ch. 7) present consumption theory at a technical level similar to ours. Thorough overviews of the theoretical and empirical literature on consumption can be found in Deaton (1992) and, more recently, in Browning and Lusardi (1997) and Attanasio (1999), with a particular focus on the evidence from microeconometric studies. When confronting theory and microeconomic data, it is of course very important (and far from straight- forward) to account for heterogeneous objective functions across individuals or house- holds. In particular, empirical work has found that theoretical implications are typi- cally not rejected when the marginal utility function is allowed to depend flexibly on the number of children in the household, on the household head’s age, and on other observable characteristics. Information may also be heterogeneous: the information set of individual agents need not be more refined than the econometrician’s (Pischke, 1995), and survey measures of expectations formed on its basis can be used to test theoretical implications (Jappelli and Pistaferri, 2000). The seminal paper by Hall (1978) provides the formal framework for much later work on consumption, including the present chapter. Flavin (1981) tests the empirical implications of Hall’s model, and finds evidence of excess sensitivity of consumption to expected income. Campbell (1987) and Campbell and Deaton (1989) derive theor- etical implication for saving behavior and address the problem of excess smoothness of consumption to income innovations. Campbell and Deaton (1989) and Flavin (1993) also provide the joint interpretation of “excess sensitivity” and “excess smoothness” outlined in Section 1.2. Empirical tests of the role of liquidity constraints, also with a cross-country perspective, are provided by Jappelli and Pagano (1989, 1994), Campbell and Mankiw (1989, 1991) and Attanasio (1995, 1999). Blanchard and Mankiw (1988) stress the importance of the precautionary saving motive, and Caballero (1990) solves analyt- ically the optimization problem with precautionary saving assuming an exponential utility function, as in Section 1.3. Weil (1993) solves the same problem in the case of constant but unrelated intertemporal elasticity of substitution and relative risk aver- sion parameters. A precautionary saving motive arises also in the models of Deaton (1991) and Carroll (1992), where liquidity constraints force consumption to closely track current income and induce agents to accumulate a limited stock of financial assets to support consumption in the event of sharp reductions in income (buffer-stock saving). Carroll (1997, 2001) argues that the empirical evidence on consumers’ behav- ior can be well explained by incorporating in the life-cycle model both a precautionary saving motive and a moderate degree of impatience. Sizeable responses of consump- tion to predictable income changes are also generated by models of dynamic inconsis- tent preferences arising from hyperbolic discounting of future utility; Angeletos et al. (2001) and Frederick, Loewenstein, and O’Donoghue (2002) provide surveys of this strand of literature. The general setup of the CCAPM used in Section 1.4 is analyzed in detail by Campbell, Lo, and MacKinley (1997, ch. 8) and Cochrane (2001). The model’s empir- ical implications with a CRRA utility function and a lognormal distribution of returns and consumption are derived by Hansen and Singleton (1983) and extended by, among others, Campbell (1996). Campbell, Lo, and MacKinley (1997) also provide CONSUMPTION 45 a complete survey of the empirical literature. Campbell (1999) has documented the international relevance of the equity premium and the risk-free rate puzzles, origi- nally formulated by Mehra and Prescott (1985) and Weil (1989). Aiyagari (1993), Kocherlakota (1996), and Cochrane (2001, ch. 21) survey the theoretical and empirical literature on this topic. Costantinides, Donaldson, and Mehra (2002) provide an explanation of those puzzles by combining a life-cycle perspective and borrowing constraints. Campbell and Cochrane (1999) develop the CCAPM with habit formation behavior outlined in Section 1.4 and test it on US data. An exhaustive survey of the theory and the empirical evidence on consumption, asset returns, and macroeconomic fluctuations is found in Campbell (1999). Dynamic programming methods with applications to economics can be found in Dixit (1990), Sargent (1987, ch. 1) and Stokey, Lucas, and Prescott (1989), at an increasing level of difficulty and analytical rigor. � REFERENCES Abel, A. (1990) “Consumption and Investment,” in B. Friedman and F. Hahn (ed.), Handbook of Monetary Economics, Amsterdam: North-Holland. Aiyagari, S. R. (1993) “Explaining Financial Market Facts: the Importance of Incomplete Markets and Transaction Costs,” Federal Reserve Bank of Minneapolis Quarterly Review, 17, 17–31. (1994) “Uninsured Idiosyncratic Risk and Aggregate Saving,” Quarterly Journal of Eco- nomics, 109, 659–684. Angeletos, G.-M., D. Laibson, A. Repetto, J. Tobacman, and S. Winberg (2001) “The Hyperbolic Consumption Model: Calibration, Simulation and Empirical Evaluation,” Journal of Economic Perspectives, 15(3), 47–68. Attanasio, O. P. (1995) “The Intertemporal Allocation of Consumption: Theory and Evidence,” Carnegie–Rochester Conference Series on Public Policy, 42, 39–89. (1999) “Consumption,” in J. B. Taylor and M. Woodford (ed.), Handbook of Macroeco- nomics, vol. 1B, Amsterdam: North-Holland, 741–812. Blanchard, O. J. and S. Fischer (1989) Lectures on Macroeconomics, Cambridge, Mass.: MIT Press. and N. G. Mankiw (1988) “Consumption: Beyond Certainty Equivalence,” American Eco- nomic Review (Papers and Proceedings), 78, 173–177. Browning, M. and A. Lusardi (1997) “Household Saving: Micro Theories and Micro Facts,” Journal of Economic Literature, 34, 1797–1855. Caballero, R. J. (1990) “Consumption Puzzles and Precautionary Savings,” Journal of Monetary Economics, 25, 113–136. Campbell, J. Y. (1987) “Does Saving Anticipate Labour Income? An Alternative Test of the Permanent Income Hypothesis,” Econometrica, 55, 1249–1273. (1996) “Understanding Risk and Return,” Journal of Political Economy, 104, 298–345. (1999) “Asset Prices, Consumption and the Business Cycle,” in J. B. Taylor and M. Wood- ford (ed.), Handbook of Macroeconomics, vol. 1C, Amsterdam: North-Holland. and J. H. Cochrane (1999) “By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior,” Journal of Political Economy, 2, 205–251. 46 CONSUMPTION and A. Deaton (1989) “Why is Consumption So Smooth?” Review of Economic Studies, 56, 357–374. and N. G. Mankiw (1989) “Consumption, Income and Interest Rates: Reinterpreting the Time-Series Evidence,” NBER Macroeconomics Annual, 4, 185–216. (1991) “The Response of Consumption to Income: a Cross-Country Investigation,” European Economic Review, 35, 715–721. A. W. Lo, and A. C. MacKinley (1997) The Econometrics of Financial Markets, Princeton: Princeton University Press. Carroll, C. D. (1992) “The Buffer-Stock Theory of Saving: Some Macroeconomic Evidence,” Brookings Papers on Economic Activity, 2, 61–156. (1997) “Buffer-Stock Saving and the Life Cycle/Permanent Income Hypothesis,” Quarterly Journal of Economics , 102, 1–55. (2001) “A Theory of the Consumption Function, With and Without Liquidity Constraints,” Journal of Economic Perspectives, 15 (3), 23–45. Cochrane, J. H. (2001) Asset Pricing, Princeton: Princeton University Press. Costantinides G. M., J. B. Donaldson, and R. Mehra (2002) “Junior Can’t Borrow: A New Perspective on the Equity Premium Puzzle,” Quarterly Journal of Economics, 117, 269–298. Deaton, A. (1991) “Saving and Liquidity Constraints,” Econometrica, 59, 1221–1248. (1992) Understanding Consumption, Oxford: Oxford University Press. Dixit, A. K. (1990) Optimization in Economic Theory, 2nd edn, Oxford: Oxford University Press. Flavin, M. (1981) “The Adjustment of Consumption to Changing Expectations about Future Income,” Journal of Political Economy , 89, 974–1009. (1993) “The Excess Smoothness of Consumption: Identification and Interpretation,” Review of Economic Studies, 60, 651–666. Frederick S., G. Loewenstein, and T. O’Donoghue (2002) “Time Discounting and Time Prefer- ence: A Critical Review,” Journal of Economic Literature, 40, 351–401. Friedman, M. (1957) A Theory of the Consumption Function, Princeton: Princeton University Press. Hall, R. E. (1978) “Stochastic Implications of the Permanent Income Hypothesis: Theory and Evidence,” Journal of Political Economy, 96, 971–987. (1989) “Consumption,” in R. Barro (ed.), Handbook of Modern Business Cycle Theory, Oxford: Basil Blackwell. Hansen, L. P. and K. J. Singleton (1983) “Stochastic Consumption, Risk Aversion, and the Temporal Behavior of Asset Returns,” Journal of Political Economy, 91, 249–265. Jappelli, T. and M. Pagano (1989) “Consumption and Capital Market Imperfections: An Inter- national Comparison,” American Economic Review, 79, 1099–1105. (1994) “Saving, Growth and Liquidity Constraints,” Quarterly Journal of Economics, 108, 83–109. and L. Pistaferri (2000), “Using Subjective Income Expectations to Test for Excess Sensitiv- ity of Consumption to Predicted Income Growth,” European Economic Review 44, 337–358. Kocherlakota, N. R. (1996) “The Equity Premium: It’s Still a Puzzle,” Journal of Economic Literature, 34(1), 42–71. CONSUMPTION 47 Mehra, R. and E. C. Prescott (1985) “The Equity Premium: A Puzzle,” Journal of Monetary Economics , 15(2), 145–161. Modigliani, F. (1986) “Life Cycle, Individual Thrift, and the Wealth of Nations,” American Economic Review, 76, 297–313. and R. Brumberg (1954) “Utility Analysis and the Consumption Function: An Inter- pretation of Cross-Section Data,” in K. K. Kurihara (ed.), Post-Keynesian Economics, New Brunswick, NJ: Rutgers University Press. Pischke, J.-S. (1995) “Individual Income, Incomplete Information, and Aggregate Consump- tion,” Econometrica, 63, 805–840. Romer, D. (2001) Advanced Macroeconomics, 2nd edn, New York: McGraw-Hill. Sargent, T. J. (1987) Dynamic Macroeconomic Theory, Cambridge, Mass.: Harvard University Press. Stokey, N., R. J. Lucas, and E. C. Prescott (1989) Recursive Methods in Economic Dynamics, Cambridge, Mass.: Harvard University Press. Weil, P. (1989) “The Equity Premium Puzzle and the Risk-Free Rate Puzzle,” Journal of Monetary Economics, 24, 401–421. (1993) “Precautionary Savings and the Permanent Income Hypothesis,” Review of Economic Studies, 60, 367–383. 2 Dynamic Models of Investment Macroeconomic IS–LM models assign a crucial role to business investment flows in linking the goods market and the money market. As in the case of con- sumption, however, elementary textbooks do not explicitly study investment behavior in terms of a formal dynamic optimization problem. Rather, they offer qualitatively sensible interpretations of investment behavior at a point in time. In this chapter we analyze investment decisions from an explicitly dynamic perspective. We simply aim at introducing dynamic continuous-time optimization techniques, which will also be used in the following chapters, and at offering a formal, hence more precise, interpretation of qualitative approaches to the behavior of private investment in macroeconomic models encountered in introductory textbooks. Other aspects of the subject matter are too broad and complex for exhaustive treatment here: empirical applications of the theories we analyze and the role of financial imperfections are men- tioned briefly at the end of the chapter, referring readers to existing surveys of the subject. As in Chapter 1’s study of consumption, in applying dynamic optimiza- tion methods to macroeconomic investment phenomena, one can view the dynamics of aggregate variables as the solution of a “representative agent” problem. In this chapter we study the dynamic optimization problem of a firm that aims at maximizing present discounted cash flows. We focus on technical insights rather than on empirical implications, and the problem’s setup may at first appear quite abstract. When characterizing its solution, however, we will emphasize analogies between the optimality conditions of the formal problem and simple qualitative approaches familiar from undergraduate textbooks. This will make it possible to apply economic intuition to mathematical for- mulas that would otherwise appear abstruse, and to verify the robustness of qualitative insights by deriving them from precise formal assumptions. Section 2.1 introduces the notion of “convex” adjustment costs, i.e. techno- logical features that penalize fast investment. The next few sections illustrate the character of investment decisions from a partial equilibrium perspective: we take as given the firm’s demand and production functions, the dynamics of the price of capital and of other factors, and the discount rate applied to future cash flows. Optimal investment decisions by firms are forward looking, and should be based on expectations of future events. Relevant techniques and mathematical results introduced in this context are explained in detail in the INVESTMENT 49 Appendix to this chapter. The technical treatment of firm-level investment decisions sets the stage for a discussion of an explicitly dynamic version of the familiar IS–LM model. The final portion of the chapter returns to the firm-level perspective and studies specifications where adjustment costs do not discourage fast investment, but do impose irreversibility constraints, and Section 2.8 briefly introduces technical tools for the analysis of this type of problem in the presence of uncertainty. 2.1. Convex Adjustment Costs In what follows, F (t ) denotes the difference between a firm’s cash receipts and outlays during period t . We suppose that such cash flows depend on the capital stock K (t ) available at the beginning of the period, on the flow I (t ) of investment during the period, and on the amount N(t ) employed during the period of another factor of production, dubbed “labor”: F (t ) = R(t, K (t ), N(t )) − Pk (t )G ( I (t ), K (t )) − w(t )N(t ). (2.1) The R(·) function represents the flow of revenues obtained from sales of the firm’s production flow. This depends on the amounts employed of the two factors of production, K and N, and also on the technological efficiency of the production function and/or the strength of demand for the firm’s product. In (2.1), possible variations over time of such exogenous features of the firm’s technological and market environment are taken into account by including the time index t alongside K and N as arguments of the revenue function. We assume that revenue flows are increasing in both factors, i.e. ∂ R(·) ∂ K > 0,
    ∂ R(·)
    ∂ N
    > 0, (2.2)
    as is natural if the marginal productivity of all factors and the market price of
    the product are positive. To prevent the optimal size of the firm from diverging
    to infinity, it is necessary to assume that the revenue function R(·) is concave
    in K and N. If the price of its production is taken as given by the firm, this is
    ensured by non-increasing returns to scale in production. If instead physical
    returns to scale are increasing, the revenue function R(·) can still be concave
    if the firm has market power and its demand function’s slope is sufficiently
    negative.
    The two negative terms in the cash-flow expression (2.1) represent costs
    pertaining to investment, I , and employment of N. As to the latter, in this
    chapter we suppose that its level is directly controlled by the firm at each point
    in time and that utilization of a stock of labor N entails a flow cost w per
    unit time, just as in the static models studied in introductory microeconomic
    courses. As to investment costs, a formal treatment of the problem needs to

    50 INVESTMENT
    be precise as to the moment when the capital stock used in production during
    each period is measured. If we adopt the convention that the relevant stock
    is measured at the beginning of the period, it is simply impossible for the
    firm to vary K (t ) at time t . When the production flow is realized, the firm
    cannot control the capital stock, but can only control the amount of positive
    or negative investment: any resulting increase or decrease of installed capital
    begins to affect production and revenues only in the following period. On this
    basis, the dynamic accumulation constraint reads
    K (t + �t ) = K (t ) + I (t )�t − ‰K (t )�t, (2.3)
    where ‰ denotes the depreciation rate of capital, and �t is the length of the
    time period over which we measure cash flows and the investment rate per
    unit time I (t ).
    By assumption, the firm cannot affect current cash flows by varying the
    available capital stock. The amount of gross investment I (t ) during period �t
    does, however, affect the cash flow: in (2.1) investment costs are represented
    by a price Pk (t ) times a function G (·) which, as in Figure 2.1, we shall assume
    increasing and convex in I (t ):
    ∂ G (·)
    ∂ I
    > 0,
    ∂ 2 G (·)
    ∂ I 2
    > 0. (2.4)
    The function G (·) is multiplied by a price in the definition (2.1) of cash
    flows. Hence it is defined in physical units, just like its arguments I and
    K . For example, it might measure the physical length of a production line,
    or the number of personal computers available in an office. The investment
    Figure 2.1. Unit investment costs

    INVESTMENT 51
    rate I (t ) is linearly related to the change in capital stock in equation (2.3)
    but, since G (·) is not linear, the cost of each unit of capital installed is not
    constant. For instance, we might imagine that a greenhouse needs to purchase
    G ( I, K ) flower pots in order to increase the available stock by I units, and that
    the quantities purchased and effectively available for future production are
    different because a certain fraction (variable as a function of I and K ) of pots
    purchased break and become useless. In the context of this example it is also
    easy to imagine that a fraction of pots in use also break during each period,
    and that the parameter ‰ represents this phenomenon formally in (2.3).
    While such examples can help reduce the rather abstract character of the
    formal model we are considering, its assumptions may be more easily justified
    in terms of their implications than in those of their literal realism. For pur-
    poses of modeling investment dynamics, the crucial feature of the G ( I, K )
    function is the strict convexity assumed in (2.4). This implies that the average
    unit cost (measured, after normalization by Pk , by the slope of lines such as
    OA and OB in Figure 2.1) of investment flows is increasing in the total flow
    invested during a period. Thus, a given total amount of investment is less
    costly when spread out over multiple periods than when it is concentrated
    in a single period. For this reason, the optimal investment policy implied by
    convex adjustment costs is to some extent gradual.
    The functional form of investment costs plays an important role not only
    when the firm intends to increase its capital stock, but also when it wishes
    to keep it constant, or decrease it. It is quite natural to assume that the firm
    should not bear costs when gross investment is zero (and capital may evolve
    over time only as a consequence of exogenous depreciation at rate ‰). Hence,
    as in Figure 2.1,
    G (0, ·) = 0,
    and the positive first derivative assumed in (2.4) implies that G ( I, ·) < 0 for I < 0: the cost function is negative (and makes positive contributions to the firm’s cash flow) when gross investment is negative, and the firm is selling used equipment or structures. In the figure, the G (·) function lies above a 45◦ line through the origin, and it is tangent to it at zero, where its slope is unitary: ∂ G (0, ·)/∂ I = 1. This property makes it possible to interpret Pk as “the” unit price of capital goods, a price that would apply to all units installed if the convexity of G ( I, ·) did not deter larger than infinitesimal investments of either sign. When negative investment rates are considered, convexity of adjustment costs similarly implies that the unit amount recouped from each unit scrapped (as measured by the slope of lines such as OB) is smaller when I is more negative, and this makes speedy reduction of the capital stock unattractive. 52 INVESTMENT Comparing the slope of lines such as OA and OB, it is immediately apparent that alternating positive and negative investments is costly: even though there are no net effects on the final capital stock, the firm cannot fully recoup the original cost of positive investment from subsequent negative invest- ment. First increasing, then decreasing the capital stock (or vice versa) entails adjustment costs. In summary, the form of the function displayed in Figure 2.1 implies that investment decisions should be based not only on the contribution of capital to profits at a given moment in time, but also on their future outlook. If the relevant exogenous conditions indexed by t in R(·) and the dynamics of the other, equally exogenous, variables Pk (t ), w(t ), r (t ) suggest that the firm should vary its capital stock, the adjustment should be gradual, as will be set out below. Moreover, if large positive and negative fluctuations of exogenous variables are expected, the firm should not vary its investment rate sharply, because the cost and revenues generated by upward and downward capital stock fluctuations do not offset each other exactly. Convexity of the adjustment cost function implies that the total cost of any given capital stock variation is smaller when that variation is diluted through time, hence the firm should behave in a forward looking fashion when choosing the dynamics of its investment rate and should try to keep the latter stable by anticipating the dynamics of exogenous variables. 2.2. Continuous-Time Optimization Neither the realism nor the implications of convex adjustment costs depend on the length �t of the period over which revenue, cost, and investment flows are measured. The discussion above, however, was based on the idea that current investment cannot increase the capital stock available for use within each such period, implying that K (t ) could be taken as given when evaluating opportunities for further investment. This accounting convention, of course, is more accurate when the length of the period is shorter. Accordingly, we consider the limit case where �t → 0, and suppose that the firm makes optimizing choices at every instant in continuous time. Optimiza- tion in continuous time yields analytically cleaner and often more intuitive results than qualitatively similar results from discrete time specifications, such as those encountered in this book when discussing consumption (in Chap- ter 1) and labor demand under costly adjustment (in Chapter 3). We also assume, for now, that the dynamics of exogenous variables is deterministic. (Only at the end of the chapter do we introduce uncertainty in a continuous- time investment problem.) This also makes the problem different from that discussed in Chapter 1: the characterization offered by continuous-time INVESTMENT 53 models without uncertainty is less easily applicable to empirical discrete- time observations, but is also quite insightful, and each of the modeling approaches we outline could fruitfully be applied to the various substantive problems considered. The economic intuition afforded by the next chapter’s models of labor demand under uncertainty would be equally valid if applied to investment in plant and equipment investment rather than in workers, and we shall encounter consumption and investment problems in continuous time (and in the absence of uncertainty) when discussing growth models in Chapter 4. In continuous time, the maximum present value (discounted at rate r ) of cash flows generated by a production and investment program can be written as an integral: V (0) ≡ max ∫ ∞ 0 F (t )e − ∫ t 0 r (s )d s d t, subject to K̇ (t ) = I (t ) − ‰K (t ), for all t . (2.5) The Appendix to this chapter defines the integral and offers an introduction to Hamiltonian dynamic optimization. This method suggests a simple recipe for solution of this type of problem (which will also be encountered in Chapter 4). The Hamiltonian of optimization problem (2.5) is H (t ) = e − ∫ t 0 r (s )d s ( F (t ) + Î(t ) ( I (t ) − ‰K (t ))) , where Î(t ) denotes the shadow price of capital at time t in current value terms (that is, in terms of resources payable at the same time t ). The first-order conditions of the dynamic optimization problem we are studying are ∂ H ∂ N = 0 ⇒ ∂ F (·) ∂ N = 0 ⇒ ∂ R(·) ∂ N = w(t ), ∂ H ∂ I = 0 ⇒ ∂ F (·) ∂ I = −Î(t ) ⇒ Pk ∂ G ∂ I = Î(t ), (2.6) − ∂ H ∂ K = d d t ( Î(t )e − ∫ t 0 r (s )d s ) ⇒ Î̇ − r Î = − ( ∂ F (·) ∂ K − ‰Î ) . The limit “transversality” condition must also be satisfied, in the form lim t→∞ e − ∫ t 0 r (s )d s Î(t )K (t ) = 0. (2.7) The Appendix shows that these optimality conditions are formally analogous to those of more familiar static constrained optimization problems. Here, we discuss their economic interpretation. The condition ∂ R(·) ∂ N = w(t ) (2.8) 54 INVESTMENT simply requires that, in flow terms, the marginal revenue yielded by employ- ment of the flexible factor N be equal to its cost w, at every instant t . This is quite intuitive, since the level of N may be freely determined by the firm. The condition Pk ∂ G (·) ∂ I = Î(t ) (2.9) calls for equality, along an optimal investment path, of the marginal value of capital Î(t ) and the marginal cost of the investment flows that determine an increase (or decrease) of the capital stock at every instant. That marginal cost, in turn, is − Pk ∂ G (·)/∂ I in the problem we are considering. Such considera- tions, holding at every given time t , do not suffice to represent the dynamic aspects of the firm’s problem. These aspects are in fact crucial in the third condition listed in (2.6), which may be rewritten in the form r Î = ∂ F (·) ∂ K − ‰Î + Î̇ and interpreted in terms of financial asset valuation. For simplicity, let ‰ = 0. From the viewpoint of time t , the marginal unit of capital adds ∂ F /∂ K to current cash flows, and this is a “dividend” paid by that unit to its owner at that time (the firm). The marginal unit of capital, however, also offers capital gains, in the amount Î̇. If the firm attaches a (shadow) value Î to the unit of capital, then it must be the case that its total return in terms of both dividends and capital gains is financially fair. Hence it should coincide with the return r Î that the firm could obtain from Î units of purchasing power in a financial market where, as in (2.5), cash flows are discounted at rate r . If ‰ > 0, similar considerations hold true but should take into account that
    a fraction of the marginal unit of capital is lost during every instant of time.
    Hence its value, amounting to ‰Î per unit time, needs to be subtracted from
    current “dividends.”
    Such considerations also offer an intuitive economic interpretation of the
    transversality condition (2.7), which would be violated if the “financial” value
    Î(t ) grew at a rate greater than or equal to the equilibrium rate of return r (s )
    while the capital stock, and the marginal dividend afforded by the investment
    policy, tend to a finite limit. In such a case, Î(t ) would be influenced by a
    speculative “bubble”: the only reason to hold the asset corresponding to the
    marginal value of capital is the expectation of everlasting further capital gains,
    not linked to profits actually earned from its use in production. Imposing
    condition (2.7), we acknowledge that such expectations have no economic
    basis, and we deny that purely speculative behavior may be optimal for the
    firm.

    INVESTMENT 55
    2.2.1. CHARACTERIZING OPTIMAL INVESTMENT
    Consider the variable
    q (t ) ≡ Î(t )
    Pk (t )
    ,
    the ratio of the marginal capital unit’s shadow value to parameter Pk , which
    represents the market price of capital (that is, the unit of cost of investment in
    the neighborhood of the zero gross investment point, where adjustment costs
    are negligible).
    This variable, known as marginal q , has a crucial role in the determination
    of optimal investment flows. In fact, the first condition in (2.6) implies that
    ∂ G ( I (t ), K (t ))
    ∂ I (t )
    = q (t ), (2.10)
    and if (2.4) holds then ∂ G (·)/∂ I is a strictly increasing function of I . Such a
    function has an inverse: let È(·) denote the inverse of ∂ G (·)/∂ I as a function
    of I . Both ∂ G (·)/∂ I and its inverse may depend on the capital stock K . The
    È(q , K ) function implicitly defined by
    ∂ G (È(q , K ), K )
    ∂È
    ≡ q
    returns investment flows in such a way as to equate the marginal investment
    cost ∂ G (·)/∂ I to a given q , for a given K . Condition (2.10) may then be
    equivalently written
    I (t ) = È(q (t ), K (t )). (2.11)
    Since K (t ) is given at time t , (2.11) determines the investment rate as a
    function of q (t ).
    Since, by assumption, the investment cost function G ( I, ·) has unitary slope
    at I = 0, zero gross investment is optimal when q = 1; positive investment
    is optimal when q > 1; and negative investment is optimal when q < 1. Intuitively, when q > 1 (hence Î > Pk ) capital is worth more inside the firm
    than in the economy at large; hence it is a good idea to increase the capital
    stock installed in the firm. Symmetrically, q < 1 suggests that the capital stock should be reduced. In both cases, the speed at which capital is transferred towards the firm or away from it depends not only on the difference between q and unity, but also on the degree of convexity of the G (·) function, that is, on the relevance of capital adjustment costs. If the slope of the function in Figure 2.1 increases quickly with I , even q values very different from unity are associated with modest investment flows. Exercise 10 Show that, if capital has positive value, then investment would always be positive if the total investment cost were quadratic, for example if 56 INVESTMENT G (K , I ) = x · I 2 where Pk = 1 and x ≥ 0 may depend on K . Discuss the real- ism of more general specifications where G (K , I ) = x · I ‚ for ‚ > 0.
    Determining the optimal investment rate as a function of q does not yield a
    complete solution to the dynamic optimization problem. In fact, in order to
    compute q one needs to know the shadow value Î(t ) of capital, which—unlike
    the market price of capital, Pk (t )—is part of the problem’s solution, rather
    than part of its exogenous parameterization. However, it is possible to char-
    acterize graphically and qualitatively the complete solution of the problem on
    the basis of the Hamiltonian conditions.
    Since we expressed the shadow value of capital in current terms, calendar
    time t appears in the optimality conditions only as an argument of the func-
    tions, such as Î(·) and K (·), which determine optimal choices of I and N.
    Noting that
    q̇ (t ) =
    d
    d t
    Î(t )
    Pk (t )
    =
    Î̇(t )
    Pk (t )
    − Î(t )
    Pk (t )
    Ṗk (t )
    Pk (t )
    ,
    let us define Ṗk (t )/ Pk (t ) ≡ k (the rate of inflation in terms of capital), and
    recall that Î̇ = (r + ‰)Î − ∂ F (·)/∂ K by the last optimality condition in (2.6).
    Thus, we may write the rate of change of q as a function of q itself, of K , and
    of parameters:
    q̇ = (r + ‰ − k )q −
    1
    Pk
    ∂ F (·)
    ∂ K
    . (2.12)
    In this expression the calendar time t is omitted for simplicity, but all
    variables—particularly those, not explicitly listed, that determine the size of
    cash flows F (·) and their derivative with respect to K —are measured at a
    given moment in time.
    Combining the constraint K̇ (t ) = I (t ) − ‰K (t ) with condition (2.11), we
    obtain a relationship between the rate of change of K , K itself, and the level
    of q :
    K̇ = È(q , K ) − ‰K . (2.13)
    Now, if we suppose that all exogenous variables are constant (including the
    price of capital Pk , to imply that k = 0), and recall that the investment rate
    and N depend on q and K through the optimality conditions in (2.6), the
    time-varying elements of the system formed by (2.12) and (2.13) are just q (t )
    and K (t )—that is, precisely those for whose dynamics we have derived explicit
    expressions.
    Thus, the dynamics of the two variables may be studied in the phase diagram
    of Figure 2.2. On the axes of the diagram we measure the dynamic variables
    of interest. On the horizontal axis of this and subsequent diagrams, one reads
    the level of K ; on the vertical axis, a level of q . If only K and q —and variables

    INVESTMENT 57
    Figure 2.2. Dynamics of q (supposing that ∂ F (·)/∂ K is decreasing in K )
    uniquely determined by them, such as the investment rate I = È(q , K )—are
    time-varying, then each point in (K , q )-space is uniquely associated with
    their dynamic changes. Picking any point in the diagram, and knowing the
    functional form of the expressions in (2.12) and (2.13), one could in prin-
    ciple compute both q̇ and K̇ . Graphically, the movement in time of the two
    variables may be represented by placing in the diagram appropriately oriented
    arrows.
    In practice, the characterization exercise needs first to identify points where
    one of the variables remains constant in time. In Figure 2.2, the downward-
    sloping line represents combinations of K and q such that the expression on
    the right-hand side of (2.12) is zero. This is the case when
    q = (r + ‰)−1
    1
    Pk
    ∂ F (·)
    ∂ K
    .
    Given that (r + ‰) Pk > 0, the locus of points along which q̇ = 0 has a negative
    slope if a higher capital stock is associated with a smaller “dividend” ∂ F (·)/∂ K
    from the marginal capital unit in (2.12).
    This is not, in general, guaranteed by the condition ∂ 2 F (·)/∂ K 2 < 0. When drawing the phase diagram, in fact, the firm’s cash flow, F (·) = R(t, K , N) − Pk (t )G ( I, K ) − w(t )N, should be evaluated under the assumptions that the flexible factor N is always adjusted so as to satisfy the condition ∂ R(K , N)/∂ N = w, and that invest- ment satisfies the condition ∂ G ( I, K )/∂ I = Î. Thus, as K varies, both the optimal employment of N, which we may write as N∗ = n(K , w), and the optimal investment flow È(K , Î) vary as well. Exercise 12 highlights certain implications of this fact for a properly drawn phase diagram. It will be conve- nient for now to suppose that the q̇ = 0 locus slopes downwards, as is the case 58 INVESTMENT (for example) if the adjustment cost function G (·) does not depend on K and revenues R(·) are an increasing and strictly concave function of K only. Once we have identified the locus of points where q̇ = 0, we need to deter- mine the sign of q̇ for points in the diagram that are not on that locus. For each level of K , one and only one level of q implies that q̇ equals zero. If for example we consider point A along the horizontal axis of the figure, q is steady only if its level is at the height of point B. If we move up to a higher value of q for the same level of K , such as that corresponding to point C in the figure, equation (2.12)—where q is multiplied by r + ‰ > 0—implies that q̇ is not equal to
    zero, as in point B, but is larger than zero. In the figure this is represented
    by an upward-pointing arrow: if one imagines placing a pen on the diagram
    at point C, and following the dynamic instructions given by (2.12), the pen
    should slide towards even higher values of q . The same reasoning holds for
    all points above the q̇ = 0 locus, for example point D, whence an upward-
    sloping arrow also starts. The speed of the dynamic movement represented is
    larger for larger values of r + ‰, and for greater distances from the stationary
    locus: the latter fact could be represented by drawing larger arrows for points
    farther from the q̇ = 0 locus. To convince oneself that q̇ > 0 in D, one may
    also consider point E on q̇ = 0 and, holding q constant, note that, if (2.12)
    identifies a downward-sloping locus, then a higher level of K must result in q̇
    larger than zero. Symmetrically, we have q̇ < 0 at every point below and to the left of the q̇ = 0 locus, such as those marked with downward-sloping arrows in the figure. Applying the same reasoning to equation (2.13) enables us to draw Figure 2.3. To determine the slope of the locus along which K̇ = 0, note that the right-hand side of (2.13) is certainly increasing in q since a higher q is associated with a larger investment flow. The effect on K̇ of a higher K is ambiguous: as long as ‰ > 0 it is certainly negative through the second term,
    Figure 2.3. Dynamics of K (supposing that ∂È(·)/ ∂ K − ‰ < 0) INVESTMENT 59 Figure 2.4. Phase diagram for the q and K system but it may be positive through the first term. If a firm with a larger installed capital stock bears smaller unit costs for installation of a given additional investment flow I , a larger optimal investment flow is associated with a given q , and a larger K has a negative effect on G ( · ) and a positive effect on È( · ). The relevance of this channel is studied in exercise 12, but the figure is drawn supposing that the negative effect dominates the positive one—for example, because the adjustment cost function G ( · ) does not depend on K , and ‰ > 0
    suffices to imply a positive slope for the K̇ = 0 locus. It is then easy to show
    that K̇ > 0, as indicated by arrows pointing to the right, at all points above
    that locus; a value of q higher than that which would maintain a steady capital
    stock, in fact, can only be associated with a larger investment flow and an
    increasing K . Symmetrically, arrows point to the left at all points below the
    K̇ = 0 locus.
    Figure 2.4, which simply superimposes the two preceding figures, considers
    the joint dynamic behavior of q and K . Since arrows point up and to the
    right in the region above both stationary loci, from that region the system can
    only diverge (at the increasing speed implied by values of q and K that are
    increasingly far from those consistent with their stability) towards infinitely
    large values of q and/or K . Such dynamic behavior is quite peculiar from the
    economic point of view, and in fact it can be shown to violate the transversality
    condition (2.7) for plausible forms of the F (·) function. Also, starting from
    points in the lower quadrant of the diagram, the dynamics of the system,
    driven by arrows pointing left and downwards, can only lead to economically
    nonsensical values of q and/or K .
    The system’s configuration is much more sensible at the point where the
    K̇ = 0 and q̇ = 0 loci cross, the unique steady state of the dynamic system
    we are considering. Thus, we can focus attention on dynamic paths starting
    from the left and right regions of Figure 2.4, where arrows pointing towards
    the steady state allow the dynamic system to evolve in its general direction.

    60 INVESTMENT
    Figure 2.5. Saddlepath dynamics
    As shown in Figure 2.5, however, it is quite possible for trajectories start-
    ing in those regions to cross the K̇ = 0 locus (vertically) or the q̇ = 0 locus
    (horizontally) and then, instead of reaching the steady state, proceed in the
    regions where arrows point away from it—implying that (2.7) is violated, or
    that capital eventually becomes negative.
    In the figure, however, a pair of dynamic paths is drawn that start from
    points to the left and right of the steady state and continue towards it (at
    decreasing speed) without ever meeting the system’s stationarity loci. All
    points along such paths are compatible with convergence towards the steady
    state, and together form the saddlepath of the dynamic system. For any given
    K , such as that labeled K (0) in the figure, only one level of q (or, equiva-
    lently, only one rate of investment) puts the system on a trajectory converging
    towards the steady state. If q were higher, and the I (0) investment rate larger,
    the firm should continue to invest at a rate faster than that leading to the
    steady state in order to keep on satisfying the last optimality condition in
    (2.6), and the (2.12) dynamic equation deriving from it. Sooner or later, this
    would lead the firm to cross the q̇ = 0 line and, along a path of ever increasing
    investment, to violate the transversality condition. Symmetrically, if the firm
    invested less than what is implied by the saddlepath value of q , it would find
    itself investing less and less over time, and would diverge towards excessively
    small capital stocks rather than converge to the steady state.
    2.3. Steady-State and Adjustment Paths
    For a given (and supposed constant) value of exogenous variables, the firm’s
    investment rate should be that implied by the q level corresponding on the
    saddlepath to the capital stock, which, at any point in time, is determined by
    past investment decisions. The capital stock and its shadow value then move

    INVESTMENT 61
    towards their steady state (if they are not there yet). Setting q̇ = K̇ = 0 and
    k = 0 in (2.12) and (2.13), we can study the steady-state levels qs s and K s s :
    (r + ‰)qs s =
    1
    Pk
    ∂ F (·)
    ∂ K
    ∣∣∣∣
    K =K s s
    , (2.14)
    È(qs s , K s s ) = ‰K s s . (2.15)
    The second equation simply indicates that the gross investment rate Is s =
    È(qs s , K s s ) must be such as to compensate depreciation in the stock of capital
    (stock that is constant, by definition, in steady state). The first equation is less
    obvious. Recalling that qs s = Îs s / Pk , however, we may rewrite it as
    Îs s = (r + ‰)
    −1 ∂ F (·)
    ∂ K
    ∣∣∣∣
    K =K s s
    =
    ∫ ∞
    t
    e −(r +‰)(Ù−t )
    ∂ F (·)
    ∂ K
    ∣∣∣∣
    K =K s s
    d Ù.
    Thus, in steady state the shadow value of capital is equal to the stream of future
    marginal contributions by capital to the firm’s cash flows, discounted at rate
    r + ‰ > 0 over the infinite planning horizon. If it were the case that r + ‰ = 0,
    the relevant present value would be ill-defined: hence, as mentioned above
    and discussed in more detail below, it must be the case that r + ‰ > 0 in a
    well-defined investment problem.
    The steady state is readily interpreted along the lines of a simple approach
    to investment which should be familiar from undergraduate textbooks (see
    Jorgenson 1963, 1971). One may treat the capital stock as a factor of produc-
    tion whose user cost is (rk + ‰) Pk when Pk is the price of each stock unit,
    rk = i − � Pk / Pk is the real rate of interest in terms of capital, and ‰ is the
    physical depreciation rate of capital. If the profit flow is an increasing concave
    function F (K , . . . ) of capital K , the first-order condition
    ∂ F (K ∗(. . . ), . . . )
    ∂ K
    = (rk + ‰) Pk (2.16)
    identifies the K ∗ stock that maximizes F (K , . . . ) in each period, neglecting
    adjustment costs. If capital does not depreciate and ‰ = 0, however, condi-
    tion (2.15) implies that qs s = 1, since ∂ G (·)/∂ I = 1 when I = 0, and con-
    dition (2.14) simply calls for capital’s marginal productivity to coincide
    with its financial cost, just as in static approaches to optimal use of
    capital:
    ∂ F (·)
    ∂ K
    = r Pk .
    If instead ‰ > 0, then steady-state investment is given by Is s = ‰K s s > 0, and
    therefore qs s > 1. The unit cost of capital being installed to offset ongoing
    depreciation is higher than Pk , because of adjustment costs.
    Phase diagrams are useful not only for characterizing adjustment paths
    starting from a given initial situation, but also for studying the investment

    62 INVESTMENT
    effects of permanent changes in parameters. To this end, one may specify a
    functional form for cash flows F (·) in (2.1), as is done in the exercises at the
    end of the chapter, and study the effects of a change in its parameters on the
    q̇ = 0 locus, on the steady-state capital stock, and on the system’s adjustment
    path.
    Consider, for example, the effect of a smaller wage w. This event, as the
    following exercise verifies in a special case, may (or may not) imply an increase
    in the optimal capital stock in the static context of introductory economics
    textbooks—and, equivalently, a higher stock K s s in the steady state of the
    dynamic problem we are studying.
    Exercise 11 Suppose that the adjustment cost function G (·) does not depend
    on the capital stock, and let ‰ = 0. If the firm’s revenue function has the Cobb–
    Douglas form R(K , N, t ) = K · N‚, does a lower w increase or decrease the
    steady-state capital stock K s s ?
    At the time when parameters change, however, the capital stock is given.
    The new configuration of the system can affect only q and the investment
    rate, and the resulting dynamics gradually increase (or decrease) the capital
    stock. The gradual character of the optimal adjustment path derives from
    strictly convex adjustment costs, which, as we know, make fast investment
    unattractive. At any time, the speed of adjustment depends on the difference
    between the current and steady-state levels of q . Hence the speed of movement
    along the saddlepath is decreasing, and the growth rate of capital becomes
    infinitesimally small as the steady state is approached. In fact, it is by avoiding
    perpetually accelerating capital and investment trajectories that the “saddle”
    adjustment paths can satisfy the transversality condition.
    It is also interesting to study the effects on investment of future expected
    events. Suppose that at time t = 0 it becomes known that the wage will remain
    constant at w(0) until t = T , will then fall to w(T ) < w(0), and will remain constant at that new level. The optimal investment flow anticipates such a future exogenous event: if a lower wage and the resulting higher employment of N implies a larger marginal contribution of capital to cash flows, then the firm begins at time zero to invest more than what would be optimal if it were known that w(t ) = w(0) for ever. However, since between t = 0 and t = T the wage is still w(0) and there is no reason to increase N for given K , it cannot be optimal for the firm to behave as in the solution of the exercise above, where the wage decreased permanently to w(t ) = w(T ) for all t . In order to characterize the optimal investment policy, recall that to avoid divergent dynamics the firm should select a dynamic path that leads towards the steady state while satisfying the optimality conditions. From time T onwards, all parameters are constant and we know that the firm should be INVESTMENT 63 Figure 2.6. A hypothetical jump along the dynamic path, and the resulting time path of Î(t ) and investment (↑ + ↓) ⇒ smaller investment costs on the saddlepath leading to the new steady state. To figure out the dynamics of q and K during the period when the system’s dynamics are still those implied by w(0), note that the system should evolve so as to find itself on the new saddlepath at time T , without experiencing discontinuous jumps. To see why, consider the implications of a dynamic path such that a discontinuous jump of q is needed to bring the system on the saddlepath, as in Figure 2.6. Formally, it would be impossible to define q̇ (T ), hence Î̇(T ), and neither equation (2.12) nor the optimality condition (2.6) could be satisfied. From the economic point of view, recall that a sudden change of q would necessarily entail a similarly abrupt variation of the investment flow, as in the figure. As we know, however, strictly convex adjustment costs imply that such an investment policy is more costly than a smoother version, such as that represented by dots in the figure. Whenever a path with foreseeable discontinuities is considered as an optimal-policy candidate, it can be ruled out by the fact that a more gradual investment policy would reduce overall investment costs. (A more gradual investment policy also affects the capital path, of course, but investment can be redistributed over time so as to make this effect relatively small on a present discounted basis.) Since such reasoning can be applied at every instant, the optimal path is necessarily free of discontinuities—other than the unavoidable 64 INVESTMENT Figure 2.7. Dynamic effects of an announced future change of w one associated with the initial re-optimization in light of new, unforeseen information arriving at time zero. It is now easy to display graphically, as in Figure 2.7, the dynamic response of the system. Starting from the steady state, the height of q ’s jump at time zero (when the parameter change to be realized at time T is announced) depends on how far in the future is the expected event. In the limit case where T = 0 (that is, where the parameter change occurs immediately) q would jump directly on the new saddlepath. If, as in Figure 2.7, T is rather far in the future, q jumps to a point intermediate between the initial one (the old steady state, in the figure) and the saddlepath: the firm then follows the dynamics implied by the initial parameters until time T , when the dynamic path meets the new saddlepath. Intuitively, the firm finds it convenient to dilute over time the adjustment it foresees. For larger values of T the height of the ini- tial jump would be smaller, and the apparently divergent dynamics induced by the expectation of future events would follow slower, more prolonged, dynamics. These results offer a more precise interpretation of the investment deter- mination assumptions made in the IS–LM model familiar from introductory macroeconomics courses, where business investment I depends on exogenous variables ( say, Ī ) and negatively on the interest rate. This relationship can be rationalized qualitatively considering that the propensity to invest should depend on (exogenous) expectations of future (hence, discounted) profits to be obtained from capital installed through current investment. From this point of view, any variable relevant to expectations of future profits influences the exogenous component Ī of investment flows. Since the present discounted value of profits is lower for large discount factors, for any given Ī the invest- ment flow I is a decreasing function of the current interest rate i . In the context of the dynamic model we are considering, the firm’s investment tends INVESTMENT 65 to a steady state, which, inasmuch as it depends on future events, depends in obvious and important ways on expectations.26 2.4. The Value of Capital and Future Cash Flows As we have seen, in steady state it is possible to express q (t ) in terms of the present value of future marginal effects of K on the firm’s cash flows. In fact, a similar expression is always valid along an optimal investment path. If we set Pk (t ) = 1 for all t (and therefore k ≡ 0) for simplicity, then q and Î are equal. The last condition in (2.6) may be written d d Ù Î(Ù) − (r + ‰)Î(Ù) = −F K (Ù), (2.17) where F K (Ù) = ∂ F (Ù, K (Ù), N(Ù)) ∂ K (2.18) denotes the marginal cash-flow effect of capital at every time Ù along the firm’s optimal trajectory. Multiplying by e −(r +‰)Ù, we can rewrite (2.17) in the form d d Ù ( Î(Ù)e −(r +‰)Ù ) = −F K (Ù)e −(r +‰)Ù, which may be integrated between Ù = 0 and Ù = T to obtain e −(r +‰)T Î(T ) − Î(0) = − ∫ T 0 F K (Ù)e −(r +‰)Ù d Ù. In the limit for T → ∞, as long as K (∞) > 0 condition (2.7) implies that the
    first term vanishes and
    Î(0) =
    ∫ ∞
    0
    F K (Ù)e
    −(r +‰)Ù d Ù. (2.19)
    Along an optimal investment trajectory, the marginal value of capital at time
    zero is the present value of cash flows generated by an additional unit of
    capital at time zero which, depreciating steadily over time at rate ‰, adds e −‰t
    units of capital at each time t > 0. Taking as given the capital stock installed at
    time t , each additional unit of capital increases cash flows according to F K (·).
    The firm could indeed install such an additional unit and then, keeping its
    investment policy unchanged, increase discounted cash flows by the amount in
    (2.19).
    ²⁶ Keynes (1936, ch. 12) emphasizes the relevance of expectation later adopted as a key feature of
    Keynesian IS–LM models. Of course, his framework of analysis is quite different from that adopted
    here, and does not quite agree with the notion that investment should always tend to some long-run
    equilibrium configuration.

    66 INVESTMENT
    This reasoning does not take into account the fact that a hypothetical
    variation of investment (hence of capital in use in subsequent periods) should
    lead the firm to vary its choices of further investment. Any such variation,
    however, has no effect on capital’s marginal value as long as its size is infinites-
    imally small. If at time zero a small additional amount of capital were in fact
    installed, the firm would indeed vary its future investment policy, but only
    by similarly small amounts. This would have no effect on discounted cash
    flows around an optimal trajectory, where first-order conditions are satisfied
    and small perturbations of endogenous variables have no first-order effect on
    the firm’s value.
    This fact, an application of the envelope theorem, makes it possible to com-
    pute capital’s marginal value taking as given the optimal dynamic path of
    capital—or, equivalently, to gauge the optimality of each investment decision
    taking all other such decisions as given. In general, equation (2.19) does not
    offer an explicit solution for Î(0), because its right-hand side depends on
    future levels of K whenever ∂ F K (·)/∂ K �= 0, that is, whenever the function
    linking cash flows to capital is strictly concave. Inasmuch as the marginal
    contribution of capital to cash flows depends on the stock of capital, one
    would need to know the level of K (Ù) for Ù > 0 in order to compute the right-
    hand side of (2.19). But future capital stocks depend on current investment
    flows, which in turn depend on the very Î that one is attempting to evaluate.
    The obvious circularity of this reasoning generally makes it impossible to
    compute the optimal policy through this route. For a finite planning horizon
    T , one could obtain a solution starting from the given (possibly zero) value of
    capital at the time when the firm ceases to exist. But if T → ∞ one needs to
    compute the optimal policy as a whole, or at least to characterize it graphically
    as we did above. In fact, it is easy to interpret the dynamics of q in Figure 2.7
    in terms of expected cash flows: favorable exogenous events become nearer in
    time (and are more weakly discounted) along the first portion of the dynamic
    path illustrated in the figure.
    It can be the case, however, that F (·) is only weakly concave (hence lin-
    ear) in K ; then F K (·) ≡ ∂ F (·)/∂ K does not depend on exogenous variables,
    equation (2.19) yields an explicit value for Î, and the firm’s investment policy
    follows immediately. For example, if
    ∂ G (·)
    ∂ K
    = 0, R(t, K (t ), N(t )) = R̃(t )K (t ), (2.20)
    then (2.19) reads
    Î(0) =
    ∫ ∞
    0
    R̃(Ù)e −(r +‰)Ù d Ù. (2.21)
    The first equation in (2.20) states that capital’s installation costs depend only
    on I , not on K . Hence, unit investment costs do depend on the size of

    INVESTMENT 67
    investment flows per unit time, but the cost of a given capital stock increase is
    independent of the firm’s initial size. The second equation in (2.20) states that
    each unit of installed capital makes the same contribution to the firm’s capital
    stock, again denying that the firm’s size is relevant at the margin.
    A relationship in the form (2.21) holds true, more generally, whenever the
    scale of the firm’s operations is irrelevant at the margin. Consider the case of a
    firm using a production function f (K , N) with constant returns to scale, and
    operating in a competitive environment (taking as given prices and wages).
    By the constant-returns assumption, f (K , N) = f (K /x, N/x )x , and, setting
    x = K , total revenues may be written
    R(t, K , N) = P (t ) f (K , N) = P (t ) f (1, N/K )K.
    The first-order condition ∂ R(·)/∂ N = w, which takes the form
    P (t ) f N (1, N/K ) = w(t ),
    determines the optimal N/K ratio as a function Ì(·) of the w(t )/ P (t ) ratio. In
    the absence of adjustment costs for factor N, this condition holds at all times,
    and N(t )/K (t ) = Ì(w(t )/ P (t )) for all t . Hence,
    F (t ) = P (t ) f (1, Ì(w(t )/ P (t ))) K − w(t )Ì (w(t )/ P (t ))
    K − Pk (t )G ( I (t ), K (t )),
    and, using the first equation in (2.20), we arrive at
    ∂ F (·)
    ∂ K
    = P (t ) f (1, Ì(w(t )/ P (t ))) − w(t )Ì(w(t )/ P (t )). (2.22)
    This expression is independent of K , like R̃(·) in (2.21), and allows us to
    conclude that the constant-returns function F (·) is simply proportional to K.
    This algebraic derivation introduces simple mathematical results that will
    be useful when characterizing the average value of capital in the next section. It
    also has interesting implications, however, when one allows for the possibility
    that future realizations of exogenous variables such as w(t ) and P (t ) may
    be random. A formal redefinition of the problem to allow for uncertainty in
    continuous time requires more advanced technical tools, introduced briefly
    in the last section of this chapter. Intuitively, however, if the firm’s objective
    function is defined as the expected value of the integral in (2.5), an expression
    similar to (2.19) should also hold in expectation:
    Î(0) =
    ∫ ∞
    0
    E 0 [ F K (Ù)] e
    −(r +‰)Ù d Ù. (2.23)
    In discrete time, one would replace the integral with a summation and the
    exponential function with compound discount factors. It would still be true,
    of course, that along an optimal investment trajectory the marginal value of

    68 INVESTMENT
    Figure 2.8. Unit profits as a function of the real wage
    capital is equal to the present expected value of its contributions to future cash
    flows.
    Now, if the firm operates in perfectly competitive markets, produces under
    constant returns, and chooses the flexible factor optimally at all times, so that
    (2.22) holds, then optimal cash flows are a convex function of the real product
    wage, w(t )/ P (t ). It is easy to see why when we consider Figure 2.8, which
    displays the profit accruing to the firm from each unit of capital. (A study of
    unit profits is equivalent to that of total profits if, as in the present case, the
    latter are proportional to the former.) If the firm did not vary its employment
    of N in response to a change in the real wage, then, for given K , the difference
    between revenues and variable-factor costs would be a linear function of the
    real wage. By definition, the profits afforded by optimal adjustment of N
    must be larger for every possible real wage, and will be equal only where the
    supposedly constant employment level is optimal. Thus, profits are a convex
    function of the real wage. Flexibility in employment of N allows the firm to
    use each unit of capital so as to exploit favorable conditions and to limit losses
    in unfavorable ones.
    By Jensen’s inequality (already encountered when introducing precaution-
    ary savings in Chapter 1), the conditions listed above imply that
    Var
    (
    w(t )
    p(t )
    )
    > 0 ⇒ E t [ F K (w(t )/ P (t ))] > F K (E t [w(t )/ P (t )]) .
    Thus, uncertainty increases expected profits earned by each unit of capital, and
    induces more intense investment by a firm that, like the one we are studying, is
    risk-neutral (that is, is concerned only with expectations of future cash flows).

    INVESTMENT 69
    2.5. Average Value of Capital
    We now recall the expression for F ( · ) in (2.1), and consider the case where
    R( · ) and G ( · ) are linearly homogeneous as functions of K , N, and I . A
    function f ( · ) is linearly homogeneous if
    f (Îx, Îy) = Î f (x, y),
    as in the case of constant-returns production functions. Then, Euler’s theorem
    states that27
    f (x, y) =
    ∂ f (x, y)
    ∂ x
    x +
    ∂ f (x, y)
    ∂ y
    y.
    If G ( I, K ) did not depend on K , as in the case considered above, then it
    could be linearly homogeneous only if adjustment costs were linear (hence
    not strictly convex) in the investment flow I . But in the more general case,
    omitting t and denoting partial derivatives by subscripts as in (2.18), we obtain
    F (t ) = R(t, K (t ), N(t )) − Pk (t )G ( I (t ), K (t )) − w(t )N(t )
    = K RK + N R N − ( I G I + K G K ) Pk − w N
    = ( RK − G K Pk ) K − Pk G I I, (2.24)
    where the first step applies Euler’s theorem to R(·) and G (·), and the second
    recognizes that R N = w by the second condition in (2.6).
    Noting that RK − G K Pk ≡ F K , and that the other conditions in (2.6) and
    the accumulation constraint imply
    Pk G I = Î, F K = (r + ‰)Î − Î̇, I = K̇ + ‰K ,
    equation (2.24) simplifies to
    F (t ) = r Î(t )K (t ) − Î̇(t )K (t ) − Î(t )K̇ (t ) (2.25)
    along an optimal trajectory. It is immediately verified that this is equivalent
    to
    e −r t F (t ) =
    d
    d t
    [
    −e −r t Î(t )K (t )
    ]
    , (2.26)
    ²⁷ This implies that, if x and y are factors of production whose units are compensated according
    to marginal productivity, then the total compensation of the two factors exhausts production. (There
    are no pure profits.) This will be relevant when, in Ch. 4, we discuss income distribution in a dynamic
    general equilibrium.

    70 INVESTMENT
    and it is easy to evaluate the integral in the definition (2.5) of the firm’s value:
    V (0) =
    ∫ ∞
    0
    F (t )e −r t d t
    =
    (
    −e −r t Î(t )K (t )
    )∞
    0
    = Î(0)K (0), (2.27)
    where the last step recognizes that limt→∞ e −r t Î(t )K (t ) = 0 if the limit con-
    dition (2.7) holds.
    Thus, Î(0) = V (0)/K (0), and since this holds true for any time zero and all
    steps are valid for any Pk (t ) (constant, or variable), we have in general
    q (t ) ≡ Î(t )
    Pk (t )
    =
    V (t )
    Pk (t )K (t )
    . (2.28)
    Hence marginal q , which in the models considered above determines optimal
    investment, is the same as the ratio of the firm’s market value to the replace-
    ment cost of its capital stock.
    This result offers a precise interpretation for another intuitive idea familiar
    from introductory textbooks, namely the Tobin (1969) notion that investment
    flows may be interpreted on the basis of financial considerations. In other
    words, it is profitable to install capital and increase the production possibilities
    of each firm and of the whole economy only if the cost of investment compares
    favorably to the value of installed capital, as measured by the value of firms in
    the financial market. As we have seen, the average q measure identified by
    the Tobin approach is indeed the determinant of investment decisions when
    firms face convex, linearly homogeneous adjustment costs, and produce under
    constant returns.
    Exercise 12 If the F (·) and G (·) functions are linearly homogeneous in K , I,
    and N (so that average and marginal q coincide), what is the shape of the K̇ = 0
    and q̇ = 0 loci in the phase diagram discussed in Section 2.2.1?
    Such reasoning and results suggest an empirical approach to the study of
    investment. On the basis of equation (2.11), investment should be completely
    explained by q , which in turn is directly measurable from stock market
    and balance-sheet data under the hypotheses listed above. Investment does
    depend on (unobservable) expectations of future events. But, since the same
    expectations also affect the value of the firm in a rational financial market,
    one may test the proposed theoretical framework by considering empirical
    relationships between investment flows and measured q . Of course, both
    the value of the firm (and the average q it implies) and its investment are
    endogenous variables. Hence the empirical strategy is akin to that, based
    on Euler equations for aggregate consumption, encountered in Chapter 1.
    One does not estimate a function relating investment (or consumption) to

    INVESTMENT 71
    exogenous variables, but rather verifies a property that endogenous variables
    should display under certain theoretical assumptions.
    As regards revenues, the assumption leading to the conclusion that invest-
    ment and average q should be strictly related may be interpreted supposing
    that the firm produces under constant returns to scale and behaves in perfectly
    competitive fashion. As regards adjustment costs, the assumption is that they
    pertain to proportional increases of the firm’s size, rather than to absolute
    investment flows. A larger firm bears smaller costs to undertake a given
    amount of investment, and the whole optimal investment program may be
    scaled upwards or downwards if doubling the size of the firms yields the same
    unit investment costs for twice-as-large investment flows, that is if the adjust-
    ment cost function has constant returns to scale and G ( I, K ) = g ( I /K )K .
    The realism of these (like any other) assumptions is debatable, of course. They
    do imply that different initial sizes of the firm simply yield a proportionally
    rescaled optimal investment program. As always under constant returns to
    scale and perfectly competitive conditions, the firm does not have an optimal
    size and, in fact, does not quite have a well-defined identity. In more general
    models, the value of the firm is less intimately linked to its capital stock and
    therefore may vary independently of optimal investment flows.
    2.6. A Dynamic IS–LM Model
    We are now ready to apply the economic insights and technical tools intro-
    duced in the previous sections to study an explicitly macroeconomic, and
    explicitly dynamic, modeling framework. Specifically, we discuss a simplified
    version of the dynamic IS–LM model of Blanchard (1981), capturing the
    interactions between forward-looking prices of financial assets and output
    and highlighting the role of expectations in determining (through investment)
    macroeconomic outcomes and the effects of monetary and fiscal policies. As in
    the static version of the IS–LM model, the level of goods prices is exogenously
    fixed and constant over time. However, the previous sections’ positive rela-
    tionship between the forward-looking q variable and investment is explicitly
    accounted for by the aggregate demand side of the model.
    A linear equation describes the determinants of aggregate goods spending
    y D (t ):
    y D (t ) = · q (t ) + c y(t ) + g (t ), · > 0, 0 < c < 1. (2.29) Spending is determined by aggregate income y (through consumption), by the flow g of public spending (net of taxes) set exogenously by the fiscal authorities, and by q as the main determinant of private investment spending. 72 INVESTMENT We shall view q as the market valuation of the capital stock of the economy incorporated in the level of stock prices: for simplicity, we disregard the dis- tinction between average and marginal q , as well as any role of stock prices in determining aggregate consumption. Output y evolves over time according to the following dynamic equation: ẏ(t ) = ‚ (y D (t ) − y(t )), ‚ > 0. (2.30)
    Output responds to the excess demand for goods: when spending is larger
    than current output, firms meet demand by running down inventories and
    by increasing production gradually over time. In our setting, output is a
    “predetermined” variable (like the capital stock in the investment model of
    the preceding sections) and cannot be instantly adjusted to fill the gap between
    spending and current production.
    A conventional linear LM curve describes the equilibrium on the money
    market:
    m(t )
    p
    = h0 + h1 y(t ) − h2 r (t ), (2.31)
    where the left-hand side is the real money supply (the ratio of nominal money
    supply m to the constant price level p), and the right-hand side is money
    demand. The latter depends positively on the level of output and negatively on
    the interest rate r on short-term bonds.28 Conveniently, we assume that such
    bonds have an infinitesimal duration; then, the instantaneous rate of return
    from holding them coincides with the interest rate r with no possibility of
    capital gains or losses.
    Shares and short-term bonds are assumed to be perfect substitutes in
    investors’ portfolios (a reasonable assumption in a context of certainty); con-
    sequently, the rates of return on shares and bonds must be equal for any
    arbitrage possibility to be ruled out. The following equation must then hold
    in equilibrium:
    (t )
    q (t )
    +
    q̇ (t )
    q (t )
    = r (t ), (2.32)
    where the left-hand side is the (instantaneous) rate of return on shares, made
    up of the firms’ profits  (entirely paid out as dividends to shareholders) and
    the capital gain (or loss) q̇ . At any time this composite rate of return on shares
    must equal the interest rate on bonds r .29 Finally, profits are positively related
    to the level of output:
    (t ) = a0 + a1 y(t ). (2.33)
    ²⁸ The assumption of a constant price level over time implies a zero expected inflation rate; there
    is then no need to make explicit the difference between the nominal and real rates of return.
    ²⁹ If long-term bonds were introduced as an additional financial asset, a further “no arbitrage”
    equation similar to (2.32) should hold between long and short-term bonds.

    INVESTMENT 73
    Figure 2.9. A dynamic IS–LM model
    The two dynamic variables of interest are output y and the stock market
    valuation q . In order to study the steady-state and the dynamics of the system
    outside the steady-state, following the procedure adopted in the preceding
    sections, we first derive the two stationary loci for y and q and plot them in
    a (q , y)-phase diagram. Setting ẏ = 0 in (2.30) and using the specification
    of aggregate spending in (2.29), we get the following relationship between
    y and q :
    y =
    ·
    1 − c q +
    1
    1 − c g , (2.34)
    represented as an upward-sloping line in Figure 2.9. A higher value of q stim-
    ulates aggregate spending through private investment and increases output
    in the steady state. This line is the equivalent of the IS schedule in a more
    traditional IS–LM model linking the interest rate to output. For each level of
    output, there exists a unique value of q for which output equals spending:
    higher values of q determine larger investment flows and a corresponding
    excess demand for goods, and, according to the dynamic equation for y,
    output gradually increases. As shown in the diagram by the arrows pointing
    to the right, ẏ > 0 at all points above the ẏ = 0 locus. Symmetrically, ẏ < 0 at all points below the stationary locus for output. The stationary locus for q is derived by setting q̇ = 0 in (2.32), which yields q =  r = a0 + a1 y h0/ h2 + h1/ h2 y − 1/ h2 m/ p , (2.35) where the last equality is obtained using (2.33) and (2.31). The steady-state value of q is given by the ratio of dividends to the interest rate, and both are affected by output. As y increases, profits and dividends increase, raising q ; also, the interest rate (at which profits are discounted) increases, with a depressing effect on stock prices. The slope of the q̇ = 0 locus then depends 74 INVESTMENT on the relative strength of those two effects; in what follows we assume that the “interest rate effect” dominates, and consequently draw a downward- sloping stationary locus for q .30 The dynamics of q out of its stationary locus are governed by the dynamic equation (2.32). For each level of output (that uniquely determines dividends and the interest rate), only the value of q on the stationary locus is such that q̇ = 0. Higher values of q reduce the dividend component of the rate of return on shares, and a capital gain, implying q̇ > 0,
    is needed to fulfill the “no arbitrage” condition between shares and bonds: q
    will then move upwards starting from all points above the q̇ = 0 line, as shown
    in Figure 2.9. Symmetrically, at all points below the q̇ = 0 locus, capital losses
    are needed to equate returns and, therefore q̇ < 0. The unique steady state of the system is found at the point where the two stationary loci cross and output and stock prices are at ys s and qs s respec- tively. As in the dynamic model analyzed in previous sections, in the present framework too there is a unique trajectory converging to the steady-state, the saddlepath of the dynamic system. To rationalize its negative slope in the (q , y) space, let us consider at time t0 a level of output y(t0) < ys s . The associated level q (t0) on the saddlepath is higher than the value of q on the stationary locus ẏ = 0. Therefore, there is excess demand for goods owing to a high level of investment, and output gradually increases towards its steady-state value. As y increases, the demand for money increases also and, with a given money supply m, the interest rate rises. The behavior of q is best understood if the dynamic equation (2.32) is solved forward, yielding the value of q (t0) as the present discounted value of future dividends:31 q (t0) = ∫ ∞ t0 (t ) e − ∫ t t0 r (s ) d s d t. (2.36) Over time q changes, for two reasons: on the one hand, q is positively affected by the increase in dividends (resulting from higher output); on the other, future dividends are discounted at higher interest rates, with a negative effect on q . Under our maintained assumption that the “interest rate effect” dominates, q declines over time towards its steady-state value qs s . Let us now use our dynamic IS–LM model to study the effects of a change in macroeconomic policy. Suppose that at time t = 0 a future fiscal restriction is announced, to be implemented at time t = T : public spending, which is initially constant at g (0), will be decreased to g (T ) < g (0) at t = T and will then remain permanently at this lower level. The effects of this anticipated fiscal restriction on the steady-state levels of output and the interest rate are immediately clear from a conventional IS–LM (static) model: in the new ³⁰ Formally, d q /d y|q̇ =0 < 0 ⇔ a1 < q (h1/ h2 ). Moreover, as indicated in Fig. 2.9, the q̇ = 0 line has the following asymptote: limy→∞ q |q̇ =0 = a1 h2/ h1 . ³¹ In solving the equation, the terminal condition limt→∞ (t )e − ∫ t t0 r (s )d s = 0 is imposed. INVESTMENT 75 Figure 2.10. Dynamic effects of an anticipated fiscal restriction steady state both y and r will be lower. Both changes affect the new steady-state level of q : lower output and dividends depress stock prices, whereas a lower interest rate raises q . Again, the latter effect is assumed to dominate, leading to an increase in the steady-state value of q . This is shown in Figure 2.10 by an upward shift of the stationary locus ẏ = 0, which occurs at t = T along an unchanged q̇ = 0 schedule, leading to a higher q and a lower y in steady-state. In order to characterize the dynamics of the system, we note that, from time T onwards, no further change in the exogenous variables occurs: to converge to the steady state, the economy must then be on the saddlepath portrayed in the diagram. Accordingly, from T onwards, output decreases (since the lower public spending causes aggregate demand to fall below cur- rent production) and q increases (owing to the decreasing interest rate). What happens between the time of the fiscal policy announcement and that of its delayed implementation? At t = 0, when the future policy becomes known, agents in the stock market anticipate lower future interest rates. (They also foresee lower dividends, but this effect is relatively weak.) Consequently, they immediately shift their portfolios towards shares, bidding up share prices. Then at the announcement date, with output and the interest rate still at their initial steady-state levels, q increases. The ensuing dynamics from t = 0 up to the date T of implementation follow the equations of motion in (2.30) and (2.32) on the basis of the parameters valid in the initial steady state. A higher value of q stimulates investment, causing an excess demand for goods; starting from t = 0, then, output gradually increases, and so does the interest rate. The dynamic adjustment of output and q is such that, when the fiscal policy is implemented at T (and the stationary locus ẏ = 0 shifts upwards), the economy is exactly on the saddlepath leading to the new steady-state: 76 INVESTMENT aggregate demand falls and output starts decreasing along with the interest rate, whereas q and investment continue to rise. Therefore, an apparently “perverse” effect of fiscal policy (an expansion of investment and output fol- lowing the announcement of a future fiscal restriction) can be explained by the forward-looking nature of stock prices, anticipating future lower interest rates. Exercise 13 Consider the dynamic IS–LM model proposed in this section, but suppose that (contrary to what we assumed in the text) the “interest rate effect” is dominated by the “dividend effect” in determining the slope of the stationary locus for q . (a) Give a precise characterization of the q̇ = 0 schedule and of the dynamic properties of the system under the new assumption. (b) Analyze the effects of an anticipated permanent fiscal restriction (announced at t = 0 and implemented at t = T ), and contrast the results with those reported in the text. 2.7. Linear Adjustment Costs We now return to a typical firm’s partial equilibrium optimal investment problem, questioning the realism of some of the assumptions made above and assessing the robustness of the qualitative results obtained from the simple model introduced in Section 2.1. There, we assumed that a given increase of the capital stock would be more costly when enacted over a shorter time period, but this is not necessarily realistic. It is therefore interesting to study the implications of relaxing one of the conditions in (2.4) to ∂ 2 G (·) ∂ I 2 = 0, (2.37) so that in Figure 2.1 the G ( I, ·) function would coincide with the 45◦ line. Its slope, ∂ G (·)/∂ I , is constant at unity, independently of the capital stock. Since the cost of investment does not depend on its intensity or the speed of capital accumulation, the firm may choose to invest “infinitely quickly” and the capital stock is not given (predetermined) at each point in time. This appears to call into question all the formal apparatus discussed above. However, if we suppose that all paths of exogenous variables are continuous in time and simply proceed to insert ∂ G /∂ I = 1 (hence Î = Pk , Î̇ = Ṗk = k Pk ) in conditions (2.6), we can obtain a simple characterization of the firm’s optimal policy. As in the essentially static cost-of-capital approach outlined above, condition (2.12) is replaced by ∂ F (·) ∂ K = (r + ‰ − k ) Pk (t ). (2.38) INVESTMENT 77 Hence the firm does not need to look forward when choosing investment. Rather, it should simply invest at such a (finite, or infinite) rate as needed to equate the current marginal revenues of capital to its user cost. The latter concept is readily understood noting that, in order to use temporarily an additional unit of capital, one may borrow its purchase cost, Pk , at rate r and re-sell the undepreciated (at rate ‰) portion at the new price implied by k . If F K (·) is a decreasing function of installed capital (because the firm produces under decreasing returns and/or faces a downward-sloping demand function), then equation (2.38) identifies the desired stock of capital as a function of exogenous variables. Investment flows can then be explained in terms of the dynamics of such exogenous variables between the beginning and the end of each period. In continuous time, the investment rate per unit time is well defined if exogenous variables do not change discontinuously. Recall that we had to rule out all changes of exogenous variables (other than completely unexpected or perfectly foreseen one-time changes) when drawing phase diagrams. In the present setting, conversely, it is easy to study the implications of ongoing exogenous dynamics. This enhances the realism and applicability of the model, but the essentially static character of the perspective encounters its limits when applied to real-life data. In reality, not only the growth rates of exogenous variable in (2.38), but also their past and future dynamics appear relevant to current investment flows. An interesting compromise between strict convexity and linearity is offered by piecewise linear adjustment costs. In Figure 2.11, the G ( I, ·) function has unit slope when gross investment is positive, implying that Pk is the cost of Figure 2.11. Piecewise linear unit investment costs 78 INVESTMENT each unit of capital purchased and installed by the firm, regardless of how many units are purchased together. The adjustment cost function remains linear for I < 0, but its slope is smaller. This implies that when selling pre- viously installed units of capital the firm receives a price that is independent of I (t ), but lower than the purchase price. This adjustment cost structure is realistic if investment represents purchases of equipment with given off- the-shelf price, such as personal computers, and constant unit installation cost, such as the cost of software installation. If installation costs cannot be recovered when the firm sells its equipment, each firm’s capital stock has a degree of specificity, while capital would need to be perfectly transferable into and out of each firm for (2.16) to apply at all times. Linear adjustment costs do not make speedy investment or scrapping unattractive, as strictly convex adjustment costs would. The kink at the origin, however, still makes it unattractive to mix periods of positive and negative gross investment. If a positive investment were immediately followed by a negative one, the firm would pay installation costs without using the marginal units of capital for any length of time. In general, a firm whose adjustment costs have the form illustrated in Figure 2.11 should avoid investment when very temporary events call for capital stock adjustment. Installation costs put a premium on inac- tivity: the firm should cease to invest, even as current conditions improve, if it expects (or, in the absence of uncertainty, knows) that bad news will arrive soon. To study the problem formally in the simplest possible setting, it is con- venient to suppose that the price commanded by scrapped units of capital is so low as to imply that investment decisions are effectively irreversible. This is the case when the slope of G ( I, ·) for I < 0 is so small as to fall short of what can be earned, on a present discounted basis, from the use of capital in production. Since adjustment costs do not induce the firm to invest slowly, the investment rate may optimally jump between positive and negative values. In fact, nothing prevents optimal investment from becoming infinitely positive or negative, or the optimal capital stock path from jumping. If exogenous variables follow continuous paths, however, there is no reason for any such jump to occur along an optimal path. Hence the Hamiltonian solution method remains applicable. Among the conditions in (2.6), only the first needs to be modified: if capital has price Pk when purchased and is never sold, the first-order condition for investment reads Pk { = Î(t ), if I > 0,
    ≥ Î(t ), if I = 0. (2.39)
    The optimality condition in (2.39) requires Î(t ), the marginal value of capital
    at time t , to be equal to the unit cost of investment only if the firm is indeed
    investing. Hence in periods when I (t ) > 0 we have Î(t ) = Pk , Î̇(t ) = k P (t ),

    INVESTMENT 79
    and the third condition in (2.6) implies that (2.38) is valid at all t such that
    I (t ) > 0. If the firm is investing, capital installed must line up with ∂ F (·)/∂ K
    and with the user cost of capital at each instant.
    It is not necessarily optimal, however, always to perform positive invest-
    ment. It is optimal for the firm not to invest whenever the marginal value of
    capital is (weakly) lower than what it would cost to increase its stock by a unit.
    In fact, when the firm expects unfavorable developments in the near future
    of the variables determining the “desired” capital stock that satisfies condition
    (2.38), then if it continued to invest it would find itself with an excessive of
    capital stock.
    To characterize periods when the firm optimally chooses zero investment,
    recall that the third condition in (2.6) and the limit condition (2.7) imply, as
    in (2.19), that
    q (t ) ≡ Î(t )
    Pk (t )
    =
    1
    Pk (t )
    ∫ ∞
    t
    F K (Ù)e
    −(r +‰)(Ù−t ) d Ù. (2.40)
    In the upper panel of Figure 2.12, the curve represents a possible dynamic
    path of desired capital, determined by cyclical fluctuations of F (·) for given
    K . Since that curve falls faster than capital depreciation for a period, the
    firm ceases to invest at time t0 and starts again at time t1. We know from the
    Figure 2.12. Installed capital and optimal irreversible investment

    80 INVESTMENT
    optimality condition (2.39) that the present value (2.40) of marginal revenue
    products of capital must be equal to the purchase price Pk (t ) at all t when
    gross investment is positive, such as t0 and t1. Thus, if we write
    Pk (t0) =
    ∫ ∞
    t0
    F K (Ù)e
    −(r +‰)(Ù−t0 ) d Ù
    =
    ∫ t1
    t0
    F K (Ù)e
    −(r +‰)(Ù−t0 ) d Ù +
    ∫ ∞
    t1
    F K (Ù)e
    −(r +‰)(Ù−t0 ) d Ù, (2.41)
    noting that∫ ∞
    t1
    F K (Ù)e
    −(r +‰)(Ù−t0 ) d Ù = e −(r +‰)(t1−t0 )
    ∫ ∞
    t1
    F K (Ù)e
    −(r +‰)(Ù−t1 ) d Ù,
    and recognizing Î(t1) = Pk (t1) in the last integral, we obtain
    Pk (t0) =
    ∫ t1
    t0
    F K (Ù)e
    −(r +‰)(Ù−t0 ) d Ù + e −(r +‰)(t1−t0 ) Pk (t1)
    from (2.41). If the inflation rate in terms of capital is constant at k , then
    Pk (t1) = Pk (t0)e
    k (t1−t0 ) and
    Pk (t0) =
    ∫ t1
    t0
    F K (Ù) e
    −(r +‰)(Ù−t0 )d Ù + Pk (t0) e
    −(r +‰−k )(t1−t0 )
    ⇒ Pk (t0) (1 − e −(r +‰−k )(t1−t0 )) =
    ∫ t1
    t0
    F K (Ù) e
    −(r +‰)(Ù−t0 )d Ù.
    Noting that∫ t1
    t0
    (r + ‰ − k )e −(r +‰−k )(Ù−t0 )d Ù = 1 − e −(r +‰−k )(t1−t0 ),
    we obtain∫ t1
    t0
    F K (Ù) e
    −(r +‰)(Ù−t0 )d Ù − Pk (t0)
    ∫ t1
    t0
    (r + ‰ − k ) e −(r +‰−k )(Ù−t0 )d Ù = 0.
    Again, using Pk (t0)e
    k (Ù−t0 ) = Pk (Ù) yields∫ t1
    t0
    F K (Ù) e
    −(r +‰)(Ù−t0 )d Ù −
    ∫ t1
    t0
    (r + ‰ − k ) Pk (Ù) e −(r +‰)(Ù−t0 )d Ù = 0,
    and (2.41) may be rewritten as∫ t1
    t0
    [ F K (Ù) − (r + ‰ − k ) Pk (Ù)]e −(r +‰)(Ù−t0 ) d Ù = 0. (2.42)
    Thus, the marginal revenue product of capital should be equal to its user
    cost in present discounted terms (at rate r + ‰) not only when the firm invests
    continuously, but also over periods throughout which it is optimal not to

    INVESTMENT 81
    invest. In Figure 2.12, area A should have the same size as the discounted
    value of B. Adjustment costs, as usual, affect the dynamic aspects of the firm’s
    behavior. As the cyclical peak nears, the firm stops investing because it knows
    that in the near future it would otherwise be impossible to preserve equality
    between marginal revenues and costs of capital.
    Similar reasoning is applicable, with some slightly more complicated nota-
    tion, to the case where the firm may sell installed capital at a positive price
    pk (t ) < Pk (t ) and find it optimal to do so at times. In this case, we should draw in Figure 2.12 another dynamic path, below that representing the desired capital stock when investment is positive, to represent the capital stock that satisfies condition (2.38) when the user cost of capital is computed on the basis of its resale price. The firm should follow this path whenever its desired invest- ment is negative and optimal inaction would lead it from the former to the latter line. Even though the speed of investment is not constrained, the existence of transaction costs implies that the firm’s behavior should be forward-looking. Investment should cease before a slump reveals that it would be desirable to reduce the capital stock. This is yet another instance of the general importance of expectations in dynamic optimization problems. Symmetrically, the capital stock at any given time is not independent of past events. In the latter portion of the inaction period illustrated in the figure, the capital stock is larger than what would be optimal if it could be chosen in light of current conditions. This illustrates another general feature of dynamic optimization problems, namely the character of interaction between endogenous capital and exogenous forc- ing variables: the former depends on the whole dynamic path of the latter, rather than on their level at any given point in time. 2.8. Irreversible Investment Under Uncertainty Throughout the previous sections, the firm was supposed to know with certainty the future dynamics of exogenous variables relevant to its optimiza- tion problem. (And, in order to make use of phase diagrams, we assumed that those variables were constant through time, or only changed discretely in perfectly foreseeable fashion.) This section briefly outlines formal modeling techniques allowing uncertainty to be introduced in explicit, if stylized, ways into the investment problem of a firm facing linear adjustment costs. We try, as far as possible, to follow the same logical thread as in the derivations encountered above. We continue to suppose that the firm oper- ates in continuous time. The assumption that time is indefinitely divisible is of course far from completely realistic; also less than fully realistic are the assumptions that the capital stock is made up of infinitesimally small particles, 82 INVESTMENT and that it may be an argument of a differentiable production function. As was the case under certainty, however, such assumptions make it possible to obtain precise and elegant quantitative results by means of analytical calculus techniques. 2.8.1. STOCHASTIC CALCULUS First of all, we need to introduce uncertainty into the formal continuous-time optimization framework introduced above. So far, all exogenous features of the firm’s problem were determined by the time index, t : knowing the position in time of the dynamic system was enough to know the product price, the cost of factors, and any other variable whose dynamics are taken as given by the firm. To prevent such dynamics from being perfectly foreseeable, one must let them depend not only on time, but also on something else: an index, denoted ˘, of the unknown state of nature. A function {z(t ; ˘)} of a time index t and of the state of nature ˘ is a stochastic process, that is, a collection of random variables. The state of nature, by definition, is not observable. If the true ˘ were known, in fact, the path of the process would again depend on t only, and there would be no uncertainty. But if ˘ belongs to a set on which a probability distribution is defined, one may formally assign likelihood levels to different possible ˘ and different possible time paths of the process. This makes it possible to formulate precise answers to questions, clearly of interest to the firm, concerning the probability that processes such as revenues or costs reach a given level within a given time interval. In order to illustrate practical uses of such concepts, it will not be neces- sary to deal further with the theory of stochastic processes. We shall instead introduce a type of stochastic process of special relevance in applications: Brownian motion. A standard Brownian motion, or Wiener process, is a basic building block for a class of stochastic process that admits a stochastic coun- terpart to the functional relationships studied above, such as integrals and differentials. This process, denoted {W(t )} in what follows, can be defined by its probabilistic properties. {W(t )} is a Wiener process if 1. W(0; ˘) = 0 for “almost all” all ˘, in the sense that the probability is one that the process takes value zero at t = 0; 2. fixing ˘, {W(t ; ˘)} is continuous in t with probability one; 3. fixing t ≥ 0, probability statements about W(t ; ˘) can be made viewing W(t ) as a normally distributed random variable, with mean zero and variance t as of time zero: realizations of W(t ) are quite concentrated for small values of t , while more and more probability is attached to values far from zero for larger and larger values of t ; INVESTMENT 83 4. W(t ′) − W(t ), for every t ′ > t , is also a normally distributed random
    variable with mean zero and variance (t ′ − t ); and W(T ′) − W(T ) is
    uncorrelated with—and independent of—W(t ′) − W(t ) for all T ′ >
    T > t ′ > t .
    Assumption 1 is a simple normalization, and assumption 2 rules out jumps
    of W(t ) to imply that large changes of W(t ) become impossible as smaller
    and smaller time intervals are considered. Indeed, property 3 states that the
    variance of changes is proportional to time lapsed, hence very small over
    short periods of time. The process, however, has normally distributed incre-
    ments over any finite interval of time. Since the normal distribution assigns
    positive probability to any finite interval of the real line, arbitrarily large
    variations have positive probability on arbitrarily short (but finite) intervals
    of time.
    Normality of the process’s increments is useful in applications, because
    linear transformations of W(t ) can also be normal random variables with
    arbitrary mean and variance. And the independence over time of such incre-
    ments stated as property 4 (which implies their normality, by an application of
    the Central Limit Theorem) makes it possible to make probabilistic statements
    on all future values of W(t ) on the basis of its current level only. It is particu-
    larly important to note that, if {W(t ); 0 ≤ t ≤ t1} is known with certainty, or
    equivalently if observation of the process’s trajectory has made it possible to
    rule out all states of the world ˘ that would not be consistent with the observed
    realization of the process up to time t1, then the probability distribution of the
    process’s behavior in subsequent periods is completely characterized. Since
    increments are independent over non-overlapping periods, W(t ) − W(t1) is
    a normal random variable with mean zero and variance t − t1. Hence the
    process enjoys the Markov property in levels, in that its realization at any time
    Ù contains all information relevant to formulating probabilistic statements as
    to its realizations at all t > Ù.
    Independence of the process’s increments has an important and somewhat
    awkward implication: for a fixed ˘, the path {W(t )} is continuous but (with
    probability one) not differentiable at any point t . Intuitively, a process with
    differentiable sample paths would have locally predictable increments, because
    extrapolation of its behavior over the last d t would eliminate all uncertainty
    about the behavior of the process in the immediate future. This, of course,
    would deny independence of the process’s increments (property 4 above). For
    increments to be independent over any t interval, including arbitrarily short
    ones, the direction of movement must be random at arbitrarily close t points.
    A typical sample path then turns so frequently that it fails to be differentiable
    at any t point, and has infinite variation: the absolute value of its increments
    over infinitesimally small subdivisions of an arbitrarily short time interval is
    infinite.

    84 INVESTMENT
    Non-existence of the derivative makes it impossible to apply familiar cal-
    culus tools to functions when one of their arguments is a Brownian process
    {W}. Such functions—which, like their argument, depend on t, ˘ and are
    themselves stochastic process—may however be manipulated by stochastic
    calculus tools, developed half a century ago by Japanese mathematician T.
    Itô along the lines of classical calculus. Given a process { A(t )} with finite
    variation, a process {y(t )} which satisfies certain regularity conditions, and
    a Wiener process {W(t )}, the integral
    z(T ; ˘) = z(t ; ˘) +
    ∫ T
    t
    y(Ù; ˘) d W(Ù; ˘) +
    ∫ T
    t
    d A(t ; ˘) (2.43)
    defines an Itô process {z(t )}. The expression

    y d W denotes a stochastic or
    Itô integral. Its exact definition need not concern us here: we may simply note
    that it is akin to a weighted sum of the Wiener process’s increments d W(t ),
    where the weight function {y(t )} is itself a stochastic process in general.
    The properties of Itô integrals are similar to those of more familiar integrals
    (or summations). Stochastic integrals of linear combinations can be written
    as linear combinations of stochastic integrals, and the integration by parts
    formula
    z(t )x (t ) = z(0)x (0) +
    ∫ t
    0
    z(Ù)d x (Ù) +
    ∫ t
    0
    x (Ù)d z(Ù). (2.44)
    holds when z and x are processes in the class defined by (2.43) and one of
    them has finite variation. The stochastic integral has one additional important
    property. By the unpredictable character of the Wiener process’s increments,
    E t
    (∫ T
    t
    y(Ù) d W(Ù)
    )
    = 0,
    for any {y(t )} such that the expression is well defined, where E t [·] denotes the
    conditional expectation at time t (that is, an integral weighting possible realiza-
    tions with the probability distribution reflecting all available information on
    the state of nature as of that time).
    Recall that, if function x (t ) has first derivative x ′(t ) = d x (t )/d t = ẋ , and
    function f ( · ) has first derivative f ′(x ) = d f (x )/d x, then the following rela-
    tionships are true:
    d x = ẋ d t, d f (x ) = f ′(x ) d x, d f (x ) = f ′(x ) ẋ d t. (2.45)
    The integral (2.43) has differential form
    d z(t ) = y(t ) d W(t ) + d A(t ), (2.46)
    and it is natural to formulate a stochastic version of the “chain rule”
    relationships in (2.45), used in integration “by substitution.” The rule is as fol-
    lows: if a function f ( · ) is endowed with first and second derivatives, and {z(t )}

    INVESTMENT 85
    is an Itô process with differential as in (2.46), then
    d f (z(t )) = f ′(z(t ))y(t ) d W(t ) + f ′(z(t )) d A(t ) + 1
    2
    f ′′(z(t ))(y(t ))2 d t.
    (2.47)
    Comparing (2.46–2.47) with (2.45), note that, when applied to an Itô
    process, variable substitution must take into account not only the first, but also
    the second, derivative of the transformation. Heuristically, the order of mag-
    nitude of d W(t ) increments is higher than that of d t if uncertainty is present
    in every d t interval, no matter how small. Independent increments also imply
    that the sign of d W(t ) is just as likely to be positive as to be negative, and
    by Jensen’s inequality the curvature of f (z) influences locally non-random
    behavior even in the infinitesimal limit. Taking conditional expectations in
    (2.47), where E t [d W(t )] = 0 by unpredictability of the Wiener process, we
    have
    E t [d f (z(t ))] = f
    ′(z(t )) d A(t ) + 1
    2
    f ′′(z(t ))(y(t )2) d t.
    Hence E t [d f (z(t ))] � f ′(z(t )) E t [d z(t )] depending on whether f ′′(z(t ))
    � 0.
    2.8.2. OPTIMIZATION UNDER UNCERTAINTY AND IRREVERSIBILITY
    We are now ready to employ these formal tools in the study of a firm that,
    in partial equilibrium, maximizes the present discounted value at rate r of
    its cash flows. In the presence of uncertainty, exogenous variables relevant to
    profits are represented by the realization of a stochastic process, Z (t ), rather
    than by the time index t . As seen above, the optimal profit flow may be
    a convex function of exogenous variables (but it may also, under different
    assumptions, be concave). In such cases Jensen’s inequality introduces a link
    between the expected value and variability of capital’s marginal revenue prod-
    uct. For simplicity, we will disregard such effects, supposing that the profit
    flow is linear in Z . Like in the previous section, let K (t ) be the capital stock
    installed at time t . For simplicity, let this be the only factor of production, so
    that the firm’s cash flow gross of investment-related expenditures is F (K ) Z .
    We suppose further that units of capital may be purchased at a constant price
    Pk and have no scrap value. As long as capital is useful—that is, as long as
    F ′(K ) > 0—this implies that all investment is irreversible.
    The exogenous variable Z , which multiplies a function of installed capital,
    could be interpreted as the product’s price. Let its dynamics be described by a
    stochastic process with differential
    d Z (t ) = ËZ (t ) d t + ÛZ (t ) d W(t ).

    86 INVESTMENT
    This is a simple special case of the general expression in (2.46), with A(t ) =
    ËZ (t ) d t and y(t ) = ÛZ (t ) for Ë and Û constant parameters. This process is
    a geometric Brownian motion, and it is well suited to economic applications
    because Z (t ) is positive (as a price should be) for all t > 0 if Z (0) > 0; as
    it gets closer to zero, in fact, this process’s increments become increasingly
    smaller in absolute value, and it can never reach zero. If Û is equal to zero,
    the proportional growth rate of Z , d Z/Z = Ë, is constant and known with
    certainty, implying that Z (T ) is known for all T > 0 if Z (0) is. But if Û is
    larger than zero, that deterministic proportional growth rate is added, during
    each time interval (t ′ − t ), to the realization of a normally distributed ran-
    dom variable with mean zero and variance (t ′ − t )Û2. This implies that the
    logarithm of Z is normally distributed (that is, Z (t ) is a lognormal random
    variable), and that the dispersion of future possible levels of Z is increasingly
    wide over longer forecasting horizons.
    As we shall see, the firm’s optimal investment policy implies that one may
    not generally write an expression for K̇ = d K (t )/d t . If capital depreciates at
    rate ‰, the accumulation constraint is better written in differential form,
    d K (t ) = d X (t ) − ‰K (t ) d t,
    for a process X (t ) that would correspond to the integral
    ∫ t
    o
    I (Ù)d Ù of gross
    investment I (t ) per unit time if such concepts were well defined.
    Apart from such formal peculiarities, the firm’s problem is substantially
    similar to those studied above. We can define, also in the presence of uncer-
    tainty, the shadow value of capital at time t , which still satisfies the relationship
    Î(t ) =
    ∫ T
    t
    E t [ F
    ′(K (Ù)) Z (Ù)]e −(r +‰)(Ù−t ) d Ù. (2.48)
    As in the previous sections, and quite intuitively, the optimal investment
    policy must be such as to equate Î(t )—the marginal contribution of capital to
    the firm’s value—to the marginal cost of investment. If the second derivative
    of F (·) is not zero, however, the marginal revenue products on the right-hand
    side of (2.48) depend on the (optimal) investment policy, which therefore
    must be determined simultaneously with the shadow value of capital.
    If investment is irreversible and has constant unit cost Pk , then the firm, as
    we saw in (2.39), must behave so as to obtain Î(t ) = Pk when gross investment
    is positive, that is when d X (t ) > 0 in the notation introduced here; and to
    ensure that Î(t ) ≤ Pk at all times. The shadow value of capital is smaller
    than its cost when a binding irreversibility constraint prevents the firm from
    keeping them equal to one another, as would be possible (and optimal) if, as in
    the first few sections of this chapter, investment costs were uniformly convex
    and smoothly differentiable at the origin.
    Now, if the firm only acts when Î(t )/ Pk ≡ q equals unity, and since the
    future path of the {Z (Ù)} process depends only on its current level Z (t ), the

    INVESTMENT 87
    expected value in (2.48) and the level of q must be functions of K (t ) and Z (t ).
    Thus, we may write Î(t )/ Pk ≡ q (K (t ), Z (t )), noting that (2.48) implies
    (r + ‰)
    Î(t )
    Pk
    d t =
    F ′(K (t )) Z (t )
    Pk
    d t +
    E t [d Î(t )]
    Pk
    , (2.49)
    and we use a multivariate version of the differentiation rule (2.47) to expand
    the expectation in (2.49) to
    (r + ‰)q (K , Z ) =
    F ′(K ) Z
    Pk
    +
    ∂q (K , Z )
    ∂ K
    K (−‰)
    +
    ∂q (K , Z )
    ∂ Z
    ËZ +
    ∂ 2q (K , Z )
    ∂ Z 2
    Û2
    2
    Z 2, (2.50)
    an equation satisfied by q at all times when the firm is not investing (and
    therefore when capital is depreciating at a rate ‰).
    This is a relatively simple differential equation, which may be further sim-
    plified supposing that ‰ = 0. A particular solution of
    r q (K , Z ) =
    F ′(K ) Z
    Pk
    +
    ∂q (K , Z )
    ∂ Z
    ËZ +
    ∂ 2(K , Z )
    ∂ Z 2
    Û2
    2
    Z 2 (2.51)
    is linear in Z and reads
    q0(K , Z ) =
    F ′(K ) Z
    (r − Ë) Pk
    .
    The “homogeneous” part of the equation,
    r qi (K , Z ) =
    ∂qi (K , Z )
    ∂ Z
    ËZ +
    ∂ 2qi (K , Z )
    ∂ Z 2
    Û2
    2
    Z 2,
    is solved by functions in the form
    qi (K , Z ) = Ai Z
    ‚ (2.52)
    if ‚ is a solution of the quadratic equation
    r = ‚Ë + (‚ − 1)‚ Û
    2
    2
    , (2.53)
    for any constant Ai , as is easily checked inserting its derivatives
    ∂qi (K , Z )
    ∂ Z
    = Ai Z
    ‚−1‚,
    ∂ 2qi (K , Z )
    ∂ Z 2
    = (‚ − 1)‚ Ai Z ‚−2
    in the differential equation and simplifying the resulting expression.

    88 INVESTMENT
    The quadratic equation has two distinct roots if Û2 > 0:
    ‚1 =
    1
    Û2
    [

    (
    Ë − 1
    2
    Û2
    )
    +
    √(
    Ë − 1
    2
    Û2
    )2
    + 2Û2r
    ]
    > 0,
    ‚2 =
    1
    Û2
    [

    (
    Ë − 1
    2
    Û2
    )

    √(
    Ë − 1
    2
    Û2
    )2
    + 2Û2r
    ]
    < 0. Thus, there exist two groups of solutions in the form (2.52), q1(K , Z ) = A1 Z ‚1 and q2(K , Z ) = A2 Z ‚2 . Hence, all solutions to (2.51) may be written q (K , Z ) = F ′(K ) Z (r − Ë) Pk + A1 Z ‚1 + A2 Z ‚2 . (Recall that we have set ‰ = 0.) To determine the constants A1 and A2, we recall that this expression represents the ratio of capital’s marginal value to its purchase price. From this economic point of view, it is easy to argue that A2, the constant associated with the negative root of (2.53), must be zero. Otherwise, as Z tends to zero the shadow value of capital would diverge towards infinity (or negative infinity), which would be quite difficult to interpret since capital’s contribution to profits tends to vanish in that situation. We also know that the firm’s investment policy prevents q (K , Z ) from exceeding unity. The other constant, A1, and the firm’s investment policy should therefore satisfy the equation F ′(K ∗( Z )) Z (r − Ë) Pk + A1 Z ‚1 = 1, (2.54) where K ∗( Z ) denotes the capital stock chosen by the firm when exogenous conditions are indexed by Z and the irreversibility constraint is not binding, so that it is possible to equate capital’s shadow value and cost (Î = Pk , and q = 1). The single equation (2.54) does not suffice to determine both K ∗( Z ) and A1. But the structure of the problem implies that another condition should also be satisfied by these two variables: this is the smooth pasting or high- contact condition that ∂q (·) ∂ Z = 0 whenever q = 1, K = K ∗( Z ) and, therefore, gross investment d X may be positive. To see why, consider the character of the firm’s optimal investment policy. When following the proposed optimal policy, the firm invests if and only if an infinitesimal stochastic increment of the Z process would other- wise lead q to exceed unity. Since the stochastic process that describes Z ’s dynamics has “infinite variation,” each instant when investment is positive is followed immediately by an instant (at least) when Z declines and there is no INVESTMENT 89 investment. (It is for this reason that the time path of the capital stock, while of finite variation, is not differentiable and the notation could not feature the usual rate of investment per unit time, I (t ) = d K (t )/d t .) When K = K ∗( Z ), a relationship in the form (2.50) should be satisfied: (r + ‰)q (K ∗( Z ), Z )d t = F ′(K ∗( Z )) Z Pk d t + ∂q (K ∗( Z ), Z ) ∂ K d K + ∂q (K ∗( Z ), Z ) ∂ Z ËZ d t + ∂ 2(K ∗( Z ), Z ) ∂ Z 2 Û2 2 Z 2d t, (2.55) where the variation of both arguments of the q (·) function is taken into account, and d K (t ) may be positive when d X > 0.
    Along the K = K ∗( Z ) locus the relationship q (K ∗( Z ), Z ) = 1 is also sat-
    isfied. As long as the function is differentiable (as it is in this model), total
    differentiation yields
    ∂q (K ∗( Z ), Z )
    ∂ K
    d K = − ∂q (K
    ∗( Z ), Z )
    ∂ Z
    d Z.
    Inserting this in (2.55) yields
    (r + ‰)q (K ∗( Z ), Z )d t =
    F ′(K ∗( Z )) Z
    Pk
    d t +
    ∂q (K ∗( Z ), Z )
    ∂ Z
    (ËZ d t − d Z )
    +
    ∂ 2(K ∗( Z ), Z )
    ∂ Z 2
    Û2
    2
    Z 2d t. (2.56)
    Since the path of all variables is continuous, the q (·) function must also satisfy
    the differential equation (2.51) that holds during zero-investment periods.
    Thus, it must be the case that
    r q (K , Z ) =
    F ′(K ) Z
    Pk
    +
    ∂q (K , Z )
    ∂ Z
    ËZ +
    ∂ 2(K , Z )
    ∂ Z 2
    Û2
    2
    Z 2 (2.57)
    for K and Z values arbitrarily close to those that induce the firm to invest. By
    continuity, in the limit where investment becomes positive (for an instant) and
    (ËZ d t − d Z ) �= ËZ d t , equations (2.56) and (2.57) can hold simultaneously
    only if
    ∂q (K ∗( Z ), Z )
    ∂ Z
    = 0,
    the smooth-pasting condition.
    In the case we are studying,
    ∂q (K , Z )
    ∂ Z
    ∣∣∣∣
    K =K ∗( Z )
    =
    F ′(K ∗( Z ))
    (r − Ë) Pk
    + A1‚1 Z
    ‚1−1.

    90 INVESTMENT
    Setting this expression to zero we have
    A1 = −
    F ′(K ∗( Z )) Z 1−‚1
    ‚1(r − Ë) Pk
    ,
    and inserting this in (2.54) we obtain a characterization of the firm’s optimal
    investment policy:
    F ′(K ∗( Z )) Z =
    ‚1
    ‚1 − 1
    (r − Ë) Pk ,
    or, recalling that ‚1 is the positive solution of equation (2.53) and rearranging
    it to read ‚/(‚ − 1) =
    (
    r + 1
    2
    ‚Û2
    )
    /(r − Ë),
    F ′(K ∗( Z )) Z =
    (
    r + 1
    2
    ‚1Û
    2
    )
    Pk .
    It is instructive to compare this equation with that which would hold
    if the firm could sell as well as purchase capital at the constant price Pk .
    In that case, since ‰ = 0, it should be true that F ′(K ) Z = r Pk at all times.
    Since ‚1 > 0, at times of positive investment the marginal revenue product of
    capital is higher—and, with F ′′(·) < 0, the capital stock is lower—than in the case of reversible investment. Intuitively, the irreversibility constraint makes it suboptimal to invest so as to equate the current marginal revenue product of capital to its user cost r Pk : the firm knows that it will be impossible to reduce the capital stock in response to future negative developments, and aims at avoiding large excessive capacity in such instances by restraining investment in good times. The ‚1 root is a function of Û, and it is possible to show that ‚1Û 2 is increasing in Û—or that, quite intuitively, a larger wedge between the current marginal profitability of capital and its user cost is needed to trigger investment when the wedge may very quickly be erased by more highly volatile fluctuations. Substantially similar, but more complex, derivations can be performed for cases where capital depreciates and/or the firm employs perfectly flexible factors (such as N in the previous sections’ models). To obtain closed-form solutions in such cases, it is necessary to assume that the firm’s demand and production functions have constant-elasticity forms. (Further details are in the references at the end of the chapter.) Irreversible investment models are more complex and realistic than the models introduced above. They do not, however, deny their fundamental assumption that optimal investment policies rule out arbitrage opportunities, and similarly support simple present-value financial considerations. Intu- itively, if investment is irreversible future decisions to install capital may only increase its stock and—under decreasing returns—reduce its marginal rev- enue product. Just like in the certainty model of Section 2.6, it is precisely the expectation of future excess capacity (and low marginal revenue products) that makes the firm reluctant to invest. In present expected value terms, in INVESTMENT 91 fact, capital’s marginal revenue product fluctuates around the same user-cost level that would determine it in the absence of adjustment costs. A model with long periods of inaction, of course, cannot represent well the dynamics of aggregate investment, which is empirically much smoother than would be implied by the dynamics illustrated in Figure 2.12, or by similar pictures one might draw tracing the dynamics of a stochastic desired capital stock and of the irreversibly installed stock associated with it. This suggests that aggregate dynamics should not be interpreted as the optimal choices of a single, “representative” firm—as is assumed by most micro-founded macro- economic models. If one allows part of uncertainty to be “idiosyncratic,” that is relevant only to individual firms but completely offset in the aggregate, then aggregation of intermittent and heterogeneous firm-level investment poli- cies yields smoother macroeconomic dynamics. Inaction by individual firms implies some degree of inertia in the aggregate series’ response to aggregate shocks. Such inertia could be interpreted in terms of convex adjustment costs for a hypothetical representative firm, but reflects heterogeneity of micro- economic dynamics if one maintains that adjustment costs do not necessar- ily imply higher unit costs for faster investment. This interpretation allows aggregate variables to react quickly to unusual large events and, in particular, to drastic changes of future expectations. Co-existence of firms with very different dynamic experiences is perhaps most obvious from a labor-market perspective, since employment typically increases in some sectors and firms at the same time as it declines in others. Employment changes can in fact be interpreted as “investment” if, as is often the case, hiring and firing workers entails costs for employers. The next chapter discusses dynamic labor demand issues from this perspective. � APPENDIX A2: HAMILTONIAN OPTIMIZATION METHODS The chapter’s main text makes use of Hamiltonian methods for the solution of dynamic optimization problems in continuous time, emphasizing economic inter- pretations of optimality conditions which, as usual, impose equality at the margin between (actual or opportunity) costs and revenues. The more technical treatment in this appendix illustrates the formal meaning of the same conditions. More detailed expositions of the relevant continuous-time optimization techniques may be found in Dixit (1990) or Barro and Sala-i-Martin (1995). In continuous time, any optimization problem must be posed in terms of relationships among functions, more complex mathematical objects than ordinary real numbers. As we shall see, however, the appropriate methods may still be interpreted in light of the simple notions that are familiar from the solution of static constrained optimization problems. Dynamic optimization problems under uncertainty may be formulated in substantially similar ways, taking into account that optimality con- ditions introduce links among yet more complex mathematical objects: stochastic 92 INVESTMENT processes which, like those introduced in the final section of the chapter, are functions not only of time, but also of the state of nature. It is important, first of all, to recall what precisely is meant by posing a problem in continuous time. Investment is a flow variable, measured during a time period; capital is a stock variable, measured at a point in time, for example at the beginning of each period. In continuous time, stock and flow variables are measured at extremely small time intervals. If in discrete time the obvious accounting relationship (2.3) holds, that is if K (t + �t ) = K (t ) + I (t )�t − ‰K (t )�t, where I (t ) denotes the average rate of investment between t and t + �t , then when going to continuous time we need to consider the limit for �t → 0 in that relationship. Recalling the definition of a derivative, we obtain lim �t→0 K (t + �t ) − K (t ) �t ≡ d K (t ) d t ≡ K̇ (t ) = I (t ) − ‰K (t ). This expression is a particular case of more general dynamic constraints encountered in economic applications. The level and rate of change of y(t ), a stock variable, are linked to one or more flow variables z and to exogenous variables by an accumulation constraint in the form of ẏ(t ) = g (t, z(t ), y(t )). (2.A1) The presence of t as an argument of this function represents exogenous variables which, in the absence of uncertainty, may all be simply indexed by calendar time. The flow variable, z(t ), is directly controlled by an economic agent, hence it is endogenous to the problem under study. It is important to realize, however, that the stock vari- able y(t ) is also under dynamic control by the agent. In (2.A1), the rate of change of the stock ẏ depends, at every point in time, on the levels of y(t ) and z(t ). Since (2.A1) states that the stock y(t ) has a first derivative with respect to time, its level is given by past history at every t . Hence, y(t ) cannot be an instrument of optimization at time t ; however, z(t ) may be chosen, and since in turn this affects ẏ(t ), it is possible to change future levels of the y stock variable. Working in continuous time, it will be necessary to make use of simple integrals. Recall that the accumulation relationship K (t + n�t ) = K (t ) + n∑ j =1 [ I (t + j �t ) − ‰K (t + j �t )]�t has a continuous time counterpart: fixing t + n�t = T and letting n → ∞ as �t → 0, K (T ) = K (t ) + ∫ T t [ I (Ù) − ‰K (Ù)] d Ù. INVESTMENT 93 Given a function of time I (Ù) and a starting point K (t ) = Í̄, the integral determining K (T ) is solved by a function K (·) such that d d Ù K (Ù) = I (Ù) − ‰K (Ù), K (t ) = Í̄. For example, let I (Ù) = ˇK (Ù). A function in the form K (Ù) = C e (ˇ−‰) Ù with C = e −(ˇ−‰)t Í̄ satisfies the conditions, and capital grows exponentially at rate ˇ − ‰: K (T ) = K (t )e (ˇ−‰)(T −t ). A2.1. Objective function In order to characterize the economic agent’s optimal choices, we need to know not only the form of the accumulation constraint, K̇ (t ) = I (t ) − ‰K (t ), (2.A2) but also that of an objective function which also explicitly recognizes the time dimen- sion. If flows of benefits (utility, profits,. . .) are given by some function f (t, I (t ), K (t )), the total value as of time zero of a dynamic optimization program may be measured by the integral ∫ ∞ 0 f (t, I (t ), K (t ))e −Òt d t, (2.A3) where Ò ≥ 0 (supposed constant for simplicity) is the intertemporal rate of discount of the relevant benefits. One could of course express in integral (rather than summation) form the objective function of consumption problems, such as those encountered in Chapter 1: we shall deal with such expressions in Chapter 4. Here, we will interpret the optimization problem in terms of capital and investment. It is sensible to suppose that ∂ f (t, I, K ) ∂ K > 0,
    i.e., a larger stock of capital must increase the cash flow;
    ∂ f (t, I, K )
    ∂ I
    < 0, i.e., investment expenditures reduce current cash flows; and ∂ 2 f (t, I, K ) ∂ K 2 ≤ 0, ∂ 2 f (t, I, K ) ∂ I 2 ≤ 0, 94 INVESTMENT with at least one strict inequality; i.e., returns to capital must be decreasing, and/or marginal investment costs are increasing. As usual, such concavity of the objective function ensures that, subject to the linear constraint (2.A2), the optimization prob- lem has a unique internal solution, identified by first-order conditions, where the second-order condition is surely satisfied. A2.2. Constrained optimization The problem is that of maximizing the objective function in (2.A3), while satisfying the constraint (2.A2). The instruments of optimization are two functions, I (t ) and K (t ). Hence an infinitely large set of choices are available: one needs to choose the flow variable I (t ) for each of uncountably many time intervals [t, t + d t ), taking into account its direct (negative) effects on f ( · ) and thus on the integral in (2.A3); its (positive) effects on the K ( · ) accumulation path; and the (positive) effects of K ( · ) on f ( · ) and on the integral. “The” constraint (2.A2) is a functional constraint, that is, a set of infinitely many constraints in the form I (t ) − ‰K (t ) − K̇ (t ) = 0, each valid at an instant t . From the economic point of view, the agent is faced by a clear trade-off: investment is costly, but it makes it possible to increase the capital stock and enjoy additional benefits in the future. We recall at this point that, in order to maximize a function subject to one or more constraints, one forms a Lagrangian as a linear combination of the objective functions and the constraints. To each constraint, the Lagrangian assigns a coefficient (a Lagrange multiplier, or shadow price) measuring the variation of the optimized objective function in response to a marginal loosening of the constraint. In the case we are considering, loosening the accumulation constraint (2.A2) at time t means granting additional capital at the margin without requiring the costs entailed by additional investment. Thus, the shadow price is the marginal value of capital at time t evaluated—like all of (2.A3)—at time zero. In the case we are considering, a continuum of constraints is indexed by t , and the Lagrange multipliers define a function of t , denoted Ï(t ) in what follows. In practice, the Lagrangian linear combination has uncountably many terms and adds them up giving infinitesimal weight “d t ” to each; we may write it in integral form: L = ∫ ∞ 0 f (t, I (t ), K (t ))e −Òt d t + ∫ ∞ 0 Ï(t )( I (t ) − ‰K (t ) − K̇ (t )) d t. Using the integration by parts rule, ∫ b a f ′(x )g (x ) d x = f (b)g (b) − f (a )g (a ) − ∫ b a f (x )g ′(x ) d x, INVESTMENT 95 we obtain − ∫ ∞ 0 Ï(t )K̇ (t ) d t = − ( lim t→∞ Ï(t )K (t ) − Ï(0)K (0) ) + ∫ ∞ 0 Ï̇(t )K (t ) d t. The optimization problem is ill defined if the limit does not exist. If it exists, it must be zero (as we shall see). Setting lim t→∞ Ï(t )K (t ) = 0, (2.A4) we can rewrite the “Lagrangian” as L̃ = ∫ ∞ 0 [ f (t, I (t ), K (t ))e −Òt + Ï(t )( I (t ) − ‰K (t )) + Ï̇(t )K (t )] d t + Ï(0)K (0). Given the (2.A4), this form is completely equivalent to the previous one, and, conve- niently, it does not feature K̇ . The necessary conditions for a constrained maximization problem are that all derivatives of the Lagrangian with respect to instruments (here, I (t ) and K (t )) and shadow prices (here, Ï(t )) be zero. If we were dealing with summations rather than integrals of expressions, which depend on the various instruments and shadow prices, we would equate to zero the derivative of each expression. By analogy, we can differen- tiate the function being integrated with respect to Ï(t ) for each t in L : comfortingly, this procedure retrieves the functional accumulation constraint ( I (t ) − ‰K (t ) − K̇ (t )) = 0; and we can differentiate the function being integrated in the equivalent expression L̃ with respect to I (t ) and K (t ) for each t , obtaining ∂ f (t, I (t ), K (t )) ∂ I (t ) e −Òt + Ï(t ) = 0, (2.A5) ∂ f (t, I (t ), K (t )) ∂ K (t ) e −Òt − Ï(t )‰ + Ï̇(t ) = 0. (2.A6) Note that we have disregarded the term Ï(0)K (0) in L̃ when differentiating (2.A6) at t = 0. In fact, the initial stock of capital is a parameter rather than an endogenous variable in the optimization problem: it would be nonsensical to impose a first-order condition in the form Ï(0) = 0. Similar considerations also rationalize the assumption made in (2.A4). Intuitively, if the limit of K (t )Ï(t ) were finite but different from zero, then Ï(∞)K (∞) should satisfy first-order conditions: differentiating with respect to Ï(∞), we would need K (∞) = 0, and differentiating with respect to K (∞), we would need Ï(∞) = 0. Either one of these conditions implies (2.A4). (Of course, this is a very heuristic argument: it is not really rigorous to take such derivatives directly at the limit. A more rigorous approach would consider a similar problem with a finite planning horizon T , and take the limit for T → ∞ of first-order conditions at T .) 96 INVESTMENT A2.3. The Hamiltonian recipe Conditions (2.A5) and (2.A6) can be derived directly from the Hamiltonian (rather than Lagrangian) of the problem, defined as follows: H (t ) = [ f (t, I (t ), K (t )) + Î(t )( I (t ) − ‰K (t ))]e −Òt . In this definition the shadow price Ï(t ) (which measures values “at time zero”) is replaced by Î(t ) ≡ Ï(t )e Òt (which measures values “at time t ,” without discounting them back to zero). In all other respects, the Hamiltonian expression is similar to the function integrated in the Lagrangians introduced above, and therefore measures the flow of benefits offered by a dynamic policy.32 The expression proposed, in fact, multiplies the accumulation constraint’s shadow price by I (t ) − ‰K (t ), which is the same as K̇ (t ), along any possible dynamic path. Thus, the benefit flow includes the value (in terms of the objective function) of the increase in the stock variable K (t ). This term makes it possible to take into account properly the problem’s intertemporal linkages when maximizing the Hamiltonian expression. In current-value terms, the optimality conditions encountered above read ∂ H ∂ I = 0, (2.A7) − ∂ H ∂ K = d (Î(t )e −Òt ) d t , (2.A8) lim t→∞ Î(t )e −Òt K (t ) = 0. (2.A9) The constraint, which in a static problem would be equivalent to the first-order condition with respect to the shadow price, is imposed by the condition ∂ H (·) ∂[Îe −Òt ] = K̇ . A2.4. The general case All the above derivations took the accumulation constraint to be linear, as in (2.A2), and the rate of discount per unit time to be constant at Ò. More general specifications can be treated similarly. If the problem given is max [∫ ∞ 0 f (t, z(t ), y(t ))e − ∫ t 0 r (s )d s d t ] (2.A10) ³² Note that the benefit flow is discounted back to zero in the expression proposed. Since the discount factor is always strictly positive, one may equivalently define a current value Hamiltonian, a similar expression without the discount factor. The interpretation of the shadow value and the relevant dynamic optimality conditions will be different, but jointly equivalent to those outlined here. INVESTMENT 97 subject to ẏ(t ) = g (t, z(t ), y(t )), for all t ≥ 0, (2.A11) then one forms the Hamiltonian H (t ) = [ f (t, z(t ), y(t )) + Î(t )( g (t, z(t ), y(t )))]e − ∫ t 0 r (s )d s , and imposes first-order conditions in the form ∂ H ∂ z = 0, (2.A12) − ∂ H ∂ y = d (Î(t )e − ∫ t 0 r (s )d s ) d t , (2.A13) lim t→∞ Î(t )e − ∫ t 0 r (s )d s y(t ) = 0. (2.A14) If the constraint is not linear, then these conditions can identify an optimum even when f (·) is not strictly concave. For example, one may have ∂ f /∂ z < 0 constant (a constant unit investment cost) if g (·) is increasing and convex in z. REVIEW EXERCISES Exercise 14 Consider a firm with capital as the only factor of production. Its revenues at time t are R(K (t )) if installed capital is K (t ). The accumulation constraint has the usual form, K̇ (t ) = I (t ) − ‰K (t ), and the cost of investing I (t ) is a function G ( I (t )) that does not depend on installed capital (for simplicity, Pk ≡ 1). (a) Suppose the firm aims at maximizing the present discounted value at rate r of its cash flows, F (t ). Express cash flows, in terms of the functions R( · ) and G ( · ), derive the relevant first-order conditions, and characterize the solution graphically making specific assumptions as to the derivatives of R( · ) and G ( · ). (b) Characterize the solution under more specific assumptions: suppose revenues are a linear function of installed capital, R(K ) = ·K , and let the investment cost function be quadratic, G ( I ) = I + b I 2. Derive and interpret an expression for the steady-state capital stock. What happens if ‰ = 0? Exercise 15 A firm’s cash flows are K · N‚ − Pk G (K̇ , K ) − w N, where K is the capital stock, K̇ its rate of change, and N is a perfectly flexible factor. Let r be the rate of discount applied to future cash flows, over an infinite planning horizon. (a) What needs to be assumed about ·, ‚, G (·, ·) to ensure that the Hamiltonian first- order conditions identify the optimal solution? 98 INVESTMENT (b) Let · + ‚ < 1. Draw a saddlepath diagram for given Pk , w, and r ; be specific as to what you assume about the form of G (·, ·). Show the effects of an unexpected, permanent change of Pk , starting from the steady state. (c) What does Pk represent in the problem? Would it be a good idea to let G (K̇ t , K t ) = ( K̇ t K t )2 ? Or would it be preferable to let G (x, y) = (x/y)3? What about G (x, y) = (x )3? (d) Suppose that · + ‚ = 1, and let Pk G (K̇ , K ) = g (K̇ ). (Adjustment costs do not depend on K .) The wage is constant at w(t ) = 1 only for 0 ≤ t < T : thereafter, its level is random, and for t ≥ T w(t ) = { 1 + Ó, with prob. 1/2 1 − Ó, with prob. 1/2. Write the first-order condition for investment at t = 0. How does the investment flow depend on Ó for t < 1? Exercise 16 A firm’s production function is Y (t ) = · √ K (t ) + ‚ √ L (t ), and its product is sold at a given price, normalized to unity. Factor L is not subject to adjustment costs, and is paid w per unit time. Factor K obeys the accumulation constraint K̇ (t ) = I (t ) − ‰K (t ), and the cost of investing I is G ( I ) = I + „ 2 I 2 per unit time (letting Pk = 1). The firm maximizes the present discounted value at rate r of its cash flows. (a) Write the Hamiltonian for this problem, derive and discuss briefly the first-order conditions, and draw a diagram to illustrate the solution. (b) Analyze graphically the effects of an increase in ‰ (faster depreciation of installed capital) and give an economic interpretation of the adjustment trajectory. (c) If, instead of being constant, the cost of factor L were a random variable, would this matter for the firm’s investment policy? Explain. Exercise 17 As a function of installed capital K , a firm’s revenues are given by R(K ) = K − 1 2 K 2. INVESTMENT 99 The usual accumulation constraint has ‰ = 0.25, so K̇ = I − 0.25K . Investing I costs Pk G ( I ) = Pk ( I + 1 2 I 2 ) . The firm maximizes the present discounted value at rate r = 0.25 of its cash flows. (a) Write the first-order conditions of the dynamic optimization problem, and charac- terize the solution graphically supposing that Pk = 1 (constant). (b) Starting from the steady state of the Pk = 1 case, show the effects of a 50% subsidy of investment (so that Pk is halved). (c) Discuss the dynamics of optimal investment if at time t = 0, when Pk is halved, it is also announced that at some future time T > 0 the interest rate will be tripled,
    so that r (t ) = 0.75 for t ≥ T .
    Exercise 18 The revenue flow of a firm is given by
    R(K , N) = 2K 1/2 N1/2,
    where N is a freely adjustable factor, paid a wage w(t ) at time t ; K is accumulated
    according to
    K̇ = I − ‰K ,
    and an investment flow I costs
    G ( I ) =
    (
    I + 1
    2
    I 2
    )
    .
    (Note that Pk = 1, hence q = Î.)
    (a) Write the first-order conditions for maximization of present discounted (at rate r )
    value of cash flows over an infinite planning horizon.
    (b) Taking r and ‰ to be constant, write an expression for Î(0) in terms of w(t ), the
    function describing the time path of wages.
    (c) Evaluate that expression in the case where w(t ) = w̄ is constant, and characterize
    the solution graphically.
    (d) How could the problem be modified so that investment is a function of the average
    value of capital (that is, of Tobin’s average q)?
    � FURTHER READING
    Nickell (1978) offers an early, very clear treatment of many issues dealt with in
    this chapter. Section 2.5 follows Hayashi (1982). For a detailed and clear treatment
    of saddlepath dynamics generated by anticipated and non-anticipated parameter
    changes, see Abel (1982). The effects of uncertainty on optimal investment flows under
    convex adjustment costs, sketched in Section 2.4, were originally studied by Hartman
    (1972). A more detailed treatment of optimal inaction in a certainty setting may be
    found in Bertola (1992).

    100 INVESTMENT
    Dixit (1993) offers a very clear treatment of optimization problems under
    uncertainty in continuous time, introduced briefly in the last section of the chapter.
    Dixit and Pindyck (1994) propose a more detailed and very accessible discussion of
    the relevant issues. Bertola (1998) contains a more complete version of the irreversible
    investment problem solved here. For a very complex model of irreversible investment
    and dynamic aggregation, and for further references, see Bertola and Caballero (1994).
    When discussing consumption in Chapter 1, we emphasized the empirical
    implications of optimization-based theory, and outlined how theoretical refinements
    were driven by the imperfect fit of optimality conditions and data. Of course, the-
    oretical relationships have also been tested and estimated on macroeconomic and
    microeconomic investment data. These attempts have met with considerably less
    success than in the case of consumption. While aggregate consumption changes are
    remarkably close to the theory’s unpredictability implication, aggregate investment’s
    relationship to empirical measures of q is weak and leaves much to be explained by
    output and by distributed lags of investment, and its relationship to empirical meas-
    ures of Jorgenson’s user cost are also empirically elusive. (For surveys, see Chirinko,
    1993, and Hubbard, 1998.) The evidence does not necessarily deny the validity of
    theoretical insights, but it certainly calls for more complex modeling efforts. Even
    more than in the case of consumption, financial constraints and expectation formation
    mechanisms play a crucial role in determining investment in an imperfect world.
    Together with monetary and fiscal policy reactions, financial and expectational
    mechanisms are relevant to more realistic models of macroeconomic dynamics of
    the type studied in Section 2.5. As in the case of consumption, however, attention
    to microeconomic detail (as regards heterogeneity of individual agents’ dynamic
    environment, and adjustment-cost specifications leading to infrequent bursts of
    investment) has proven empirically useful: aggregate cost-of-capital measures are
    statistically significant in the long run, and short-run dynamics can be explained by
    fluctuations of the distribution of individual firms within their inaction range (Bertola
    and Caballero, 1994).
    � REFERENCES
    Abel, A. B. (1982) “Dynamic Effects of Temporary and Permanent Tax Policies in a q model of
    Investment,” Journal of Monetary Economics, 9, 353–373.
    Barro, R. J., and X. Sala-i-Martin (1995) “Appendix on Mathematical Methods,” in Economic
    Growth, New York: McGraw-Hill.
    Bertola, G. (1992) “Labor Turnover Costs and Average Labor Demand,” Journal of Labor Eco-
    nomics, 10, 389–411.
    (1998) “Irreversible Investment (1989),” Ricerche Economiche/Research in Economics, 52,
    3–37.
    and R. J. Caballero (1994) “Irreversibility and Aggregate Investment,” Review of Economic
    Studies, 61, 223–246.
    Blanchard, O. J. (1981) “Output, the Stock Market and the Interest Rate,” American Economic
    Review, 711, 132–143.

    INVESTMENT 101
    Chirinko, R. S. (1993) “Business Fixed Investment Spending: A Critical Survey of Modelling
    Strategies, Empirical Results, and Policy Implications,” Journal of Economic Literature, 31,
    1875–1911.
    Dixit, A. K. (1990) Optimization in Economic Theory, Oxford: Oxford University Press.
    (1993) The Art of Smooth Pasting, London: Harcourt.
    and R. S. Pindyck (1994) Investment under Uncertainty, Princeton: Princeton University
    Press.
    Hartman, R. (1972) “The Effect of Price and Cost Uncertainty on Investment,” Journal of
    Economic Theory, 5, 258–266.
    Hayashi, F. (1982) “Tobin’s Marginal q and Average q : A Neoclassical Interpretation,” Economet-
    rica, 50, 213–224.
    Hubbard, R. G. (1998) “Capital-Market Imperfections and Investment,” Journal of Economic
    Literature 36, 193–225.
    Jorgenson, D. W. (1963) “Capital Theory and Investment Behavior,” American Economic Review
    (Papers and Proceedings), 53, 247–259.
    (1971) “Econometric Studies of Investment Behavior,” Journal of Economic Literature, 9,
    1111–1147.
    Keynes, J. M. (1936) General Theory of Employment, Interest, and Money, London: Macmillan.
    Nickell, S. J. (1978) The Investment Decisions of Firms, Cambridge: Cambridge University Press.
    Tobin, J. (1969) “A General Equilibrium Approach to Monetary Theory,” Journal of Money,
    Credit, and Banking, 1, 15–29.

    3 Adjustment Costs in
    the Labor Market
    In this chapter we use dynamic methods to study labor demand by a single
    firm and the equilibrium dynamics of wages and employment. As in previous
    chapters, we aim at familiarizing readers with methodological insights. Here
    we focus on how uncertainty may be treated simply in an environment that
    allows economic circumstances to change, with given probabilities, across a
    well-defined and stable set of possible states (a Markov chain). We derive
    some generally useful technical results from first principles and, again as in
    previous chapters, we discuss their economic significance intuitively, with
    reference to their empirical and practical relevance in a labor market context.
    In reality, adjustment costs imposed on firms by job security legislation are
    widely different across countries, sectors, and occupations, and the literature
    has given them a prominent role when comparing European and American
    labor market dynamics. (See Bertola, 1999, for a survey of theory and evi-
    dence.) In most European countries, legislation imposes administrative and
    legal costs on employers wishing to shed redundant workers. Together with
    other institutional differences (reviewed briefly in the suggestions for further
    reading at the end of the chapter), this has been found to be an important
    factor in shaping the European experience of high unemployment in the last
    three decades of the twentieth century.
    Section 3.1 derives the optimal hiring and firing decisions of a firm that is
    subject to adjustment costs of labor. The next two sections characterize the
    implications of these optimal policies for the dynamics and the average level
    of employment. Finally, in Section 3.4 we study the interactions between the
    decisions of firms and workers when workers are subject to mobility costs,
    focusing in particular on equilibrium wage differentials. The entire analysis of
    this chapter is based on a simple model of uncertainty, characterized formally
    in the appendix to the chapter.
    Remember that in Chapter 2 we viewed the factor N, which was not subject
    to adjustment costs, as labor. Hence we called its remuneration per unit of
    time, w, the “wage rate.” In the absence of adjustment costs, the optimal
    labor input had a simple and essentially static solution: that is, the optimal
    employment level needed to satisfy the condition
    ∂ R(t, K (t ), N)
    ∂ N
    = w(t ). (3.1)

    LABOR MARKET 103
    Figure 3.1. Static labor demand
    This first-order condition is necessary and sufficient if the total revenues
    R(·) are an increasing and concave function of N. Under this condition,
    ∂ R(·)/∂ N is a decreasing function of N and (3.1) implicitly defines the
    demand function for labor N∗(t, K (t ), w(t )).
    If the above condition holds, the employment level depends only on the
    levels of K , of wages, and of the exogenous variables that, in the absence of
    uncertainty, are denoted by t . This relationship between employment, wages,
    and the value of the marginal product of labor is illustrated in Figure 3.1,
    which is familiar from any elementary textbook. In fact, the same relation
    can be obtained assuming that firms simply maximize the flow of profits in
    a given period, rather than the discounted flow of profits over the entire time
    horizon.
    The fact that the static optimality condition remains valid in the potentially
    more complex dynamic environment illustrates a general principle. In order
    for the dynamic aspects of an economic problem to be relevant, the effects
    of decisions taken today need to extend into the future; likewise, decisions
    taken in the past must condition current decisions. Adjustments costs (linear
    or strictly convex) introduced for investment in Chapter 2 make it costly for
    firms to undo previous choices. As a result, when firms decide how much
    to invest, they need to anticipate their future input of capital. But if labor is
    simply compensated on the basis of its effective use, and if variations in N(t )
    do not entail any cost, then forward-looking considerations are irrelevant.
    Firms do not need to form expectations about the future because they know

    104 LABOR MARKET
    that it will always be possible for them to react immediately, and without any
    cost, to future events.33
    3.1. Hiring and Firing Costs
    In Chapter 2, on investment, the presence of more than one state variable
    would have complicated the analysis of the dynamic aspects of optimal invest-
    ment behavior. In particular, we would not have been able to use the sim-
    ple two-dimensional phase diagram. It was therefore helpful to assume that
    no factors other than capital were subject to adjustment costs. Since in this
    chapter we aim to analyze the dynamic behavior of employment, it would
    not be very useful or realistic to retain the assumption that variations in
    employment do not entail any costs for the firm. For example, as a result of the
    technological and organizational specificity of labor, firms incur hiring costs
    because they need to inform and instruct newly hired workers before they are
    as productive as the incumbent workers. The creation and destruction of jobs
    (turnover) often entails costs for the workers too, not only because they may
    need to learn to perform new tasks, but also in terms of the opportunity cost
    of unemployment and the costs of moving. The fact that mobility is costly
    for workers affects the equilibrium dynamics of wages and employment, as
    we will see below. In fact, it is in order to protect workers against these costs
    of mobility that labor contracts and laws often impose firing costs, so that
    firms incur costs both when they expand and when they reduce their labor
    demand.
    We start this chapter by considering the optimal hiring and firing policies
    of a single firm that is subject to hiring and firing costs. As in the case of
    investment, the solution described by (3.1), in which the marginal produc-
    tivity and the marginal costs of labor are equated in every period, is no longer
    efficient with adjustment costs. Like the installation costs for machinery and
    equipment, the costs of hiring and firing require a firm to adopt a forward-
    looking employment policy.
    The economic implications of such behavior could well be studied using the
    continuous time optimization methods introduced in the previous chapter,
    and some of the exercises below explore analogies with the methods used
    in the study of investment there. We adopt a different approach, however, in
    order to explore new aspects of the dynamic problems that we are dealing with
    and to learn new techniques. As in Chapter 1, we assume that the decisions
    ³³ Even in the absence of adjustment costs, the consumption and savings decisions studied in
    Chapter 1 have dynamic implications via the budget constraint of agents, since current consumption
    affects the resources available for future consumption. Adjustment costs may also be relevant for the
    consumption of non-durable goods if the utility of agents depends directly on variations (and not just
    levels) of consumption. This could occur for instance as a result of habits or addiction.

    LABOR MARKET 105
    are taken in discrete time and under uncertainty about the future. Since we
    also want to take adjustment costs and equilibrium features into account, it is
    useful to simplify the model.
    In what follows, we assume that firms operate in an environment in which
    one or more exogenous variables (like the retail price of the output, the
    productive efficiency, or the costs of inputs other than labor) fluctuate so
    that a firm is sometimes more and sometimes less inclined to hire workers.
    In (3.1), the capital stock of a firm K (t ) (which we do not analyze explicitly
    in this chapter) and the time index t could represent these exogenous factors.
    To simplify the analysis as much as possible, we assume that the complex of
    factors that are relevant for the intensity of labor demand has only two states:
    a strong state indexed by g , and a weak state indexed by b. If the alternation
    between these two states were unambiguously determined by t , the firm would
    be able to determine the evolution of the exogenous variables. Here we shall
    assume that the evolution of demand is uncertain. In each period the demand
    conditions change with probability p from weak to strong or vice versa. Hence,
    in each period the firm takes its decisions on how many workers to hire or
    fire knowing that the prevailing demand conditions remain unchanged with
    probability (1 − p).
    As in the analysis of investment, we assume that the firm maximizes the
    current discounted value of future cash flows. Given that the variations of Z
    are stochastic, the objective of the firm needs to be expressed in terms of the
    expected value of future cash flows. To simplify the interpretation of the trans-
    ition probability p, it is convenient to adopt a discrete-time setup. Assuming
    that firms are risk-neutral, we can then write
    Vt = E t
    [ ∞∑
    i =0
    (
    1
    1 + r
    )i
    ( R( Zt +i , Nt +i ) − w Nt +i − G (�Nt +i ))
    ]
    , (3.2)
    where:
    � E t [·] denotes the expected value conditional on the information avail-
    able at date t (this concept is defined formally in the chapter’s appendix
    within the context of the simple model studied here);
    � r is the discount rate of future cash flow, which we assume constant for
    simplicity; likewise, w denotes the constant wage that a worker receives
    in any given period;
    � the total revenues R(·) depend on employment N and a variety of
    exogenous factors indexed by Zt +i : if the demand for labor is strong
    in period t + i , then Zt +i = Zg , while if labor demand is weak, then
    Zt +i = Zb ;
    � the function G (·) represents the costs of hiring and firing, or turnover,
    which in any given period t + i depends on the net variation �Nt +i ≡

    106 LABOR MARKET
    Nt +i − Nt +i −1 of the employment level with respect to the preceding
    period; this net variation of employment plays the same role as the
    investment level I (t ) in the analysis of capital in the preceding chapter.
    Exercise 19 To explore the analogy with the investment problem of the previous
    chapter, rewrite the objective function of the firm assuming that the turnover costs
    depend on the gross variations of employment, and that this does not coincide
    with �N because a fraction ‰ of the workers employed in each period resign, for
    personal reasons or because they reach retirement age, without costs for the firm.
    Note also that (3.2) does not feature the price of capital, Pk : what could such a
    parameter mean in the context of the problems we study in this chapter?
    In order to solve the model, we need to specify the functional form of G (·).
    As in the case of investment, the adjustment costs may be strictly convex. In
    that case, the unit costs of turnover would be an increasing function of the
    actual variation in the employment level. This would slow down the optimal
    response to changes in the exogenous variables. However, there are also good
    reasons to suppose that adjustment costs are concave. For instance, a single
    instructor can train more than one recruit, and the administrative costs of a
    firing procedure may well be at least partially independent of the number of
    workers involved.
    The case of linear adjustment costs that we consider here lies in between
    these extremes. The simple proportionality between the cost and the amount
    of turnover simplifies the characterization of the optimal labor demand poli-
    cies. We therefore assume that
    G (�N) =
    {
    (�N) H if �N ≥ 0,
    −(�N) F if �N < 0, (3.3) where the minus sign that appears in the �N < 0 case ensures that any variation in employment is costly for positive values of parameters H and F . By (3.3), the firm incurs a cost H for each unit of labor hired, while any unit of labor that is laid off entails a cost F . Both unit costs are independent of the size of �N, and, since H is not necessarily equal to F , the model allows for a separate analysis of hiring and firing costs. As in the analysis of investment, firms’ optimal actions are based on the shadow value of labor, defined as the marginal increase in the discounted cash flow of the firm if it hires one additional unit of labor. When a firm increases the employment level by hiring an infinitesimally small unit of labor while keeping the hiring and firing decisions unchanged, the objective function defined in (3.2) varies by an amount of Ît = E t [ ∞∑ i =0 ( 1 1 + r )i ( ∂ R( Zt +i , Nt +i ) ∂ Nt +i − w )] (3.4) LABOR MARKET 107 per unit of additional employment. If the employment levels Nt +i on the right- hand side of this equation are the optimal ones, (3.4) measures the marginal contribution of an infinitesimally small labor input variation around the opti- mally chosen one. This follows from the envelope theorem, which implies that infinitesimally small variations in the employment level do not have first-order effects on the value of the firm. 3.1.1. OPTIMAL HIRING AND FIRING To characterize the optimal policies of the firm, we assume that the realiza- tion of Zt is revealed at the beginning of period t , before a firm chooses the employment level Nt that remains valid for the entire time period. 34 Hence, E t [ ∂ R( Zt , Nt ) ∂ Nt − w ] = ∂ R( Zt , Nt ) ∂ Nt − w. We can separate the first term of the summation in (3.4), whose discount factor is equal to one, from the remaining terms. To simplify notation, we define Ï( Zt +i , Nt +i ) ≡ ∂ R( Zt +i , Nt +i ) ∂ Nt +i , and write Ît = Ï( Zt , Nt ) − w + E t [ ∞∑ i =1 ( 1 1 + r )i (Ï( Zt +i , Nt +i ) − w) ] = Ï( Zt , Nt ) − w + ( 1 1 + r ) E t [ ∞∑ i =0 ( 1 1 + r )i (Ï( Zt +1+i , Nt +1+i ) − w) ] . At date t + 1 agents know the realization of Zt +1, while at t they know only the probability distribution of Zt +1. The conditional expectation at date E t +1[·] is therefore based on a broader information set than that at E t [·]. ³⁴ We could have adopted other conventions for the timing of the exogenous and endogenous stock variables. For example, retaining the assumption that Nt is determined at the start of period t , we could assume that the value of Zt is not yet observed when firms take their hiring and firing decisions; it would be a useful exercise to repeat the preceding analysis under this alternative hypothesis. Such timing conditions would be redundant in a continuous-time setting, but the elegance of a reformula- tion in continuous time would come at the cost of additional analytical complexity in the presence of uncertainty. 108 LABOR MARKET Applying the law of iterative expectations, which is discussed in detail in the Appendix, we can then write E t [ ∞∑ i =0 ( 1 1 + r )i (Ï( Zt +1+i , Nt +1+i ) − w) ] = E t [ E t +1 [ ∞∑ i =0 ( 1 1 + r )i (Ï( Zt +1+i , Nt +1+i ) − w) ]] . Recognizing the definition of Ît +1 in the above expression, we obtain a recurs- ive relation between the shadow value of labor in successive periods: Ît = Ï( Zt , Nt ) − w + 1 1 + r E t [Ît +1]. (3.5) This relationship is similar to the expression that was obtained by differentiat- ing the Bellman equation in the appendix to Chapter 1, and is thus equivalent to the Euler equation that we have already encountered on various occasions in the preceding chapters. Exercise 20 Rewrite this equation in a way that highlights the analogy between this expression and the condition r Î = ∂ R(·)/∂ K + Î̇, which was derived when we solved the investment problem using the Hamiltonian method. The optimal choices of the firm are obvious if we express them in terms of the shadow value of labor. First of all, the marginal value of labor cannot exceed the costs of hiring an additional unit of labor. Otherwise the firm could increase profits by choosing a higher employment level, contradicting the hypothesis that employment maximizes profits. Hence, given that the costs of a unit increase in employment are equal to H , while the marginal value of this additional unit is Ît , we must have Ît ≤ H . Similarly, if Ît < −F , the firm could increase profits immediately by fir- ing workers at the margin: the immediate cost of firing one unit of labor, −F , would be more than compensated by an increase in the cash flow of the firm. Again, this contradicts the assumption that firms maximize profits. Hence, if the dynamic labor demand of a firm is such that it maximizes (3.2), we must have −F ≤ Ît ≤ H (3.6) for each t . Moreover, either the first or the second inequality turns into an equality sign if �Nt �= 0: formally, at an interior optimum for the hiring and firing policies of a firm, we have d G (�Nt )/d (�Nt ) = Ît . Whenever the firm prefers to adjust the employment level rather than wait for better or worse circumstances, the marginal cost and benefit of that action need to equal each other. If the firm hires a worker we have Ît = H , which LABOR MARKET 109 implies that the marginal benefit of an additional worker is equal to the hiring costs. Similarly, if a firm fires workers, it must be true that Î = −g ; that is, the negative marginal value of a redundant worker needs to be compensated exactly by the cost of firing this worker g . Notice also that the shadow value of the marginal worker can be negative only if the wage exceeds the value of marginal productivity. As in the case of investment, the conditions based on the shadow value defined in (3.4) are not in themselves sufficient to formulate a solution for the dynamic optimization problem. In particular, if ∂ R(·)/∂ N depends on N, then in order to calculate Ît as in (3.4) we need to know the distribution of {Nt +i , i = 0, 1, 2, . . . }, and thus we need to have already solved the optimal demand for labor. It would be useful if we could study the case in which the revenues of the firm are linear in N. This would be analogous to the model we used to show that optimal investment (with convex adjustment costs) could be based on the average q . However, in this case static labor demand is not well defined. In Figure 3.1 the value of marginal productivity would give rise to a horizontal line at the height of w and the optimality conditions would be satisfied for a continuum of employment levels. In fact, in the case of investment we saw that the value of capital stock and the size of firms were ill defined when the average value of q was the only determinant of investment; to characterize optimal investment decisions, we needed convex adjustment costs. These difficulties are familiar from the study of dynamic investment prob- lems in an environment without uncertainty. In the presence of uncertainty, even after solving the dynamic optimization problem, we could not assume that firms know their future employment levels: the evolution of employment {Nt +i , for i = 0, 1, 2, . . . } depends not only on the passing of time i , but also on the stochastic realizations of {Zt +i }. To tackle this difficulty, we can use the fact that a profit-maximizing firm will react optimally to each realization of this random variable. Hence we can deduce the probability distribution of the endogenous variable Nt +i from the probability distribution of {Zt +i }. At this point, the advantage of restricting the state space to two realizations becomes clear. In what follows we guess that the endogenous variables take on only two different values depending on the realization of Zt . If Zt = Zg , then Nt = Ng and Ît = Îg ; on the contrary, if Zt = Zb , then the employment level is given by Nt = Nb , and its shadow value is equal to Ît = Îb . When labor demand is strong, equation (3.5) can therefore be written in the form Îg = Ï( Zg , Ng ) − w + 1 1 + r [(1 − p)Îg + pÎb ]. (3.7) The shadow value Îg is given by the expected discounted shadow value in the next period plus the “dividend” in the current period, which is equal to the difference between the value of marginal productivity Ï(·) and the wage w. 110 LABOR MARKET Given that Ît +1 has only two possible values, the expected value in (3.7) is simply the product of Îg and the probability (1 − p) that the state remains unchanged, plus Îb times the probability p that the state changes from good to bad. Similarly, when labor demand is weak, we can write Îb = Ï(Nb , Zb ) − w + 1 1 + r [ pÎg + (1 − p)Îb ]. (3.8) If each transition from the “strong” to the “weak” state induces a firm to fire workers, then in order to satisfy (3.6) we need to have Îb = −F in bad states, and Îg = H in good states. Given that H and F are constants, Ît indeed takes only two values, as was guessed in order to derive (3.7) and (3.8). Substituting Îb = −F and Îg = H in these expressions, we can solve the resulting system of linear equations to obtain Ï(Ng , Zg ) = w + p F 1 + r + (r + p) H 1 + r , Ï(Nb , Zb ) = w − (r + p) F 1 + r − p H 1 + r . (3.9) 3.2. The Dynamics of Employment The character of the optimal labor demand policy is illustrated in Figure 3.2. The weak case is associated with a demand curve that lies below the demand curve in the strong case. Without hiring and firing costs, firms would equalize the value of marginal productivity to the wage rate w in each of the two states. Hence, with H = F = 0, the costs of labor are simply equal to w and employment oscillates between the levels identified by vertical dashed lines in the figure. If the cost of hiring H and/or the cost of firing F are posi- tive, this equality no longer holds. If labor demand (Z = Zg ) is strong, the marginal productivity of labor exceeds the wage rate. Symmetrically, when labor demand is weak ( Z = Zb ), the value of marginal productivity is less than the wage. Hence, it looks as if the optimal hiring decisions are based on a wage that is higher than w, while the firing decisions seem to be based on one that is lower. The dashed lines in Figure 3.2 illustrate a pair of “shadow wages” and employment levels that may be compatible with this. The vertical arrows indicate how these “shadow wages” differ from the actual wage, while the horizontal arrows indicate the differences between the static and dynamic employment levels in both states. Exercise 21 In Figure 3.2 both demand curves for labor are decreasing functions of employment. That is, we have assumed that ∂ 2 R(·)/∂ N2 < 0. How would the problem of optimal labor demand change if ∂ 2 R( Z i , N)/∂ N2 = 0 for i = b, g ? And if this were true only for i = b? LABOR MARKET 111 Figure 3.2. Adjustment costs and dynamic labor demand Hiring and firing costs reduce the size of fluctuations in the employment level between good and bad states. As mentioned in the introduction to this chapter, this very intuitive insight can be brought to bear on empirical evi- dence from markets characterized by differently stringent employment pro- tection legislation. In fact, the evidence unsurprisingly indicates that countries with more stringent labor market regulations feature less pronounced cyclical variations in employment. This is consistent with the simple model considered here (which takes wages to be exogenously given and constant) if the “firm” represents all employers in the economy, since wages in all countries are quite insensitive to cyclical fluctuations at the aggregate level (see Bertola, 1990, 1999, and references therein). It is certainly not surprising to find that turnover costs imply employment stability. If a negative cash flow is associated with each variation in the employ- ment level, firms optimally prefer to respond less than fully to fluctuations in labor demand. As indicated by the term labor hoarding, the firm values its labor force when considering the future as well as the current marginal revenue product of labor. Exercise 22 Show that it is optimal for the firm not to hire or fire any worker if both H and F are large relatively to the fluctuations in Z . To illustrate the role of the various parameters and of the functional form of R(·), it is useful to examine some limit cases. First of all, we consider the case in which F = 0 and H > 0: firms can fire workers at no cost, but hiring workers
    entails a cost over and above the wage. In order to evaluate how these costs
    affect firms’ propensity to create jobs, we rewrite the first-order condition for

    112 LABOR MARKET
    the strong labor demand case as
    Ï(Ng , Zg ) − w = r
    H
    1 + r
    + p
    H
    1 + r
    . (3.10)
    The first term on the right-hand side of this expression can be interpreted
    as a pure financial opportunity cost. If invested in an alternative asset with
    interest rate r , the hiring cost would yield a perpetual flow of dividends equal
    to r H from next period onwards, or, equivalently, a flow return of r H/(1 + r )
    starting this period. Hence, if the good state lasts for ever and p = 0, the
    presence of hiring costs simply corresponds to a higher wage rate. If, on
    the contrary, the future evolution of labor demand is uncertain and p > 0,
    the hiring costs also influence the employment level via the second term on the
    right of (3.10). The higher is p, the less inclined are firms to hire workers. The
    explanation is that firms might lose the resources invested in hiring a worker
    if this worker is laid off when labor demand switches from the good to the
    bad state. In the limit case with p = 1, labor demand oscillates permanently
    between the two states and (3.10) simplifies to Ï(Ng , Zg ) = w + H : since the
    marginal unit of labor that is hired in a good state is fired with probability one
    in the next period, we need to add the entire hiring cost to the salary.
    In periods with weak labor demand, the firm does not hire and hence does
    not incur any hiring cost. Nonetheless, the firm’s choices are still influenced
    by H : the employment level in the bad state needs to satisfy the following
    condition:
    Ï(Nb , Zb ) = w − p
    H
    1 + r
    . (3.11)
    In this equation a higher value of H is equivalent to a lower wage flow. This
    may seem surprising, but is easily explained. Retaining one additional unit of
    labor in the bad state costs the firm w, but the firm saves the cost of hiring an
    additional unit of labor in the next period if the demand conditions improve,
    which occurs with probability p.
    The reasoning for the case H = 0 and F > 0 is similar. In periods with weak
    labor demand,
    Ï(Nb , Zb ) = w − (r + p)
    F
    1 + r
    . (3.12)
    The firing cost F —which is saved if the firm decides not to fire a marginal
    worker—is equivalent to a lower wage in periods with weak labor demand.
    Conversely, in periods of strong labor demand we have
    Ï(Ng , Zg ) = w + p
    F
    1 + r
    , (3.13)
    and in this case the firing costs have the same effect as a wage increase: the
    fear that the firm may have to pay the firing cost if (with probability p) labor

    LABOR MARKET 113
    demand weakens in the next period deters the firm from hiring. Like hiring
    costs, firing costs therefore induce labor hoarding on the part of firms. In the
    case of firing costs, the firm values the units of labor it decides not to fire:
    moreover, the fear that the firm may not be able to reduce employment levels
    enough in periods with weak labor demand deters firms from hiring workers
    in good states.
    Before turning to further implications and applications of these simple
    results, it is worth mentioning that qualitatively similar insights would of
    course be valid in more formally sophisticated continuous-time models, such
    as those introduced in Chapter 2’s treatment of investment. Convex adjust-
    ment costs are not a particularly realistic representation of real-life employ-
    ment protection legislation, but it is conceptually simple to let downward
    adjustment be costly (rather than impossible, or never profitable) in the irre-
    versible investment models introduced in Sections 2.6 and 2.7.
    Readers familiar with that material may wish to try the following exercises,
    which propose relatively simple versions of the models solved in the references
    given. Such readers, however, should be warned that both settings only yield
    a set of equations whose solutions have to be sought numerically, thus illus-
    trating the advantages in terms of tractability of the Markov chain methods
    discussed in this chapter.
    Exercise 23 (Bertola, 1992) Let time be continuous. Suppose labor’s revenue is
    given by
    R(L , Z ) = Z
    L 1−‚
    1 − ‚ , 0 < ‚ < 1, and let the cyclical index Z be the following trigonometric function of time: Z (Ù) = K 1 + K 2 sin ( 2 p Ù ) , K 1 > K 2 > 0.
    Discuss the possible realism of such perfectly predictable cycles, and outline the
    optimality conditions that must be obeyed over each cycle by the optimal employ-
    ment path if the wage is given at w and the employer faces adjustment costs
    C (L̇ (Ù))L̇ (Ù) for C (x ) = h if Ẋ > 0, C (x ) = − f if Ẋ < 0. Exercise 24 (Bentolila and Bertola, 1990). Let the dynamics of the exogenous variables relevant for labor demand be given by d Z (t ) = ËZ (t ) d t + ÛZ (t ) d W(t ), and let the marginal revenue product of labor be written in the form Z L −‚. The wage is given at w, hiring is costless, firing costs f per unit of labor, and workers quit costlessly at rate ‰ so that d L (t ) = −‰L (t ) if the firm neither hires nor fires at time t . Write the optimality conditions for the firm’s employment policy and discuss how a solution may be found. 114 LABOR MARKET 3.3. Average Long-Run Effects We have seen that positive values of H and F reduce a firm’s propensity to hire and fire workers. Adjustment costs therefore reduce fluctuations in the employment level. Their effect on the average employment level is less clear- cut. This depends essentially on the magnitude of the increase in employment in periods with strong labor demand, relative to the decrease in employment in periods with a weak labor demand. In general, either of the two effects may dominate. The net effect on average employment is therefore a priori ambiguous and depends, as we will see, on two specific elements of the model: on the one hand, that firms discount future cash flows at a positive rate, and on the other hand, that optimal static labor demand is often a non-linear function of the wage and of aggregate labor market conditions denoted by Z . Since transitions between strong and weak states are symmetric, the ergodic distribution is very simple: as shown in the appendix to this chapter, the probability that we observe weak labor demand in a period indefinitely far away in the future is independent of the current state. Hence, in the long run, both states have equal probability. Assigning a probability of one-half to each of the two first-order conditions in (3.9), we can calculate the average value of the marginal productivity of labor: Ï(Ng , Zg ) + Ï(Nb , Zb ) 2 = w + r 2 H − F 1 + r . (3.14) If r > 0, then the costs of hiring tend to increase the value of marginal pro-
    ductivity in the long run: intuitively, the quantity 1
    2
    r H/(1 + r ) is added to
    the wage w, because in half of the periods the firm pays a cost H to hire the
    marginal unit of labor. In doing so, the firm forgoes the flow proceed r H that
    would accrue from next period onwards if it had invested H in a financial
    asset. The effects of firing costs F are similar, but perhaps less intuitive.
    If F > 0 and discount r is positive, then average marginal productivity is
    reduced by an amount equal to 1
    2
    r F /(1 + r ). To understand how a higher-
    cost F may reduce marginal productivity despite the increase in labor costs,
    it is useful to note that this effect is absent if r = 0. Hence, the reduction
    in marginal productivity is a dynamic feature. Because the firm discounts
    future revenues, the cash flow in different periods is not equivalent: firing
    costs increase the willingness of a firm to pay any given wage level by more
    than they reduce this willingness in periods with a strong labor demand when
    only in the smaller discounted value is taken into consideration.
    Graphically, with a positive value of r , firing costs are more important
    than hiring costs in the determination of the length of the arrow that points
    downwards in Figure 3.2. Conversely, hiring costs are more important in the
    determination of the length of the arrows that point upwards. Considering the
    employment levels associated with each level of the (shadow) wage, we can

    LABOR MARKET 115
    conclude that the positive impact of firing costs on low levels of employment
    are more pronounced than their negative impact on the employment level in
    the good state.
    3.3.1. AVERAGE EMPLOYMENT
    Figure 3.2 shows that variations in employment levels depend not only on
    differences between marginal products in the two cases and the wage, but also
    on the slope of the demand curve. If, as is the case in the figure, the slope
    of the demand curve is much steeper in the good state than in the bad state,
    the relative length of the two horizontal arrows can be such as to imply net
    employment effects that differ from what is suggested by the shadow wages in
    the two states. To isolate this effect, it is useful to set r = 0. In that case optimal
    demand maximizes the average rather than the actual value of the cash flow,
    and (3.14) then simplifies to
    Ï(Ng , Zg ) + Ï(Nb , Zb )
    2
    = w. (3.15)
    The turnover costs no longer appear in this expression. This indicates that a
    firm can maximize average profits by setting the average value of marginal
    productivity equal to wages. The average equality does not imply that both
    terms are necessarily the same. In fact, rewriting the conditions in (3.9) for
    the case in which r = 0 gives
    Ï(Ng , Zg ) = w + p( F + H ),
    Ï(Nb , Zb ) = w − p( F + H ).
    (3.16)
    Hence the firm imputes a share p of the total turnover costs that it incurs along
    a completed cycle to the marginal unit on the hiring and the firing margin.
    Exercise 25 Discuss the case in which the firm receives a payment each time it
    hires a worker, for example because the state subsidizes employment creation, and
    H = −F . What would happen if the cost of hiring were so strongly negative that
    H + F < 0 even in the case of firing costs F ≥ 0? Even in the case when r = 0 and hiring and firing costs do not affect the expected marginal productivity of labor, the effect of adjustment costs on average employment is zero only when the slope of the labor demand curve is constant. In fact, if Ï(N, Zg ) = f ( Zg ) − ‚N, Ï(N, Zb ) = g ( Zb ) − ‚N, 116 LABOR MARKET then, for any pair of functions f (·) and g (·), the relationships in (3.16) imply Ng = f ( Zg ) − w − p( F + H ) ‚ , Nb = g ( Zb ) − w + p( F + H ) ‚ . Hence, in this case average employment, Ng + Nb 2 = 1 2 f ( Zg ) + g ( Zb ) − w ‚ , coincides with the employment level that would be generated by the (wider) fluctuations that would keep the marginal productivity of labor always equal to the wage rate. Conversely, if the slope of the labor demand curve depends on the employ- ment level and/or on Z , then the average of Ng and Nb that satisfies (3.16) for H + F > 0, and thus (3.15), is not equal to the average of the employment
    levels that satisfy the same relationships for H = F = 0. The mechanism by
    which nonlinearities with respect to N generate mean effects, even in the
    case where r = 0, is similar to the one encountered in the discussion of the
    effects of uncertainty on investment in Chapter 2. If y = Ï(N; Z ) is a convex
    function in its first argument, then the inverse Ï−1( y; Z ) is also convex, so that
    N = Ï( y; Z ). For each given value of Z , therefore, Jensen’s inequality implies
    that
    Ï(x ; Z ) + Ï( y; Z )
    2
    > Ï
    (x + y
    2
    ; Z
    )
    .
    As illustrated in Figure 3.3, this means that, if deviations from the wage in
    (3.16) occurred around a stable marginal revenue product of labor function,
    that function’s convexity would imply that employment fluctuations average
    to a lower level, because the lower N associated with a given productivity
    increase is larger in absolute value than the employment increase associated
    with a symmetric productivity decline.
    Exercise 26 Suppose that r = 0, so that (3.15) holds, and that Ï(N, Zg ) =
    f ( Zg ) + ‚(N) and Ï(N, Zb ) = g ( Zb ) + ‚(N) for a decreasing function ‚(·)
    which does not depend on Z . Discuss the relationship between variations of
    employment and its average level.
    In general, the functional form of the labor demand function need not be
    constant and may depend on the average conditions of the labor market. The
    shape of labor demand may depend not only on N, but also on Z . Hence,
    Jensen’s inequality does not suffice to pin down an unambiguous relationship
    between the convexity of the demand function in each of the states and the
    average level of employment. State dependency of the functional form of
    labor demand is therefore an additional (and ambiguous) element in the
    determination of average employment.

    LABOR MARKET 117
    Figure 3.3. Nonlinearity of labor demand and the effect of turnover costs on average
    employment, with r = 0
    Exercise 27 Consider the case where Ï(N, Zg ) = Zg − ‚N and Ï(N, Zb ) =
    Zb − „N, and where ‚ and „ satisfy ‚ > „ > 0. What is the general effect of
    firing costs on the average employment level? And what is its effect in the limit
    case with r = 0? Why can’t we analyze this effect in the limit case with „ = 0 as
    in exercise 21?
    3.3.2. AVERAGE PROFITS
    In summary, average employment is very mildly and ambiguously related to
    turnover costs and, in particular, to firing costs. This is consistent with empir-
    ical evidence across countries characterized by differently stringent employ-
    ment protection legislation, in that it is hard to find convincing effects of such
    legislation on average long-run unemployment when other relevant factors
    (such as the upward pressure on wages exercised by unions) are appropriately
    taken into account (see Bertola 1990, 1999, and references therein).
    If not for employment levels, one can obtain unambiguous results for the
    average profits of the firm, or, more precisely, the average of the objective
    function in (3.2). Defined in this chapter as the surplus of the revenues of
    the firm over the total cost of labor, that function could obviously also include
    costs that are not related to labor, like the compensation of other factors of
    production. The negative slope of the demand curve for labor implies that
    a firm’s revenues would exceed the costs of labor in a static environment if
    all units of labor were paid according to marginal productivity (the striped
    area in Figure 3.4). Since total revenues correspond to the area below the
    marginal revenue curve, this surplus is given by the dotted area in Figure 3.4.

    118 LABOR MARKET
    Figure 3.4. The employer’s surplus when marginal productivity is equal to the wage
    The same negative slope guarantees that the dynamic optimization problem
    studied above has a well defined solution, and that the firm’s surplus is smaller
    when turnover costs are larger—not only when these costs are associated with
    a lower average employment level, but also when the adjustment costs induce
    an increase in the average employment level of the firm.
    To illustrate these (general) results, we shall consider the simple case of a
    linear demand curve for labor: with Ï(N, Z ) = Z − ‚N, the total revenues
    associated with given values of Z and N are simply given by (N, Z ) = Z N −
    1
    2
    ‚N2. Since the surplus (N, Z ) − w N is maximized when N = N∗ = ( Z −
    w)/‚ and the marginal return from labor coincides with the wage, the first-
    order term is zero in a Taylor expansion of the surplus around the optimum.
    In the case considered here, all terms of order three and above are also zero,
    and from
    (N, Z ) − w N = (N∗, Z ) − w N∗ + 1
    2
    ∂ 2[(N, Z ) − w N]
    ∂ N2
    ∣∣∣∣
    N∗
    (N − N∗)2,
    we can conclude that the choice of employment level N �= N∗ implies a loss of
    surplus equal to 1
    2
    ‚(N − N∗)2.
    As a result of hiring and firing costs, firms choose employment levels that
    differ from those that maximize the static optimality conditions and thus
    accept lower flow returns. In the case examined here, the marginal produc-
    tivity of labor is a linear function and optimal employment levels can easily be

    LABOR MARKET 119
    derived from (3.9):
    Ng =
    (
    Zg − w −
    p F + (r + p) H
    1 + r
    )
    1

    ,
    Nb =
    (
    Zb − w +
    (r + p) F + p H
    1 + r
    )
    1

    .
    Hence, the surplus is inferior to the static optimum by a quantity equal to(
    p F + (r + p) H
    1 + r
    )2
    1
    2
    in the strong case, and by (
    (r + p) F + p H
    1 + r
    )2
    1
    2
    in the weak case.
    Given the presence of turnover costs, it is rational for the firm to accept these
    static losses, because the smaller variations in employment permit the firm
    to save expenses on hiring and firing costs. But even though firms correctly
    weigh the marginal loss of revenues and the costs of turnover, the firm does
    experience the lower revenues and adjustment costs. Hence, both average
    profits and the optimized value of the firm are necessarily lower in the presence
    of turnover costs, and this can have adverse implications for the employers’
    investment decisions.
    3.4. Adjustment Costs and Labor Allocation
    In this section we shift attention from the firms to workers, and we analyze
    the factors that determine the equilibrium value of wages in this dynamic
    environment. If the entire aggregate demand for labor came from a single firm,
    then wages and aggregate employment should fluctuate along a curve that is
    equally “representative” of the supply side of the labor market. Looking at the
    implications of hiring and firing restrictions from this aggregate perspective
    suggests that the increased stability of wages and employment around a more
    or less stable average may or may not be desirable for workers. Moreover,
    these costs reduce the surplus of firms, which in turn may have a negative
    impact on investment and growth. Here readers should remember the results
    of Chapter 2, which showed that a higher degree of uncertainty increased
    firms’ willingness to invest as long as labor was flexible. Conversely, the rigidity
    of employment due to turnover costs can therefore be expected to reduce
    investment.

    120 LABOR MARKET
    Obviously, however, it is not very realistic to interpret variations in aggre-
    gate employment in terms of a more or less intense use of labor by a represen-
    tative agent. In fact, real wages are more or less constant along the business
    cycle, making it very difficult to interpret the dynamics of employment in
    terms of the aggregate supply of labor. Moreover, unemployment is typically
    concentrated within some subgroups of the population. Higher firing costs are
    associated with a smaller risk of employment loss and therefore have impor-
    tant implications when, as is realistic, losing one’s job is painful (because real
    wages do not make agents indifferent to employment). In order to concentrate
    on these disaggregate aspects, it is instructive to consider the implications of
    adjustment costs for the flow of employment between firms subject to the type
    of demand shocks analyzed above. To abstract from purely macroeconomic
    phenomena, it is useful to assume that there is such a large number of firms
    that the law of large numbers holds, so that exactly half of the firms are
    in the good state in any period. The same arguments used to compute the
    ergodic distribution of a single firm imply that, if the transition probability
    is the same for all firms, and if transitions are independent events, then the
    aggregate distribution of firms is stable over time. In fact, if we denote the
    share of firms with a strong demand by Pt , then a fraction p Pt of these firms
    will move to the state with a low demand. Hence, if the transitions of firms
    are independent events, the effective share of firms that is hit by a decline
    in demand approaches the expected value if the number of firms is higher.35
    Symmetrically, we can expect that a share p of the 1 − Pt firms in the bad state
    receive a positive shock. The inflow of firms into ranks of the firms with strong
    product demand is thus equal to a share p − p Pt of the total number of firms
    if the latter is infinitely large. Since Pt diminishes in proportion to p Pt and
    increases in proportion to p(1 − Pt ), the variation in the fraction Pt of firms
    with strong product demand is given by
    Pt +1 − Pt = p − p Pt − p Pt = p(1 − 2 Pt ). (3.17)
    This expression is positive if Pt < 0.5, negative if Pt > 0.5, and equal to zero if
    Pt = P∞ = 0.5. Hence, the frequency distribution of a large number of firms
    tends to stabilize at P = 0.5, as does the probability distribution of a single
    firm (discussed in the chapter’s appendix).
    Exercise 28 What is the role of p in (3.17)? Discuss the case p = 0.5.
    ³⁵ Imagine that the “relevant states of nature” are represented by the outcome of a series of coin
    tosses. Associate the value one with the outcome “heads” and zero with “tails,” so that the resulting
    random variable X has expected value 12 and variance
    1
    4 . The fraction of Xi = 1 with n tosses, Pn =∑n
    i =1 Xi /n, has expected value
    1
    2 , and, if the realizations are independent, its variance (1/n
    2 )(n/4) =
    (1/4n) decreases with n. Hence, in the limit with n → ∞, the variation is zero and P∞ = 0.5 with
    certainty.

    LABOR MARKET 121
    Figure 3.5. Dynamic supply of labor from downsizing firms to expanding firms, without
    adjustment costs
    This analogy between the probability and frequency distributions is valid
    whenever a large population of agents faces “idiosyncratic uncertainty,” and
    not just in the simple case described above. The “idiosyncratic” character of
    uncertainty means that individual agents are hit by independent events. With
    a large enough number of agents, the flows into and out of a certain state will
    then cancel each other out and the frequency distribution of these states will
    tend to converge to a stable distribution.
    Exercise 29 Assume that the probability of a transition from b to g is still given
    by p, while the probability of a transition in the opposite direction is now allowed
    to be q �= p. What is the steady-state proportion of firms in state g ?
    In the steady state with idiosyncratic uncertainty, in which Pt +1 = Pt = 0.5,
    each time a firm incurs a negative shock, another firm will incur a positive
    shock to labor productivity. Notice that this does not rely on a causal relation-
    ship between these events. That is, given that the demand shocks are assumed
    to be idiosyncratic, the above simultaneity does not refer to a particular other
    firm. We do not know which particular firm is hit by a symmetric shock,
    but we do know that there are as many firms with strong and weak product
    demand. It is therefore the relative size of these two groups that is constant
    over time, while the identity of individual firms belonging to each group
    changes over time.
    As before, the downward-sloping curves in Figure 3.5 correspond to the
    two possible positions of the demand curve for labor. Owing to the linearity
    of these curves, we can directly translate predictions in terms of wages into

    122 LABOR MARKET
    predictions about employment, abstracting from relatively unimportant
    effects deriving from Jensen’s inequality. The length of the horizontal axis
    represents the total labor force that is available to firms. The workers who
    are available for employment within a hiring firm are those who cannot find
    employment elsewhere—and, in particular, those who decided to leave their
    jobs in firms that are hit by a negative shock and are firing workers. The
    dotted line in the figure represents the labor demand by one such firm which
    is measured from right to left, that is in terms of residual employment after
    accounting for employment generated by firms with a strong demand.
    The workers who move from a shrinking firm to an expanding firm lose
    their employment in the first firm. The alternative wage of workers who are
    hired by expanding firms is therefore given by the demand curve for labor of
    downsizing firms, which essentially plays the role of an aggregate supply curve
    of labor. Hence, in the absence of firing costs, the equilibrium will be located
    at point E ∗ in Figure 3.5, at which the marginal productivity is the same in all
    firms and is equal to the common wage rate w.
    3.4.1. DYNAMIC WAGE DIFFERENTIALS
    As noted in the introduction to this chapter, it is certainly not very realistic to
    assume that labor mobility is costless for workers. Therefore we shall assume
    here that workers need to pay a cost Í each time they move to a new job.
    In reality, these costs could correspond at least partly to the loss of income
    (unemployment); however, for simplicity we shall assume that labor mobility
    is instantaneous. The objective in the dynamic optimization program of work-
    ers is to maximize the net expected income from work—given by the wage wt
    in periods in which the worker remains with her current employer, and by
    wt − Í in the other periods. Denoting the net expected value of labor income
    (or “human capital”) of individual j by W
    j
    t implies the following relation:
    W
    j
    t =
    {
    w
    j
    t +
    1
    1+r
    E t (W
    j
    t +1) if she does not move,
    w
    j
    t − Í + 11+r E t (W
    j
    t +1) if she moves to a new job.
    (3.18)
    Notice that each individual worker can be in two states only. At the begin-
    ning of a period a worker may be employed by a firm with a strong demand for
    labor, in which case the worker can earn wg without having to incur mobility
    costs. Since a firm in state g may receive a negative shock with probability p,
    the human capital Wg of each of its workers satisfies the following recursive
    relationship:
    Wg = wg +
    1
    1 + r
    [ p Wb + (1 − p)Wg ], (3.19)

    LABOR MARKET 123
    where Wb denotes the human capital of a worker employed by a firm with
    weak demand. The human capital of these workers satisfies the relationship
    Wb = wb +
    1
    1 + r
    [ p Wg + (1 − p)Wb ] (3.20)
    if the worker chooses to remain with the same firm. In this case the worker
    earns a wage wb , which, as we will see, is generally lower than wg . Because
    a transition to the bad state is accompanied by a wage reduction, it pays
    the worker to consider a move to a firm in the good state. In the long-run
    equilibrium there is a constant fraction of these firms in the economy. Hence,
    each time a firm incurs a negative shock, there is another firm that incurs
    a positive shock and will be willing to hire the workers who choose to leave
    their old firm. For these workers, (3.18) implies that
    Wb = wg − Í +
    1
    1 + r
    [(1 − p)Wg + p Wb ]. (3.21)
    The mobility to a good firm g entails a cost Í, but, since the move is instant-
    aneous, it immediately entitles the worker to a wage wg and to consider the
    future from the perspective of a firm with strong demand—which is different
    from the firms considered in (3.20), since the probability is 1 − p rather
    than p that state g will be realized next period. Since the option to move is
    available to all workers, the two alternatives considered in (3.20) and (3.21)
    need to be equivalent; otherwise there would be an arbitrage opportunity
    inducing all or none of the workers to move. Both of these outcomes would be
    inconsistent with equilibrium. From the equality between (3.20) and (3.21),
    we can immediately obtain
    wg − wb = k −
    1 − 2 p
    1 + r
    (Wg − Wb ). (3.22)
    If p = 0.5, the wage differential between expanding and shrinking firms is
    exactly equal to Í, the cost for a worker of moving between any two firms
    in a period. But if p < 0.5, that is if shocks to demand are persistent, then (3.22) takes into account the capital gains Wg − Wb from mobility. Subtract- ing (3.20) from (3.19) and using (3.22), we obtain Wg − Wb = Í. (3.23) In equilibrium, the cost of mobility needs to be equal to the gain in terms of higher future income. Substituting (3.23) into (3.22), we obtain an explicit expression for the difference between the flow salaries in the two states: wg − wb = 2 p + r 1 + r Í. (3.24) As mentioned above, firms in the good state pay a higher wage if mobility is voluntary and costly for workers. Equilibrium is illustrated in Figure 3.6: 124 LABOR MARKET Figure 3.6. Dynamic supply of labor from downsizing firms to expanding firms, without employers’ adjustment costs, if mobility costs Í per unit of labor in order to offer a higher salary, firms in state g employ fewer workers than in Figure 3.5, where we assumed that labor mobility was costless. Intuitively, workers are willing to bear the cost Í only if there are advantages associated with mobility, and the market can offer this advantage in terms of a higher wage. As in the case of the hiring and firing costs for firms, workers face a trade-off between the maximization of the static flow income—which would be obtained at point E ∗ in Figure 3.5—and minimizing the costs of mobility— which obviously would be zero if employment at each firm were completely stable, and if we would consider a uniform allocation of labor across firms without taking into account the differences in idiosyncratic productivity. The equilibrium allocation illustrated in Figure 3.6, in which (3.24) is satisfied, balances two requirements: the shaded area represents the loss of flow output in each period, which is such that it exactly offsets the mobility costs that would have to be incurred to move the economy closer to E ∗. This modeling perspective has interesting empirical implications. Wage dispersion should be more pronounced in situations of higher uncertainty for given workers’ mobility costs, and when workers bear larger mobility costs. Bertola and Ichino (1995) and Bertola and Rogerson (1997) find that these implications offer useful insights when comparing disaggregate wage and employment dispersion statistics from different countries and different periods. From a methodological point of view, it is important to note that the relationships between the various endogenous variables that are implied by the optimal dynamic mobility choices of workers satisfy non-arbitrage conditions LABOR MARKET 125 of a financial nature. If a single worker intends to maximize the net expected value of her future income, then the labor market needs to offer workers who decide to move the appropriate increase in wages to make this “investment” profitable. � APPENDIX A3: (TWO-STATE) MARKOV PROCESSES We now illustrate some of the techniques that are applicable to stochastic processes in the form xt +1 = { xb with prob. p if xt = xg , with prob. 1 − p if xt = xb xg with prob. p if xt = xb , with prob. 1 − p if xt = xg , (3.A1) the simple Markov chain that describes the evolution of all of the endogenous and exogenous variables in this chapter. A3.1. Conditional probabilities Let Pt, t +i = Probt (xt +i = xg |It ) denote the probability, based on all the information available at time t , of the realization (or ‘the actual level’) of the process at t + i equals xg . From (3.A1) it is clear that Pt, t +1 = { p if xt = xb 1 − p if xt = xg (3.A2) Figure A3.1 illustrates how we can compute this probability for i > 1. If the process
    starts from xg at t = 1, probability P1,3 is given by the sum of the two paths that are
    consistent with x3 = xg : the first one, which is constant, has probability (1 − p)2; the
    second one, in which we observe two consecutive variations of opposite sign, has p2.
    Hence, P1,3 = (1 − p2) + p2 = 1 − 2 p(1 − p) if x1 = xg . Similar reasoning implies
    that P1,3 = 2 p(1 − p) if x1 = xb .
    Figure A3.1. Possible time paths of a two-state Markov chain

    126 LABOR MARKET
    Using similar techniques, we could calculate the probability of observing xg at each
    date i > 2, starting from xb or xg . However, it is not necessary to do so in order to
    understand that all conditional probabilities from the point of view of period t are
    functions of xt . In fact,
    Pt,t +1 ≡ Probt (xt +1 = xg |It ) = p if xt = xb ,
    Pt,t +1 = 1 − p if xt = xg ,
    and the two possible values of Pt,t +1 are different if p �= 1 − p, that is if p �= 0.5.
    Conversely, any other information available at t is irrelevant for the evaluation of both
    Probt (xt +1 = xg ) and Pt,t +i for i > 1. Since the transition probabilities in (3.A1) are
    valid between t + 1 and t + 2,
    Pt,t +2 = (1 − p) Pt,t +1 + p(1 − Pt,t +1) = (1 − 2 p) Pt,t +1 + p (3.A3)
    depends only on Pt,t +1, which in turn depends only on xt (or is constant, and equal to
    0.5, if p = 0.5). Equation (3.A3) can be generalized to any pair of dates, in the form
    Pt,t +i +1 = (1 − p) Pt,t +i + p(1 − Pt,t +i ) = (1 − 2 p) Pt,t +i + p. (3.A4)
    Even if we extend the length of the chain beyond time t + 2, all probabilities Pt,t +i are
    still functions of xt only. (Thus, the process is Markovian process in levels.)
    A3.2. The ergodic distribution
    Using equation (3.A4), we can characterize the dynamics of the conditional probabil-
    ities for any future period. We write
    Pt,t +i +1 − Pt,t +i = (1 − 2 Pt,t +i ) p
    {
    > 0 if Pt,t +i < 0.5, < 0 if Pt,t +i > 0.5.
    (3.A5)
    Evaluating the probability that the process is in state g for ever increasing values of i ,
    that is for periods increasingly further away in the future, we find that this probability
    decreases if it is above 0.5, and increases if it is below 0.5. Hence, with time the
    probability Pt,t +i converges monotonically to its “ergodic” value Pt,∞ = 0.5.
    A3.3. Iterated expectations
    The conditional expectation of xt +i at date t for each i ≥ 0 is given by
    Et [xt +i ] = xg Pt,t +i + xb (1 − Pt,t +i ) = xb + (xg − xb ) Pt,t +i , (3.A6)
    which depends only on the current value of the process if (3.A1) is satisfied.
    We can use (3.A1) again to obtain the relationship between Pt,t +i and Pt +1,t +i , that is
    between the probabilities that are assigned at different moments in time to realizations
    of xt +i within the same future period. As we saw above, the realization of xt +1 is in

    LABOR MARKET 127
    general not relevant for the probability of xt +i = xg from the viewpoint of t + 1. From
    the viewpoint of period t , the probabilities of the same event can be written as
    Pt,t +i = ( Pt +1,t +i |xt +1 = xb ) · P(xt +1 = xb |It )
    + ( Pt +1,t +i |xt +1 = xg ) · P(xt +1 = xg |It ) (3.A7)
    (where P(xt +1 = xg |It ) = 1 − p if xt = xg , and so forth). This allows us to verify the
    validity of the law of iterative expectations in this context. For i ≥ 2, we write
    Et +1[xt +i ] = xb + (xg − xb ) Pt +1,t +i . (3.A8)
    At date t + 1, the probability on the right-hand side of (3.A8) is given, while at time t
    it is not possible to evaluate this probability with certainty: it could be ( Pt +1,t +i |xt +1 =
    xb ), or ( Pt +1,t +i |xt +1 = xg ), depending on the realization of xt +1. Given the uncertainty
    associated with this realization, from the point of view of time t the conditional
    expectation E t +1[xt +1+i ] is itself a random variable, and we can therefore calculate its
    expected value:
    E t [ E t +1[xt +i ]] = P(xt +1 = xb |It ) E t +1[xt +i |xt +1 = xb ]
    + P(xt +1 = xg |It ) E t +1[xt +i |xt +1 = xg ].
    Inserting (3.A8), using (3.A7), and recalling (3.A6), it follows that
    E t [xt +i ] = xb + (xg − xb ) Pt,t +1 = E t [ E t +1[xt +i ]].
    EXERCISES
    Exercise 30 Consider the production function
    F (k, l ; ·) = (k + l )· − ‚
    2
    l 2 − „
    2
    k2.
    (a) Suppose a firm with that production function has given capital k = 1, can hire l
    costlessly, pays given wage w = 1, and must pay F = 1 for each unit of l fired. If ·t
    takes the values 4 or 2 with equal probability p = 0.5, and future cash flows are
    discounted at rate r = 1, what is the optimal dynamic employment policy?
    (b) Suppose capital depreciates at rate ‰ = 1 and can be costlessly adjusted to ensure
    that its marginal product is equal to the cost of funds r + ‰. Does capital adjust-
    ment change the optimal employment pattern? What are the optimal levels of
    capital when ·t = 4 and when ·t = 2?
    Exercise 31 Consider a labor market in which firms have a linear demand curve for labor
    subject to parallel oscillations, Ï(N, Z ) = Z − ‚N. As in the main text, Z can take two
    values, Zb and Zg > Zb , and oscillates between these values with transition probability
    p. Also, the wage oscillates between two values, wb and wg > wb , and the oscillations of
    the wage are synchronized with those of Z .

    128 LABOR MARKET
    (a) Calculate the levels of employment Nb and Ng that maximize the expected dis-
    counted value of the revenues of the firm if the discount rate is equal to r and if the
    unit hiring and firing costs are given by H and F respectively.
    (b) Compute the mobility cost k at which the optimal mobility decisions are consistent
    with a wage differential �w = wg − wb when workers discount their future
    expected income at rate r .
    (c) Assume that the labor market is populated by 1,000 workers and 100 firms of
    which exactly half are in a good state in each period. What levels of the wage wb
    are compatible with full employment (with wg = wb + �w as above), under the
    hypothesis that labor mobility is instantaneous?
    Exercise 32 Suppose that the marginal productivity of labor is given by Ï( Z, N) = Z −
    ‚N, and that the indicator Zt can assume three rather than two values {Zb , Z M , Zg },
    with Zb < Z M < Zg , where the realizations of Zt are independent, while the wage rate is constant and equal to w̄ in each period. Finally, hiring and firing costs are given by H and F respectively. What form does the recursive relationship Î( Zt , Nt ) = Ï( Zt , Nt ) − w̄ + E t [Î( Zt +1, Nt +1)] take if the parameters are such that only fluctuations from Zb to Zg or vice versa induce the firm to adjust its labor force, while the employment level is unaffected for fluctuations from and to the average level of labor demand (from Zb to Z M or vice versa, or from Z M to Zg or vice versa)? Which are the two employment levels chosen by the firm? � FURTHER READING Theoretical implications of employment protection legislation and firing costs are potentially much wider than those illustrated in this chapter. For example, Bertola (1994) discusses the implications of increased rigidity (and less efficiency) in models of growth like the ones that will be discussed in the next chapter, using a two- state Markov process similar to the one introduced in this chapter but specified in a continuous-time setting where state transitions are described as Poisson events of the type to be introduced in Chapter 5. Economic theory can also explain why employment protection legislation is imposed despite its apparently detrimental effects. Using models similar to those discussed here, Saint-Paul (2000) considers how politico-economic interactions can rationalize labor market regulation and resistance to reforms, and Bertola (2004) shows that, if workers are risk-averse, then firing costs may have beneficial effects: redundancy payments not only can remedy a lack of insurance but also can foster efficiency if they allow forward-looking mobility decisions to be taken on a more appropriate basis. Of course, job security provisions are only one of the many institutional features that help explain why European labor markets generate lower employment than Amer- ican ones. Union behavior and taxation play important roles in determining high- wage, low-employment outcomes. And macroeconomic shocks interact in interesting LABOR MARKET 129 ways with wage and employment rigidities in determining the dynamics of employ- ment and unemployment across the Atlantic and within Europe. For economic and empirical analyses of the European unemployment problem from an international comparative perspective, see Bean (1994), Alogoskoufis et al. (1995), Nickell (1997), Nickell and Layard (1999), Blanchard and Wolfers (2000), and Bertola, Blau, and Kahn (2002), which all include extensive references. � REFERENCES Alogoskoufis, G., C. Bean, G. Bertola, D. Cohen, J. Dolado, G. Saint-Paul (1995) Unemployment: Choices for Europe, London: CEPR. Bean, C. (1994) “European Unemployment: A Survey,” Journal of Economic Literature, 32, 573–619. Bentolila, S., and G. Bertola (1990) “Firing Costs and Labor Demand: How Bad is Eurosclerosis?” Review of Economic Studies, 57, 381–402. Bertola, G. (1990) “Job Security, Employment and Wages,” European Economic Review, 34, 851–886. (1992) “Labor Turnover Costs and Average Labor Demand,” Journal of Labor Economics, 10, 389–411. (1994) “Flexibility, Investment, and Growth,” Journal of Monetary Economics, 34, 215–238. (1999) “Microeconomic Perspectives on Aggregate Labor Markets,” in O. Ashenfelter and D. Card (eds.), Handbook of Labor Economics, vol. 3B, 2985–3028, Amsterdam: North- Holland. Bertola, G. (2003) “A Pure Theory of Job Security and Labor Income Risk,” Review of Economic Studies, 71(1): 43–61. F. D. Blau, and L. M. Kahn (2002) “Comparative Analysis of Labor Market Outcomes: Lessons for the US from International Long-Run Evidence,” in A. Krueger and R. Solow (eds.), The Roaring Nineties: Can Full Employment Be Sustained? New York: Russell Sage, pp. 159–218. and A. Ichino (1995) “Wage Inequality and Unemployment: US vs Europe,” in B. Bernanke and J. Rotemberg (eds.), NBER Macroeconomics Annual 1995, 13–54, Cambridge, Mass.: MIT Press. and R. Rogerson (1997) “Institutions and Labor Reallocation,” European Economic Review, 41, 1147–1171. Blanchard, O. J., and J. Wolfers (2000) “The Role of Shocks and Institutions in the Rise of European Unemployment: The Aggregate Evidence,” Economic Journal, 110: C1–C33. Nickell, S. (1997) “Unemployment and Labor Market Rigidities: Europe versus North America,” Journal of Economic Perspectives, 11(3): 55–74. and R. Layard (1999) “Labor Market Institutions and Economic Performance,” in O. Ashenfelter and D. Card (eds.), Handbook of Labor Economics, vol. 3C, 3029–3084, Amster- dam: North-Holland. Saint-Paul, G. (2000) The Political Economy of Labour Market Institutions, Oxford: Oxford University Press. 4 Growth in Dynamic General Equilibrium The previous chapters analyzed the optimal dynamic behavior of single consumers, firms, and workers. The interactions between the decisions of these agents were studied using a simple partial equilibrium model (for the labor market). In this chapter, we consider general equilibrium in a dynamic environment. Specifically, we discuss how savings and investment decisions by individual agents, mediated by more or less perfect markets as well as by institutions and collective policies, determine the aggregate growth rate of an economy from a long-run perspective. As in the previous chapters, we cannot review all aspects of a very extensive theoretical and empirical literature. Rather, we aim at familiarizing readers with technical approaches and economic insights about the interplay of technology, preferences, market structure, and insti- tutional features in determining dynamic equilibrium outcomes. We review the relevant aspects in the context of long-run growth models, and a brief concluding section discusses how the mechanisms we focus on are relevant in the context of recent theoretical and empirical contributions in the field of economic growth. Section 4.1 introduces the basic structure of the model, and Section 4.2 applies the techniques of dynamic optimization to this base model. The next two sections discuss how decentralized decisions may result in an optimal growth path, and how one may assess the relevance of exogenous technological progress in this case. Finally, in Section 4.5 we consider recent models of endogenous growth. In these models the growth rate is determined endoge- nously and need not coincide with the optimal growth rate. The problem at hand is more interesting, but also more complex, than those we have considered so far. To facilitate analysis we will therefore emphasize the economic intuition that underlies the formal mathematical expressions, and aim to keep the structure of the model as simple as possible. In what follows we consider a closed economy. The national accounting relationship Y (t ) = C (t ) + I (t ) (4.1) between the flows of production (Y ), consumption (C ), and investment therefore holds at the aggregate level. Furthermore, for simplicity, we do not distinguish between flows that originate in the private and the public sectors. EQUILIBRIUM GROWTH 131 The distinction between consumption and investment is based on the concept of capital. Broadly speaking, this concept encompasses all durable factors of production that can be reproduced. The supply of capital grows in proportion with investments. At the same time, however, existing capital stock is subject to depreciation, which tends to lower the supply of capital. As in Chapter 2, we formalize the problem in continuous time. We can therefore define the stock of capital, K (t ) at time t , without having to specify whether it is measured at the beginning or the end of a period. In addition, we assume that capital depreciates at a constant rate ‰. The evolution of the supply of capital is therefore given by lim �t→0 K (t + �t ) − K (t ) �t ≡ d K (t ) d t ≡ K̇ (t ) = I (t ) − ‰K (t ). The demand for capital stems from its role as an input in the productive process, which we represent by an aggregate production function, Y (t ) = F (K (t ), . . .). This expression relates the flow of aggregate output between t and t + �t to the stocks of production factors that are available during this period. In principle, these stocks can be measured for any infinitesimally small time period �t . However, a formal representation of the aggregate production process in a single equation is normally not feasible. In reality, the capital stock consists of many different durable goods, both public and private. At the end of this chapter we will briefly discuss some simple models that make this disaggregate structure explicit, but for the moment we shall assume that investment and consumption can be expressed in terms of a single good as in (4.1). Furthermore, for simplicity we assume that “capital” is combined with only one non-accumulated factor of production, denoted L (t ). In what follows, we will characterize the long-run behavior of the economy. More precisely, we will consider the time period in which per capita income grows at a non-decreasing rate and in which the ratio between aggregate capital K and the flow of output Y tends to stabilize. The amount of capital per worker therefore tends to increase steadily. The case in which the growth rate of output and capital exceeds the growth rate of the population represents an extremely important phenomenon: the steady increase in living standards. But in this chapter our interest in this type of growth pattern stems more from its simplicity than from reality. Even though simple models cannot capture all features of world history, analyzing the economic mechanisms of a growing economy may help us understand the role of capital accumulation in the real world and, more generally, characterize the economic structure of growth processes. 132 EQUILIBRIUM GROWTH 4.1. Production, Savings, and Growth The dynamic models that we consider here aim to explain, in the simplest pos- sible way, on the one hand the relationship between investments and growth, and on the other hand the determinants of investments. The production process is defined by Y (t ) = F (K (t ), L (t )) = F (K (t ), A(t )N(t )), (4.2) where N(t ) is the number of workers that participate in production in period t and A(t ) denotes labor productivity; at time t each of the N(t ) workers supplies A(t ) units of labor. Clearly, there are various ways to specify the concept of productive efficiency in more detail. The amount of work of an individual may depend on her physical strength, on the time and energy invested in production, on the climate, and on a range of other factors. How- ever, modeling these aspects not only complicates the analysis, but also forces us to consider economic phenomena other than the ones that most interest us. To distinguish the role of capital accumulation (which by definition depends endogenously on savings and investment decisions) from these other factors, it is useful to assume that the latter are exogenous. The starting point of our analysis is the Solow (1956) growth model. This model is familiar from basic macroeconomics textbooks, but the analysis of this section is relatively formal. We assume that L (t ) grows at a constant rate g , L̇ (t ) = g L (t ), L (t ) = L (0)e g t , and for the moment we abstract from any economic determinant for the level or the growth rate of this factor of production. Furthermore, we assume that the production function exhibits constant returns to scale, so that F (ÎK , ÎL ) = ÎF (K , L ) for any Î. The validity of this assumption will be discussed below in the light of its economic implications. Formally, the assumption of constant returns to scale implies a direct relationship between the level of output and capital per unit of the non-accumulated factor, y(t ) ≡ Y (t )/L (t ) and k(t ) ≡ K (t )/L (t ). Omitting the time index t , we can write y = F (K , L ) L = L F (K /L , 1) L = f (k), EQUILIBRIUM GROWTH 133 which shows that the per capita production depends only on the capital/labor ratio. The accumulation of the stock of capital per worker is given by k̇(t ) = d d t ( K (t ) L (t ) ) = K̇ (t )L (t ) − L̇ (t )K (t ) L (t )2 = K̇ (t ) L (t ) − L̇ (t ) L (t ) K (t ) L (t ) . Since K̇ (t ) = I (t ) − ‰K (t ) and L̇ (t ) = g L (t ), we thus get k̇(t ) = I (t ) L (t ) − ( g + ‰)k(t ). Assuming that the economy as a whole devotes a constant proportion s of output to the accumulation of capital, C (t ) = (1 − s )Y (t ), I (t ) = s Y (t ), then I (t )/L (t ) = s Y (t )/L (t ) = s y(t ) = s f (k(t )), and thus k̇(t ) = s f (k(t )) − ( g + ‰)k(t ). The main advantage of this expression, which is valid only under the simpli- fying assumptions above, is that it refers to a single variable. For any value of k(t ), the model predicts whether the capital stock per worker tends to increase or decrease, and using the intermediate steps described above one can fully characterize the ensuing dynamics of the aggregate and per capita income. The amount of capital per worker tends to increase when s f (k(t )) > ( g + ‰)k(t ), (4.3)
    and to decrease when
    s f (k(t )) < ( g + ‰)k(t ). (4.4) Having reduced the dynamics of the entire economy to the dynamics of a single variable, we can illustrate the evolution of the economy in a simple graph as shown in Figure 4.1. Clearly, the function s f (k) plays a crucial role in these relationships. Since f (k) = F (k, 1) and F (·) has constant returns to scale, we have f (Îk) = F (Îk, 1) ≤ F (Îk, Î) = ÎF (k, 1) = Î f (k) for Î > 1, (4.5)
    where the inequality is valid under the hypothesis that increasing L , the
    second argument of F (·, ·), cannot decrease production. Note, however, that
    the inequality is weak, allowing for the possibility that using more L may leave
    production unchanged for some values of Î and k.
    If the inequality in (4.5) is strict, then income per capita tends to increase
    with k, but at a decreasing rate, and f (k) takes the form illustrated in the
    figure. If a steady state ks s exists, it must satisfy
    s f (ks s ) = ( g + ‰)ks s . (4.6)

    134 EQUILIBRIUM GROWTH
    Figure 4.1. Decreasing marginal returns to capital
    4.1.1. BALANCED GROWTH
    The expression on the right in (4.3) defines a straight line with slope ( g + ‰).
    In Figure 4.2, this straight line meets the function s f (k) at ks s : for k < ks s , k̇ = s f (k) − ( g + ‰)k > 0, and the stock of capital tends to increase towards
    ks s ; for k > ks s , on the contrary, k̇ < 0, and in this case k tends to decrease towards its steady state value ks s . Figure 4.2. Steady state of the Solow model EQUILIBRIUM GROWTH 135 The speed of convergence is proportional to the vertical distance between the two functions, and thus decreases in absolute value while k approaches its steady-state value. In the long-run the economy will be very close to the steady state. If k ≈ ks s �= 0, then k = K /L is approximately constant; given that d d t K (t ) L (t ) = ( K̇ (t ) K (t ) − L̇ (t ) L (t ) ) K (t ) L (t ) ≈ 0 ⇒ K̇ (t ) K (t ) ≈ L̇ (t ) L (t ) , the long-run growth rate of K is close to the growth rate of L . Moreover, since F (K , L ) has constant returns to scale, Y (t ) will grow in the same proportion. Hence, in steady state the model follows a “balanced growth” path, in which the ratio between production and capital is constant. For the per capita capital stock and output, we can use the definition that L (t ) = A(t )N(t ). This yields Y (t ) N(t ) = Y (t ) L (t ) L (t ) N(t ) = f (kt ) A(t ), K (t ) N(t ) = kt A(t ). In terms of growth rates, therefore, we get the expression (d/d t )[Y (t )/N(t )] Y (t )/N(t ) = (d/d t ) f (kt ) f (kt ) + Ȧ(t ) A(t ) . When kt tends to a constant ks s , as in the above figure, then d f (kt )/d t = f ′(kt )k̇ tends to zero; only a positive growth rate Ȧ(t )/ A(t ) can allow a long- run growth in the levels of per capita income and capital. In other words, the model predicts a long-run growth of per capita income only when L grows over time and whenever this growth is at least partly due to an increase in A rather than an increase in the number of workers N. If we assume that the effective productivity of labor A(t ) grows at a positive rate g A, and that g ≡ L̇ L = Ȧ A + Ṅ N = g A + g N , then the economy tends to settle in a balanced growth path with exogenous growth rate g A: the only endogenous mechanism of the model, the accumula- tion of capital, tends to accompany rather than determine the growth rate of the economy. A once and for all increase in the savings ratio shifts the curve s f (k) upwards, as in Figure 4.3. As a result, the economy will converge to a steady state with a higher capital intensity, but the higher saving rate will have no effect on the long-run growth rate. In particular, the accumulation of capital cannot sustain a constant growth of income (whatever the value of s ) if g = 0 and f ′′(k) < 0. For simplicity, consider the case in which L is constant and ‰ = 0. In that case, Ẏ Y = f ′(k)k̇ f (k) = s f ′(k), (4.7) 136 EQUILIBRIUM GROWTH Figure 4.3. Effects of an increase in the savings rate and an increase in k clearly reduces the growth rate of per capita income. Asymptotically, the growth rate of the economy is zero if limk→∞ f ′(k) = 0, or it reaches a positive limit if for k → ∞ the limit of f ′(k) = ∂ F (·)/∂ K is strictly positive. Exercise 33 Retaining the assumption that s is constant, let ‰ > 0. How does the
    asymptotic behavior of Ẏ /Y depend on the value of limk→∞ f ′(k)?
    4.1.2. UNLIMITED ACCUMULATION
    Even if f ′(k) is decreasing in k, nothing prevents the expression on the left
    of (4.3) from remaining above the line ( g + ‰)k for all values of k, implying
    that no finite steady state exists (ks s → ∞). For this to occur the following
    condition needs to be satisfied:
    lim
    k→∞
    f ′(k) ≡ f ′(∞) ≥ g + ‰
    s
    , (4.8)
    so that the distance between the functions does not diminish any further when
    k increases from a value that is already close to infinity.
    Consider, for example, the case in which g = ‰ = 0: in this case the steady-
    state capital stock k is infinite even if limk→∞ f ′(k) = 0. This does not imply
    that the growth rate remains high, but only that the growth rate slows down
    so much that it takes an infinite time period before the economy approaches
    something like a steady state in which the ratio between capital and output
    remains constant. In fact, given that the speed of convergence is determined

    EQUILIBRIUM GROWTH 137
    by the distance between the two curves in (4.2), which tends to zero in the
    neighborhood of a steady state, the economy always takes an infinite time
    period to attain the steady state. The steady state is therefore more like a
    theoretical reference point than an exact description of the final configuration
    for an economy that departs from a different starting position.
    Nevertheless, in the long-run a positive growth rate is sustainable if the
    inequality in (4.8) holds strictly:
    lim
    k→∞
    f ′(k) ≡ f ′(∞) > g + ‰
    s
    .
    If L is constant, and if there is no depreciation (‰ = 0), the long-run growth
    rate is

    Y
    = s f ′(∞) > 0,
    and it is dependent on the savings ratio s and the form of the production
    function.
    Consider, for example, the case of a constant elasticity of substitution (CES)
    production function:
    F (K , L ) = [·K Î + (1 − ·)L Î]1/Î, Î ≤ 1. (4.9)
    In this case we have
    f (k) = [·kÎ + (1 − ·)]1/Î
    and thus
    f ′(k) = [·kÎ + (1 − ·)](1/Î)−1·kÎ−1 = ·[· + (1 − ·)k−Î](1−Î)/Î.
    If Î is positive, the term k−Î tends to zero if k approaches infinity, and
    limk→∞ f ′(k) = ·(·)(1/Î)−1 = ·1/Î > 0: hence, this production function sat-
    isfies f ′(∞) > 0 when 0 ≤ Î < 1. The production function (4.9) is also well defined for Î < 0. In this case, the term in parentheses tends to infinity and, since its exponent (1 − Î)/Î is negative, limk→∞ f ′(k) = 0. For Î = 0 the functional form (4.9) raises unity to an infinitely large exponent, but is well defined. Taking logarithms, we get ln( f (k)) = 1 Î ln ( ·kÎ + (1 − ·) ) . The limit of this expression can be evaluated using l’Hôpital’s rule, and is equal to the ratio of the limit of the derivatives with respect to Î of the numerator and the denominator. Using the differentiation rules d ln(x )/d x = 1/x and d y x /d x = y x ln y, the derivative of the numerator can be written as( ·kÎ + (1 − ·) )−1 (·kÎ ln k), 138 EQUILIBRIUM GROWTH while the derivative of the denominator is equal to one. Since limÎ→0 kÎ = 1, the limit of the logarithm of f (k) is thus equal to · ln k, which corresponds to the logarithm of the Cobb–Douglas function k·. Exercise 34 Interpret the limit condition in terms of the substitutability between K and L . Assuming ‰ = g = 0, analyze the growth rate of capital and production in the case where Î = 1, and in the case where · = 1. 4.2. Dynamic Optimization The model that we discussed in the previous section treated the savings ratio s as an exogenous variable. We therefore could not discuss the economic motivation of agents to save (and invest) rather than to consume, nor could we determine the optimality of the growth path of the economy. To introduce these aspects into the analysis, we will now consider the welfare of a repre- sentative agent who consumes an amount C (t )/N(t ) ≡ c (t ) in each period t . Suppose that the welfare of this agent at date zero can be measured by the following integral U = ∫ ∞ 0 u(c (t ))e −Òt d t. (4.10) The parameter Ò is the discount rate of future consumption; given Ò > 0, the
    agent prefers immediate consumption over future consumption. The function
    u(·) is identical to the one introduced in Chapter 1: the positive first derivative
    u′(·) > 0 implies that consumption is desirable in each period; however, the
    marginal utility of consumption is decreasing in consumption, u′′(·) < 0, which gives agents an incentive to smooth consumption over time. The decision to invest rather than to consume now has a precise economic interpretation. For simplicity, we assume that g = 0, so that normalizing by population as in (4.10) is equivalent to normalizing by the labor force. Assum- ing that ‰ = 0 too, the accumulation constraint, f (k(t )) − c (t ) − k̇(t ) = 0, (4.11) implies that higher consumption (for a given k(t )) slows down the accumula- tion of capital and reduces future consumption opportunities. At each date t , agents thus have to decide whether to consume immediately, obtaining utility u(c (t )), or to save, obtaining higher (discounted) utility in the future. This problem is equivalent to the maximization of objective func- tion (4.10) given the feasibility constraint (4.11). Consider the associated Hamiltonian, H (t ) = [u(c (t )) + Î(t )( f (k(t )) − c (t ))] e −Òt , EQUILIBRIUM GROWTH 139 where the shadow price is defined in current values. This shadow price measures the value of capital at date t and satisfies Î(t ) = Ï(t )e Òt where Ï(t ) measures the value at date zero. The optimality conditions are given by ∂ H ∂c = 0, (4.12) − ∂ H ∂k = d ( Î(t )e −Òt ) d t , (4.13) lim t→∞ Î(t )e −Òt k(t ) = 0. (4.14) 4.2.1. ECONOMIC INTERPRETATION AND OPTIMAL GROWTH Equations (4.12) and (4.13) are the first-order conditions for the optimal path of growth and accumulation. In this section we provide the economic intuition for these conditions, which we shall use to characterize the dynamics of the economy. The advantage of using the present-value shadow price Î(t ) is that we can draw a phase diagram in terms of Î (or c ) and k, leaving the time dependence of these variables implicit. From (4.12), we have u′(c ) = Î. (4.15) Î(t ) measures the value in terms of utility (valued at time t ) of an infinitesimal increase in k(t ). Such an increase in capital can be obtained only by a reduc- tion of current consumption. The loss of utility resulting from lower current consumption is measured by u′(c ). For optimality, the two must be the same. In addition, we also have the condition that Î̇ = (Ò − f ′(k)) Î, (4.16) which has an interpretation in terms of the evaluation of a financial asset: the marginal unit of capital provides a “dividend” f ′(k)Î, in terms of utility, and a capital gain Î̇. Expression (4.16) implies that the sum of the “dividend” and the capital gain are equal to the rate of return Ò multiplied by Î. This relationship guarantees the equivalence of the flow utilities at different dates, and we can interpret Î as the value of a financial activity (the marginal unit of capital). An economic interpretation is also available for the “transversality” condi- tion in (4.14): it imposes that either the stock of capital, or its present value Î(t )e −Òt (or both) need to be equal to zero in the limit as the time horizon extends to infinity. 140 EQUILIBRIUM GROWTH Combining the relationships in (4.15) and (4.16), we derive the following condition: d d t u′(c ) = (Ò − f ′(k)) u′(c ). Along the optimal path of growth and accumulation, the proportional growth rate of marginal utility is equal to Ò − f ′(k), the difference between the exponential discount rate of utility and the growth rate of the available resources arising from the accumulation of capital. This condition is a Euler equation, like that encountered in Chapter 1. (Exercise 36 asks you to show that it is indeed the same condition, expressed in continuous rather than discrete time.) Making the time dependence explicit and differentiating the function on the left of this equation with respect to t yields d u′(c (t ))/d t = u′′(c (t ))d c (t )/d t. Thus, we can write (omitting the time argument) ċ = ( u′(c ) −u′′(c ) ) ( f ′(k) − Ò). (4.17) Since the law of motion for capital is given by k̇ = f (k) − c , (4.18) we can therefore study the dynamics of the system in c , k-space. 4.2.2. STEADY STATE AND CONVERGENCE The steady state of the system of equations (4.17) and (4.18) satisfies f ′(ks s ) = Ò, c s s = f (ks s ), if it exists. For the dynamics we make use of a phase diagram as in Chapter 2. On the horizontal axis we measure the stock of capital k (which now refers to the economy-wide capital stock rather than the capital stock of a single firm). On the vertical axis we measure consumption, c , rather than the shadow price of capital. (The two quantities are univocally related, as was the case for q and investment in Chapter 2.) If f (·) has decreasing marginal returns and in addition there exists a ks s < ∞ such that f ′(ks s ) = Ò, then we have the situation illustrated in Figure 4.4. Clearly, more than one initial consumption level c (0) can be associated with a given initial capital stock k(0). However, only one of these consumption levels leads the economy to the steady state: the dynamics are therefore of the saddlepath type which we already encountered in Chapter 2. Any other path EQUILIBRIUM GROWTH 141 Figure 4.4. Convergence and steady state with optimal savings leads the economy towards points where c = 0, or where k = 0 (which in turn implies that c = 0 if f (0) = 0 and if capital cannot become negative). Under reasonable functional form restrictions the solution is unique, and one can show that only the saddlepath satisfies (4.14). Exercise 35 Repeat the derivation, supposing that g A = 0 but ‰ > 0, g N > 0.
    Show that the system does not converge to the capital stock associated with
    maximum per capita consumption in steady state.
    4.2.3. UNLIMITED OPTIMAL ACCUMULATION
    In the above diagram the accumulation of capital cannot sustain an indefinite
    increase of labor productivity and of per capita consumption. However, as in
    the Solow model, the hypothesis that F (·) has constant returns to scale in
    capital and labor does not necessarily imply that ks s < ∞. In these cases one cannot speak about a steady state in terms of the level of capital, consumption, and production. However, it is still possible that there exists a steady state in terms of the growth rates of these variables—that is, a situation in which the economy has a positive and non-decreasing long-run growth rate even in the absence of exogenous technological change. Suppose for instance that f ′(k) = b, which is constant and independent of k for all the relevant values of the capital stock. If the elasticity of marginal utility is constant, so that u′′(c ) c u′(c ) = −Û 142 EQUILIBRIUM GROWTH for all values of c , then we can rewrite (4.17) as ċ (t ) c (t ) = b − Ò Û , (4.19) and consumption increases (or decreases, if b < Ò and agents can disinvest) at a constant exponential rate. The utility function considered here is of the constant relative risk aversion (CRRA) type, given by u(c ) = c 1−Û 1 − Û , u ′(c ) = c −Û, u′′(c ) = −Ûc −Û−1. (4.20) The conditions u′(·) > 0, u′′(·) < 0 are satisfied if Û > 0. If Û = 1, the func-
    tional form (4.20) is not well defined, but the marginal utility function u′(x ) =
    x −1 (which completely characterizes preferences) coincides with the derivative
    of log(x ): hence, for Û = 1 we can write u(c ) = log(c ). Given f ′(k) = b, we can
    write f (k) = bk + Ó with Ó a constant of integration. From the law of motion
    for capital,
    k̇(t ) = f (k(t )) − c (t ) = Ó + bk(t ) − c (t ),
    we can derive
    k̇(t )
    k(t )
    =
    Ó
    k(t )
    + b − c (t )
    k(t )
    .
    If we focus on the case in which k(t ) tends to infinity and Ó/ k(t ) to zero,
    we have
    lim
    t→∞
    k̇(t )
    k(t )
    = b − lim
    t→∞
    c (t )
    k(t )
    . (4.21)
    The proportional growth rate of k then tends to a constant if k(t ) tends to
    grow at the same (exponential) rate as c (t ).
    One can show that this condition is necessarily true if the economy satisfies
    the transversality condition (4.14). With equation (4.20) for u(·), we get
    Î(t ) = u′(c (t )) = (c (t ))−Û.
    Given that c (t ) grows at a constant exponential rate, Ï(t ) = Î(t )e −Òt has expo-
    nential dynamics.
    Now, consider (4.21). If c (t )/ k(t ) diminishes over time, then k(t ) grows
    at a more than exponential rate and the limit in (4.14) does not exist. If, on
    the contrary, c (t )/ k(t ) is growing, k̇(t )/ k(t ) becomes increasingly negative.
    As a result, k(t ) will eventually equal zero, and production, consumption, and
    accumulation will come to a halt—which is certainly not optimal, since for
    the case of (4.20) we have u′(0) = ∞.
    The first case corresponds to paths that hit or approach the vertical axis in
    Figure 4.4; the second corresponds to paths that hit or approach the horizontal
    axis. Hence, as in the case of the phase diagram, there is only one initial level

    EQUILIBRIUM GROWTH 143
    of consumption that satisfies the transversality condition. (In fact, the phase
    diagram remains valid in a certain sense; however, the economy is always arbit-
    rarily far from the steady state.) The consumption/capital ratio is therefore
    constant over time under our assumptions. Imposing

    k
    =

    c
    in (4.19) and in (4.21), we get
    c (t ) =
    (Û − 1)b + Ò
    Û
    k(t ). (4.22)
    Equation (4.22) implies that the initial consumption is an increasing function
    of b, the intertemporal rate of transformation, if Û > 1. In this case the income
    effect of a higher b dominates the substitution effect, which induces capital
    accumulation and hence tends to reduce the level of consumption. For Û = 1,
    equation (4.20) is replaced by u(c ) = ln(c ), and the ratio c / k is equal to Ò and
    does not depend on b.
    Since y(t ) = bk(t ), savings are a constant fraction of income as in the Solow
    model:
    s = 1 − (Û − 1)b + Ò

    .
    Nonetheless, in the model with optimization, the savings ratio s is constant
    only if u(·) is given by (4.20) and if f (·) = bk, and not in more general cases.
    Moreover, s is not a given constant as in the Solow model. The savings ratio
    depends on the parameters that characterize utility (Û and Ò) and technol-
    ogy (b).
    Having shown that capital grows at an exponential rate, we now return
    to (4.14). In order to satisfy this transversality condition, the growth rate of
    capital needs to be smaller than the rate at which the discounted marginal
    utility diminishes along the growth path. We thus have
    d
    d t
    (
    ln(c (t )−Ûe −Òt )
    )
    = −Û b − Ò
    Û
    − Ò = −b.
    Since in the case considered here Ï(t ) = e −Òt c (t )−Û and f ′(k) = b, this is a
    reformulation of condition (4.13).
    In addition, we have k̇ = s y = s bk where s is the savings ratio. The transver-
    sality condition is therefore satisfied if

    k
    = s b < d d t | ( ln(c (t )−Ûe −Òt ) ) | = b, 144 EQUILIBRIUM GROWTH or equivalently if s < 1. Hence, the propensity to save s = 1 − C /Y , which is implied by (4.22), must be smaller than one: this leads to the condition that 0 < 1 − s = c (t ) y(t ) = c (t ) bk(t ) = (Û − 1)b + Ò bÛ , which is equivalent to (1 − Û)b < Ò. (4.23) If the parameters of the model violated (4.23), the steady state growth path that we identified would not satisfy (4.14). But in that case the optimal solution would not be well defined since the objective function (4.10) could take an infinite value: although technically speaking consumption could grow at rate b, the integral in (4.10) does not converge when (1 − Û)b − Ò > 0.
    The steady-state growth path describes the optimal dynamics of the econ-
    omy without any transitional dynamics if f (k) = bk for each 0 ≤ k ≤ ∞. We
    should note, however, that the constant b is not allowed to be a function of
    L if F (ÎK , ÎL ) = ÎF (K , L ). Hence, F (K , L ) = b K = F̃ (K ), and the non-
    accumulated factor L cannot be productive for the economy considered, that
    grows at a constant rate in the absence of any (exogenous) growth in L , if
    the production function has constant returns to scale in K and L together.
    Alternatively, the economy may converge asymptotically to the steady-state
    growth path if limk→∞ f ′(k) = b > 0 even though f ′′(k) < 0 for any 0 ≤ k < ∞. In this case the marginal productivity of L can be positive for each value of K and L , but the productive role of the non-accumulated factor becomes asymptotically negligible (in a sense that we will make more precise in Section 4.4). In both cases we have or are approaching a steady-state growth path: the economy grows at a positive rate if b > Ò, and (less realistically) at a
    negative rate if b < Ò. With ‰ > 0, it is not difficult to prove that the economy
    can grow indefinitely if limk→∞ f ′(k) > ‰.
    4.3. Decentralized Production and
    Investment Decisions
    The analysis of the preceding section proceeded directly from the maximiza-
    tion of the objective function of a representative agent (4.10), subject to tech-
    nological constraint represented by the production function. Under certain
    conditions, the optimal solution coincides with the growth of an economy
    in which the decisions to save and invest are decentralized to households
    and firms. In order to study this decentralized economy, we need to define
    the economic nature and the productive role of capital in greater detail. Let us
    assume for now that K is a private factor of production. The property rights of

    EQUILIBRIUM GROWTH 145
    this factor are owned by individual agents who in the past saved part of their
    disposable income.
    The economy is populated by infinitely lived agents, or “households,” which
    for the moment we assume to be identical. The typical household, indexed
    by i , owns one unit of labor. For simplicity, we assume that the growth rate
    of the population is zero. In addition, each household owns ai (t ) units of
    financial wealth (measured in terms of output, consumption, or capital) at
    date t . Moreover, individual agents or households take the wage rate w(t ) and
    the interest rate r (t ) at which labor and capital are compensated as given. (In
    other words, agents behave competitively on all markets.)
    Family i maximizes
    U =
    ∫ ∞
    0
    u(c i (t ))e
    −Òt d t, (4.24)
    subject to the budget constraint
    w(t ) + r (t )ai (t ) = c (t ) + ȧi (t ).
    The flow income earned by capital and labor is either consumed, or added to
    (subtracted from, when negative) the family’s financial wealth.
    Production is organized in firms. Firms hire the production factors from
    households and offer their goods on a competitive market. At each date t , the
    firm indexed by j produces F (K j (t ), L j (t )) using quantities K j (t ) and L j (t )
    of the two factors, in order to maximize the difference between its revenues
    and costs. Since all prices are expressed in terms of the final good, firms solve
    the following static problem:
    max
    K j ,L j
    ( F (K j , L j ) − r K j − w L j ).
    Given that F (·, ·) has constant returns to scale, we can write
    max
    K j ,L j
    [
    L j f
    (
    K j
    L j
    )
    − r K j − w L j
    ]
    ,
    where f (·) corresponds to the output per worker defined in the previous
    section. The first-order conditions of the firm are therefore given by
    f ′
    (
    K j (t )
    L j (t )
    )
    = r (t ),
    f
    (
    K j (t )
    L j (t )
    )
    − K j (t )
    L j (t )
    f ′(K j (t )/L j (t )) = w(t ),
    which are valid for each t and each j .
    Since all firms face the same unit costs of capital and labor, every firm
    will choose the same capital/labor ratio, K j /L j ≡ k. In equilibrium firms
    therefore can differ only as regards the scale of their operation: if L is the

    146 EQUILIBRIUM GROWTH
    aggregate stock of labor (or the number of households), we can index the scale
    of individual firms by Ó j so that

    j Ó j = 1, and denote L j = Ó j L . Thanks to
    the assumption of constant returns to scale, we can assume that F (·, ·) has
    the same functional form as at the aggregate level. We can then immediately
    derive a simple expression for the aggregate output of the economy:
    Y ≡

    j
    F (K j , L j ) =

    j
    L j f (K j /L j ) =
    (∑
    j
    Ó j
    )
    L f (k) = F (K , L ).
    Hence if the production function has constant returns to scale and if all
    markets are competitive, the number of active firms and the scale of their
    operation is irrelevant.36
    At this point we note that

    j L j = L = AN = A
    (∑N
    j =1 1
    )
    . Hence, the
    same factor of labor efficiency A is applied to each individual unit of labor
    that is offered on the labor market. Moreover, we notice that in equilibrium
    the profits of each firm are equal to zero. It is therefore irrelevant to know
    which family owns a particular firm and at which scale this firm operates.
    Let us now return to the household. The dynamic optimization problem of
    the household is expressed by the following Hamiltonian:
    H (t ) = e −Òt [u(c i (t )) + Îi (t )(w(t ) + r (t )ai (t ) − c i (t ))].
    The first-order conditions are analogous to (4.12)–(4.14), and can be rewrit-
    ten as
    d
    d t
    c i (t ) =
    −u′(c i (t ))
    u′′(c i (t ))
    (
    r (t ) − Ò
    )
    ,
    lim
    t→∞
    e −Òt u′(c i (t ))ai (t ) = 0.
    Exercise 36 Compare this optimality condition with
    u′(c t ) =
    1 + r
    1 + Ò
    u′(c t +1),
    also known as a Euler equation, which holds in a deterministic environment with
    discrete time. Complete the parallel between the consumption problems studied
    here and in Chapter 1 by deriving a version of the cumulated budget restriction
    in continuous time.
    ³⁶ For simplicity, we suppose that the stock of capital may vary without adjustment costs. The
    following derivations would remain valid if, as in some of the models studied in Chapter 2, returns
    to scale were constant in adjustment as well as in production, implying that—at least in the long
    run—the size of firms is irrelevant.

    EQUILIBRIUM GROWTH 147
    4.3.1. OPTIMAL GROWTH
    We close the model by imposing the restriction that the total wealth of house-
    holds must equal the aggregate stock of capital. Inter-family loans and debts
    cancel out on aggregate, and in any case there is no reason why such loans and
    debts should exist if households are identical and start with the same initial
    wealth: ai (t ) = a (t ). From
    L∑
    i =1
    ai (t ) = L a (t ) = K (t ),
    we get
    ai (t ) = a (t ) = k(t ).
    Furthermore given that
    r = f ′(k),
    it is easy to verify that optimality conditions for the accumulation of financial
    wealth coincide with those for the accumulation of capital along the path of
    aggregate growth that maximizes (4.10). (This also remains true if g > 0, if
    ‰ > 0, and even if n > 0 — where we should note that, in the presence of
    population growth, the per capita rate of return on capital a is given by r − n,
    and that if capital depreciates we have r = f ′(k) − ‰.)
    Hence, the growth path of a market economy will coincide with the optimal
    growth path if the following conditions are satisfied.
    (A) Production has constant returns to scale.
    (B) Markets are competitive.
    (C) Savings and consumption decisions are taken by agents who independ-
    ently solve identical problems.
    Conditions (A) and (B) guarantee that r (t ) = f ′(k). The savings of an
    individual household are compensated according to the aggregate marginal
    productivity of capital. Moreover, given conditions (A) and (B), the market
    structure is very simple and the entire economy behaves as a “representa-
    tive” firm.
    Hypothesis (C) allows us to represent the savings decisions in terms of the
    optimization of a single “representative agent.” Most differences between indi-
    vidual agents on the market are made irrelevant by the presence of a perfectly
    competitive capital market (as implicitly assumed above). For example, the
    supply of labor may follow different dynamics across households, but access
    to a perfectly competitive capital market may prevent this from having any
    effect at the aggregate level: individuals or households whose labor income is
    temporarily low can borrow from households that are in the opposite position,
    with no aggregate effects as long as total labor supply in the economy is

    148 EQUILIBRIUM GROWTH
    fixed. This is an application of the permanent income hypothesis discussed
    in Chapter 1.
    It is also useful to note that differences in individual consumption have no
    impact at the aggregate level if agents have a common utility function with a
    constant elasticity of substitution as in (4.20). In this case the growth rate of
    consumption is the same for all households, so that

    C
    =

    i ċ i∑
    i c i
    =

    i
    r (t )−Ò
    Û
    c i∑
    i c i
    =
    r (t ) − Ò
    Û
    .
    Functional form (4.20) thus has two advantages. On the one hand, this func-
    tional form is compatible with a steady-state growth path (as we saw above).
    On the other hand, it allows us to aggregate the individual investment deci-
    sions, even in the case in which agents consume different amounts, because
    the interest rate r (t ) is the same for all agents.
    4.4. Measurement of “Progress”: The Solow Residual
    The hypotheses of constant returns to scale and perfectly competitive markets
    (realistic or not) not only are crucial for the equivalence between the opti-
    mization at the aggregate and decentralized levels, but also make it possible to
    measure the technological progress that may allow unlimited growth of labor
    productivity when ks s < ∞. Differentiating the production function Y (t ) = F (K (t ), L (t )), we get Ẏ (t ) = F K (·)K̇ (t ) + F L (·)L̇ (t ) = F K (·)K̇ (t ) + F L (·) [ Ṅ(t ) A(t ) + N(t ) Ȧ(t ) ] , where F L (·) and F K (·) denote the partial derivatives with respect to the pro- duction factors, which are measured in current values. The second equality exploits our definition of labor supply L (t ) ≡ N(t ) A(t ). Rewriting the above expression in terms of proportional growth yields Ẏ Y = F K (·)K Y K̇ K + F L (·) AN Y Ṅ N + F L (·)N Y Ȧ, (4.25) where we have omitted the time dependence. Now, if labor markets are per- fectly competitive, we have w = ∂ F (·)/∂ N = AF L (·). We can thus write F L (·) AN Y = w N Y ≡ „, which expresses labor’s share of national income, which is in general observ- able, in terms of a derivative of the aggregate production function. Moreover, given that the production technology has constant returns to scale in K and L , the entire value of output will be paid to the production factors if these EQUILIBRIUM GROWTH 149 are paid according to their marginal productivity. In fact, for each F (·, ·) with constant returns to scale, F ( K Y , L Y ) = 1 with Y = F (K , L ). Using Euler’s Theorem, we therefore have 1 = F ( K Y , L Y ) = ∂ F (K , L ) ∂ K K Y + ∂ F (K , L ) ∂ L L Y . (4.26) Hence, F K (·)K Y = 1 − AF L (·)N Y = 1 − „, and (4.25) implies „ Ȧ A = Ẏ Y − (1 − „) K̇ K − „ Ṅ N . (4.27) If accurate measures of „ (the income share of the non-accumulated factor N) and the proportional growth rate of Y , K , and N are available, then (4.27) provides a measure known as “Solow’s residual,” which indicates how much of the growth in income is accounted for by an increase in the measure of efficiency A(t ) (which as such is not measurable) rather than by an increase in the supply of productive inputs. If the production function has the Cobb–Douglas form, Y = F (K , L ) = F (K , AN) = K ·( AN)1−·, (4.28) or, equivalently, if Y = ÃF̃ (K , N) = ÃK · N1−·, where à = A1−·, (4.29) then „ is constant and equal to 1 − ·. The Cobb–Douglas function is therefore convenient from an analytic point of view, and also because it does not attach any practical relevance to the difference between a labor-augmenting technical change as in (4.28) and a neutral technological change as in (4.29). In fact, the Solow residual defined in (4.27) corresponds to the rate of growth of Ã. Exercise 37 Verify that, if K̇ /K = Ȧ/ A + Ṅ/N, the income shares of capital and labor are constant as long as the production function has constant returns to scale, even if it does not have the Cobb–Douglas form. Unfortunately, the functional form (4.28) implies that lim k→∞ f ′(k) = lim k→∞ ÷k·−1 = 0 if · < 1, that is if „ > 0 and labor realistically receives a positive share of
    national income. Given that the labor share is approximately constant (around

    150 EQUILIBRIUM GROWTH
    60% in the long-run), the empirical evidence does not seem supportive of
    unlimited growth with constant returns to scale.
    More generally, for each case in which the aggregate production function
    F (·, ·) has constant returns to scale and
    lim
    k→∞
    f ′(k) = lim
    k→∞
    ∂ F (K , L )
    ∂ K
    = b > 0,
    then F L (·)L /F (·) tends to zero when K and k approach infinity for a con-
    stant L . It suffices to take the limit of expression (4.26) with K → ∞ (and
    thus L /Y = L /F (K , L ) → 0), which yields
    1 = F
    (
    lim
    K →∞
    K
    Y
    , 0
    )
    = b lim
    K →∞
    K
    F (K , L )
    + lim
    K →∞
    (
    ∂ F (K , L )
    ∂ L
    L
    F (K , L )
    )
    , (4.30)
    l’Hôpital’s rule then implies (as in exercise 33 above) that
    lim
    K →∞
    K
    F (K , L )
    = 1
    /(
    lim
    K →∞
    ∂ F (K , L )
    ∂ K
    )
    =
    1
    b
    .
    Hence, the first term on the right-hand side of (4.30) tends to one, and the
    second term (the income share of the non-accumulated factor) therefore has
    to tend to zero.
    In sum, the income share of the non-accumulated factor „ needs to decline
    to zero with the accumulation of an infinite amount of capital if
    (i) the accumulation of capital allows the economy to grow indefinitely,
    and
    (ii) the production function has constant returns to scale.
    This conclusion is intuitive in light of the reasoning that led us to draw a
    convex production function in Figure 4.1, and to identify a steady state in
    Figure 4.2; if we have equality rather than a strict inequality in (4.5), that is if
    f (Îk) = F (Îk, 1) = F (Îk, Î) = ÎF (k, 1) = Î f (k)
    for Î �= 1, then output is proportional to K and increasing L will not have
    any effect on output. If the increase in the stock of capital tends to have pro-
    portional effects on output, then both marginal productivity and the income
    share of the non-accumulated factor must steadily decrease.
    Exercise 38 Verify this result for the case of a function in the form (4.9).
    Naturally, equation (4.27) and its implications are valid only under the twin
    assumptions that the production technology exhibits constant returns to scale
    and that production factors are paid according to their marginal productivity.

    EQUILIBRIUM GROWTH 151
    From a formal point of view, nothing would prevent us from considering
    models in which either assumption is violated. As illustrated in the exercise
    below, in that case it does not make much sense to measure Ȧ/ A by inserting
    labor’s income share „ in (4.27).
    Exercise 39 Consider a Cobb–Douglas production function with increasing
    returns to scale,
    Y = AN· K ‚, · + ‚ > 1.
    Suppose, in addition, that wages are below the marginal productivity of labor,
    AF N (·) =
    w
    1 − Ï ,
    where Ï > 0 can be interpreted as a monopolistic mark-up. What does the Solow
    residual measured by (4.27) correspond to in this case?
    The above hypotheses correspond to conditions (A) and (B) in the previous
    section, which allowed us to connect the macroeconomic dynamics to the
    savings and consumption decisions of individual agents. Constant returns to
    scale allowed us simply to aggregate the production functions of the individual
    firms. And the remuneration of production factors equal to their marginal
    product (which in turn followed from the assumption that all markets are
    characterized by free entry and perfect competition) ensured that the dynamic
    path of the economy maximized the welfare of a hypothetical representative
    agent. In the rest of this chapter we consider models for which the macro-
    economic dynamics are well-defined (but not necessarily optimal from the
    aggregate point of view) in the absence of perfectly competitive markets and
    in the presence of increasing returns to scale.
    4.5. Endogenous Growth and Market Imperfections
    To obtain an income share for the non-accumulated factor that is not reduced
    to zero in the long-run and at the same time allow for an endogenous growth
    rate that is determined by the investment decisions of individual agents, we
    need to reconsider the assumption of constant returns to scale. Henceforth
    we will consider steady-state growth paths only in the absence of exogenous
    technological change. We know that, in order to sustain long-run (propor-
    tional) growth, the economy needs to exhibit constant returns to capital: from
    now on we therefore assume that f ′(k) = b, with b independent of k. If that
    condition is satisfied, and if the productivity of the non-accumulated factor L
    is positive, aggregate production is characterized by increasing returns to scale.

    152 EQUILIBRIUM GROWTH
    Multiplying K and L by the same constant increases aggregate production
    more than proportionally.
    As shown above, constant returns to scale are a crucial condition for the
    decentralization of the socially optimal savings and investment decisions.
    Allowing for increasing returns to scale means (in general) that we lose this
    result. It becomes important therefore to confront the optimal growth path of
    the economy with the growth path that results from decentralized investment
    decisions. In addition, we need to pay attention to the criteria for the distri-
    bution of income: with increasing returns to scale, it is no longer possible to
    remunerate all factors of production on the basis of their marginal product
    because the sum of these payments would exceed the value of production.
    Some factor of production needs to receive less than the value of its marginal
    product, and it is obviously of interest to know how that may result.
    4.5.1. PRODUCTION AND NON-RIVAL FACTORS
    To understand the economic mechanisms behind the division of the value of
    output within each productive unit and in the economy as a whole, it is useful
    briefly to reconsider the hypothesis of constant returns to scale.
    One possible microeconomic foundation for this assumption is based on
    the idea that production processes can be replicated. If a firm or productive
    unit j produces Y j using quantities K j and L j (and these are the only neces-
    sary factors of production), then one can obviously obtain double the amount
    of production by doubling the input of both factors, simply by organizing
    these additional factors in an identical production unit. The same reason-
    ing applies to different factors of proportionality as long as the factors of
    production are perfectly divisible, as is implicit in the concept of marginal
    productivity.
    A model with constant returns to scale in production implies not only that
    a doubling of inputs may lead to a doubling of output, but also that such a
    doubling of inputs is necessary to obtain twice the amount of output. In reality,
    however, there are factors of production whose input need not be doubled in
    order to double output; for instance, to build a house one needs a blueprint,
    a piece of land, and a certain quantity of materials, manual labor, and energy
    (all inputs that can be expressed in units of labor and other primary inputs).
    To build a second house one probably needs the same amounts of materials,
    labor, and energy and an identical piece of land. However, nothing prevents
    the use of the same plan. The same input can therefore be used to build several
    houses. This is an example of a more general phenomenon: certain factors
    of production (like the architectural plan) may be used contemporaneously
    by one or more production processes, and their use in a production process
    need not reduce its productivity in other processes. These factors are normally

    EQUILIBRIUM GROWTH 153
    referred to as non-rival inputs. It is not difficult to find other examples: every
    factor that provides intangible (but necessary) input of know-how (or soft-
    ware) is non-rival.
    The presence of non-rival factors makes the assumption of increasing
    returns plausible. It is still possible to build a second house using double the
    amount of all inputs including a completely new plan. But this is no longer
    necessary: since the product can be doubled without any work on the plan,
    doubling all inputs makes it possible to improve its quality, or perhaps to build
    a larger house.
    As we know, the assumption of constant returns was useful to decentralize
    production and to distribute its revenues. On the contrary, with non-rival
    factors and increasing returns to scale, it is no longer possible to pay all factors
    according to their marginal productivity. The total productivity of the factors
    used in design depends on the number of houses that are built with one and
    the same design. This number can in principle be very high. Moreover, if each
    additional house requires a constant amount of labor and material, then the
    production technology has constant returns to these variable inputs, and if
    these factors were paid according to their true marginal productivity, there
    would be nothing left to pay the architect.
    How can one decentralize production decisions under these circumstances?
    Non-rival factors are mostly identified with intangible resources (know-how,
    software) which, by their nature, are often non-rival, non-excludable. When a
    productive input is non-rival it is often difficult to prevent other agents from
    making an economic use of this factor. The regulation of property rights and
    licenses is meant to resolve this type of problem. Nonetheless, the theft of intel-
    lectual property remains difficult to prove and is also hard to punish, because
    the knowledge (the stolen “object”) remains in the hands of the thief. In the
    example of the house, the private property in the physical sense (calculations
    and designs) can be guaranteed, and unauthorized duplication of the plan can
    be punished legally. However, certain innovative aspects of the project may be
    evident by simply observing the final product, and it is not easy to prevent or
    punish reproduction of these aspects by third parties.
    Many recent growth models allow for increasing rather than constant
    returns to scale, and are therefore naturally forced to study markets and
    productive structures characterized by non-rivalry and non-excludability of
    certain factors.
    4.5.2. INVOLUNTARY TECHNOLOGICAL PROGRESS
    In the model outlined below the level of technology, A is treated as an entirely
    non-rival and non-excludable productive input in Solow’s model of exogenous

    154 EQUILIBRIUM GROWTH
    growth. The production function therefore has three arguments, K , N, and A:
    Y = F (K , L ) = F (K , AN) ≡ F̃ (K , N, A).
    If F (·, ·) has constant returns to scale in K and L (both of which have strictly
    positive marginal productivity), then F̃ (·, ·, ·) has increasing returns to scale
    in K , N, and A. In fact, doubling K , N, and A doubles K but quadruples
    L , so aggregate production more than doubles if F L (·) > 0. Since firms hire
    capital and labor from households, we can interpret the situation in terms of
    the non-rivalry and non-excludability of A: each unit of labor has free access
    to the current level of A, which is the same for all.
    As we know, growth in the Solow model is exogenous. More precisely,
    the dynamics of the level of technological change or efficiency A(t ) is not
    influenced by economic decisions: if one interprets A as a production factor
    in the decentralized model, then this factor
    (A1) is completely non-excludable from the viewpoint of production and
    receives no remuneration;
    (A2) is reproduced over time without any interaction with the production
    system; in fact, if we have exponential technological change at a con-
    stant rate g A, the expression Ȧ(t ) = g A A(t ) can be interpreted as an
    expression of accumulation in which A(t ) is used in the production
    of further technological progress (besides its use in the production of
    final goods).
    To integrate technological change in the economic structure of the model,
    we can preserve aspect (A1) (no remuneration for the “factor” technology)
    and relax aspect (A2), assuming that the growth in efficiency is linked to
    economic activity (and remunerated).
    For example, one can specify a model in which technological change is a
    by-product of production (learning by doing). One can for instance assume
    that
    A(t ) = A
    (
    K (t )
    N
    )
    , A′(·) > 0,
    so that the effective productivity of labor is a function of the amount of capital
    per worker. To interpret this assumption, one could assume that experience
    makes workers more efficient. That is, while doing, workers learn from their
    mistakes, and their additional experience thereby increases the productive
    efficiency of the non-accumulated factor N.
    The proposed functional form assumes that labor efficiency is a function
    of the capital stock and thus of the total amount of past investments. It
    may be more realistic to assume that total accumulated production, rather
    than investments, determines the efficiency of N. However, such an exten-

    EQUILIBRIUM GROWTH 155
    sion would complicate the analysis without providing substantially different
    results.
    Much more important is the implicit assumption that the efficiency of each
    unit of labor does not depend on its own productive activity, but rather on
    aggregate economic activity. Agents in this economy learn not only from their
    own mistakes, so to speak, but also from the mistakes of others. When deciding
    how much to invest, agents do not consider the fact that their actions affect the
    productivity of the other agents in the economy; the economic interactions
    are thus affected by externalities. These externalities are similar (albeit with an
    opposite sign) to the externalities that one encounters in any basic textbook
    treatment of pollution, or to those that we will discuss in Chapter 5 when we
    consider coordination problems.
    If we retain the assumptions that firms produce homogeneous goods with
    the constant-returns-to-scale production technology F (K j , AN j ), that A is
    non-rival and non-excludable, and that all markets are perfectly competitive,
    then output decisions can be decentralized as in Section 4.3. In particular, the
    marginal productivity of capital needs to coincide with r (t ), the rate at which
    it is remunerated in the market,
    r (t ) =
    ∂ F (·)
    ∂ K
    ≡ F1(·) = f ′(K /L ),
    and the dynamic optimization problem of households implies a proportional
    growth rate of consumption equal to (r (t ) − Ò)/Û if the function of marginal
    utility has constant elasticity. Hence, recalling that L = AN, it follows that
    both individual and aggregate consumption grow at a rate
    Ċ (t )
    C (t )
    =
    [
    f ′
    (
    K (t )
    N A(t )
    )
    − Ò
    ]/
    Û.
    If, as in the case of a Cobb–Douglas function, the economy distributes a
    constant (or non-vanishing) share of national income to the non-accumulated
    factor, then limk→∞ f ′(k) = 0 < Ò and consumption growth can remain pos- itive only if A and L grow together with K , which would prevent the marginal productivity of capital from approaching zero. However, since A is a function of k in the model of this section, the growth of A itself depends on the accumulation of capital. If lim k→∞ A(k) k = 1 a > 0,
    we have
    lim
    K /N→∞
    f ′
    (
    K
    A(K /N)N
    )
    = lim
    K /N→∞
    F1
    (
    K
    A(K /N)N
    , 1
    )
    = F1(a, 1),
    which may well be above Ò.

    156 EQUILIBRIUM GROWTH
    Exercise 40 Let F (K , L ) = K · L 1−·, and A(·) = a K /N: what is the growth
    rate of the economy?
    Hence, in the presence of learning by doing, the economy can con-
    tinue to grow endogenously even if the non-accumulated factor receives a
    non-vanishing share of national income. There is however an obvious prob-
    lem. From the aggregate viewpoint, true marginal productivity is given by
    d
    d K
    F (K , A(K /N)N) = F1(·) + F2(·) A′(k) > F1(·), for F2(·) ≡
    ∂ F (·)
    ∂ L
    .
    Hence, growth that is induced by the optimal savings decisions of individuals
    does not correspond to the growth rate that results if one optimizes (4.10)
    directly. In fact, the decentralized growth rate is below the efficient growth
    rate because individuals do not take the external effects of their actions into
    account, and they disregard the share of investment benefits that accrues to
    the economy as a whole rather than to their own private resources.
    4.5.3. SCIENTIFIC RESEARCH
    It may well be the case that innovative activity has an economic character and
    that it requires specific productive efforts rather than being an unintentional
    by-product. For example, we may have
    Y (t ) = C (t ) + K̇ (t ) = F (K y (t ), L y (t )), (4.31)
    Ȧ(t ) = F (K A(t ), L A(t )), (4.32)
    with K y (t ) + K A(t ) = K (t ), L y (t ) + L A(t ) = L (t ) = A(t )N(t ). In other
    words, new and more efficient modes of production may be “produced” by
    dedicating factors of production to research and development rather than to
    the production of final goods.
    If, as suggested by the notation, the production function is the same in both
    sectors and has constant returns to scale, then we can write
    Ȧ = F (K A, L A) =
    ∂ F (K A, L A)
    ∂ K
    K A +
    ∂ F (K A, L A)
    ∂ L
    L A.
    Assuming that the rewards r and w of the factors employed in research are
    the same as the earnings in the production sector, then
    Ȧ = r K A + w L A (4.33)
    is a measure of research output in terms of goods. If A is (non-rival and)
    non-excludable, then this output has no market value. Since it is impossible to
    prevent others from using knowledge, private firms operating in the research

    EQUILIBRIUM GROWTH 157
    sector would not be able to pay any salary to the factors of production that
    they employ.
    Nonetheless, the increase in productive efficiency has value for society as a
    whole, if not for single individuals. Like other non-rival and non-excludable
    goods, such as national defense or justice, research may therefore be financed
    by the government or other public bodies if the latter have the authority to
    impose taxes on final output that has a market value. One could for example
    tax the income of all private factors at rate Ù, and use the revenue to finance
    “firms” which (like universities or national research institutes, or like monas-
    teries in the Middle Ages) produce only research which is of no market value.
    Thanks to constant returns to scale, one can calculate national income in both
    sectors by evaluating the output of the research sector at the cost of production
    factors, as in (4.33). Moreover, the accumulation of tangible and intangible
    assets obeys the following laws of motion:
    K̇ = (1 − Ù) F (K , AN) − C,
    Ȧ = ÙF (K , AN).
    The return on private investments is given by
    r (t ) = (1 − Ù) f ′(k),
    and if f (·) has decreasing returns the economy possesses a steady-state growth
    path in which A, K , Y , and C all grow at the same rate. It is not difficult
    to see that there is no unambiguous relation between this growth rate and
    the tax Ù (or the size of the public research sector). In fact, in the long-run
    there is no growth if Ù = 0, since in that case Ȧ(t ) = 0; but neither is there
    growth if Ù is so high that r (t ) = (1 − Ù) f ′(k) tends toward values below
    the discount rate of utility, and prevents growth of private consumption and
    capital. For intermediate values, however, growth can certainly be positive.
    (We shall return to this issue in Section 4.5.5.)
    4.5.4. HUMAN CAPITAL
    Retaining assumptions (4.32) and (4.31), one can reconsider property (A1),
    and allow A to be a private and excludable factor of production. In this case,
    the problem of how to distribute income to the three factors A, K , and L
    if there are increasing returns to scale can be resolved if one assumes that a
    person (a unit of N) does not have productive value unless she owns a certain
    amount of the measure of efficiency A. Reverting to the hypothesis implicit in
    the Solow model, in which N is remunerated but not A, the presence of N is
    thus completely irrelevant from a productive point of view.

    158 EQUILIBRIUM GROWTH
    The factor A, if remunerated, is not very different from K , and may be
    dubbed human capital. In fact, for A to be excludable it should be embodied
    in individuals, who have to be employed and paid in order to make productive
    use of knowledge. One example of this is the case of privately funded profes-
    sional education.
    In the situation that we consider here, all the factors are accumulated. Given
    constant returns to scale, we can therefore easily decentralize the decisions to
    devote resources to any of these uses. If as in (4.31) and (4.32) the two factors
    of production are produced with the same technology, and if one assumes that
    all markets are competitive so that A and K are compensated at rates F A(·)
    and F K (·) respectively, then the following laws of motion hold:
    K̇ = F ((1 − Ù)K , (1 − Ù) A) − C = (1 − Ù) F (K , A) − C
    Ȧ = ÙF (K , A).
    In these equations Ù no longer denotes the tax on private income, but rather
    more generally the overall share of income that is devoted to the accumulation
    of human capital instead of physical capital (or consumption).
    If technological change does indeed take the form suggested here, then
    we need to reinterpret the empirical evidence that was advanced when we
    discussed the Solow residual. Given that the worker’s income includes the
    return on human capital, we need to refine the definition of labor stock, which
    is no longer identical to the number of workers in any given period. The
    accumulation of this factor may for example depend on the enrolment rates of
    the youngest age cohorts in education more than on demographic changes as
    such. However, the fact that agents have a finite life, and that they dedicate only
    the first part of their life to education, implies that it is difficult to claim that
    education is the only exclusive source of technological progress. Each process
    of learning and transmission of knowledge uses knowledge that is generated
    in the past and is not necessarily compensated. Hence also the accumulation
    of human capital is subject to the type of externalities that we encountered in
    the discussion of learning by doing.37
    4.5.5. GOVERNMENT EXPENDITURE AND GROWTH
    Besides the capacity to finance the accumulation of non-excludable technolog-
    ical change, government spending may provide the economy with those (non-
    rival and non-excludable) factors that make the assumption of increasing
    returns plausible. Non-rivalry and non-excludability are in fact main features
    ³⁷ Drafting and studying the present chapter, for example, would have been much more difficult if
    Robert Solow, Paul Romer, and many others had not worked on growth issues. Yet, no royalty is paid
    to them by the authors and readers of this book.

    EQUILIBRIUM GROWTH 159
    of pure public goods like defense or police, and of quasi-public goods like
    roads, telecommunications, etc. To analyze these aspects, we assume that
    Y (t ) = F̃ (K (t ), L (t ), G (t )),
    where, besides the standard factors K and L (the latter constant in the absence
    of exogenous technological change), the amount of public goods G appears
    as a separate input. Since L and K are private factors of production, the
    competitive equilibrium of the private sector requires that the production
    function F̃ (·, ·, ·) has constant returns to its first two arguments:
    F̃ (ÎK , ÎL , G ) = ÎF̃ (K , L , G ).
    Hence, given ∂ F̃ (·)/∂ G > 0, a proportional change of G and of the private
    factors L and K results in a more than proportional increase in production.
    The function F̃ (·, ·, ·) therefore has increasing returns to scale, but this does
    not prevent the existence of a competitive equilibrium as long as G is a non-
    rival and non-excludable factor which is made available to all productive units
    without any cost. If the provision of public goods is constant over time (G (t ) =
    Ḡ for each t ) then, as in the preceding section, constant returns to K and L
    would imply decreasing returns to K . With an increase in the stock of capital,
    the growth rate that is implied by the optimization of (4.10) and (4.20), i.e.
    Ċ (t )
    C (t )
    =
    (
    ∂ F̃ (K (t ), L (t ), Ḡ )
    ∂ K
    − Ò
    )/
    Û,
    can only decrease, and will fall to zero in the limit if L continues to receive a
    positive share of aggregate income.
    To allow indefinite growth, the provision of public goods needs to increase
    exponentially. If, as seems realistic, a higher G (t ) has a positive effect on the
    marginal productivity of capital, then Ġ (t ) > 0 has a similar effect to the (ex-
    ogenous) growth of A(t ) in the preceding sections. Hence, an ever increasing
    supply of public goods may allow the return on savings to remain above the
    discount rate Ò so that the economy as a whole can grow indefinitely.
    As we saw in Section 4.5.2, the development of A(t ) could be made
    endogenous by assuming that the accumulation of this index of efficiency
    depended on the capital stock. Similarly, and even more obviously, the provi-
    sion of public goods is a function of private economic activity if one assumes
    that their provision is financed by the taxation of private income. If
    G (t ) = ÙF̃ (K (t ), L (t ), G (t )), (4.34)
    then each increase in production will be shared in proportion between con-
    sumption, investments and the increase of G (t ),which can offset the secular
    decrease in the marginal productivity of capital.

    160 EQUILIBRIUM GROWTH
    To obtain a balanced growth path, the production function needs to have
    constant returns to K and G for any constant L . In fact, if
    F̃ (ÎK , L , ÎG ) = ÎF̃ (K , L , G ),
    a constant increase of capital will imply proportional growth of income if G
    grows at the same rate as K —this is in turn implied by the proportionality
    of income, tax revenues, and the provision of public goods in (4.34). To cal-
    culate the growth rate that is compatible with a balanced government budget
    and with the resulting savings and investment decisions, we must to take into
    account the fact that we have to subtract the tax rate Ù from the private return
    on savings; hence, consumption grows at the rate
    Ċ (t )
    C (t )
    =
    (
    (1 − Ù) ∂ F̃ (K (t ), L (t ), G (t ))
    ∂ K
    − Ò
    )/
    Û, (4.35)
    and the growth path of the economy will satisfy the above equation and (4.34).
    Exercise 41 Consider the production function
    F̃ (K , L , G ) = K · L ‚ G „.
    Determine what relation ·, ‚, and „ need to satisfy so that the economy has
    a balanced growth path. What is the growth rate along this balanced growth
    path?
    4.5.6. MONOPOLY POWER AND PRIVATE INNOVATIONS
    An important aspect of the models described above is the fact that the decen-
    tralized growth path need not be optimal in the absence of a complete set of
    competitive markets. The formal analysis of economic interactions that are
    less than fully efficient plays an important role in modern macroeconomics,
    and in this concluding section we briefly discuss how imperfectly competitive
    markets may imply inefficient outcomes.
    In order to decentralize production decisions, we have so far assumed that
    markets are perfectly competitive (allowing only for the possibility of missing
    markets in the case of non-excludable factors). However, it is realistic to
    assume that there are firms that have monopoly power and that do not take
    prices as given. From the viewpoint of the preceding sections, it is interesting
    to note the relationship between monopoly power and increasing returns to
    scale within firms. Returning to the example of a house, we assume that the
    project is in fact excludable. That is, a given productive entity (a firm) can
    legally prevent unauthorized use of the project by third parties. However,
    within the firm the project is still non-rival, and the firm can use the same
    blueprint to build any arbitrary number of houses. If we assume that the firm

    EQUILIBRIUM GROWTH 161
    is competitive, it will be willing to supply houses as long as the price of each
    is above marginal cost. Hence for a price above marginal cost supply tends to
    infinity, while for any price below marginal cost supply is zero. But if the price
    is exactly equal to marginal cost, then revenues are just enough to recover the
    variable cost (materials, labor, land)—and the fixed cost (the project) would
    need to be paid by the firm, which should rationally refuse to enter the market.
    A firm that bears a fixed cost but does not have increasing marginal costs
    (or more generally has increasing returns) has to be able to charge a price
    above marginal cost in order to exist. Formally, we assume that firm j needs
    to pay a fixed cost Í0 to be able to produce, and a variable cost (per unit of
    output) equal to Í1. In addition, we assume that the demand function has
    constant elasticity, with p j = x
    ·−1
    j where x j is the number of units produced
    and offered on the market. The total revenues are thus p j x j = x
    ·
    j , and to
    maximize profits,
    max
    x j
    x ·j − Í0 − Í1 x j ,
    the firm chooses output level
    x j =
    (Í1
    ·
    )1/(·−1)
    and charges price
    p j =
    Í1
    ·
    .
    With free entry of firms (that is any firm that pays Í0 can start production of
    this item), profits will be zero in equilibrium:
    ( p j − Í1)x j = Í0 ⇒ x j =
    Í0
    Í1
    ·
    1 − · , (4.36)
    and the resulting price is equal to the average cost of production, rather than
    the marginal cost, as in the case of perfect competition. The costs of each firm
    are thus given by
    Í0 + Í1 x j = Í0 + Í1
    Í0
    Í1
    ·
    1 − · =
    Í0
    1 − · . (4.37)
    This condition determines the scale of production, or in our example the
    number of houses that are produced with each project.
    To incorporate this monopolistic behavior in a dynamic general equilib-
    rium model, we consider the aggregate production (valued at market prices)
    of N identical firms:
    X =
    N∑
    j =1
    p j x j =
    N∑
    j =1
    x · = N x ·.

    162 EQUILIBRIUM GROWTH
    If Í0 and Í1 are given and if N is an integer, then this measure of output can
    only be a multiple of the scale of production calculated in (4.36). However,
    nothing constrains us from indexing firms with a continuous variable and
    replacing the summation sign by an integral.38 Writing
    X =
    ∫ N
    0
    x ·j d j = x
    ·
    ∫ N
    0
    d j = N x ·,
    and treating N as a continuous variable, the zero profit condition can be
    exactly satisfied for any value of aggregate production. Given that profits are
    zero, the value of production equals the cost of production, which in turn is
    given by N times the quantity in (4.37). Assume for a moment that the costs
    of a firm (both fixed and variable) are given by the quantity of K multiplied by
    r (t ). For a given supply of productive factors, we can then determine the num-
    ber of production processes that can be activated as well as the remuneration
    of the production factors. The scale of production of each of the N identical
    firms is proportional to K /N, and the constant of proportionality is given by
    Í0/(1 − ·).
    We thus have
    X =
    ∫ N
    0
    (
    Í0
    1 − ·
    K
    N

    d j =
    (
    Í0
    1 − ·

    N1−· K ·. (4.38)
    Because the goods are imperfect substitutes, the value of output increases with
    the number of varieties N for any given value of K . In other words, for a given
    value of income it is more satisfying to consume a wider variety of goods.
    Suppose that the value of aggregate output is defined by
    Y = L 1−·
    (∫ N
    0
    x ·j d j
    )
    = L 1−· X.
    That is, output (which can be consumed or invested in the form of capital) is
    obtained by combining the market value X of the intermediate goods x j with
    factor L which, as usual, is assumed to be exogenous and fixed.
    Let us assume in addition that utility has the constant-elasticity form (4.20),
    so that the optimal rate of growth of consumption is constant if the rate of
    return on savings is constant. Given that, in equilibrium,
    Y = L 1−· X = L 1−·Ó1−· K ,
    ³⁸ Approximating N by a continuous variable is substantially appropriate if the number of firms is
    large. Formally, one would let the economic size of each firm go to zero as their number increases, and
    keep the product of the number of firms by the distance between their indexes constant at N.

    EQUILIBRIUM GROWTH 163
    so that ∂Y /∂ K is constant (non decreasing), we find that equilibrium has a
    growth path with a constant growth rate if
    ∂Y
    ∂ K
    = L 1−·Ó1−· > Ò.
    In the decentralized equilibrium, the rate of growth is (r − Ò)/Û where r
    denotes the remuneration of capital in terms of the final good. To determine
    r , we notice that each factor is paid according to its marginal productivity in
    the final goods sector provided that this sector is competitive. Hence, the total
    value of income that accrues to capital is equal to
    r K = ·(Y /K )K = ·L 1−·Ó1−· K
    and
    r = ·L 1−·Ó1−· < L 1−·Ó1−· = ∂Y ∂ K . The private accumulation of capital is rewarded at a rate that is below its pro- ductivity at the aggregate level. As before, the economy therefore grows below the optimum growth rate. Intuitively, given that the production technology is characterized by increasing returns at the level of an individual firm, firms can make positive profits only if prices exceed marginal costs. The rate r which determines marginal costs is therefore below the true aggregate return on capital. The difference between private and social returns on capital is given by the mark-up, which distorts savings decisions and implies that growth is slower than optimal. Admitting that prices may be above marginal cost, one can add further realism to the model by assuming that monopolistic market power is of a long-run nature. This requires that fixed flow costs be incurred once the firm is created. Over time firms can therefore gradually recover fixed costs, thanks to monopolistic rents. Obviously, this is the right way to formalize the above house example: the fixed cost of designing the house is paid once, but the resulting project can be used many times. We refer readers to the bibliographical references at the end of this chapter for a complete treatment of the resulting dynamic optimization problem and its implications for the aggregate growth rate. REVIEW EXERCISES Exercise 42 Consider the production function Y = F (K ) = { ·K − 1 2 K 2 if K < ·, 1 2 ·2 otherwise. 164 EQUILIBRIUM GROWTH (a) Determine the optimality conditions for the problem max ∫ ∞ 0 u(C (t ))e −Òt d t s.t. C (t ) = F (K (t )) − K̇ (t ), K (0) < · given with utility function u(x ) = { ı + ‚x − 1 2 x 2 if x < ‚ ı + 1 2 ‚2 otherwise. (b) Calculate the steady-state value of capital, production, and consumption. Draw the phase diagram in the capital–consumption space. (The formal derivations can be limited to the region K < ·, C < ‚ assuming that the parameters ·, ‚, Ò satisfy appropriate conditions. You may also provide an (informal) discussion of the optimal choices outside this region in which the usual assumptions of convexity are not satisfied.) (c) To draw the phase diagram, one needs to keep in mind the role of parame- ters · and Ò. But what is the role of ‚? (d) The production function does not have constant returns to scale. This is a problem (why?) if one wants to interpret the solution as a dynamic equi- librium of a market economy. Show that for a certain g (L ) the production function Y = F (K , L ) = ·K − g (L )K 2 has constant returns to K and L in the relevant region. Also show that the solution characterized above corresponds to the dynamic equilibrium of an economy endowed with an amount L = 2 of a non-accumulated factor. Exercise 43 Consider an economy in which output and accumulation satisfy Y (t ) = ln(L + K (t )), K̇ (t ) = s Y (t ), with L and s constant. (a) Can this economy experience unlimited growth of consumption C (t ) = (1 − s ) Y (t )? Explain why this may or may not be the case. (b) Can the productive structure of this economy be decentralized to competi- tive firms? EQUILIBRIUM GROWTH 165 Exercise 44 Consider an economy with a production function and a law of motion for capital given by Y (t ) = L + L 1−· K (t )·, K̇ (t ) = Y (t ) − C (t ). (a) Let 0 ≤ · ≤ 1. How are L and K (t ) compensated if markets are compet- itive? (b) Determine the growth rate of aggregate consumption C (t ) if there is a fixed number of identical consumers that maximize the same objective function, U = ∫ ∞ 0 c (t )1−Û − 1 1 − Û e −Òt d t, where r (t ) denotes the real interest rate on savings. Provide a brief discus- sion. (c) Given the above assumptions, characterize graphically the dynamics of the economy in the space (C, K ) if · < 1, and calculate the steady state. (d) How are the dynamics if · = 1? How do the income shares of the two factors evolve? Discuss the realism of this model with reference to the empirical plausibility of the balanced growth path. Exercise 45 An economic system is endowed with a fixed amount of a production factor L . Of this, L Y units are employed in the production of final goods destined for consumption and accumulation, Y (t ) = A(t )K · L 1−·Y , K̇ (t ) = Y (t ) − C (t ). The remaining units of L are used to increase A(t ) according to the following technology: Ȧ(t ) = (L − L Y ) A(t ). (a) Consider the case in which the propensity to save is equal to s . Characterize the balanced growth path of this economy. (b) What feature allows this economy to grow endogenously? What economic interpretation can we give for the difference between K and A? (c) Discuss the possibility of decentralizing production with the above technol- ogy if A, K , and L are “rival” and “excludable” factors. Exercise 46 Consider an economy in which output Y , capital K , and consump- tion C are related as follows: Y (t ) = F (K (t ), L ) = (K (t )„ + L „)1/„, K̇ (t ) = Y (t ) − C (t ) − ‰K (t ), where L > 0, ‰ > 0, and „ ≤ 1 are fixed parameters.
    (a) Show that the production function has constant returns to scale.

    166 EQUILIBRIUM GROWTH
    (b) Write the production function in the form y = f (k) for y ≡ Y /L and
    k ≡ K /L .
    (c) Calculate the net rate of return on capital, r = f ′(k) − ‰, and show that
    in the limit with k approaching infinity this rate tends to −‰ if „ ≤ 0, and
    to 1 − ‰ if „ > 0.
    (d) Denote the net production by Ỹ ≡ Y − ‰K = F (K , L ) − ‰K , and
    assume that C (t ) = 0.5Ỹ (t ) (aggregate consumption is equal to half the
    net income). What happens to consumption if the economy approaches a
    steady state?
    (e) If on the contrary consumption is chosen to maximize
    U =
    ∫ ∞
    0
    log(c (t ))e −Òt d t,
    for which values of „ and Ò will there be endogenous growth?
    Exercise 47 Consider an economy in which
    Y (t ) = K (t )· L̄ ‚, K̇ (t ) = P (t )s Y (t ),
    and in which the labor force is constant, and a fraction s of P (t )Y (t )is dedicated
    to the accumulation of capital.
    (a) Consider P (t ) = P̄ (constant). For which values of · and ‚ does there
    exist a steady state in levels or in growth rates? For which values can we
    decentralize the production decisions to competitive firms?
    (b) Let P (t ) = e ht , where h > 0 is a constant. With · < 1, at which rate can Y (t ) grow? (c) How does the economy grow if on the contrary P (t ) = K (t )1−·? (d) What does P (t ) represent in this economy? How can we interpret the assumption made in (b) and (c)? Exercise 48 Consider an economy in which all individuals maximize U = ∫ ∞ 0 U (c (t )) e −Òt d t, with U (c ) = 1 − 1 c and Ò = 1. (a) Let r denote the return in private savings and determine the rate of growth of consumption. (b) Suppose that production utilizes private capital and labor according to Y (t ) = F (K , L , t ) = B (t )L + 3K . Determine the per-unit income of L and K , denoted by w(t ) and r (t ) respectively, if capital and labor are paid their marginal productivity. (c) Suppose that L is constant, that K̇ (t ) = Y (t ) − C (t ), and that Ḃ (t ) = B (t ). Can capital and production grow for ever at the same rate as the EQUILIBRIUM GROWTH 167 optimal consumption? Determine the relation between C (t ), K (t ), and B (t ) along the balanced growth path. (d) Suppose that at the aggregate level B (t ) = K (t ), but that factors are com- pensated on the basis of their marginal productivity taking as given B (t ). Show that the resulting decentralized growth rate is below the socially efficient growth rate. � FURTHER READING This chapter offers a concise introduction to key notions within a subject treated much more exhaustively by Grossman and Helpman (1991), Barro and Sala-i-Martin (1995), and Aghion and Howitt (1998). Models of endogenous growth were originally formulated in Romer (1986, 1990), Rebelo (1991), and other contributions that may be fruitfully read once familiar with the technical aspects discussed here. Blanchard and Fisher (1989, section 2.2) offers a concise discussion of how optimal growth paths may be decentral- ized in competitive markets. For a discussion of general equilibrium in more complex growth environments, readers are referred to Jones and Manuelli (1990) and Rebelo (1991). These papers consider production technologies that enable endogenous growth, and the optimal growth paths of these economies can be decentralized as in the models of Sections 4.2.3 and 4.5.4. The model of Rebelo allows for a distinction between investment goods and consumption goods. As a result, the optimal production decisions may be decentralized even in the presence of non-accumulated factors like L in this chapter. However, this requires that non-accumulated factors be employed in the production of consumption goods only, and not in the production of investment goods. An extensive recent literature lets non-accumulated factors be employed in a (labor-intensive) research and development sector, where endogenous growth is sustained by learning by doing or informational spillover mechanisms of the type discussed in Sections 4.2 and 4.3 above. McGrattan and Schmidtz (1999) offer a nice macro-oriented introduction to the relevant insights. Romer (1990) and Grossman and Helpman (1991) are key references in this literature. Grossman and Helpman (1991) offer fully dynamic versions of the model with monopolistic competition, introduced in the last section of this chapter. The role of research and development is also treated in Barro and Sala-i-Martin (1995), who discuss the role of government spending in the growth process, an issue that was originally dealt with in Barro (1990). As to empirical aspects, there is an extensive literature on the measurement of the growth rate of the Solow residual; for a discussion of this issue see e.g. Maddison (1987) or Barro and Sala-i-Martin (1995), chapter 10. Barro and Sala-i-Martin (1995) and McGrattan and Schmidtz (1999) offer extensive 168 EQUILIBRIUM GROWTH reviews of recent empirical findings regarding long-run economic growth phenomena. Briefly, the treatment of human capital as an accumulated factor (as in Section 5.4 above) and careful measurement of government interference with market interactions (as in Section 5.5 above) have both proven crucial in interpreting cross-country income dynamics. More detailed and realistic theoretical models than those offered by this chapter’s stylized treatment have of course proved empirically useful, especially as regards the government’s role in protecting investors’ legal rights to the fruits of their efforts, and open- economy aspects. Theoretical and empirical contributions have also paid well- deserved attention to politico-economic tensions regarding all relevant poli- cies’ implications for growth and distribution (see Bertola, 2000, and refer- ences therein), as well as to the role of finite lifetimes in determining aggregate saving rates (see Blanchard and Fischer, 1989, and Heijdra and van der Ploeg, 2002). More generally, treatment of policy influences and market imperfections along the lines of this chapter’s argument is becoming more prominent in macroeconomic equilibrium models. As noted by Solow (1999), much of the recent methodological progress on such aspects was prompted by the need to allow for increasing returns to scale in endogenous growth models, but the relevant insights have much wider applicability, and need not play a particularly crucial role in explaining long-run growth phenomena. � REFERENCES Aghion, P., and P. Howitt (1998) Macroeconomic Growth Theory, Cambridge, Mass.: MIT Press. Barro, R. J. (1990) “Government Spending in a Simple Model of Endogenous Growth,” Journal of Political Economy, 98, S103–S125. and X. Sala-i-Martin (1995) Economic Growth, New York: McGraw-Hill. Bertola, G. (2000) “Macroeconomics of Income Distribution and Growth,” in A. B. Atkinson and F. Bourguignon (eds.), Handbook of Income Distribution, vol. 1, 477–540, Amsterdam: North-Holland. Blanchard, O. J., and S. Fischer (1989) Lectures on Macroeconomics, Cambridge, Mass.: MIT Press. Grossman, G. M., and E. Helpman (1991) Innovation and Growth in the Global Economy, Cambridge, Mass.: MIT Press. Heijdra, B. J., and F. van der Ploeg (2002) Foundations of Modern Macroeconomics, Oxford: Oxford University Press. Jones, L. E., and R. Manuelli (1990) “A Model of Optimal Equilibrium Growth,” Journal of Political Economy, 98, 1008–1038. Maddison, A. (1987) “Growth and Slowdown in Advanced Capitalist Economies,” Journal of Economic Literature, 25, 649–698. EQUILIBRIUM GROWTH 169 McGrattan, E. R., and J. A. Schmidtz, Jr (1999) “Explaining Cross-Country Income Differences,” in J. B.Taylor and M. Woodford (eds.), Handbook of Macroeconomics, vol. 1A, 669–736, Ams- terdam: North-Holland. Rebelo, S. (1991) “Long-Run Policy Analysis and Long-Run Growth,” Journal of Political Econ- omy, 99, 500–521. Romer, P. M. (1986) “Increasing Returns and Long-Run Growth,” Journal of Political Economy, 94, 1002–1037. (1990) “Endogenous Technological Change,” Journal of Political Economy, 98, S71–S102. (1987) “Growth Based on Increasing Returns Due to Specialization,” American Economic Review (Papers and Proceedings), 77, 56–72. Solow, R. M. (1956) “A Contribution to the Theory of Economic Growth,” Quarterly Journal of Economics, 70, 65–94. (1999) “Neoclassical Growth Theory,” in J. B. Taylor and M. Woodford (eds.), Handbook of Macroeconomics, vol. 1A, 637–667, Amsterdam: North-Holland. 5 Coordination and Externalities in Macroeconomics As we saw in Chapter 4, externalities play an important role in endogenous growth theory. Many recent contributions have explored the relevance of similar phenomena in other macroeconomic contexts. In general, aggregate equilibria based on microeconomic interactions may differ from those medi- ated by the equilibrium of a perfectly competitive market in which agents take prices as given. If every agent correctly solves her own individual prob- lem, taking into consideration the actions of all other agents rather than the equilibrium price, then nothing guarantees that the resulting equilibrium is efficient at the aggregate level. Uncoordinated “strategic” interactions may thus play a crucial role in many modern macroeconomic models with micro foundations. In this chapter we begin by considering the relationship between the exter- nalities that each agent imposes on other individuals in the same market and the potential multiplicity of equilibria, first in an abstract trade setting (Section 5.1) and then in a simple monetary economy (Section 5.2). (The appendix to this chapter describes a general framework for the analysis of the relationship between externalities, strategic interactions, and the properties of multiplicity and efficiency of the aggregate equilibria.) Then we study a labor market characterized by a (costly) process of search on the part of firms and workers. This setting extends the analysis of the dynamic aspects of labor markets of Chapter 3, focusing on the flows into and out of unemployment. Attention to labor market flows is motivated by their empirical relevance: even in the absence of changes in the unemployment rate, job creation and job destruction occur continuously, and the reallocation of workers often involves periods of frictional unemployment. The stylized “search and matching” modeling framework introduced below is realistic enough to offer empirically sensible insights, reviewed briefly in the “Further Readings” section at the end of the chapter. We formally analyze determination of the steady state equilibrium in Section 5.3 and the dynamic adjustment process in Section 5.4. Finally, Section 5.5 characterizes the efficiency implications of externalities in labor market search activity. COORDINATION AND EXTERNALITIES 171 5.1. Trading Externalities and Multiple Equilibria This section analyzes a basic model where the nature of interactions among individuals creates a potential for multiple equilibria. These equilibria are characterized by different levels of “activity” (employment, production) in the economy. The model presented here is based on Diamond (1982a ) and features a particular type of externality among agents operating in a given market: the larger the number of potential trading partners, the higher the probability that an agent will make a profitable trade (trading externality). Markets with a high number of participants thus attract even more agents, which reinforces their characteristic as a “thick” market, while “thin” markets with a low number of participants remain locked in an inferior equilibrium. 5.1.1. STRUCTURE OF THE MODEL The economy is populated by a high number of identical and infinitely lived individuals, who engage in production, trade, and consumption activities. Production opportunities are created stochastically according to a Poisson distribution, whose parameter a defines the instantaneous probability of the creation of a production opportunity. At each date t0, the probability that no production opportunity is created before date t is given by e −a (t−t0 ) (and the probability that at least one production opportunity is created within this time interval is thus given by 1 − e −a (t−t0 )). This probability depends only on the length of the time interval t − t0 and not on the specific date t0 chosen. The probability that a given agent receives a production opportunity between t0 and t is therefore independent of the distribution of production prior to t0. 39 All production opportunities yield the same quantity of output y, but they differ according to the associated cost of production. This cost is defined by a random variable c , with distribution function G (c ) defined on c ≥ c > 0,
    where c represents the minimum cost of production. Trade is essential in the
    model, because goods obtained from exploiting a production opportunity
    cannot be consumed directly by the producer. This assumption captures in
    a stylized way the high degree of specialization of actual production processes,
    and it implies that agents need to engage in trade before they can consume. At
    each moment in time, there are thus two types of agent in the market:
    1. There are agents who have exploited a production opportunity and wish
    to exchange its output for a consumption good: the fraction of agents
    in this state is denoted by e , which can be interpreted as a “rate of
    ³⁹ The stochastic process therefore has the Markov property and is completely memoryless. In a
    more general model, a may be assumed to be variable. The function a (t ) is known as the hazard
    function.

    172 COORDINATION AND EXTERNALITIES
    employment,” or equivalently as an index of the intensity of production
    effort.
    2. There are agents who are still searching for a production opportunity:
    the corresponding fraction 1 − e can be interpreted as the “unemploy-
    ment rate.”
    Like production opportunities, trade opportunities also occur stochasti-
    cally, but their frequency depends on the share of “employed” agents: the
    probability intensity of arrivals per unit of time is not a constant, like the a
    parameter introduced above, but a function b(e ), with b(0) = 0 and b′(e ) > 0.
    The presence of a larger number of employed agents in the market increases
    the probability that each individual agent will find a trading opportunity. This
    property of the trading technology is crucial for the results of the model and its
    role will be highlighted below.
    Consumption takes place immediately after agents exchange their goods.
    The instantaneous utility of an agent is linear in consumption (y) and in the
    cost of production (−c ), and the objective of maximizing behavior is
    V = E
    [ ∞∑
    i =1
    (
    − e −r ti c + e −r (ti +Ùi ) y
    )]
    ,
    where r is the subjective discount rate of future consumption, the sequence
    of times {ti } denotes dates when production takes place, and {Ùi } denotes the
    interval between such dates and those when consumption and trade take place.
    Since production and trade opportunities are random, both {ti } and {Ùi } are
    uncertain, and the agent maximizes the expected value of discounted utility
    flows.
    To maximize V , the agent needs to adopt an optimal rule to decide whether
    or not to exploit a production opportunity. This decision is based on the cost
    that is associated with each production opportunity or, equivalently, on the
    effort that a producer needs to exert to exploit the production opportunity.
    The agent chooses a critical level for the cost c ∗, such that all opportunities
    with a cost level equal to c ≤ c ∗ are exploited, while those with a cost level
    c > c ∗ are refused.
    To solve the model, we need to determine this critical value c ∗ and the
    dynamic path of the level of activity or “employment” e .
    5.1.2. SOLUTION AND CHARACTERIZATION
    To study the behavior of the economy outlined above, we first derive the
    equations that describe the dynamics of the level of activity (employment) e
    and the critical value of the costs c ∗ (the only choice variable of the model).

    COORDINATION AND EXTERNALITIES 173
    The evolution of employment is determined by the difference between the
    flow into and out of employment. The first is equal to the fraction of the
    unemployed agents that receive and exploit a production opportunity: this
    fraction is equal to (1 − e )a G (c ∗). The flow out of employment is equal to
    the fraction of employed agents who find a trading opportunity and who thus,
    after consumption, return to the pool of unemployed. This fraction is equal to
    e b(e ). The assumption b′(e ) > 0 that was introduced above now has a clear
    interpretation in terms of the increasing returns to scale in the process of trade.
    Calculating the elasticity of the flow out of employment e b(e ) with respect to
    the rate of employment e , we get
    ε = 1 +
    e b′(e )
    b(e )
    ,
    which is larger than one if b′(e ) > 0 (implying increasing returns in the
    trading technology). In other words, a higher rate of activity increases the
    probability that an employed agent will meet a potential trading partner.
    Given the expressions for the flows into and out of employment, we can
    write the following law of motion for the employment rate:
    ė = (1 − e )a G (c ∗) − e b(e ). (5.1)
    In a steady state of the system the two flows exactly compensate each other,
    leaving e constant. The following relation between the steady-state value of
    employment and the critical cost level c ∗ therefore holds:
    (1 − e )a G (c ∗) = e b(e )
    ⇒ d e
    d c ∗
    ∣∣∣∣
    ė =0
    =
    (1 − e )a G ′(c ∗)
    b(e ) + e b′(e ) + a G (c ∗)
    > 0. (5.2)
    A rise in c ∗ increases the flow into employment, since it raises the share of
    production opportunities that agents find attractive, and thus determines a
    higher steady-state value for e , as depicted in the left-hand panel of Figure 5.1.
    For points that are not located on the locus of stationarity, the dynamics of
    employment are determined by the effect of e on ė : according to (5.1), a higher
    value for e reduces ė , as is also indicated by the direction of the arrows in the
    figure.
    In order to determine the production cost below which it is optimal to
    exploit the production opportunity, agents compare the expected discounted
    value of utility in the two states: employment (the agent has produced the
    good and is searching for a trading partner) and unemployment (the agent is
    looking for a production opportunity with sufficiently low cost). The value of
    the objective function in the two states is denoted by E and U , respectively.
    These values depend on the path of employment e and thus vary over time;

    174 COORDINATION AND EXTERNALITIES
    Figure 5.1. Stationarity loci for e and c ∗
    however, if we limit attention to steady states for a moment, then E and
    U are constant over time ( Ė = 0 and U̇ = 0). The relationships that tie the
    values of E and U can be derived by observing that the flow utility from
    employment (r E ) needs to be equal to utility of consumption y, which occurs
    with probability b(e ), plus the expected value of the ensuing change from
    employment to unemployment:
    r E = b(e )y + b(e )(U − E ). (5.3)
    There is a clear analogy with the pricing of financial assets (which yield
    periodic dividends and whose value may change over time), if we interpret
    the left-hand side of (5.3) as the flow return (opportunity cost) that a risk-
    neutral investor demands if she invests an amount E in a risk-free asset with
    return r . The right-hand side of the equation contains the two components
    of the flow return on the alternative activity “employment”: the expected
    dividend derived from consumption, and the expected change in the asset
    value resulting from the change from employment to unemployment. This
    interpretation justifies the term “asset equations” for expressions like (5.3)
    and (5.4).
    Similarly, the flow utility from unemployment comprises the expected value
    from a change in the state (from unemployment to employment) which
    occurs with probability a G (c ∗) whenever the agent decides to produce; and
    the expected cost of production, equal to the rate of occurrence of a pro-
    duction opportunity a times the average cost (with a negative sign) of the
    production opportunities that have a cost below c ∗ and are thus realized.

    COORDINATION AND EXTERNALITIES 175
    The corresponding asset equation is therefore given by
    r U = a G (c ∗)(E − U ) − a
    ∫ c ∗
    c
    c d G (c )
    = a
    ∫ c ∗
    c
    (E − U − c ) d G (c ), (5.4)
    where G (c ∗) ≡
    ∫ c ∗
    c
    d G (c ).
    Equations (5.3) and (5.4) can be derived more rigorously using the prin-
    ciple of dynamic programming which was introduced in Chapter 1. In the
    following we consider a discrete time interval �t , from t = 0 to t = t1, and we
    keep e constant. Moreover, we assume that an agent who finds a production
    opportunity and returns to the pool of unemployed does not find a new
    production opportunity in the remaining part of the interval �t . Given these
    assumptions, we can express the value of employment at the start of the
    interval as follows:
    E =
    ∫ t1
    0
    be −bt e −r t y d t + e −r �t [e −b�t E + (1 − e −b�t )U ], (5.5)
    where the dependence of b on e is suppressed to simplify notation. The first
    term on the right-hand side of (5.5) is the expected utility from consumption
    during the interval, which is discounted to t = 0. (Remember that e−bt defines
    the probability that no trading opportunity arrives before date t .) The second
    term defines the expected (discounted) utility that is obtained at the end of
    the interval at t = t1. At this date, the agent may be either still “employed,”
    having not had a chance to exchange the produced good (which occurs with
    probability e −b�t ), or “unemployed,” after having traded the good (which
    occurs with complementary probability 1 − e −b�t ).40 Solving the integral in
    (5.5) yields
    E =
    b
    b + r
    (1 − e −(r +b)�t )y + e −r �t [e −b�t E + (1 − e −b�t )U ]
    =
    b
    r + b
    y +
    e −r �t (1 − e −b�t )
    1 − e −(r +b)�t U. (5.6)
    Taking the limit of (5.6) for �t → 0 and applying l’Hôpital’s rule to the
    second term, so that
    lim
    �t→0
    −r e −r �t (1 − e −b�t ) + be −r �t e −b�t
    (r + b)e −(r +b)�t
    =
    b
    r + b
    ,
    ⁴⁰ Since we limit attention to steady-state outcomes in which e is constant, E and U are also constant
    over time. As a result, there is no difference between the values at the beginning and at the end of the
    time interval.

    176 COORDINATION AND EXTERNALITIES
    we get the asset equation for E which was already formulated in (5.3):
    E =
    b
    r + b
    y +
    b
    r + b
    U
    ⇒ r E = b y + b(U − E ).
    Similar arguments can be used to derive the second asset equation in (5.4).
    The critical value c ∗ is set in order to maximize E and U .
    In the optimum, therefore, the following first-order conditions hold:
    ∂ E
    ∂c ∗
    =
    ∂U
    ∂c ∗
    = 0.
    The derivative of the value of “unemployment” with respect to the thresh-
    old cost level c ∗ can be obtained from (5.4) using Leibnitz’s rule,41
    d
    d b
    ∫ b
    a
    f (z)d z = f (b).
    In our case, f (z) = (E − U − z)(d G /d z). Differentiating (5.4) with respect
    to c ∗ and equating the resulting expression to zero yields
    r
    ∂U
    ∂c ∗
    = a (E − U − c ∗)G ′(c ∗) = 0
    ⇒ c ∗ = E − U. (5.7)
    In words, whoever is unemployed (searching for a production opportunity) is
    willing to bear a cost of production that is at most equal to the gain, in terms
    of expected utility, from exploiting a production opportunity to move from
    unemployment to “employment.” Now, subtracting (5.4) from (5.3), we get
    r (E − U ) = b(e )y − b(e )(E − U ) − a G (c ∗)(E − U ) + a
    ∫ c ∗
    c
    c d G (c ).
    (5.8)
    Using (5.8) we can now derive the equation for the stationary value of c ∗,
    which expresses c ∗ as a function of e . Writing
    E − U = c ∗ =
    b(e )y + a
    ∫ c ∗
    c
    c d G (c )
    r + b(e ) + a G (c ∗)
    , (5.9)
    ⁴¹ In general, the definition of an integral implies
    d
    d x
    ∫ b(x )
    a (x )
    f (z; x )d z =
    ∫ b(x )
    a (x )
    ∂ f (z; x )
    ∂ x
    d z + b′(x ) f (b(x )) − a ′(x ) f (a (x ))
    (Leibnitz’s rule). Intuitively, the area below the curve of f (·) and between the points a (·) and b(·) is
    equal to the integral of the derivative of f (·) over the interval. Moreover, an increase in the upper
    limit increases this area in proportion to f (b(x )), while an increase in the lower limit decreases it in
    proportion to f (a (x )).

    COORDINATION AND EXTERNALITIES 177
    rearranging to
    b(e )y + a
    ∫ c ∗
    c
    c d G (c ) = (r + b(e ) + a G (c ∗))c ∗,
    and differentiating, we find that the slope of the locus of stationarity (5.9) is
    d c ∗
    d e
    ∣∣∣∣
    ċ ∗=0
    =
    b′(e )(y − c ∗)
    r + b(e ) + a G (c ∗)
    . (5.10)
    The sign of this derivative is positive since y > c ∗ (agents accept only those
    production possibilities with a cost below the value of output) and b′(e ) > 0.
    Notice also that if e = 0 no trade ever takes place. (There are no agents with
    goods to offer.) In this case, agents are indifferent between employment and
    unemployment and there is no incentive to produce: c ∗ = E − U = 0. Finally,
    if we assume that b′′(e ) < 0, one can show that d 2c ∗/d e 2 < 0. Hence, the function that represents the locus of stationarity is strictly concave, and the locus of stationarity, which is drawn in the right-hand panel of Figure 5.1, starts in the origin and increases at a decreasing rate. The positive sign of d c ∗/d e|ċ ∗=0 implies that there exists a strategic complementarity between the actions of individual agents. The concept of strategic complementarity is formally introduced in the appendix to this chapter. Intuitively, it implies that the actions of one agent increase the payoffs from action for all other agents; expressed in terms of the model studied here, the higher the fraction of employed agents, the more likely each individual agent will find a trading partner. This induces agents to increase the threshold for acceptance of pro- duction opportunities. At the aggregate level, therefore, the optimal individual response implies a more than proportional increase in the level of activity. To determine the dynamics of c ∗, we need to remember that the equilibrium rela- tions (5.3) and (5.4) are obtained on the basis of the assumption that E and U are constant over time. In general, however, these values will depend on the path of employment e . In that case, we need to add the terms Ė = ė∂ E (·)/∂e and U̇ = ė∂U (·)/∂e to the right-hand sides of (5.3) and (5.4), respectively, yielding: r E (·) = ∂ E (·) ∂e ė + b(e )(y − E (·) + U (·)) (5.11) r U (·) = ∂U (·) ∂e ė + a ∫ c ∗ c (E (·) − U (·) − c )d G (c ). (5.12) In terms of asset equations, Ė and U̇ represent the “capital gains” that, together with the flow utility, give the “total returns” r E and r U . 178 COORDINATION AND EXTERNALITIES Now, subtracting (5.12) from (5.11), and noting from (5.7) that ċ ∗ = Ė − U̇ = ( ∂ E (·) ∂e − ∂U (·) ∂e ) ė, we can derive the expression for the dynamics of c ∗: ċ ∗ = r c ∗ − b(e )(y − c ∗) + a ∫ c ∗ c (c ∗ − c )d G (c ). (5.13) Moreover, if we assume that ċ ∗ = 0, we obtain exactly (5.9). Since ∂ċ ∗ ∂c ∗ = r + b(e ) + a G (c ∗) > 0,
    the response of ċ ∗ to c ∗ is positive, as shown by the direction of the arrows in
    Figure 5.1. We are now in a position to analyze the possible equilibria of the
    economy, and we can make the interpretation of individual behavior in terms
    of the strategic complementarity more explicit.
    First of all, given the shape of the two loci of stationarity, there may be mul-
    tiple equilibria. The origin (c ∗ = e = 0) is always an equilibrium of the system.
    In this case the economy has zero activity (shut-down equilibrium). If there are
    more equilibria, then we may have the situation depicted in Figure 5.2. In this
    case there are two additional equilibria: E 1, in which the economy has a low
    level of activity, and E 2, with a high level of activity.
    Figure 5.2. Equilibria of the economy

    COORDINATION AND EXTERNALITIES 179
    Graphically, the direction of the arrows in Figure 5.2 implies that the system
    can settle in the equilibrium with a high level of activity only if it starts from
    the regions to the north-east or the south-west of E 2. As in the continuous-
    time models analyzed in Chapters 2 and 4, the dynamics are therefore charac-
    terized by a saddlepath. Also drawn in the figure is a saddlepath that leads to
    equilibrium in the origin; finally, there is an equilibrium with low (but non-
    zero) activity. For a formal analysis of the dynamics we linearize the system of
    dynamic equations (5.1) and (5.13) around a generic equilibrium (ē, c̄ ∗).
    In matrix notation, this linearized system can be expressed as follows:(

    ċ ∗
    )
    =
    (
    −(a G (c̄ ∗) + b(ē ) + ē b′(ē )) (1 − ē )a G ′(c̄ ∗)
    −b′(ē )(y − c̄ ∗) r + b(ē ) + a G (c̄ ∗)
    )(
    e − ē
    c ∗ − c̄ ∗
    )

    (
    · ‚
    „ ‰
    )(
    e − ē
    c ∗ − c̄ ∗
    )
    , where ·, „ < 0; ‚, ‰ > 0. (5.14)
    If in a given equilibrium the curve ė = 0 is steeper than ċ ∗ = 0, then this
    equilibrium is a saddlepoint, as in the case of E 2. Formally, we need to verify
    the following condition:
    det
    (
    · ‚
    „ ‰
    )
    = ·‰ − ‚„ < 0. This can be rewritten as − · ‚ > − „

    ,
    where −·/‚ is the slope of the curve ė = 0 and −„/‰ is the slope of the curve
    ċ ∗ = 0. In contrast, at E 1 the relationship between the steepness of the two
    curves is reversed and the determinant of the matrix is positive. Such an
    equilibrium is called a node. The trace of the matrix is · + ‰ = r − ē b′(ē ):
    whether its node is negative or positive depends on its sign. This in turn
    depends on the specific values of r and ē and on the properties of the function
    b(·). The existence of a strategic complementarity, arising from the trading
    externality implied by the assumption that b′(e ) > 0, has thus resulted in
    multiple equilibria.
    A low level of employment induces agents to accept only few production
    opportunities (c ∗ is low) and in equilibrium the economy is characterized by
    a low level of activity. If, on the contrary, employment is high, each agent
    will accept many production opportunities and this allows the economy to
    maintain an equilibrium with a high level of activity. Finally, it is important
    to note that agents’ expectations play a crucial role in the selection of the
    equilibrium. Looking at point e 0 in Figure 5.2, it is clear that there exist values
    of e for which the economy can either jump to the saddlepath that leads to the
    “inferior” equilibrium (the origin), or to the one that leads to the equilibrium
    with a high level of activity. Which of these two possibilities is actually realized

    180 COORDINATION AND EXTERNALITIES
    depends on the beliefs of agents. If agents are “optimistic” (i.e. if they expect
    a high level of activity and thus a convergence to the equilibrium at E 2),
    then they choose a value of c ∗ on the higher saddlepath, while if they are
    “pessimistic” (and anticipate convergence to the origin), they choose a point
    on the lower saddlepath.
    5.2. A Search Model of Money
    The stylized Diamond model of the previous section represents a situation
    where heterogeneous tastes and specialization in production force agents to
    trade in order to consume. Unlike Robinson Crusoe, the economic agents
    of the model cannot consume their own production: in the original article,
    Diamond (1982a ) outlines how the economic decisions and interactions of his
    model could be applicable to a tropical island where a religious taboo prevents
    each of the natives from eating fruit he has picked. And, since trade occurs on
    a bilateral basis, rather than in a competitive auctioneered market, the econ-
    omy’s general equilibrium cannot be viewed as a representative-agent welfare
    maximization problem of the type that is sometimes discussed in terms of
    Robinson Crusoe’s activities in undergraduate microeconomics textbooks.
    The insights are qualitatively relevant in many realistic settings. In particu-
    lar, whenever trade does not occur simultaneously in a frictionless centralized
    market, a potential role arises for a “medium of exchange”—an object that is
    accepted in a trade not to be directly consumed or used in production, but
    only to be exchanged in future trades. It would certainly be inconvenient
    for the authors of this book to carry copies of it into stores selling groceries
    they wish to consume, hoping that the owner might be interested in learn-
    ing advanced macroeconomic techniques. In reality, of course, authors and
    publishers exchange books for money, and money for groceries. So, money’s
    medium-of-exchange role facilitates exchanges of goods and, ultimately, con-
    sumption. The model presented in this section, a simplified version of that
    in Kiyotaki and Wright (1993), formalizes the use of money as a medium
    of exchange. As in the Diamond model of the previous section, strategic
    interaction among individuals is crucial in determining the equilibrium out-
    come. Moreover, different equilibria (characterized by different degrees of
    acceptability of money in the exchange process) may arise, depending on the
    particular traders’ beliefs: again, agents’ expectations are self-fulfilling.
    5.2.1. THE STRUCTURE OF THE ECONOMY
    Consider an economy populated by a large number of infinitely lived agents.
    There is also a large number of differentiated and costlessly storable consump-
    tion goods, called commodities, coming in indivisible units. Agents differ as to

    COORDINATION AND EXTERNALITIES 181
    their preferences for commodities: each individual “likes” (and can consume)
    only a fraction 0 < x < 1 of the available commodities. The same exogenous parameter x denotes the fraction of agents that like any given commodity. Production occurs only jointly with consumption: when an agent consumes one unit of a commodity in period t , he immediately produces one unit of a different good, which becomes his endowment for the next period t + 1. The utility obtained from consumption, net of any production cost, is U > 0.
    As in Diamond’s model of Section 5.1, we assume that commodities cannot
    be consumed directly by the producer: this motivates the need for agents to
    engage in a trading activity before being able to consume.
    In the economy, besides commodities, there is also a certain amount of
    costlessly storable fiat money, coming in indivisible units as well as the com-
    modities. Fiat money has two distinguishing features: it has no intrinsic value
    (it does not yield any utility in consumption and cannot be used as a produc-
    tion input), and it is inconvertible into commodities having intrinsic worth.
    Initially, an exogenously given fraction 0 < M < 1 of the agents are each endowed with one unit of money, whereas 1 − M are each endowed with one unit of a commodity. We can now describe how agents in the economy behave during any given period t , in which a fraction M of them are money holders and a fraction 1 − M are commodity holders. � A money holder will try to exchange money for a consumable commodity. For this to happen, two conditions must jointly be fulfilled: (i) she must meet an agent holding a commodity she “likes” (since only a fraction x of all commodities can be consumed by each agent), and (ii) the commodity holder must be willing to accept money in exchange for the consumption good. Only when these two conditions are met does trade take place: the money holder exchanges her unit of money for a commodity that she consumes enjoying utility U ; she then immediately produces one unit of a different commodity (that she “dislikes”), and will start the next period as a commodity holder. If, on the contrary, trade does not occur, she will carry money over to the next period. � A commodity holder will also try to exchange his endowment for a com- modity he “likes.” For this to happen, he must meet another commodity holder and both must be willing to trade (i.e. each agent must “like” the commodity he would receive in the exchange). Exchanges of com- modities for commodities occur only if they are mutually agreeable, and therefore both goods are consumed after trade.42 It is also possible that a commodity holder meets a money holder who “likes” his particular commodity; if trade takes place, then the agent starts the next period as a money holder. ⁴² The introduction of an arbitrarily small transaction cost paid by the receiver can rule out the possibility that an agent agrees to receive in a trade a commodity he cannot consume. 182 COORDINATION AND EXTERNALITIES The artificial economy here described highlights the different degree of acceptability of commodities and fiat money. Each consumption good will always be accepted in exchange by some agents, whereas money will be accepted only if agents expect to trade it in the future in exchange for con- sumable goods. A final assumption concerns the meeting technology generating the agents’ trading opportunities. Agents meet pairwise and at random; in each period an agent meets another with a constant probability 0 < ‚ ≤ 1. 5.2.2. OPTIMAL STRATEGIES AND EQUILIBRIA Each agent chooses a trading strategy in order to maximize the expected discounted utility from consumption. A trading strategy is a rule allowing the agent to decide whether to accept a commodity or money in exchange for what he is offering (either a commodity or money). The optimal trading strategy is obtained by solving the utility maximization problem, taking as given the strategies of other traders: this is the agent’s optimal response to other traders’ strategies. When all optimal strategies are mutually consistent, a Nash equilibrium configuration arises. We focus attention on symmetric and stationary equilibria, that is, on situations where all agents follow the same time-invariant strategies. In equilibrium, agents exchange commodities for other commodities only when both traders can consume the good they receive, whereas fiat money is used only if it has a “value.” Such a value depends on its acceptability, which is not an intrinsic property of money but is determined endogenously in equilibrium. The agent’s strategy is defined by the following rule of behavior: when a meeting occurs, the agent accepts a commodity only if he or she “likes” it (then with probability x ), and he or she accepts money in exchange with probability  when other agents accept money with probability �. The agent must choose  as the best response to the common strategy of other agents, �. To this end, at the beginning of period t he or she compares the payoffs (in terms of expected utility) from holding money and from holding a commodity, which we call VM (t ) and VC (t ) respectively. For a money holder, the payoff is equal to VM (t ) = 1 1 + r {(1 − ‚) VM (t + 1) + ‚[(1 − M) x� (U + VC (t + 1)) +(1 − (1 − M) x�) VM (t + 1)]} , (5.15) where r is the rate of time preference. If a meeting does not occur (with prob- ability 1 − ‚) the agent will end period t holding money with a value VM (t + 1), whereas if a meeting does occur (with probability ‚) she will end the period COORDINATION AND EXTERNALITIES 183 with an expected payoff given by the term in square brackets on the right-hand side of (5.15). If the agent meets a commodity holder who is offering a good that she “likes” and is willing to accept money, the exchange can take place and the payoff is the sum of the utility from consumption U and the value of the newly produced commodity VC (t + 1). This event occurs with probability (1 − M)x�. With the remaining probability, 1 − (1 − M)x�, trade does not take place and the agent’s payoff is simply VM (t + 1). For a commodity holder, the payoff is VC (t ) = 1 1 + r { (1 − ‚) VC (t + 1) + ‚ [(1 − M) x 2 U + +M x VM (t + 1) +(1 − Mx ) VC (t + 1)]} . (5.16) Again, the term in square brackets gives the expected payoff if a meeting occurs and is the sum of three terms. The first is utility from consumption U , which is enjoyed only if the agent meets a commodity holder and both like each other’s commodity (a “double coincidence of wants” situation), so that a barter can take place; the probability of this event is (1 − M)x 2. The second term is the payoff from accepting money in exchange for the commodity, yielding a value VM (t + 1): this trade occurs only if the agent is willing to accept money (with probability ) and meets a money holder who is willing to receive the commodity he offers (with probability Mx ). The third term is the payoff from ending the period with a commodity, which happens in all cases except for trade with a money holder, so occurs with probability 1 − Mx . To derive the agent’s best response, we focus on equilibria in which all agents choose the same strategy, whereby  = �, and payoffs are stationary, so that VM (t ) = VM (t + 1) ≡ VM and VC (t ) = VC (t + 1) ≡ VC . Using these properties in (5.15) and (5.16), multiplying by 1/(1 + r ), and rearranging terms we get r VM = ‚ {(1 − M) x� U + (1 − M) x� (VC − VM )}, (5.17) r VC = ‚ {(1 − M) x 2U + M x� (VM − VC )}. (5.18) Expressed in this form, (5.17) and (5.18) are readily interpreted as asset val- uation equations. The left-hand side represents the flow return from investing in a risk-free asset. The right-hand side is the flow return from holding either money or a commodity and includes the expected utility from consumption (the “dividend” component) as well as the expected change in the value of the asset held (the “capital gains” component). Finally, subtracting (5.17) from (5.18), we obtain VC − VM = ‚ (1 − M)xU r + ‚x� (x − �). (5.19) 184 COORDINATION AND EXTERNALITIES The sign of VC − VM depends on the sign of the difference between the degree of acceptability of commodities (parameterized by the fraction of agents that “like” any given commodity x ) and that of money (�). Consequently, the agents’ optimal strategy in accepting money in a trade depends solely on �. � If � < x , money is being accepted with lower probability than commod- ities. Then VC > VM , and the best response is never to accept money in
    exchange for a commodity:  = 0.
    � If � > x , money is being accepted with higher probability than com-
    modities. In this case VC < VM , and the best response is to accept money whenever possible:  = 1. � Finally, if � = x , money and commodities have the same degree of acceptability. With VC = VM , agents are indifferent between holding money and commodities: the best response then is any value of  between 0 and 1. The optimal strategy  = (�) is shown in Figure 5.3. Three (stationary and symmetric) Nash equilibria, represented in the figure along the 45◦ line where  = �, are associated with the three best responses illustrated above: (i) A non-monetary equilibrium (� = 0): agents expect that money will never be accepted in trade, so they never accept it. Money is valueless (VM = 0) and barter is the only form of exchange (point A). (ii) A pure monetary equilibrium (� = 1): agents expect that money will be universally acceptable, so they always accept it in exchange for goods (point C ). (iii) A mixed monetary equilibrium (� = x ): agents are indifferent between accepting and rejecting money, as long as other agents are expected Figure 5.3. Optimal (�) response function COORDINATION AND EXTERNALITIES 185 to accept it with probability x . In this equilibrium money is only partially acceptable in exchanges (point B ). The main insight of the Kiyotaki–Wright search model of money is that acceptability is not an intrinsic property of money, which is indeed worthless. Rather, it can emerge endogenously as a property of the equilibrium. More- over, as in Diamond’s model, multiple equilibria can arise. Which of the possi- ble equilibria is actually realized depends on the agents’ beliefs: if they expect a certain degree of acceptability of money (zero, partial or universal) and choose their optimal trading strategy accordingly, money will display the expected acceptability in equilibrium. Again, as in Diamond’s model, expectations are self-fulfilling. 5.2.3. IMPLICATIONS The above search model can be used to derive some implications concerning the agents’ welfare and the optimal quantity of money. Welfare We can now compare the values of expected utility for a commodity holder and a money holder in the three possible equilibria. Solving (5.17) and (5.18) with � = 0, x , and 1 in turn, we find the values of V iC and V i M , where the superscript i = n, m, p denotes the non-monetary, the mixed monetary, and the pure monetary equilibria associated with � = 0, x, 1 respectively. The resulting expected utilities are reported in Table 5.1, where K ≡ (‚(1 − M)xU/r ) > 0.
    Some welfare implications can be easily drawn from the table. First of
    all, the welfare of a money holder intuitively increases with the degree of
    acceptability of money. In fact, comparing the expected utilities in column
    (3), we find that V nM < V m M < V p M . Further, in the pure monetary equilibrium (third row of the table) money holders are better off than commodity holders: V p C < V p M . Holding universally acceptable money guarantees consumption when the money holder meets a Table 5.1. � V iC V i M (1) (2) (3) 0 K x 0 x K x K x 1 K x r + ‚((1 − M)x + M) r +‚x > K x K
    r + ‚x ((1 − M)x + M)
    r +‚x > K x

    186 COORDINATION AND EXTERNALITIES
    commodity holder with a good that she “likes”: trade increases the welfare of
    both agents and occurs with certainty. On the contrary, a commodity holder
    can consume only if another commodity holder is met and both like each
    other’s commodity: a “double coincidence of wants” is necessary, and this
    reduces the probability of consumption with respect to a money holder.
    Exercise 49 Check that, in a pure monetary equilibrium, when a money holder
    meets a commodity holder with a good she “likes” both agents are willing to trade.
    Finally, looking at column (2) of the table, we note that a commodity holder
    is indifferent between a non-monetary and a mixed monetary equilibrium,
    but is better off if money is universally acceptable, as in the pure monetary
    equilibrium:
    V nC = V
    m
    C < V p C . Summarizing, the existence of universally accepted fiat money makes all agents better off. Moreover, moving from a non-monetary to a mixed mon- etary equilibrium increases the welfare of money holders without harming commodity holders. Thus, in general, an increase in the acceptability of money (�) makes at least some agents better off and none worse off (a Pareto improvement). Optimal quantity of money We now address the issue of the optimal quantity of money from the social welfare perspective. The amount of money in circulation is directly related to the fraction of agents endowed with money M; we therefore consider the possibility of choosing M so as to maximize some measure of social welfare. A reasonable such measure is an agent’s ex ante expected utility, that is the expected utility of each agent before the initial endowment of money and commodities is randomly distributed among them. The social welfare crite- rion is then W = (1 − M)VC + M VM . (5.20) The fraction of agents endowed with money can be optimally chosen in the three possible equilibria of the economy. First, we note that, in both the non- monetary and the mixed monetary equilibria, money does not facilitate the exchange process (thus making consumption more likely); it is then optimal to endow all agents with commodities, thereby setting M = 0. In the pure COORDINATION AND EXTERNALITIES 187 monetary equilibrium, social welfare W p can be expressed as W p = (1 − M)V pC + M V p M = K · [M + x (1 − M)] = ‚ U r (1 − M)[Mx + (1 − M)x 2], (5.21) where we used the definition of K given above. Maximization of W p with respect to M yields the optimal quantity of money M∗: ∂ W P ∂ M = ‚ U r x [(1 − 2x ) − 2M∗(1 − x )] = 0 ⇒ 1 − 2x = 2M∗(1 − x ) ⇒ M∗ = 1 − 2x 2 − 2x . (5.22) Since 0 ≤ M∗ ≤ 1, for x ≥ 1 2 we get M∗ = 0. When each agent is willing to consume at least half of the commodities, exchanges are not very difficult and money does not play a crucial role in facilitating trade: in this case it is optimal to endow all agents with consumable commodities. Instead, if x < 1 2 , fiat money plays a useful role in facilitating trade and consumption, and the introduction of some amount of money improves social welfare (even though fewer consumable commodities will be circulating in the economy). From (5.22) we see that, as x → 0, M∗ → 1 2 , as shown in the left-hand panel of Figure 5.4. To further develop the intuition for this result, we can rewrite the last expression in (5.21) as follows: r W p = U · ‚(1 − M)[Mx + (1 − M)x 2], (5.23) ? Figure 5.4. Optimal quantity of money M∗ and ex ante probability of consumption P 188 COORDINATION AND EXTERNALITIES where the left-hand side is the “flow” of social welfare per period and the right-hand side is the utility from consumption U multiplied by the agent’s ex ante consumption probability. The latter is given by the probability of meeting an agent endowed with a commodity, ‚(1 − M), times the probability that a trade will occur, given by the term in square brackets. Trade occurs in two cases: either the agent is a money holder and the potential counterpart in the trade offers a desirable commodity (which happens with probability Mx ), or the agent is endowed with a commodity and a “double coincidence of wants” occurs (which happens with probability (1 − M)x 2). The sum of these two probabilities yields the probability that, after a meeting with a commodity holder, trade will take place. The optimal quantity of money is the value of M that maximizes the agent’s ex ante consumption probability in (5.23). As M increases, there is a trade-off between a lower probability of encountering a commodity holder and a higher probability that, should a meeting occur, trade takes place. The amount of money M∗ optimally weights these two opposite effects. The behavior of the consumption probability ( P ) as a func- tion of M is shown in the right-hand panel of Figure 5.4 for two values of x (0.5 and 0.25) in the case where ‚ = 1. The corresponding optimal quantities of money M∗ are 0 and 0.33 respectively. 5.3. Search Externalities in the Labor Market We now proceed to apply some of the insights discussed in this chapter to labor market phenomena. While introducing the models of Chapter 3, we already noted that the simultaneous processes of job creation and job destruction are typically very intense, even in the absence of marked changes in overall employment. In that chapter we assumed that workers’ relocation was costly, but we did not analyze the level or the dynamics of the unemployment rate. Here, we review the modeling approach of an important strand of labor economics focused exactly on the determinants of the flows into and out of (frictional) unemployment. The agents of these models, unlike those of the models discussed in the previous sections, are not ex ante symmetric: workers do not trade with each other, but need to be employed by firms. Unemployed workers and firms willing to employ them are inputs in a “productive” process that generates employment, a process that is given a stylized and very tractable representation by the model we study below. Unlike the abstract trade and monetary exchange frameworks of the previous sections, the “search and matching” framework below is qualitatively realistic enough to offer practical implications for the dynamics of labor market flows, for the steady state of the economy, and for the dynamic adjustment process towards the steady state. COORDINATION AND EXTERNALITIES 189 5.3.1. FRICTIONAL UNEMPLOYMENT The importance of gross flows justifies the fundamental economic mechanism on which the model is based: the matching process between firms and workers. Firms create job openings (vacancies) and unemployed workers search for jobs, and the outcome of a match between a vacant job and an unemployed worker is a productive job. Moreover, the matching process does not take place in a coordinated manner, as in the traditional neoclassical model. In the neoclassical model the labor market is perfectly competitive and supply and demand of labor are balanced instantaneously through an adjustment of the wage. On the contrary, in the model considered here firms and workers operate in a decentralized and uncoordinated manner, dedicating time and resources to the search for a partner. The probability that a firm or a worker will meets a partner depends on the relative number of vacant jobs and unemployed workers: for example, a scarcity of unemployed workers relative to vacancies will make it difficult for a firm to fill its vacancy, while workers will find jobs easily. Hence there exists an externality between agents in the same market which is of the same “trading” type as the one encountered in the previous section. Since this externality is generated by the search activity of the agents on the market, it is normally referred to as a search externality. Formally, we define the labor force as the sum of the “employed” workers plus the “unemployed” workers which we assume to be constant and equal to L units. Similarly, the total demand for labor is equal to the number of filled jobs plus the number of vacancies. The total number of unemployed workers and vacancies can therefore be expressed as u L e v L , respectively, where u denotes the unemployment rate and v denotes the ratio between the number of vacancies and the total labor force. In each unit of time, the total number of matches between an unemployed worker and a vacant firm is equal to mL (where m denotes the ratio between the newly filled jobs and the total labor force). The process of matching is summarized by a matching function, which expresses the number of newly created jobs (mL ) as a function of the number of unemployed workers (u L ) and vacancies (v L ): mL = m(u L , v L ). (5.24) The function m(·), supposed increasing in both arguments, is conceptually similar to the aggregate production function that we encountered, for exam- ple, in Chapter 4. The creation of employment is seen as the outcome of a “productive process” and the unemployed workers and vacant jobs are the “productive inputs.” Obviously, both the number of unemployed work- ers and the number of vacancies have a positive effect on the number of matches within each time period (mu > 0, mv > 0). Moreover, the creation of
    employment requires the presence of agents on both sides of the labor market
    (m(0, 0) = m(0, v L ) = m(u L , 0) = 0). Additional properties of the function

    190 COORDINATION AND EXTERNALITIES
    m(·) are needed to determine the character of the unemployment rate in
    a steady-state equilibrium. In particular, for the unemployment rate to be
    constant in a growing economy, m(·) needs to have constant returns to scale.43
    In that case, we can write
    m =
    m(u L , v L )
    L
    = m(u, v). (5.25)
    The function m(·) determines the flow of workers who find a job and who
    exit the unemployment pool within each time interval. Consider the case of
    an unemployed worker: at each moment in time, the worker will find a job
    with probability p = m(·)/u. With constant returns to scale for m(·), we may
    thus write
    m(u, v)
    u
    = m
    (
    1,
    v
    u
    )
    ≡ p(Ë), an increasing function of Ë ≡ v
    u
    . (5.26)
    The instantaneous probability p that a worker finds a job is thus positively
    related to the tightness of the labor market, which is measured by Ë, the ratio of
    the number of vacancies to unemployed workers.44 An increase in Ë, reflecting
    a relative abundance of vacant jobs relative to unemployed workers, leads to
    an increase in p. (Moreover, given the properties of m, p′′(Ë) < 0.) Finally, the average length of an unemployment spell is given by 1/ p(Ë), and thus is inversely related to Ë. Similarly, the rate at which a vacant job is matched to a worker may be expressed as m(u, v) v = m ( 1, v u ) u v = p(Ë) Ë ≡ q (Ë), (5.27) a decreasing function of the vacancy/unemployment ratio. An increase in Ë reduces the probability that a vacancy is filled, and 1/q (Ë) measures the average time that elapses before a vacancy is filled.45 The dependence of p and q on Ë captures the dual externality between agents in the labor market: an increase in the number of vacancies relative to unemployed workers increases the probability that a worker finds a job (∂ p(·)/∂v > 0), but at the same time
    it reduces the probability that a vacancy is filled (∂q (·)/∂v < 0). ⁴³ Empirical studies of the matching technology confirm that the assumption of constant returns to scale is realistic (see Blanchard and Diamond, 1989, 1990, for estimates for the USA). ⁴⁴ As in the previous section, the matching process is modeled as a Poisson process. The probability that an unemployed worker does not find employment within a time interval d t is thus given by e − p(Ë) d t . For a small time interval, this probability can be approximated by 1 − p(Ë) d t . Similarly, the probability that the worker does find employment is 1 − e − p(Ë) d t , which can be approximated by p(Ë) d t . ⁴⁵ To complete the description of the functions p and q , we define the elasticity of p with respect to Ë as Á(Ë). We thus have: Á(Ë) = p′(Ë)Ë/ p(Ë). From the assumption of constant returns to scale, we know that 0 ≤ Á(Ë) ≤ 1. Moreover, the elasticity of q with respect to Ë is equal to Á(Ë) − 1. COORDINATION AND EXTERNALITIES 191 5.3.2. THE DYNAMICS OF UNEMPLOYMENT Changes in unemployment result from a difference between the flow of work- ers who lose their job and become unemployed, and the flow of workers who find a job. The inflow into unemployment is determined by the “separation rate” which we take as given for simplicity: at each moment in time a fraction s of jobs (corresponding to a fraction 1 − u of the labor force) is hit by a shock that reduces the productivity of the match to zero: in this case the worker loses her job and returns to the pool of unemployed, while the firm is free to open up a vacancy in order to bring employment back to its original level. Given the match destruction rate s , jobs therefore remain productive for an average period 1/s . Given these assumptions, we can now describe the dynamics of the number of unemployed workers. Since L is constant, d (u L )/d t = u̇ L and hence u̇ L = s (1 − u)L − p(Ë)u L ⇒ u̇ = s (1 − u) − p(Ë)u, (5.28) which is similar to the difference equation for employment (5.1) derived in the previous section. The dynamics of the unemployment rate depend on the tightness of the labor market Ë: at a high ratio of vacancies to unemployed workers, workers easily find jobs, leading to a large flow out of unemploy- ment.46 From equation (5.28) we can immediately derive the steady-state relationship between the unemployment rate and Ë: u = s s + p(Ë) . (5.29) Since p′(·) > 0, the properties of the matching function determine a negative
    relation between Ë and u: a higher value of Ë corresponds to a larger flow
    of newly created jobs. In order to keep unemployment constant, the unem-
    ployment rate must therefore increase to generate an offsetting increase in
    the flow of destroyed jobs. The steady-state relationship (5.29) is illustrated
    graphically in the left-hand panel of Figure 5.5: to each value of Ë corresponds
    a unique value for the unemployment rate. Moreover, the same properties of
    m(·) ensure that this curve is convex. For points above or below u̇ = 0, the
    unemployment rate tends to move towards the stationary relationship: keep-
    ing Ë constant at Ë0, a value u > u0 causes an increase in the flow out of unem-
    ployment and a decrease in the flow into unemployment, bringing u back to
    u0. Moreover, given u and Ë, the number of vacancies is uniquely determined
    by v = Ëu, where v denotes the number of vacancies as a proportion of the
    labor force. The picture on the right-hand side of the figure shows the curve
    ⁴⁶ To obtain job creation and destruction “rates,” we may divide the flows into and out of employ-
    ment by the total number of employed workers, (1 − u)L . The rate of destruction is simply equal to s ,
    while the rate of job creation is given by p(Ë)[u/(1 − u)].

    192 COORDINATION AND EXTERNALITIES
    Figure 5.5. Dynamics of the unemployment rate
    u̇ = 0 in (v, u)-space. This locus is known as the Beveridge curve, and identifies
    the level of vacancies v0 that corresponds to the pair (Ë0, u0) in the left-hand
    panel. In the sequel we will use both graphs to illustrate the dynamics and
    the comparative statics of the model. At this stage it is important to note that
    variations in the labor market tightness are associated with a movement along
    the curve u̇ = 0, while changes in the separation rate s or the efficiency of
    the matching process (captured by the properties of the matching function)
    correspond to movements of the curve u̇ = 0. For example, an increase in
    s or a decrease in the matching efficiency causes an upward shift of u̇ = 0.
    Equation (5.29) describes a first steady-state relationship between u and Ë. To
    find the actual equilibrium values, we need to specify a second relationship
    between these variables. This second relationship can be derived from the
    behavior of firms and workers on the labor market.
    5.3.3. JOB AVAILABILITY
    The crucial decision of firms concerns the supply of jobs on the labor market.
    The decision of a firm about whether to create a vacancy depends on the
    expected future profits over the entire time horizon of the firm, which we
    assume is infinite. Formally, each individual firm solves an intertemporal
    optimization problem taking as given the aggregate labor market conditions
    which are summarized by Ë, the labor market tightness. Individual firms
    therefore disregard the effect of their decisions on Ë, and consequently on
    the matching rates p(Ë) and q (Ë) (the external effects referred to above). To
    simplify the analysis, we assume that each firm can offer at most one job. If the
    job is filled, the firm receives a constant flow of output equal to y. Moreover, it
    pays a wage w to the worker and it takes this wage as given. The determination
    of this wage is described below. On the other hand, if the job is not filled the

    COORDINATION AND EXTERNALITIES 193
    firm incurs a flow cost c , which reflects the time and resources invested in
    the search for suitable workers. Firms therefore find it attractive to create a
    vacancy as long as its value, measured in terms of expected profits, is positive;
    if it is not, the firm will not find it attractive to offer a vacancy and will exit
    the labor market. The value that a firm attributes to a vacancy (denoted by V )
    and to a filled job ( J ) can be expressed using the asset equations encountered
    above. Given a constant real interest rate r , we can express these values as
    r V (t ) = −c + q (Ë(t )) ( J (t ) − V (t )) + V̇ (t ), (5.30)
    r J (t ) = (y − w(t )) + s (V (t ) − J (t )) + J̇ (t ), (5.31)
    which are explicit functions of time. The flow return of a vacancy is equal
    to a negative cost component (−c ), plus the capital gain in case the job is
    filled with a worker ( J − V ), which occurs with probability q (Ë), plus the
    change in the value of the vacancy itself (V̇ ). Similarly, (5.31) defines the flow
    return of a filled job as the value of the flow output minus the wage ( y − w),
    plus the capital loss (V − J ) in case the job is destroyed, which occurs with
    probability s , plus the change in the value of the job ( J̇ ).
    Exercise 50 Derive equation (5.31) with dynamic programming arguments,
    supposing that J̇ = 0 and following the argument outlined in Section 5.1 to
    obtain equations (5.3) and (5.4).
    Subtracting (5.30) from (5.31) yields the following expression for the dif-
    ference in value between a filled job and a vacancy:
    r ( J (t ) − V (t )) = ( y − w(t ) + c )
    − [s + q (Ë(t ))]( J (t ) − V (t ))
    + ( J̇ (t ) − V̇ (t )). (5.32)
    Solving equation (5.32) at date t0 for the entire infinite planning horizon of
    the firm, we get
    J (t0) − V (t0) =
    ∫ ∞
    t0
    ( y − w(t ) + c ) e −
    ∫ t
    t0
    [r +s +q (Ë(Ù))] d Ù
    d t, (5.33)
    where we need to impose the following transversality condition:
    lim
    T →∞
    [ J (T ) − V (T )] e −
    ∫ T
    t0
    (r +s +q (Ë(Ù)))d Ù
    = 0.
    Equation (5.33) expresses the difference between the value of a job and
    the value of a vacancy as the value of the difference between the flow return
    of a job ( y − w) and that of a vacancy (−c ) over the entire time horizon,
    which is discounted to t0 using the appropriate “discount rate.” Besides on
    the real interest rate, this discount rate also depends on the separation rate s
    and on the tightness of the labor market via q (Ë). Intuitively, a higher number

    194 COORDINATION AND EXTERNALITIES
    of vacancies relative to unemployed workers decreases the probability that a
    vacant firm will meet a worker. This reduces the effective discount rate and
    leads to an increase in the difference between the value of a filled job and a
    vacancy. Moreover, Ë may also have an indirect effect on the flow return of a
    filled job via its impact on the wage w, as we will see in the next section.
    Now, if we focus on steady-state equilibria, we can impose V̇ = J̇ = 0 in
    equations (5.30) and (5.31). Moreover, we assume free entry of firms and as
    a result V = 0: new firms continue to offer vacant jobs until the value of the
    marginal vacancy is reduced to zero. Substituting V = 0 in (5.30) and (5.31)
    and combining the resulting expressions for J , we get
    J = c /q (Ë)
    J = ( y − w)/(r + s )
    }
    ⇒ y − w = (r + s ) c
    q (Ë)
    . (5.34)
    Equation (5.30) gives us the first expression for J . According to this con-
    dition, the equilibrium value of a filled job is equal to the expected costs of a
    vacancy, that is the flow cost of a vacancy c times the average duration of a
    vacancy 1/q (Ë). The second condition for J can be derived from (5.31): the
    value of a filled job is equal to the value of the constant profit flow y − w.
    These flow returns are discounted at rate r + s to account for both impatience
    and the risk that the match breaks down. Equating these two expressions yields
    the final solution (5.34), which gives the marginal condition for employment
    in a steady-state equilibrium: the marginal productivity of the worker ( y)
    needs to compensate the firm for the wage w paid to the worker and for
    the flow cost of opening a vacancy. The latter is equal to the product of the
    discount rate r + s and the expected costs of a vacancy c /q (Ë).
    This last term is just like an adjustment cost for the firm’s employment
    level. It introduces a wedge between the marginal productivity of labor and
    the wage rate, which is similar to the effect of the hiring costs studied in
    Chapter 3. However, in the model of this section the size of the adjustment
    cost is endogenous and depends on the aggregate conditions on the labor
    market. In equilibrium, the size of the adjustment costs depends on the
    unemployment rate and on the number of vacancies, which are summarized
    at the aggregate level by the value of Ë. If, for example, the value of output
    minus wages ( y − w) increases, then vacancy creation will become profitable
    (V > 0) and more firms will offer jobs. As a result, Ë will increase, leading to
    a reduction in the matching rate for firms and an increase in the average cost
    of a vacancy, and both these effects tend to bring the value of a vacancy back
    to zero.
    Finally, notice that equation (5.34) still contains the wage rate w. This is an
    endogenous variable. Hence the “job creation condition” (5.34) is not yet the
    steady-state condition which together with (5.29) would allow us to solve for
    the equilibrium values of u and Ë. To complete the model, we need to analyze
    the process of wage determination.

    COORDINATION AND EXTERNALITIES 195
    5.3.4. WAGE DETERMINATION AND THE STEADY STATE
    The process of wage determination that we adopt here is based on the fact
    that the successful creation of a match generates a surplus. That is, the value
    of a pair of agents that have agreed to match (the value of a filled job and an
    employed worker) is larger than the value of these agents before the match
    (the value of a vacancy and an unemployed worker). This surplus has the
    nature of a monopolistic rent and needs to be shared between the firm and
    the worker during the wage negotiations. Here we shall assume that wages are
    negotiated at a decentralized level between each individual worker and her
    employer. Since workers and firms are identical, all jobs will therefore pay the
    same wage.
    Let E and U denote the value that a worker attributes to employment and
    unemployment, respectively. The joint value of a match (given by the value of
    a filled job for the firm and the value of employment for the worker) can then
    be expressed as J + E , while the joint value in case the match opportunity
    is not exploited (given by the value of a vacancy for a firm and the value
    of unemployment for a worker) is equal to V + U . The total surplus of the
    match is thus equal to the sum of the firm’s surplus, J − V , and the worker’s
    surplus, E − U :
    ( J + E ) − (V + U ) ≡ ( J − V ) + (E − U ). (5.35)
    The match surplus is divided between the firm and the worker through a
    wage bargaining process. We take their relative bargaining strength to be
    exogenously given. Formally, we adopt the assumption of Nash bargaining.
    This assumption is common in models of bilateral negotiations. It implies
    that the bargained wage maximizes a geometric average of the surplus of the
    firm and the worker, each weighted by a measure of their relative bargaining
    strength. In our case the assumption of Nash bargaining gives rise to the
    following optimization problem:
    max
    w
    ( J − V )1−‚(E − U )‚, (5.36)
    where 0 ≤ ‚ ≤ 1 denotes the relative bargaining strength of the worker. Given
    that the objective function is a Cobb–Douglas one, we can immediately
    express the solution (the first-order conditions) of the problem as:
    E − U = ‚
    1 − ‚ ( J − V ) ⇒ E − U = ‚[( J − V ) + (E − U )]. (5.37)
    The surplus that the worker appropriates in the wage negotiations ( E − U ) is
    thus equal to a fraction ‚ of the total surplus of the job.
    Similar to what is done for V and J in (5.30) and (5.31), we can express
    the values E and U using the relevant asset equations (reintroducing the

    196 COORDINATION AND EXTERNALITIES
    dependence on time t ):
    r E (t ) = w(t ) + s (U (t ) − E (t )) + Ė (t ) (5.38)
    r U (t ) = z + p(Ë)(E (t ) − U (t )) + U̇ (t ). (5.39)
    For the worker, the flow return on employment is equal to the wage plus
    the loss in value if the worker and the firm separate, which occurs with
    probability s , plus any change in the value of E itself; while the return on
    unemployment is given by the imputed value of the time that a worker does
    not spend working, denoted by z, plus the gain if she finds a job plus the
    change in the value of U . Parameter z includes the value of leisure and/or
    the value of alternative sources of income including possible unemployment
    benefits. This parameter is assumed to be exogenous and fixed. Subtracting
    (5.39) from (5.38), and solving the resulting expression for the entire future
    time horizon, we can express the difference between the value of employment
    and unemployment at date t0 as
    E (t0) − U (t0) =
    ∫ ∞
    t0
    (w(t ) − z) e −
    ∫ t
    t0
    [r +s + p(Ë(Ù))] d Ù
    d t. (5.40)
    As in the case of firms, apart from the real interest rate r and the rate of separ-
    ation s , the discount rate for the flow return of workers depends on the degree
    of labor market tightness via its effect on p(Ë). A relative abundance of vacant
    jobs implies a high matching rate for workers, and this tends to reduce the
    difference between the value of employment and unemployment for a given
    wage value.
    There are two ways to obtain the effect of variations Ë on the wage. Restrict-
    ing attention to steady-state equilibria, so that Ė = U̇ = 0, we can either derive
    the surplus of the worker E − U directly from (5.38) and (5.39), or we can
    solve equation (5.40) keeping w and Ë constant over time:
    E − U = w − z
    r + s + p(Ë)
    . (5.41)
    According to (5.41), the surplus of a worker depends positively on the differ-
    ence between the flow return during employment and unemployment (w − z)
    and negatively on the separation rate s and on Ë: an increase in the ratio of
    vacancies to unemployed workers increases the exit rate out of unemployment
    and reduces the average length of an unemployment spell. Using (5.41), and
    noting that in steady-state equilibrium
    J − V = J = y − w
    r + s
    ,

    COORDINATION AND EXTERNALITIES 197
    we can solve the expression for the outcome of the wage negotiations given by
    (5.37) as
    w − z
    r + s + p(Ë)
    =

    1 − ‚
    y − w
    r + s
    .
    Rearranging terms, and using (5.34), we obtain the following equivalent
    expressions for the wage:
    w − z = ‚[( y + c Ë − w) + (w − z)] (5.42)
    ⇒ w = z + ‚( y + c Ë − z). (5.43)
    Equation (5.42) is the version in terms of flows of equation (5.37): the flow
    value of the worker’s surplus, i.e. the difference between the wage and alter-
    native income z, is a fraction ‚ of the total flow surplus. The term y − w + c Ë
    represents the flow surplus of the firm, where c Ë denotes the expected cost
    savings if the firm fills a job. Moreover, the wage is a pure redistribution from
    the firm to the worker. If we eliminate the wage payments in (5.42), we obtain
    the flow value of the total surplus of a filled job y + c Ë − z, which is equal
    to the sum of the value of output and the cost saving of the firm minus the
    alternative costs of the worker. Finally, equation (5.43) expresses the wage as
    the sum of the alternative income and the fraction of the surplus that accrues
    to the worker.
    It can easily be verified that the only influence of aggregate labor market
    conditions on the wage occur via Ë, the ratio of vacancies to unemployed
    workers. The unemployment rate u does not have any independent effect on
    wages. The explanation is that wages are negotiated after a firm and a worker
    meet. In this situation the match surplus depends on Ë, as we saw above. This
    variable determines the average duration of a vacancy, and hence the expected
    costs for the firm if it continued to search.
    The determination of the equilibrium wage completes the description of
    the steady-state equilibrium. The equilibrium can be summarized by equa-
    tions (5.29), (5.34), and (5.43) which we shall refer to as B C (Beveridge
    curve), J C (job creation condition), and W (wage equation):
    u =
    s
    s + p(Ë)
    ( B C ) (5.44)
    y − w = (r + s ) c
    q (Ë)
    ( J C ) (5.45)
    w = (1 − ‚)z + ‚( y + c Ë) (W) (5.46)
    For a given value of Ë, the wage is independent of the unemployment rate.
    The system can therefore be solved recursively for the endogenous variables
    u, Ë, and w. Using the definition for Ë, we can then solve for v. The last
    two equations jointly determine the equilibrium wage w and the ratio of

    198 COORDINATION AND EXTERNALITIES
    Figure 5.6. Equilibrium of the labor market with frictional unemployment
    vacancies/unemployed Ë, as is shown in the left-hand panel of Figure 5.6.
    Given Ë, we can then determine the unemployment rate u, and consequently
    also v, which equates the flows into and out of unemployment (the right-hand
    panel of the figure).
    This dual representation facilitates the comparative static analysis, which
    is intended to analyze the effect of changes in the parameters on the steady-
    state equilibrium. (Analysis of transitional dynamics is the subject of the next
    section.) In some cases, parameter changes have an unambiguous effect on
    all of the endogenous variables. This is true for instance in the case of an
    increase in unemployment benefits, a component of z, or an increase in the
    relative bargaining strength of workers ‚: the only effect of these changes is an
    upward shift of W which causes an increase in the wage and a reduction in Ë.
    This reduction, along the curve B C , is accompanied by an increase in u and a
    reduction in v.
    In other cases the effects are more complex and not always of unambiguous
    sign. Consider, for example, the effects of the following two types of shock
    which may be at the root of cyclical variations in overall unemployment. The
    first is an “aggregate” disturbance. This is represented by a variation in the pro-
    ductivity of labor y which affects all firms at the same time and with the same
    intensity. The second shock is a “reallocative” disturbance, represented by a
    change in the separation rate s . This shock hits individual firms independently
    of the aggregate state of the economy (captured by labor productivity y).
    A reduction in y moves both J C and W downwards. This results in a reduc-
    tion of the wage but has an ambiguous effect on Ë. However, formal analysis
    (which is required in the exercise below) shows that in a stationary equilibrium
    Ë also decreases; since the curve B C does not shift, the unemployment rate
    must increase while the number of vacancies v is reduced. In the case of a
    reallocative shock, we observe an inward shift of J C along W. This results in
    a joint decrease of the wage and the labor market tightness Ë, as in the case

    COORDINATION AND EXTERNALITIES 199
    of the aggregate shock. At the same time, however, the curve B C shifts to the
    right. Hence, while the unemployment rate increases unambiguously, it is in
    general not possible to determine the effect on the number of vacancies. In
    reality, however, v appears to be procyclical, and this suggests that aggregate
    shocks are a more important source of cyclical movements in the labor market
    than allocative shocks.
    Exercise 51 Derive formally, using the system of equations formed by (5.44),
    (5.45), and (5.46), the effects on the steady-state levels of w, Ë, u, and v
    of a smaller labor productivity (�y < 0) and of a higher separation rate (�s > 0).
    5.4. Dynamics
    Until now, all the relationships we derived referred to the steady-state equi-
    librium of the system. In this section we will analyze the evolution of unem-
    ployment, vacancies, and the wage rate along the adjustment path toward the
    steady-state equilibrium.
    The discussion of the flows into and out of unemployment in the previ-
    ous section has already delivered the law of motion for unemployment. This
    equation is repeated here (stressing the time dependence of the endogenous
    variables):
    u̇(t ) = s (1 − u(t )) − p(Ë(t )) u(t ). (5.47)
    The dynamics of u are due to the flow of separations and the flow of newly
    created jobs resulting from the matches between firms and workers. The
    magnitude of the flow out of unemployment depends on aggregate labor
    market conditions, captured by Ë, via its effect on p(·). Outside a steady-
    state equilibrium, the path of Ë will influence unemployment dynamics in
    the economy. Moreover, given the definition of Ë as the ratio of vacancies
    to unemployed workers, this will also affect the value of the labor market
    tightness. In order to give a complete description of the adjustment process
    toward a steady-state equilibrium, we therefore need to study the dynamics of
    Ë. This requires an analysis of the job creation decisions of firms.
    5.4.1. MARKET TIGHTNESS
    At each moment in time firms exploit all opportunities for the profitable
    creation of jobs. Hence in a steady-state equilibrium, as well as along the
    adjustment path, V (t ) = 0∀t , and outside a steady-state equilibrium V̇ (t ) = 0,

    200 COORDINATION AND EXTERNALITIES
    ∀t . The value of a filled job for the firm can be derived from (5.30) and (5.31).
    From the first equation, setting V (t ) = V̇ (t ) = 0, we get
    J (t ) =
    c
    q (Ë(t ))
    . (5.48)
    Equation (5.48) is identical to the steady-state expression derived before.
    Firms continue to create new vacancies, thereby influencing Ë, until the value
    of a filled job equals the expected cost of a vacancy. Since entry into the
    labor market is costless for firms (the resources are used to maintain open
    vacancies), equation (5.48) will hold at each instant during the adjustment
    process. Outside the steady-state, the dynamics of J needs to satisfy difference
    equation (5.31), with V (t ) = 0:
    J̇ (t ) = (r + s ) J (t ) − (y − w(t )) . (5.49)
    The solution of (5.49) shows that the value J (t ) depends on the future path
    of (expected) wages. Besides that, J (t ) also depends on labor productivity, the
    real interest rate, and the rate of separation, but all these variables are assumed
    to be constant:
    J (t ) =
    ∫ ∞
    t
    (y − w(t )) e −(r +s )Ù d Ù. (5.50)
    Wages are continuously renegotiated. Outside steady-state equilibrium the
    surplus sharing rule (5.37) with V (t ) = 0 therefore remains valid:
    E (t ) − U (t ) = ‚
    1 − ‚ J (t ). (5.51)
    Outside the steady-state E , U , and J may vary over time, but these variations
    need to ensure that (5.51) is satisfied. Hence, we have
    Ė (t ) − U̇ (t ) = ‚
    1 − ‚ J̇ (t ). (5.52)
    The dynamics of J are given by (5.49), while the dynamics of E and U can be
    derived by subtracting (5.39) from (5.38):
    Ė (t ) − U̇ (t ) = [r + s + p(Ë(t ))](E (t ) − U (t )) − (w(t ) − z). (5.53)
    Equating (5.52) to (5.53), and using (5.49) and (5.48) to replace J̇ and J , we
    can solve for the level of wages outside steady-state equilibrium as:
    w(t ) = z + ‚(y + c Ë(t ) − z). (5.54)
    The wage is thus determined in the same way both in a steady-state equilib-
    rium and during the adjustment process. Moreover, given the values for the
    exogenous variables, the wage dynamics depends exclusively on changes in the
    degree of labor market tightness, which affects the joint value of a productive
    match.

    COORDINATION AND EXTERNALITIES 201
    We are now in possession of all the elements that are needed to determine
    the dynamics of Ë. Differentiating (5.48) with respect to time, where by defin-
    ition q (Ë) ≡ p(Ë)/Ë, we have
    J̇ (t ) =
    c p(Ë(t )) − c Ë(t ) p′(Ë(t ))
    p(Ë(t ))2
    Ë̇(t ) =
    c
    p(Ë(t ))
    [1 − Á(Ë(t ))] Ë̇(t ), (5.55)
    where 0 < Á(Ë) < 1 (defined above) denotes the elasticity of p(Ë) with respect to Ë. To simplify the derivations, we henceforth assume that Á(Ë) = Á is con- stant (which is true if the matching function m(·) is of the Cobb–Douglas type). Substituting (5.49) for J̇ and using the expression J = c (Ë/ p(Ë)), we can rewrite equation (5.55) as Ë̇(t ) c p(Ë(t )) (1 − Á) = (r + s ) c Ë(t ) p(Ë(t )) − (y − w(t )). (5.56) Finally, substituting the expression for the wage as a function of Ë from (5.54), the above law of motion for Ë can be written as Ë̇(t ) = r + s 1 − Á Ë(t ) − p(Ë(t )) c (1 − Á) [(1 − ‚) ( y − z) − ‚ c Ë(t )]. (5.57) Changes in Ë depend on (in addition to all the parameters of the model) only the value of Ë itself. The labor market tightness does not in any indepen- dent way depend on the unemployment rate u. In (Ë, u)-space the curve Ë̇ = 0 can thus be represented by a horizontal line at Ë̄, which defines the unique steady-state equilibrium value for the ratio between vacancies and unem- ployed. This is illustrated in the left-hand panel of Figure 5.7. Once we have determined Ë̄, we can determine, for each value of the unemployment rate, the level of v that is compatible with a stationary equilibrium. For instance, in the case of u0 this is equal to v0. Besides that, equation (5.57) also indicates that, for points above or below the curve Ë̇ = 0, Ë tends to move away from its equilibrium value. Formally, Figure 5.7. Dynamics of the supply of jobs 202 COORDINATION AND EXTERNALITIES one can show this by calculating47 ∂Ë̇ ∂Ë ∣∣∣∣ Ë̇=0 = (r + s ) + p(Ë) 1 − Á ‚ > 0
    from (5.57). The apparently “unstable” behavior of Ë is due to the nature of
    the job creation decision of firms. Looking at the future, firms’ decisions on
    whether to open a vacancy today are based on expected future values of Ë. For
    example, if firms expect a future increase in Ë resulting from an increase in the
    number of vacant jobs, they will anticipate an increase in future costs to fill
    a vacancy. As a result, firms have an incentive to open vacancies immediately
    in anticipation of this increase in cost. At the aggregate level, this induces an
    immediate increase in v (and in Ë) in anticipation of further increases in the
    future. Hence, there is an obvious analogy between the variations of v and the
    movement of asset prices which we already alluded to when we interpreted
    (5.30) and (5.31) as asset equations: expectations of a future increase in price
    cause an increase in current prices.
    As a result of forward-looking behavior on the part of firms, both v and
    Ë are “jump” variables. Their value is not predetermined: in response to
    changes in the exogenous parameters (even if these changes are expected in the
    future and have not yet materialized), v and Ë may exhibit discrete changes.
    The unemployment rate, on the other hand, is a “predetermined” or state
    variable. The dynamics of the unemployment rate are governed by (5.47),
    and u adjusts gradually to changes in Ë, even in case of a discrete change
    in the labor market tightness. An unanticipated increase in v and Ë leads to
    an increase in the flow out of unemployment, resulting in a reduction of u.
    However, the positive effect of the number of vacancies on unemployment
    is mediated via the stochastic matching process on the labor market. The
    immediate effect of an increase in Ë is an increase in the matching rate for
    workers p(Ë), and this translates only gradually in an increase in the number
    of filled jobs. The unemployment rate therefore will start to decrease only after
    some time.
    The aggregate effect of the decentralized decisions of firms (each of which
    disregards the externalities of its own decision on aggregate variables) consists
    of changes in the degree of labor market tightness Ë and, as a result, in
    changes in the speed of adjustment of the unemployment rate. The dynamics
    of u are therefore intimately linked to the presence of the externalities that
    characterize the functioning of the labor market in the search and matching
    literature.
    ⁴⁷ Note that this derivative is computed at a steady-state equilibrium point (on the Ë̇ = 0 locus).
    Hence, we may use (5.34), and replace y − w with (r + s )(c Ë/ p(Ë)) to obtain the expression in the text.

    COORDINATION AND EXTERNALITIES 203
    Figure 5.8. Dynamics of unemployment and vacancies
    5.4.2. THE STEADY STATE AND DYNAMICS
    We are now in a position to characterize the system graphically, using the
    differential equations (5.47) and (5.57) for u and Ë. In both panels of Figure 5.8
    we have drawn the curves for Ë̇ = 0 and u̇ = 0. Moreover, for each point
    outside the unique steady-state equilibrium we have indicated the movement
    of Ë and u.
    As we have seen in the analysis of dynamic models of investment and
    growth theory, the combination of a single-state variable (u) and a single
    jump variable (Ë) implies that there is only one saddlepath that converges
    to the steady-state equilibrium (saddlepoint).48 Since the expression for Ë̇ = 0
    does not depend on u, the saddlepath coincides with the curve for Ë̇ = 0: all
    the other points are located on paths that diverge from the curve Ë̇ = 0 and
    never reach the steady-state, violating the transversality conditions. Hence, as
    a result of the forward-looking nature of the vacancy creation decisions of
    firms, the labor market tightness Ë will jump immediately to its long-run value
    and remain there during the entire adjustment process.
    Let us now analyze the adjustment process in response first to a reduction
    in labor productivity y (an aggregate shock) and then to an increase in the rate
    of separation s (a reallocative shock). Figure 5.9 illustrates the dynamics fol-
    lowing an unanticipated permanent reduction in productivity (�y < 0) at date ⁴⁸ Formally, we can determine the saddlepoint nature of the equilibrium by evaluating the linearized system (5.47) and (5.57) around the steady-state equilibrium point (ū, Ë̄), yielding ( u̇ Ë̇ ) = ( −(s + p(Ë̄)) −ū p′(Ë̄) 0 (r + s ) + p(Ë̄) 1−Á ‚ )( u − ū Ë − Ë̄ ) . The pattern of signs in the matrix is ⎡ ⎣− − 0 + ⎤ ⎦. Thus, the determinant is negative, confirming that the equilibrium is a saddlepoint. 204 COORDINATION AND EXTERNALITIES Figure 5.9. Permanent reduction in productivity t0. In the left-hand graph, the curve Ë̇ = 0 shifts downward while u̇ = 0 does not change. In the new steady-state equilibrium (point C ) the unemployment rate is higher and labor market tightness is lower. Moreover, from the right- hand graph, it follows that the number of vacancies has also decreased. The figure also illustrates the dynamics of the variables: at date t0 the economy jumps to the new saddlepath which coincides with the new curve Ë̇ = 0. Given the predetermined nature of the unemployment rate, the whole adjustment is performed by v and Ë, which make a discrete jump downwards as shown by B in the two graphs. From t0 onwards both unemployment and the number of vacancies increase gradually, keeping Ë fixed until the new steady-state equilibrium is reached. The permanent reduction in labor productivity reduces the expected profits of a filled job. Hence, from t0 onwards firms have an incentive to create fewer vacancies. Moreover, initially the number of vacancies v falls below its new equilibrium level because firms anticipate that the unemployment rate will rise. In future it will therefore be easier to fill a vacancy. As a result, firms prefer to reduce the number of vacancies at the beginning of the adjustment process, increasing their number gradually as the unemployment rate starts to rise. Finally, the reduction in labor productivity also reduces wages, but this reduction is smaller than the decrease in y. Since the labor market immediately jumps to a saddlepath along which Ë(t ) is constant, equation (5.54) implies that the wage w(t ) is constant along the whole adjustment process. The short- run response of the wage is thus equal to the long-run response, which is governed by (5.34). According to this equation, the difference y − w is pro- portional to the expected cost of a vacancy for the firm. This cost depends on the average time that is needed to fill a vacancy, which diminishes when v and Ë fall in response to a productivity shock. Hence, in this version of the model COORDINATION AND EXTERNALITIES 205 Figure 5.10. Increase in the separation rate productivity changes do not imply proportional wage changes. (On this point see exercise 51 at the end of the chapter.) A similar adjustment process takes place in the case of a (unanticipated and permanent) reallocative shock �s > 0, as shown in Figure 5.10. However, in
    this case u̇ = 0 is also affected. This curve shifts to the right, which reinforces
    the increase in the unemployment rate, but has an ambiguous effect on the
    number of vacancies. (The figure illustrates the case of a reduction in v.)
    Finally, let us consider the case of a temporary reduction of productivity:
    agents now anticipate at t0 that productivity will return to its higher initial
    value at some future date t1. Given the temporary nature of the shock, the
    new steady-state equilibrium coincides with the initial equilibrium (point A
    in the graphs of Figure 5.11). At the time of the change in productivity, t0,
    the immediate effect is a reduction in the number of vacancies which causes a
    discrete fall in Ë. However, this reduction is smaller than the one that resulted
    from a permanent change, and it moves the equilibrium from the previous
    equilibrium A to a new point B ′. From t0 onwards, the unemployment rate
    and the number of vacancies increase gradually but not at the same rate: as
    a result, their ratio Ë increases, following the diverging dynamics that leads
    towards the new and lower stationary curve Ë̇ = 0. To obtain convergence of
    the steady-state equilibrium at A, the dynamics of the adjustment need to
    bring Ë to its equilibrium level at t1 when the shock ceases and productivity
    returns to its previous level (point B ′′). In fact, convergence to the final
    equilibrium can occur only if the system is located on the saddlepath, which
    coincides with the stationary curve for Ë, at date t1. After t1 the dynamics
    concerns only the unemployment rate u and the number of vacancies v, which
    decrease in the same proportion until the system reaches its initial starting
    point A.
    The graph on the right-hand side of Figure 5.11 also illustrates that cyc-
    lical variations in productivity give rise to a counter-clockwise movement of

    206 COORDINATION AND EXTERNALITIES
    Figure 5.11. A temporary reduction in productivity
    employment and vacancies around the Beveridge curve. This is consistent with
    empirical data for changes in unemployment and vacancies during recessions,
    which are approximated here by a temporary reduction in productivity.
    5.5. Externalities and efficiency
    The presence of externalities immediately poses the question of whether the
    decentralized equilibrium allocation is efficient. In particular, in the previ-
    ous sections it was shown that firms disregard the effect of their private
    decisions on the aggregate labor market conditions when they are deciding
    whether or not to create a vacancy. In this section we analyze the implica-
    tions of these external effects for the efficiency of the market equilibrium and
    compare the decentralized equilibrium allocation with the socially efficient
    allocation.
    To simplify the comparison between individual and socially optimal
    choices, we reformulate the problem of the firm so far identified with a sin-
    gle job, allowing firms to open many vacancies and employ many workers.
    Moreover, we also modify the production technology and replace the previ-
    ous linear production technology with a standard production function with
    decreasing marginal returns of labor. Let Ni denote the number of workers
    of firm i . The production function is then given by F (Ni ), with F
    ′(·) > 0
    and F ′′(·) < 0. The case F ′′(·) = 0 corresponds to the analysis in the previous sections, while the case of decreasing marginal returns to labor corresponds to our analysis in Chapter 3. The employment level of a firm varies over time as a result of vacancies that are filled and because of shocks that hit the firm and destroy jobs at rate s . The evolution of Ni is described by the following equation, where we have COORDINATION AND EXTERNALITIES 207 suppressed the time dependence of the variables to simplify the notation: Ṅi = q (Ë) Xi − s Ni , (5.58) where Xi represents the number of vacancies of a firm and is the control variable of the firm. Each vacancy is transformed into a filled job with instan- taneous probability q (Ë), which is a function of the aggregate tightness of the labor market. In deciding Xi firms take Ë as given, disregarding the effect of their decisions on the aggregate ratio of vacancies to unemployed. More specifically, we assume that the number of firms is sufficiently high to justify the assumption that a single firm takes the level of Ë as an exogenous variable. The problem of the representative firm is therefore max Xi ∫ ∞ 0 [ F (Ni ) − w Ni − c Xi ]e −r t d t, (5.59) subject to the law of motion for employment given by (5.58). Moreover, we assume that the firm takes the wage w as given and independent of the number of workers that it employs. The solution can be found by writing the associated Hamiltonian, H (t ) = [ F (Ni ) − w Ni − c Xi + Î(q (Ë) Xi − s Ni )]e −r t (5.60) (where Î is the Lagrange multiplier associated with the law of motion for N), and by deriving the first-order conditions: ∂ H ∂ Xi = 0 ⇒ Î = c q (Ë) , (5.61) − ∂ H ∂ Ni = d (Î(t )e −r t ) d t ⇒ F ′(Ni ) − w = (r + s )Î − Î̇, (5.62) lim t→∞ e −r t ÎNi = 0. (5.63) Equation (5.61) implies that firms continue to create vacancies until the marginal profits of a job equal the marginal cost of a vacancy (c /q (Ë)). This condition holds at any moment in time and is similar to the condition for job creation (5.48) derived in Section ??. The Lagrange multiplier Î can therefore be interpreted as the marginal value of a filled job for the firm, which we denoted by J in the previous sections. The dynamics of Î are given by (5.62), which in turn corresponds to equation (5.49) for J̇ . Finally, equation (5.63) defines the appropriate transversality condition for the firm’s problem. In what follows we consider only steady-state equilibria. Combining (5.61) and (5.62) and imposing Î̇ = 0, we get F ′(N∗i ) − w = (r + s ) c q (Ë)′ (5.64) 208 COORDINATION AND EXTERNALITIES where N∗i denotes the steady-state equilibrium employment level of the firm. The optimal number of vacancies, X ∗i , can be derived from constraint (5.58), with Ṅ = 0: q (Ë) X ∗i = s N ∗ i ⇒ X ∗i = s q (Ë) N∗i . (5.65) Hence, if all firms have the same production function and start from the same initial conditions, then each firm will choose the same optimal solution and the ratio of filled jobs to vacancies for each firm will be equal to the aggregate ratio: X ∗i N∗i = v 1 − u . (5.66) This completes the characterization of the decentralized equilibrium. We now proceed with a characterization of the socially efficient solution. For simplicity we normalize the mass of firms to one. X and N therefore denote the stock of vacancies and of filled jobs, at both the aggregate level and the level of an individual firm. Since the relations Ë ≡ v u = v L L − N = X L − N hold true, aggregate labor market conditions as captured by Ë are endogenous in the determination of the socially efficient allocation. The efficient allocation can be found by solving the following maximization problem: max X ∫ ∞ 0 [ F (N) − z N − c X ] e −r t d t, (5.67) subject to the condition Ṅ = q ( X L − N ) X − s N. (5.68) The bracketed expression in (5.67) denotes aggregate net output. The first term ( F (N)) is equal to the output of employed workers. From this we have to sub- tract the flow utility of employed workers (z N), and the costs of maintaining the vacancies (c X ). The wage rate does not appear in this expression because it is a pure redistribution from firms to workers: in the model considered here, distributional issues are irrelevant for social efficiency. The important point to note is that the effect of the choice of X on the aggregate conditions on the labor market is explicitly taken into account: the ratio Ë is expressed as X/(L − N) and is not taken as given in the maximization of social welfare. The problem is solved using similar methods as for the case of the problem of individual firms. Constructing the associated Hamiltonian and deriving the first-order conditions for X and N (with Ï as the Lagrange multiplier for the COORDINATION AND EXTERNALITIES 209 dynamic constraint) yields ∂ H ∂ X = 0 ⇒ Ï = c q ′(Ë)Ë + q (Ë) , (5.69) − ∂ H ∂ N = d (Ï(t )e −r t ) d t ⇒ F ′(N) − z = ( r + s − q ′(Ë)Ë2 ) Ï − Ï̇. (5.70) Explicit consideration of the effects on Ë introduces various differences between the above optimality conditions and the first-order conditions of the individual firm, (5.61) and (5.62). First of all, comparing (5.69) with the corresponding condition (5.61) shows that individual firms tend to offer an excessive number of vacant jobs compared with what is socially efficient (recall that q ′ < 0). The reason for this discrepancy is that firms disregard the effect of their decisions on the aggregate labor market conditions. Moreover, from the marginal condition for N (5.70), it follows that the “social” discount rate associated with the marginal value of a filled job Ï contains an additional term, −q ′(Ë)Ë2 > 0, which does not appear in the
    analogous condition for the individual firm (5.62). That is, an increase in
    the number of employed workers diminishes the probability that the firm
    will hire additional workers in the future. Equation (5.70) correctly reflects
    this dynamic aspect of labor demand, which tends to reduce the marginal
    value of a filled job in a steady-state equilibrium (in which Ï̇ = 0). Hence,
    also from this perspective the decentralized decisions of firms result in an
    excessive number of vacancies compared with the social optimum. Finally,
    comparing the left-hand side of equations (5.62) and (5.70) reveals that the
    individual conditions contain the value of productivity net of the wage w,
    while in the condition for social efficiency the value of productivity is net of the
    opportunity cost z. Hence, for the same value of Ë, individual firms attribute a
    lower “dividend” to filled jobs since w ≥ z, and firms thus tend to generate an
    insufficient number of vacancies. This last effect runs in the opposite direction
    to the two effects discussed above, and this makes a comparison between
    the two solutions—the individual and the social—interesting. The socially
    optimal solution may coincide with the corresponding decentralized equi-
    librium if the wage determination mechanism “internalizes” the externalities
    that private agents ignore. However, in the model that we have constructed,
    wages are determined after a firm and a worker meet. Hence, although the
    wage is perfectly flexible, it cannot perform any allocative function.
    Nonetheless, we can determine the conditions that the wage determination
    mechanism needs to satisfy for the decentralized equilibrium to coincide with
    the efficient solution. For this to occur, the marginal value of a filled job in the
    social optimum, which is given by (5.69), needs to be equal to the marginal
    value that the firm and the worker attribute to this job in the decentralized
    equilibrium. The latter is equal to the value that a firm and a worker attribute

    210 COORDINATION AND EXTERNALITIES
    to the joint surplus that is created by a match. Since firms continue to offer
    vacancies until their marginal value is reduced to zero (V = 0), the condition
    for efficiency of a decentralized equilibrium is
    Ï = J + E − U, (5.71)
    where E and U , introduced in Section 5.3, denote the value that a worker
    attributes to the state of employment and unemployment, respectively.
    Using (5.69), (5.48), and (5.51) we can rewrite (5.71) as:
    c
    q ′(Ë)Ë + q (Ë)
    =
    1
    1 − ‚
    c
    q (Ë)
    , (5.72)
    where ‚ denotes the relative bargaining strength of the worker. From (5.72),
    we obtain
    ‚ = − q
    ′(Ë) Ë
    q (Ë)
    ≡ −
    [
    Á(Ë) − 1
    ]
    ,
    ⇒ ‚ = 1 − Á(Ë), (5.73)
    where Á(Ë) and 1 − Á(Ë) denote the elasticity of the matching probability of
    a worker p(Ë) and the average duration of a vacancy 1/q (Ë) with respect to
    Ë. Since ‚ is constant, condition (5.73) can be satisfied only if the matching
    function has constant returns to scale with respect to its arguments v and u.
    This condition is satisfied for a matching function of the Cobb–Douglas type:
    m = m0v
    Áu1−Á, 0 < Á < 1. (5.74) It is easy to verify that (5.74) has the following properties: p(Ë) = m0Ë Á , q (Ë) = m0Ë Á−1 , 1 q (Ë) = 1 m0 Ë1−Á. (5.75) The constant parameter Á represents both the elasticity of the number of matches m with respect to the number of vacancies v, and the elasticity of p(Ë) with respect to Ë, while 1 − Á denotes the elasticity of m with respect to u and also the elasticity of the medium duration of a vacancy, 1/q (Ë), with respect to Ë. Returning to efficiency condition (5.73), we can thus deduce that, if the average duration of a vacancy strongly increases with an increase in the num- ber of vacancies (i.e. if 1 − Á is relatively high), there is a strong tendency for firms to exceed the efficient number of vacancies. Only a relatively high value of ‚, which implies high wage levels, can counterbalance this effect and induce firms to reduce the number of vacancies. When ‚ = 1 − Á, these two opposing tendencies exactly offset each other and the decentralized equilib- rium allocation is efficient. For cases in which ‚ �= 1 − Á, there are two types of inefficiency: COORDINATION AND EXTERNALITIES 211 1. if ‚ < 1 − Á firms offer an excessive number of vacancies and the equi- librium unemployment rate is below the socially optimal level; 2. if ‚ > 1 − Á wages are excessively high because of the strong bargaining
    power of workers and this results in an unemployment rate that is above
    the socially efficient level.
    In sum, in the model of the labor market that we have described here
    we cannot make a priori conclusions about the efficiency of the equilibrium
    unemployment rate. Given the complex externalities between the actions of
    firms and workers, the properties of the matching function and the wage deter-
    mination mechanism are crucial to determine whether the unemployment
    rate will be above or below the socially efficient level.
    � APPENDIX A5: STRATEGIC INTERACTIONS AND MULTIPLIERS
    This appendix presents a general theoretical structure, based on Cooper and John
    (1988), which captures the essential elements of the strategic interactions in the models
    discussed in this chapter. We will discuss the implications of strategic interactions
    in terms of the multiplicity of equilibria and analyze the welfare properties of these
    equilibria.
    Consider a number I of economic agents (i = 1, …, I ), each of which chooses a
    value for a variable ei ∈ [0, E ] which represents the agent’s “activity level,” with the
    objective of maximizing her own payoff Û(ei , e−i , Îi ), where e−i represents (the vector
    of) activity levels of the other agents and Îi is an exogenous parameter which influences
    the payoff of agent i . Payoff function Û(·) satisfies the properties Ûi i < 0 and Ûi Î > 0.
    (This last assumption implies that an increase in Î raises the marginal return of activity
    for the agent.)
    If all other agents choose a level of activity ē , the payoff of agent i can be expressed
    as Û(ei , ē, Îi ) ≡ V (ei , ē ). In this case the optimization problem becomes
    max
    ei
    V (ei , ē ), (5.A1)
    from which we derive
    V1(e

    i , ē ) = 0, (5.A2)
    where V1 denotes the derivative of V with respect to its first argument, ei . First-order
    condition (5.A2) defines the optimal response of agent i to the activity level of all
    other agents: e ∗i = e

    i (ē ). Moreover, using (5.A1), we can also calculate the slope of the
    reaction curve of agent i :
    d e ∗i
    d ē
    = − V12
    V11
    ≶ 0, if V12 ≶ 0. (5.A3)
    By the second-order condition for maximization, we know that V11 < 0; the sign of the slope is thus determined by the sign of V12(ei , ē ). In case V12 > 0, we can

    212 COORDINATION AND EXTERNALITIES
    make a graphical representation of the marginal payoff function V1(ei , ē ) and of the
    resulting reaction function e ∗i (ē ). The left-hand graph in Figure 5.12 illustrates various
    functions V1, corresponding to three different activity levels for the other agents: ē = 0,
    ē = e , and ē = E .
    Assuming V1(0, 0) > 0 and V1( E , E ) < 0 (points A and B ) guarantees the exis- tence of at least one symmetric decentralized equilibrium in which e = e ∗i (e ), and agent i chooses exactly the same level of activity as all other agents (in this case V1(e, e ) = 0 and V11(e, e ) < 0). In Figure 5.12 we illustrate the case in which the reaction has a positive slope, and hence V12 > 0, and in which there is a unique symmetric equilib-
    rium.
    In general, if V12(ei , ē ) > 0 there exists a strategic complementarity between agents:
    an increase in the activity level of the others increases the marginal return of activity
    for agent i , who will respond to this by raising her activity level. If, on the other hand,
    V12(ei , ē ) < 0, then agents’ actions are strategic substitutes. In this case agent i chooses a lower activity in response to an increase in the activity level of others (as in the case of a Cournot duopoly situation in which producers choose output levels). In the latter case there exists a unique equilibrium, while in the case of strategic complementarity there may be multiple equilibria. Before analyzing the conditions under which this may occur, and before discussing the role of strategic complementarity or substitutability in determining the character- istics of the equilibrium, we must evaluate the problem from the viewpoint of a social planner who implements a Pareto-efficient equilibrium. Figure 5.12. Strategic interactions COORDINATION AND EXTERNALITIES 213 The planner’s problem may be expressed as the maximization of a representative agent’s welfare with respect to the common strategy (activity level) of all agents: the optimum that we are looking for is therefore the symmetric outcome corresponding to a hypothetical cooperative equilibrium. Formally, max e V (e, e ), (5.A4) from which we obtain V1(e ∗ , e ∗) + V2(e ∗ , e ∗) = 0. (5.A5) Comparing this first-order condition49 with the condition that is valid in a symmetric decentralized equilibrium (5.A2), we see that the solutions for e ∗ are different if V2(e ∗, e ∗) �= 0. In general, if V2(ei , ē ) > (<)0, there are positive (negative) spillovers. The externalities are therefore defined as the impact of a third agent’s activity level on the payoff of an individual. A number of important implications for different features of the possible equilibria follow from this general formulation. 1. Efficiency Whenever there are externalities that affect the symmetric decen- tralized equilibrium, that is when V2(e, e ) �= 0, the decentralized equilibrium is inefficient. In particular, with a positive externality (V2(e, e ) > 0), there exists
    a symmetric cooperative equilibrium characterized by a common activity level
    e ′ > e .
    2. Multiplicity of equilibria As already mentioned, in the case of strategic comple-
    mentarity (V12 > 0), an increase in the activity level of the other agents increases
    the marginal return of activity for agent i , which induces agent i to raise her
    own activity level. As a result, the reaction function of agents has a positive slope
    (as in Figure 5.12). Strategic complementarity is a necessary but not a sufficient
    condition for the existence of multiple (non-cooperative) equilibria. The suf-
    ficient condition is that d e ∗i /d ē > 1 in a symmetric decentralized equilibrium.
    If this condition is satisfied, we may have the situation depicted in Figure 5.13,
    in which there exist three symmetric equilibria. Two of these equilibria (with
    activity levels e 1and e 3) are stable, since the slope of the reaction curves is less
    than one at the equilibrium activity levels, while at e 2 the slope of the reaction
    curve is greater than one. This equilibrium is therefore unstable.
    3. Welfare If there exist multiple equilibria, and if at each activity level there are
    positive externalities (V2(ei , ē ) > 0 ∀ē ), then the equilibria can be ranked. Those
    with a higher activity level are associated with a higher level of welfare. Hence,
    agents may be in an equilibrium in which their welfare is below the level that
    may be obtained in other equilibria. However, since agents choose the optimal
    strategy in each of the equilibria, there is no incentive for agents to change
    ⁴⁹ The second-order condition that we assume to be satisfied is given by V11 (e
    ∗, e ∗) + 2V12 (e ∗, e ∗) +
    V22 (e
    ∗, e ∗) < 0. Furthermore, in order to ensure the existence of a cooperative equilibrium, we assume that V1 (0, 0) + V2 (0, 0) > 0, V1 ( E , E ) + V2 ( E , E ) < 0, which is analogous to the restrictions imposed in the decentralized optimization above. 214 COORDINATION AND EXTERNALITIES Figure 5.13. Multiplicity of equilibria their level of activity. The absence of a mechanism to coordinate the actions of individual agents may thus give rise to a “coordination failure,” in which potential welfare gains are not realized because of a lack of private incentives to raise the activity levels. Exercise 52 Show formally that equilibria with a higher ē are associated with a higher level of welfare if V2(ei , ē ) > 0. (Use the total derivative of function V (·) to derive this
    result.)
    4. Multipliers Strategic complementarity is necessary and sufficient to guarantee
    that the aggregate response to an exogenous shock exceeds the response at
    the individual level; in this case the economy exhibits “multiplier” effects. To
    clarify this last point, which is of particular relevance for Keynesian models,
    we will consider the simplified case of two agents with payoff functions defined
    as V 1 ≡ Û1(e 1, e 2, Î1) and V 2 ≡ Û2(e 1, e 2, Î2), respectively. All the assumptions
    about these payoff functions remain valid (in particular, V 113 ≡ Û113 > 0). The
    reaction curves of the two agents are derived from the following first-order
    conditions:
    V 11 (e

    1 , e

    2 , Î1) = 0, (5.A6)
    V 22 (e

    1 , e

    2 , Î2) = 0. (5.A7)
    We now consider a “shock” to the payoff function of agent 1, namely d Î1 > 0, and
    we derive the effect of this shock on the equilibrium activity levels of the two agents, e ∗1
    and e ∗2 , and on the aggregate level of activity, e

    1 + e

    2 . Taking the total derivative of the
    above system of first-order conditions (5.A6) and (5.A7), with d Î2 = 0, and dividing

    COORDINATION AND EXTERNALITIES 215
    the first equation by V 111 and the second by V
    2
    22, we have:
    d e ∗1 +
    (
    V 112
    V 111
    )
    d e ∗2 +
    (
    V 113
    V 111
    )
    d Î1 = 0,
    (
    V 221
    V 222
    )
    d e ∗1 + d e

    2 = 0.
    The terms V 112/ V
    1
    11 and V
    2
    21/ V
    2
    22 represent the slopes, with opposing signs, of the
    reaction curves of the agents which we denote by Ò (given that the payoff functions
    are assumed to be identical, the slope of the reaction curves is also the same). The
    term V 113/ V
    1
    11 represents the response (again with oppositing signs) of the optimal
    equilibrium level of agent 1 to a shock Î1. In particular, keeping e

    2 constant, we have
    V 11 (e

    1 , e

    2 , Î1) = 0 ⇒
    ∂e ∗1
    ∂Î1
    = − V
    1
    13
    V 111
    > 0.
    We can thus rewrite the system as follows:
    (
    1 −Ò
    −Ò 1
    )(
    d e ∗1
    d e ∗2
    )
    =
    (
    ∂e ∗1
    ∂Î1
    0
    )
    d Î1,
    which yields the following solution:
    d e ∗1
    d Î1
    =
    1
    1 − Ò2
    ∂e ∗1
    ∂Î1
    (5.A8)
    d e ∗2
    d Î1
    =
    Ò
    1 − Ò2
    ∂e ∗1
    ∂Î1
    = Ò
    d e ∗1
    d Î1
    . (5.A9)
    Equation (5.A8) gives the total response of agent 1 to a shock Î1. This response can
    also be expressed as
    d e ∗1
    d Î1
    =
    ∂e ∗1
    ∂Î1
    + Ò
    d e ∗2
    d Î1
    . (5.A10)
    The first term is the “impact” (and thus only partial) response of agent 1 to a shock
    affecting her payoff function; the second term gives the response of agent 1 that is
    “induced” by the reaction of the other agent. The condition for the additional induced
    effect is simply Ò �= 0. Moreover, the actual induced effect depends on Ò and d e ∗2 /d Î1,
    as in (5.A9), where d e ∗2 /d Î1 has the same sign Ò: positive in case of strategic comple-
    mentarity and negative in case of substitutability. The induced response of agent 1 is
    therefore always positive.
    This leads to a first important conclusion: the interactions between the agents always
    induce a total (or equilibrium) response that is larger than the impact response. In

    216 COORDINATION AND EXTERNALITIES
    particular, for each Ò �= 0, we have
    d e ∗1
    d Î1
    >
    ∂e ∗1
    ∂Î1
    .
    For the economy as a whole, the effect of the disturbance is given by
    d (e ∗1 + e

    2 )
    d Î1
    =
    (
    1
    1 − Ò2 +
    Ò
    1 − Ò2
    )
    ∂e ∗1
    ∂Î1
    =
    1
    1 − Ò
    ∂e ∗1
    ∂Î1
    = (1 + Ò)
    d e ∗1
    d Î1
    . (5.A11)
    The relative size of the aggregate response compared with the size of the individual
    response depends on the sign of Ò: if Ò > 0 (and limiting attention to stable equilibria
    for which Ò < 1), then aggregate response is bigger than individual response. Strategic complementarity is thus a necessary and sufficient condition for Keynesian multiplier effects. Exercise 53 Determine the type of externality and the nature of the strategic interactions for the simplified case of two agents with payoff function (here expressed for agent 1) V 1(e 1, e 2) = e · 1 e · 2 − e 1 (with 0 < 2· < 1). Furthermore, derive the (symmetric) decen- tralized equilibria and compare these with the cooperative (symmetric) equilibrium. REVIEW EXERCISES Exercise 54 Introduce the following assumptions into the model analyzed in Section 5.1: (i) The (stochastic) cost of production c has a uniform distribution defined on [0, 1], so that G (c ) = c for 0 ≤ c ≤ 1. (ii) The matching probability is equal to b(e ) = b · e , with parameter b > 0.
    (a) Determine the dynamic expressions for e and c ∗ (repeating the derivation in
    the main text) under the assumption that y < 1. (b) Find the equilibria for this economy and derive the stability properties of all equilibria with a positive activity level. Exercise 55 Starting from the search model of money analyzed in Section 5.2, suppose that carrying over money from one period to the next now entails a storage cost, c > 0.
    Under this new assumption,
    (a) Derive the expected utility for an agent holding a commodity (VC ) and for an
    agent holding money (VM ), and find the equilibria of the economy.
    (b) Which of the three equilibria described in the model of Section 5.2 (with c = 0)
    always exists even with c > 0? Under what condition does a pure monetary
    equilibrium exist?
    Exercise 56 Assume that the flow cost of a vacancy c and the imputed value of free time z
    in the model of Section 5.3 are now functions of the wage w (instead of being exogenous).

    COORDINATION AND EXTERNALITIES 217
    In particular, assume that the following linear relations hold:
    c = c 0 w, z = z0 w.
    Determine the effect of an increase in productivity (�y > 0) on the steady-state equilib-
    rium.
    Exercise 57 Consider a permanent negative productivity shock (�y < 0) in the match- ing model of Sections 5.3 and 5.4. The shock is realized at date t1, but is anticipated by the agents from date t0 < t1 onwards. Derive the effect of this shock on the steady-state equilibrium and describe the transitional dynamics of u, v, and Ë. Exercise 58 Consider the effect of an aggregate shock in the model of strategic interactions for two agents introduced in Appendix A5. That is, consider a variation in the exogenous terms of the payoff functions, so that d Î1 = d Î2 = d Î > 0, and derive the effect of this
    shock on the individual and aggregate activity level.
    � FURTHER READING
    The role of externalities between agents that operate in the same market as a source
    of multiplicity of equilibria is the principal theme in Diamond (1982a ). This arti-
    cle develops the economic implications of the multiplicity of equilibria that have a
    Keynesian spirit. The monograph by Diamond (1984) analyzes this theme in greater
    depth, while Diamond and Fudenberg (1989) concentrate on the dynamic aspects
    of the model. Blanchard and Fischer (1989, chapter 9) offer a compact version
    of the model that we studied in the first section of this chapter. Moreover, after
    elaborating on the general theoretical structure to analyze the links between strate-
    gic interactions, externalities, and multiplicity of equilibria, which we discussed in
    Appendix A5, Cooper and John (1988) offer an application of Diamond’s model.
    Rupert et al. (2000) survey the literature on search models of money as a medium of
    exchange and present extensions of the basic Kiyotaki–Wright framework discussed in
    Section 5.2.
    The theory of the decentralized functioning of labor markets, which is based on
    search externalities and on the process of stochastic matching of workers and firms,
    reinvestigates a theme that was first developed in the contributions collected in Phelps
    (1970), namely the process of search and information gathering by workers and its
    effects on wages. Mortensen (1986) offers an exhaustive review of the contributions in
    this early strand of literature.
    Compared with these early contributions, the theory developed in Section 5.3 and
    onwards concentrates more on the frictions in the matching process. Pissarides (2000)
    offers a thorough analysis of this strand of the literature. In this literature the base
    model is extended to include a specification of aggregate demand, which makes the
    interest rate endogenous, and allows for growth of the labor force, two elements that
    are not considered in this chapter. Mortensen and Pissarides (1999a , 1999b) provide
    an up-to-date review of the theoretical contributions and of the relevant empirical
    evidence.

    218 COORDINATION AND EXTERNALITIES
    In addition to the assumption of bilateral bargaining, which we adopted in Section
    5.3, Mortensen and Pissarides (1998a ) consider a number of alternative assumptions
    about wage determination. Moreover, Pissarides (1994) explicitly considers the case of
    on-the-job search which we excluded from our analysis. Pissarides (1987) develops the
    dynamics of the search model, studying the path of unemployment and vacancies in
    the different stages of the business cycle. The paper devotes particular attention to the
    cyclical variations of u and v around their long-run relationship, illustrated here by the
    dynamics displayed in Figure 5.11. Bertola and Caballero (1994) and Mortensen and
    Pissarides (1994) extend the structure of the base model to account for an endogenous
    job separation rate s . In these contributions job destruction is a conscious decision
    of employers, and it occurs only if a shock reduces the productivity of a match below
    some endogenously determined level. This induces an increase in the job destruction
    rate in cyclical downturns, which is coherent with empirical evidence.
    The simple Cobb–Douglas formulation for the aggregate matching function with
    constant returns to scale introduced in Section 5.3 has proved quite useful in interpret-
    ing the evidence on unemployment and vacancies. Careful empirical analyses of flows
    in the (American) labor market can be found in Blanchard and Diamond (1989, 1990),
    Davis and Haltiwanger (1991, 1992) and Davis, Haltiwanger, and Schuh (1996), while
    Contini et al. (1995) offer a comparative analysis for the European countries. Cross-
    country empirical estimates of the Beveridge curve have been used by Nickell et al.
    (2002) to provide a description of the developments of the matching process over the
    1960–99 period in the main OECD economies. They find that the Beveridge curve
    gradually drifted rightwards in all countries from the 1960s to the mid-1980s. In some
    countries, such as France and Germany, the shift continued in the same direction in
    the 1990s, whereas in the UK and the USA the curve shifted back towards its original
    position. Institutional factors affecting search and matching efficiency are responsible
    for a relevant part of the Beveridge curve shifts. The Beveridge curve for the Euro area
    in the 1980s and 1990s is analysed in European Central Bank (2002). Both counter-
    clockwise cyclical swings around the curve of the type discussed in Section 5.4 and
    shifts of the unemployment–vacancies relation occurred in this period. For example,
    over 1990–3 unemployment rose and the vacancy rate declined, reflecting the influ-
    ence of cyclical factors; from 1994 to 1997 the unemployment rate was quite stable
    in the face of a rising vacancy rate, a shift of the Euro area Beveridge curve that is
    attributable to structural factors.
    Not only empirically, but also theoretically, the structure of the labor force, the
    geographical dispersion of unemployed workers and vacant jobs, and the relevance
    of long-term unemployment determine the efficiency of a labor market’s matching
    process. Petrongolo and Pissarides (2001) discuss the theoretical foundations of the
    matching function and provide an up-to-date survey of the empirical estimates for
    several countries, and of recent contributions focused on various factors influencing
    the matching rate.
    The analysis of the efficiency of decentralized equilibrium in search models is first
    developed in Diamond (1982b) and Hosios (1990), who derive the efficiency condi-
    tions obtained in Section 5.5; it is also discussed in Pissarides (2000). In contrast, in a
    classic paper Lucas and Prescott (1974) develop a competitive search model where the
    decentralized equilibrium is efficient.

    COORDINATION AND EXTERNALITIES 219
    � REFERENCES
    Bertola, G., and R. J. Caballero (1994) “Cross-Sectional Efficiency and Labour
    Hoarding in a Matching Model of Unemployment,” Review of Economic Studies, 61,
    435–456.
    Blanchard, O. J., and P. Diamond (1989) “The Beveridge Curve,” Brookings Papers on Economic
    Activity, no. 1, 1–60.
    (1990) “The Aggregate Matching Function,” in P. Diamond (ed.), Growth, Productiv-
    ity, Unemployment, Cambridge, Mass.: MIT Press, 159–201.
    and S. Fischer (1989) Lectures on Macroeconomics, Cambridge, Mass.: MIT Press.
    Contini, B., L. Pacelli, M. Filippi, G. Lioni, and R. Revelli (1995) A Study of Job Creation and Job
    Destruction in Europe, Brussels: Commission of the European Communities.
    Cooper, R., and A. John (1988) “Coordinating Coordination Failures in Keynesian Models,”
    Quarterly Journal of Economics, 103, 441–463.
    Davis, S., and J. Haltiwanger (1991) “Wage Dispersion between and within US Manufacturing
    Plants, 1963–86,” Brookings Papers on Economic Activity, no. 1, 115–200.
    (1992) “Gross Job Creation, Gross Job Destruction and Employment Reallocation,”
    Quarterly Journal of Economics, 107, 819–864.
    and S. Schuh (1996) Job Creation and Destruction, Cambridge, Mass.: MIT Press.
    Diamond, P. (1982a ) “Aggregate Demand Management in Search Equilibrium,” Journal of Polit-
    ical Economy, 90, 881–894.
    (1982b) “Wage Determination and Efficiency in Search Equilibrium,” Review of Economic
    Studies, 49, 227–247.
    (1984) A Search-Equilibrium Approach to the Micro Foundations of Macroeconomics,
    Cambridge, Mass.: MIT Press.
    and D. Fudenberg (1989) “Rational Expectations Business Cycles in Search Equilibrium,”
    Journal of Political Economy, 97, 606–619.
    European Central Bank (2002) “Labour Market Mismatches in Euro Area Countries,” Frankfurt:
    European Central Bank.
    Hosios, A. J. (1990) “On the Efficiency of Matching and Related Models of Search and Unem-
    ployment,” Review of Economic Studies, 57, 279–298.
    Kiyotaki, N., and R. Wright (1993) “A Search-Theoretic Approach to Monetary Economics,”
    American Economic Review, 83, 63–77.
    Lucas, R. E., and E. C. Prescott (1974) “Equilibrium Search and Unemployment,” Journal of
    Economic Theory, 7, 188–209.
    Mortensen, D. T. (1986) “Job Search and Labor Market Analysis,” in O. Ashenfelter and R. Layard
    (eds.), Handbook of Labor Economics, Amsterdam: North-Holland.
    and C. A. Pissarides (1994) “Job Creation and Job Destruction in the Theory of Unemploy-
    ment,” Review of Economic Studies, 61, 397–415.
    (1999a ) “New Developments in Models of Search in the Labor Market,” in O. Ashen-
    felter and D. Card (eds.), Handbook of Labor Economics, vol. 3, Amsterdam: North-Holland.
    (1999b) “Job Reallocation, Employment Fluctuations and Unemployment,” in J. B.
    Taylor and M. Woodford (eds.), Handbook of Macroeconomics, Amsterdam: North-Holland.

    220 COORDINATION AND EXTERNALITIES
    Nickell S., L. Nunziata, W. Ochel, and G. Quintini (2002) “The Beveridge Curve, Unemployment
    and Wages in the OECD from the 1960s to the 1990s,” Centre for Economic Performance Dis-
    cussion Paper 502; forthcoming in P. Aghion, R. Frydman, J. Stiglitz, and M. Woodford (eds.),
    Knowledge, Information and Expectations in Modern Macroeconomics: In Honor of Edmund S.
    Phelps, Princeton: Princeton University Press.
    Petrongolo B., and C. A. Pissarides (2001) “Looking into the Black Box: A Survey of the Matching
    Function,” Journal of Economic Literature, 39, 390–431.
    Phelps, E. S. (ed.) (1970) Macroeconomic Foundations of Employment and Inflation Theory, New
    York: W. W. Norton.
    Pissarides, C. A. (1987) “Search, Wage Bargains and Cycles,” Review of Economic Studies, 54,
    473–483.
    (1994) “Search Unemployment and On-the-Job Search,” Review of Economic Studies, 61,
    457–475.
    (2000) Equilibrium Unemployment Theory, 2nd edn. Cambridge, Mass.: MIT Press.
    Rupert P., M. Schindler, A. Shevchenko, and R. Wright (2000) “The Search-Theoretic Approach
    to Monetary Economics: A Primer,” Federal Reserve Bank of Cleveland Economic Review, 36(4),
    10–28.

    � A N S W E R S T O E X E R C I S E S
    Solution to exercise 1
    When Î = 0 (assuming for simplicity that yt−i = ȳ ∀i ≥ 0) the agent has an
    initial consumption level c t = ȳ and a stock of financial assets at the beginning
    of period t + 1 equal to zero: At +1 = 0. In period t + 1, we have
    c t +1 = ȳ +
    r
    1 + r
    εt +1, s t +1 = yt +1 − c t +1 =
    1
    1 + r
    εt +1 = At +2.
    In subsequent periods (with no further innovations) current income will go
    back to its mean value ȳ, and consumption will remain at the higher level
    computed for t + 1. The return on financial wealth accumulated in t + 1 allows
    the consumer to maintain such higher consumption level over the entire
    future horizon:
    y Dt +2 = yt +2 + r At +2 = ȳ +
    r
    1 + r
    εt +1 = c t +2 ⇒ s t +2 = 0.
    The same is true for all periods t + i with i > 2. There is no saving, and the
    level of A remains equal to At +2. When Î = 1, the whole increase in income is
    permanent and is entirely consumed. There is no need to save in order to keep
    the higher level of consumption in the future.
    Solution to exercise 2
    We look for a consumption function of the general form
    c t = r ( At + Ht ) = r At +
    r
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    E t yt +i ,
    as in (1.12) in the main text. Given the assumed stochastic process for income,
    we can compute expectations of future incomes and then the value of human
    wealth Ht . We have
    E t yt +1 = Îyt + (1 − Î)ȳ
    E t yt +2 = Î
    2 yt + (1 + Î)(1 − Î)ȳ
    . . .
    E t yt +i = Î
    i yt + (1 + Î + . . . + Î
    i −1)(1 − Î)ȳ = Îi yt + (1 − Îi )ȳ.

    222 ANSWERS TO EXERCISES
    Plugging the last expression above into the definition of Ht , we get
    Ht =
    1
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    (Îi yt + (1 − Îi )ȳ)
    =
    1
    1 + r
    [
    yt
    ∞∑
    i =0
    (
    Î
    1 + r
    )i
    + ȳ
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    − ȳ
    ∞∑
    i =0
    (
    Î
    1 + r
    )i]
    =
    1
    1 + r
    [
    yt
    1 + r
    1 + r − Î + ȳ
    (
    1 + r
    r
    − 1 + r
    1 + r − Î
    )]
    =
    1
    1 + r − Î yt +
    1 − Î
    r (1 + r − Î) ȳ.
    The consumption function is then
    c t = r ( At + Ht ) = r At +
    r
    1 + r − Î yt +
    1 − Î
    1 + r − Î ȳ.
    If Î = 1, income innovations are permanent and the best forecast of all future
    incomes is simply current income yt . Thus, consumption will be equal to total
    income (interest income and labor income):
    c t = r At + yt .
    If Î = 0, income innovations are purely temporary and the best forecast of
    future incomes is mean income ȳ. Consumption will then be
    c t = r At + ȳ +
    r
    1 + r
    ( yt − ȳ).
    The last term measures the annuity value (at the beginning of period t ) of
    the income innovation that occurred in period t and therefore known by the
    consumer (indeed, yt − ȳ = εt ).
    Solution to exercise 3
    Since c 2 = w1 − c 1 + w2, from the first-order condition
    1
    c 1
    = E
    (
    1
    c 2
    )
    we get
    1
    c 1
    =
    p
    w1 − c 1 + x
    +
    1 − p
    w1 − c 1 + y
    .
    Rearranging and writing p x + (1 − p)y = z, we get
    (w1 − c 1 − z + y + x ) c 1 = (w1 − c 1 + x )(w1 − c 1 + y).

    ANSWERS TO EXERCISES 223
    This is a quadratic equation for c 1, so a closed-form solution is available.
    Writing x = z + �, y = z − �, the first-order condition reads
    (w1 − c 1 + z) c 1 = (w1 − c 1 + z + �)(w1 − c 1 + z − �).
    In the absence of uncertainty (� = 0), the solution is c 1 = (w1 + z)/2. (With
    discount and return rates both equal to zero, the agent consumes half of the
    available resources in each period.) For general � the optimality condition is
    solved by
    c 1 =
    3
    4
    (w1 + z) ±
    1
    4

    ((w1 + z)2 + 8�2).
    Selecting the negative square root ensures that the solution approaches the
    appropriate limit when � → 0, and implies that uncertainty reduces first-
    period consumption (for precautionary motives). An analytic solution would
    be impossible for even slightly more complicated maximization problems.
    This is why studies of precautionary savings prefer to specify the utility func-
    tion in exponential form, rather than logarithmic or other CRRA.
    Solution to exercise 4
    Solving the consumer’s problem, we get the following first-order condition
    (see the main text for the solution in the certainty case):
    1 + r
    1 + Ò
    E t
    [(
    c t +1
    c t
    )−„]
    = 1.
    The assumption � log c t +1 ∼ N
    (
    E t (� log c t +1), Û
    2
    )
    yields
    −„� log c t +1 ∼ N
    (
    −„E t (� log c t +1), „2Û2
    )
    .
    Using the properties of the lognormal distribution, we can write the Euler
    equation as
    1 + r
    1 + Ò
    e (−„E t (� log c t +1 )+(„
    2/2)Û2 ) = 1.
    Taking logarithms, the following expression for the expected rate of change of
    consumption is obtained:
    E t (� log c t +1) =
    1

    (r − Ò) + „
    2
    Û2.
    The uncertainty on future consumption levels, captured by the variance Û2,
    induces the (prudent) consumer to transfer resources from the present to the
    future, determining an increasing path of consumption over time.

    224 ANSWERS TO EXERCISES
    Solution to exercise 5
    (a) The increase of mean income changes the consumer’s permanent
    income. Both permanent income and consumption increase by �ȳ.
    Formally,
    �c t +1 = �y
    P
    t +1 =
    r
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    (E t +1 − E t ) yt +1+i
    =
    r
    1 + r
    ∞∑
    i =0
    (
    1
    1 + r
    )i
    �ȳ = �ȳ.
    Since the income change is entirely permanent, saving is not affected.
    (b) In order to find the change in consumption following an innovation in
    income, it is necessary to compute the revision in expectations of future
    incomes caused by εt +1. Given the stochastic process for labor income,
    we have
    (E t +1 − E t ) yt +1 = εt +1,
    (E t +1 − E t ) yt +2 = −‰εt +1,
    (E t +1 − E t ) yt +i = 0 for i > 2.
    Applying the general formula for the change in consumption, we get
    �c t +1 = r ( Ht +1 − E t Ht +1)
    =
    r
    1 + r
    (
    εt +1 −
    1
    1 + r
    ‰εt +1
    )
    =
    r (1 + r − ‰)
    (1 + r )2
    εt +1.
    The increase in consumption is lower than the increase in income since
    the latter is only temporary. The higher is ‰, the lower is the change in
    consumption, because a positive income innovation in t + 1 (εt +1) is
    offset by a negative income change (−‰εt +1) in the following period.
    (c) The behavior of saving reflects the expectation of future income
    changes. Given εt +1 and using the stochastic process for income, we
    obtain
    yt +1 = ȳ + εt +1
    yt +2 = ȳ − ‰εt +1 ⇒ �yt +2 = −(1 + ‰)εt +1
    yt +3 = ȳ ⇒ �yt +3 = ‰εt +1.

    ANSWERS TO EXERCISES 225
    (No income changes are foreseen for subsequent periods.) Saving in
    t + 1 and t + 2 is then
    s t +1 = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E t +1�yt +1+i = −
    [
    − 1 + ‰
    1 + r
    +

    (1 + r )2
    ]
    εt +1
    =
    1 + r (1 + ‰)
    (1 + r )2
    εt +1 > 0,
    s t +2 = −
    ∞∑
    i =1
    (
    1
    1 + r
    )i
    E t +2�yt +2+i = −

    1 + r
    εt +1 < 0. In t + 1 a portion of the higher income is saved, since the con- sumer knows its transitory nature and then anticipates further income changes in the two following periods. In t + 2 income is temporar- ily lower than average and the agent finances consumption with the income saved in the previous period: in t + 2, then, saving is negative. Solution to exercise 6 For each period from t onwards, the consumer must choose both the con- sumption of non-durable goods c t +i (which coincides with expenditure), and the expenditure on durable goods dt +i (which adds to the stock and starts to provide utility in the period after the purchase). The utility maximization problem is then solved for c t +i and dt +i . Besides the constraints in the main text, we must consider the transversality condition on financial wealth A and the non-negativity constraint on the stock of durable goods S and on consumption c (though we will not explicitly use these additional constraints in the solution below). Following the solution procedure already used in the main text, we substitute the two constraints into the utility function to be maximized. Combining the constraints, we can write consumption as c t +i = (1 + r ) At +i − At +i +1 + yt +i − pt +i [St +i +1 − (1 − ‰)St +i ]. Plugging the above expression into the objective function, we get the following optimization problem: max At +i ,St +i Ut = ∞∑ i =0 ( 1 1 + Ò )i u((1 + r ) At +i − At +i +1 + yt +i − pt +i (St +i +1 − (1 − ‰)St +i ), St +i ). Expanding the first two terms of the summation (for i = 0, 1) and differ- entiating with respect to At +1 and St +1, we obtain the following first-order 226 ANSWERS TO EXERCISES conditions: ∂Ut ∂ At +1 = −uc (c t , St ) + 1 + r 1 + Ò uc (c t +1, St +1) = 0, ∂Ut ∂ St +1 = − pt uc (c t , St ) + 1 1 + Ò pt +1(1 − ‰)uc (c t +1, St +1) + 1 1 + Ò u S (c t +1, St +1) = 0. The consumer makes two decisions. First, he chooses between consumption in the current period and in the next period (and then between consumption and saving). Second, he chooses between spending on non-durable goods and spending on durable goods, which yield deferred utility. The first-order conditions above illustrate these two choices. The first condition captures the choice between consumption and saving, as in (1.5) in the main text, uc (c t , St ) = 1 + r 1 + Ò uc (c t +1, St +1), and bears the usual interpretation: the loss of marginal utility arising from the decrease in consumption at time t must be offset by the marginal utility (discounted with rate Ò) obtained by accumulating financial assets with gross return 1 + r . The choice between spending for non-durable goods and pur- chasing durables is illustrated by the second condition, rewritten as pt uc (c t , St ) = 1 1 + Ò [u S (c t +1, St +1) + (1 − ‰) pt +1uc (c t +1, St +1)]. One unit of the durable good purchased at time t entails a decrease of spending on (and consumption of ) pt units of non-durable goods with a utility loss measured by pt uc (c t , St ) on the right-hand side of the above equation. In equilibrium, this loss must be offset, in the following period, by the higher utility stemming from the unitary increase of the stock of durables. This increase in utility, measured on the left-hand side of the equation, has two components (both discounted at rate Ò). The first is the marginal utility of the stock of durables at the beginning of period t + 1. The second accounts for the additional resources that an increase in the stock of durables makes available for consumption in t + 1 by reducing the need for further purchases, dt +1. These additional resources are measured by (1 − ‰) pt +1, yielding utility (1 − ‰) pt +1uc (c t +1, St +1). Solution to exercise 7 (a) In each period, utility is affected positively by consumption in the current period and negatively by consumption in the previous period. ANSWERS TO EXERCISES 227 This formulation of utility may capture habit formation behavior: a high level of consumption in period t decreases utility in period t + 1 (but increases period t + 1 marginal utility). Therefore, the agent is induced to increase consumption in period t + 1. This effect is due to a consumption “habit” (related to the last period level of c ) making the agent increase consumption over time. (b) Substituting c t +i and c t +i −1 from the budget constraints of two sub- sequent periods into the objective function and differentiating with respect to financial wealth, we obtain the following first-order condi- tion (Euler equation): E t u ′(c t +i −1, c t +i −2) = 1 + r + „ 1 + Ò E t u ′(c t +i , c t +i −1) − (1 + r )„ (1 + Ò)2 E t u ′(c t +i +1, c t +i ). Setting i = 0 and Ò = r , and assuming quadratic utility so that u′(c t +i , c t +i −1) = 1 − b(c t +i − „c t +i −1), we get 1 − b(c t−1 − „c t−2) = 1 + r + „ 1 + r [1 − b(c t − „c t−1)] − „ 1 + r [1 − b(E t c t +1 − „c t ], or „E t c t +1 = (1 + r + „ + „ 2)c t − [(1 + r + „)„ + (1 + r )]c t−1 + „(1 + r ) c t−1. Using first differences of consumption, „E t �c t +1 = ( 1 + r + „2 ) �c t − (1 + r )„�c t−1. The change in consumption between t and t + 1 depends on past values of �c and therefore is not orthogonal to all variables dated t . If in each period utility depends on consumption in the current and the last periods, in choosing between c t and c t +1, the agent considers the effects on utility not only at t and t + 1 (as in the case of a time-separable utility function), but also at t + 2. This creates an intertemporal link between the marginal utility in three subsequent periods and then, with quadratic utility, between the consumption levels in subsequent periods. In this case there is a dynamic relation between c t +1and c t , c t−1 and c t−2, which makes the consumption change �c t +1 dependent on lagged values �c t and �c t−1. Therefore, the orthogonality conditions that hold with separable utility are not valid here. 228 ANSWERS TO EXERCISES Solution to exercise 8 (a) The change in permanent income for agents, �y Pt , is found from the following version of equation (1.6): �y Pt = r 1 + r ∞∑ i =0 ( 1 1 + r )i [E ( yt +i | It ) − E ( yt +i | It−1)], where the information set used by agents ( I ) has been made explicit. It is then necessary to compute the “surprises”: yt − E ( yt | It−1), E ( yt +1 | It )− E ( yt +1 | It−1), etc. Since agents in each period observe the realization of x , using the stochastic process for income, we have E ( yt | It−1) = Îyt−1 + xt−1, from which we obtain yt − E ( yt | It−1) = ε1t . Recalling that the properties of x imply that E (xt | It−1) = 0, to com- pute the second “surprise” we use the following expressions: E ( yt +1 | It ) = Îyt + xt , E ( yt +1 | It−1) = ÎE ( yt | It−1), from which E ( yt +1 | It ) − E ( yt +1 | It−1) = Î ( yt − E ( yt | It−1)) + xt = Îε1t + xt . Iterating the same procedure, we find, for i ≥ 1, E ( yt +i | It ) − E ( yt +i | It−1) = Îi (Îε1t + xt ). The change in permanent income is then given by �y Pt = r 1 + r [ ε1t + ∞∑ i =1 ( 1 1 + r )i Îi −1(Îε1t + xt ) ] = r 1 + r [ ∞∑ i =0 ( 1 1 + r )i Îi ε1t + ∞∑ i =1 ( 1 1 + r )i Îi −1 xt ] = r 1 + r ( 1 + r 1 + r − Î ε1t + 1 1 + r − Î xt ) = r 1 + r − Î ( ε1t + 1 1 + r xt ) . Now consider the change in permanent income (�ỹ Pt ) computed by the econometrician, who does not observe the realization of x . ANSWERS TO EXERCISES 229 The relevant “surprises” are then: yt − E ( yt | �t−1), E ( yt +1 | �t ) − E ( yt +1 | �t−1), etc. As in the previous case, we get E ( yt | �t−1) = Îyt−1 E ( yt +1 | �t ) = Îyt E ( yt +1 | �t−1) = ÎE ( yt | �t−1), from which we compute the “surprises”: yt −E ( yt | �t−1) = ε1t + xt−1 E ( yt +1 | �t ) − E ( yt +1 | �t−1) = Î(ε1t + xt−1) . . . E ( yt +i | �t ) − E ( yt +i | �t−1) = Îi (ε1t + xt−1). Finally, using equation (1.7), we obtain �ỹ Pt = r 1 + r ∞∑ i =0 ( 1 1 + r )i Îi (ε1t + xt−1) = r 1 + r − Î (ε1t + xt−1). (b) The variability of permanent income, measured by the variance of �y Pt and �ỹ Pt , is var(�y Pt ) = ¯ 2 ( Û21 + ( 1 1 + r )2 Û2x ) , var(�ỹ Pt ) = ¯ 2(Û21 + Û 2 x ), where ¯ ≡ r /(1 + r − Î), Û21 ≡ var(ε1), and Û2x ≡ var(x ). We find then that var(�y Pt ) < var(�ỹ P t ). The variability of permanent income estimated by the econometrician is higher than the variability perceived by agents. Overestimating the unforeseen changes in income may lead to the conclusion that consumption is excessively smooth, even though agents behave as predicted by the rational expectations–permanent income theory. Solution to exercise 9 (a) For the assumed utility function, marginal utility is u′(c ) = { a − bc for c < a/b; 0 for c ≥ a/b, 230 ANSWERS TO EXERCISES As shown in the figure, marginal utility is convex in the neighborhood of c = a/b where it becomes zero. Therefore, there exists a precaution- ary saving motive. (b) The optimality condition for c 1 is u′(c 1) = E 1[u ′(c 2)]. If Û = 0, we get c 1 = c 2 = a/b: in each period income is entirely con- sumed, there is no saving, and marginal utility is zero. If Û > 0, with
    c 1 = a/b, in the second period the agent consumes either a/b + Û (with
    zero marginal utility) or a/b − Û (with positive marginal utility) with
    equal probability. The expected value of the second-period marginal
    utility will then be positive, violating the optimality condition. There-
    fore, when Û > 0 the agent is induced to consume less than a/b in
    the first period. Writing the realizations of second-period income and
    consumption, i.e.
    c 2 = y2 + ( y1 − c 1) =
    {
    2a/b − c 1 + Û ≡ c H2 (c 1) with probability 0.5
    2a/b − c 1 − Û ≡ c L2 (c 1) with probability 0.5
    and noting that marginal utility is zero in the first case, the optimality
    condition becomes
    a − bc 1 =
    1
    2
    (a − bc L2 (c 1))
    and the value of c 1 is computed as
    c 1 =
    a
    b
    − Û
    3
    .
    First-period consumption is decreasing in Û: income uncertainty gives
    rise to a precautionary saving motive.

    ANSWERS TO EXERCISES 231
    Solution to exercise 10
    If G (·) has the quadratic form proposed in the exercise, then the marginal
    investment cost ∂ G (K , I )/∂ I = x · 2 I has the same sign as the investment
    flow I . Since the optimal investment flow I ∗ must satisfy the condition
    x · 2 I ∗ = Î, where Î is the marginal value of capital, Î > 0 implies I ∗ > 0.
    Intuitively, this functional form (whose slope at the origin is zero, rather than
    unity) implies costs for the firm not only when I > 0, but also when I < 0. As long as installed capital has a positive value, it cannot be optimal for the firm to pay costs in order to scrap it, and the optimal investment flow is never negative. The slope at the origin of functions in the form I ‚ is zero for all ‚ > 0, and
    such functions are well defined for I < 0 only when ‚ is an integer. If ‚ is an even number, then the sign of ∂ G (K , I )/∂ I = x · ‚I ‚−1 coincides with that of I and, as in the case where ‚ = 2, negative gross investment is never optimal. If ‚ is an odd integer then, as in the figure, the derivative of adjustment costs is always positive. Thus, negative investment yields positive cash flows, and may be optimal. The second derivative ∂ 2 G (K , I )/∂ I 2 = x · ‚ (‚ − 1) I ‚−2, however, is not always positive as assumed in (2.4). Rather, it is negative for I < 0. This implies that the unit cash flow yielded by negative gross investment is increasingly large when increasingly negative values of I are considered. Hence, the firm would profit from mixing periods of gradual positive investment (of arbitrarily small cost, since the function G (·) is flat for I near zero) with sudden spurts of negative investment. Such functional forms make no economic sense, and also make it impossible to obtain a unique formal characterization of optimal investment. If the adjustment cost function had increasing returns to (neg- ative) investment, the first-order conditions would not characterize optimal policies, and many different intermittent investment policies could yield an infinitely large firm value. 232 ANSWERS TO EXERCISES Solution to exercise 11 Employment of the flexible factor N must satisfy in steady state, as always, the familiar first-order condition ∂ R(·)/∂ N = w. As mentioned in the text, if capital does not depreciate, its steady-state stock must satisfy the sim- ilarly familiar condition ∂ F (·)/∂ K = r Pk . Equivalently, since ∂ F (·)/∂ K = ∂ R(·)/∂ K − Pk ∂ G (·)/∂ K and ∂ G (·)/∂ K = 0 in this exercise, ∂ R(·)/∂ K = r Pk . Thus, we need to characterize the effects of a smaller w on the pair (K s s , Ns s ) that satisfies the two conditions. If revenues have the Cobb–Douglas form, the conditions · K s s K ·s s N ‚ s s = r Pk , ‚ Ns s K ·s s N ‚ s s = w, can be solved if · + ‚ < 1 and the firm has decreasing returns in production. Then, we have K s s = w ‚/·+‚−1(r Pk ) 1−‚/·+‚−1·‚−1/·+‚−1‚−‚/·+‚−1 and a smaller wage is associated with a higher steady-state capital stock. Solution to exercise 12 If G (·) has constant returns to K and I , we may write G ( I, K ) = g ( I K ) K and note that, by the investment first-order condition, g ′ ( I K ) = q , optimal investment is proportional to K for given q : I = È̃(q )K . The portion of the firm’s cash flows that pertains to investment costs, Pk G ( I, K ) = g (̃È(q ))K , therefore has zero second derivative with respect to K . Since revenues (once optimized with respect to N) are also linear in K , ∂ F (·)/∂ K does not depend on K , and the q̇ = 0 locus is horizontal. As for the K̇ = 0 locus, we noted when tracing phase diagrams that its slope tends to be positive when ‰ > 0,
    since a higher q and more intense investment flows are needed to keep a larger
    capital stock constant. To determine the slope of the K̇ = 0 locus, however, the
    derivative of G (·) with respect to K is also relevant when it is not zero (as was

    ANSWERS TO EXERCISES 233
    convenient to assume when drawing phase diagrams). In the case where G (·)
    has constant returns, we can write
    K̇ = È̃(q )K − ‰K = (̃È(q ) − ‰)K
    and find that, even when ‰ > 0, the locus identified by setting this expression
    equal to zero is horizontal. As is the case in a static environment, the optimal
    size of a competitive firms with constant returns to scale is undetermined
    (if the two stationarity loci coincide), or tends to be infinitely large or small
    (if either locus is larger than the other).
    Solution to exercise 13
    (a) As shown in the text, an increase in y has two effects on the steady-
    state value of q : a positive “dividend effect” and a negative “interest rate
    effect.” If the former dominates, the q̇ = 0 schedule slopes upwards in
    the (q , y) phase diagram, as in the figure.
    Formally, from (2.35) we get
    d q
    d y
    ∣∣∣∣
    q̇ =0
    =
    a1r − (a0 + a1 y)h1/ h2
    r 2
    > 0 ⇔ a1 > q
    h1
    h2
    ,
    where we used the expression for q = /r which applies along the q̇ = 0
    locus. This schedule crosses the stationary locus for y from above, since
    lim
    y→∞
    d q
    d y
    ∣∣∣∣
    q̇ =0
    = 0
    and q approaches the value a1h2/ h1 asymptotically from below (for
    y → ∞). Outside its stationary locus, q retains the same dynamic

    234 ANSWERS TO EXERCISES
    properties illustrated in the main text: q̇ > 0 at all points above the
    curve and q̇ < 0 below the curve. In this case the saddlepath slopes upwards, reflecting the fact that, when output increases towards the steady state of the system, the stronger influence on q is given by dividends, which are also rising. (b) Under the new assumption, the effects of the fiscal restriction on the steady-state values of output and the interest rate are similar to those reported in the text: both y and r decrease. However, the effect on the steady-state value of q is different: here q is affected mainly by lower dividends, and attains a lower level in the final steady state. The permanent reduction in output (and dividends) is foreseen by agents at t = 0, when the future fiscal restriction is announced. The ensuing portfolio reallocation away from shares and toward bonds determines an immediate decrease in stock market prices, with a depressing effect on private investment, aggregate demand, and (starting gradually from t = 0) output. At the implementation date t = T the economy is on the saddlepath converging to the new steady-state position. In contrast with the case of a dominant “interest rate effect,” here in the final steady state there is less public spending and less private investment; moreover, the (apparently) perverse temporary effect of fiscal policy on output does not occur. Solution to exercise 14 (a) With F (t ) = R(K (t )) − G ( I ), the dynamic optimality conditions G ′( I ) = Î, Î̇ − r Î = −F ′(K ) + ‰Î, are necessary and sufficient if G ′( I ) > 0, F ′(K ) > 0, F ′′(K ) ≤ 0, and
    G ′′( I ) > 0. The optimal investment flow is a function È(·) of q (or,
    since Pk = 1, of Î), where È(·) is the inverse of G ′(·). Inserting I = È(q )
    in the accumulation constraint, using the second optimality condi-
    tions, and noting that q̇ = Î̇, we obtain a system of two differential
    equations:
    K̇ = È(q ) − ‰K , q̇ = (r + ‰)q − F ′(K ).
    The dynamics of K and q can be studied by a phase diagram, with q on
    the vertical axis and K on the horizontal axis. The locus where q̇ = 0
    is negatively sloped if F ′′(K ) < 0; the locus where K̇ = 0 is positively sloped if ‰ > 0. The point where the two meet identifies the steady
    state, and the system converges toward it along a negatively sloped
    saddlepath.

    ANSWERS TO EXERCISES 235
    (b) For these functional forms, F ′(K ) = ·, F ′′(K ) = 0, G ′( I ) = 1 + 2b I .
    Hence, È(q ) = (q − 1)/(2b), and the dynamic equations are
    K̇ =
    q − 1
    2b
    − ‰K , q̇ = (r + ‰)q − ·.
    The locus along which capital is constant,
    (K̇ = 0) ⇒ q = 1 + 2b‰K ,
    is positively sloped if ‰ > 0, while
    (q̇ = 0) ⇒ q = ·
    r + ‰
    identifies a horizontal line: the shadow price of capital, given by
    the marginal present discounted (at rate r + ‰) contribution of capital
    to the firm’s cash flow, is constant if ∂ 2 F (·)/∂ K 2 = 0, as is the case
    here. The saddlepath coincides with the q̇ = 0 locus, on which the
    system must stay throughout its convergent trajectory. In steady state,
    imposing K̇ = 0, we have
    K s s =
    1

    q − 1
    2b
    =
    · − (r + ‰)
    (r + ‰)2b‰
    .
    The firm’s capital stock is an increasing function of the difference
    between · (the marginal revenue product of capital) and r + ‰ (the
    financial and depreciation cost of each installed unit of capital). If
    · > r + ‰, the steady-state capital stock is finite provided that b‰ > 0.
    As the capital stock increases, in fact, an increasingly large investment
    flow per unit time is needed to offset depreciation. Since unit gross
    investment costs are increasing, in the long run the optimal capital
    stock is such that the benefits · − (r + ‰) of an additional unit will
    be exactly offset by the higher marginal cost of investment needed to
    keep it constant. If · < r + ‰, revenues afforded by capital are smaller than its opportunity cost, and it is never optimal to invest. If ‰ → 0 (and also if b → 0) the K̇ = 0 is horizontal, like the one where q̇ = 0, and the steady state is ill-defined: the expression above implies that K s s tends to infinity if · > r , tends to minus infinity (or zero, in light of
    an obvious non-negativity constraint on capital) if · < r , and is not determined if · = r . Solution to exercise 15 (a) It must be the case that cash flows are concave with respect to endoge- nous variables: · > 0, ‚ > 0, · + ‚ ≤ 1, G (·) convex.
    (b) The diagram is similar to that of Figure 2.5. Since there is no depre-
    ciation, the slope of the K̇ = 0 locus depends on how the capital stock

    236 ANSWERS TO EXERCISES
    affects the marginal cost of investment: if a given investment flow is less
    expensive when more capital is already installed, that is if
    ∂ 2 G (x, y)
    ∂ x∂ y
    < 0, then the K̇ = 0 locus is negatively sloped. If it is steeper than the q̇ = 0 locus, then the system’s dynamics will be globally unstable: when investing, the firm will reduce the cost of further investment so strongly as to more than offset the decline of capital’s marginal revenue product. Dynamics are well-behaved if, instead, the K̇ = 0 schedule meets the q̇ = 0 from above, in which case a change of Pk relocates the q̇ = 0 schedule and q jumps on the new saddlepath. A higher Pk decreases K in the new steady state. (c) As in exercise 10, a quadratic form for G (·) implies that investment is almost costless when it is very small. This is not realistic, and Pk represents the market price of capital net of adjustment costs only if the derivative of adjustment costs is unity at K̇ = 0. Cubic functional forms are not convex for K̇ < 0, implying that first-order conditions do not identify an optimum. (d) As usual, the first-order condition is g ′(K̇ (0) = Î(0). Since R(·) is lin- early homogeneous and G (·) is independent of K , the shadow price of capital does not depend on future capital stocks, and is a convex function of the exogenous wage w: Î(0) = constant · ∫ ∞ 0 e −r t E 0 [ (w(t )) ‚ ‚ − 1 ] . Larger values of Ó increase the variance of w over the period (from T to infinity) when it is positive from the standpoint of time 0. Thus, Jensen’s inequality (as in Figure 2.11) associates a larger Ó with higher shadow values, and with larger investment flows between t = 0 and t = T . Solution to exercise 16 (a) Cash flows are given by F (t ) = · √ K (t ) + ‚ √ L (t ) − w L (t ) − I − „ 2 I 2, and the optimality conditions are 1 + „I = Î (marginal investment cost = shadow price of capital), ‚ 2 √ L = w (marginal revenue product of labor = wage), ANSWERS TO EXERCISES 237 and Î̇ − r Î = − · 2 √ K + ‰Î. (Capital gains minus the opportunity cost of funds = depreciation costs minus the marginal revenue product of capital.) Hence, dynamics are described by the K̇ = Î − 1 „ − ‰K , Î̇ = (r + ‰)Î − · 2 √ K . Graphically, this can be shown as follows: (b) Both Î̇ = 0 and K̇ = 0 move to the left, as shown in the figure. In the new steady state the capital stock is unambiguously smaller; intuitively, a higher marginal product is needed to offset the larger cost of a higher replacement investment flow. The effect on capital’s 238 ANSWERS TO EXERCISES shadow price and on the gross investment flow is ambiguous: in the graph, it depends on the slope of the two curves in the relevant region. Recall that Î is the present discounted value capital’s contribution to the firm’s revenues: in the new steady state, the latter is larger but it is more heavily discounted at rate (r + ‰). (c) For the functional form proposed, capital’s marginal productivity is independent of L : ∂Y ∂ K ∂ L = ∂ ∂ L −· 2 √ K = 0, and therefore the cost w of factor L has no implications for the firm’s investment policy. If instead the mixed second derivative is not zero then, as in Figure 2.8, capital’s marginal productivity evaluated at the optimal L ∗(w) employment of factor L is a convex function of w, implying that variability of w will lead the firm to invest more. Solution to exercise 17 (a) The dynamic first-order condition is (r + ‰)q = d R(·) d K 1 Pk + q̇ , or, with (r + ‰) = 0.5 and d R(·)/d K = 1 − K , 0.5q = 1 − K Pk + q̇ . The optimality condition for investment flows is G ′( I ) = q . In this exercise, G ′( I ) = 1 + I . Hence, I = q − 1, and optimal capital dynam- ics are described by K̇ = q − 1 − 0.25K , or graphically: ANSWERS TO EXERCISES 239 (b) If the price of capital is halved, the q̇ = 0 schedule rotates clockwise around its intersection with the horizontal axis, and q jumps onto the new saddlepath: (c) From T onwards, the q̇ = 0 locus returns to its original position. (The combination of the subsidy and higher interest rate is exactly offset in the user cost of capital, and the marginal revenue product of capital is unaffected throughout.) Investment is initially lower than in the previous case: q jumps, but does not reach the saddlepath; its trajectory reaches and crosses the k̇ = 0 locus, and would diverge if parameters did not change again at T . At time T the original saddlepath is met, and the trajectory converges back to its starting point. The farther in the future is T , the longer-lasting is the investment increase; in the limit, as T goes to infinity the initial portion of the trajectory tends to coincide with the saddlepath: 240 ANSWERS TO EXERCISES Solution to exercise 18 (a) The conditions requested are K 1/2 N−1/2 = w, 1 + I = Î, −K −1/2 N1/2 + ‰Î = −r Î + Î̇. (b) From K 1/2 N−1/2 = w, we have N = K /w 2, hence F (t ) = 2K 1/2 N1/2 − G ( I ) − w N = 2 w K − G ( I ) − 1 w K , Î(0) = ∫ ∞ 0 e −(r +‰)t ∂ F (·) ∂ K (t ) d t = ∫ ∞ 0 e −(r +‰)t 1 w(t ) d t. (c) Î = 1/[r + ‰)w̄] is constant with respect to K . The form of adjustment costs and of the accumulation constraint imply that I = Î − 1 and that K̇ = 0 if I = ‰K , that is, if Î = 1 + ‰K as shown in the figure. (d) One would need to ensure that G (·) is linearly homogeneous in I and K . For example, one could assume that G ( I, K ) = I + 1 2K I 2. Solution to exercise 19 Denote gross employment variations in period t by �̃Nt : positive values of �̃Nt represent hiring at the beginning of period t , while negative values of �̃Nt represent firings at the end of period t − 1. Noting that effective employment at date t is given by Nt = Nt−1 + �̃Nt − ‰Nt−1, we have �̃Nt = �Nt + ‰Nt−1 for each t . ANSWERS TO EXERCISES 241 If turnover costs depend on hiring and layoffs but not on voluntary quits, we can rewrite the firm’s objective function as Vt = E t [ ∞∑ i =0 ( 1 1 + r )i ( R( Zt +i , Nt +i ) − w Nt +i − G (�Nt +i + ‰Nt )) ] . Introducing a parameter with the same role as Pk , that is multiplying G (·) by a constant, influences the magnitude of the hiring and firing costs in relation to the flow revenue R(·) and the salary wt Nt . Such a constant of proportionality is not interpretable like the “price” of labor. Each unit of the factor N is in fact paid a flow wage wt , rather than a stock payment; for this reason, the slope of the original function G (·) is zero rather than one, as in the preceding chapter. In the problem we consider here, the wage plays a role similar to that of user cost of capital in Chapter 2. To formulate these two problems in a similar fashion, we need to assume that workers can be bought and sold at a unique price which is equivalent to the present discounted value of future earnings of each worker. One case in which it is easy to verify the equivalence between the flow and the stock payments is when the salary, the discount rate, and the layoff rate are constant: since only a fraction equal to e −(r +‰)(Ù−t ) of the labor force employed at date t is not yet laid off at date Ù, the present value of the wage paid to each worker is given by∫ ∞ t we −(r +‰)(Ù−t )d Ù = w r + ‰ . The role of this quantity is the same as the price of capital Pk in the study of investments, and, as we mentioned, the wage w coincides with the user cost of capital (r + ‰) Pk . The formal analogy between investments and the “purchase” and “sale” of workers—which remains valid if the salary and the other variables are time-varying—obviously does not have practical relevance except in the case of slavery. Solution to exercise 20 To compare these two expressions, remember that Î̇ = [Î(t + d t ) − Î(t )]/d t ≈ [Î(t + �t ) − Î(t )]/�t for a finite �t . Assuming �t = 1, we get a discrete-time version of the opti- mality condition for the case of the Hamiltonian method, r Ît = ∂ R(·) ∂ K + Ît +1 − Ît , or alternatively Ît = 1 1 + r ∂ R(·) ∂ K + 1 1 + r Ît +1. 242 ANSWERS TO EXERCISES This expression is very similar to (3.5). It differs in three aspects that are easy to interpret. First of all, the operator E t [·] will obviously be redundant in (3.5) in which by assumption there is no uncertainty. Secondly, the discrete-time expression applies a discount rate to the marginal cash flow, but this factor is arbitrarily close to one in continuous time (where d t = 0 would replace �t = 1). Finally, the two relationships differ also as regards the specification of the cash flow itself, in that only (3.5) deducts the salary w from the marginal revenue. This difference occurs because labor is rewarded in flow terms. (The shadow value of labor therefore does not contain any resale value, as is the case with capital.) Solution to exercise 21 If both functions are horizontal lines, the shadow value of labor will not depend on the employment level. Without loss of generality, we can then write Ï(N, Zg ) = Zg , Ï(N, Zb ) = Zb , and calculate the shadow values in the two possible situations. In the case considered here, (3.5) implies that Îg = Ïg − w + 1 1 + r ((1 − p)Îg + pÎb ), Îb = Ïb − w + 1 1 + r ((1 − p)Îb + pÎg ), a system of two linear equations in two unknowns whose solution is Îb = 1 + r r (r + p)Ïb + pÏg r + 2 p − w, Îg = 1 + r r (r + p)Ïg + pÏb r + 2 p − w. These two expressions are simply the expected discounted values of the excess of productivity (marginal and average) over the wage rate of each worker. In the absence of hiring and firing costs, the firm will choose either an infinitely large or a zero employment level, depending on which of the two shadow values is non-zero. On the contrary, if the costs of hiring and firing are positive, it is possible that −F < ÎD < ÎF < H, and thus that, as a result of (3.6), the firm will find it optimal not to vary the employment level. If only one marginal productivity is constant, then it may be optimal for the firm to hire and fire workers in such a way that the first-order conditions hold with equality: Ï(Ng , Zg ) = w + p F 1 + r ANSWERS TO EXERCISES 243 and Zb = w − (r + p) F 1 + r can be satisfied simultaneously only if the second condition (in which all variables are exogenous) holds by assumption. In this case, the first condition can be solved as Ng = 1 ‚ ( Zg − ‚w − p F 1 + r ) . As in many other economic applications, strict concavity of the objective function is essential to obtain an interior solution. Solution to exercise 22 Subtracting the two equations in (3.9) term by term yields an expression for the difference between the two possible marginal productivities of labor: Ï(Ng , Zg ) − Ï(Nb , Zb ) = (r + 2 p) H + F 1 + r . This expression is valid under the assumption that the firm hires and fires workers upon every change of the exogenous conditions represented by Zt . However, H and F can be so large, relative to variations in demand for labor, that the expression is satisfied only when Nb > Ng , as in the figure.
    Such an allocation is clearly not feasible: if Nb > Ng , the firm will need to fire
    workers whenever it faces an increase in demand, violating the assumptions
    under which we derived (3.9) and the equation above. (In fact, the formal
    solution involves the paradoxical cases of “negative firing,” and “negative hir-
    ing,” with the receipt rather than the payment of turnover costs!). Hence, the
    firm is willing to remain completely inactive, with employment equal to any

    244 ANSWERS TO EXERCISES
    level within the inaction region in the figure. It is still true that employment
    takes only two values, but, these values coincide and they are completely
    determined by the initial conditions.
    Solution to exercise 23
    A trigonometric function, such as sin(·), repeats itself every  = 3.1415 . . .
    units of time; hence, the Z (Ù) process has a cycle lasting p periods. If p = one
    year, the proposed perfectly cyclical behavior of revenues might be a stylized
    model of a firm in a seasonal industry, for example a ski resort. If the firm
    aims at maximizing its value, then
    Vt =
    ∫ ∞
    t
    ( R(L (Ù), Z (Ù)) − w L (Ù) − C ( Ẋ (Ù)) Ẋ (Ù))e −r (Ù−t )d Ù,
    where r > 0 is the rate of discount and R(·) is the given revenue function.
    Then with ∂ R(·)/∂ L = M(·) as given in the exercise, optimality requires that
    − f ≤
    ∫ ∞
    t
    (M(L (Ù), Z (Ù)) − w) e −r (Ù−t )d Ù ≤ h
    for all t : as in the model discussed in the chapter, the value of marginal
    changes in employment can never be larger than the cost of hiring, or more
    negative than the cost of firing. Further, and again in complete analogy to the
    discussion in the text, if the firm is hiring or firing, equality must obtain in
    that relationship: if Ẋt < 0, − f = ∫ ∞ t (M(L (Ù), Z (Ù)) − w) e −r (Ù−t )d Ù, (*) and if Ẋt > 0, ∫ ∞
    t
    (M(L (Ù), Z (Ù)) − w) e −r (Ù−t )d Ù = h. (**)
    Each complete cycle goes through a segment of time when the firm is hiring
    and a segment of time when the firm is firing (unless turnover costs are so
    large, relative to the amplitude of labor demand fluctuations, as to make inac-
    tion optimal at all times). Within each such interval the optimality equations
    hold with equality, and using Leibnitz’s rule to differentiate the relevant inte-
    gral with respect to the lower limit of integration yields local Euler equations
    in the form
    M(L (t ), Z (t )) − w = r C (L̇ (t )).
    Inverting the functional form given in the exercise, the level of employment is((
    K 1 + K 2 sin
    (
    2
    p
    Ù
    ))
    /(w − r f )
    )1/‚

    ANSWERS TO EXERCISES 245
    whenever Ù is such that the firm is firing, and((
    K 1 + K 2 sin
    (
    2
    p
    Ù
    ))
    /(w + r h)
    )1/‚
    whenever Ù is such that the firm is hiring. If h + f > 0, however, there must
    also be periods when the firm neither hires nor fires: specifically, inaction
    must be optimal around both the peaks and troughs of the sine function.
    (Otherwise, some labor would be hired and immediately fired, or fired and
    immediately hired, and h + f per unit would be paid with no counteracting
    benefits in continuous time.) To determine the optimal length of the inaction
    period following the hiring period, suppose time t is the last instant in the
    hiring period, and denote with T the first time after t that firing is optimal at
    that same employment level: then, it must be the case that
    L (t ) =

    ⎝ K 1 + K 2 sin
    (
    2
    p
    t
    )
    w + r h


    1/‚
    =

    ⎝ K 1 + K 2 sin
    (
    2
    p
    T ′
    )
    w − r f


    1/‚
    .
    This is one equation in T and t . Another can be obtained inserting the given
    functional forms into equations (*) and (**), recognizing that the former
    applies at T and the latter at t , and rearranging:∫ T
    t
    e −r (Ù−t )
    [(
    K 1 + K 2 sin
    (
    2
    p
    Ù
    ))
    (L (t ))−‚ − w
    ]
    d Ù = h + f e −r (T
    ′−t )
    .
    The integral can be solved using the formula∫
    e Îx sin(„x ) d x =
    Îe Îx
    „2 + Î2
    (
    sin(„x ) − „
    Î
    cos(„x )
    )
    ,
    but both the resulting expression and the other relevant equation are highly
    nonlinear in t and T ′, which therefore can be determined only numerically.
    See Bertola (1992) for a similar discussion of optimality around the cyclical
    trough, expressions allowing for labor “depreciation” (costless quits), sample
    numerical solutions, and analytical results and qualitative discussion for more
    general specifications.
    Solution to exercise 24
    Denoting by Á(t ) ≡ Z (t )L (t )−‚ labor’s marginal revenue product, the shadow
    value of employment (the expected discounted cash flow contribution of a
    marginal unit of labor) may be written
    Î(t ) =
    ∫ ∞
    t
    E t [Á(Ù) − w]e −(r +‰)(Ù−t )d Ù,

    246 ANSWERS TO EXERCISES
    and, by the usual argument, an optimal employment policy should never let
    it exceed zero (since hiring is costless) or fall short of −F (the cost of firing a
    unit of labor). Hence, the optimality conditions have the form −F ≤ Î(t ) ≤ 0
    for all t , −F = Î(t ) if the firm fires at t , Î(t ) = 0 if the firm hires at t .
    In order to make the solution explicit, it is useful to define a function
    returning the discounted expectation of future marginal revenue products
    along the optimal employment path,
    v(Á(t )) ≡
    ∫ ∞
    t
    E t [Á(Ù)]e
    −(r +‰)(Ù−t )d Ù = Î(t ) +
    w
    r + ‰
    .
    This function depends on Á(t ), as written, only if the marginal revenue
    product process is Markov in levels. Here this is indeed the case, because in
    the absence of hiring or firing we can use the stochastic differentiation rule
    introduced in Section 2.7 to establish that, at all times when the firm is neither
    hiring nor firing,
    d Á(t ) = d [ Z (t )L (t )−‚]
    = L (t )−‚d Z (t ) − ‚Z (t )L (t )−‚−1d L (t )
    = L (t )−‚[ËZ (t ) d t + ÛZ (t ) d W(t )] + ‚Z (t )L (t )−‚−1‰L (t )
    = Á(t )(Ë + ‚‰) d t + Á(t )Û d W(t )
    is Markov in levels (a geometric Brownian motion), and we can proceed to
    show that optimal hiring and firing depend only on the current level of Á(t ),
    hence preserving the Markov character of the process. In fact, we can use
    the stochastic differentiation rule again and apply it to the integral in the
    definition of v(·) to obtain a differential equation,
    (r + ‰)v(Á) = Á +
    1
    d t
    (
    ∂v(·)
    ∂Á
    E (d Á) +
    ∂ 2v(·)
    ∂Á2
    (d Á)2
    )
    = Á +
    ∂v(·)
    ∂Á
    Á(Ë + ‚‰) +
    ∂ 2v(·)
    ∂Á2
    Á2Û2,
    with solutions in the form
    v(Á) =
    Á
    r − Ë − ‰‚ + K 1Á
    ·1 + K 2Á
    ·2 ,
    where ·1 and ·2 are the two solutions of the quadratic characteristic equation
    (see Section 2.7 for its derivation in a similar context) and K 1, K 2 are con-
    stants of integration. These two constants, and the critical levels of the Á(t )
    process that trigger hiring and firing, can be determined by inserting the v(·)
    function in the two first-order and two smooth-pasting conditions that must
    be satisfied at all times when the firm is hiring or firing. (See Section 2.7 for a
    definition and interpretation of the smooth-pasting conditions, and Bentolila

    ANSWERS TO EXERCISES 247
    and Bertola (1990) for further and more detailed derivations and numerical
    solutions.)
    Solution to exercise 25
    It is again useful to consider the case where r = 0, so that (3.16) holds: if H =
    −F , and thus H + F = 0, then wages and marginal productivity are equal in
    every period, and the optimal hiring and firing policies of the firm coincide
    with those that are valid if there are no adjustment costs. The combination
    of firing costs and identical hiring subsidies does have an effect when r > 0.
    Using the condition H + F = 0 in (3.9), we find that the marginal productiv-
    ity of labor in each period is set equal to w + r H/(1 + r ) = w − r F /(1 + r ).
    Intuitively, the moment a firm hires a worker, it deducts r H/(1 + r ) from the
    flow wage, which is equivalent to the return if it invests the subsidy H in an
    alternative asset, and which the firm needs to pay if it decides to fire the worker
    at some future time.
    If H + F < 0, then turnover generates income rather than costs, and the optimal solution will degenerate: a firm can earn infinite profits by hiring and firing infinite amounts of labor in each period. Solution to exercise 26 Specializing equation (3.15) to the case proposed, we obtain 1 2 ( f ( Zg ) + ‚(Ng ) + g ( Zb ) + ‚(Nb )) = w, or, alternatively, 1 2 (‚(Ng ) + ‚(Nb )) = w − 12 ( f ( Zg ) + g ( Zb )). The term on the right does not depend on Ng and Nb , and hence is inde- pendent of the magnitude of the employment fluctuations (which in turn are determined by the optimal choices of the firm in the presence of hiring and firing costs). We can therefore write E[‚(N)] = constant = ‚(E[N]) + Ó, where, by Jensen’s inequality, Ó is positive if ‚(·) is a convex function, and neg- ative if ‚(·) is a concave function. In both cases Ó is larger the more N varies. Combining the last two equations to find the expected value of employment, we have E[N] = ‚−1 ( w − 1 2 ( f ( Zg ) + g ( Zb ) + 2Ó) ) , where ‚−1(·), the inverse of ‚(·), is decreasing. We can therefore conclude that, if ‚(·) is a convex function, the less pronounced variation of employment 248 ANSWERS TO EXERCISES when hiring and firing costs are larger is associated with a lower average employment level. The reverse is true if ‚(·) is concave. Solution to exercise 27 Since we are not interested in the effects of H , we assume that H = 0. The optimality conditions Zg − ‚Ng = w + p g 1 + r , Zb − „Nb = w − (r + p) g 1 + r , imply Ng = 1 ‚ ( Zg − w − p g 1 + r ) , Nb = 1 „ ( Zb − w + (r + p) g 1 + r ) , and thus Ng + Nb 2 = 1 2‚„ [ „ ( Zg − w − p F 1 + r ) + ‚ ( Zb − w + (r + p) F 1 + r )] = „Zg + ‚Zb − („ + ‚)w 2‚„ + ‚ − „ 2‚„ p F 1 + r + ‚ 2‚„ r F 1 + r . The first term on the right-hand side of the last expression denotes the average employment level if F = 0; the effect of F > 0 is positive in the last term if
    r > 0, but since ‚ < „ the second term is negative. As we saw in exercise 21, the limit case with „ = 0 is not well defined unless the exogenous variables satisfy a certain condition. It is therefore not possible to analyze the effects of a variation of g that is not associated with variations in other parameters. Solution to exercise 28 In (3.17), p determines the speed of convergence of the current value of P to its long-run value. If p = 0, there is no convergence. (In fact, the initial conditions remain valid indefinitely.) Writing Pt +1 = p + (1 − 2 p) Pt , we see that the initial distribution is completely irrelevant if p = 0.5; the probability distribution of each firm is immediately equal to P∞, and also the frequency distribution of a large group of firms converges immediately to its long-run stable equivalent. ANSWERS TO EXERCISES 249 Solution to exercise 29 As in the symmetric case, we consider the variation of the proportion P of firms in state F : Pt +1 − Pt = p(1 − Pt ) − q Pt = p − (q + p) Pt = p ( 1 − q + p p Pt ) . This expression is positive if Pt < p/(q + p), negative if Pt > p/(q + p), and
    zero if Pt corresponds to P∞ = p/(q + p), the stable proportion of firms in
    state F . Intuitively, if p > q (if the entry rate into the strong state is higher
    than the exit rate out of this state), then in the long run the strong state is
    more likely than the weak state.
    Solution to exercise 30
    (a) Marginal productivity of labor is

    ∂l
    F (k, l ; ·) = · − ‚l .
    When · = 4 and the firm is hiring, employment is the solution x of
    4 − ‚x = 1 + p F
    1 + r
    ;
    therefore, with r = F = 1 and p = 0.5, the solution is 11/4‚ = 2.75/‚.
    When · = 2 and the firm fires, it employs x such that
    2 − ‚x = 1 − ( p + r ) F
    1 + r
    ,
    so employment is 7/4‚ = 1.75/‚.
    (b) Employment is not affected by capital adjustment for this production
    function because it is separable; i.e., the marginal product of (and
    demand for) one factor does not depend on the level of the other. The
    marginal product of capital is · − „k, so setting it equal to r + ‰ = 2
    yields
    4 − „k = 2 ⇒ k = 2/„
    when ·t = 4 and
    2 − „k = 2 ⇒ k = 0
    when ·t = 2.

    250 ANSWERS TO EXERCISES
    Solution to exercise 31
    (a) The optimality conditions of the firm, analogous to (3.9), are
    Zg − ‚Ng = wg + p
    F
    1 + r
    + (r + p)
    H
    1 + r
    ,
    Zb − ‚Nb = wb − (r + p)
    F
    1 + r
    − p H
    1 + r
    ,
    from which we obtain
    Ng =
    (
    Zg − wg −
    p F + (r + p) H
    1 + r
    )
    1

    ,
    Nb =
    (
    Zb − wb +
    (r + p) F + p H
    1 + r
    )
    1

    .
    (b) We know that workers are indifferent between moving and staying if
    (3.24) holds, that is if, wg − wb = Í(2 p + r )/(1 + r ). Hence, the given
    wage differential is an equilibrium phenomenon if the mobility costs
    for workers are equal to
    Í =
    1 + r
    2 p + r
    �w.
    (c) Given that �w = Í(2 p + r )/(1 + r ), and that wF = wb + �w, we have
    Ng =
    (
    Zg − wb − Í
    2 p + r
    1 + r
    − p F + (r + p) H
    1 + r
    )
    1

    ,
    and the full-employment condition 50 Ng + 50Nb = 1000 can therefore
    be written
    50
    (
    Zg − wb − Í
    2 p + r
    1 + r
    − p F + (r + p) H
    1 + r
    )
    1

    + 50
    (
    Zb − wb +
    (r + p) F + p H
    1 + r
    )
    1

    = 1000.
    Hence the wage rate needs to be
    wb =
    1
    2
    ( Zg + Zb ) − 10‚ −
    1
    2
    Í(2 p + r ) + ( H − F )r
    1 + r
    .

    ANSWERS TO EXERCISES 251
    Solution to exercise 32
    Denote the optimal employment levels by Nb and Ng . Noting that Î( Zg , Ng ) =
    H and Î( Zb , Nb ) = −F , the dynamic optimality conditions are given by
    H = ( Zg , Ng ) − w̄ +
    1
    1 + r
    H − F + Î(M,G )
    3
    ,
    −L = ( Zb , Nb ) − w̄ +
    1
    1 + r
    H − F + Î(M,B )
    3
    .
    In both cases the shadow value of labor is equal to the current marginal cash
    flow plus the expected discounted shadow value in the next period. The latter
    is equal to H or to −F in the two cases in which the firm decides to hire or fire
    workers; and it will be equal to Î(·) such that it is optimal not to react if labor
    demand in the next period takes the mean value. To characterize this shadow
    value, consider that if Zt +1 = Z M —so that inactivity is effectively optimal—
    then the shadow value Î(M,G ) satisfies
    Î(M,G ) = Ï( Z M , Ng ) − w̄ +
    1
    1 + r
    H − F + Î(M,G )
    3
    if the last action of the firm was to hire a worker, while the shadow value Î(M,B )
    satisfies
    Î(M,B ) = Ï( Z M , Nb ) − w̄ +
    1
    1 + r
    H − F + Î(M,B )
    3
    if the last action of the firm was to fire workers. The last four equations can
    be solved for Ng , Nb , Î(M,G ), and Î(M,B ). Under the hypothesis that Ï( Z, N)
    is linear, we obtain
    Nb =
    1

    (
    Zb − w̄ + L +
    1
    1 + r
    Z M − Zb + H − 2 F
    3
    )
    ,
    Ng =
    1

    (
    Zg − w̄ − A +
    1
    1 + r
    Z M − Zg + 2 H − F
    3
    )
    ,
    and the solutions for the two shadow values, which need to satisfy
    −F < Î(M,B ) < H, −F < Î(M,G ) < H if, as we assumed, the parameters are such that it is optimal for the firm not to react if the realization of labor demand is at the intermediate value. Solution to exercise 33 Since k̇ = s f (k) − ‰k, Ẏ Y = f ′(k)k̇ f (k) = f ′(k) ( s − ‰ k f (k) ) . 252 ANSWERS TO EXERCISES The condition limk→∞ f ′(k) > 0 is no longer sufficient to allow a positive
    growth rate: also, the limit of the second term, which defines the propor-
    tional growth rate of output, needs to be strictly positive. This is the case if
    ‰ limk→∞(k/ f (k)) < s . If both capital and output grow indefinitely, the limit required is a ratio between two infinitely large quantities. Provided that the limit is well defined, it can be calculated, by l’Hôpital’s rule, as the ratio of the limits of the numerator’s derivative—which is unity—and of the denomina- tor’s derivative—which is f ′(k), and tends to b. Hence, for positive growth in the limit is necessary that lim k→∞ f ′(k) = b >

    s
    > 0.
    When a fraction s of income is saved and capital depreciates at rate ‰, we get
    lim
    k→∞

    Y
    = b
    (
    s − ‰
    b
    )
    = bs − ‰.
    Solution to exercise 34
    If Î ≤ 0, capital and labor cannot be substituted easily: no output can be
    produced without an input of L . In fact, the equation that defines factor
    combinations yielding a given output level,
    Ȳ = (·K Î + (1 − ·)L Î)1/Î,
    allows Ȳ > 0 for L = 0 only if Î > 0. In that case, the accumulation of capital
    can sustain indefinite growth of the economy: the non-accumulated factor L
    may substitute capital, but output can continue to grow even if the ratio L /K
    tends to zero.
    These particular examples both assume that ‰ = g = 0, and we know
    already that indefinite growth is feasible if the marginal product of capital has
    a strictly positive limit. If Î = 1, the production function is linear, i.e.
    F (K , L ) = ·K + ·L , f (k) − ·k(1 − ·),
    and the requested growth rates are

    y
    =
    ·k̇
    y
    = ·s ,

    k
    = s
    ·k + (1 − ·)
    k
    = s · +
    1 − ·
    k
    .
    The growth rate of output equals ·s , which is constant if agents consume a
    constant fraction s of income. Capital, on the other hand, grows at a decreas-
    ing rate which approaches the same value ·s only asymptotically.
    The case in which · = 1 is even simpler: since y = k, the growth rate of both
    capital and output is always equal to s .

    ANSWERS TO EXERCISES 253
    Solution to exercise 35
    As in the main text, we continue to assume that the welfare of an individual
    depends on per capita consumption, c (t ) ≡ C (t )/N(t ). However, when the
    population grows at rate g N we need to consider the welfare of a representative
    household rather than that of a representative individual. If welfare is given by
    the sum of the utility function of the N(t ) = N(0)e g N t individuals alive at date
    t , objective function (4.10) becomes
    U ′ =
    ∫ ∞
    0
    u(c (t ))N(t )e −Òt d t =
    ∫ ∞
    0
    u(c (t ))N(0)e −Ò
    ′t d t,
    where Ò′ ≡ Ò − g N : a higher growth rate of the population reduces the impa-
    tience of the representative agent. With g A = 0, and normalizing A(t ) =
    A(0) = 1, the law of motion for per capita capital k(t ) is
    d
    d t
    K (t )
    N(t )
    =
    Y (t ) − C (t ) − ‰K (t )
    N(t )
    − K (t )Ṅ(t )
    N(t )2
    = f (k(t )) − c (t ) − (‰ + g N )k(t ).
    The first-order conditions associated with the Hamiltonian are
    H (t ) = [u(c (t )) + Î(t )( f (k(t )) − c (t ) − (‰ + g N )k(t ))]e −Ò
    ′t
    .
    Using similar techniques as in the main text, we obtain
    ċ =
    (
    u′(c )
    −u′′(c )
    )
    ( f ′(k) − (‰ + g N ) − Ò′) =
    (
    u′(c )
    −u′′(c )
    )
    ( f ′(k) − ‰ − Ò).
    The dynamics of the system are similar to those studied in the main text, and
    tend to a steady state where
    f ′(ks s ) = Ò + ‰, 0 = f (ks s ) − c s s − (‰ + g N )ks s .
    The capital stock does not maximize per capita consumption in the steady
    state: in each possible steady state k̇ = 0 needs to be satisfied; that is,
    c s s = f (ks s ) − (‰ + g N )ks s .
    The second derivative of the right-hand expression is f ′′(·) < 0. The maxi- mum of the steady state per capita stock of capital is therefore obtained at a value k∗ at which the first derivative is equal to zero so that f ′(k∗) = ‰ + g N . Hence f ′(ks s ) > f ′(k∗) if g N < Ò, which is a necessary condition to have Ò′ > 0 and to have a well defined optimization problem. From this, and from
    the fact that f ′′(·) < 0, we have k∗ > ks s . The economy evolves not toward
    the capital stock that maximizes per capita consumption (the so-called golden
    rule), but to a steady state with a lower consumption level. In fact, given that

    254 ANSWERS TO EXERCISES
    the economy needs an indefinite time period to reach the steady state, it would
    make sense to maximize consumption only if Ò′ were equal to zero, that is, if
    a delay of consumption to the future were not costly in itself. On the other
    hand, when agents have a positive rate of time preference, which is needed
    for the problem to be meaningful, then the optimal path is characterized by a
    higher level of consumption in the immediate future and a convergence to a
    steady state with ks s < k ∗. Solution to exercise 36 Denote the length of a period by �t (which was normalized to one in Chap- ter 1), and refer to time via a subscript rather than an argument between parentheses: let rt denote the interest rate per time period (for instance on an annual basis) valid in the period between t and t + �t ; moreover, let yt and c t denote the flows of income and consumption in the same period but again measured on an annual basis. Finally let At be the wealth at the beginning of the period [t, t + �t ]. Hence, we have the discrete-time budget constraint At +�t = ( 1 + rt �t n )n At + ( yt − c t )�t. Interest payments are made in each of the n subperiods of �t . Moreover, in each of the subperiods of length �t/n, an amount rt �t/n of interest is received which immediately starts to earn interest. If n tends to infinity, lim n→∞ ( 1 + rt �t n )n = e rt �t . Therefore At +�t = e rt �t At + ( yt − c t )�t. Rewriting the first-order condition in discrete time denoting the length of the discrete period by �t > 0, we have
    u′(c t ) =
    (
    1 + r
    1 + Ò
    )�t
    u′(c t +�t ).
    Recognizing that (1 + r )�t ≈ e (s −t )�t and imposing s = t + �t , we get
    u′(c t ) = e
    r (s −t )e −Ò(s −t )u′(c s ).
    We can rewrite this expression as
    u′(c t )
    e −Ò(s −t )u′(c s )
    = e r (s −t ),
    which equates the marginal rate of substitution, the left-hand side of the
    expression, to the marginal rate of substitution between the resources available

    ANSWERS TO EXERCISES 255
    at times t and s . Isolating any two periods, we obtain the familiar conditions
    for the optimality of consumption and savings, that is the equality between the
    slope of the indifference curve and of the budget restriction. In continuous
    time, this condition needs to be satisfied for any t and s : hence, along the
    optimal consumption path we have (differentiating with respect to s )
    − u
    ′(c t )
    (e −Ò(s −t )u′(c s ))2
    (
    e −Ò(s −t )
    d u′(c s )
    d s
    − Òe −Ò(s −t )u′(c s )
    )
    = r e r (s −t ).
    In the limit, with s → t , we get
    − 1
    u′(c t )
    (
    d u′(c t )
    d t
    − Òu′(c t )
    )
    = r,
    or, equivalently,

    (
    d u′(c t )
    d t
    )
    = (r − Ò)u′(c t ).
    Given that the marginal utility of consumption u′(c t ) equals the shadow value
    of wealth Ît , this relation corresponds to the Hamiltonian conditions for
    dynamic optimality. Differentiating with respect to �t and letting �t tend
    to 0, we get
    d c t
    d t
    =
    (
    − u
    ′(c t )
    u′′(c t )
    )
    (r − Ò).
    In the presence of a variation of the interest rate r (or, more precisely, in the
    differential r − Ò), the consumer changes the intertemporal path of her con-
    sumption by an amount equal to the (positive) quantity in large parentheses:
    this is the reciprocal of the well-known Arrow–Pratt measure of absolute risk
    aversion. As we noted in Chapter 1, the more concave the utility function,
    the less willing the consumer will be to alter the intertemporal pattern of
    consumption. With regard to the cumulative budget constraint, we can write
    At +�t − At
    �t
    =
    (e rt �t − 1)
    �t
    At + ( yt − c t )
    and evaluate the limit of this expression for �t → 0:
    lim
    �t→0
    At +�t − At
    �t
    = lim
    �t→0
    (e rt �t − 1)
    �t
    At + ( yt − c t ).
    On the left we have the definition of the derivative of At with respect to time.
    Since both the denominator and the numerator in the first term on the right
    are zero in �t = 0, we need to apply l’Hôpital’s rule to evaluate this limit. This
    gives
    d
    d t
    At = lim
    �t→0
    (rt e
    rt �t )
    1
    At + ( yt − c t )

    256 ANSWERS TO EXERCISES
    or, in the notation in continuous time adopted in this chapter,
    Ȧ(t ) = r (t ) A(t ) + y(t ) − c (t ),
    which is a constraint, in flow terms, that needs to be satisfied for each t . This
    law of motion for wealth relates A(t ), r (t ), c (t ), y(t ) which are all functions
    of the continuous variable t . The summation of (??) obviously corresponds
    to an integral in continuous time. Suppose for simplicity that the interest
    rate is constant, i.e. r (t ) = r for each t , and multiply both terms in the above
    expression by e −r t ; we then get
    e −r t Ȧ(t ) − r e −r t A(t ) = e −r t ( y(t ) − c (t )).
    Since the term on the left-hand side is the derivative of the product of e −r t and
    A(t ), we can write
    d
    d t
    (e −r t A(t )) = e −r t ( y(t ) − c (t )).
    It is therefore easy to evaluate the integral of the term on the left:∫ t
    0
    d
    d t
    (e −r t A(t )) d t = [e −r t A(t )]T0 = e
    −r T A(T ) − A(0).
    Equating this to the integral of the term on the right, we get
    e −r T A(T ) = A(0) +
    ∫ T
    0
    e −r t ( y(t ) − c (t )) d t. (5.A1)
    If we let T tend to infinity, and if we impose the continuous-time version of
    the no-Ponzi-game condition (1.3), i.e.
    lim
    T →∞
    e −r T A(T ) = 0,
    we finally arrive at the budget condition for an infinitely lived consumer who
    takes consumption and savings decisions in each infinitesimally small time
    period: ∫ ∞
    0
    e −r t c (t ) = A(0) +
    ∫ ∞
    0
    e −r t y(t ) d t.
    Solution to exercise 37
    If K̇ /K = Ȧ/ A + Ṅ/N = L̇ /L , then k ≡ K /L is constant. The rate r at which
    capital is remunerated is given by
    ∂ F (K , L )
    ∂ K
    =
    ∂[L F (K /L , 1)]
    ∂ K
    = f ′(K /L ),
    and is constant if K and L grow at the same rate. Moreover, because of
    constant returns to scale, production grows at the same rate as K (and L ), and

    ANSWERS TO EXERCISES 257
    the income share of capital r K /Y is thus constant along a balanced growth
    path, even if the production function is not Cobb–Douglas.
    Solution to exercise 38
    The production function
    F (K , L ) = (·K Î + (1 − ·)L Î)1/Î
    exhibits constant returns to scale and the marginal productivity of capital has
    a strictly positive limit if Î > 0, as we saw on page 150. The income share of
    labor L is given by
    ∂ F (K , L )
    ∂ L
    L
    F (K , L )
    =
    [·K Î + (1 − ·)L Î](1−Î)/Î(1 − ·)L Î−1 L
    [·K Î + (1 − ·)L Î]1/Î
    = [·K Î + (1 − ·)L Î]−1(1 − ·)L Î
    =
    [
    ·
    (
    K
    L

    + (1 − ·)L Î
    ]−1
    (1 − ·), (5.A2)
    which tends to zero with the growth of K /L if Î > 0.
    Solution to exercise 39
    In terms of actual parameters, the Solow residual may be expressed as

    A
    + Ï·
    (

    N
    − K̇
    K
    )
    + (· + ‚ − 1) K̇
    K
    .
    This measure may therefore be an overestimate or an underestimate of “true”
    technological progress.
    Solution to exercise 40
    The return on savings and investments is
    r =
    ∂ F (K , L )
    K
    = ·K ·−1 L 1−·.
    Hence, recognizing that A = a K /N, so that L = N A = a K ,
    r = ·K ·−1 K 1−·a 1−· = ·a 1−·,
    which does not depend on K and thus remains constant during the process
    of accumulation. If this r is above the discount rate of utility Ò, the rate of

    258 ANSWERS TO EXERCISES
    aggregate consumption growth is

    C
    =
    ·a 1−· − Ò
    Û
    ,
    where, as usual, Û denotes the elasticity of marginal utility. Since A(·)N/K is
    constant and production,
    F (K , L ) = K · N1−· A1−· = K · N1−·(a K /N)1−· = K a 1−·,
    is proportional to K , the economy moves immediately (and not just in the
    limit) to a balanced growth path.
    Solution to exercise 41
    Since the production function needs to have constant returns to K and L ,
    it must be the case that ‚ = 1 − ·; moreover, since the returns need to be
    constant with respect to K and G , we need to have „ = 1 − ·. Hence, writing
    F̃ (K , L , G ) = K · L 1−· G 1−·,
    and substituting fiscal policy parameters from (4.34) we get
    G = ÙK · L 1−· G 1−· ⇒ G = (ÙK · L 1−·)1/· = (ÙL 1−·)1/· K .
    Given that G and K are proportional, the net return on private savings is
    constant:
    (1 − Ù) ∂ F̃ (K , L , G )
    ∂ K
    = (1 − Ù)·K ·−1 L 1−· G 1−·
    = (1 − Ù)·
    (
    G
    K
    )1−·
    L 1−·
    = (1 − Ù)·(ÙL 1−·)1/· L 1−·.
    The growth rate of consumption, which can be obtained by substituting the
    above expression into (4.35), and that of capital and aggregate production are
    also constant.
    Solution to exercise 42
    (a) We know that along the optimum path of consumption the following
    Euler condition holds:
    −u′′(C )Ċ = ( F ′(K ) − Ò)u′(C ),
    which is necessary and sufficient if u′′(C ) < 0, F ′′(K ) ≤ 0. These regu- larity conditions are satisfied if, respectively, C < ‚ and K < ·. In this ANSWERS TO EXERCISES 259 case the derivatives are given by u′(C ) = ‚ − C, u′′(C ) = −1, F ′(K ) = · − K , and we can write Ċ = (· − K − Ò)(‚ − C ). (b) In the steady state, (Ċ = 0) ⇐⇒ ((· − K s s − Ò)(‚ − Cs s ) = 0). If Cs s < ‚ and K s s < · then necessarily K s s = · − Ò (or, as is usual, F ′(K s s ) = Ò). Since (K̇ = 0) ⇐⇒ ( ys s = F (K s s ) = Cs s ), we have Ys s = Cs s = F (K s s ) = ·(· − Ò) − 12 (· − Ò)2 = 12 (·2 − Ò2). For all this to be valid, the parameters need to be such that K s s < ·, which is true if Ò > 0 and Cs s < ‚, which in turn requires (· 2 − Ò2) < 2‚. In the diagram, optimal consumption can never be in the region where C > ‚, since this would provide the same flow utility as C = ‚. If
    K > ·, it is optimal to consume the surplus as soon as possible, given
    that production is independent of K in this region. Hence, the flow
    consumption needs to be set equal to the maximum utility, C = ‚. If
    that implies that K̇ < 0, then the system moves to the region studied 260 ANSWERS TO EXERCISES above. But if the parameters do not satisfy the above conditions, then consumption may remain the same at ‚ with capital above · forever. In this case the maximization problem does not have economic signifi- cance. (There is no scarcity.) (c) Writing Ċ = (· − K − Ò)‚ − (· − K − Ò)C, we see that ‚ determines the speed of convergence towards the steady state for given C and K , that is (so to speak) the strength of vertical arrows drawn in the phase diagram, and the slope of the saddlepath. (d) If returns to scale were decreasing in the only production factor, then, setting F ′(K ) = r (as in a competitive economy), total income r K would be less than production, F (K ). Thus, an additional factor must implicitly be present, and must earn income F (K ) − F ′(K )K . For the functional form proposed in this exercise, we have F (ÎK , ÎL ) = ·ÎK − g (ÎL )Î2 K 2, ÎF (K , L ) = ·ÎK − g (L )ÎK 2, hence returns to scale are constant if g (ÎL )Î = g (L ), i.e. if g (x ) = Ï/x for Ï a constant (larger than zero, to ensure that L has positive produc- tivity). Setting Ï = 1 and L = 2, production depends on capital accord- ing to the functional form proposed in the exercise, and the solution can be interpreted as the optimal path followed by a competitive market economy. Solution to exercise 43 (a) For the production function proposed, Ẏ (t ) = ( 1 L + K (t ) ) K̇ (t ) = ( 1 L + K (t ) ) s Y (t ), and the proportional growth rate of income tends to s /L > 0. Since
    consumption is proportional to income, consumption can also grow
    without limit.
    (b) The returns to scale of this production function are non-constant:
    ln(ÎL + ÎK ) = ln Î + ln(L + K ) �= Î ln(L + K ).
    If both factors were compensated according to their marginal produc-
    tivity, total costs would be equal to(
    1
    L + K
    )
    L +
    (
    1
    L + K
    )
    K = 1,

    ANSWERS TO EXERCISES 261
    while the value of output may be above one (in which case there will
    be pure profits) or below one (in which case profits are negative if
    L + K < 1). Hence, this function is inadequate to represent an econ- omy in which output decisions are decentralized to competitive firms. Solution to exercise 44 (a) The returns to scale are constant. Each unit of L earns a flow income w(t ) = ∂Y (t ) ∂ L = 1 + (1 − ·) ( K (t ) L )· , and each unit of K earns r (t ) = ∂Y (t ) ∂ K (t ) = · ( K (t ) L )·−1 . (b) From the optimality conditions associated with the Hamiltonian, we obtain Ċ (t ) C (t ) = r (t ) − Ò Û . Hence, if consumers have the same constant elasticity utility function, the growth rate will not depend on the distribution of consumption levels. Moreover, the growth rate increases with the difference between the interest rate and the rate of time preference and is higher if agents are more inclined to intertemporal substitution (a low Û). (c) Production starts from L for K = 0, is an increasing and concave func- tion of K , and coincides with the locus along which K̇ = 0. The locus where Ċ = 0 is vertical above K s s , such that r = f ′(K s s ) = Ò ⇒ K s s = ( · Ò )1/1−· L . The saddlepath converges in the usual way to the steady state, where Cs s = L + L 1−· K ·s s = L + ( · Ò )·/1−· L . (d) The return on investments is constant and equal to one, and so aggre- gate consumption grows at a constant rate. However, the income share of capital is growing and approaches one asymptotically. Except in the long run, when labor’s income share is zero, the growth rate of production is therefore not constant and we do not have a balanced growth path. 262 ANSWERS TO EXERCISES Solution to exercise 45 (a) Calculating the total derivative, and using K̇ = s Y and equation Ȧ, we get Ẏ Y = Ȧ A + · K̇ K = L − L Y + ·s Y K . Hence, when the growth rates are constant, Y /K needs to be constant and Ẏ Y = K̇ K = L − L Y 1 − · . (b) The growth rate of the economy does not depend on s (which deter- mines Y /K ) but is instead endogenously determined by the allocation of resources to the sector in which A can be reproduced with constant returns to scale. A can be interpreted as a stock of knowledge (or instructions), produced in a research and development sector. (c) The sector that produces material goods has increasing returns in the three factors; thus, no decentralized production structure could com- pensate all three factors according to their marginal productivity. Solution to exercise 46 (a) F (ÎK , ÎL ) = [(ÎK )„ + (ÎL )„]1/„ = [΄(K „ + L „)]1/„ = Î(K „ + L „)1/„ = ÎF (K , L ). (b) y = 1 L F (K , L ) = [L −„(K „ + L „)]1/„ = [( K L )„ + 1„ ]1/„ = (k„ + 1)1/„ ≡ f (k). (c) f ′(k) = (k„ + 1)(1−„)/„k„−1 = [(k„ + 1)k−„](1−„)/„ = (1 + k−„)(1−„)/„. Taking the required limit, lim x→∞ (1 + k−„)(1−„)/„ = ( 1 + lim x→∞ k−„ )(1−„)/„ . If „ < 0, then k−„ tends to infinity and the exponent (1 − „)/„ is negative; thus, f ′(k) tends to zero and r = f ′(k) − ‰ tends to −‰. If „ > 0 then k−„ tends to zero, and in the limit unity is raised to the
    power of (1 − „)/„ > 0. Hence, f ′(k) tends to unity, and r = f ′(k) − ‰
    tends to 1 − ‰.

    ANSWERS TO EXERCISES 263
    (d) The economy converges to a steady state if limk→∞ k̇(t ) = 0. That is
    (given that a constant fraction of income is dedicated to accumulation),
    the economy converges to a steady state if net output tends to zero.
    (e) For a logarithmic utility function the growth rate of consumption is
    given by the difference between the net return on savings and the
    discount rate of future utility: Ċ (t )/C (t ) = r (t ) − Ò. In order to have
    perpetual endogenous growth, this rate needs to have a positive limit
    if k approaches infinity: limk→∞ r (t ) = 1 − ‰ if „ > 0; in addition,
    1 − ‰ − Ò > 0 or equivalently Ò < 1 − ‰ must hold. (Naturally, Ò needs to be positive, otherwise the optimization problem does not have eco- nomic significance.) Solution to exercise 47 (a) Since capital has a constant price and does not depreciate, there does not exist a steady state in levels: in fact, no positive value of K (t ) makes K̇ (t ) = P̄ s Y (t ) = P̄ s K (t )· L (t )‚ equal to zero. If · = 1 a balanced growth path exists, where K̇ (t ) K (t ) = Ẏ (t ) Y (t ) = P̄ s L (t )‚. The economy can be decentralized if the production function has con- stant returns to scale, that is if · + ‚ = 1. (b) The proportional growth rate of capital is K̇ (t ) K (t ) = s P (t ) Y (t ) K (t ) = s P (t )K (t )·−1 L̄ ‚. Hence K̇ (t )/K (t ) = g k is constant if Ṗ (t ) P (t ) + (· − 1) K̇ (t ) K (t ) = 0. The balanced growth rate of the stock of capital is g k = 1 1 − · Ṗ (t ) P (t ) = h 1 − · , and the constant growth rate of output is given by Ẏ (t ) Y (t ) = · K̇ (t ) K (t ) = · 1 − · h. (c) If P (t ) = K (t )1−·, the accumulation of capital is governed by K̇ (t ) = K (t )1−·s Y (t ) = K (t )1−·s K (t )· L̄ ‚ = K (t )s L̄ ‚. 264 ANSWERS TO EXERCISES Hence K̇ (t )/K (t ) = s L̄ ‚ is constant (and depends endogenously on the savings rate, s ). (d) P (t ) is the price of output (of savings) in terms of units of capital: if P (t ) increases, a given flow of savings can be used to buy more units of capital. In part (b) this increase is exogenous and, like the dynamics of A in the Solow model, allows for perpetual growth even in the case of decreasing marginal returns to capital. In part (c) the price of invest- ment goods depends endogenously on the accumulation of capital: as in models of learning by doing, this can be interpreted as assuming that investment is more productive if the economy is endowed with a larger capital stock. Solution to exercise 48 (a) Inserting u′(c ) = 1/c 2, u′′(c ) = −2/c 3, and Ò = 1 in the Euler equation, we obtain ċ (t ) = r − 1 2 c (t ). In words, the growth rate of consumption is independent of its level (since the utility function has CRRA form). (b) w(t ) = B (t ), r (t ) = 3. (c) The proportional rate of growth of capital is K̇ (t ) K (t ) = B (t ) K (t ) L + 3 − C (t ) K (t ) , and it is constant if C (t )/K (t ) and B (t )/K (t ) are constant and capital grows at the same rate as consumption and B (t ). In fact, if r = 3, con- sumption does grow at the same rate as B (t ): ċ (t )/c (t ) = Ċ (t )/C (t ) = 1 = Ḃ (t )/B (t ). The level of consumption can be such as to ensure also that K̇ (t )/K (t ) = 1: 1 = B (t ) K (t ) L + 3 − C (t ) K (t ) ⇔ C (t ) = B (t )L + 2K (t ). (d) The aggregate production function is Y (t ) = (L + 3)K (t ). Hence, the return to capital is L + 3 > 3 = r for the aggregate economy, and it
    would be optimal for growth to proceed at rate
    Ċ (t )
    C (t )
    =
    L + 3 − 1
    2
    = L + 1.

    ANSWERS TO EXERCISES 265
    Solution to exercise 49
    From the third row of the table of expected utilities in the text, we can easily
    see that V
    p
    M > V
    p
    C , since x < 1; then, the commodity holder is willing to trade with a money holder. To check that the latter is also willing to trade, we must show that U − (V pM − V p C ) > 0, since after the exchange the agent initially
    endowed with money enjoys utility from consumption but becomes a com-
    modity holder. Using the appropriate entries of the table and the definition of
    K given in the text, we have
    U − (V pM − V
    p
    C ) = U −
    K
    r + ‚x
    r (1 − x )
    = U
    (
    1 − ‚x (1 − M)(1 − x )
    r + ‚x
    )
    .
    This expression is positive, because the fraction in the large parentheses is
    less than unity. Hence, a money holder is willing to exchange money for a
    commodity she can consume.
    Solution to exercise 50
    Consider a discrete time interval �t , from t = 0 to t = t1, during which Ë is
    constant and therefore J̇ = V̇ = 0. Retracing the argument of Section 5.1,
    suppose that, when a firm and a worker experience a separation event, the
    resulting vacant job is not filled again during such an interval. (This assump-
    tion is valid, of course, in the �t → 0 limit.) The value of a filled job at the
    beginning of the interval is thus given by
    J =
    ∫ t1
    0
    e −s t e −r t ( y − w) d t + e −r �t [e −s �t J + (1 − e −s �t )V ]. (0.1)
    The first term on the right-hand side denotes the expected production flow
    during the time interval, net of the wage paid to the worker, discounted back to
    t = 0. (e −s t represents the probability that the job is still filled and productive
    at time t .) The second term represents the (discounted) expected value of the
    job at t = t1, the end of the interval. (If a separation occurs, with probability
    1 − e −s �t , the job becomes a vacancy, valued at V .) Solving the integral yields
    J =
    1
    r + s
    ( y − w) + e
    −r �t (1 − e −s �t )
    1 − e −(r +s )�t V.
    The limit as �t → 0 of the second term is s /(r + s ), by l’Hôpital’s rule. Thus,
    J =
    1
    r + s
    ( y − w) + s
    r + s
    V ⇒ r J = ( y − w) + s (V − J ).

    266 ANSWERS TO EXERCISES
    Solution to exercise 51
    Totally differentiating (5.45) and (5.46) around a steady state equilibrium
    point, we obtain(
    1 −(r + s )c q ′
    q 2
    1 −‚c
    )(
    dw
    d Ë
    )
    =
    (− c
    q
    1
    0 ‚
    )(
    d s
    d y
    )
    ,
    and we can compute the following effects of an aggregate productivity shock
    (recall that q ′ < 0): dw d y = −︷ ︸︸ ︷ −‚c + ‚(r + s )c q ′ q 2 � > 0,
    d Ë
    d y
    =
    −︷︸︸︷
    ‚ − 1

    > 0,
    where the determinant
    � = −‚c + (r + s )c q

    q 2
    is negative. An aggregate productivity shock moves the equilibrium wage
    in the same direction but, barring extreme cases, by a smaller amount
    (0 ≤ dw/d y ≤ 1); it also induces a change in the same direction of
    labor market tightness. (Lower productivity is associated with a lower
    vacancy/unemployment ratio in the new steady state.) As ‚ → 1 (and all the
    matching surplus is captured by workers), different productivity levels affect
    only the wage, with no effect on Ë. For intermediate values of ‚, the effects of
    a negative productivity shock are shown in parts (a) and (b) of the figure. The
    effects on unemployment and vacancy rates are uniquely determined, since
    productivity variations do not affect the Beveridge curve and only cause the
    equilibrium point to move along it.
    (a) (b)

    ANSWERS TO EXERCISES 267
    For the case of a reallocative shock, we have
    dw
    d s
    =
    +︷︸︸︷

    c 2
    q

    < 0, d Ë d s = +︷︸︸︷ c /q � < 0(∗). An increase in s leads to a reduction in both the equilibrium wage and the labor market tightness through a leftward shift of the curve J C (which is more pronounced for higher values of ‚). However, at the same time the Beveridge curve shifts to the right, and the effect of s on the number of vacancies v is therefore ambiguous, while the unemployment rate increases with certainty— see parts (c) and (d) of the figure. Totally differentiating the expression for the Beveridge curve and using the result obtained above for Ë, we obtain (with p′ > 0 and 0 ≤ Á ≡ p′Ë/ p ≤ 1)
    d u =
    p
    (s + p)2
    d s − s p

    (s + p)2
    d Ë
    ⇒ d u
    d s
    =
    p
    (s + p)2
    (
    1 − Ás
    (r + s )(Á − 1) − ‚p
    )
    > 0(∗∗).
    (The denominator of the last fraction is negative.) Moreover, using the defin-
    ition for Ë, we can express the effect on the mass of vacancies as
    dv
    d s
    = u
    d Ë
    d s
    + Ë
    d u
    d s
    ,
    which, together with (∗) and (∗∗), yields
    dv
    d s
    =
    s
    s + p
    Ë
    (r + s )(Á − 1) − ‚p + Ë
    p
    (s + p)2
    (
    1 − Ás
    (r + s )(Á − 1) − ‚p
    )
    =
    Ë
    (s + p)2
    s 2 + pr (Á − 1) − ‚p
    (r + s )(Á − 1) − ‚p .
    From these equations one can deduce that, starting from a relatively low level
    of s , a reallocative shock will lead to an increase in the mass of vacancies.
    (c) (d)

    268 ANSWERS TO EXERCISES
    Solution to exercise 52
    Totally differentiating the payoff function with respect to ē yields
    d V (ei (ē ), ē )
    d ē
    = V1(ei , ē )
    d ei
    d ē
    + V2(ei , ē ) = V2(ei , ē ) > 0,
    since V1(.) = 0 in an optimum. (This is an application of the envelope
    theorem.)
    Solution to exercise 53
    There is a positive externality between the actions of agents, since
    ∂ V 1
    ∂e 2
    = ·e ·1 e
    ·−1
    2 > 0.
    The reaction function of agent 1 is obtained by maximizing this agents’s
    payoff:
    max
    e1
    V 1(e 1, e 2) ⇒ ·e ·−11 e ·2 − 1 = 0 ⇒ e 1 = ·1/1−·e ·/1−·2 ,
    from which we obtain
    d e 1
    d e 2
    =
    ·
    1 − · ·
    1/1−·e 2·−1/1−·2 > 0.
    Hence, there is a strategic complementarity between the actions of agents.
    The symmetric decentralized equilibria are obtained by combining the two
    (identical) reaction functions with e 1 = e 2 = e . There are two equilibria: one
    with zero activity (e 1 = 0) and one with a positive activity level (¯̄e = ·1/1−2·).
    The cooperative equilibrium (e ∗) is obtained by maximizing the payoff of the
    representative agent with respect to the common activity level e :
    max
    e
    V (e, e ) = e 2· − e ⇒ 2·(e ∗)2·−1 − 1 = 0 ⇒ e ∗ = (2·)1/1−2·.
    The fact that e ∗ > ¯̄e confirms that the decentralized equilibria are inefficient
    in the presence of externalities.
    Solution to exercise 54
    (a) Given the assumptions, the expression for the dynamics of e is given by
    ė = (1 − e )a c ∗ − e 2b,
    from which, setting ė = 0, an expression for the locus of stationary
    points is obtained:
    ė = 0 ⇒ c ∗ = e
    2b
    (1 − e )a ⇒
    d c ∗
    d e
    ∣∣∣∣
    ė =0
    =
    2e b + a c ∗
    (1 − e )a > 0,
    d 2c ∗
    d e 2
    ∣∣∣∣
    ė =0
    > 0.

    ANSWERS TO EXERCISES 269
    The production cost c has an upper limit equal to 1. Hence if c ∗ exceeds
    this upper limit there is no need for a further increase in e to maintain a
    constant level of employment, because for c ∗ ≥ 1 all production oppor-
    tunities are accepted. The locus of stationary points is thus vertical for
    c ∗ > 1. The dynamic expression for c ∗ is given by
    ċ ∗ = r c ∗ − be ( y − c ∗) + a c
    ∗2
    2
    .
    Assuming ċ ∗ = 0, one obtains a (quadratic) expression in c ∗. This
    expression is drawn in the figure and along this curve:
    d c ∗
    d e
    ∣∣∣∣
    ċ ∗=0
    =
    b( y − c ∗)
    a c ∗ + r + be
    > 0e
    d 2c ∗
    d e 2
    ∣∣∣∣
    ċ ∗=0
    = − b
    2( y − c ∗)
    (a c ∗ + r + be )2
    < 0. (b) There are two possible equilibria: E 0, in which c ∗ = e = 0, and E 1, which is the only equilibrium with a positive activity level. The sta- bility properties of E 1 can be studied by linearizing the two difference equations around the equilibrium point (e 1, c ∗ 1 ) and by determining the sign of the determinant of the resulting matrix:( −(2e 1b + a c ∗1 ) (1 − e 1)a −b( y − c ∗1 ) a c ∗1 + r + be 1 ) . If the determinant is negative, the equilibrium has the nature of a saddlepoint. As in the general case discussed in the main text, this occurs if in equilibrium the curve ė = 0 is steeper than the curve ċ ∗ = 0, as is the case in E 1. Solution to exercise 55 (a) Restricting our attention to symmetric and stationary equilibria (in which  = � and VC and VM are constant over time), with c > 0,

    270 ANSWERS TO EXERCISES
    the values of expected utility from holding a commodity and holding
    money are:
    VC =
    1
    1 + r
    {(1 − ‚)VC + ‚[(1 − M)x 2U + (1 − M�x )VC
    + M�x (VM − c )]}
    VM =
    1
    1 + r
    {(1 − ‚)(VM − c ) + ‚[(1 − M)�x (U + VC )
    + (1 − (1 − M)�x )(VM − c )]},
    where c is subtracted from VM whenever the agent ends the period with
    money. Using the two equations above, we get
    VC − VM =
    ‚(1 − M)xU (x − �) + (1 − ‚�x )c
    r + ‚�x
    .
    Setting VC = VM , we find the value of �, dubbed �
    M , that makes
    agents indifferent between holding commodities and money:

    M =
    ‚(1 − M)x 2U + c
    ‚(1 − M) xU + ‚x c = x +
    (1 − ‚x 2) c
    ‚(1 − M) xU + ‚x c > x.
    To make agents indifferent, money must be more acceptable than com-
    modities: �M > x . The greater acceptability of money compensates
    money holders for the storage cost they incur in the event of their
    ending the period still holding money. The agents’ optimal strategy and
    the corresponding equilibria (shown in the figure) are then as follows:
    � � < �M : with VC > VM , agents never accept money and a non-
    monetary equilibrium arises (� = 0).
    � � > �M : agents always accept money since VC < VM , and the resulting equilibrium is pure monetary (� = 1). � � = �M : in this case VC = VM and agents are indifferent between holding commodities and money; the corresponding equilibrium is mixed monetary (� = �M ). ANSWERS TO EXERCISES 271 (b) With storage costs for money the non-monetary equilibrium always exists, whereas the existence of the other two possible equilibria depends on the magnitude of c . Even when money is accepted with certainty (� = 1), agents may not be fully compensated for the storage cost if c is very large. To find the values of c for which a pure monetary equilibrium exists, we consider VC < VM when � = 1: VC < VM ⇒ ‚(1 − M) xU (x − 1) + (1 − ‚x ) c < 0 ⇒ c < ‚(1 − M) xU (1 − x ) 1 − ‚x . To ease the interpretation of this condition, consider the case of ‚ = 1. (Agents meet pairwise with certainty each period.) The above condi- tion then simplifies to c < (1 − M)xU . The right-hand term is the expected utility from consumption for a money holder (utility U times the probability of meeting a trader offering an acceptable commodity (1 − M)x ). Only if the storage cost of money c is lower than the expected utility from consumption does a pure monetary (and a fortiori a mixed monetary) equilibrium exist, as in the case portrayed in the figure. Solution to exercise 56 The fact that c and z depend on the wage alters the shape of the J C and W curves, as can be seen in the figure. In steady-state equilibrium, using (5.34), we have y − w = (r + s ) c 0w q (Ë) ⇒ w = 1 1 + (r + s )c 0/q (Ë) y( J C ) and w = z0w + ‚( y + c 0wË − z0w) ⇒ w = ‚ 1 − (1 − ‚)z0 − ‚z0Ë y.(W) From (W), which holds in steady-state equilibrium and along the adjustment path, it follows that the wage is proportional to productivity and the factor of proportionality is positively correlated with the measure of labor market tightness Ë. Combining both equations, we obtain a result that is different from the one in exercise 51: here Ë is independent of the value of productivity in the steady-state equilibrium. Variations in y lead to proportional adjustments in the wage but have no effect on Ë or on the unemployment rate u. Hence, in this version of the model, a continuous increase in productivity (technological progress) does not lead to a decrease in the long-run unemployment rate. The unemployment rate is determined by the properties of the matching technology (the efficiency of the “technology” 272 ANSWERS TO EXERCISES that governs the process of meetings between unemployed workers and vacancies) and the exogenous separation rate s . Solution to exercise 57 At t0 firms anticipate the future reduction in productivity and immediately reduce the number of vacancies: v and Ë fall by a discrete amount (see figure). Between t0 and t1, the dynamics are governed by the difference equations associated with the initial steady state: v and Ë continue to decrease (while the unemployment rate increases) until they reach the new saddlepath at t1. From t1 onwards u and v increase in the same proportion, leaving the labor market tightness Ë unchanged. Solution to exercise 58 Following the procedure outlined in the main text, we calculate the total differential of the two first-order conditions, which yields V 111d e ∗ 1 + V 1 12d e ∗ 2 + V 1 13d Î = 0, V 221d e ∗ 1 + V 2 22d e ∗ 2 + V 2 23d Î = 0. ANSWERS TO EXERCISES 273 Using the same definitions as in the main text, we can rewrite this system of equations as ( 1 −Ò −Ò 1 )( d e ∗1 d e ∗2 ) = ( ∂e ∗1 (·)/∂Î ∂e ∗2 (·)/∂Î ) d Î, from which we obtain the following results: d e ∗1 d Î = ∂e ∗1 /∂Î + Ò(∂e ∗ 2 /∂Î) 1 − Ò2 = 1 1 − Ò2 (1 + Ò) ∂e ∗1 ∂Î = 1 1 − Ò ∂e ∗1 ∂Î d e ∗2 d Î = ∂e ∗2 /∂Î + Ò(∂e ∗ 1 /∂Î) 1 − Ò2 = 1 1 − Ò ∂e ∗2 ∂Î d (e ∗1 + e ∗ 2 ) d Î = 2 1 − Ò ∂e ∗1 ∂Î = 2 d e ∗1 d Î . � I N D E X adjustment costs convex, 49 linear, 76, 106 balanced growth, 134 Bellman equation, 38, 41, 108 Brownian motion, 82 geometric, 86 bubble, 54 capital, 131 human, 37, 122, 157 CCAPM (consumption based capital asset pricing model), 30 Cobb–Douglas, 39, 138, 149, 151, 210 constant absolute risk aversion (CARA), 25 constant elasticity of substitution (C.E.S.), 137, 148 constant relative risk aversion (CRRA), 6, 31, 39, 142 consumption function, 8, 12, 15, 25, 40 discounting exponential vs hyperbolic, 3 elasticity of substitution, 137 envelope theorem, 38, 66, 107 ergodic distribution, 114, 126 Euler equation, 108, 140, 146 Euler’s theorem, 69, 149 excess smoothness, 13, 17, 19, 29 expectations, 1, 4, 44, 65, 71 iterated, 18, 126 externalities, 155, 170, 188, 206, 213 Hamiltonian, 53, 91, 96, 98, 138, 146 idiosyncratic uncertainty, 91, 121 income permanent, 1, 8, 13, 41 innovation (in stochastic process), 28 integral, 92 integration by parts, 84, 94 by substitution, 84 intertemporal budget constraint, 5 irreversibility, 78, 85 Jensen’s inequality, 24, 33, 68, 85, 116 l’Hôpital’s rule, 6, 137, 150 labor hoarding, 111 Lagrangian, 5, 94 learning by doing, 154 Leibnitz rule, 176 life-cycle, 13 liquidity constraints, 15, 44 marginal rate of substitution intertemporal, 30 mark-up, 163 Markov process, 83, 125 monopoly power, 160 multiple equilibria, 171, 178, 185, 212 Nash bargaining, 195 Nash equilibrium, 182, 184 orthogonality test, 8, 20 persistence, 12, 16 phase diagram, 56, 61 q average, 70, 109 marginal, 55 replicability, 152 representative agent, 147 returns to scale constant, 132, 146 decreasing, 140 increasing, 151, 152 risk premium, 34, 36 saddlepath, 60, 99, 140, 179, 203 series discounted arithmetic, 28 geometric, 11 INDEX 275 smooth pasting, 88 Solow growth model, 132 stationarity, 16 steady-state, 59, 61, 137 optimal savings, 140 stochastic process, 3, 82 ARMA, 16 Taylor expansion, 118 transversality condition, 4, 53, 54, 59, 60, 62, 139, 143 user cost of capital, 61, 77 Wiener process, 82 CONTENTS LIST OF FIGURES 1 Dynamic Consumption Theory 1.1 Permanent Income and Optimal Consumption 1.1.1 Optimal consumption dynamics 1.1.2 Consumption level and dynamics 1.1.3 Dynamics of income, consumption, and saving 1.1.4 Consumption, saving, and current income 1.2 Empirical Issues 1.2.1 Excess sensitivity of consumption to current income 1.2.2 Relative variability of income and consumption 1.2.3 Joint dynamics of income and saving 1.3 The Role of Precautionary Saving 1.3.1 Microeconomic foundations 1.3.2 Implications for the consumption function 1.4 Consumption and Financial Returns 1.4.1 Empirical implications of the CCAPM 1.4.2 Extension: the habit formation hypothesis Appendix A1: Dynamic Programming Review Exercises Further Reading References 2 Dynamic Models of Investment 2.1 Convex Adjustment Costs 2.2 Continuous-Time Optimization 2.2.1 Characterizing optimal investment 2.3 Steady-State and Adjustment Paths 2.4 The Value of Capital and Future Cash Flows 2.5 Average Value of Capital 2.6 A Dynamic IS-LM Model 2.7 Linear Adjustment Costs 2.8 Irreversible Investment Under Uncertainty 2.8.1 Stochastic calculus 2.8.2 Optimization under uncertainty and irreversibility Appendix A2: Hamiltonian Optimization Methods Review Exercises Further Reading References 3 Adjustment Costs in the Labor Market 3.1 Hiring and Firing Costs 3.1.1 Optimal hiring and firing 3.2 The Dynamics of Employment 3.3 Average Long-Run Effects 3.3.1 Average employment 3.3.2 Average profits 3.4 Adjustment Costs and Labor Allocation 3.4.1 Dynamic wage differentials Appendix A3: (Two-State) Markov Processes Exercises Further Reading References 4 Growth in Dynamic General Equilibrium 4.1 Production, Savings, and Growth 4.1.1 Balanced growth 4.1.2 Unlimited accumulation 4.2 Dynamic Optimization 4.2.1 Economic interpretation and optimal growth 4.2.2 Steady state and convergence 4.2.3 Unlimited optimal accumulation 4.3 Decentralized Production and Investment Decisions 4.3.1 Optimal growth 4.4 Measurement of “Progress”: The Solow Residual 4.5 Endogenous Growth and Market Imperfections 4.5.1 Production and non-rival factors 4.5.2 Involuntary technological progress 4.5.3 Scientific research 4.5.4 Human capital 4.5.5 Government expenditure and growth 4.5.6 Monopoly power and private innovations Review Exercises Further Reading References 5 Coordination and Externalities in Macroeconomics 5.1 Trading Externalities and Multiple Equilibria 5.1.1 Structure of the model 5.1.2 Solution and characterization 5.2 A Search Model of Money 5.2.1 The structure of the economy 5.2.2 Optimal strategies and equilibria 5.2.3 Implications 5.3 Search Externalities in the Labor Market 5.3.1 Frictional unemployment 5.3.2 The dynamics of unemployment 5.3.3 Job availability 5.3.4 Wage determination and the steady state 5.4 Dynamics 5.4.1 Market tightness 5.4.2 The steady state and dynamics 5.5 Externalities and efficiency Appendix A5: Strategic Interactions and Multipliers Review Exercises Further Reading References ANSWERS TO EXERCISES INDEX A B C D E H I J L M N O P R S T U W

    Still stressed with your coursework?
    Get quality coursework help from an expert!