Introducing some technical terms that arise in default analytics.

This nomenclature comes from the field of statistics called survival analysis, which is well established and readily found in text books or wiki entries etc. If you don’t mind reading maths you will find better guidance there than in this post. The name ‘survival’ arose because the subject matter was/is mostly mortality or onset/re-occurrence of disease in populations or test cohorts. This is not too far from the study of the onset of default, so happily (?if this is an appropriate word) a great deal of well established statistical theory and practice is available for the study of default. This applies mainly to PD rather than LGD modelling.

Some of these terms and their equivalents in banking terminology are covered below.

Survival analysis is an essentially longitudinal activity, although the data it is based on will often be cross-sectional in structure.

The key variable x is the waiting time until default. This means the MOB of the first default. This variable x will have a distribution (probability density function) f(x), from which can be derived (by integration) the cumulative density function F(x). The pdf is not intuitive for non-technical audiences and I recommend only showing the cdf which is a monotonic rising curve that is easy to interpret: F(24), for example, would show the probability of going bad on or before MOB=24. This can also be interpreted as “what proportion of this population will have gone bad within 2 years”.

Note that PD alwyas needs to be related to some time window, so “PD” alone is a vague concept and one needs to be specifying something like “probability of going bad within the first 24 MOB” (for a longitudinal PD) or “probability of going bad within the next 12 calendar months” (for a cross-sectional PD).

I avoid using the stats terminology of pdf or cdf because they don’t sound intuitive, and particularly the word “cumulative” can mean so many different things in various contexts. Some more business-intuitive term is preferable. Some colleagues have called the cdf the “emergence curve” which is quite descriptive as it makes one think of the bads “emerging” with the passing of time, as the curve climbs up. An emergence curve is visually comfortable to absorb (being an integral, it will be quite smoothe) and shows at a glance the values of the 12-month PD or 24-month PD or any other x-month PD. Another business-friendly term is “default profile”, which sits comfortably with “churn profile” for the cdf of waiting time until closed-good.

But none of these is the hazard curve > continue

d next time…