The credit risk world likes to work with ‘odds’ and related quantities so these are covered today.
You could just do everything in terms of probability, i.e. PD, which is unambiguous. PD lies in [0,1] and a small number (like 0.002) is a better customer than a bigger number (like 0.013). In typical modelling situations (in Australia, in the good times..), a lot of PDs would have one or two or even three leading zeroes and these numbers are not handy for transcription or to quickly convey which zones they lie in.
It goes without saying that it often more palatable to format a PD as a percentage, e.g. PD = 0.013 as PD = 1.3%.
‘Odds’ have a special status because they are intimately linked with logistic regression, the main PD-modelling statistical tool. Odds can be worked out from the PD, and vice versa, as follows:
- odds = 1/PD – 1
- PD = 1/(1 + odds)
For example, odds = 8 means exactly the same thing as PD = 1/9 = 0.1111..
Odds are generally taken to be the Good:Bad odds; thus a bigger number for odds is a better situation. I have seen analysts using Odds the other way up i.e. the Bad:Good odds. You can come out alive but it will confuse your colleagues; +/- changes of sign will cascade through and graphs will tilt the opposite way.
One step closer to the logistic zone is to transform to “log_odds”.
- log_odds = ln(odds)
- odds = exp(log_odds)
‘ln’ means natural logs, i.e. to the base ‘e’. Actually, mathematicians always mean natural logs when they say log and as a matter of pride would never mention the base, or contemplate a base other than ‘e’ unless it was a neat way to summarise a problem that had structure particular to integral bases. Ambiguity can arise: computer systems that are tech-oriented, like SAS or MATLAB, assume ‘log’ means ln, whereas those that are business-oriented, like MS/Excel, assume that ‘log’ means log_to_base_10. It also doesn’t help that ‘ln’ is not comfortable in speech.
By ‘log’ I always mean natural log, and I use log10 or log2 to mean logs to base 10 or 2. For the meantime, the terminology ‘log_odds’ will be used, which is easy in speech, but if anyone can suggest better nomenclature they are welcome to put it forward.
If we’ve taken the right choices so far, a bigger number for log_odds is a better situation. Note that log_odds can be negative (when odds < 1 which is when PD > 0.5).
To make the numbers more convenient to handle, it is common practice to convert the log_odds to a ‘score’ on a user-friendly scale that wouldn’t involve negatives or decimal places. For the first time in this chain of transformation, arbitrary scaling constants are involved in this choice: one for location and one for scale (spread). A typical approach is illustrated below:
- for location: bang a stake in the ground at the point that will represent odds of 1 (== log_odds of zero == PD of 0.5): so, for example, choose a score of 500 to represent this point (which BTW would be a lousy customer)
- for scale: this is normally done by specifying how many points it takes to double the odds (PDO). A comfortable choice would be PDO=20, which says that a score of 520 <=> odds=2, 540 <=> odds=4, 560 <=> odds=8 etc.
Because log_odds is a logarithmic scale, the above choices work out and amount to a linear transformation of log_odds to score. The two scaling parameters, and hence the transformations from log_odds to score and back, will depend on these fairly arbitrary choices.
PDO=20 gives a nice granularity to the scores, which will mostly land in the 500-800 zone and you won’t feel the need to use decimal points i.e. whole-number scores suffice. As long as PDO is chosen to be positive, it will still be the case that a bigger score is a better situation.
All the above transformations are absolute arithmetic ones that always apply, irrespective of context such as outcome window, default definition, calibration, closed goods in/out, etc. If you find you disagree with someone via these calcs, it means you started from different contexts and therein lies the entire explanation for your disagreement.
5 comments
29 April, 2008 at 19:02
» Odds Credit Score on Credit Speak: Find Info, News and More on Credit Score
[…] Posted in April 29th, 2008 by in Uncategorized Odds The credit risk world likes to work with ‘odds’ and related quantities so these are covered […]
30 April, 2008 at 00:26
Clive
Some afterthoughts:
Any feedback on commonly adopted score scaling choices would be welcome – especially from the specialist scorecard building companies – tell us what the commonest choices are so that this blog can maximise familiarity.
Alternatives I have seen include banging in the location anchor at a point representing, say, 50:1 odds. Such a point is likely to be not too far from the middle point of your range since it is close to PD=2%. OTOH 1:1 odds has the minor convenience of log_odds=0 which makes the scaling formula slightly simpler. Also, this “location anchor” may be chosen at 200, 600, whatever; somewhere where scores are extremely unlikely to ever go below 0 (or below 100, which produces a 2-digit score) or above 999.
It’s all about convenience and amenability for non-technical users so as to minimise opportunities for transcription errors and misunderstandings so it may as well be done thoughtfully.
I’ve seen PDO choices of 20 and of 15. Even 10 would give a decent granularity for most purposes, but, why skimp, take 20.
PDO=12 would be a quirky choice because the scores would then be analogous to musical intervals on the Western system of 12 semitones per octave, so scores would map neatly to piano keys. Credit risk committees could then debate lifting the score cut-off from B-flat to F-sharp. The rest of this scenario is left to the reader’s imagination.
Note that the Basel floor of PD=0.03% equates to odds of 3332 and log_odds=8.11; this then represents a sensible limit on axes on graphs for example. I typically choose log_odds=-2 (PD=88%) to represent minus infinity and log_odds=8 (PD=0.03%) to represent positive infinity, which gives 10 “octaves” of the PD scale encompasing all but the most extreme of values. Sticking to these limits on all log_odds graphs then gives a uniform “frame” which will enhance interpretation and comparison.
I have also sometimes used these infinity bounds to “Winsorise” or truncate calculated odds values that go crazy because of zero division or log of zero or small sample effects.
You may prefer your own estimates of infinity.
Whatever your choice of scoring scale, it is helpful to determine these two “infinities” as a guide: e.g. if you use 600@1:1 and PDO=20 you will get minus infinity at score=542 and plus infinity at score=831. Scores outside this range are possible but extreme, and in the case of >831 they exceed the Basel limit.
1 May, 2008 at 22:33
Odds and marginal profitability « ozrisk.net
[…] is 15, it means that your tipping point is at 15:1 odds, which can be converted to the score as per previous post. This would then be the cut-off. This post assumes a simple automatic accept/decline score, […]
3 July, 2008 at 04:05
Vikas
Hi.. Would request if someone could clarify:
I understand odds definition in the context of event defined as ‘Default’ as PD/(1-PD) i.e prob (event) / prob (non-event). If we consider Log (Odds) = Score, then the above relationship gives result as PD = 1/(1+exp(-score)). This relationship is explained in quite a few credit risk books. Why is then your odds definition provided as (1-PD)/PD. What is correct??
3 July, 2008 at 22:35
Clive
The good news is that either is correct – as covered in the paragraph that starts “Odds are generally taken to be the Good:Bad odds” in the original post above. It is a choice of convention.
In the odds and score definition you quote, a higher score means a worse situation.
There is also a symmetry around whether one considers “default” or “non-default” to be the target event.
I opined that “Odds are generally taken to be the Good:Bad odds” (rather than the Bad:Good odds) without knowing what most of the Credit Risk industry does, so it may be from your sample of quite a few credit risk books that my convention is actually in the minority.
A convenience factor attached to this convention is that PDs are usually small (.001 to .1), and this leads to a convenient range of numbers for odds (999 to 9). In the convention you mention, the range for those “odds” would be fractions (0.001 to 0.11).
Thank you for raising the point and it would be interesting to hear which convention enjoys the most support.