When to use Survival Analysis instead of Regression?

“And what is a Survival function?”, I heard you asking

Tarek Amr
5 min readMar 14, 2024

--

Regression is an important item in every Data Scientist’s toolbox. You probably use it all the time; and you should. But sometimes it isn’t the right tool for the job.

Here is an example to show you why:

You want to predict the height of a golf ball after 5 meters, given the player’s technique, the strength of their shot and, say, the wind speed. How would you estimate it?

Regression, you say?

Correct!

Now, what if I told you: for some reason, there was a fridge in the way of your golfers.

Image created by the author
The fridge is blocking the golfers’ way — Image created by the author

You wish that nasty fridge wasn’t there, it has nothing to do with your experiment.

Too bad, you have to deal with it.

And sorry, you won’t get the chance to re-do this experiment. Those colorful golfers are too busy now to help you again.

Now you have to think out of your toolbox.

Option One: I am still using Regression anyway

Alright, you sure can use regression, but you have to exclude the purple and the pink golfers. The fridge was in their way. Only the green one is left.

--

--

Tarek Amr

I write about what machines can learn from data, what humans can learn from machines, and what businesses can learn from all three.