Sören Asmussen, Aarhus University, Denmark Limit theorems for failure recovery probabilities A job like the execution of a computer program or the transmission of a file on a communications link may fail. Some of the schemes for failure recovery that have been suggested go under the names of REPLACE, RESUME, RESTART and checkpointing. I will start by a survey of a recent analysis of RESTART (joint with Fiorini, Lipsky, Rolski & Sheahan), which shows that this scheme may lead to total completion times with a disasterously heavy tail. One way to mitigate this is checkpointing, where the job is split into subtask so that a it suffices to restart from the last checkpoint. We investigate the effect of this on the tail of the total time distribution for a variety of checkpointing models (joint with Lipsky).