18 August 2013

Hadoop fsck -move does not only move the files

Let me rewind the story. Last morning two datanodes crashed. Badly. Hopefully the default replication factor is 3, so the blocks started to re-replicate over properly. Unfortunately, we had a few data sets stored with replication factor of 2. And without much surprise, some blocks were only stored in the two datanodes that crashed. hadoop fsck command reported corrupted files.
No big deal, the action plan simply looks like:
  1. Run a fsck -move to move the corrupted files out of the processing paths to not have jobs failing
  2. Recover the blocks from the crashed datanodes, potentially manually if the datanodes do not come back to life
  3. Move the files from /lost+found back to their original locations.
fsck -move ran fine. /path/to/my/corrupted/file has been move to /lost+found/path/to/my/corrupted/file as expected. But hey, what happened? Looking closer, files have been transformed in directories, and additional files have been created:
$ hadoop fs -ls -R /lost+found
drwx------ /lost+found/path/to/my/corrupted/file
-rw-r--r-- /lost+found/path/to/my/corrupted/file/0
-rw-r--r-- /lost+found/path/to/my/corrupted/file/1

Looking in the bible (Hadoop - The Definited Guide by Tom White), it is briefly mentioned: "Files are broken into chains of contiguous blocks to aid any salvaging efforts you may attempt." So here is the point: fsck -move does not only move the files to /lost+found, it actually splits the files in smaller file of continuous blocks:


As a side effect, the corrupted blocks are no longer used in the remaining files. Even if the blocks are recovered anytime later, they will be deleted anyway because no more belonging to any file. fsck -move is a non-reversible process – or at least painful to recover from.

The recommendation is to use fsck -move in the last resort, only when there is no more chance to recover the block. This will helps to reduce the amount of lost data to only a few portions of the file and not it’s entire content.
I learnt fsck -move hidden behavior the hard way. If I needed to rewrite the action plan in case of corrupted files, it would be:
  1. Get the list of corrupted files (hadoop fsck / | grep -v "^\.$")
  2. Manually move the files listed at point 1. away from the processing path (hadoop fs -mv /path/to/my/corrupted/file /quarantine/path/to/my/corrupted/file)
  3. Try to recover the missing blocks.
  4. If able to recover the missing blocks, re-move the files in the initial directory
  5. If not, run hadoop fsck -move to reduce the amount of lost data down to a few blocks, and do some additional post-processing to have the portion of files in a usable format, including re-move the file’s splits in the initial directory.

Remember: hadoop fsck -move does not only move the files to /lost+found!

No comments:

Post a Comment