RSS LJ

October 30, 2003

Undoing the damage ()

by fluffy at 9:52 PM
After last night's incident I got right to work on cleaning things up; at 3 AM I went to bed, and in that time had restored about 60% of the directory tree. This may serve as a guide for people who experience a similar catastrophe.

Then again, it may not.

All commands shown below are run from the original music collection's root directory unless otherwise indicated.

The very first thing I did was to look for a list of all of my mp3 files as they were before. This was pretty easy, since I automatically generate one on every main site update. (Obviously, the one at that URL is more recent than that, but I saved a copy of it for my own purposes.) This file has ended up becoming an extreme time-saver for this whole process

Next, I moved the mangled directory tree into a subdirectory, _MANGLED_, just to keep things separate.

The next step was to restore the original directory structure, taking advantage of the playlist file yet again. I did something like this (in sh):

cat playlist.txt | while read size fname ; do
  mkdir `dirname $fname`
done
I could also take advantage of the fact that iTunes didn't bother with any files which weren't in MP3 format (this saved all of my OGGs, as well as all of the .link hidden files which I use to generate the links on the structured playlist):
cd _MANGLED_
find . -type f -not -name '*.mp3' -exec mv \{\} ../\{\} \;
So now I had a mostly-empty directory structure which I could work with as an organizational thing, at least.

The next bit was to start dealing with albums. I decided to do the file-recovery equivalent of triage:

  1. Anything which would be easy enough to move back into the directory structure, I'd just do
  2. Anything which could be easily downloaded or ripped again, I'd put into my "to redownload/rerip" pile
  3. Anything else I'd put off until I felt like handling it (e.g. as a break from type 1)
The first category covered a surprising amount of ground:
  • Single-artist albums (mv "_MANGLED_/Faye Wong/Love Only Strangers/*" Faye_Wong/Love_Only_Strangers)
  • Multiple-artist albums (mv "_MANGLED_/*/Tonnage/* various_artists/Tonnage/)
  • Singles (mv "_MANGLED_/Garbage/Unknown Album/*" Garbage/) and other odd collections (mv "_MANGLED_/dune/Unknown Album/*" Brothomstates/dune/)
The second category includes things like
  • Multiple-disc sets where the id3 tags mangled the filenames into oblivion (e.g. The Orb Remix Project)
  • Albums which I ripped before I even used id3 tagging (which I should just re-encode to get better quality anyway)
  • SongFight and Quiet American (any consistency within id3 tags is pure coincidence)
The third category is a bit more nebulous; it includes things like
  • Multiple-disc sets (especially ones which must go in order; I'm not going to worry about my Beatmania collections, for example, but Phish - A Live One needs to stay correct)
  • SongFight Live and CoverFight (Many of the files are no longer active, and have inconsistent tagging)
  • Non-SongFight stuff from SongFight artists (SongFight stuff tends to get put into weirdly titled "albums" and so on, as the participants usually put random weirdness in the id3 tags)
  • Multiple-artist albums distributed as an mp3cd (e.g. the CDBaby samplers)
  • Stuff which didn't have any id3 tagging at all
For that very last bit, I was able to take advantage of the fact that all of the tagless mp3 files happened to have unique filenames anyway:
cat playlist.txt | while read size fname ; do
  mangle="_MANGLED_/Unknown Artist/Unknown Album/`basename \"$fname\"`"
  if [ ! -f "$fname"] && [ -f "$mangle" ] ; then
    mv -v "$mangle" "$fname"
  fi
done
This actually took care of a surprising amount of my collection.

A lot of the stuff (such as the CDBaby samplers) I think I'll do something different with; rather than try to restore them to their original places, I'll actually just make separate directories for each artist and treat the songs like singles, since that's basically what they are. I've gotten tired of using "various artists" as a catch-all for "artists I only have one song of" anyway, and it was getting kind of hard to keep track of which albums I'd bought so I could remove the single and so on. (For example, I was surprised to find that I still had the CDBaby sampler copy of "Rhyme For Rhyme" by Mike Control laying around.)

Other things which have come in handy:

  • From the main tree, find . -type d -empty works really well to see which things are left to be recovered
  • From _MANGLED_, find . -type d -empty -print0 | xargs -0 rmdir helps the cleanup effort
  • From _MANGLED_, du -k | sort -n shows which things should be a priority
  • To find files whose names were probably truncated from id3v1 tag data: find . -name '[0-9][0-9] ????????????????????????.mp3'
  • Realizing that I'm not going to get the filenames exactly how they were before without a lot of work, and the complacency with many of my files now being in the form of "01_first_song.mp3" instead of "01.first_song.mp3" (after I run my little 'underscores' tool to convert all spaces to underscores)
So anyway, feel free to watch my recovery process on my structured playlist.

I'll be happy when _MANGLED_ is gone. In the meantime, at least this gave me something to do.

Comments

#1148 10/30/2003 09:42 pm RoboDJ
Too bad you don't use RoboDJ...if you did, you'd have a textfile containing the paths to all of your files.
#1149 10/30/2003 09:44 pm
Doesn't matter. I have a textfile containing the paths to all of my files as they were before (the playlist-full.html), but there's no simple mapping between that and the new locations as mangled by iTunes. Unless RoboDJ also maintains all of the id3-tag-based data or something.

But to run RoboDJ I'd have to be listening to my music under Windows.
#1155 10/31/2003 09:33 am yes, it does
Sorry...I was just being snotty. But it does maintain a copy of the ID3 info in its database (which is in a text file)