Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#248069 - 30/01/2005 13:50 trouble with iTunes - copying data, desktop to laptop
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
Maybe somebody here has some ideas. This wasn't supposed to be difficult. The setup: my Mac is the primary place I keep my MP3s stored (~60GB at last count). I want to copy a subset to my PC laptop for use when DJing. There's no easy way for me to write down a predicate that expresses that subset, so it really needs to be done by hand. Furthermore, I have a lot of metadata like BPMs, ratings, and whatnot that's stored in the iTunes database. I want all of that to make it over to the PC.

What almost worked but didn't:

- I mounted my Mac's filesystem on the PC.
- I made a copy of the iTunes XML database and did some regexp tweaking to make all of the file names refer to the remote Mac (i.e., they now said file://localhost/Z:/Music/iTunes Music/...).
- I set iTunes to not copy files when it imports them.
- I "imported" the hacked XML file.
- 30 minutes later, I had all the data and metadata loaded, but with the files remote instead of local.
- By hand, I deleted enough entries that the files should now all fit locally.
- I asked iTunes to "consolidate" my music library.

Sadly, the consolidation only copied about 9000 out of 12000 files, saying "Copying Music failed: The file name was invalid or too long." I just tried it again, and I briefly saw the file name that made it choke:

Mary Lou Williams/Live at the Keystone Korner/02 The History of Jazz According To Mary Lou_ Spiritual III_Fandangle_Old-Fashioned Slow Blues; K.C. 12th Street; Baby Bear Boogie; attacca Roll 'Em.mp3

Is that too many characters for NTFS to deal with in a filename?

I'm assuming I'm going to have to write a Perl script to do this "consolidation" properly, since iTunes can't handle it. One issue is long filenames, possibly with special characters. Another issue is that the XML file uses web-style %-escapes (e.g., %20 for spaces), so I need to mechanically turn those back into the normal characters that are part of the file names (and, which may or may not be legal characters for NTFS). The only other issue I can imagine might haunt me is how to efficiently do the copying. The right answer seems to be gnutar's "--files-from" argument, although there is the legal file name issue to deal with.

Thoughts?

Top
#248070 - 30/01/2005 14:13 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
Quote:
Mary Lou Williams/Live at the Keystone Korner/02 The History of Jazz According To Mary Lou_ Spiritual III_Fandangle_Old-Fashioned Slow Blues; K.C. 12th Street; Baby Bear Boogie; attacca Roll 'Em.mp3

Is that too many characters for NTFS to deal with in a filename?

Dunno, but you can find out (and also find out whether it likes those semicolons and that apostrophe) by trying to copy the file manually in Explorer, or even by trying to play it in Windows straight from the exported volume. Samba goes to some lengths to dumb down its filenames for Windows clients; maybe the Macintosh SMB daemon doesn't quite go far enough?

Peter

Top
#248071 - 30/01/2005 14:30 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5683
Loc: London, UK
Quote:
Is that too many characters for NTFS to deal with in a filename?


Yes and no. NTFS can deal with (IIRC) unlimited characters in a filename, or it might be 32768. However, without some effort, the file APIs on Windows can only deal with MAX_PATH characters, which is probably 257.

That filename you provided is only 199 characters long, but it might be mounted somewhere with a longer prefix. If not, then it could be a problem with a character in the name.

Also, how are you mounting the files? I think Samba has a 255-character limit as well...
_________________________
-- roger

Top
#248072 - 30/01/2005 15:15 Re: trouble with iTunes - copying data, desktop to laptop [Re: Roger]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
Quote:
Also, how are you mounting the files? I think Samba has a 255-character limit as well...

I'm mounting them with Apple's version of Samba.

Meanwhile, I tried to make the tar file on Mac and it's currently unpacking on Windows. It will probably be another hour before I know what may or may not have failed. So far, tar has given me errors on files that had accents in their characters, but is otherwise working. It looks like I need to debug my perl script a little more.

Top
#248073 - 30/01/2005 19:35 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
Oooh, here's some more fun. On my Mac, here's how an e with an accute accent shows up in the XML for the file name:

Ste%CC%81phane%20Grappelli

Here's the same thing from the XML on a PC:

St%C3%A9phane%20Grappelli

What's going on here? Is this some bastard disagreement about Unicode character interpretation or what? iTunes on the PC is having no trouble playing files from my Mac over the network, so maybe Samba is smart enough to translate between these things. Weird.

Top
#248074 - 30/01/2005 20:14 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
andy
carpal tunnel

Registered: 10/06/1999
Posts: 5916
Loc: Wivenhoe, Essex, UK
Can you show us the context of the XML around those strings.
_________________________
Remind me to change my signature to something more interesting someday

Top
#248075 - 30/01/2005 20:45 Re: trouble with iTunes - copying data, desktop to laptop [Re: andy]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
Okay, you asked for it. Here's a song entry from iTunes on my Mac:

Quote:
<dict>
<key>Track ID</key><integer>3579</integer>
<key>Name</key><string>Pennies from Heaven</string>
<key>Artist</key><string>Stéphane Grappelli</string>
<key>Album</key><string>Verve Jazz Masters 11</string>
<key>Genre</key><string>Jazz</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>5172996</integer>
<key>Total Time</key><integer>229067</integer>
<key>Track Number</key><integer>1</integer>
<key>Year</key><integer>1966</integer>
<key>Date Modified</key><date>2003-09-17T14:52:25Z</date>
<key>Date Added</key><date>2002-07-14T20:19:06Z</date>
<key>Bit Rate</key><integer>180</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Comments</key><string>Track 1</string>
<key>Track Type</key><string>File</string>
<key>Location</key><string>file://localhost/Volumes/BulkStuff/ dwallach/Music/iTunes/iTunes%20Music/Ste%CC%81phane%20Grappelli/ Verve%20Jazz%20Masters%2011/01%20Pennies%20from%20Heaven.mp3</string>
<key>File Folder Count</key><integer>4</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>


Now, here's the "same" entry from my PC:

Quote:
<dict>
<key>Track ID</key><integer>2489</integer>
<key>Name</key><string>Pennies from Heaven</string>
<key>Artist</key><string>Stéphane Grappelli</string>
<key>Album</key><string>Verve Jazz Masters 11</string>
<key>Genre</key><string>Jazz</string>
<key>Kind</key><string>MPEG audio file</string>
<key>Size</key><integer>5172996</integer>
<key>Total Time</key><integer>229067</integer>
<key>Track Number</key><integer>1</integer>
<key>Year</key><integer>1966</integer>
<key>Date Modified</key><date>2003-09-17T14:52:25Z</date>
<key>Date Added</key><date>2005-01-30T03:00:25Z</date>
<key>Bit Rate</key><integer>180</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Comments</key><string>Track 1</string>
<key>Track Type</key><string>File</string>
<key>Location</key><string>file://localhost/Z:/ Music/iTunes/iTunes%20Music/St%C3%A9phane%20Grappelli/ Verve%20Jazz%20Masters%2011/01%20Pennies%20from%20Heaven.mp3/</string>
<key>File Folder Count</key><integer>-1</integer>
<key>Library Folder Count</key><integer>-1</integer>
</dict>



I added spaces so the location lines would wrap in nice places, otherwise, this is straight out of the XML. You can see that they've got the right thing in the artist tag, but they seem to be encoding it for the file name. That encoding seems to only be for the XML version of the file name. The actual filename, on the real disk, has the accent decoded ('ls' on the file will show the accent). Despite all this bizzareness, the file will play just fine on either machine. (Z: is mounted to my home directory on the Mac.) The only challenge is getting it stored locally instead of remotely.

I'm running the damn tar again. Last time, it freaked out when the file it had written got over 4GB in size. This time, writing to stdout, it seems to be working. That should capture everything except for about 30 files (like the above example) with messed up accent characters. I'll deal with those by hand for now, but if anybody has any ideas how to automate this, I'd love to hear it. One possibly attractive idea would be to write a script on the Mac to simplify the filenames. Just nuke all the accents and update the XML database. The accents will still be there in the XML database and the ID3 tags. If I also hacked down long file names, I'd probably be able to get the "consolidate" thing to actually work. *sigh*

Top
#248076 - 30/01/2005 21:01 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
andy
carpal tunnel

Registered: 10/06/1999
Posts: 5916
Loc: Wivenhoe, Essex, UK
Ah, ok. That makes more sense. I was puzzled as to why the data was encoded, as it didn't need to be. The encoding has nothing to do with the data being in XML but is instead to do with it being used in an URL.

The different encoding must just be down to the way "file:" URLs are parsed on Windows and the Mac.

The Mac encoding looks very odd "Ste%CC%81phane". So the "e" is there and then some encoding presumably to indicate it has an accent.

Mind you, the Windows encoding looks odd as well. I would expect é (e-acute) to be encoded as "%e9".

I guess it must be some double byte nastiness. I guess we must be dealing with some Latin codepage rather than Unicode and that "%C3%A9" is the encoding for it. That would explain why the Mac version is different, due to different code pages. I still don't understand why the Mac version has "e" and the encoded value though.

P.S. did I mention that I hate dealing with double byte nonsense...


Edited by andy (30/01/2005 21:08)
_________________________
Remind me to change my signature to something more interesting someday

Top
#248077 - 30/01/2005 21:10 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
Quote:
Oooh, here's some more fun. On my Mac, here's how an e with an accute accent shows up in the XML for the file name:

Ste%CC%81phane%20Grappelli

Here's the same thing from the XML on a PC:

St%C3%A9phane%20Grappelli

What's going on here? Is this some bastard disagreement about Unicode character interpretation or what?

Oh dear. You've copped for a normalisation problem. Basically, there are two ways in Unicode of representing e-acute: you can either use the e-acute character, which is what your PC has done (this is called NFC, for "normal form, composed"), or you can use an e followed by a combining acute accent, which is what your Mac has done (this is called NFD, for "normal form, decomposed"). Both are then represented in UTF-8, and then URL-escaped.

The Macintosh filesystem canonicalises filenames to NFD, whereas Windows canonicalises to NFC. Both have their advantages: NFD can represent more characters (x-acute, say, which doesn't exist as a single character), but NFC makes UI code a lot simpler as it doesn't have to compose the characters at render time.

Quote:
maybe Samba is smart enough to translate between these things.

Looks that way, yes.

Peter

Top
#248078 - 30/01/2005 21:14 Re: trouble with iTunes - copying data, desktop to laptop [Re: peter]
andy
carpal tunnel

Registered: 10/06/1999
Posts: 5916
Loc: Wivenhoe, Essex, UK
Damn, now I know another fact about Unicode that I didn't want to know
_________________________
Remind me to change my signature to something more interesting someday

Top
#248079 - 30/01/2005 21:16 Re: trouble with iTunes - copying data, desktop to laptop [Re: andy]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
Yeah, and I just learned that I've been messing with this stuff for so long that I can read things in URL-encoded UTF-8. I'm not sure I wanted to know that either

Peter

Top
#248080 - 30/01/2005 21:28 Re: trouble with iTunes - copying data, desktop to laptop [Re: peter]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
Okay, now this is at least starting to make some sense. Is there a library (in Perl or otherwise) that lets me deal with this mess? Ideally, I'd like to take those URLs, decode them, normalize them, see if the actual file matches the name and if not, normalize the other way, and then reencode.

Bonus question: if the file is "named" with one normal form and you ask for it with a different but equivalent normal form, should it open the file or not? Do different filesystems deal with this in different ways?

Top
#248081 - 30/01/2005 21:46 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
Quote:
Okay, now this is at least starting to make some sense. Is there a library (in Perl or otherwise) that lets me deal with this mess? Ideally, I'd like to take those URLs, decode them, normalize them, see if the actual file matches the name and if not, normalize the other way, and then reencode.

It's not a problem I've ever had to solve, but I imagine ICU can help deal with it.

Quote:
Bonus question: if the file is "named" with one normal form and you ask for it with a different but equivalent normal form, should it open the file or not? Do different filesystems deal with this in different ways?

My guess is that Windows won't, but MacOS will. MacOS lets you mount FAT32 filesystems, so presumably has some code for getting their filenames right.

Peter

Top
#248082 - 01/02/2005 00:38 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
tanstaafl.
carpal tunnel

Registered: 08/07/1999
Posts: 5549
Loc: Ajijic, Mexico
Is that too many characters for NTFS to deal with in a filename?


No. NTFS will handle 255 characters (your filename is only 198), but included in that 255 character limit is the entire directory path to the file as well as the filename.

So if your directory path (C:\"My_Music\MP3_Files\Williams,Mary_Lou" or whatever it is) is more than 57 characters long, then NTFS will choke on it.

tanstaafl.
_________________________
"There Ain't No Such Thing As A Free Lunch"

Top
#248083 - 01/02/2005 07:40 Re: trouble with iTunes - copying data, desktop to laptop [Re: tanstaafl.]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5683
Loc: London, UK
Quote:
...included in that 255 character limit is the entire directory path to the file as well as the filename.


Wrong.

NTFS can handle paths up to about 32,000 characters, but each component can be no longer than 255 characters.

The problem is in the applications themselves.

As I said above, if you're using the C runtime, or the standard Windows API functions, you're limited to MAX_PATH characters which is 260 characters. If you want to use filenames longer than this, you need to prepend (at the API level) your filename with "\\?\", but then Windows won't do path normalisation and stuff, and you'll have to do it yourself.

I've just successfully created a path with 66 components of 240 characters a piece, with a total path length of around 31800 characters. I had to do it programmatically, and both Windows Explorer and cmd.exe are unable to remove it (or even change into it) but it's there.

Now to write some code to get rid of it...


Edited by Roger (01/02/2005 07:47)
_________________________
-- roger

Top
#248084 - 01/02/2005 09:35 Re: trouble with iTunes - copying data, desktop to laptop [Re: Roger]
andy
carpal tunnel

Registered: 10/06/1999
Posts: 5916
Loc: Wivenhoe, Essex, UK
Quote:
I've just successfully created a path with 66 components of 240 characters a piece, with a total path length of around 31800 characters. I had to do it programmatically, and both Windows Explorer and cmd.exe are unable to remove it (or even change into it) but it's there.



Ah, just like the case of hard links. NTFS has hard links, just like all the various Unix file systems. But Windows doesn't use them and neither can users create them normally.

You can however use API calls to create them. Once you have created the hard link it just works as you would expect, every app on the machine just sees it as another copy of the linked file.

I used this to good effect at one point. There was a website where authorized users could download huge files when they had gained the appropriate rights. They had to be accessed via http, but I couldn't stream them out via BinaryWrite in ASP because it killed performance.

Using the hard link API I could create hard links to the huge files, give the links GUID names and they just direct the user's browser to the hard link. The hard link was just deleted after a specified period. I used the same thing later to make huge files appear in users' ftp areas when they carried out an action on the website.
_________________________
Remind me to change my signature to something more interesting someday

Top
#248085 - 01/02/2005 12:42 Re: trouble with iTunes - copying data, desktop to laptop [Re: andy]
mlord
carpal tunnel

Registered: 29/08/2000
Posts: 14496
Loc: Canada
You really mean "more than one hard link to a given file/inode". A "hard link" is simply a directory entry for a file inode, and ALL permanent files on most modern filesystems have at least one hard link.

I know you knew that, but I'm just trying to help demystify "hard links". Most folks I introduce to them begin with very wrong ideas about the concept.

Cheers

Top
#248086 - 01/02/2005 13:40 Re: trouble with iTunes - copying data, desktop to laptop [Re: DWallach]
DWallach
carpal tunnel

Registered: 30/04/2000
Posts: 3810
For what it's worth, I might as well post how I ended up solving the problem. Turns out, there were only two files that had filenames too long for whatever reason for iTunes to deal with. I ended up discovering those files because tar choked on them as well. I "deleted" those two particular songs from my PC's iTunes and asked the PC to consolidate the tracks. That worked fine, saving me from the hassle of writing my own sync software. Then, I just moved those two long-named files over by hand, renaming them to something short, and having the PC iTunes add them into its database. So long as you don't have iTunes "managing" those wacky files, they can have whatever name you want.

Ultimately, if iTunes was only smart enough to recognize the "long filename" error and deal with it, then everything would have worked the way I originally intended, and I never would have had to learn how bizzare the world of Unicode can be.

Top