Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#215491 - 06/05/2004 08:50 FreeDB Vs CDDB
Phoenix42
veteran

Registered: 21/03/2002
Posts: 1424
Loc: MA but Irish born
Is there a difference?
With out getting into the open source Vs big business argument, what I'm interested in is which database returns the better more accurate results.

Top
#215492 - 06/05/2004 08:58 Re: FreeDB Vs CDDB [Re: Phoenix42]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
All I can say is that there are some serious idiots submitting to FreeDB, but I use it because it's built into EAC (don't know if you can use CDDB with EAC, though), and it's easier to populate the files with incorrect info then edit it than it is to write it all yourself from scratch.

Incorrect spelling, capitalization (why do some people submit a song title like it's a sentence and not an actual title?), and sometimes song titles that are simply wrong. That's what I find from about 25% of the stuff I get from FreeDB. Honestly, I don't remember that much bad information from CDDB when it was still free.
_________________________
Matt

Top
#215493 - 06/05/2004 09:01 Re: FreeDB Vs CDDB [Re: Dignan]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
I've submitted suggestions to the FreeDB dudes on how to avoid these problems, but they seem uninterested.
_________________________
Bitt Faulk

Top
#215494 - 06/05/2004 09:06 Re: FreeDB Vs CDDB [Re: Dignan]
siberia37
old hand

Registered: 09/01/2002
Posts: 702
Loc: Tacoma,WA
In reply to:

Incorrect spelling, capitalization (why do some people submit a song title like it's a sentence and not an actual title?), and sometimes song titles that are simply wrong. That's what I find from about 25% of the stuff I get from FreeDB. Honestly, I don't remember that much bad information from CDDB when it was still free.




Not to mention duplicate entries. Most all of the CD's i've used FreeDB with have come up in like three different genres in three different entries. It's like people are so anxious to put their version of the CD in there, that they don't care if it's already in there.

Top
#215495 - 06/05/2004 09:12 Re: FreeDB Vs CDDB [Re: siberia37]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
It's like people are so anxious to put their version of the CD in there, that they don't care if it's already in there.
You're absolutely correct. I hate when that happens. I only try to submit when I see it's not up there, but I don't think I've been able to submit anything, unfortunately. I think EAC wasn't letting me for some reason.

Bitt, what were your suggestions? I'm curious.
_________________________
Matt

Top
#215496 - 06/05/2004 09:16 Re: FreeDB Vs CDDB [Re: Dignan]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Bitt, what were your suggestions? I'm curious.
Don't know what Bitt had in mind, but it's pretty easy to develop an algorithm to implement these capitalization rules that would work on about 99% of song titles. Checking some of the rules (like capitalizing "as" if it's followed by a verb) would require the script to cross-reference a dictionary, but most of the others are pretty easy to do without a dictionary. I wrote a Perl script that uses most of the rules and it works on all but a few of my songs.
_________________________
- Tony C
my empeg stuff

Top
#215497 - 06/05/2004 09:25 Re: FreeDB Vs CDDB [Re: Dignan]
JeffS
carpal tunnel

Registered: 14/01/2002
Posts: 2858
Loc: Atlanta, GA
What's REALLY annying is tagging audio books. Because it seems a different person has submitted each disk in a different format. Talk about a pain! The last Harry Potter book was 23 CDs long and I had to tag each individually to get it to all make sense. And then there's LOTR, which is 46 CDs long . . .
_________________________
-Jeff
Rome did not create a great empire by having meetings; they did it by killing all those who opposed them.

Top
#215498 - 06/05/2004 09:44 Re: FreeDB Vs CDDB [Re: tonyc]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
capitalization rules that would work on about 99% of song titles
Those rules leave a lot of words uncapitalised though -- I always capitalise everything except those abbreviations ("vs", "am") which only look right in lower-case. The tagging program I wrote flags any words without initial capital in red, and "vs" is about the only one I don't bother fixing (although sometimes I change "vs" to "&").

Peter

Top
#215499 - 06/05/2004 09:52 Re: FreeDB Vs CDDB [Re: peter]
tms13
old hand

Registered: 30/07/2001
Posts: 1115
Loc: Lochcarron and Edinburgh
Ugh, you're not one of those people who capitalise Of, The, and And, are you? The bane of my tagging life! (well, apart from the current broken version of Grip which loses the last letter from each track)
_________________________
Toby Speight
030103016 (80GB Mk2a, blue)
030102806 (0GB Mk2a, blue)

Top
#215500 - 06/05/2004 09:56 Re: FreeDB Vs CDDB [Re: peter]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
So do you have any tracks from Pearl Jam's "vs" on your system?
_________________________
Matt

Top
#215501 - 06/05/2004 10:29 Re: FreeDB Vs CDDB [Re: Dignan]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
So do you have any tracks from Pearl Jam's "vs" on your system?
Hmm, no. For that one I'd probably go for write-as-pronounced and tag it as "Versus". Lemon Jelly's lemonjelly.ky was a tough call too, but eventually I left it lower-case.

Peter

Top
#215502 - 06/05/2004 10:51 Re: FreeDB Vs CDDB [Re: peter]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
I often see people capitalize song titles the way they are on the album cover art and insert. I suppose I have less of a problem with this, because at least there's a reason, and I'll keep it if it doesn't bother me.

It's When they Capitalize With no apparent rules That i really get Pissed Off.
_________________________
Matt

Top
#215503 - 06/05/2004 11:04 Re: FreeDB Vs CDDB [Re: tonyc]
mcomb
pooh-bah

Registered: 31/08/1999
Posts: 1649
Loc: San Carlos, CA
I wrote a Perl script that uses most of the rules

Would you mind sharing that part of the script? I use a perl script for tagging my mp3s as well and that would be a useful addition.

-Mike
_________________________
EmpMenuX - ext3 filesystem - Empeg iTunes integration

Top
#215504 - 06/05/2004 11:10 Re: FreeDB Vs CDDB [Re: Dignan]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
I often see people capitalize song titles the way they are on the album cover art and insert. I suppose I have less of a problem with this, because at least there's a reason, and I'll keep it if it doesn't bother me.
This gets annoying if used with players (even Karma, until fairly recently) which compare names in a case-sensitive way. So if you've got a band that ponces about with their typography, it makes it unnecessarily difficult (and ugly) even to browse their entire oeuvre.

Peter

Top
#215505 - 06/05/2004 11:46 Re: FreeDB Vs CDDB [Re: tonyc]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Actually, mine was more human-oriented. There are many ways to misspell, miscapitalize, or otherwise get it totally wrong. There are only a few correct ways, and most of these are at least passable to most of us. Assuming that's the case, if you wait and don't publish submissions immediately, many people will submit. Then compare them all. Once you have an exact match, publish that one. The chances of multiple people screwing up in the same way are unlikely. You could then also keep a list of who often submitted correct results and favor their submissions. (An email address was required back with free CDDB; I don't know that that's the case with FreeDB, but I think it is. Obviously, you'd have to throw away generic addresses.) You'd also have to have a timeout in case, for some reason, you never got matching submissions so that it doesn't languish too long.

You could then conceivably run the titles through a filter to make the capitalization orthogonal if you wanted, but that may be an unneccessry step. It would fix Peters who capitalize every damn word, but even if that wasn't the case, poor capitalization is better than flat-out wrong.
_________________________
Bitt Faulk

Top
#215506 - 06/05/2004 12:51 Re: FreeDB Vs CDDB [Re: Dignan]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
So do you have any tracks from Pearl Jam's "vs" on your system?
My perl script actually makes an exception for that one, and a few other non-standard albums/artists. It's not perfect, but it works.
_________________________
- Tony C
my empeg stuff

Top
#215507 - 06/05/2004 13:21 Re: FreeDB Vs CDDB [Re: mcomb]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Okay, I cut this out of a much longer script. It's a little hackish, but it pretty much folllows the right algorithm (or if it doesn't, I haven't found any errors.) The @exceptions array is for phrases you don't want changed for whatever reason. Also, earlier in my script I do things like this:
next if ( $artist =~ /^Tool$/ && $album =~ /^undertow$/ );
next if ( $artist =~ /^Tori Amos$/ && $album =~ /^Scarlet's Walk$/ );
to skip over problem albums that are capitalized funny.

The rest should be pretty self explanatory. It probably isn't the most optimized code in the world, but it does the job. It also assumes that the titles start out "pretty close" to correct, so starting out with ALL CAPS isn't a good idea, as it will just try to capitalize the first letters (which are already capitalized.)


#!/usr/bin/perl

@articles = ( 'a', 'an', 'the' );
@conjunctions = ( 'and', 'but', 'or', 'nor' );
@prepositions = (
'at', 'by', 'for', 'from', 'in', 'into', 'of', 'off',
'on', 'onto', 'out', 'over', 'to', 'with'
);

@uppers = ( 'also', 'be', 'if', 'that', 'thus', 'when', 'as' );
@lowers = ( 'f\.', 'vs\.', "\'n\'" );

@phrasalverbs = (
'Beat Up', 'Blow Out', 'Break Down', 'Break Into', 'Break Up', 'Bring Up',
'Call Off', 'Call On', 'Call Up', 'Carry On',
'Come Back', 'Come Down', 'Come On', 'Come Out', 'Come Over', 'Do Over',
'Fill In', 'Fill Out', 'Find Out', 'Get Along', 'Get Around',
'Get By', 'Get Over', 'Get Through', 'Get Up', 'Give Back', 'Give Up',
'Go Along', 'Go Away', 'Go On', 'Go Over', 'Hand In', 'Hang Up',
'Hold On', 'Keep On', 'Keep Up', 'Leave Out', 'Let Down', 'Look For',
'Look Into', 'Look Like', 'Look Out', 'Look Over', 'Look Up', 'Make Out',
'Make Up', 'Pack Up', 'Pass Out', 'Pick Out', 'Pick Up', 'Put Away',
'Put Off', 'Put On', 'Put Out', 'Put Up', 'Roll Over', 'Run Into', 'Run Out',
'Run Over', 'Show Up', 'Take After', 'Take Back', 'Take Off', 'Take On',
'Take Up', 'Talk Back', 'Talk Over', 'Throw Away', 'Try On', 'Turn Down',
'Turn In', 'Turn Off', 'Turn On', 'Use Up', 'Wait On'
);

@exceptions = ( );


while ( $title = <>)
{
chop ($title);
$newtitle = capitalizeTitle($title);
print "before: $title\nafter: $newtitle\n";
}

sub substitutions
{
my $word = @_[0];

$word =~ s!^(\()?vs$!$1vs\.!i;
$word =~ s!^(\()ft\.$!$1f\.!i;
$word =~ s!^(\()feat\.$!$1f\.!i;
$word =~ s!^(\()featuring$!$1f\.!i;
return $word;
} # end sub substitutions


sub capitalizeTitle
{

my $string = @_[0];

for ( $i = 0 ; $i < @exceptions ; $i++ ) {
return $string if ( uc($string) eq uc (@exceptions[$i]) );
}

@words = split ( / /, $string );

for ( $i = 0 ; $i < @words ; $i++ )
{
$word = \@words[$i];
if ( $i == 0
|| $i == @words - 1
|| @words[ $i + 1 ] =~ /^\(/
|| @words[ $i - 1 ] eq "-" )
{
$$word =~ s/^(.)/\U$1/;
} # end if ( $i == 0 || $i == ...
else
{
$$word =~ s/^([^A-Za-z0-9]*)(.)/\U$1$2/;
$$word = substitutions($$word);
foreach $a (@articles)
{
if ( $$word =~ /^$a$/i )
{
$$word =~ s/^(.)/\L$1/;
}
} # end foreach $a (@articles)
foreach $a (@conjunctions)
{

if ( $$word =~ /^$a$/i )
{
$$word =~ s/^(.)/\L$1/;
}
} # end foreach $a (@conjunctions)

foreach $a (@prepositions)
{
if ( $$word =~ /^$a$/i )
{
$$word =~ s/^(.)/\L$1/;
}
} # end foreach $a (@prepositions)

foreach $a (@uppers)
{
if ( $$word =~ /^[^A-Za-z0-9]*$a$/i )
{
$$word =~ s/^([^A-Za-z0-9]*)(.)/\U$1$2/;
}
} # end foreach $a (@uppers)

foreach $a (@lowers)
{
if ( $$word =~ /^[^A-Za-z0-9]*$a$/i )
{
$$word =~ s/^([^A-Za-z0-9]*)(.)/\L$1$2/;
}
} # end foreach $a (@lowers)

} # end else[ if ( $i == 0 || $i == ...
} # end for ( $i = 0 ; $i < @words...
$words = join ( ' ', @words );
return $words;
} # end sub capitalizeTitle

.


Attachments
214451-titlecap.pl (154 downloads)

_________________________
- Tony C
my empeg stuff

Top
#215508 - 07/05/2004 07:16 Re: FreeDB Vs CDDB [Re: Phoenix42]
Phoenix42
veteran

Registered: 21/03/2002
Posts: 1424
Loc: MA but Irish born
Wow, this turned down a path I hadn't put much though into, thought that's a good reflection on the community that it did.

The reason why I was asking is due to the issues with FreeDB that Matt & Eric listed. So I've been looking for a way of having my cake and eating it too - to get the secure ripping of EAC and the accuracy of CDDB, the only thing I've found are the many front ends to Cdparanoia.

This of course opens up a whole other can of worms, me and linux, but I'll go play with some stuff first.

Top
#215509 - 07/05/2004 07:26 Re: FreeDB Vs CDDB [Re: Phoenix42]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
Yeah, all I can say is that when I'm ripping a new CD, I grab the info from FreeDB, then I either rename within EAC or rip then rename with MP3 Tag Studio, always with a handy AMG album listing available
_________________________
Matt

Top
#215510 - 07/05/2004 07:42 Re: FreeDB Vs CDDB [Re: Dignan]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5683
Loc: London, UK
always with a handy AMG album listing available

MP3 Tag & Rename will parse album info from AMG. It doesn't work too well with compilations, but it's OK.
_________________________
-- roger

Top
#215511 - 07/05/2004 07:47 Re: FreeDB Vs CDDB [Re: Roger]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
Really? Woah, I'll have to check that out! How good is it usually?
_________________________
Matt

Top
#215512 - 07/05/2004 07:53 Re: FreeDB Vs CDDB [Re: Dignan]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5683
Loc: London, UK
It's usually OK. Basically, it brings up a browser window. You then find the relevant album, using AMG's normal tools. Then there's a button on the browser that grabs the info (by scaping the HTML, AFAICT) and puts it in the tags.

It doesn't work with compilations, because AMG lists them as tracks 1-40 (e.g.), rather than Discs 1 and 2, each with 20 tracks.

It works OK, but I'm not a big fan of MP3 Tag & Rename anyway.
_________________________
-- roger

Top
#215513 - 07/05/2004 08:06 Re: FreeDB Vs CDDB [Re: Roger]
Dignan
carpal tunnel

Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
Ah, didn't see that it was Tag & Rename. I'm not a fan either. I wonder if Tag Studio will ever have that added. I don't have many compilations, so it wouldn't bother me.
_________________________
Matt

Top
#215514 - 07/05/2004 11:28 Re: FreeDB Vs CDDB [Re: Phoenix42]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31596
Loc: Seattle, WA
and the accuracy of CDDB
Oooohoohahahaha! ahahaha! Good one! Ahahahah! Oh, you just slay me!

_________________________
Tony Fabris

Top
#215515 - 07/05/2004 11:31 Re: FreeDB Vs CDDB [Re: tfabris]
Phoenix42
veteran

Registered: 21/03/2002
Posts: 1424
Loc: MA but Irish born
Maybe I should have said relative accuracy of CDDB when compared with FreeDB.

Tough crowd tonight :-)

Top