Issues that cause a problems - known bad naming

View previous topic View next topic Go down

Issues that cause a problems - known bad naming

Post  anomander on Fri Apr 12, 2013 2:12 am

I thought it might be worth me capturing some of the naming I have seen that causes CT a problem, and potentially why.

2000AD 1234.cbz

The series is actually "2000 AD" but if often/mostly/always seen as "2000AD". Complete fail unless you rename first.


GIJoe - anything.cbz

The removal of dots from the series name combined with the lack of space between G.I. and Joe causes this one to fail.


Archie xxxx.cbz

This one might just be me with my very very slow internet conenction but I can nver get any matches. CT just hangs and I assume it is because there are so many issues in the series.

Action Comics xxxx.cbz or Action xxxx.cbz
Assume the same problem as Archie above.
avatar
anomander

Posts : 74
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  ctjimmy on Fri Apr 12, 2013 4:35 am

I actually hit this exact same problem yesterday when I started to index my 2000 AD collection. It also doesn't help that comicvine labels the specials as '2000AD' but the issues as '2000 AD'. I got round it by using the command line and forcing the issue via --metadata=series="2000 AD"

Jimmy

ctjimmy

Posts : 7
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  ComicTagger on Fri Apr 12, 2013 12:37 pm

@anomander, yes, if you want to match series names on Comic Vine, each word in the title is a search term. Periods don't matter, so "GI Joe" works as well as "G.I. Joe". As I understand it, (but I'm not sure) the CV editors try to keep consistent with the indicia. It is a wiki, so if you find things that are incorrect, you can go and change them. Initially you will be moderated.

(CT is just passing your strings to the CV search API. You can experiment in the GUI just by entering some text into the "Series" field, and clicking "search". This will give you a good idea of what matches what.)

As Jimmy pointed out, you don't have to rename: you can indicate which title to search for, in various ways. On the command-line:

Code:
ComicTagger -s -t cr -m "series=GI Joe"  -o -v "GI Joe *.cbz"

Also possible in the GUI one at a time for Auto-Identify just by changing the Series name field. For Auto-Tag, there is an entry in the dialog that comes up.

Also when matching, you can change the setting for "Name Length Match Tolerance":

Default Name Length Match Tolerance is for eliminating automatic
search matches that are too long compared to your series name search. The higher
it is, the more likely to have a good match, but each search will take longer and
use more bandwidth. Too low, and only the very closest lexical matches will be
explored.

If you make this a higher number, say 20, and search for "2000", every title that has "2000" in it and that is up to 24 characters (20+4) will be considered in the cover matching process.

This is in the settings in the GUI, or you can manually edit ~/.ComicTagger/settings ("id_length_delta_thresh")

I don't know why you're having a problem with "Action Comics". That works just fine for me. Maybe you should clear your cache, especially if you've been running SVN checkouts.

There is a bug though, that is exposed by listing the "Archie" series from 1960. I'll look into it.
avatar
ComicTagger
Admin

Posts : 208
Join date : 2012-12-02

View user profile http://comictagger.forumotion.com

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  anomander on Sun Apr 14, 2013 7:29 am

Excellent tips.

I only have one process left that requires correctly named issues (a perl script that moves issues into series folders). Once i have deprecated this and moved to using solely resolved metadata I will almost not care about file names any more.

On the topic of these specific issues though would you consider maintaining a central corrections list much like Sickbeard etc do.

The list would not have to be extensive but if it was an ini type file the userbase could help maintain it.
avatar
anomander

Posts : 74
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  ComicTagger on Sun Apr 14, 2013 4:35 pm

On the topic of these specific issues though would you consider maintaining a central corrections list much like Sickbeard etc do.

The list would not have to be extensive but if it was an ini type file the userbase could help maintain it.

How would it work? Can you give an example of what would be in the list? I'm not keen on implementing content-specific stuff directly into the app, but if we could figure out a way to centrally manage this file, I'm not against the idea.
avatar
ComicTagger
Admin

Posts : 208
Join date : 2012-12-02

View user profile http://comictagger.forumotion.com

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  anomander on Mon Apr 15, 2013 7:05 am

Essentially I imagine a file of match and replacements. The matches could be very simple case sensitive or more advanced regex based. I would suggest regex being better even though it is much harder for users it becomes feature complete immediately.

So using a known example the most simple config file could contain:

Code:

"2000AD"|"2000 AD"

Using REGEX is far more powerful though:
Code:

"(2000)([aA][dD])"|"$1 AD"
This example is made overcomplicated just to highlight the captured matches.

The question on when to do these matches is up for debate. It could be immediately before CT does anything more or only on fail to match.

It would be a relatively simple matter for the community to participate in maintaining a list of replacements . I would not expect it to be very long.

The beauty though is that as soon as one person fixes a common problem it is fixed for everyone participating.
avatar
anomander

Posts : 74
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  ComicTagger on Mon Apr 15, 2013 2:29 pm

I think this idea has some merit, but I want to let it stew in my mind for a while. Depending on how it's implemented, it could add a bit of complexity, as there are things like list management per install, which would be in the GUI and probably CLI, and decisions about when to apply the transforms, and hosting and management/ownership of the master list.

If you want, start a new thread, and start collecting a list of transforms (not necessarily regex, just human-readable) based on real-world examples. This would definitely add weight to this as a feature request.
avatar
ComicTagger
Admin

Posts : 208
Join date : 2012-12-02

View user profile http://comictagger.forumotion.com

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  anomander on Tue Apr 16, 2013 11:18 am

Forked to this thread. http://comictagger.forumotion.com/t79-transforms-fixes-for-common-differences-between-local-filename-and-comic-vine


I will attempt to maintain a master list for ease in the interim. If this idea gets rejected let me know so I dont end up doing this for no purpose.

Kudos
avatar
anomander

Posts : 74
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  ComicTagger on Wed Apr 17, 2013 12:15 am

OK, the other thread looks good. Let's wait and see what others people come up with.

I was thinking that it would be possible to write an add-on script, instead of bringing this into app proper. The script would use the already existing filename parser and renaming functions, and would just apply the transforms. So if you you had an automated script you wanted to run on a batch of file, you would run the transform script first. This might be an simpler way to test this feature out, especially if demand is low.
avatar
ComicTagger
Admin

Posts : 208
Join date : 2012-12-02

View user profile http://comictagger.forumotion.com

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  anomander on Wed Apr 17, 2013 3:13 am

An addon script is definitely flexible. On the other hand what you want is CT more reliable for the first time user.

It is the age old scope creep vs. useful feature debate.
avatar
anomander

Posts : 74
Join date : 2013-03-28

View user profile

Back to top Go down

Re: Issues that cause a problems - known bad naming

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum