Lists: Sorting list by title but ignoring articles

  • 11
  • Idea
  • Updated 5 years ago
  • (Edited)
Is there any way to sort lists by title but have it ignore articles like "the", "a", etc? A lot of movies begin with "the" and it's annoying having them all listed in the T's. I think MyMovies used to do this by default but it does not seem be implemented in the Lists. :-(If this is not presently possible can it be added to the wishlist?
Photo of Jim Beam

Jim Beam

  • 2 Posts
  • 0 Reply Likes
  • sad

Posted 8 years ago

  • 11
Photo of Giancarlo Cairella

Giancarlo Cairella, Official Rep

  • 1110 Posts
  • 1054 Reply Likes
Just to provide some historical perspective/context: until a few years ago, titles used to be stored internally with the article appended at the end, e.g.

Blues Brothers, The (1980)
Dolce vita, La (1960)
Boot, Das (1981)
Mariachi, El (1992)
Clockwork Orange, A (1971)
etc.

This was better when generating/displaying alphabetized lists (as it eliminated the inconvenience you are describing) but it was not devoid of complications.

For display purposes. the title needed to be converted to the more readable format (e.g. The Blues Brothers, Das Boot, La dolce vita). This made sorting better in many cases, but presented its own set of challenges -- the article needed to be put back at the beginning when displaying the title, but different languages have different rules for that (e.g. French titles beginning with indefinite article needed to alphabetized under the letter U, but those beginning with the definite article needed to be alphabetized by the letter of the first following noun); capitalization rules also vary, creating all sorts of complications (when converting an Italian or Spanish title you'd have to change the first letter of the title to lowercase when prefixing an article, but you need to keep it uppercase for English and German).

Also, detecting which titles had an article in need to be moved to the front versus those which didn't was a challenge in itself, requiring the need for built-in exceptions (otherwise a title like "Die, Mommie, Die" would be rendered as "Die Die, Mommie" unless the system was made aware that 'Die' here was not a German definite article)

Last but not least, a non-trivial amount of users objected to this format and complained about the sort order being inaccurate because of titles 'La dolce vita' appearing under 'D' and not 'L' where they wanted/expected it to be.

To be honest, I don't remember exactly when we switched to the current format (possibly around 2005). But I do know that apart from the 'alphabetical list' inconvenience of having many titles grouped under the letter T for 'The' etc. (which is not necessarily a problem for some users, as explained above) it did not generate significant complaints or feedback.

I'm not saying that the current system is necessarily better, although it did simplify the way we approached certain technical choices; just pointing out there is no clear-cut solution and that each approach has its shortcomings. Ideally of course we would be able to store both versions of a title (one for sorting, one for display) but adding/backfilling all that information that would be a non-trivial task.
Photo of Peter

Peter, Champion

  • 8207 Posts
  • 10719 Reply Likes
The switch happened in April 2009.
Photo of DavidAH_Ca

DavidAH_Ca, Champion

  • 3263 Posts
  • 2925 Reply Likes
I recall a number of complaints when the switch was made - almost certain more than the 'non-trivial' number who complained about the correct sort.

I have mentioned several times that a better method than moving the leading article or storing two different versions of the title is storing a number that indicates how many characters should be skipped when creating the sort key. (This system is used by libraries in North America -- the specific version is in the MARC 21 system.)

Most titles would have a code of 0, and creating the corrections for the English, French, & German articles should be easy to do automatically (although with a manual check for 'Die' and 'Les')

This system would also allow titles like *batteries not included to be moved to the B's instead of being at the beginning of the list.

I think IMDb should seriously consider creating a system like this.
Photo of Emperor

Emperor, Champion

  • 6418 Posts
  • 3026 Reply Likes
Indeed. You'd could make the manual check less laborious by keeping it language specific, "die" would only be flagged up for German language titles and would avoid all the Die Hard and Die X Die false positives. Oddities like Les Miserables could also be decided upon manually - would most English speakers expect it to be under L? Quite possibly. Whereas would most people expect "Les liaisons dangereuses" to be sorted under Li?

I'd also add an option to allow people to opt out of the definite article ignoring sorting. You can guarantee someone will moan about this, even though I'd imagine most people would appreciate it.
Photo of Lady Aleena

Lady Aleena

  • 22 Posts
  • 16 Reply Likes
DavidAH_Ca: I always include the * as part of sorting titles and putting *Batteries Not Included at the top, though A, An, and The are always going to be the bane of my sorting my lists because they are not ignored while sorting.
Photo of DavidAH_Ca

DavidAH_Ca, Champion

  • 3263 Posts
  • 2925 Reply Likes
Lady Aleena:

That is really a matter of personal preference: I prefer to see *batteries not included with the B's because when I say or think the title I think "batteries not included" not "asterisk batteries not included", so in my database I sort starting with the B. You might well prefer to sort on the asterisk.

Perhaps a better example is And God Created Woman. There are two versions of this with two different titles :
the 1956 version is : ...And God Created Woman but
the 1988 version is : And God Created Woman.

In a straight sort of the titles in my database, the first title is listed right at the top (9th) while the 1988 version is well down into the A's (766th). Personally I would prefer to have them show up together, as I see the ellipsis as no more important than the article, so I have set my system to ignore it. This places the titles together (575th & 576th) which makes sense to me.