Deprecation of encoding attributes for alternate titles

  • 4
  • Announcement
  • Updated 2 months ago
  • (Edited)
We're pleased to announce a simplification to the contribution guidelines for alternate titles.

We recently upgraded our systems so that the following attributes are no longer needed to perform the correct character encoding. We have therefore replaced all these attributes with imdb display title (or alternative title if there was already an imdb display title), and removed the options from the contribution form:
  • ISO-LATIN-2 title
  • Cyrillic KOI8-R title
  • Greek ISO-8859-7 title
  • Turkish ISO-8859-9 title
  • original ISO-LATIN-2 title
  • original Cyrillic KOI8-R title
  • original Turkish ISO-8859-9 title
  • original Greek ISO-8859-7 title

We have also changed the transliterated ISO-LATIN-1 attribute to just "transliterated title" and created a guideline for this attribute here.

As always, we would welcome details of any issues you find, or feedback you have via this thread.

Many thanks!
Mike
Photo of Mike

Mike, Employee

  • 14 Posts
  • 16 Reply Likes

Posted 2 months ago

  • 4
Photo of Owen Rees

Owen Rees

  • 253 Posts
  • 398 Reply Likes
Really good to see this. I have a good idea how much had to be done behind the scenes to make this possible.

Should this be an announcement rather than a question? It looks like it to me.
(Edited)
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Hi, Mike.

Issues concerning Ukrainian titles:

1) A contributor for films (originally in Ukrainian language) is constantly attributing as "alternative transliteration" those titles that are in fact "transliterated title". I'm starting to be tired with this to correct, there are so many... The most recent ones: Human with a Stool (I corrected thi one) and Skvot32 (I left this one for editors to examine deeper);

2) A data-editor manipulates the "imdb display" attribute changing this for some of the alternate titles submitted initially to change only the encoding. Is this same editor going to revise all the Russian titles used to be encoded KOI8-R when this attribute is dropped? I believe not, but why he/she touches the Ukrainian titles attributes this way while they are submitted for encoding correction only? Nonsense! Moreover, the attribute is often switched incorrectly to "alternative" then no localized title appers for those movies that were screened in theaters under such titles. It's a kind of vandalism legalized by a data-editor;

3) The whole process is sabotaged by a data-editor who operates those submissions that autimatically go for a deeper control. I can say, 99% of those not approved in short time then are declined for no reasons. When I explain just "Correction of Latin latters to Ukrainian letters", they decline as "duplicate", but then no change for that title at all, so it was not a "duplicate" in fact. When I explain "not a duplicate", they decline as "no reason" or "can not verify", and so on by circle. Some of the titles (a third part of them) then are approved after second or third re-submission, but not all of them having a correct attribute.

PS: And I believe this same editor does not really cares what he/she does, because I also have declines for release dates submitted during this process, and those dates fixed on official BoxOffice are DECLINED. Someone just having fun on paid basis.
Photo of Mike

Mike, Employee

  • 14 Posts
  • 16 Reply Likes
Hi

Many thanks for raising these issues!

> "A contributor for films (originally in Ukrainian language) is constantly attributing as "alternative transliteration" those titles that are in fact "transliterated title". I'm starting to be tired with this to correct, there are so many... The most recent ones: Human with a Stool (I corrected thi one) and Skvot32 (I left this one for editors to examine deeper);"

Agreed, it does seem the (alternative transliteration) attribute is unnecessary in these cases, and that (transliterated title) is more suitable. We will investigate how best to address this and respond in the new year.

> "A data-editor manipulates the "imdb display" attribute changing this for some of the alternate titles submitted initially to change only the encoding. Is this same editor going to revise all the Russian titles used to be encoded KOI8-R when this attribute is dropped? I believe not, but why he/she touches the Ukrainian titles attributes this way while they are submitted for encoding correction only? Nonsense!"
...and > "PS: those titles changed automatically to "imdb display ttile", they still need the Latin symbols to be corrected. Such changes are also partially sabotaged by a data-editor."

Please could you provide one or two examples? We just want to make sure we fully understand the problem here. We agree the Ukrainian titles should be fixed to replace the latin characters with Ukrainian equivalents. We could fix all of these in bulk if that seems like the right approach.

> "Moreover, the attribute is often switched incorrectly to "alternative" then no localized title appers for those movies that were screened in theaters under such titles. It's a kind of vandalism legalized by a data-editor;"

For all the existing alternate titles with attribute (Cyrillic KOI8-R title) and (original Cyrillic KOI8-R title), we changed the attribute to “imdb display title”. If there was already an imdb display title for the same country and language then we used “alternative title” instead. Are you still seeing any issues with these? One or two examples would really help us investigate.

> "The whole process is sabotaged by a data-editor who operates those submissions that autimatically go for a deeper control. I can say, 99% of those not approved in short time then are declined for no reasons. When I explain just "Correction of Latin latters to Ukrainian letters", they decline as "duplicate", but then no change for that title at all, so it was not a "duplicate" in fact. When I explain "not a duplicate", they decline as "no reason" or "can not verify", and so on by circle. Some of the titles (a third part of them) then are approved after second or third re-submission, but not all of them having a correct attribute."

This seems strange. Sorry to ask, but again one or two examples (a submission reference) would really help us investigate.

Again, many thanks for raising these issues. We look forward to resolving them. We also appreciate your patience if responses are delayed until the new year.

Many thanks!
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Hi

Reffering to the above [ 2) A data-editor manipulates the "imdb display" attribute changing this for some of the alternate titles submitted initially to change only the encoding...... ]
I could gain more statistics after Christmas or in 2020 when having more free time. For now, some examples I remember: 

(I) https://contribute.imdb.com/updates?update=tt4574334:akas.correct
"Дивнi дива" (2016) (Ukraine) (alternative title)
as it is now: 

and how it was initially submitted (# 191207-233741-595000, 191208-125934-504000, 191209-115713-134000): 



(II) https://contribute.imdb.com/updates?update=tt1212450:akas.correct
Найп'янкiший округ у свiтi (2012) (Ukraine) (alternative title) 
as it is now: 

and how it was initially submitted (#191213-084016-745000): 


(III) https://contribute.imdb.com/updates?update=tt1216487:akas.correct
Дiвчина, яка грала з вогнем (2009) (Ukraine) (alternative title) 
as it is now: 

and how it was initially submitted (#191213-083648-857000): 


(IV) https://contribute.imdb.com/updates?update=tt1220888:akas.correct
Кримiнальна фiшка вiд Генрi (2011) (Ukraine) (alternative title) 
as it is now: 

and how it was initially submitted (#191213-082923-663000): 


None of the above were duplicates. None of the above were approved, so the titles include the Latin symbols as it was at the time of submissions until I resubmtted them much later (today and couple of days before).
(Edited)
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Reffering to the above [ 3) The whole process is sabotaged by a data-editor who operates those submissions that autimatically go for a deeper control. I can say, 99% of those not approved in short time then are declined for no reasons...... ]
The bold-italic phrase above says it all. I'm not sure what examples are expected from me exactly. Could you please check the track of my contribution history for the period December 3-12, those days I mostly contributed to correct the alternate titles. The majority of those declined were approved after 2nd or 3d resubmissions a few days later.

=======================================

Mike, please initiate the bulk processing for the rest of Ukrainian Cyrillic alternate titles to replace the Latin "i" letters with Ukrainian Cyrillic "і" letters (respecting the caps). 

I manually finished correcting all the complicated cases of substitutions that have being imitated the Ukrainian letters (і, ї, є), such as doubles "ii", "i" followed by apostrophe, Latin "e", Russian "э". The final results were checked, submitted and approved.

At this moment, only three related submissions are pending: 191221-184711-477000, 191221-181136-809000, 191221-172444-563000. After they approved, the rest of titles seem cause no problems after bulk processing.

Thank you

PS: The above mentioned last submissions are approved.
(Edited)
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Mike, there is no need for bulk processing any more. They are done manually. 

I also corrected the "alternative" instead of "imdb display" as well as "alternative transliteration" cases mentioned above (1 and 2). It seems the "alternative transliteration" instances resulted from automatic corrections on those titles contributed initially wrong as English originals while they should be contributed as transliterated Ukrainian titles.
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Hi again

I was not going to address the complex issue so deep here, but the quality of responses from the "Contact us" has drastically dropped down taking much more time and efforts to solve obvious objective problems.

Mike, reffering back to your interest expressed above
This seems strange. Sorry to ask, but again one or two examples (a submission reference) would really help us investigate.
please do something to make some data-editor(s) a bit thinking or more knowledgable to solve the related problems, if this is not a case of real sabotage on someone's end.

Some recent submissions on alternate titles are declined again:
200106-112132-051000  /  200106-180103-617000  /  200107-151503-641000  /  200107-211030-764000  /  200107-215412-969000  /  200107-225632-662000  /  200107-225738-328000  /  200107-225923-359000  /  200107-230128-022000  /  200107-230238-799000  /  200109-124306-001000  /  200109-132505-619000  /  200109-135444-065000  /  200109-160033-028000

Let's take the first one for example. I pretend I'm a data editor who knows nothing about Latin-Cyrillic issue but there is lots of submissions from a contributor explaining one and the same reason. OK, even if I know nothing about the issue, I could be a thinking person and check if a reason exists. At least ONE check; it is not hard at all to just search for the Latin "i" in any modern browser or app:

Voila. As anyone can see, the submitted data is different comparing to the existing in the IMDb. I can hardly believe that a job a person is paid for might be so unmotivated that the person is too lazy to perform such simple action once. It looks more like sabotage, and the specific working hours make me thinking that there is 1(-2) data editor(s) doing this. The same person(s) who had declined all the submissions that are approved later after 2nd and 3d re-submission: 
191203-235046-206000  /  191204-130050-914000  /  191204-162032-703000  /  191204-162524-209000  /  191204-165530-650000  /  191204-170209-715000  /  191204-183208-119000  /  191204-184107-251000  /  191204-184240-950000  /  191204-184328-151000  /  191204-200952-647000  /  191204-201356-150000  /  191204-201758-116000  /  191204-203246-816000  /  191204-203338-203000  /  191204-204452-295000  /  191204-211323-311000  /  191204-212928-248000  /  191204-215212-555000  /  191204-221954-884000  /  191204-222207-526000  /  191204-222328-296000  /  191204-230524-827000  /  191204-230620-460000  /  191204-230911-695000  /  191204-232408-959000  /  191204-232731-842000  /  191205-005039-789000  /  191207-141747-572000  /  191207-170802-822000  /  191207-200903-537000  /  191207-203328-998000  /  191207-203613-860000  /  191207-204716-370000  /  191207-204821-088000  /  191207-223338-723000  /  191207-224502-020000  /  191207-231824-569000  /  191207-231927-048000  /  191207-232239-251000  /  191207-233304-111000  /  191207-233647-532000  /  191207-233741-595000  /  191208-005402-770000  /  191208-005959-942000  /  191208-010258-146000  /  191208-010340-345000  /  191208-010458-041000  /  191208-010548-619000  /  191208-010838-361000  /  191208-010912-263000  /  191208-010953-212000  /  191208-011345-608000  /  191208-011638-226000  /  191208-011721-448000  /  191208-012540-978000  /  191208-100421-671000  /  191208-101535-877000  /  191208-102233-550000  /  191208-103414-713000  /  191208-103555-194000  /  191208-105148-755000  /  191208-115311-676000  /  191208-115432-040000  /  191208-115714-905000  /  191208-124006-815000  /  191208-125934-504000  /  191208-131040-247000  /  191208-132835-889000  /  191208-133702-870000  /  191208-142336-969000  /  191208-142412-020000  /  191208-161607-100000  /  191208-161706-841000  /  191208-162251-134000  /  191208-162323-045000  /  191208-162928-955000  /  191208-194822-781000  /  191208-200736-697000  /  191208-214244-070000  /  191208-214543-087000  /  191208-214720-035000  /  191208-215106-290000  /  191208-221458-321000  /  191208-221720-319000  /  191208-221756-072000  /  191208-221851-519000  /  191208-222434-992000  /  191208-222515-065000  /  191208-222649-604000  /  191208-231523-560000  /  191208-232115-689000  /  191208-232933-291000  /  191208-233018-926000  /  191208-233130-944000  /  191208-233219-690000  /  191208-233352-255000  /  191208-233424-939000  /  191208-233545-139000  /  191208-233827-062000  /  191208-233858-428000  /  191208-233927-262000  /  191208-234353-511000  /  191208-235153-989000  /  191208-235313-651000  /  191208-235530-555000  /  191208-235605-539000  /  191209-001049-858000  /  191209-002516-815000  /  191209-002556-803000  /  191209-093711-815000  /  191209-110154-240000  /  191209-111144-501000  /  191209-111959-961000  /  191209-115713-134000  /  191209-184701-382000  /  191210-113922-316000  /  191210-114402-133000  /  191210-114437-791000  /  191210-114808-258000  /  191210-163246-122000  /  191210-174959-813000  /  191210-180134-698000  /  191210-180831-645000  /  191210-180910-801000  /  191210-181105-314000  /  191210-181140-449000  /  191210-181558-297000  /  191210-181738-918000  /  191210-182649-992000  /  191210-182941-786000  /  191210-183017-912000  /  191210-183634-146000  /  191210-183734-831000  /  191210-183900-373000  /  191210-183949-408000  /  191210-184034-638000  /  191210-184122-406000  /  191210-184200-469000  /  191210-184340-048000  /  191210-184653-827000  /  191210-184729-983000  /  191210-184815-821000  /  191210-185340-218000  /  191210-185514-692000  /  191210-193432-109000  /  191210-193704-438000  /  191210-195050-456000  /  191210-195451-054000  /  191210-195716-883000  /  191210-195847-865000  /  191210-195924-445000  /  191210-200534-309000  /  191210-200649-262000  /  191210-201120-841000  /  191210-201201-930000  /  191210-201519-279000  /  191210-204029-826000  /  191210-223624-618000  /  191210-232810-643000  /  191210-235253-840000  /  191210-235658-624000  /  191212-092253-530000  /  191212-095604-588000  /  191212-095720-494000  /  191212-102149-578000  /  191212-102629-882000  /  191212-102849-317000  /  191212-104402-543000  /  191212-105930-955000  /  191212-110204-214000  /  191212-110258-438000  /  191212-115924-165000  /  191212-121303-172000  /  191212-122936-310000  /  191212-123239-925000  /  191213-082923-663000  /  191213-083648-857000  /  191213-084016-745000  /  191213-085350-450000  /  191215-205345-409000  /  191216-130548-694000  /  191216-222545-896000  /  200101-184402-247000  /  200102-110429-242000  /  200102-221842-034000  /  200105-155108-478000  /  200105-201245-352000  /  200105-204443-266000  /  200105-210156-435000

If IMDb owners and officials would not care for quality of the job done and the staff they hire and pay for, then contributors invest their time and efforts for nothing.
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
Issue concerning 2 Baltic letters (used to be ISO-LATIN-2): 
Šš Žž (codes U+0160, 0161, 017D, 017E)

They are OK to input and then you see them colored green after "Check this update", but in fact those automatically turn to the non-accented Ss Zz. All the rest seems working flawless for Baltic languages.
Photo of Jonny

Jonny, Employee

  • 7 Posts
  • 17 Reply Likes
Hi,

Thank you for the report, this has now been fixed.

Thanks,
Jonny
Photo of MAthePA

MAthePA

  • 2078 Posts
  • 3519 Reply Likes
I confirm the whole characters set for Baltic languages works now.
Thanks