IMDb Data – Now easily available to contributors

  • 12
  • Announcement
  • Updated 3 months ago
  • (Edited)
Today (20 Dec 2018) we are pleased to announce the IMDb datasets are easier to access and now directly from imdb.com. Using the new interface, contributors can bulk-access subsets of IMDb title and name data for personal and non-commercial use. Each dataset file is in a gzipped, tab-separated-values (TSV) format.

To access the datasets and for more information you can go here: https://contribute.imdb.com/czone

Stewart

Photo of Stewart

Stewart, Employee

  • 15 Posts
  • 16 Reply Likes

Posted 4 months ago

  • 12
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
Thanks!
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
However, I do hope that the extended datasets can be made available to a slightly broader span of frequent contributors.
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Unless I'm missing something, there doesn't seem to be anything new here, just the same data that's been available for over 18 months from https://datasets.imdbws.com/ via https://www.imdb.com/interfaces/ (which is still far less data than used to be available via ftp).

Are there still plans to keep Col's promise to make more data available, preferably taking into account all the concerns raised in that thread?

(Edited)
Photo of Chris H.

Chris H., Employee

  • 75 Posts
  • 97 Reply Likes
Hello,

    The extended datasets are only visible to folks who have made a large number of contributions recently. Those below that threshold only see the basic datasets. Folks who have not made any contributions don't get to see either.

I hope this helps to clarify this,
Chris.
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
O...
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Chris,
Please define 'large number' and 'recently'. Apparently making the top 250 list several times doesn't qualify me: is that because my number of contributions is still too low, or because I've taken a break for while?

And why aren't the requirements mentioned either in the announcement or on the page itself (there's not even any hint that anything else might be available)?
Photo of ACT_1

ACT_1

  • 3059 Posts
  • 2777 Reply Likes

Hi!  Chris H., Official Rep
The extended datasets are only visible to folks
who have made a large number of contributions recently
- - -

? ?

Add to Titles and Name Pages: Created: ____ Updated: ____
for the contributors to see

- - -

Saturday, December 22, 2018

https://www.imdb.com/user/ur96700000/ 96,700,000 IMDb members

https://www.imdb.com/pressroom/stats/
IMDb Statistics
Titles: 5,626,984
Names: 9,572,880

https://www.imdb.com/search/title?release_date=1874-01-01,&sort=release_date,asc
Released at least 1874-01-01
5,095,020 titles

https://www.imdb.com/search/name?gender=male,female
Males/Females (are there Others ?)
5,119,378 names

https://www.imdb.com/title/tt9480000/ 9,480,000 titles
https://www.imdb.com/name/nm10327500/ 10,327,500 names

- - -

https://contribute.imdb.com/czone/hall_of_fame

Contributor Hall of Fame

Only lists
- Top Plot Summary Writers
- Top Mini-Bio Writers

Not Titles, Names, Reviews, and maybe Trivia, Gooofs, etc...

- - -

bob the moo
IMDb member since Feb 2001
https://www.imdb.com/user/ur1002035/
https://www.imdb.com/user/ur1002035/reviews  - 8,875 Reviews

Change
https://www.imdb.com/search/name?bio_author=bob+the+moo
to
https://www.imdb.com/user/ur1002035/bios

Change
https://www.imdb.com/search/title?plot_author=bob+the+moo
to
https://www.imdb.com/user/ur1002035/plots

add to :

Quick Links
Ratings ... Lists
Watchlist . Checkins
Reviews ... Poll Responses (add 'Polls Started' and put polls on a separate page)
Bios ...... Plots

bob the moo
Joined on December 21, 2012
https://getsatisfaction.com/imdb/people/bobthemoo

- - -

planktonrules
MartinHafer
Sun Jun 8 2003
https://www.imdb.com/user/ur2467618/
https://www.imdb.com/user/ur2467618/ratings - 22,220 titles
https://www.imdb.com/user/ur2467618/reviews - 22,492 Reviews

.

(Edited)
Photo of Brian Risselada

Brian Risselada

  • 30 Posts
  • 34 Reply Likes
Chris, can you please answer the question of how many a "large number of contributions" is? And can you also please explain why this information would only be made available to them? For a long time it was available to everyone. What was the problem with that?
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Chris? Stewart? Anyone?

I'd like answers to the questions I asked over 2 weeks ago, please.

What are the requirements for accessing the mysterious 'extended datasets'? Exactly how many contributions are there in 'a large number'? Exactly how many days/months/years ago counts as 'recently'?

But I suspect that whatever your answer, I'll be asking you to reconsider your policy on this. For my own case, I have contributed over 45,000 items to IMDb over the last 5 or 6 years (according to Col's end-of-year reports). I realise that's nowhere near as much as some, but I still consider it to be a very large number, and it was enough to get me into the top 250 contributors list three times in recent years. So I'm disappointed that you seem to be demanding more, before I can access data that wouldn't even exist without contributors like me. I'm sure I'm not the only person in this situation. For all the flaws in the old ftp-based system, at least the data was there and freely available. Why are you now so reluctant to share?

Also, why are you being so secretive about this data? I still have no idea what might be included in these 'extended datasets' and in fact I wouldn't even know that such a thing existed at all if I hadn't asked here. Neither the announcement in this thread nor the datasets page itself makes any mention whatsoever of what might be available or what mystical incantation is required to unlock it. Why is that?

(Edited)
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
If restrictions are necessary, then so be it, but the criteria for access need to be designed carefully so that significant contributors are not left out. But without knowing the current criteria or IMDb's reasons, we're left guessing...
Photo of Vincent Fournols

Vincent Fournols

  • 2885 Posts
  • 4789 Reply Likes
To newcomers, I very often claim that IMDb is a collaborative site. But there is also the dark blind opaque side of IMDb: information that never make it out, concealed roadmaps, unanswered questions, which turn out to be extremely irritating (I leave the mysterious rating computation aside: this I can understand as it could be manipulated).
So I am afraid that the answer will never come.
(And thus I hope to tease and lead the bear to come out of his den !! :D)
Photo of ACT_1

ACT_1

  • 3059 Posts
  • 2777 Reply Likes


Jeorj Euler :
... There may have been some esoteric widespread incidents
of individuals or organizations who have not contributed much,
if anything at all, to IMDb yet built up their own websites
by siphoning off IMDb.
- - -


IMDb could add or already has added special names that are not real
and would not be found any other place but IMDb
IMDb could then do a Giggle Search for that name
and see who is publishing IMDb data as their own ? ?
.

Photo of Vincent Fournols

Vincent Fournols

  • 2885 Posts
  • 4789 Reply Likes
Funny, ACT: I was thinking exactly the same thing and was about to share it :)
Photo of ACT_1

ACT_1

  • 3059 Posts
  • 2777 Reply Likes

Hey! Vincent Fournols

my reply to your reply to: "IMDb could add or already has added special names that are not real"

IMDb Test Titles:
https://www.imdb.com/title/tt4684222/ Probably the Best Test Title Around
https://www.imdb.com/title/tt2418644/ Testing Movie
https://www.imdb.com/title/tt3239160/ Test
https://www.imdb.com/title/tt4242990/ Worldwide Title - A Test Movie (original title)
https://www.imdb.com/title/tt4844946/ A Test Series (2016-2019)
This title does not exist. It has been created for IMDb testing purposes only
https://www.imdb.com/search/keyword?keywords=imdb-testing
- - -

https://getsatisfaction.com/imdb/topics/feature-films-in-the-news-genre?topic-reply-list[settings][f...
1 day ago
by J.
I have no idea what, if anything, should be done with this curious entry:
Probably the Best Test Title Around.
- - -

1 day ago
by Ed Jones (XLIX)
That curious title is an IMDb Test title I believe.
Think I remember Col explaining this once.
It was made up names (nm) sequence numbers that had bogus actors on it.
Running across one of these titles or names is rare.
.

Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Another two weeks go by and still only silence from IMDb. Disappointing. I had seen some signs recently that IMDb was getting better at communication, but clearly there's still a long way to go. If you're actively refusing to answer my questions then at least have the decency to say that, instead of leaving me waiting and wondering.

If anyone's interested, in the meantime my recent contributions seem to have triggered the magic formula and unlocked the extended datasets for me.  Looks very promising at first glance, so I should probably be thanking you for that.  But, sadly, the discussion in this thread now leaves me wondering if I'll be able to find enough time to look at the data in detail before my recent contributions are no longer recent enough and I lose access again.
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
I was not addressing you, Vincent Fournols.
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Jeorj, do you really think it's unreasonable for me to hope for a reply from IMDb Staff in less than the month I've been waiting so far? While I'm grateful for the input here from you and Vincent, I don't understand why these datasets seem to be surrounded in mystery and I'd like an official response. Even if that reply is "we can't/won't answer your questions", I want to know that IMDb aren't ignoring me. Are my expectations too high?
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
I don't know. You're more than welcome to keep inquiring about it every month. If you're feeling disappointed about this or that, then it may be because your expectations are way too high. After several fiascos of the past few years, I've long since lowered mine.
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
To be honest, it's not so much that I expect anything, I too have been around for far too long for that, it's just that I haven't yet given up hope that things might get better.
(Edited)
Photo of Jeorj Euler

Jeorj Euler

  • 6445 Posts
  • 7897 Reply Likes
There is hope that things will get better, but as we can see from the most recent dozen or so announcements, the company seems heavily focused on the portability aspects and entertainment aspects, of the site, as of this time.
Photo of Stewart

Stewart, Employee

  • 15 Posts
  • 16 Reply Likes
Hello,

The Extended Datasets are available to those who have 1000+ approved contributions in the last 360 days, otherwise the Basic Datasets are available with just one approved contribution in the last 60 days.

I hope this helps.
Stewart
(Edited)
Photo of Vincent Fournols

Vincent Fournols

  • 2885 Posts
  • 4789 Reply Likes
Thanks a lot Stewart.
Photo of Phil G

Phil G

  • 26 Posts
  • 48 Reply Likes
Thanks very much Stewart, that's actually much better than I'd thought initially (I knew I'd taken a break for a while but didn't realise I'd been pretty much inactive for a whole year!)

Would it be possible to add a note saying this to the datasets page for people who don't yet have access, so that they understand that what they're seeing isn't everything available.
Photo of Brian Risselada

Brian Risselada

  • 30 Posts
  • 34 Reply Likes
Thank you for the response!