**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing IMDB data from TvGuide websites * @MinSWversion: V3.0.0 * @Site: imdb.com, primary search with ask.com * @Revision 13 - [05/04/2020] Jan van Straaten * - full rewrite * @Revision 12 - [08/10/2017] Jan van Straaten * - added actor role * @Revision 11 - [22/05/2016] Jan van Straaten * - added mdbinitype * - fixed star-rating and original title * @Revision 10 - [07/12/2015] Jan van Straaten * - change element names, mdb_show_id and mdb_episode_id * @Revision 9 - [25/09/2015] Jan van Straaten * - added mdb_category * @Revision 8 - [10/10/2014] Jan van Straaten * - improved showid scub, also numbers upto 5000000 (was 2500000) * @Revision 7 - [09/06/2014] Jan van Straaten * - added url header * @Revision 6 - [07/06/2014] Jan van Straaten/Jagad * - added mdb_showicon * @Revision 5 - [20/12/2013] Jan van Straaten * - changes in (aka)titles * @Revision 4 - [23/11/2013] Jan van Straaten * - changes in actor and director due to site changes * @Revision 3 - [11/08/2013] Jan van Straaten * - small changes in commentsummary due to imdb.com changes * @Revision 2 - [16/02/2013] Jan van Straaten * - small changes in actor due to imdb.com changes * @Revision 1 - [14/04/2012] Jan van Straaten * - correction in production date * @Remarks: none * @header_end **------------------------------------------------------------------------------------------------ site {url=imdb.com|mdbinitype=movie|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=ask} * scope.range {(primarysearch)|end} * primary search: url_primarysearch {url(urlencode=1,2,3,4)|https://www.ask.com/web?&q=|imdb+|'title'|+|'credit'|&/NCR} mdb_show_id.scrub {regex|primary||/tt(\d{7})/||} * imdb url's: url_mdb_p1 {url()|primary|https://www.imdb.com/title/tt|mdb_show_id|/} url_mdb_p2 {url|primary|https://www.imdb.com/title/tt|mdb_show_id|/plotsummary} url_mdb_p3 {url|primary|https://www.imdb.com/title/tt|mdb_show_id|/releaseinfo#akas} url_mdb_p4 {url|primary|https://www.imdb.com/title/tt|mdb_show_id|/reviews} url_mdb_p5 {url|primary|https://www.imdb.com/title/tt|mdb_show_id|/fullcredits#cast} *url_mdb_p6 {url|primary|https://www.imdb.com/title/tt|mdb_show_id|/criticreviews?ref_=tturv_sa_5 * url_mdb.headers {customheader=Accept-Encoding=gzip,deflate,br} end_scope *imdb elements scope.range {(match)|end} * musthaves mdb_title.scrub {single|p3| (original title)|||}* original title mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|||(|} * OK mdb_title.scrub {regex|p3||(.+?)||} *aka's *OK end_scope scope.range {(getelements)|end} mdb_actor.scrub {regex|p5||name/nm\d+?/\?ref_=ttfc_fc_cl_t\d+?\"\s>(.+?)\s+?.+?\s+?(.+?)\s+?||} mdb_actor.modify {substring(type=element)|0 24} * keep the first 12 actors (double because still name|role format mdb_actor.modify {remove(type=regex)|""} mdb_actor.modify {replace|\||###} mdb_actor.modify {replace(type=regex)|"(###\s{5,})\w+"| (role=} mdb_actor.modify {replace|###|\|} ** the next is only necessary if actor still contains 9 or more spaces (after the role name) ** some p5 pages have a very simple actor listing without these spaces, so we add them mdb_actor.modify {addend| } mdb_actor.modify {substring(type=regex)|"\A\s(.+?\s\(role=.+?)\s{9}"} mdb_actor.modify {remove|} mdb_actor.modify {addend(not "")|)} * mdb_director.scrub {multi|p1|"director":|"name": "|"|\},} mdb_director.scrub {multi|p5|/?ref_=ttfc_fc_dr|"\r> ||} * fulllist mdb_director.modify {substring(type=element)|0 5} * keep the first 6 * mdb_productiondate.scrub {single|p1||||} mdb_category.scrub {multi|p1|genres&ref_=tt_ov_inf|>||} mdb_category.modify {cleanup(removeduplicates)} * mdb_description.scrub {single|p1||} mdb_starrating.scrub {single()|p1|
|itemprop="ratingValue">||
} mdb_starratingvotes.scrub {single|p1|
|based on|user ratings|
} mdb_country.scrub {regex|p1||

Release Date:

.+?\((.+?)\)||} mdb_plot.scrub {single|p2|id="plot-summaries-content">|

|

|} mdb_commentsummary.scrub {multi(max=5 excludeblock="Warning: Spoilers")|p4|class="title" >|||
} mdb_review.scrub {multi(max=1 excludeblock="Warning: Spoilers")|p4|class="title" >|
|
|
} mdb_review.modify {cleanup} end_scope *