Files
epg/siteini.pack/MDB postprocessor/imdb.com.bing.ini
freearhey a597b11307 Init
2021-03-09 22:46:37 +03:00

91 lines
5.4 KiB
INI
Executable File

**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
* @MinSWversion: V1.1.1/56.25
* @Site: imdb.com, primary search with bing.com
* @Revision 10 - [22/05/2016] Jan van Straaten
* - added mdbinitype
* - fixed star-rating and original title
* @Revision 9 - [07/12/2015] Jan van Straaten
* - change element names, mdb_show_id and mdb_episode_id
* @Revision 8 - [25/09/2015] Jan van Straaten
* - added mdb_category
* @Revision 7 - [10/10/2014] Jan van Straaten
* - improved showid scub, also numbers upto 5000000 (was 2500000)
* @Revision 6 - [09/06/2014] Jan van Straaten
* - added url header
* @Revision 5 - [07/06/2014] Jan van Straaten/Jagad
* - added mdb_showicon
* @Revision 4 - [20/12/2013] Jan van Straaten
* - changes in (aka)titles
* @Revision 3 - [23/11/2013] Jan van Straaten
* - changes in actor and director due to site changes
* @Revision 2 - [11/08/2013] Jan van Straaten
* - small changes in title, actor and commentsummary due to imdb.com changes
* @Revision 1 - [14/04/2012] Jan van Straaten
* - correction in production date
* @Remarks: none
* @header_end
**------------------------------------------------------------------------------------------------
*
*
site {url=imdb.com|mdbinitype=movie|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=bing}
* primary search:
url_primarysearch {url|http://www.bing.com/search?q=|imdb+title/tt+|'title'|+|'productiondate'|+|'credit'|&scope=web&setmkt=en-US&qs=ns&form=QBRE&qb=2}
*scope=web&setmkt=es-ES&setlang=match
mdb_show_id.scrub {regex()|primary||title/tt(\d{7})/||}
*
* filter showid (7 char long):
mdb_show_id.modify {remove| } * remove spaces
mdb_temp_1.modify {calculate(type=element format=F0)|'mdb_show_id' #} * number of mdb_show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|4}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'mdb_show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(type=char format=F0)|'mdb_temp_2' #} * how many chars in this mdb_show_id?
mdb_show_id.modify {remove('mdb_temp_3' not "7" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if not 7 chars
* end loop
* filter showid (only numbers and < 5000000):
mdb_temp_1.modify {calculate(type=element format=F0)|'mdb_show_id' #} * number of mdb_show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|5}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'mdb_show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(format=F0)|'mdb_temp_2'} * convert to number
mdb_show_id.modify {remove('mdb_temp_3' "0" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if not only numbers
mdb_show_id.modify {remove('mdb_temp_3' > "5000000" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if > 5000000
* end loop
*
* imdb url's:
url_mdb_p1 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/}
*url_mdb_p2.modify {addstart|'url_mdb_p1'plotsummary}
*url_mdb_p3.modify {addstart|'url_mdb_p1'releaseinfo#akas}
*url_mdb_p4.modify {addstart|'url_mdb_p1'reviews}
*url_mdb_p5.modify {addstart|'url_mdb_p1'fullcredits#cast}
*
url_mdb_p2 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/plotsummary}
url_mdb_p3 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/releaseinfo#akas}
url_mdb_p4 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/reviews}
url_mdb_p5 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/fullcredits#cast}
*
url_mdb.headers {customheader=Accept-Encoding=gzip,deflate}
*
* imdb elements
mdb_title.scrub {single(separator=":")|p1|<div class="originalTitle|">|<span|</div>}* original title when redirected
mdb_title.modify {cleanup(tags="/=\""} * removes starting "
mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|<head>|<title>|(|</title>}
mdb_title.scrub {multi(exclude="(""title")|p3|<table id="akas"|</td>\n <td>|</td>|</table>} *aka's
*
mdb_productiondate.scrub {single|p1|<title>||</title>|</title>}
** new:
mdb_category.scrub {regex()|p1||<span class=\"itemprop\" itemprop=\"genre\">(.+?)</span></a>||}
**
mdb_actor.scrub {multi(exclude="onclick=")|p1|itemprop="actors"|<span class="itemprop" itemprop="name">|</span>|</div>}
mdb_actor.scrub {multi(exclude="<img src=")|p5|class="castlist|itemprop="name">|</span>|</table>} * full list
mdb_director.scrub {multi|p1|itemprop="director"|<span class="itemprop" itemprop="name">|</span>|</div>}
mdb_director.scrub {multi|p5|Directed by|" > |</a>|</table>} * fulllist
mdb_starrating.scrub {single()|p1|<div class="ratingValue">|itemprop="ratingValue">|</span>|</div>}
mdb_starratingvotes.scrub {single|p1|<div class="ratingValue">|based on|user ratings|</div>}
mdb_commentsummary.scrub {multi(max=5 exclude="This review may contain spoilers")|p4|<a href="reviews-index?">|<h2>|</h2>|Add another review}
mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED" include=first)|p4|<a href="reviews-index?">|</p>\n<p>|</p>\n\n|<div class="yn"}
mdb_plot.scrub {single|p2|<p class="plotpar">||<i>|</i>}
mdb_description.scrub {single|p1|<meta name="description"|content="|" />|<meta}
mdb_showicon.scrub {single|p1|Poster"|src="|"|"image" />}