103 lines
6.2 KiB
INI
Executable File
103 lines
6.2 KiB
INI
Executable File
**------------------------------------------------------------------------------------------------
|
|
* @header_start
|
|
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
|
|
* @MinSWversion: V1.1.1/56.25
|
|
* @Site: imdb.com, primary search with imdb.com
|
|
* @Revision 10 - [22/05/2016] Jan van Straaten
|
|
* - added mdbinitype
|
|
* - fixed star-rating and original title
|
|
* @Revision 9 - [07/12/2015] Jan van Straaten
|
|
* - change element names, mdb_show_id and mdb_episode_id
|
|
* @Revision 8 - [25/09/2015] Jan van Straaten
|
|
* - added mdb_category
|
|
* @Revision 7 - [10/10/2014] Jan van Straaten
|
|
* - improved showid scub, also numbers upto 5000000 (was 2500000)
|
|
* @Revision 6 - [09/06/2014] Jan van Straaten
|
|
* - added url header
|
|
* @Revision 5 - [07/06/2014] Jan van Straaten/Jagad
|
|
* - added mdb_showicon
|
|
* @Revision 4 - [20/12/2013] Jan van Straaten
|
|
* - changes in (aka)titles
|
|
* @Revision 3 - [23/11/2013] Jan van Straaten
|
|
* - changes in actor and director due to site changes
|
|
* @Revision 2 - [11/08/2013] Jan van Straaten
|
|
* - small changes in title and commentsummary due to site changes
|
|
* @Revision 1 - [16/02/2013] Jan van Straaten
|
|
* - url primary search connects to advanced search,
|
|
* this is less effective than the normal search from rev 0
|
|
* but that became useless because imdb.com changed the result
|
|
* of that (too many hits)
|
|
* - small changes in actor due to imdb.com changes
|
|
* @Revision 0 - [14/05/2012] Jan van Straaten
|
|
* - creation
|
|
* @Remarks: It is advised to use this ini as the 'second chance' ini,
|
|
* see mdb.config
|
|
* @header_end
|
|
**------------------------------------------------------------------------------------------------
|
|
*
|
|
*
|
|
site {url=imdb.com|mdbinitype=movie|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=imdb}
|
|
* primary search:
|
|
*url_primarysearch {url(urlencode=1,2)|http://www.imdb.com/find?q=|'title'| + |%28|'productiondate'|%29+&s=tt}
|
|
*http://www.imdb.com/find?q=You+Only+Live+Twice+%2B+%281967%29&s=tt
|
|
*mdb_show_id.scrub {multi|primary|<a href="/title/tt||/"|/"}
|
|
url_primarysearch {url|http://www.imdb.com/search/title?release_date=|'productiondate'|,&title=|'title'}
|
|
*http://www.imdb.com/search/title?release_date=2010,&title=Odjuret
|
|
url_primarysearch.headers {customheader=Accept-Encoding=gzip,deflate}
|
|
*mdb_show_id.scrub {multi|primary|<td class="result_text">|<a href="/title/tt|/?ref|</td>}
|
|
mdb_show_id.scrub {multi|primary|<span class="wlb_wrapper"|<a href="/title/tt|/">|</a>}
|
|
*
|
|
* filter showid (7 char long):
|
|
mdb_show_id.modify {remove| } * remove spaces
|
|
mdb_temp_1.modify {calculate(type=element format=F0)|'mdb_show_id' #} * number of mdb_show_id's = loop index
|
|
loop {('mdb_temp_1' > "0" max=50)|4}
|
|
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
|
|
mdb_temp_2.modify {substring(type=element)|'mdb_show_id' 'mdb_temp_1' 1} * the showid to inspect
|
|
mdb_temp_3.modify {calculate(type=char format=F0)|'mdb_temp_2' #} * how many chars in this mdb_show_id?
|
|
mdb_show_id.modify {remove('mdb_temp_3' not "7" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if not 7 chars
|
|
* end loop
|
|
* filter showid (only numbers and < 5000000):
|
|
mdb_temp_1.modify {calculate(type=element format=F0)|'mdb_show_id' #} * number of mdb_show_id's = loop index
|
|
loop {('mdb_temp_1' > "0" max=50)|5}
|
|
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
|
|
mdb_temp_2.modify {substring(type=element)|'mdb_show_id' 'mdb_temp_1' 1} * the showid to inspect
|
|
mdb_temp_3.modify {calculate(format=F0)|'mdb_temp_2'} * convert to number
|
|
mdb_show_id.modify {remove('mdb_temp_3' "0" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if not only numbers
|
|
mdb_show_id.modify {remove('mdb_temp_3' > "5000000" type=element)|'mdb_show_id' 'mdb_temp_1' 1} * remove this mdb_show_id if > 5000000
|
|
* end loop
|
|
*
|
|
* imdb url's:
|
|
url_mdb_p1 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/}
|
|
*url_mdb_p2.modify {addstart|'url_mdb_p1'plotsummary}
|
|
*url_mdb_p3.modify {addstart|'url_mdb_p1'releaseinfo#akas}
|
|
*url_mdb_p4.modify {addstart|'url_mdb_p1'reviews}
|
|
*url_mdb_p5.modify {addstart|'url_mdb_p1'fullcredits#cast}
|
|
*
|
|
url_mdb_p2 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/plotsummary}
|
|
url_mdb_p3 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/releaseinfo#akas}
|
|
url_mdb_p4 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/reviews}
|
|
url_mdb_p5 {url|primary|http://www.imdb.com/title/tt|mdb_show_id|/fullcredits#cast}
|
|
*
|
|
url_mdb.headers {customheader=Accept-Encoding=gzip,deflate}
|
|
*
|
|
* imdb elements
|
|
mdb_title.scrub {single(separator=":")|p1|<div class="originalTitle|">|<span|</div>}* original title when redirected
|
|
mdb_title.modify {cleanup(tags="/=\""} * removes starting "
|
|
mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|<head>|<title>|(|</title>}
|
|
mdb_title.scrub {multi(exclude="(""title")|p3|<table id="akas"|</td>\n <td>|</td>|</table>} *aka's
|
|
*
|
|
mdb_productiondate.scrub {single|p1|<title>||</title>|</title>}
|
|
** new:
|
|
mdb_category.scrub {regex()|p1||<span class=\"itemprop\" itemprop=\"genre\">(.+?)</span></a>||}
|
|
**
|
|
mdb_actor.scrub {multi(exclude="onclick=")|p1|itemprop="actors"|<span class="itemprop" itemprop="name">|</span>|</div>}
|
|
mdb_actor.scrub {multi(exclude="<img src=""onclick=")|p5|class="castlist|itemprop="name">|</span>|</table>} * full list
|
|
mdb_director.scrub {multi|p1|itemprop="director"|<span class="itemprop" itemprop="name">|</span>|</div>}
|
|
mdb_director.scrub {multi|p5|Directed by|" > |</a>|</table>} * fulllist
|
|
mdb_starrating.scrub {single()|p1|<div class="ratingValue">|itemprop="ratingValue">|</span>|</div>}
|
|
mdb_starratingvotes.scrub {single|p1|<div class="ratingValue">|based on|user ratings|</div>}
|
|
mdb_commentsummary.scrub {multi(max=5 exclude="This review may contain spoilers")|p4|<a href="reviews-index?">|<h2>|</h2>|Add another review}
|
|
mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED" include=first)|p4|<a href="reviews-index?">|</p>\n<p>|</p>\n\n|<div class="yn"}
|
|
mdb_plot.scrub {single|p2|<p class="plotpar">||<i>|</i>}
|
|
mdb_description.scrub {single|p1|<meta name="description"|content="|" />|<meta}
|
|
mdb_showicon.scrub {single|p1|Poster"|src="|"|"image" />} |