95 lines
5.9 KiB
INI
Executable File
95 lines
5.9 KiB
INI
Executable File
**------------------------------------------------------------------------------------------------
|
|
* @header_start
|
|
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
|
|
* @MinSWversion: V1.1.1/56.12
|
|
* - (postprocess V1.11)
|
|
* @Site: imdb.com
|
|
* @Revision 4 - [07/12/2015] Jan van Straaten
|
|
* - change element names, mdb_show_id and mdb_episode_id
|
|
* @Revision 3 - [30/09/2015] Jan van Straaten
|
|
* - added mdb-category
|
|
* @Revision 2 - [11/08/2014] Jan van Straaten
|
|
* - improved mdb_episode_id selection
|
|
* @Revision 1 - [09/06/2014] Jan van Straaten
|
|
* - added url header
|
|
* @Revision 0 - [07/04/2014] Jan van Straaten
|
|
* - creation
|
|
* @Remarks: Series data extraction. English version
|
|
* - variant of imdb.com.imdb_series.ini rev 1 , episode in onscreen s2e4 format
|
|
* @header_end
|
|
**------------------------------------------------------------------------------------------------
|
|
*
|
|
*
|
|
site {url=imdb.com|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=imdb|episodesystem=onscreen}
|
|
* primary search (using imdb's advanced search):
|
|
url_primarysearch {url()|http://www.imdb.com/search/title?&title=|'title'|&title_type=tv_series}
|
|
url_primarysearch.modify {replace| |%20}
|
|
*http://www.imdb.com/search/title?title=Touched%20by%20an%20Angel&title_type=tv_series
|
|
url_primarysearch.headers {customheader=Accept-Encoding=gzip,deflate}
|
|
mdb_show_id.scrub {multi|primary|<span class="wlb_wrapper"|<a href="/title/tt|/">|</a>}
|
|
*
|
|
* imdb url's:
|
|
url_mdb_p1.modify {addstart|http://www.imdb.com/title/tt'mdb_show_id'/epdate} * all the episodes date sorted with episode title and mdb_episode_id
|
|
url_mdb_p2.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'} * the episode detail page
|
|
* or http://www.imdb.com/title/tt0553267/?ref_=tt_ep_pr = same
|
|
url_mdb_p3.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/synopsis?ref_=tt_stry_pl} * the full synopsis
|
|
* is same as * http://www.imdb.com/title/tt2288518/synopsis full synopsis
|
|
url_mdb_p4.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/fullcredits?ref_=tt_ql_1} *full cast and crew (director, writer, actor)
|
|
* http://www.imdb.com/title/tt2288518/fullcredits?ref_=tt_ql_1
|
|
url_mdb_p5.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/plotsummary?ref_=tt_ql_5} *plot summary (not used)
|
|
url_mdb_p6.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/reviews?ref_=tt_ql_7} *user reviews
|
|
*
|
|
url_mdb.headers {customheader=Accept-Encoding=gzip,deflate}
|
|
*
|
|
* imdb elements
|
|
mdb_subtitle.scrub {multi|p1|<a href="/title/tt|/">|</a>|</td>}
|
|
mdb_title.scrub {single(separator="(" include=first exclude="</span>")|p1|<span class="title-extra">||<i>|</span>} * the original title
|
|
mdb_title.scrub {single(separator="(" include=first)|p1|<title>||</title>|</title>}
|
|
mdb_title.modify {cleanup(tags="/=\"")} * removes starting "
|
|
mdb_title.modify {cleanup(tags="\"=/")}
|
|
*
|
|
** aka's not yet implemented if at all possible
|
|
*mdb_title.scrub {multi(separator=" - ")|p3|<h5><a name="akas">Also Known As (AKA)</a></h5>|<tr>\n<td>|</td>|</table>} *aka's
|
|
*mdb_title.scrub {multi|p3|<h5><a name="akas">Also Known As (AKA)</a></h5>|<tr>\n<td>|</td>|</table>} *aka's
|
|
*
|
|
** get the mdb_episode_id
|
|
** this is the procedure to follow: from an index page with all episodes and episode titles on it, split it in individual episodes
|
|
mdb_temp_1.scrub {multi|p1|<h3>Episodes Rated by Date</h3>|<td><a href="/title/tt|</tr>|<br style="clear:both;" />} * all the episodes
|
|
mdb_temp_1.modify {select|">'mdb_subtitle'<" ~} * select the one and only with the episode title
|
|
mdb_episode_id.modify {substring(type=regex)|'mdb_temp_1' "(\d\{7\})/\">"} * get the tt nbr for the episode
|
|
*
|
|
* the following elements are taken from the episode detail page mdb-p2
|
|
* there is a story line, director and actors , starrating, episodenum
|
|
* also a 'full synopsys' on a separate page mdb-p3
|
|
** productiodate as year
|
|
*mdb_productiondate.scrub {single|p2|<title>||</title>|</title>}
|
|
*mdb_productiondate.modify {calculate(format=productiondate)}
|
|
** full productiondate
|
|
mdb_productiondate.scrub {single|p2|<h2 class="tv_header">|<span class="nobr">(|)</span>|</h1>}
|
|
mdb_productiondate.modify {calculate(format=productiondate)} * only year allowed!
|
|
** new:
|
|
mdb_category.scrub {regex()|p2||<span class=\"itemprop\" itemprop=\"genre\">(.+?)</span></a>||}
|
|
**
|
|
mdb_temp_2.scrub {single(include="Season""Episode")|p2|<h2 class="tv_header">|<span class="nobr">|</span>|</h2>} * episode
|
|
mdb_actor.scrub {multi|p4|?ref_=ttfc_fc_cl_t|itemprop="name">|</span>|</a>}
|
|
mdb_director.scrub {multi|p4|?ref_=ttfc_fc_dr|" >|</a>|</td>}
|
|
mdb_starrating.scrub {single|p2|Ratings:|itemprop="ratingValue">|</span>|</strong>}
|
|
mdb_starratingvotes.scrub {single|p2|Ratings:|itemprop="ratingCount">|</span>|users</a>}
|
|
mdb_showicon.scrub {single|p2|Poster"|src="|"|"image" />}
|
|
mdb_commentsummary.scrub {multi(exclude="SPOILERS ARE INCLUDED""This review may contain spoilers""Add another review" include=first)|p6|<a href="reviews-index?">|<h2>|</h2>|Add another review}
|
|
mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED""This review may contain spoilers""Add another review" include=first)|p6|<a href="reviews-index?">|<p>|</p>\n\n|Add another review}
|
|
mdb_plot.scrub {single(separator="<em" include=first)|p2|<h2>Storyline</h2>|<p>|</p>|</div>}
|
|
mdb_description.scrub {single|p3|<div id="swiki.2.1">||</div>|</div>}
|
|
mdb_description.modify {replace|<br/><br/>| }
|
|
*
|
|
* standard 'onscreen' episode
|
|
mdb_episode.modify {addstart('mdb_temp_2' not "")|'mdb_temp_2'}
|
|
mdb_episode.modify {replace|Season |s}
|
|
mdb_episode.modify {replace|Episode |e}
|
|
mdb_episode.modify {remove|, }
|
|
* convert episode to xmltv_ns:
|
|
*mdb_episode.modify {substring(type=regex)|'mdb_temp_2' "Season.(\d+)"}
|
|
*mdb_episode.modify {calculate(> "0" format=F0)|1 -}
|
|
*mdb_temp_2.modify {substring(type=regex)|"Episode.(\d+)"}
|
|
*mdb_temp_2.modify {calculate(> "0" format=F0)|1 -}
|
|
*mdb_episode.modify {addend|.'mdb_temp_2'.} |