**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing EPG data from TvGuide websites * @Site: sol.no * @MinSWversion: V1.1.1/53.6 * uses span in remove duplicates * @Revision 2 - [29/12/2013] Jan van Straaten * remove of 'subshow' within a 'show' (with same start) in the index that causes dayjumps * @Revision 1 - [27/02/2012] Willy de Wilde * Showsplit.scrub and stop.scrub * @Revision 0 - [27/01/2011] Alberto Miguel * none * @Remarks: * none * @header_end **------------------------------------------------------------------------------------------------ site {url=sol.no|timezone=UTC+01:00|maxdays=6|cultureinfo=NO|charset=utf-8|titlematchfactor=90|ratingsystem=NO} url_index{url|http://www.sol.no/underholdning/tv/guiden/index.cgi?StartTime=0500&EndTime=2900&Date=|urldate|&Categories=*&Channels=|channel} urldate.format {datestring|yyyy-MM-dd} * index_urlshow {url|http://www.sol.no/||} *Sometimes the // is problematic. index_showsplit.scrub {multi()||} * * it happens that some shows are listed that occur within another show * the following removes the extra show scope.range {(splitindex)|end} index_showsplit.modify {|} * dummy required to initiate the index_showsplit operations index_temp_2.modify {substring(type=regex)|'index_showsplit' "(.+} index_temp_1.scrub {single||
|} index_category.scrub {single|
|
|} index_description.scrub {multi(exclude="|
||
|} description.scrub {single(separator="Medvirkende:" include=first)||} titleoriginal.scrub {single(lang=xx)|} productiondate.scrub {single|} presenter.scrub {single(separator=", ")|} actor.scrub {single(separator=", ")|} director.scrub {single(separator=", ")|} rating.scrub {single(exclude="")|} *Sometimes on series get something like this (2:4) * index_title.modify {remove(null)|} index_title.modify {remove(null)|'index_start'} * index_temp_1.modify {remove(null)|} *I have to use this because cleanup delete sometimes the last ) on titles with final (R) index_title.modify {addstart(null)|'index_temp_1'} * title.modify {remove(null)|} *I have to use this because cleanup delete sometimes the last ) on titles with final (R) * description.modify {remove(null)|'index_description'} *There is duplicated description when exists index_description description.modify {remove(null)|('titleoriginal')} *There is duplicated description when exists index_description description.modify {remove(null)|Programleder'presenter'.} * titleoriginal.modify{replace()|``|\'} titleoriginal.modify{addend(notnull)|.!-} titleoriginal.modify{remove(null)|..!-} *Sometimes there is a final . on the original title titleoriginal.modify{remove(null)|.!-} * presenter.modify {remove(null)|:} presenter.modify {remove(null)|er } presenter.modify {remove(null)|.} * description.modify{replace()|``|\'} * director.modify{replace()|``|\'} * actor.modify{replace()|``|\'} * rating.modify {addend(notnull)|. år}
).+"} * the part that is equal index_temp_2.modify {remove(type=regex)|"(\d{2}:\d{2})"} * remove the stop time (not equal) index_temp_2.modify {cleanup(removeduplicates link=index_showsplit)} * span=2 , removes only when next or next-next are equal end_scope * index_start.scrub {single|||
|||
|||
|
|| |} title.scrub {multi(exclude="|||||
|(|)||fra|.||Programleder|.||Medvirkende:|.||Regi:|.||. (|. år)|