**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing EPG data from TvGuide websites * @Site: npo.nl * @MinSWversion: V1.1.1/53.5 * @Revision 0 - [15/12/2013] Jan van Straaten * - creation * @Remarks: successor of gids.publiekeomroep.nl * @header_end **------------------------------------------------------------------------------------------------ * is this a monday to monday site? * site {url=npo.nl|timezone=UTC+01:00|maxdays=6|cultureinfo=nl-NL|charset=utf-8|titlematchfactor=50} url_index{url()|http://www.npo.nl/gids/verticaal/|urldate|/content} urldate.format {datestring|dd-MM-yyyy} * The indexpage lists all channels (24) in hour blocks within each hour block * the channel data is separated by tags * There is no channel_id inside the channel blocks, instead the channels are listed *in a fixed order, * the first channel block is Nederland 1 , the *second Nederland 2 etc. * This order is the site_id of the channellist file. * the next showsplit splits the index in hour blocks, 24 elements in one hour, one for each channel index_showsplit.scrub {multi(exclude="padder right")|td class='padder left'>||} index_variable_element.modify {addstart(scope=splitindex)|'config_site_id'} index_showsplit.modify {substring(type=element)|'index_showsplit' 'index_variable_element' 1/24} * this selects the proper channel index_showsplit.modify {select()|"
" ~} *ignore empty shows index_showsplit.modify {replace||\|} * split in ndividual shows
index_urlshow {url()|http://www.npo.nl|||
|
} *index_stop.scrub {single|data-end-hour="||"|} *index_temp_1.scrub {single|data-end-minutes="||"|} *index_stop.modify {addend()|:'index_temp_1} index_category.scrub {single(separator=",")|data-genre="||"|"} index_title.scrub {single(separator=":" include=first)|
||
|} index_subtitle.scrub {single(separator=":" exclude=first)|
||
|} index_category.modify {replace(not "19")|9|nieuws/actualiteit} index_category.modify {replace|10|amusement} index_category.modify {replace|11|informatief} index_category.modify {replace|12|religieus} index_category.modify {replace|13|jeugd} index_category.modify {replace|14|serie/soap} index_category.modify {replace|15|overige} index_category.modify {replace|16|documentaire} index_category.modify {replace|17|sport} index_category.modify {replace|18|misdaad} index_category.modify {replace|19|kunst/cultuur} index_category.modify {replace|20|erotiek} index_category.modify {replace|21|animatie} index_category.modify {replace|22|natuur} index_category.modify {replace|23|comedy} index_category.modify {replace|24|muziek} index_category.modify {replace|25|film} index_category.modify {replace|26|educatief} index_category.modify {replace|27|gezondheid} index_category.modify {replace|28|wetenschap} index_category.modify {replace|30|kinderen 6-12} index_category.modify {replace|31|drama} index_category.modify {replace|32|kinderen 2-5} index_category.modify {replace|33|klassiek} end_scope * title.scrub {single()|

|

|} title.scrub {single()|

||

|} * alternative title.modify {cleanup(removeduplicates)} title.modify {addstart("")|'index_title'} * some detail page have no title! title.modify {cleanup} subtitle.scrub {single|

|

|} subtitle.scrub {single|

||

|} *alternative subtitle.modify {cleanup(removeduplicates)} temp_1.scrub {single()|

} description.modify {addend('temp_1' not "")| ('temp_1')} * omroep description.scrub {multi(include="name='description")||'>} description.modify {remove()|\' name=\'description} rating.scrub {multi||} rating.modify {remove(type=regex)|"(\A.+title=\")"} rating.modify {remove|age-} * ** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file) ** ** @auto_xml_channel_start * site_id is set to the order nr in which the channels are listed on the index page *scope.range {(channellist)|end} *index_temp_2.scrub {multi()|
    |/>||
} *index_temp_2.modify {cleanup} *index_site_channel.modify {addstart|'index_temp_2'} *index_site_id.scrub {multi||||} * dummy needed to activate the channellist creation *index_temp_1.modify {calculate(type=element format=F0)|'index_site_channel' #} *loop {('index_temp_1' > "0" max=50)|end} *index_temp_1.modify {calculate(format=F0)|1 -} *index_site_id.modify {addstart|'index_temp_1'####} *end_loop *index_site_id.modify {replace|####|\|} *end_scope ** @auto_xml_channel_end