**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: npo.nl
* @MinSWversion: V1.1.1/53.5
* @Revision 0 - [15/12/2013] Jan van Straaten
* - creation
* @Remarks: successor of gids.publiekeomroep.nl
* @header_end
**------------------------------------------------------------------------------------------------
* is this a monday to monday site?
*
site {url=npo.nl|timezone=UTC+01:00|maxdays=6|cultureinfo=nl-NL|charset=utf-8|titlematchfactor=50}
url_index{url()|http://www.npo.nl/gids/verticaal/|urldate|/content}
urldate.format {datestring|dd-MM-yyyy}
* The indexpage lists all channels (24) in hour blocks within each hour block
* the channel data is separated by
tags
* There is no channel_id inside the channel blocks, instead the channels are listed *in a fixed order,
* the first | channel block is Nederland 1 , the *second Nederland 2 etc.
* This order is the site_id of the channellist file.
* the next showsplit splits the index in hour blocks, 24 elements in one hour, one for each channel
index_showsplit.scrub {multi(exclude="padder right")|td class='padder left'> | ||}
index_variable_element.modify {addstart(scope=splitindex)|'config_site_id'}
index_showsplit.modify {substring(type=element)|'index_showsplit' 'index_variable_element' 1/24} * this selects the proper channel
index_showsplit.modify {select()|"" ~} *ignore empty shows
index_showsplit.modify {replace||\|} * split in ndividual shows
index_urlshow {url()|http://www.npo.nl| ||| }
*index_stop.scrub {single|data-end-hour="||"|}
*index_temp_1.scrub {single|data-end-minutes="||"|}
*index_stop.modify {addend()|:'index_temp_1}
index_category.scrub {single(separator=",")|data-genre="||"|"}
index_title.scrub {single(separator=":" include=first)||| |}
index_subtitle.scrub {single(separator=":" exclude=first)||| |}
index_category.modify {replace(not "19")|9|nieuws/actualiteit}
index_category.modify {replace|10|amusement}
index_category.modify {replace|11|informatief}
index_category.modify {replace|12|religieus}
index_category.modify {replace|13|jeugd}
index_category.modify {replace|14|serie/soap}
index_category.modify {replace|15|overige}
index_category.modify {replace|16|documentaire}
index_category.modify {replace|17|sport}
index_category.modify {replace|18|misdaad}
index_category.modify {replace|19|kunst/cultuur}
index_category.modify {replace|20|erotiek}
index_category.modify {replace|21|animatie}
index_category.modify {replace|22|natuur}
index_category.modify {replace|23|comedy}
index_category.modify {replace|24|muziek}
index_category.modify {replace|25|film}
index_category.modify {replace|26|educatief}
index_category.modify {replace|27|gezondheid}
index_category.modify {replace|28|wetenschap}
index_category.modify {replace|30|kinderen 6-12}
index_category.modify {replace|31|drama}
index_category.modify {replace|32|kinderen 2-5}
index_category.modify {replace|33|klassiek}
end_scope
*
title.scrub {single()|||}
title.scrub {single()||||} * alternative
title.modify {cleanup(removeduplicates)}
title.modify {addstart("")|'index_title'} * some detail page have no title!
title.modify {cleanup}
subtitle.scrub {single|||}
subtitle.scrub {single||||} *alternative
subtitle.modify {cleanup(removeduplicates)}
temp_1.scrub {single()|}
description.modify {addend('temp_1' not "")| ('temp_1')} * omroep
description.scrub {multi(include="name='description")||'>}
description.modify {remove()|\' name=\'description}
rating.scrub {multi||}
rating.modify {remove(type=regex)|"(\A.+title=\")"}
rating.modify {remove|age-}
*
** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
**
** @auto_xml_channel_start
* site_id is set to the order nr in which the channels are listed on the index page
*scope.range {(channellist)|end}
*index_temp_2.scrub {multi()|}
*index_temp_2.modify {cleanup}
*index_site_channel.modify {addstart|'index_temp_2'}
*index_site_id.scrub {multi||||} * dummy needed to activate the channellist creation
*index_temp_1.modify {calculate(type=element format=F0)|'index_site_channel' #}
*loop {('index_temp_1' > "0" max=50)|end}
*index_temp_1.modify {calculate(format=F0)|1 -}
*index_site_id.modify {addstart|'index_temp_1'####}
*end_loop
*index_site_id.modify {replace|####|\|}
*end_scope
** @auto_xml_channel_end |