epg/siteini.pack/Netherlands/npo.nl.ini

**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: npo.nl
* @MinSWversion: V1.1.1/53.5
* @Revision 0 - [15/12/2013] Jan van Straaten
*     - creation
* @Remarks: successor of gids.publiekeomroep.nl
* @header_end
**------------------------------------------------------------------------------------------------
* is this a monday to monday site?
*
site {url=npo.nl|timezone=UTC+01:00|maxdays=6|cultureinfo=nl-NL|charset=utf-8|titlematchfactor=50}
url_index{url()|http://www.npo.nl/gids/verticaal/|urldate|/content}
urldate.format {datestring|dd-MM-yyyy}

* The indexpage lists all channels (24) in hour blocks within each hour block
* the channel data is separated by <td ..  </td> tags
* There is no channel_id inside the channel blocks, instead the channels are listed *in a fixed order,
* the first <td .. </td> channel block is Nederland 1 , the *second Nederland 2 etc.
* This order is the site_id of the channellist file.
* the next showsplit splits the index in hour blocks, 24 elements in one hour, one for each channel
index_showsplit.scrub {multi(exclude="padder right")|td class='padder left'></td>|<td|</td>|</table>}
index_variable_element.modify {addstart(scope=splitindex)|'config_site_id'}
index_showsplit.modify {substring(type=element)|'index_showsplit' 'index_variable_element' 1/24} * this selects the proper channel
index_showsplit.modify {select()|"<div class=\'time\'>" ~} *ignore empty shows
index_showsplit.modify {replace|</a>|</a>\|} * split in ndividual shows

<div class='time'>
index_urlshow {url()|http://www.npo.nl|<a href="||"|class=}
*index_urlchannellogo {url| }
*
scope.range {(indexshowdetails)|end}
index_start.scrub {single|<div class='time'>||</div>|</div>}
*index_stop.scrub {single|data-end-hour="||"|</div>}
*index_temp_1.scrub {single|data-end-minutes="||"|</div>}
*index_stop.modify {addend()|:'index_temp_1}
index_category.scrub {single(separator=",")|data-genre="||"|"}
index_title.scrub {single(separator=":" include=first)|<div class='program-title'>||</div>|</div>}
index_subtitle.scrub {single(separator=":" exclude=first)|<div class='program-title'>||</div>|</div>}
index_category.modify {replace(not "19")|9|nieuws/actualiteit}
index_category.modify {replace|10|amusement}
index_category.modify {replace|11|informatief}
index_category.modify {replace|12|religieus}
index_category.modify {replace|13|jeugd}
index_category.modify {replace|14|serie/soap}
index_category.modify {replace|15|overige}
index_category.modify {replace|16|documentaire}
index_category.modify {replace|17|sport}
index_category.modify {replace|18|misdaad}
index_category.modify {replace|19|kunst/cultuur}
index_category.modify {replace|20|erotiek}
index_category.modify {replace|21|animatie}
index_category.modify {replace|22|natuur}
index_category.modify {replace|23|comedy}
index_category.modify {replace|24|muziek}
index_category.modify {replace|25|film}
index_category.modify {replace|26|educatief}
index_category.modify {replace|27|gezondheid}
index_category.modify {replace|28|wetenschap}
index_category.modify {replace|30|kinderen 6-12}
index_category.modify {replace|31|drama}
index_category.modify {replace|32|kinderen 2-5}
index_category.modify {replace|33|klassiek}
end_scope
*
title.scrub {single()|<h1 title=|>|</h1>|</div>}
title.scrub {single()|<h1>||</h1><h2>|</div>} * alternative
title.modify {cleanup(removeduplicates)}
title.modify {addstart("")|'index_title'} * some detail page have no title!
title.modify {cleanup}
subtitle.scrub {single|<h1 title=|<h2>|</h2>|</div>}
subtitle.scrub {single|</h1><h2>||</h2>|</div>} *alternative
subtitle.modify {cleanup(removeduplicates)}
temp_1.scrub {single()|<h1 title=|(|)|</div>}
description.modify {addend('temp_1' not "")| ('temp_1')} * omroep
description.scrub {multi(include="name='description")|<meta content='||'>|'>}
description.modify {remove()|\' name=\'description}
rating.scrub {multi|<i class="npo-icon-kijkwijzer||">|</div>}
rating.modify {remove(type=regex)|"(\A.+title=\")"}
rating.modify {remove|age-}
*
**  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _
**      #####  CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
**
** @auto_xml_channel_start
* site_id is set to the order nr in which the channels are listed on the index page
*scope.range {(channellist)|end}
*index_temp_2.scrub {multi()|<ul class='scroll-header'>|/>|</li>|</ul>}
*index_temp_2.modify {cleanup}
*index_site_channel.modify {addstart|'index_temp_2'}
*index_site_id.scrub {multi||||} * dummy needed to activate the channellist creation
*index_temp_1.modify {calculate(type=element format=F0)|'index_site_channel' #}
*loop {('index_temp_1' > "0" max=50)|end}
*index_temp_1.modify {calculate(format=F0)|1 -}
*index_site_id.modify {addstart|'index_temp_1'####}
*end_loop
*index_site_id.modify {replace|####|\|}
*end_scope
** @auto_xml_channel_end