87 lines
4.4 KiB
INI
Executable File
87 lines
4.4 KiB
INI
Executable File
**------------------------------------------------------------------------------------------------
|
|
* @header_start
|
|
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
|
|
* @Site: bulsat.com
|
|
* @MinSWversion: V1.1.1/53
|
|
* none
|
|
* @Revision 0 - [18/06/2014] Jan van Straaten/Willy De Wilde
|
|
* creation
|
|
* @Remarks:
|
|
* none
|
|
* @header_end
|
|
**------------------------------------------------------------------------------------------------
|
|
*
|
|
* a difficult site, one week index page (mon - sun) sorted in hour sections.
|
|
* The details are also on the index page, by id reference from the index show
|
|
*
|
|
site {url=bulsat.com|timezone=UTC+02:00|maxdays=7.1|cultureinfo=bg|charset=UTF-8|titlematchfactor=90}
|
|
site {firstday=0123456}
|
|
url_index{url(debug)|http://www.bulsat.com/tv-programa.php?view=vertical&go=|channel|}
|
|
*
|
|
index_showsplit.scrub {single||||} * copy of the index_page
|
|
*
|
|
scope.range {(splitindex)|end}
|
|
index_variable_element.modify {substring(type=regex)|'index_showsplit' "<!--/main -->(.+)</div></div>"} * all the showdetails for use further down
|
|
index_showsplit.modify {substring(type=regex)|"<div id=\"tvprograma\">(?:.*?)(?:<li(.*?)/li(?:.*?))*</div></div>"}
|
|
index_showsplit.modify {remove(type=element)|0 7} * no need for the first 7 (are date references)
|
|
index_showsplit.modify {replace|\||\n\|}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 0 1/7} *monday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 1 1/7} *tuesday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 2 1/7} *wednesday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 3 1/7} *thursday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 4 1/7} *friday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 5 1/7} *saturday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'}
|
|
index_temp_1.modify {substring(type=element)|'index_showsplit' 6 1/7} *sunday
|
|
index_temp_1.modify {replace|\||\n####} * make single
|
|
index_temp_6.modify {addend|\n####'index_temp_1'} * all the days, mon to sun, time sorted
|
|
index_temp_6.modify {replace|\n####|\n\|} *back to multi, all the 'hour' sections
|
|
* split the individual shows from the 'hour' sections
|
|
index_temp_6.modify {replace|</span><span class="hour"|</span>\n\|<span class="hour"}
|
|
index_temp_6.modify {replace|</a><span class="hour"|</a>\n\|<span class="hour"}
|
|
index_temp_6.modify {replace|><a class="descript"|>\n\|<a class="descript"}
|
|
* back into showsplit
|
|
index_showsplit.modify {clear}
|
|
index_showsplit.modify {addstart|'index_temp_6'}
|
|
* get rid of the void page remains
|
|
index_showsplit.modify {cleanup}
|
|
index_showsplit.modify {remove|><\|}
|
|
index_showsplit.modify {remove()|>\|}
|
|
end_scope
|
|
*
|
|
index_start.scrub {single|<span class="hour">||<|<}
|
|
index_title.scrub {single|<span class="info">|||}
|
|
scope.range {(indexshowdetails)|end}
|
|
index_title.modify {remove|<}
|
|
* get the details, they are on the index page at the bottom
|
|
index_temp_2.scrub {regex||href="#" id=(\d{7})||} * the id
|
|
* index_variable_element contains all the details, extracted from the index page (see above)
|
|
index_description.modify {substring(type=regex)|'index_variable_element' "<div id=\"d'index_temp_2'\".+?</p><p>(.+?)</p></div>"}
|
|
end_scope
|
|
*
|
|
* in princile it is possible to extract details like credits, productiondate and category
|
|
*
|
|
** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
|
|
** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
|
|
**
|
|
** @auto_xml_channel_start
|
|
*url_index {url|http://www.bulsat.com/tv-programa.php}
|
|
*index_site_channel.scrub {multi|<a href="/tv-programa.php?go=|">|</a>|</li>}
|
|
*index_site_id.scrub {multi|<a href="/tv-programa.php?go=||">|</li>}
|
|
*scope.range {(channellist)|end}
|
|
*index_site_id.modify {cleanup(removeduplicates=equal,100 link="index_site_channel")}
|
|
*end_scope
|
|
** @auto_xml_channel_end
|
|
|