236 lines
14 KiB
INI
Executable File
236 lines
14 KiB
INI
Executable File
**------------------------------------------------------------------------------------------------
|
|
* @header_start
|
|
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
|
|
* @Site: www.mydigiguide.com
|
|
* @MinSWversion: V1.1.1/55.27
|
|
* none
|
|
* @Revision 11 - [11/11/2015] Jan van Straaten
|
|
* new channellist, timezone
|
|
* @Revision 10 - [11/12/2012] Jan van Straaten
|
|
* added scope to speed up the loop
|
|
* @Revision 9 - [18/06/2012] Jan van Straaten
|
|
* resolves a problem with the time sequence in channels with too long (> 20 hours) time gabs by adding a dummy show.
|
|
* @Revision 8 - [] Jan van Straaten
|
|
* added site retry
|
|
* @Revision 7 - [] Jan van Straaten
|
|
* small change in starrating scrub, added exclude="Starring"
|
|
* @Revision 6 - [] Quake505
|
|
* added elements previousshown and premiere, also updated rating scrub, some comments removed
|
|
* @Revision 5 - [] Jan van Straaten
|
|
* added element subtitles, needs V1.0.7Beta
|
|
* @Revision 4 - [] Quake505
|
|
* updated the Part Scrubbing
|
|
* @Revision 3 - [] Quake505
|
|
* made some changes to the Part scrubbing
|
|
* @Revision 2 - [] Jan van Straaten
|
|
* added some suggestions and alternatives
|
|
* @Revision 1 - []
|
|
* (First attempt at making a Grabbing file)
|
|
* @Remarks:
|
|
* It is recommended that the Retry setting is 15 or above.
|
|
* Robots exclusion
|
|
* @header_end
|
|
**------------------------------------------------------------------------------------------------
|
|
|
|
site {url=www.mydigiguide.com|timezone=Europe/London|maxdays=10|cultureinfo=en-GB|charset=ISO-8859-1|titlematchfactor=50}
|
|
site {episodesystem=onscreen}
|
|
site {retry=<retry>15</retry>}
|
|
*
|
|
*url_index{url|http://www.mydigiguide.com/dgx/wbl.dll?h=35&a=2&ch=|channel|&dt=|urldate|&el= POST:}
|
|
url_index{url|http://www.mydigiguide.com/dgx/wbl.dll?h=35&a=2&ch=|channel|&dt=|urldate}
|
|
urldate.format {daycounter|0}
|
|
*
|
|
*
|
|
index_showsplit.scrub {multi|"top"><p class||</p></td></tr></table>|</p></td></tr></table>}
|
|
*
|
|
scope.range {(splitindex)|end}
|
|
* this loop adds a dummy show if the timedifference is more that 20 hours and less that 23 hours between subsequent shows
|
|
index_temp_6.modify {addstart|'index_showsplit'}
|
|
index_showsplit.modify {remove|'index_showsplit} * clear and triggers modify
|
|
index_temp_1.modify {calculate(type=element format=F0)|'index_temp_6' #} * number of elements in index, looopcounter & element index
|
|
loop {('index_temp_1' > "0" max=200)|23} * loop the next 23 lines while index_temp_1 > 0 (through all the indexshown)
|
|
index_temp_1.modify {calculate(format=F0)|1 -} * decrease loop counter
|
|
index_temp_2.modify {substring(type=element)|'index_temp_6' 'index_temp_1' 1} * the index show
|
|
index_showsplit.modify {addstart|&&&&&'index_temp_2'}
|
|
index_temp_3.modify {calculate(type=char format=F0)|'index_temp_2' "<span class=\"bold\">" @} *
|
|
index_temp_3.modify {calculate(format=F0)|19 +} * index of start time
|
|
index_temp_4.modify {substring(type=char)|'index_temp_2' 'index_temp_3' 5} * start time
|
|
index_temp_1.modify {calculate(> "0" format=F0)|1 -}
|
|
index_temp_2.modify {substring(type=element)|'index_temp_6' 'index_temp_1' 1} * the 'next' index show
|
|
index_temp_3.modify {calculate(type=char format=F0)|'index_temp_2' "<span class=\"bold\">" @} *
|
|
index_temp_3.modify {calculate(format=F0)|19 +} * index of start time
|
|
index_temp_5.modify {substring(type=char)|'index_temp_2' 'index_temp_3' 5} * start time of 'next' show
|
|
index_temp_4.modify {calculate(format=F2)|'index_temp_5' 'index_temp_4' -} * the time difference, positive if dayjump
|
|
index_temp_3.modify {calculate(format=F0)|0 *} *clear, set to 0
|
|
index_temp_3.modify {calculate('index_temp_4' <= "4" format=F2)|1 +} *condition < 4
|
|
index_temp_3.modify {calculate('index_temp_4' > "1" format=F2)|1 +} *condition > 1, both conditions true = 2
|
|
index_temp_2.modify {replace('index_temp_3' "2.00")|'index_temp_5'|xx:xx} *temporary replace the startime with a dummy
|
|
index_temp_5.modify {calculate('index_temp_3' "2.00" format=time)|06:00 +} *time of show to add, 6 hours later
|
|
index_temp_2.modify {replace('index_temp_3' "2.00")|xx:xx|'index_temp_5'} * the new starttime
|
|
index_temp_4.modify {calculate(type=char format=F0)|'index_temp_2' "<p class="programmedetails" @} * index of programmedetails
|
|
index_temp_2.modify {remove('index_temp_3' "2.00" type=char)|'index_temp_2' 'index_temp_4'} * remove the old details
|
|
index_temp_2.modify {addend|<p class="programmedetails">DigiGuide Library">Ignore this!! Dummy Show added to resolve a time sequence problem</a></span>|</table>} * add new title
|
|
index_showsplit.modify {addstart('index_temp_3' "2.00")|&&&&&'index_temp_2'} * add extra show
|
|
index_temp_1.modify {calculate(> "0" format=F0)|1 +}
|
|
*end loop
|
|
index_showsplit.modify {replace|&&&&&|\|} * make multi
|
|
end_scope.range
|
|
*
|
|
index_start.scrub {single|start"|<span class="bold">|</span>|</p></td>}
|
|
index_title.scrub {single|<p class="programmedetails"|DigiGuide Library">|</a></span>|</table>}
|
|
*
|
|
index_category.scrub {multi(separator="/")|<p class="programmedetails">|<span class="catname">|</span>|<span class="rating"}
|
|
*
|
|
**** V7 added exclude="Starring" , else the include=" Star" will also pass results with " Starring"
|
|
index_starrating.scrub {single(separator=", "" ("")" exclude="Starring" include=" Star")|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
**** V5-The new subtitles, needs a mod to replace its value to "true" if the following scrub yields the right result
|
|
index_subtitles.scrub {single(separator=", "" ("")" include="Subtitles")|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
**** V6-The two lines below were added for the new previousshown and premiere elements.
|
|
index_previousshown.scrub {single(separator=", "" ("")" include="Repeat")|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
index_premiere.scrub {single(separator=", "" ("")" include="New Episode""New""Live")|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
index_episode.scrub {single|<p class="programmedetails">|Series |.|</table>}
|
|
**** The episode scrub above fails if "Series " is followed by a description rather than a number, we scrub a temp that either
|
|
**** returns the series number or the string following the word Series. Will be used later in mods to filter the real data.
|
|
index_temp_1.scrub {single(separator=", " exclude="Episode""episode")|<p class="programmedetails">|Series |.|</table>}
|
|
*
|
|
*index_description.scrub {single|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
**** alternative for description, separate Starring and Director and exclude on href. This also removes the part between () if
|
|
**** there are credits. This is handled later in mods, by either removing or reinserting.
|
|
index_description.scrub {single(separator="Starring""Director" exclude="href=")|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
**** the last part of the description , between (), used in mods for several purposes
|
|
**** the include=last2 and exclude=last ---> result forelast
|
|
index_temp_3.scrub {single(separator="("")" include=last2 exclude=last)|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
index_director.scrub {single|<br>Director:|target="_blank"> |</a>|</a><br>}
|
|
index_actor.scrub {multi(max=10)|<br>Starring:|target="_blank"> |</a>|</a><br>}
|
|
*
|
|
**** subtitle 'candidate', the first element in description before "." in a temp, see mods for the completion
|
|
index_temp_2.scrub {single(separator="." exclude=last include=first)|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
**** extract the productiondate from the description, see 4.5.2 of the doc why this works
|
|
index_productiondate.scrub {single(separator="("")" include=last2 exclude=last)|<p class="programmedetails">|<br>|</p>|</td></tr></table>}
|
|
*
|
|
**** V6-Added this rating scrub because the U rating was not working
|
|
index_rating.scrub {single|<span class="certificate"|alt="| certificate| certificate}
|
|
**** rating icon
|
|
index_ratingicon.scrub {single|<span class="certificate"|<img src="|" alt="|</span>}
|
|
*
|
|
* Mods
|
|
*
|
|
**** V2-mods for alternative starrating
|
|
index_starrating.modify {cleanup} * removes the 5 Star bold tags
|
|
index_starrating.modify {remove| Star}
|
|
index_starrating.modify {addend(notnull)|/5} * if you like it that way
|
|
*
|
|
**** V2-remove the description part between () at the end
|
|
index_description.modify {remove|('index_temp_3')}
|
|
**** V2-or -- if you prefer that -- reinsert it
|
|
*index_description.modify {addend(not ~ 'index_temp_3')|('index_temp_3')}
|
|
*
|
|
index_category.modify {remove|(}
|
|
index_category.modify {remove|)}
|
|
*
|
|
**** V2-clear episode if not just a number
|
|
index_temp_1.modify {calculate(format=F0)|" " #} * counts how many spaces, must be 0 if real series data
|
|
index_episode.modify {remove('index_temp_1' not "0")|'index_episode'}
|
|
index_episode.modify {addstart(~ ",")|Series } * restores the episode data as in description
|
|
index_description.modify {remove|'index_episode'. } * removes the data from the description
|
|
index_episode.modify {replace|Series |S}
|
|
index_episode.modify {replace|, episode |E}
|
|
*
|
|
**** V2-subtitle: two cases; 1. there is 'always a subtitle if there is episode data 2. Contains max two spaces (3 words)(risky??)
|
|
index_subtitle.modify {addstart('index_episode' not "")|'index_temp_2'} * case 1. Adds the subtitle 'candidate' index_temp_3 to
|
|
**** V2-the subtitle if there is episode data
|
|
index_temp_4.modify {calculate(format=F0)|'index_temp_2' " " #} * count number of spaces
|
|
index_temp_4.modify {calculate(format=F0)|5 /} * returns 0 if <3
|
|
index_subtitle.modify {addstart('index_temp_4' "0" null)|'index_temp_2'} * case 2. Adds the subtitle 'candidate' to subtitle if null and if three words or less
|
|
*
|
|
index_description.modify {remove(null)|'index_subtitle'.}
|
|
*
|
|
**** the case 2 subtitle construction above has the risk that it moves a single max three word description to the subtitle
|
|
**** just in case (untested because I didn't find this condition)
|
|
*index_description.modify {addstart("")|'index_subtitle'} * adds the subtitle back to description if "" (empty) from the previous mod
|
|
*index_subtitle.modify {remove('index_description' ~ 'index_subtitle')|'index_subtitle'} * cleares the subtitle if the description contains (~) the subtitle again (from previous mod)
|
|
*
|
|
**** V6-This was replaced.
|
|
**** added: extract rating from index_temp_3
|
|
*index_temp_3.modify {remove|'index_productiondate'} * to avoid mixeup of the number recognition in the next mods:
|
|
****
|
|
*
|
|
*
|
|
**** V6-Added this rating scrub because the U rating was not working, Below PG and U lines are not required, but added if you wanted to change the value
|
|
index_rating.modify {replace(== "18")|'index_rating'|18+}
|
|
index_rating.modify {replace(== "15")|'index_rating'|15+}
|
|
index_rating.modify {replace(== "12")|'index_rating'|12+}
|
|
index_rating.modify {replace(== "PG")|'index_rating'|PG}
|
|
index_rating.modify {replace(== "U")|'index_rating'|U}
|
|
index_rating.modify {replace(not == "U""PG""12+""15+""18+"|BAD}
|
|
index_rating.modify {remove(== "BAD")|'index_rating'}
|
|
*
|
|
*
|
|
**** V6-This was replaced.
|
|
**** use a lookup table of possible values (there may be more??)
|
|
*index_rating.modify {addstart('index_temp_3' ~ "18")|18+}
|
|
*index_rating.modify {addstart('index_temp_3' ~ "15")|15+}
|
|
*index_rating.modify {addstart('index_temp_3' ~ "12")|12+}
|
|
*index_rating.modify {addstart('index_temp_3' ~ "PG")|PG}
|
|
*
|
|
**** complete the ratingicon
|
|
index_ratingicon.modify {addstart(notnull)|http://www.mydigiguide.com/dgx/}
|
|
*
|
|
* The three lines below remove the word Repeat from the description, this can help a Scheduler from recording a program twice that has missing episode information.
|
|
*
|
|
**** V4-Updated
|
|
**** V3-Updated
|
|
**** Add the Part section of the episode
|
|
index_episode.modify {addend|P'{single(separator=",""." include=first)|<p class="programmedetails">|(Part |)|</table>}'}
|
|
index_presenter.scrub {single(separator=",""." include=first)|<p class="programmedetails">|Part |)|</td></tr></table>} * to scrub the Part Episode, using index_presenter as temp.
|
|
index_writer.modify {calculate(format=F0)|'index_presenter' " " #} * couning spaces in index_presenter, using index_writer as temp.
|
|
index_writer.modify {calculate(format=F0)|5 /} * returns 0 if spaces are <3
|
|
index_episode.modify {addstart('index_writer' "0" null)|P'index_presenter'}
|
|
*
|
|
index_episode.modify {replace| of |/}
|
|
**** the part section of the episode is in the subtitle, this removes it
|
|
index_subtitle.modify {remove|(Part '{single|<p class="programmedetails">|(Part |)|</table>}')} *remove from subtitle
|
|
*
|
|
**** * Extra Scrub if Episode is null, there can be 'lonely' Episode data
|
|
*
|
|
index_episode.modify {addstart(null)|E'{single|<p class="programmedetails">|Episode |.|</table>}'}
|
|
**** remove from description
|
|
index_description.modify {remove|Episode '{single|<p class="programmedetails">|Episode |.|</table>}'.}
|
|
*
|
|
**** V4-elements presenter and writer were used for the new Part mod as temp elements, below is removing any DATA left
|
|
index_presenter.modify {remove|'index_presenter'}
|
|
index_writer.modify {remove|'index_writer'}
|
|
*
|
|
index_description.modify {cleanup}
|
|
*
|
|
**** V5-the new 'boolean' element subtitles, replace its value in "true" if the scrubbed value is correct
|
|
index_subtitles.modify {replace(== "Subtitles")|'index_subtitles'|true}
|
|
*
|
|
**** V6-Added the lines below for the new elements previousshown and premiere
|
|
index_previousshown.modify {replace(== "Repeat")|'index_previousshown'|true}
|
|
index_premiere.modify {replace(== "New Episode")|'index_premiere'|true}
|
|
index_premiere.modify {replace(== "New" null)|'index_premiere'|true}
|
|
index_premiere.modify {replace(== "Live" null)|'index_premiere'|true}
|
|
** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
|
|
** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
|
|
**
|
|
** @auto_xml_channel_start
|
|
** enable the following three lines to create a channel list file
|
|
** --Should only be run with one dummy channel in the config :
|
|
** <channel update="i" site="mydigiguide.com" site_id="" xmltv_id="dummy">dummy</channel>
|
|
** enable the following lines:
|
|
*index_site_id.scrub {regex||&ch=(\d+?)&cname=.+?&||}
|
|
*index_site_channel.scrub {regex||&ch=\d+?1&cname=(.+?)&||}
|
|
*index_site_channel.modify {replace|&|And}
|
|
*scope.range {(channellist)|end}
|
|
*index_site_id.modify {cleanup(removeduplicates=equal,100 link="index_site_channel")}
|
|
*end_scope
|
|
** @auto_xml_channel_end
|
|
|
|
|