150 lines
8.4 KiB
INI
Executable File
150 lines
8.4 KiB
INI
Executable File
**------------------------------------------------------------------------------------------------
|
|
* @header_start
|
|
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
|
|
* @Site: laguiatv.com
|
|
* @MinSWversion: V1.0.8
|
|
* none
|
|
* @Revision 5 - [21/10/2011] Jan van Straaten/MrSpock
|
|
* version for MediaPortal
|
|
* @Revision 4 - []
|
|
* several changes in title, subtitle, description, category, rating...
|
|
* @Revision 3 - []
|
|
* change in presenter separator
|
|
* @Revision 2 - []
|
|
* handles empty show detail pages , needs 1.0.8
|
|
* @Remarks:
|
|
* none
|
|
* @header_end
|
|
**------------------------------------------------------------------------------------------------
|
|
|
|
site {url=laguiatv.com|timezone=UTC+01:00|maxdays=6|cultureinfo=es-ES|charset=ISO-8859-1|titlematchfactor=90|ratingsystem=ES}
|
|
url_index{url|http://www.laguiatv.com/programacion.php?vertical=1&fecha=|urldate|&cadena=|channel}
|
|
urldate.format {datestring|dd/MM/yyyy}
|
|
index_urlshow {url|http://www.laguiatv.com|<a href="||">|</a></td>}
|
|
index_urlchannellogo {url|http://www.laguiatv.com/|<table class="grid cadena">|<img src="|" alt|</th>}
|
|
*
|
|
*
|
|
* index grabbing:
|
|
index_showsplit.scrub {multi()|<div class="programas">Programas</div>|<tr>|</tr>|</tbody>}
|
|
index_start.scrub {single|<th scope="row">||</th>|<td colspan=}
|
|
index_category.scrub {single(exclude="programa")|<th scope="row">|class="|">|</a></td>}
|
|
index_title.scrub {single|<a href="|">|</a>|</td>}
|
|
*
|
|
* enable the following lines to create a channel file
|
|
*index_site_channel.scrub {multi(exclude="option selected")|<div class="item equalize">|">|</option>|</select>}
|
|
*index_site_channel.scrub {multi(exclude="option selected")|<div class="item last equalize">|">|</option>|</select>}
|
|
*index_site_id.scrub {multi()|<div class="item equalize">|<option value="|">|</select>}
|
|
*index_site_id.scrub {multi()|<div class="item last equalize">|<option value="|">|</select>}
|
|
*
|
|
*
|
|
* programme grabbing:
|
|
title.scrub {single|<div class="intro-datasheet">|<p|</p>|</div>}
|
|
titleoriginal.scrub {single(lang=xx)|>Ficha técnica</h|<strong>Título original: </strong>|</li>|</ul>}
|
|
* subtitle, next grabs episode from description, exclude if html tags, messy structure
|
|
subtitle.scrub {single(exclude="</")|<div class="intro-datasheet">|<p class="desc"><b>|</b><br>|<div class="rater inline">}
|
|
director.scrub {single|>Ficha técnica</h|<li><strong>Director:</strong>|</li>|</ul>}
|
|
actor.scrub {single(separator=", ")|>Ficha técnica</h|<li><strong>Intérpretes</strong>:<br />|</li>|</ul>} * some shows this
|
|
actor.scrub {single(separator=", ")|<h3>Ficha técnica</h3>|<li><strong>Actores:</strong>|</li>|</ul>} * others this
|
|
presenter.scrub {single(separator=", " " y ")|>Ficha técnica</h|<li><strong>Presentador:</strong>|</li>|</ul>}
|
|
rating.scrub {single|>Ficha técnica</h|<li><strong>Rating:</strong>|</li>|</ul>}
|
|
*ratingicon.scrub {multi|} * no entry
|
|
category.scrub {single|>Ficha técnica</h|<li><strong>Género:</strong>|</li>|</ul>}
|
|
productiondate.scrub {single|>Ficha técnica</h|<li><strong>Año:</strong>|</li>|</ul>}
|
|
* descripción, argumento and comentario scrubs:
|
|
description.scrub {multi|<div class="intro-datasheet">|<p class="desc">|</p>|<div class="rater inline">} *not sure about be, occurs always?
|
|
* next adds 'Argumento' to description or if the one above is empty, only this (not sure :multi?)
|
|
description.scrub {multi(exclude="<object")|<h3>Argumento</h3>|<p>|</p>|<div class="rater inline">}
|
|
* adds 'comentario' to description:
|
|
description.scrub {single|>Ficha técnica</h|<li><strong>Comentario:</strong>|</li>|</ul>}
|
|
*
|
|
*
|
|
* operations:
|
|
* title: removes the html leftovers in titles, title has no unique es, <p> and <p class="desc">
|
|
title.modify {remove(notnull)|class="desc">}
|
|
title.modify {remove(notnull)|>}
|
|
title.modify {cleanup(null)}
|
|
index_title.modify {cleanup(null)}
|
|
titleoriginal.modify {cleanup(null)}
|
|
* subtitle from first part of index_title (eg. "Multicine: Superman")
|
|
temp_4.modify {remove|'temp_4'}
|
|
temp_4.modify {addend('title' not = 'index_title')|'index_title'} * only if different titles
|
|
temp_4.modify {remove|: 'title'} * deletes ": title" and first part remains (program name)
|
|
subtitle.modify {addend("")|'temp_4'} * copies into subtitle if empty
|
|
*
|
|
* description:
|
|
description.modify {remove|<li><strong>Comentario:</strong>} * </li> is missing sometimes and this is input at the end
|
|
description.modify {remove(null)|<strong>En este programa: </strong>}
|
|
* removing signs:
|
|
description.modify {remove(null)|[+]}
|
|
* removes subtitle (episode) from description:
|
|
description.modify {remove(null)|<b>'{single|<div class="intro-datasheet">|<p class="desc"><b>|</b><br>|<div class="rater inline">}'</b><br>}
|
|
* removes embedded objects & hyperlinks from description:
|
|
description.modify {remove(null)|</b>:<bR><bR><object '{single|<div class="intro-datasheet">|</b>:<bR><bR><object|</object>|<div class="rater inline">}'</object>}
|
|
description.modify {remove(null)|<object '{single|<div class="intro-datasheet">|<object|</object>|<div class="rater inline">}'</object>}
|
|
* remove hyperlinks:
|
|
description.modify {remove(null)|<a href='{single|<div class="intro-datasheet">|<a href=|>|<div class="rater inline">}'>}
|
|
description.modify {remove(null)|<A HREF='{single|<div class="intro-datasheet">|<A HREF=|>|<div class="rater inline">}'>} * as above but the hyperlink in captial letters! it happens!
|
|
* removes html leftovers:
|
|
description.modify {cleanup(null)}
|
|
description.modify {addstart(null)|Sin información} * fills an empty description
|
|
description.modify {replace|\||.\n} * convert description to a single value element
|
|
* next removes title from description start, that is usually followed by a dot
|
|
description.modify {remove(null)|'title'.\n} * deletes the title if followed by ".\n" (usually it's only at the very beginning)
|
|
description.modify {replace(null)|\'\'|\"} * sometimes uses '' instead of "
|
|
*
|
|
* handle the case that index_urlshow links to an empty show; it then ends with /0
|
|
index_temp_1.modify {substring(type=char)|'index_urlshow' -2 2}
|
|
index_urlshow.modify {remove('index_temp_1' "/0")|'index_urlshow'}
|
|
index_description.modify {addstart('index_temp_1' "/0")|Sin información}
|
|
*
|
|
* age rating:
|
|
rating.modify {remove(null)|Sin clasificar}
|
|
rating.modify {replace(null)|Todos los públicos|TP}
|
|
rating.modify {replace(null)|+ 7 años|7+}
|
|
rating.modify {replace(null)|+ 13 años|13+}
|
|
rating.modify {replace(null)|+ 18 años|18+}
|
|
*
|
|
* category changing:
|
|
index_category.modify {replace(null)|pelicula|Cine}
|
|
index_category.modify {replace(null)|película|Cine}
|
|
index_category.modify {replace(null)|serie|Serie}
|
|
*
|
|
* If there is an episode number, copy it to episode
|
|
episode.modify {calculate('title' ~ "episodio " format=F0)|'title' 1 *} * if word episodio is in title, try to extract the number
|
|
episode.modify {calculate('description' ~ "episodio " format=F0)|'description' 1 *} * if word episodio is in description, try to extract the number (it works if it is the first number)
|
|
episode.modify {calculate('subtitle' ~ "episodio" format=F0)|'subtitle' 1 *} * if word episodio is in subtitle, extract the number
|
|
episode.modify {remove("0")|0} * if the number is 0, remove it
|
|
*
|
|
*
|
|
* operations for MP (all the info into description):
|
|
temp_1.modify {remove|'temp_1'}
|
|
*Subtitle (can be movie program, episode number, episode title...)
|
|
temp_1.modify {addstart('subtitle' not "")|'subtitle'. }
|
|
*Country:
|
|
temp_2.modify {remove|'temp_2'}
|
|
temp_2.scrub {single|Origen:</strong|>|</li}
|
|
temp_1.modify {addend('temp_2' not "")|'temp_2'. } *Country
|
|
*More info
|
|
temp_1.modify {addend('productiondate' not "")|'productiondate'. } *Year
|
|
temp_1.modify {addend('titleoriginal' not "")|('titleoriginal'). } *Original title
|
|
*Categories
|
|
temp_2.modify {remove|'temp_2'}
|
|
temp_2.modify {addend(separator=", ")|'category'}
|
|
temp_1.modify {addend('temp_2' not "")|'temp_2'. } *Category
|
|
*Rating:
|
|
temp_1.modify {addend('rating' not "")|['rating']. } *Rating
|
|
*Actors:
|
|
temp_2.modify {remove|'temp_2'}
|
|
temp_2.modify {addend(separator=", ")|'actor'}
|
|
temp_1.modify {addend('temp_2' not "")|\nInt.: 'temp_2'. }
|
|
*Directors:
|
|
temp_2.modify {remove|'temp_2'}
|
|
temp_2.modify {addend(separator=", ")|'director'}
|
|
temp_1.modify {addend('temp_2' not "")|\nDirector: 'temp_2'. }
|
|
*Join all together, before the description
|
|
description.modify {addstart('temp_1' not "")|'temp_1'\n}
|
|
*Presenter:
|
|
temp_2.modify {remove|'temp_2'}
|
|
temp_2.modify {addend(separator=", ")|'presenter'}
|
|
description.modify {addend('temp_2' not "")|\nPresenta: 'temp_2'. }
|
|
* end |