**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing EPG data from TvGuide websites * @Site: nasa.gov * @MinSWversion: V1.1.1/55.17 * @Revision 2 - [01/04/2015] Francis De Paemeleere * - make it work on newer version of WG++ (for multiple days) * @Revision 1 - [30/07/2014] Francis De Paemeleere * - fix automatic adjusting number of index data to be downloaded (only the dates you have configured) * - set correct timezone * - set correct cultureinfo * @Revision 0 - [29/07/2014] Francis De Paemeleere * - creation * @Remarks: * @header_end **------------------------------------------------------------------------------------------------ site {url=nasa.gov|timezone=US/Eastern|maxdays=32.1|cultureinfo=en-US|charset=UTF-8|titlematchfactor=90|firstshow=1} url_index{url|http://www.nasa.gov/ws/tv_timeline.jsonp?format_output=1&display_id=page_1&Calendars=##CALENDAR##&andor=1&start=##START##%2012:00:00%20AM&end=##END##%2011:59:59%20PM} scope.range {(urlindex)|end} index_variable_element.modify {calculate(format=date,MM/dd/yyyy)|'now'} *today url_index.modify {replace|##START##|'index_variable_element'} index_temp_1.modify {calculate(format=F1)|'config_timespan_days' 1 +} index_temp_1.modify {calculate(format=timespan,days)} index_variable_element.modify {calculate(format=date,MM/dd/yyyy)|'now' 'index_temp_1' +} *config_timespan_days days after today url_index.modify {replace|##END##|'index_variable_element'} index_variable_element.modify {clear} index_variable_element.modify {addstart|'config_site_id'} url_index.modify {replace|##CALENDAR##|'index_variable_element'} end_scope url_index.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the index pages index_showsplit.scrub {multi|"nodes":|"node":|}}|} scope.range {(splitindex)|end} * filter out the shows that are on this channel (="calendar") index_temp_1.modify {clear} index_temp_5.modify {clear} index_variable_element.modify {clear} index_variable_element.modify {addstart|'config_site_id'} index_temp_5.modify {addstart|"calendars"\s*:\s*\[[^]]*("'index_variable_element'")[^]]*\]} loop {(each 'index_temp_6' in 'index_showsplit' max=3000)|end} index_temp_2.modify {clear} index_temp_3.modify {clear} index_temp_3.modify {substring(type=regex)|'index_temp_6' "overlay"\s*:\s*\[} index_temp_2.modify {substring(""=='index_temp_3' type=regex)|'index_temp_6' 'index_temp_5'} index_temp_1.modify {addend(""not='index_temp_2')|'index_temp_6'###_###} * for now, only add show if not an overlay shows (=show that is part of a bigger show) end_loop index_showsplit.modify {clear} index_showsplit.modify {addstart|'index_temp_1'} index_showsplit.modify {replace|###_###|\|} ** sort the shows index_showsplit.modify {sort(ascending,string)} sort_by.scrub {regex(target="index_showsplit")||"start"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} end_scope *index_date.scrub {single|} index_start.scrub {regex||"start"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} index_stop.scrub {regex||"end"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} index_title.scrub {regex||"title"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} index_description.scrub {regex||"description"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} index_showicon.scrub {regex||"master_image"\s*:\s*"([^"\\]*(?:\\.[^"\\]*)*)"||} index_start.modify {substring(type=regex)|\d{4}-\d{2}-\d{2} (\d{2}:\d{2}):\d{2}} index_stop.modify {substring(type=regex)|\d{4}-\d{2}-\d{2} (\d{2}:\d{2}):\d{2}} index_title.modify {cleanup(style=jsondecode)} index_description.modify {cleanup(style=jsondecode)} index_showicon.modify {cleanup(style=jsondecode)} index_showicon.modify {addstart(not"")|http://www.nasa.gov} ** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file) ** ** @auto_xml_channel_start *url_index{url|http://www.nasa.gov/multimedia/nasatv/schedule.html} *index_site_id.scrub {regex ||||} *index_site_channel.scrub {regex||||} *scope.range {(channellist)|end} *index_site_id.modify {cleanup(removeduplicates=equal,100 link="index_site_channel")} ** remove the first entry, because it is the "all" value *index_site_id.modify {remove(type=element)|0 1} *index_site_channel.modify {remove(type=element)|0 1} *end_scope ** @auto_xml_channel_end