| 3 | | custom_feed_handle: http://www\.sankei\.co\.jp/news/ |
|---|
| 4 | | custom_feed_follow_link: /\d+/\w+\.htm |
|---|
| 5 | | handle: http://www\.sankei\.co\.jp/news/\d+/\w+\.htm |
|---|
| 6 | | extract: <!--midashi-->(.*?)<!--midashiend-->.*?<!--photo.sta-->(.*?)<!--photo.end-->.*?<!--hombun-->(.*?(\(\d{2}/\d{2} \d{2}:\d{2}\)).*?)<!--hbnend--> |
|---|
| 7 | | extract_capture: title photo body date |
|---|
| | 3 | handle: http://sankei\.jp\.msn\.com/\w+/\w+/\d+/[\w\-]+\.htm |
|---|
| | 4 | extract_xpath: |
|---|
| | 5 | title: //span[@id="__r_article_title__"] |
|---|
| | 6 | date: //span[@id="__r_publish_date__"]/text() |
|---|
| | 7 | body: //div[@class="_LSUCS"] |
|---|
| | 8 | photo1: //div[@class="image"] |
|---|
| | 9 | photo2: //div[@class="relatedimg"] |
|---|