root/trunk/plagger/assets/plugins/Filter-EntryFullText/chosunonline.yaml

Revision 1972 (checked in by otsune, 9 months ago)

assets/plugins/Filter-EntryFullText/chosunonline.yaml:
assets/plugins/Filter-EntryFullText/japanese_chosun_com.yaml:

rewrite EFT for http://www.chosunonline.com/ (from mhatta)

Line 
1 # http://www.chosunonline.com/
2 author:
3   - mhatta
4   - Masafumi Otsune
5 custom_feed_handle: http://www\.chosunonline\.com/
6 custom_feed_follow_link: /article/\d{14}
7 handle: http://www\.chosunonline\.com/article/\d{14}
8 extract_xpath:
9   title: //h4/text()
10   subtitle: //h5
11   date: //div[@class="postdate"]/text()
12 #  date: substring-after(//div[@id="post"]/div[@class="postdate"]/text(),': ')
13   body: //div[@id="news_content"]
14   author: //div[@class="credit"]/text()
15 extract_after_hook: |
16   $data->{body} = "$data->{subtitle} $data->{body}";
17   $data->{date} =~ s!.+?(\d{4}/\d{2}/\d{2}\s\d{2}:\d{2}:\d{2})!$1!;
Note: See TracBrowser for help on using the browser.