Skip to content
This repository has been archived by the owner on Jul 17, 2024. It is now read-only.

Commit

Permalink
Merge pull request #276 from covid19-miyazaki/ticket-275-fix-scraipin…
Browse files Browse the repository at this point in the history
…g-news

closed #275 fix scraiping news
  • Loading branch information
korosuke613 authored Aug 2, 2020
2 parents 2cc4061 + 822a8e1 commit d903f49
Show file tree
Hide file tree
Showing 7 changed files with 72 additions and 33 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/news_scraping.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/setup-ruby@v1
with:
ruby-version: '2.5'
ruby-version: '2.5.8'
- name: Scraping News
env:
URL: "https://www.pref.miyazaki.lg.jp/kenko/hoken/kansensho/covid19/hassei.html"
URL: "https://www.pref.miyazaki.lg.jp/covid-19/index.html"
SELENIUM_HOST: localhost
TZ: Asia/Tokyo
run: |
Expand Down
1 change: 1 addition & 0 deletions .ruby-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
2.5.8
12 changes: 5 additions & 7 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
GEM
remote: https://rubygems.org/
specs:
childprocess (2.0.0)
rake (< 13.0)
rake (12.3.3)
rubyzip (1.3.0)
selenium-webdriver (3.142.4)
childprocess (>= 0.5, < 3.0)
rubyzip (~> 1.2, >= 1.2.2)
childprocess (3.0.0)
rubyzip (2.3.0)
selenium-webdriver (3.142.7)
childprocess (>= 0.5, < 4.0)
rubyzip (>= 1.2.2)

PLATFORMS
ruby
Expand Down
6 changes: 0 additions & 6 deletions components/WhatsNew.vue
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,6 @@
target="_blank"
rel="noopener"
>
<time
class="WhatsNew-list-item-anchor-time px-2"
:datetime="formattedDate(item.date)"
>
{{ item.date }}
</time>
<span class="WhatsNew-list-item-anchor-link">
<!-- @todo t-i18n化が必要 -->
{{ item.text }}
Expand Down
7 changes: 4 additions & 3 deletions data/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@
},
"main_summary": {
"attr": "検査実施人数",
"value": 1609,
"value": 4431,
"children": [
{
"attr": "陽性患者数",
Expand Down Expand Up @@ -269,7 +269,8 @@
},
"inspections_summary": {
"date": "2020/03/27 00:00",
"data": {},
"data": {
},
"labels": []
}
}
}
42 changes: 36 additions & 6 deletions data/news.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,44 @@
{
"newsItems": [
{
"date": "2020年5月29日",
"url": "https://www.pref.miyazaki.lg.jp/sogoseisaku/kenko/hoken/20200529_message.html",
"text": "新型コロナウイルス感染症に関する知事メッセージ(令和2年5月29日)"
"url": "https://www.pref.miyazaki.lg.jp/kansensho-taisaku/covid-19/hassei_list.html",
"text": "新型コロナウイルス感染症患者(142~157例目)の発生について(令和2年7月31日)"
},
{
"date": "2020年5月29日",
"url": "https://www.pref.miyazaki.lg.jp/kansensho-taisaku/kenko/hoken/kinkyujitaisengen_covid19.html",
"text": "「新しい生活様式」の実践へ"
"url": "https://www.pref.miyazaki.lg.jp/fukushihoken/covid-19/jigyosha/20200730202201.html",
"text": "「休業要請等」を県下全域に拡大します(8月1日~)"
},
{
"url": "https://www.pref.miyazaki.lg.jp/jinji/covid-19/chiji/20200801_message.html",
"text": "【知事メッセージ】知事部局職員の新型コロナウイルス感染について(令和2年8月1日)"
},
{
"url": "https://www.pref.miyazaki.lg.jp/kohosenryaku/covid-19/chiji/20200730.html",
"text": "【知事記者会見】「休業要請等」の県下全域への拡大について(令和2年7月30日)"
},
{
"url": "https://www.pref.miyazaki.lg.jp/kansensho-taisaku/covid-19/yobo/hassei.html",
"text": "県内の感染状況(警戒レベル)及び注意すべき県外の地域"
},
{
"url": "https://www.pref.miyazaki.lg.jp/kansensho-taisaku/covid-19/yobo/20200726104446.html",
"text": "高鍋町内の接待を伴う飲食店の従業員やご利用された方の相談先"
},
{
"url": "https://www.pref.miyazaki.lg.jp/ky-somu/covid-19/torikumi/20200725163141.html",
"text": "県立学校における新型コロナウイルス感染症対策の対応について(令和2年7月30日時点)"
},
{
"url": "https://www.pref.miyazaki.lg.jp/choju/covid-19/jigyosha/20200713095950.html",
"text": "新型コロナウイルス感染症対応従事者等慰労金(医療・介護・障害福祉分)について"
},
{
"url": "https://www.pref.miyazaki.lg.jp/kansensho-taisaku/covid-19/yobo/20200623app.html",
"text": "新型コロナウイルス接触確認アプリ(COCOA)を利用しましょう"
},
{
"url": "https://www.pref.miyazaki.lg.jp/sogoseisaku/kense/sogoseisaku/20200511164006.html",
"text": "新型コロナ宮崎復興応援寄附金"
}
]
}
33 changes: 24 additions & 9 deletions scrapingSource/scraping.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,41 @@

# スクレイピング
driver.navigate.to(ENV['URL'])
if (driver.find_elements(:class => "list_table").size == 0)
if (driver.find_elements(:class => "info_list").size == 0)
puts "no info_list"
exit
end
list_table = driver.find_element(:class => "list_table")
dates = list_table.find_elements(:class => "date")
urls = list_table.find_elements(:tag_name => "a")
texts = list_table.find_elements(:tag_name => "a")
count = dates.length - 1
infoList = driver.find_element(:class => "info_list")
scrapedNews = infoList.find_elements(:tag_name => "a")
count = scrapedNews.length - 1
newsItems = []
for i in 0..count do
newsItem = { "date" => dates[i].text, "url" => urls[i].attribute("href"), "text" => texts[i].text }
newsItem = { "url" => scrapedNews[i].attribute("href"), "text" => scrapedNews[i].text }
newsItems.push(newsItem)
end
news = { "newsItems" => newsItems }
puts news

# JSON出力
news_json = JSON.pretty_generate(news, {:indent => " "})
newsJson = JSON.pretty_generate(news, {:indent => " "})
File.open("data/news.json", mode = "w") { |f|
f.write(news_json)
f.write(newsJson)
}

pcrTable = driver.find_elements(:class => "datatable").last
rows = pcrTable.find_elements(:tag_name => "tr").last
total = rows.find_element(:tag_name => "td").find_element(:tag_name => "p")

dataHash = {}
File.open("data/data.json") do |file|
dataHash = JSON.load(file)
end

dataHash["main_summary"]["value"] = total.text.delete(",").to_i

dataJson = JSON.pretty_generate(dataHash, {:indent => " "})
File.open("data/data.json", mode = "w") { |f|
f.write(dataJson)
}

exit
Expand Down

0 comments on commit d903f49

Please sign in to comment.