Python_スクレイピング

🐘 Python_スクレイピング

作成日： 2022/06/16

■本
スクレイピング・ハッキング・ラボ　Pythonで自動化する未来型生活 (技術の泉シリーズ（NextPublishing）)

0616

17%まで
jupyter-notebookの起動〜

0621

sudo -H pip3 install jupyter
参考
https://degitalization.hatenablog.jp/entry/2020/12/07/154628
スクリーンショット 2022-06-21 21.18.04.png

0622

配列で
31%まで

0627

はてぶのタイトル取得

import requests
from bs4 import BeautifulSoup

url = "https://b.hatena.ne.jp/"

response = requests.get(url)

soup = BeautifulSoup(response.content,"html.parser")
top_entry = soup.find("section", attrs = {"class":"entrylist-unit"})
entries = top_entry.find_all("h3", attrs = {"class":"entrylist-contents-title"})
for entry in entries:
  print(entry.find("a").get("title"))

35%まで

0630

タイトル & user数の取得

import requests
from bs4 import BeautifulSoup

url = "https://b.hatena.ne.jp/"

response = requests.get(url)

soup = BeautifulSoup(response.content,"html.parser")
top_entry = soup.find("section", attrs = {"class":"entrylist-unit"})
entries = top_entry.find_all("div", attrs = {"class":"entrylist-contents"})

for entry in entries:
  title_tag = entry.find("h3", attrs={"class": "entrylist-contents-title"})
  title = title_tag.find("a").get("title")
  users_tag = entry.find("span", attrs={"class": "entrylist-contents-users"})
  users = users_tag.get_text().strip()
#   category = entry.find("a", attrs={"class": "entrylist-contents-category"})

  print("h3タイトル：　" + title)
  print(users)
#   print(category)

36%まで

Kazuma