使用 find_all 获取 class 为 content 的 div 元素 - 基于 attrs 过滤
from bs4 import BeautifulSoup
html_content = '''
<div class="content">测试01</div>
<p class="content">测试02</p>
<div>测试03</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
for element in soup.find_all(name='div', attrs={'class': 'content'}):
print('元素: ', element)
执行结果:
元素: <div class="content">测试01</div>
使用 find_all 获取 class 为 content 的 div 元素 - 基于 class_ 过滤
from bs4 import BeautifulSoup
html_content = '''
<div class="content">测试01</div>
<p class="content">测试02</p>
<div>测试03</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
for element in soup.find_all(name='div', class_='content'):
print('元素: ', element)
执行结果:
元素: <div class="content">测试01</div>
使用 find_all 获取 class 为 content 的元素,不止 div
from bs4 import BeautifulSoup
html_content = '''
<div class="content">测试01</div>
<p class="content">测试02</p>
<div>测试03</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
for element in soup.find_all(class_='content'):
print('元素: ', element)
执行结果:
元素: <div class="content">测试01</div>
元素: <p class="content">测试02</p>
使用 select 获取 class 为 content 的 div 元素
from bs4 import BeautifulSoup
html_content = '''
<div class="content">测试01</div>
<p class="content">测试02</p>
<div>测试03</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
for element in soup.select("div[class='content']"):
print('元素: ', element)
执行结果:
元素: <div class="content">测试01</div>
使用 select 获取 class 为 content 的元素,不止 div
from bs4 import BeautifulSoup
html_content = '''
<div class="content">测试01</div>
<p class="content">测试02</p>
<div>测试03</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
for element in soup.select(".content"):
print('元素: ', element)
执行结果:
元素: <div class="content">测试01</div>
元素: <p class="content">测试02</p>