前言:
上一篇文章Python|爬虫和测试|selenium框架的安装和初步使用(一)_晚风_END的博客-CSDN博客 大概介绍了一下selenium的安装和初步使用,主要是打开某个网站的主页,基本是最基础的东西,那么,这篇文章里就写一点更加深入的东西吧。
主要是介绍比如,selenium网页刷新,模拟登录csdn,元素定位等等内容
一,
无头浏览器
什么是无头浏览器呢?其实就是selenium后台启动一个浏览器,该浏览器看不到,以节约测试用机的资源。
options.add_argument("headless")主要是这个,其次是截图,截图保存在了d盘,否则不知道是否确实运行了
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://www.csdn.net")
driver.get_screenshot_as_file('d:\\截图.png')
二,
刷新页面
关闭无头,以在前台观察是否确实刷新,增加刷新代码,主要是driver.refresh()方法
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
#options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://www.csdn.net")
time.sleep(2)
driver.get_screenshot_as_file('d:\\截图.png')
try:
# 刷新页面
driver.refresh()
print('刷新页面')
except Exception as e:
print('刷新失败')
执行完毕后,cmd的截图:
表明确实刷新了页面
三,
csdn首页输入框输入指定字符
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
#options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://www.csdn.net")
time.sleep(2)
driver.get_screenshot_as_file('d:\\截图.png')
try:
# 刷新页面
driver.refresh()
print('刷新页面')
except Exception as e:
print('刷新失败')
print(driver.page_source)
driver.find_element(By.XPATH,'//*[@id="toolbar-search-input"]').send_keys('fuck')
运行结果如下:
四,
点击搜索
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
#options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://www.csdn.net")
time.sleep(2)
driver.get_screenshot_as_file('d:\\截图.png')
try:
# 刷新页面
driver.refresh()
print('刷新页面')
except Exception as e:
print('刷新失败')
print(driver.page_source)
driver.find_element(By.XPATH,'//*[@id="toolbar-search-input"]').send_keys('fuck')
driver.find_element(By.XPATH,'//*[@id="toolbar-search-button"]').click()
网页源代码内相关内容如下:
运行结果如下:
说我没有登录,OK,这就登录一下
五,
selenium登录csdn
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
#options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://www.csdn.net")
time.sleep(2)
driver.get_screenshot_as_file('d:\\截图.png')
try:
# 刷新页面
driver.refresh()
print('刷新页面')
except Exception as e:
print('刷新失败')
print(driver.page_source)
driver.find_element(By.XPATH,'//*[@id="toolbar-search-input"]').send_keys('fuck')
#driver.find_element(By.XPATH,'//*[@id="toolbar-search-button"]').click()
login=driver.find_element(By.XPATH, "//*[@class='toolbar-btn-loginfun']")
#login=driver.find_element_by_class_name('toolbar-btn-loginfun')
login.click()
网页源代码相关内容如下:
运行效果如下:
很显然,在用户中心登录不是一个好主意,因此,我们更换为使用用户登录中心,也就是更换网址为:https://passport.csdn.net/login?code=public%27
同时,我们需要抓取这个网页的前端源代码,登录相关的部分如下:
</span></div></div> <div class="passport-main"><div class="welcome_tips"><span>终于等到你~</span> <img src="https://csdnimg.cn/release/passport_fe/assets/images/wel_tips.5624828.png"></div> <div data-v-c8607eae="" class="login-box"><div data-v-c8607eae="" class="login-box-top"><div data-v-c8607eae="" class="login-box-tabs"><div data-v-c8607eae="" class="login-box-tabs-items"><span data-v-c8607eae="" id="last-login" class="last-login-way" style="display: none;">上次登录</span> <!----> <span data-v-c8607eae="" class="">微信登录</span> <!----> <span data-v-c8607eae="" class="">免密登录</span> <span data-v-c8607eae="" class="tabs-active">密码登录</span></div> <div data-v-c8607eae="" class="login-box-tabs-main"><!----> <div data-v-e5be92b8="" data-v-c8607eae="" class="login-form"><div data-v-e5be92b8="" class="login-form-item"><div data-v-4cb3a723="" data-v-e5be92b8="" class="base-input"><input data-v-4cb3a723="" autocomplete="username" placeholder="手机号/邮箱/用户名" type="text" class="base-input-text"> <span data-v-4cb3a723="" class="base-input-icon base-input-icon-clear" style="display: none;"></span> <!----> <!----></div></div> <div data-v-e5be92b8="" class="login-form-item"><div data-v-4cb3a723="" data-v-e5be92b8="" class="base-input"><!----> <input data-v-4cb3a723="" autocomplete="current-password" placeholder="密码" type="password" class="base-input-text" style="width: calc(100% - 16px);"> <!----> <span data-v-4cb3a723="" class="base-input-icon base-input-icon-password"></span> <!----></div></div> <div data-v-e5be92b8="" class="login-form-item-tips"><span data-v-e5be92b8="" class="login-form-error" style="display: none;"></span> <a data-v-e5be92b8="" target="_blank" data-report-click="{"spm": "3001.6552"}" href="https://passport.csdn.net/forget" class="login-form-link">忘记密码
</a></div> <div data-v-e5be92b8="" class="login-form-item"><button data-v-23f9b684="" data-v-e5be92b8="" disabled="disabled" class="base-button">登录</button></div></div></div></div>
根据以上内容,得出如下登录代码:
###注placeholder="手机号/邮箱/用户名 和 placeholder="密码" 以及class="base-button" 是关键的定位元素
#codding=utf-8
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import time
options = Options()
options.binary_location = "C:\\Users\\Administrator\\Desktop\\chrome\\Chrome-bin\\chrome.exe"
options.add_experimental_option("detach", True)
#options.add_argument("headless")
path=Service('f:\\chromedriver.exe')
driver = webdriver.Chrome(options=options,service=path)
# 截图预览
driver.get("https://passport.csdn.net/login?code=public%27")
time.sleep(2)
driver.get_screenshot_as_file('d:\\截图.png')
try:
# 刷新页面
driver.refresh()
print('刷新页面')
except Exception as e:
print('刷新失败')
#print(driver.page_source)
#选择密码登录方式
login = driver.find_element('xpath',"//span[contains(text(),'密码登录')]")
time.sleep(2)
login.click()
print(driver.page_source)
#输入用户名
driver.find_element(By.XPATH,'//*[@placeholder="手机号/邮箱/用户名"]').send_keys('自己的用户名')
#输入密码
driver.find_element(By.XPATH,'//*[@placeholder="密码"]').send_keys('自己的密码')
#点击登录
time.sleep(5)
driver.find_element(By.XPATH,'//*[@class="base-button"]').click()
OK,就这么简简单单的可以登录csdn了,不过需要注意,账号不能有异常,否则会出验证码,就登录不了了。