爬虫入门到精通_框架篇14(PySpider架构概述及用法详解)

官方文档
Sample Code:

from pyspider.libs.base_handler import *


class Handler(BaseHandler):
    crawl_config = {
    }

	# minutes=24 * 60:每隔一天重新爬取
    @every(minutes=24 * 60)
    def on_start(self):
        self.crawl('http://scrapy.org/', callback=self.index_page)

   # age:过期时间
    @config(age=10 * 24 * 60 * 60)
    def index_page(self, response):
        for each in response.doc('a[href^="http"]').items():
            self.crawl(each.attr.href, callback=self.detail_page)

    def detail_page(self, response):
        return {
            "url": response.url,
            "title": response.doc('title').text(),
        }

1 架构(Architecture)

在这里插入图片描述
scheduler:调度器
feter:请求器,请求网页
processor:用来处理数据,处理的url可以重新调用调度器
monitor&webui:监视器与webUI

scheduler

维护request队列,默认是存储在本地的数据库

feter

在这里插入图片描述
可以用两种模式:一种是直接请求,一种是js的请求。

processor

处理数据,将url重新封装起来,可以重新调用调度器。

2 Command Line

Global Config

global options work for all subcommands.

Usage: pyspider [OPTIONS] COMMAND [ARGS]...

  A powerful spider system in python.

Options:
  -c, --config FILENAME    a json file with default values for subcommands.
                           {“webui”: {“port”:5001}}
  --logging-config TEXT    logging config file for built-in python logging
                           module  [default: pyspider/pyspider/logging.conf]
  --debug                  debug mode
  --queue-maxsize INTEGER  maxsize of queue
  --taskdb TEXT            database url for taskdb, default: sqlite
  --projectdb TEXT         database url for projectdb, default: sqlite
  --resultdb TEXT          database url for resultdb, default: sqlite
  --message-queue TEXT     connection url to message queue, default: builtin
                           multiprocessing.Queue
  --amqp-url TEXT          [deprecated] amqp url for rabbitmq. please use
                           --message-queue instead.
  --beanstalk TEXT         [deprecated] beanstalk config for beanstalk queue.
                           please use --message-queue instead.
  --phantomjs-proxy TEXT   phantomjs proxy ip:port
  --data-path TEXT         data dir path
  --version                Show the version and exit.
  --help                   Show this message and exit.

config

pyspider启动的时候的一些配置选项

{
  "taskdb": "mysql+taskdb://username:password@host:port/taskdb",
  "projectdb": "mysql+projectdb://username:password@host:port/projectdb",
  "resultdb": "mysql+resultdb://username:password@host:port/resultdb",
  "message_queue": "amqp://username:password@host:port/%2F",
  "webui": {
    "username": "some_name",
    "password": "some_passwd",
    "need-auth": true
  }
}

项目启动时,会生成上述3个db,在当前项目的data目录下
在这里插入图片描述
webui:指定好界面访问的用户名密码
在这里插入图片描述
在这里插入图片描述
此时弹出:
在这里插入图片描述

all

运行所有的组件

Usage: pyspider all [OPTIONS]

  Run all the components in subprocess or thread

Options:
  --fetcher-num INTEGER         instance num of fetcher
  --processor-num INTEGER       instance num of processor
  --result-worker-num INTEGER   instance num of result worker
  --run-in [subprocess|thread]  run each components in thread or subprocess.
                                always using thread for windows.
  --help                        Show this message and exit.

在这里插入图片描述

one

在一个进程下运行所有的组件

Usage: pyspider one [OPTIONS] [SCRIPTS]...

  One mode not only means all-in-one, it runs every thing in one process
  over tornado.ioloop, for debug purpose

Options:
  -i, --interactive  enable interactive mode, you can choose crawl url.
  --phantomjs        enable phantomjs, will spawn a subprocess for phantomjs
  --help             Show this message and exit.

bench

请求测试

Usage: pyspider bench [OPTIONS]

  Run Benchmark test. In bench mode, in-memory sqlite database is used
  instead of on-disk sqlite database.

Options:
  --fetcher-num INTEGER         instance num of fetcher
  --processor-num INTEGER       instance num of processor
  --result-worker-num INTEGER   instance num of result worker
  --run-in [subprocess|thread]  run each components in thread or subprocess.
                                always using thread for windows.
  --total INTEGER               total url in test page
  --show INTEGER                show how many urls in a page
  --help                        Show this message and exit.

已下组件都是单独开启

scheduler

Usage: pyspider scheduler [OPTIONS]

  Run Scheduler, only one scheduler is allowed.

Options:
  --xmlrpc / --no-xmlrpc
  --xmlrpc-host TEXT
  --xmlrpc-port INTEGER
  --inqueue-limit INTEGER  size limit of task queue for each project, tasks
                           will been ignored when overflow
  --delete-time INTEGER    delete time before marked as delete
  --active-tasks INTEGER   active log size
  --loop-limit INTEGER     maximum number of tasks due with in a loop
  --scheduler-cls TEXT     scheduler class to be used.
  --help                   Show this message and exit.

phantomjs

Usage: run.py phantomjs [OPTIONS] [ARGS]...

  Run phantomjs fetcher if phantomjs is installed.

Options:
  --phantomjs-path TEXT  phantomjs path
  --port INTEGER         phantomjs port
  --auto-restart TEXT    auto restart phantomjs if crashed
  --help                 Show this message and exit.

fetcher

Usage: pyspider fetcher [OPTIONS]

  Run Fetcher.

Options:
  --xmlrpc / --no-xmlrpc
  --xmlrpc-host TEXT
  --xmlrpc-port INTEGER
  --poolsize INTEGER      max simultaneous fetches
  --proxy TEXT            proxy host:port
  --user-agent TEXT       user agent
  --timeout TEXT          default fetch timeout
  --fetcher-cls TEXT      Fetcher class to be used.
  --help                  Show this message and exit.

processor

Usage: pyspider processor [OPTIONS]

  Run Processor.

Options:
  --processor-cls TEXT  Processor class to be used.
  --help                Show this message and exit.

result_worker

Usage: pyspider result_worker [OPTIONS]

  Run result worker.

Options:
  --result-cls TEXT  ResultWorker class to be used.
  --help             Show this message and exit.

webui

Usage: pyspider webui [OPTIONS]

  Run WebUI

Options:
  --host TEXT            webui bind to host
  --port INTEGER         webui bind to host
  --cdn TEXT             js/css cdn server
  --scheduler-rpc TEXT   xmlrpc path of scheduler
  --fetcher-rpc TEXT     xmlrpc path of fetcher
  --max-rate FLOAT       max rate for each project
  --max-burst FLOAT      max burst for each project
  --username TEXT        username of lock -ed projects
  --password TEXT        password of lock -ed projects
  --need-auth            need username and password
  --webui-instance TEXT  webui Flask Application instance to be used.
  --help                 Show this message and exit.

在这里插入图片描述
在这里插入图片描述

3 API Reference

self.crawl(url, **kwargs)

self.crawl is the main interface to tell pyspider which url(s) should be crawled.
发起请求的函数

Parameters:

url

the url or url list to be crawled.
请求的网页

callback

the method to parse the response. _default: call _
回调函数

def on_start(self):
    self.crawl('http://scrapy.org/', callback=self.index_page)

the following parameters are optional

age

the period of validity of the task. The page would be regarded as not modified during the period. default: -1(never recrawl)
请求的过期时间,单位秒,过期了才重新请求

@config(age=10 * 24 * 60 * 60)
def index_page(self, response):
    ...

Every pages parsed by the callback index_page would be regarded not changed within 10 days. If you submit the task within 10 days since last crawled it would be discarded.

priority

the priority of task to be scheduled, higher the better. default: 0
指定爬取优先级.

def index_page(self):
    self.crawl('http://www.example.org/page2.html', callback=self.index_page)
    self.crawl('http://www.example.org/233.html', callback=self.detail_page,
               priority=1)

The page 233.html would be crawled before page2.html. Use this parameter can do a BFS and reduce the number of tasks in queue(which may cost more memory resources).

exetime

the executed time of task in unix timestamp. default: 0(immediately)
指定执行时间

import time
def on_start(self):
    self.crawl('http://www.example.org/', callback=self.callback,
               exetime=time.time()+30*60)

The page would be crawled 30 minutes later.

retries

retry times while failed. default: 3
失败之后的重试次数,默认时3,最大10.

itag

a marker from frontier page to reveal the potential modification of the task. It will be compared to its last value, recrawl when it’s changed. default: None
标识符,

def index_page(self, response):
    for item in response.doc('.item').items():
        self.crawl(item.find('a').attr.url, callback=self.detail_page,
                   itag=item.find('.update-time').text())

In the sample, .update-time is used as itag. If it’s not changed, the request would be discarded.
如果没有改变,不进行重新爬取

Or you can use itag with Handler.crawl_config to specify the script version if you want to restart all of the tasks.

class Handler(BaseHandler):
    crawl_config = {
        'itag': 'v223'
    }

Change the value of itag after you modified the script and click run button again. It doesn’t matter if not set before.

auto_recrawl

when enabled, task would be recrawled every age time. default: False
自动重爬,如果开启,age过期后自动重爬。

def on_start(self):
    self.crawl('http://www.example.org/', callback=self.callback,
               age=5*60*60, auto_recrawl=True)

The page would be restarted every age 5 hours.

method

HTTP method to use. default: GET
HTTP的请求方法。

params

dictionary of URL parameters to append to the URL.
get请求的一些参数.

def on_start(self):
    self.crawl('http://httpbin.org/get', callback=self.callback,
               params={'a': 123, 'b': 'c'})
    self.crawl('http://httpbin.org/get?a=123&b=c', callback=self.callback)

The two requests are the same.

data

the body to attach to the request. If a dictionary is provided, form-encoding will take place.
post请求的一些data.

def on_start(self):
    self.crawl('http://httpbin.org/post', callback=self.callback,
               method='POST', data={'a': 123, 'b': 'c'})

files

dictionary of {field: {filename: ‘content’}} files to multipart upload.
上传的文件.

user_agent

the User-Agent of the request

headers

dictionary of headers to send.

cookies

dictionary of cookies to attach to this request.

connect_timeout

timeout for initial connection in seconds. default: 20

timeout

maximum time in seconds to fetch the page. default: 120

allow_redirects

follow 30x redirect default: True
一些网页的302的跳转是否重定向设置.

validate_cert

For HTTPS requests, validate the server’s certificate? default: True
HTTPS的证书请求忽略.

proxy

proxy server of username:password@hostname:port to use, only http proxy is supported currently.

class Handler(BaseHandler):
    crawl_config = {
        'proxy': 'localhost:8080'
    }

Handler.crawl_config can be used with proxy to set a proxy for whole project.
设置代理.

etag

use HTTP Etag mechanism to pass the process if the content of the page is not changed. default: True
判断页面更新爬取情况.

last_modified

use HTTP Last-Modified header mechanism to pass the process if the content of the page is not changed. default: True

fetch_type

set to js to enable JavaScript fetcher. default: None
设置为js请求.渲染查找后的页面.

js_script

JavaScript run before or after page loaded, should been wrapped by a function like function() { document.write(“binux”); }.
网页爬取后,执行脚本.

def on_start(self):
    self.crawl('http://www.example.org/', callback=self.callback,
               fetch_type='js', js_script='''
               function() {
                   window.scrollTo(0,document.body.scrollHeight);
                   return 123;
               }
               ''')

The script would scroll the page to bottom. The value returned in function could be captured via Response.js_script_result.

js_run_at

run JavaScript specified via js_script at document-start or document-end. default: document-end
把脚本加到当我document前面还是后面.

js_viewport_width/js_viewport_height

set the size of the viewport for the JavaScript fetcher of the layout process.
JS视窗的大小.

load_images

load images when JavaScript fetcher enabled. default: False
加载时是否加载图片.

save

a object pass to the callback method, can be visit via response.save.
用来在多个函数直接传递变量的参数

def on_start(self):
    self.crawl('http://www.example.org/', callback=self.callback,
               save={'a': 123})

def callback(self, response):
    return response.save['a']

123 would be returned in callback

taskid

unique id to identify the task, default is the MD5 check code of the URL, can be overridden by method def get_taskid(self, task)
指定task的唯一标识码,ruquest队列的消息去重.

import json
from pyspider.libs.utils import md5string
def get_taskid(self, task):
    return md5string(task['url']+json.dumps(task['fetch'].get('data', '')))

Only url is md5 -ed as taskid by default, the code above add data of POST request as part of taskid.

force_update

force update task params even if the task is in ACTIVE status.
强制更新.

cancel

cancel a task, should be used with force_update to cancel a active task. To cancel an auto_recrawl task, you should set auto_recrawl=False as well.
取消任务.

@config(**kwargs)

default parameters of self.crawl when use the decorated method as callback. For example:
config里可以传递参数.

@config(age=15*60)
def index_page(self, response):
    self.crawl('http://www.example.org/list-1.html', callback=self.index_page)
    self.crawl('http://www.example.org/product-233', callback=self.detail_page)

@config(age=10*24*60*60)
def detail_page(self, response):
    return {...}

age of list-1.html is 15min while the age of product-233.html is 10days. Because the callback of product-233.html is detail_page, means it’s a detail_page so it shares the config of detail_page.

Handler.crawl_config = {}

default parameters of self.crawl for the whole project. The parameters in crawl_config for scheduler (priority, retries, exetime, age, itag, force_update, auto_recrawl, cancel) will be joined when the task created, the parameters for fetcher and processor will be joined when executed. You can use this mechanism to change the fetch config (e.g. cookies) afterwards.
将一些配置放入其中,使其全局生效.

class Handler(BaseHandler):
    crawl_config = {
        'headers': {
            'User-Agent': 'GoogleBot',
        }
    }

    ...

Response

Response.url

final URL.

Response.text

Content of response, in unicode.
返回网页源代码.
if Response.encoding is None and chardet module is available, encoding of content will be guessed.

Response.content

Content of response, in bytes.
返回网页源代码,二进制.

Response.doc

A PyQuery object of the response’s content. Links have made as absolute by default.
网页解析,调用 PyQuery的解析库.
Refer to the documentation of PyQuery: https://pythonhosted.org/pyquery/

It’s important that I will repeat, refer to the documentation of PyQuery: https://pythonhosted.org/pyquery/

Response.etree

A lxml object of the response’s content.
返回lxml对象

Response.json

The JSON-encoded content of the response, if any.
转换为Json格式.

Response.status_code

状态码:200,300等等.

Response.orig_url

If there is any redirection during the request, here is the url you just submit via self.crawl.
有重定向,就显示最原始的url.

Response.headers

A case insensitive dict holds the headers of response.

Response.cookies

Response.error

Messages when fetch error

Response.time

Time used during fetching.

Response.ok

True if status_code is 200 and no error.
判断请求是否OK.

Response.encoding

Encoding of Response.content.
用来指定编码
If Response.encoding is None, encoding will be guessed by header or content or chardet(if available).
Set encoding of content manually will overwrite the guessed encoding.

Response.save

The object saved by self.crawl API
指定不同函数直接传递参数.

Response.js_script_result

content returned by JS script
接受JS script的结果.

Response.raise_for_status()

Raise HTTPError if status code is not 200 or Response.error exists.
抛出一些错误.

self.send_message

部署过程中,消除队列的配置.

@every(minutes=0, seconds=0)

定时爬取中用的比较多.
method will been called every minutes or seconds

@every(minutes=24 * 60)
def on_start(self):
    for url in urllist:
        self.crawl(url, callback=self.index_page)

一天执行一次
The urls would be restarted every 24 hours. Note that, if age is also used and the period is longer then @every, the crawl request would be discarded as it’s regarded as not changed:

@every(minutes=24 * 60)
def on_start(self):
    self.crawl('http://www.example.org/', callback=self.index_page)

@config(age=10 * 24 * 60 * 60)
def index_page(self):
    ...

Even though the crawl request triggered every day, but it’s discard and only restarted every 10 days.

4 Frequently Asked Questions

Unreadable Code (乱码) Returned from Phantomjs

Phantomjs doesn’t support gzip, don’t set Accept-Encoding header with gzip.

How to Delete a Project?

set group to delete and status to STOP then wait 24 hours. You can change the time before a project deleted via
scheduler.DELETE_TIME.
status to STOP:
在这里插入图片描述
group to delete:
在这里插入图片描述

How to Restart a Project?

Why
It happens after you modified a script, and wants to crawl everything again with new strategy. But as the age of urls are not expired. Scheduler will discard all of the new requests.
Solution
1.Create a new project.
2.Using a itag within Handler.crawl_config to specify the version of your script.

What does the progress bar mean on the dashboard?

When mouse move onto the progress bar, you can see the explaintions.
For 5m, 1h, 1d the number are the events triggered in 5m, 1h, 1d. For all progress bar, they are the number of total tasks in correspond status.
Only the tasks in DEBUG/RUNNING status will show the progress.
在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/451665.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

week07day01(powerbi)

一. Power BI简介 1. 构成部分 power query: 进行简单的数据清洗power pivot : 进行指标计算power view : 进行报表视图 二. Power Query (进行数据清洗) 1. 如何获取数据: 点击获取数据 ——> 选择导入数据的类型——> 会出现 "加载&…

【刷题日志3.4--3.10】

绕过flag关键字od读取&#xff08;脚本&#xff09;空格过滤 [广东强网杯 2021 团队组]love_Pokemon <?php error_reporting(0); highlight_file(__FILE__); $dir sandbox/ . md5($_SERVER[REMOTE_ADDR]) . /;if(!file_exists($dir)){mkdir($dir); }function DefenderBon…

Pytorch搭建AlexNet 预测实现

1.导包 import torch import matplotlib.pyplot as plt import json from model import AlexNet from PIL import Image from torchvision import transforms 2.数据预处理 data_transform transforms.Compose([transforms.Resize((224, 224)), # 将图片重新裁剪transform…

【Android】 ClassLoader 知识点提炼

1.Java中的 ClassLoader 1.1 、ClassLoader的类型 Java 中的类加载器主要有两种类型&#xff0c;即系统类加载器和自定义类加载器。其中系统类 加载器包括3种&#xff0c;分别是 Bootstrap ClassLoader、Extensions ClassLoader 和 Application ClassLoader。 1.1.1.Bootstra…

uniapp开发DAPP钱包应用(二) Vue + Java

上一节我们讲了如何通过vue uniapp还有web3以及需要准备的相关组件&#xff0c;来搭建了DAPP开发的环境。 这一节&#xff0c;我们来说说如何用代码来实现DAPP相关接口。 1. ethers实现类 导入组件 import { ethers , providers , utils } from "ethers"; impor…

QML 控件添加键盘事件

在QML中&#xff0c;可以使用Keys类型来处理键盘事件。以下是一个简单的示例&#xff0c;演示如何在QML控件中添加键盘事件&#xff1a; import QtQuick 2.12 import QtQuick.Window 2.12Window {visible: truewidth: 640height: 480title: qsTr("Hello World")Recta…

DFS算法详解及例题

DFS:往深搜索&#xff0c;执着&#xff0c;确认从底下返回后的每个人的节点都已经用完&#xff0c;空间占用少&#xff0c;爆搜。 BFS:每一层搜索&#xff0c;稳重&#xff0c;(当一个图的权重都为1时搜到的一定是最短路) 下面我们以dfs的一道经典例题来讲解 代码: #include…

Ubuntu18.04 安装搜狗输入法

一. 概述 自己的Ubuntu 18.04系统配置中文搜狗输入法&#xff0c;安装步骤&#xff0c;亲测可用 二. 安装步骤 2.1 确认系统版本和CPU架构 查看Ubuntu系统版本号&#xff0c;通过命令 lsb_release -a wuubuntume:~$ lsb_release -a No LSB modules are available. Distr…

基于SpringBoot的“智慧食堂”系统(源码+数据库+文档+PPT)

基于SpringBootVue的智慧食堂系统的设计与实现&#xff08;源码数据库文档PPT) 开发语言&#xff1a;Java数据库&#xff1a;MySQL技术&#xff1a;SpringBootVUE工具&#xff1a;IDEA/Ecilpse、Navicat、Maven 系统展示 系统登录界面图 系统首页界面图 用户注册界面图 菜…

项目总结.

文章目录 1.DDD结构基础2. 抽奖领域3. 发奖领域4.活动领域5. 支撑领域应用层编排 1.DDD结构基础 包含&#xff1a; 接口层、应用层、领域层、基础层&#xff1b;通用包、接口层、接口定义。 接口层&#xff1a;实现RPC接口定义&#xff0c;引入应用层服务&#xff0c;封装具体的…

日期问题 刷题笔记

思路 枚举 19600101 到20591231这个区间的数 获得年月日 判断是否合法 如果合法 关于题目给出的日期 有三种可能 年/月/日 日/月/年 月/日/年 判断 是否和题目给出的日期符合 如果符合 输出 闰年{ 1.被4整除不被100整除 2.被400整除} 补位写法“%02d" 如果不…

【C语言步行梯】自定义函数、函数递归详谈

&#x1f3af;每日努力一点点&#xff0c;技术进步看得见 &#x1f3e0;专栏介绍&#xff1a;【C语言步行梯】专栏用于介绍C语言相关内容&#xff0c;每篇文章将通过图片代码片段网络相关题目的方式编写&#xff0c;欢迎订阅~~ 文章目录 什么是函数库函数自定义函数函数执行示例…

STM32 ADC数模转换器

单片机学习&#xff01; 目录 文章目录 前言 一、ADC简介 1.1 ADC名称 1.2 ADC功能 1.3 分辨率与转换时间 1.4 输入电压与转换范围 1.5 输入通道 1.6 增强功能 1.7 自动监测输入电压范围 1.8 STM32F103C8T6 ADC资源 二、逐次逼近型ADC 2.1 ADC内部结构原理图 2.2 输入通道选择 …

AI 驱动的医疗变革:迈向未来医疗新生态

直面呼啸而来的人工智能&#xff0c;医疗行业将首当其冲&#xff0c;发生翻天覆地的变化。美国心脏病学家兼基因学教授埃里克托普在《未来医疗》中预测&#xff0c;未来人类将拥有“健康小助手”——个人医疗数据和处理能力&#xff0c;还能轻松预防疾病。诸多评论家也持类似观…

Jade 处理XRD并计算半峰宽FWHM、峰面积、峰强度等数据

1.打开软件 2.导入测试的XRD数据 3.平滑数据 4.抠一下基底 5.分析具体数据 6.按住鼠标左键&#xff0c;在峰底部拉一条线&#xff0c;尽量和基底持平 7.结果就出来了&#xff0c;想要的都在里面&#xff0c;直接取值就行

数据分析-Pandas如何观测数据的中心趋势度

数据分析-Pandas如何观测数据的中心趋势度 数据分析和处理中&#xff0c;难免会遇到各种数据&#xff0c;那么数据呈现怎样的规律呢&#xff1f;不管金融数据&#xff0c;风控数据&#xff0c;营销数据等等&#xff0c;莫不如此。如何通过图示展示数据的规律&#xff1f; 数据…

Stable Diffusion 模型:从噪声中生成逼真图像

你好&#xff0c;我是郭震 简介 Stable Diffusion 模型是一种生成式模型&#xff0c;可以从噪声中生成逼真的图像。它由 Google AI 研究人员于 2022 年提出&#xff0c;并迅速成为图像生成领域的热门模型。 数学基础 Stable Diffusion模型基于一种称为扩散概率模型(Diffusion P…

16 OpenCV Laplance算子

文章目录 图像的二阶导数Laplance算子代码示例 图像的二阶导数 在二阶导数的时候&#xff0c;最大变化处的值为零即边缘是零值。通过二阶 导数计算&#xff0c;依据此理论我们可以计算图像二阶导数&#xff0c;提取边缘。 Laplance算子 void Laplacian( InputArray src, Output…

Java代码审计安全篇-SSRF(服务端请求伪造)漏洞

前言&#xff1a; 堕落了三个月&#xff0c;现在因为被找实习而困扰&#xff0c;着实自己能力不足&#xff0c;从今天开始 每天沉淀一点点 &#xff0c;准备秋招 加油 注意&#xff1a; 本文章参考qax的网络安全java代码审计&#xff0c;记录自己的学习过程&#xff0c;还希望各…

【C++进阶】C++多态概念详解

C多态概念详解 一&#xff0c;多态概念二&#xff0c;多态的定义2.1 多态构成的条件2.2 什么是虚函数2.3 虚函数的重写2.3.1 虚函数重写的特例2.3.2 override和final 2.4 重载和重写&#xff08;覆盖&#xff09;和重定义&#xff08;隐藏&#xff09;的区别 三&#xff0c;抽象…