爬虫(十七):Scrapy框架(四) 对接selenium爬取京东商品数据

时间:2022-08-27 18:20:09

1. Scrapy对接Selenium

Scrapy抓取页面的方式和requests库类似,都是直接模拟HTTP请求,而Scrapy也不能抓取JavaScript动态谊染的页面。在前面的博客中抓取JavaScript渲染的页面有两种方式。一种是分析Ajax请求,找到其对应的接口抓取,Scrapy同样可以用此种方式抓取。另一种是直接用 Selenium模拟浏览器进行抓取,我们不需要关心页面后台发生的请求,也不需要分析渲染过程,只需要关心页面最终结果即可,可见即可爬。那么,如果Scrapy可以对接Selenium,那 Scrapy就可以处理任何网站的抓取了。

1.1 新建项目

首先新建项目,名为scrapyseleniumtest。

scrapy startproject scrapyseleniumtest

新建一个Spider。

scrapy genspider jd www.jd.com

修改ROBOTSTXT_OBEY为False。

ROBOTSTXT_OBEY = False

爬虫(十七):Scrapy框架(四) 对接selenium爬取京东商品数据

1.2 定义Item

这里我们就不调用Item了。

初步实现Spider的start _requests()方法。

# -*- coding: utf-8 -*-
from scrapy import Request,Spider
from urllib.parse import quote
from bs4 import BeautifulSoup class JdSpider(Spider):
name = 'jd'
allowed_domains = ['www.jd.com']
base_url = 'https://search.jd.com/Search?keyword=' def start_requests(self):
for keyword in self.settings.get('KEYWORDS'):
for page in range(1, self.settings.get('MAX_PAGE') + 1):
url = self.base_url + quote(keyword)
# dont_filter = True 不去重
yield Request(url=url, callback=self.parse, meta={'page': page}, dont_filter=True)

首先定义了一个base_url,即商品列表的URL,其后拼接一个搜索关键字就是该关键字在京东搜索的结果商品列表页面。

关键字用KEYWORDS标识,定义为一个列表。最大翻页页码用MAX_PAGE表示。它们统一定义在settings.py里面。

KEYWORDS = ['iPad']
MAX_PAGE = 2

在start_requests()方法里,我们首先遍历了关键字,遍历了分页页码,构造并生成Request。由于每次搜索的URL是相同的,所以分页页码用meta参数来传递,同时设置dont_filter不去重。这样爬虫启动的时候,就会生成每个关键字对应的商品列表的每一页的请求了。

1.3 对接Selenium

接下来我们需要处理这些请求的抓取。这次我们对接Selenium进行抓取,采用Downloader Middleware来实现。在Middleware中对接selenium,输出源代码之后,构造htmlresponse对象,直接返回给spider解析页面,提取数据,并且也不在执行下载器下载页面动作。

class SeleniumMiddleware(object):
# Not all methods need to be defined. If a method is not defined,
# scrapy acts as if the downloader middleware does not modify the
# passed objects. def __init__(self,timeout=None):
self.logger=getLogger(__name__)
self.timeout = timeout
self.browser = webdriver.Chrome()
self.browser.set_window_size(1400,700)
self.browser.set_page_load_timeout(self.timeout)
self.wait = WebDriverWait(self.browser,self.timeout) def __del__(self):
self.browser.close() @classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
return cls(timeout=crawler.settings.get('SELENIUM_TIMEOUT')) def process_request(self, request, spider):
'''
在下载器中间件中对接使用selenium,输出源代码之后,构造htmlresponse对象,直接返回
给spider解析页面,提取数据
并且也不在执行下载器下载页面动作
htmlresponse对象的文档:
:param request:
:param spider:
:return:
''' print('PhantomJS is Starting')
page = request.meta.get('page', 1)
self.wait = WebDriverWait(self.browser, self.timeout)
# self.browser.set_page_load_timeout(30)
# self.browser.set_script_timeout(30)
try:
self.browser.get(request.url)
if page > 1:
input = self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > input')))
input.clear()
input.send_keys(page)
time.sleep(5) # 将网页中输入跳转页的输入框赋值给input变量 EC.presence_of_element_located,判断输入框已经被加载出来
input = self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > input')))
# 将网页中调准页面的确定按钮赋值给submit变量,EC.element_to_be_clickable 判断此按钮是可点击的
submit = self.wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > a')))
input.clear()
input.send_keys(page)
submit.click() # 点击按钮
time.sleep(5) # 判断当前页码出现在了输入的页面中,EC.text_to_be_present_in_element 判断元素在指定字符串中出现
self.wait.until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, '#J_bottomPage > span.p-num > a.curr'),str(page)))
# 等待 #J_goodsList 加载出来,为页面数据,加载出来之后,在返回网页源代码
self.wait.until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, '#J_bottomPage > span.p-num > a.curr'),str(page)))
return HtmlResponse(url=request.url, body=self.browser.page_source, request=request, encoding='utf-8',status=200)
except TimeoutException:
return HtmlResponse(url=request.url, status=500, request=request)

首先我在__init__()里对一些对象进行初始化,包括WebDriverWait等对象,同时设置页面大小和页面加载超时时间。在process_request()方法中,我们通过Request的meta属性获取当前需要爬取的页码,将页码赋值给input变量,再将翻页的点击按钮框赋值给submit变量,然后在数据框中输入页码,等待页面加载,直接返回htmlresponse给spider解析,这里我们没有经过下载器下载,直接构造response的子类htmlresponse返回。(当下载器中间件返回response对象时,更低优先级的process_request将不在执行,转而执行其他的process_response()方法,本例中没有其他的process_response(),所以直接将结果返回给spider解析。)

1.4 解析页面

Response对象就会回传给Spider内的回调函数进行解析。所以下一步我们就实现其回调函数,对网页来进行解析。

def parse(self, response):
soup = BeautifulSoup(response.text, 'lxml')
lis = soup.find_all(name='li', class_="gl-item")
for li in lis:
proc_dict = {}
dp = li.find(name='span', class_="J_im_icon")
if dp:
proc_dict['dp'] = dp.get_text().strip()
else:
continue
id = li.attrs['data-sku']
title = li.find(name='div', class_="p-name p-name-type-2")
proc_dict['title'] = title.get_text().strip()
price = li.find(name='strong', class_="J_" + id)
proc_dict['price'] = price.get_text()
comment = li.find(name='a', id="J_comment_" + id)
proc_dict['comment'] = comment.get_text() + '条评论'
url = 'https://item.jd.com/' + id + '.html'
proc_dict['url'] = url
proc_dict['type'] = 'JINGDONG'
yield proc_dict

这里我们采用BeautifulSoup进行解析,匹配所有商品,随后对结果进行遍历,依次选取商品的各种信息。

1.5 储存结果

提取完页面数据之后,数据会发送到item pipeline处进行数据处理,清洗,入库等操作,所以我们此时当然需要定义项目管道了,在此我们将数据存储在mongodb数据库中。

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html
import pymongo class MongoPipeline(object): def __init__(self,mongo_url,mongo_db,collection):
self.mongo_url = mongo_url
self.mongo_db = mongo_db
self.collection = collection @classmethod
#from_crawler是一个类方法,由 @classmethod标识,是一种依赖注入的方式,它的参数就是crawler
#通过crawler我们可以拿到全局配置的每个配置信息,在全局配置settings.py中的配置项都可以取到。
#所以这个方法的定义主要是用来获取settings.py中的配置信息
def from_crawler(cls,crawler):
return cls(
mongo_url=crawler.settings.get('MONGO_URL'),
mongo_db = crawler.settings.get('MONGO_DB'),
collection = crawler.settings.get('COLLECTION')
) def open_spider(self,spider):
self.client = pymongo.MongoClient(self.mongo_url)
self.db = self.client[self.mongo_db] def process_item(self,item, spider):
# name = item.__class__.collection
name = self.collection
self.db[name].insert(dict(item))
return item def close_spider(self,spider):
self.client.close()

1.6 配置settings文件

配置settings文件,将项目中使用到的配置项在settings文件中配置,本项目中使用到了KEYWORDS,MAX_PAGE,SELENIUM_TIMEOUT(页面加载超时时间),MONGOURL,MONGODB,COLLECTION。

KEYWORDS=['iPad']
MAX_PAGE=2 MONGO_URL = 'localhost'
MONGO_DB = 'test'
COLLECTION = 'ProductItem' SELENIUM_TIMEOUT = 30

以及修改配置项,激活下载器中间件和item pipeline。

DOWNLOADER_MIDDLEWARES = {
'scrapyseleniumtest.middlewares.SeleniumMiddleware': 543,
} ITEM_PIPELINES = {
'scrapyseleniumtest.pipelines.MongoPipeline': 300,
}

1.7 执行结果

项目中所有需要开发的代码和配置项开发完成,运行项目。

scrapy crawl jd

爬虫(十七):Scrapy框架(四) 对接selenium爬取京东商品数据

运行项目之后,在mongodb中查看数据,已经执行成功。

爬虫(十七):Scrapy框架(四) 对接selenium爬取京东商品数据

1.8 完整代码

items.py:

# -*- coding: utf-8 -*-

# Define here the models for your scraped items
#
# See documentation in:
# https://docs.scrapy.org/en/latest/topics/items.html
from scrapy import Item,Field class ProductItem(Item):
# define the fields for your item here like:
# name = scrapy.Field()
# dp = Field()
# title = Field()
# price = Field()
# comment = Field()
# url = Field()
# type = Field()
pass

jd.py:

# -*- coding: utf-8 -*-
from scrapy import Request,Spider
from urllib.parse import quote
from bs4 import BeautifulSoup class JdSpider(Spider):
name = 'jd'
allowed_domains = ['www.jd.com']
base_url = 'https://search.jd.com/Search?keyword=' def start_requests(self):
for keyword in self.settings.get('KEYWORDS'):
for page in range(1, self.settings.get('MAX_PAGE') + 1):
url = self.base_url + quote(keyword)
# dont_filter = True 不去重
yield Request(url=url, callback=self.parse, meta={'page': page}, dont_filter=True) def parse(self, response):
soup = BeautifulSoup(response.text, 'lxml')
lis = soup.find_all(name='li', class_="gl-item")
for li in lis:
proc_dict = {}
dp = li.find(name='span', class_="J_im_icon")
if dp:
proc_dict['dp'] = dp.get_text().strip()
else:
continue
id = li.attrs['data-sku']
title = li.find(name='div', class_="p-name p-name-type-2")
proc_dict['title'] = title.get_text().strip()
price = li.find(name='strong', class_="J_" + id)
proc_dict['price'] = price.get_text()
comment = li.find(name='a', id="J_comment_" + id)
proc_dict['comment'] = comment.get_text() + '条评论'
url = 'https://item.jd.com/' + id + '.html'
proc_dict['url'] = url
proc_dict['type'] = 'JINGDONG'
yield proc_dict

middlewares.py:

# -*- coding: utf-8 -*-

# Define here the models for your spider middleware
#
# See documentation in:
# https://doc.scrapy.org/en/latest/topics/spider-middleware.html from scrapy import signals
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from urllib.parse import urlencode
from scrapy.http import HtmlResponse
from logging import getLogger
from selenium.common.exceptions import TimeoutException
import time class ScrapyseleniumtestSpiderMiddleware(object):
# Not all methods need to be defined. If a method is not defined,
# scrapy acts as if the spider middleware does not modify the
# passed objects. @classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
s = cls()
crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
return s def process_spider_input(self, response, spider):
# Called for each response that goes through the spider
# middleware and into the spider. # Should return None or raise an exception.
return None def process_spider_output(self, response, result, spider):
# Called with the results returned from the Spider, after
# it has processed the response. # Must return an iterable of Request, dict or Item objects.
for i in result:
yield i def process_spider_exception(self, response, exception, spider):
# Called when a spider or process_spider_input() method
# (from other spider middleware) raises an exception. # Should return either None or an iterable of Response, dict
# or Item objects.
pass def process_start_requests(self, start_requests, spider):
# Called with the start requests of the spider, and works
# similarly to the process_spider_output() method, except
# that it doesn’t have a response associated. # Must return only requests (not items).
for r in start_requests:
yield r def spider_opened(self, spider):
spider.logger.info('Spider opened: %s' % spider.name) class SeleniumMiddleware(object):
# Not all methods need to be defined. If a method is not defined,
# scrapy acts as if the downloader middleware does not modify the
# passed objects. def __init__(self,timeout=None):
self.logger=getLogger(__name__)
self.timeout = timeout
self.browser = webdriver.Chrome()
self.browser.set_window_size(1400,700)
self.browser.set_page_load_timeout(self.timeout)
self.wait = WebDriverWait(self.browser,self.timeout) def __del__(self):
self.browser.close() @classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
return cls(timeout=crawler.settings.get('SELENIUM_TIMEOUT')) def process_request(self, request, spider):
'''
在下载器中间件中对接使用selenium,输出源代码之后,构造htmlresponse对象,直接返回
给spider解析页面,提取数据
并且也不在执行下载器下载页面动作
htmlresponse对象的文档:
:param request:
:param spider:
:return:
''' print('PhantomJS is Starting')
page = request.meta.get('page', 1)
self.wait = WebDriverWait(self.browser, self.timeout)
# self.browser.set_page_load_timeout(30)
# self.browser.set_script_timeout(30)
try:
self.browser.get(request.url)
if page > 1:
input = self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > input')))
input.clear()
input.send_keys(page)
time.sleep(5) # 将网页中输入跳转页的输入框赋值给input变量 EC.presence_of_element_located,判断输入框已经被加载出来
input = self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > input')))
# 将网页中调准页面的确定按钮赋值给submit变量,EC.element_to_be_clickable 判断此按钮是可点击的
submit = self.wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > a')))
input.clear()
input.send_keys(page)
submit.click() # 点击按钮
time.sleep(5) # 判断当前页码出现在了输入的页面中,EC.text_to_be_present_in_element 判断元素在指定字符串中出现
self.wait.until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, '#J_bottomPage > span.p-num > a.curr'),str(page)))
# 等待 #J_goodsList 加载出来,为页面数据,加载出来之后,在返回网页源代码
self.wait.until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, '#J_bottomPage > span.p-num > a.curr'),str(page)))
return HtmlResponse(url=request.url, body=self.browser.page_source, request=request, encoding='utf-8',status=200)
except TimeoutException:
return HtmlResponse(url=request.url, status=500, request=request) def process_response(self, request, response, spider):
# Called with the response returned from the downloader. # Must either;
# - return a Response object
# - return a Request object
# - or raise IgnoreRequest
return response def process_exception(self, request, exception, spider):
# Called when a download handler or a process_request()
# (from other downloader middleware) raises an exception. # Must either:
# - return None: continue processing this exception
# - return a Response object: stops process_exception() chain
# - return a Request object: stops process_exception() chain
pass def spider_opened(self, spider):
spider.logger.info('Spider opened: %s' % spider.name)

pipelines.py:

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html
import pymongo class MongoPipeline(object): def __init__(self,mongo_url,mongo_db,collection):
self.mongo_url = mongo_url
self.mongo_db = mongo_db
self.collection = collection @classmethod
#from_crawler是一个类方法,由 @classmethod标识,是一种依赖注入的方式,它的参数就是crawler
#通过crawler我们可以拿到全局配置的每个配置信息,在全局配置settings.py中的配置项都可以取到。
#所以这个方法的定义主要是用来获取settings.py中的配置信息
def from_crawler(cls,crawler):
return cls(
mongo_url=crawler.settings.get('MONGO_URL'),
mongo_db = crawler.settings.get('MONGO_DB'),
collection = crawler.settings.get('COLLECTION')
) def open_spider(self,spider):
self.client = pymongo.MongoClient(self.mongo_url)
self.db = self.client[self.mongo_db] def process_item(self,item, spider):
# name = item.__class__.collection
name = self.collection
self.db[name].insert(dict(item))
return item def close_spider(self,spider):
self.client.close()

settings.py:

# -*- coding: utf-8 -*-

# Scrapy settings for scrapyseleniumtest project
#
# For simplicity, this file contains only settings considered important or
# commonly used. You can find more settings consulting the documentation:
#
# https://docs.scrapy.org/en/latest/topics/settings.html
# https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
# https://docs.scrapy.org/en/latest/topics/spider-middleware.html BOT_NAME = 'scrapyseleniumtest' SPIDER_MODULES = ['scrapyseleniumtest.spiders']
NEWSPIDER_MODULE = 'scrapyseleniumtest.spiders' # Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'scrapyseleniumtest (+http://www.yourdomain.com)' # Obey robots.txt rules
ROBOTSTXT_OBEY = False # Configure maximum concurrent requests performed by Scrapy (default: 16)
#CONCURRENT_REQUESTS = 32 # Configure a delay for requests for the same website (default: 0)
# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
#DOWNLOAD_DELAY = 3
# The download delay setting will honor only one of:
#CONCURRENT_REQUESTS_PER_DOMAIN = 16
#CONCURRENT_REQUESTS_PER_IP = 16 # Disable cookies (enabled by default)
#COOKIES_ENABLED = False # Disable Telnet Console (enabled by default)
#TELNETCONSOLE_ENABLED = False # Override the default request headers:
#DEFAULT_REQUEST_HEADERS = {
# 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
# 'Accept-Language': 'en',
#} # Enable or disable spider middlewares
# See https://docs.scrapy.org/en/latest/topics/spider-middleware.html
#SPIDER_MIDDLEWARES = {
# 'scrapyseleniumtest.middlewares.ScrapyseleniumtestSpiderMiddleware': 543,
#} # Enable or disable downloader middlewares
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
#DOWNLOADER_MIDDLEWARES = {
# 'scrapyseleniumtest.middlewares.ScrapyseleniumtestDownloaderMiddleware': 543,
#}
DOWNLOADER_MIDDLEWARES = {
'scrapyseleniumtest.middlewares.SeleniumMiddleware': 543,
}
# Enable or disable extensions
# See https://docs.scrapy.org/en/latest/topics/extensions.html
#EXTENSIONS = {
# 'scrapy.extensions.telnet.TelnetConsole': None,
#} # Configure item pipelines
# See https://docs.scrapy.org/en/latest/topics/item-pipeline.html
#ITEM_PIPELINES = {
# 'scrapyseleniumtest.pipelines.ScrapyseleniumtestPipeline': 300,
#}
ITEM_PIPELINES = {
'scrapyseleniumtest.pipelines.MongoPipeline': 300,
}
# Enable and configure the AutoThrottle extension (disabled by default)
# See https://docs.scrapy.org/en/latest/topics/autothrottle.html
#AUTOTHROTTLE_ENABLED = True
# The initial download delay
#AUTOTHROTTLE_START_DELAY = 5
# The maximum download delay to be set in case of high latencies
#AUTOTHROTTLE_MAX_DELAY = 60
# The average number of requests Scrapy should be sending in parallel to
# each remote server
#AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
# Enable showing throttling stats for every response received:
#AUTOTHROTTLE_DEBUG = False # Enable and configure HTTP caching (disabled by default)
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
#HTTPCACHE_ENABLED = True
#HTTPCACHE_EXPIRATION_SECS = 0
#HTTPCACHE_DIR = 'httpcache'
#HTTPCACHE_IGNORE_HTTP_CODES = []
#HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'
KEYWORDS=['iPad']
MAX_PAGE=2 MONGO_URL = 'localhost'
MONGO_DB = 'test'
COLLECTION = 'ProductItem' SELENIUM_TIMEOUT = 30

爬虫(十七):Scrapy框架(四) 对接selenium爬取京东商品数据的更多相关文章

  1. Scrapy实战篇(八)之Scrapy对接selenium爬取京东商城商品数据

    本篇目标:我们以爬取京东商城商品数据为例,展示Scrapy框架对接selenium爬取京东商城商品数据. 背景: 京东商城页面为js动态加载页面,直接使用request请求,无法得到我们想要的商品数据 ...

  2. 爬虫系列(十三) 用selenium爬取京东商品

    这篇文章,我们将通过 selenium 模拟用户使用浏览器的行为,爬取京东商品信息,还是先放上最终的效果图: 1.网页分析 (1)初步分析 原本博主打算写一个能够爬取所有商品信息的爬虫,可是在分析过程 ...

  3. 利用selenium爬取京东商品信息存放到mongodb

    利用selenium爬取京东商城的商品信息思路: 1.首先进入京东的搜索页面,分析搜索页面信息可以得到路由结构 2.根据页面信息可以看到京东在搜索页面使用了懒加载,所以为了解决这个问题,使用递归.等待 ...

  4. python爬虫——用selenium爬取京东商品信息

    1.先附上效果图(我偷懒只爬了4页)  2.京东的网址https://www.jd.com/ 3.我这里是不加载图片,加快爬取速度,也可以用Headless无弹窗模式 options = webdri ...

  5. 爬虫之selenium爬取京东商品信息

    import json import time from selenium import webdriver """ 发送请求 1.1生成driver对象 2.1窗口最大 ...

  6. 一起学爬虫——使用selenium和pyquery爬取京东商品列表

    layout: article title: 一起学爬虫--使用selenium和pyquery爬取京东商品列表 mathjax: true --- 今天一起学起使用selenium和pyquery爬 ...

  7. 爬虫—Selenium爬取JD商品信息

    一,抓取分析 本次目标是爬取京东商品信息,包括商品的图片,名称,价格,评价人数,店铺名称.抓取入口就是京东的搜索页面,这个链接可以通过直接构造参数访问https://search.jd.com/Sea ...

  8. Scrapy 通过登录的方式爬取豆瓣影评数据

    Scrapy 通过登录的方式爬取豆瓣影评数据 爬虫 Scrapy 豆瓣 Fly 由于需要爬取影评数据在来做分析,就选择了豆瓣影评来抓取数据,工具使用的是Scrapy工具来实现.scrapy工具使用起来 ...

  9. python制作爬虫爬取京东商品评论教程

    作者:蓝鲸 类型:转载 本文是继前2篇Python爬虫系列文章的后续篇,给大家介绍的是如何使用Python爬取京东商品评论信息的方法,并根据数据绘制成各种统计图表,非常的细致,有需要的小伙伴可以参考下 ...

随机推荐

  1. Windows Store App 用户库文件操作

    (1)获取用户库位置 如果想要通过应用程序在用户库中创建文件,首先需要获得用户库中指定的位置,例如图片库.文档库等.这里值得注意的是,在获取用户库的位置之前,必须在Windows应用商店项目的清单文件 ...

  2. oracle impdp的table_exists_action详解

    oracle impdp的table_exists_action详解 分类: [oracle]--[备份与恢复]2012-01-06 22:44 9105人阅读 评论(0) 收藏 举报 tableac ...

  3. angularjs directive (自定义标签解析)

    angularjs directive (自定义标签解析) 定义tpl <!-- 注意要有根标签 --> <div class="list list-inset" ...

  4. 从使用角度看 ReentrantLock 和 Condition

    java 语言中谈到锁,少不了比较一番 synchronized 和 ReentrantLock 的原理,本文不作分析,只是简单介绍一下 ReentrantLock 的用法,从使用中推测其内部的一些原 ...

  5. window系统更新导致很多服务出错

    window7,win10,window server各版本系统中,经常会出现下载完成更新补丁后要求重启更新,此时很可能会出现很多服务失效的莫名其妙的问题,比如数据库连不上,IIS某功能不好使等等问题 ...

  6. Spark Core(三)Executor上是如何launch task(转载)

    1. 启动任务 在前面一篇博客中(Driver 启动.分配.调度Task)介绍了Driver是如何调动.启动任务的,Driver向Executor发送了LaunchTask的消息,Executor接收 ...

  7. Java Applet基础

    applet是一种Java程序.它一般运行在支持Java的Web浏览器内.因为它有完整的Java API支持,所以applet是一个全功能的Java应用程序. 如下所示是独立的Java应用程序和app ...

  8. &lbrack;leetcode-775-Global and Local Inversions&rsqb;

    We have some permutation A of [0, 1, ..., N - 1], where N is the length of A. The number of (global) ...

  9. react 使用antd的TreeSelect树选择组件实现多个树选择循环

    需求说明,一个帐号角色可以设置管理多个项目的菜单权限 且菜单接口每次只能查询特定项目的菜单数据[无法查全部] 开发思路: 1,获取项目接口数组,得到项目数据 2,循环项目数据,以此为参数递归查询菜单数 ...

  10. 咏南下拉列表数据敏感控件--TYNDBSearch

    咏南下拉列表数据敏感控件--TYNDBSearch 拥有下拉列表控件可以大大地加速软件系统的开发. 控件适用于DELPHI5及以上版本安装并使用. 控件的用法: procedure Tfgoods.s ...