superset----缓存之redis

时间:2024-09-22 09:06:32

一、概述

Superset使用Flask-Cache进行缓存,Flask-Cache支持redis,memcached,simplecache(内存),或本地文件系统)等缓存后端,如果你打算使用memcached,就需要使用memcached服务器作为后端,如果你打算使用redis,就需要安装python-redis。推荐使用redis作为缓存后端。

二、安装redis

1、去官网下载源码:https://redis.io/download

下载在,解压到data目录:

#tar xf redis-4.0.2.tar.gz -C /data

#cd  /data/redis-4.0.2

#make install -j 4

#mkdir {bin,data}

#find src/ -type f -perm -111 | xargs -i cp {} ../bin

#vim /etc/profile

export PATH=$PATH:/data/redis-4.0.2/bin

#source  /etc/profile

#cat redis.conf

bind 10.10.2.34
protected-mode no
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile "redis.log"
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data/redis-4.0.2/data
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100

#LRU(Least Recently Used)为常用的缓存清空机制
maxmemory 3g
maxmemory-policy allkeys-lru
maxmemory-samples 5
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble no
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

2、启动

#redis-server  /data/redis-4.0.2/redis.conf

三、为superset配置redis缓存

1、superset

在superset_config.py文件中增加如下内容:

CACHE_DEFAULT_TIMEOUT = 60*60*6
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_REDIS_HOST': 'spark-worker',
'CACHE_REDIS_PORT': '6379',
'CACHE_REDIS_URL': 'redis://spark-worker:6379'
}

修改hosts文件,添加10.10.2.34  spark-worker

修改完成后,重启superset生效。进入页面点击dashboard,然后去redis里面查看,

#redis-cli -h spark-worker -p 6379

>KEYS  *

1) "flask_cache_cd696b3707087317077fe46bd306804a"

2、SQL Lab Celery设置

SQL Lab是superset整合的一个强大的数据库查询工具,支持所有SQLAlchemy兼容的数据库,默认通过web请求实现数据库查询。然而,当数据规模较大,需要长时间操作数据库时,会造成web请求超时而查询失败。因此,有必要为superset配置可异步执行的后端。

superset的异步后端包括:

  1. 一个或多个superset worker(Celery worker);
  2. 一个celery broker(消息队列),推荐使用redis或RabbitMQ;
  3. 一个结果后端,用于储存查询结果。

修改superset_config.py,增加如下内容:

class CeleryConfig(object):
    BROKER_URL = 'redis://spark-worker:6379/0'
    CELERY_IMPORTS = ('superset.sql_lab',)
    CELERY_RESULT_BACKEND = 'redis://spark-worker:6379/0'
   # CELERY_ANNOTATIONS = {'tasks.add':{'rate_limit':'10/s'}}

CELERY_CONFIG = CeleryConfig

from werkzeug.contrib.cache import RedisCache
RESULTS_BACKEND = RedisCache(
    host='spark-worker', port=6379, key_prefix='superset_results')

重启superset,(superset我是用docker安装的,参考我的另一篇文章superset安装),

#docker  restart  superset

#docker  exec -it  superset /bin/bash

进入容器后,启动superset  worker

#nohup  superset worker 2>&1 & >>/dev/null

在web页面上进行设置,选择Allow Run Async

superset----缓存之redis

然后去sql  lab执行sql查询,在去redis里面查看是否有缓存:

>KEYS  *

1) "celery-task-meta-a006a611-4da6-41fe-a825-0da3e7b31060"
 2) "celery-task-meta-e9b90a39-15a5-4d51-bf2b-3c461eede835"
 3) "_kombu.binding.celery.pidbox"
 4) "celery-task-meta-d02fd720-340d-4863-ab7c-e518984a494a"
 5) "celery-task-meta-b65e3dee-96fe-43c7-b4c4-cd9035f14c2f"
 6) "celery-task-meta-ffe2e759-5bae-46b0-912d-9b1db0f36086"
 7) "_kombu.binding.celery"
 8) "celery-task-meta-5480d794-3b28-444f-8f32-54d6fbe92fc9"
 9) "flask_cache_cd696b3707087317077fe46bd306804a"
10) "unacked_mutex"
11) "celery-task-meta-140a3eaf-a121-4a39-8fc5-73d7a9e198c2"
12) "_kombu.binding.celeryev"
13) "celery-task-meta-e9dd5359-17c9-40a3-b3a2-f2dc977fd34c"

配置文件:

#cat  superset_config.py

#---------------------------------------------------------
# Superset specific config
#---------------------------------------------------------
ROW_LIMIT = 5000
SUPERSET_WORKERS = 4
SUPERSET_WEBSERVER_TIMEOUT = 3000
SUPERSET_WEBSERVER_PORT = 8088
#---------------------------------------------------------

#---------------------------------------------------------
# Flask App Builder configuration
#---------------------------------------------------------
# Your App secret key
SECRET_KEY = '\2\1thisismyscretkey\1\2\e\y\y\h'

# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
#SQLALCHEMY_DATABASE_URI = 'sqlite:////data/superset.db'
SQLALCHEMY_DATABASE_URI = 'sqlite:////home/superset/superset.db'

# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = True
# Add endpoints that need to be exempt from CSRF protection
WTF_CSRF_EXEMPT_LIST = []

# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''

CACHE_DEFAULT_TIMEOUT = 60*60*6
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_REDIS_HOST': 'spark-worker',
'CACHE_REDIS_PORT': '6379',
'CACHE_REDIS_URL': 'redis://spark-worker:6379'
}

class CeleryConfig(object):
    BROKER_URL = 'redis://spark-worker:6379/0'
    CELERY_IMPORTS = ('superset.sql_lab',)
    CELERY_RESULT_BACKEND = 'redis://spark-worker:6379/0'
   # CELERY_ANNOTATIONS = {'tasks.add':{'rate_limit':'10/s'}}

CELERY_CONFIG = CeleryConfig

from werkzeug.contrib.cache import RedisCache
RESULTS_BACKEND = RedisCache(
    host='spark-worker', port=6379, key_prefix='superset_results')