**安装配置DC/OS 注意事项*
1、
dc/os 1.8 需要python3的支持,dc/os默认带有python3。
2、
如果想自己安装本地库(local universe)执行命令:#make local-universe,出现如下错误:
File “/usr/python3.5/lib/python3.5/urllib/request.py”, line 1324, in unknown_open
raise URLError(‘unknown url type: %s’ % type)
urllib.error.URLError:
make: * [local-universe] Error 1
解决办法:yum install openssl-devel ,然后再从新安装python3,会解决此问题。
由于中国防火墙原因,建议找一台国外机器去进行本地库的打包。
安装命令如下:
2.1
先到官方网站下载python3的安装包
https://www.python.org/downloads/source/ ---我下载的是Python-3.5.2.tar.xz
wget https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tar.xz
tar -xf Python-3.5.2.tar.xz
!!!!注意 注意 ⚠️ 在编译之前需要安装一些必须的依赖,否则当报错的时候还得重新编译 ---(我就是吃了这个亏,千万要注意奥。。。)
安装必要依赖
yum install openssl-devel -y
yum install zlib-devel -y
现在可以编译咯:
cd Python-3.5.2
./configure –prefix=/opt/Python #安装目录可以自己定义无所谓。
make
make install
编译完成后会在如 /opt/下生成Python的文件夹 ,没错这就是编译完成的python --为了方便,小伙伴们可以自己定义一个软连接如下:
ln -s /opt/Python/bin/python3 /usr/bin/python3
3、
如果安装本地库出现如下错误:
File “/usr/python3.5/lib/python3.5/subprocess.py”, line 581, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘bash’, ‘scripts/build.sh’]’ returned non-zero exit status 1
make: * [local-universe] Error 1
解决办法:pip3 install jsonschema,如果没有pip3则需要根据python3先安装pip3.
3.1
首先安装setuptools
下载:https://pypi.python.org/pypi
wget –no-check-certificate https://pypi.python.org/packages/source/s/setuptools/setuptools-19.6.tar.gz#md5=c607dd118eae682c44ed146367a17e26
tar -zxvf setuptools-19.6.tar.gz
cd setuptools-19.6.tar.gz
python3 setup.py build
python3 setup.py install
3.2
然后直接安装pip
wget –no-check-certificate https://pypi.python.org/packages/source/p/pip/pip-8.0.2.tar.gz#md5=3a73c4188f8dbad6a1e6f6d44d117eeb
tar -zxvf pip-8.0.2.tar.gz
cd pip-8.0.2
python3 setup.py build
python3 setup.py install
4、
安装完DC/OS 可能会碰到yum不可用的问题,会出现如下错误:
rror: rpmdb: BDB0113 Thread/process 6589/140601939367744 failed: BDB1507 Thread died in Berkeley DB library
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 - (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:
解决办法:
rpm –rebuilddb
5、
配置DC/OS DNS地址,最好用内网的dns加上一个外网的dns,建议如下:
198.51.100.1,198.51.100.2,198.51.100.3,8.8.8.8
6、
配置ip-detect,建议用如下命令(只需替换MASTER_IP地址即可):
!/usr/bin/env bash
set -o nounset -o errexit -o pipefail
export PATH=/sbin:/usr/sbin:/bin:/usr/bin:
MASTER_IP=
BEGIN { ec = 1 }
{
if(
print $6
ec = 0
}
if (ec == 0) exit;
}
END { exit ec }
‘)
注意:用此命令,需要在主节点以及自己点运行: yum -y install bind-utils
7、
有的服务器可能配置不完全,安装DC/OS初步验证过不去,则需要执行如下命令(如果用的阿里云,那么阿里云华北一区,A区域机器需要执行如下命令):
$ sudo yum install -y tar xz unzip curl ipset
sudo sed -i s/SELINUX=enforcing/SELINUX=permissive/g /etc/selinux/config &&
sudo groupadd nogroup &&
sudo reboot
8、
外网如果想进去DC/OS集群,那么需要在外网机器上安装dcos命令,并且执行如下命令,连接到DC/OS 主节点。
curl -O https://downloads.dcos.io/binaries/cli/linux/x86-64/dcos-1.8/dcos
chmod +x dcos
./dcos config set core.dcos_url master节点地址
./dcos auth login
9、
如果是通过DC/OS安装的kafka、cassandra等,那么如果想用docs命令查看其信息,则需要安装kafka、cassandra的客户端命令。
dcos –log-level=ERROR package install –cli kafka (注意:前面是双横杠)
如果安装失败–log-level=ERROR这个命令用来查看错误信息。
另外需注意,需要在dcos服务器配置master.mesos到hosts文件。
10、
dcos提供了一组命令用来进去DC/OS集群,由于默认的是coreos系统,所以后面需要跟上–user=root
dcos node ssh –master-proxy –leader –user=root
如果提示错误,那么需要ssh-add命令将专用密钥添加到ssh-agent的高速缓存中
eval ssh-agent
ssh-add ~/.ssh/id_dsa
11、
dos安装spark cli 问题解决
[root@dcos005 dcos-cli]# dcos –log-level=ERROR package install –cli spark
Installing CLI subcommand for package [spark] version [1.0.6-2.0.2]
Unable to install CLI subcommand. Missing required program ‘virtualenv’.
Please see installation instructions: https://virtualenv.pypa.io/en/latest/installation.html
[root@dcos005 dcos-cli]# pip3
-bash: pip3: command not found
[root@dcos005 dcos-cli]# pip install virtualenv
Collecting virtualenv
Downloading http://mirrors.aliyun.com/pypi/packages/6f/86/3dc328ee7b1a6419ebfac7896d882fba83c48e3561d22ddddf38294d3e83/virtualenv-15.1.0-py2.py3-none-any.whl (1.8MB)
100% |████████████████████████████████| 1.8MB 402kB/s
Installing collected packages: virtualenv
Successfully installed virtualenv-15.1.0
You are using pip version 8.1.2, however version 9.0.1 is available.
You should consider upgrading via the ‘pip install –upgrade pip’ command.
[root@dcos005 dcos-cli]# dcos –log-level=ERROR package install –cli spark
Installing CLI subcommand for package [spark] version [1.0.6-2.0.2]
New command available: dcos spark
12、
提交程序出现问题及解决办法
问题:task_id { value: “driver-20170106165836-0006” } state: TASK_FAILED message: “Failed to launch container: Failed to run \’docker -H unix:///var/run/docker.sock pull master.mesos:5000/mesosphere/spark:1.0.6-2.0.2-hadoop-2.6\’: exited with status 1; stderr=\’Error response from daemon: Get https://master.mesos:5000/v1/_ping: x509: certificate signed by unknown authority\n\’” slave_id { value: “b36d1755-893b-4cfe-a42b-06a4c0bd27ea-S5” } timestamp: 1.483721919881525E9 executor_id { value: “driver-20170106165836-0006” } source: SOURCE_SLAVE reason: REASON_CONTAINER_LAUNCH_FAILED uuid: “;\246\221\242\002gC\302\244\300\253y\037a\230\377” container_status { network_infos { ip_addresses { ip_address: “10.161.71.194” } } }
出现原因:因为是用的local-universe,所以在新增加节点的时候忘了配置cert。
解决办法:在新增节点执行如下命令
$ mkdir -p /etc/docker/certs.d/master.mesos:5000
$ curl -o /etc/docker/certs.d/master.mesos:5000/ca.crt
http://master.mesos:8082/certs/domain.crt
$ systemctl restart docker
13、
问题:[root@dcosboot Python-3.5.2]# ./configure –prefix=/usr/local/python3
checking build system type… x86_64-unknown-linux-gnu
checking host system type… x86_64-unknown-linux-gnu
checking for –enable-universalsdk… no
checking for –with-universal-archs… no
checking MACHDEP… linux
checking for –without-gcc… no
checking for –with-icc… no
checking for gcc… no
checking for cc… no
checking for cl.exe… no
configure: error: in /usr/local/src/Python-3.5.2':
config.log’ for more details
configure: error: no acceptable C compiler found in $PATH
See
解决办法:[root@dcosboot Python-3.5.2]# yum install gcc
14、
打包本地库问题
[root@dcosboot local-universe]# make local-universe
rm -rf certs &&
rm -f local-universe.tar.gz || 0
python3 /usr/local/universe/universe/docker/local-universe/../../scripts/local-universe.py
–repository /usr/local/universe/universe/docker/local-universe/../../repo/packages/
–selected &&
docker save -o local-universe.tar mesosphere/universe:latest &&
gzip local-universe.tar
Start docker registry.
dc3d3ab59bf8125c3ee4e4538c255373562ff8e0b2a6a57bd7f6402031cffa1e
Stopping docker registry.
registry
Traceback (most recent call last):
File “/usr/local/universe/universe/docker/local-universe/../../scripts/local-universe.py”, line 327, in
sys.exit(main())
File “/usr/local/universe/universe/docker/local-universe/../../scripts/local-universe.py”, line 78, in main
os.makedirs(str(docker_artifacts))
File “/usr/local/python3/lib/python3.5/os.py”, line 241, in makedirs
mkdir(name, mode)
FileExistsError: [Errno 17] File exists: ‘/tmp/tmpekiwd5ju/registry’
make: * [local-universe] Error 1
解决办法:修改 local-universe.py代码
将调用此方法的os.makedirs的地方改成os.makedirs(dir, exist_ok=True)