k8s 操作笔记

更多kubernetes文章:k8s专栏目录

版本 1.9.0

namespace限制gpu

[[email protected] gpu-namespace]# cat compute-resources2.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
    #pods: "4"
    #requests.cpu: "1"
    #requests.memory: 1Gi
    #limits.cpu: "2"
    #limits.memory: 2Gi

kubectl create -f compute-resources2.yaml

kubectl get quota
kubectl describe quota compute-resources
kubectl delete quota compute-resources

先创建namespace 再在namespace上增加限制,这里是在default下增加限制


  • docker里面没vi等基本命令
echo "nameserver 192.168.1.254" > /etc/resolv.conf
apt-get update
apt install net-tools       # ifconfig 
apt install iputils-ping     # ping
apt install vi


启动gpu任务

Warning  FailedScheduling  3s (x7 over 34s)  default-scheduler  0/3 nodes are available: 1 PodToleratesNodeTaints, 3 Insufficient nvidia.com/gpu.


  • 调整副本数
kubectl scale ds/kube-flannel-ds --replicas=1


  • 在指定node上启动容器
增加参数  nodeName: xxxx

eg:
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      image: "nfs:5000/tensorflow/tensorflow:nightly"
      #resources:
        #limits:
          #nvidia.com/gpu: 1 # requesting 1 GPU
  nodeName: tensorflow1


  • 隔离恢复节点
kubectl cordon {hostname} #隔离
kubectl uncordon {hostname} #恢复




  • 创建删除应用
kubectl run httpd-app --image=httpd --replicas=2

kubectl get all --all-namespaces
kubectl get deployments
删除任务
kubectl delete deployment xxxxx
kubectl delete deploy/httpd-app

验证
[[email protected] k8s_images]# kubectl get pods  -o wide
NAME                         READY     STATUS    RESTARTS   AGE       IP           NODE
httpd-app-5fbccd7c6c-5sx5z   1/1       Running   0          25m       10.244.2.2   tensorflow0
httpd-app-5fbccd7c6c-87jvp   1/1       Running   0          17m       10.244.2.3   tensorflow0
[[email protected] k8s_images]# curl 10.244.2.2
<html><body><h1>It works!</h1></body></html>
[[email protected] k8s_images]# curl 10.244.2.3
<html><body><h1>It works!</h1></body></html>




  • restful api无法访问
1.5使用不加密的4194端口,1.9使用加密的6443端口,需要做额外设置才能访问


master机器上执行 curl "https://localhost:6443/healthz" -k

-k忽略证书问题

kubectl get clusterrole/cluster-admin -o yaml



编辑basic_auth_file
vi /etc/kubernetes/pki/basic_auth_file
admin,admin,2
vi /etc/kubernetes/manifests/kube-apiserver.yaml
增加     - --basic_auth_file=/etc/kubernetes/pki/basic_auth_file
注意 basic_auth_file必须是下划线,网上有中划线的是不行的

自动生效

这里basic_auth_file必须在/etc/kubernetes/pki/下的原因,可能是因为apiserver的容器里挂载了这个路径。仅是猜测,未经测试。


访问
master机器上执行 curl -u admin:admin "https://localhost:6443/api/v1" -k



权限问题:

api文档


解决Kubernetes 1.6.4 Dashboard无法访问的问题

Kubernetes 1.6新特性学习:RBAC授权

Kubernetes dashboard1.8.0 WebUI安装与配置

User “system:anonymous” cannot get path “/”

rbac 官方文档

kubernetes1.8版本heapster部署




  • 访问dashboard失败

endpoints正常
[[email protected] k8s_images]# kubectl get endpoints --all-namespaces
NAMESPACE     NAME                      ENDPOINTS                     AGE
default       kubernetes                192.168.1.138:6443            18h
kube-system   kube-controller-manager   <none>                        18h
kube-system   kube-dns                  172.17.0.2:53,172.17.0.2:53   18h
kube-system   kube-scheduler            <none>                        18h
kube-system   kubernetes-dashboard      172.17.0.3:8443               29m

[[email protected] k8s_images]# curl "https://172.17.0.3:8443" -k
<!doctype html> <html ng-app="kubernetesDashboard"> <head> <meta charset="utf-8"> <title ng-controller="kdTitle as $ctrl" ng-bind="$ctrl.title()"></title> <link rel="icon" type="image/png" href="assets/images/kubernetes-logo.png"> <meta name="viewport" content="width=device-width"> <link rel="stylesheet" href="static/vendor.93db0a0d.css"> <link rel="stylesheet" href="static/app.ffb1366f.css"> </head> <body ng-controller="kdMain as $ctrl"> <!--[if lt IE 10]>
      <p class="browsehappy">You are using an <strong>outdated</strong> browser.
      Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your
      experience.</p>
    <![endif]--> <kd-login layout="column" layout-fill ng-if="$ctrl.isLoginState()"> </kd-login> <kd-chrome layout="column" layout-fill ng-if="!$ctrl.isLoginState()"> </kd-chrome> <script src="static/vendor.9a600e6f.js"></script> <script src="api/appConfig.json"></script> <script src="static/app.fe2776ce.js"></script> </body> </html> 

service信息如下
[[email protected] k8s_images]# kubectl describe  svc/kubernetes-dashboard -n kube-system
Name:                     kubernetes-dashboard
Namespace:                kube-system
Labels:                   k8s-app=kubernetes-dashboard
Annotations:              <none>
Selector:                 k8s-app=kubernetes-dashboard
Type:                     NodePort
IP:                       10.100.2.162
Port:                     <unset>  443/TCP
TargetPort:               8443/TCP
NodePort:                 <unset>  32666/TCP
Endpoints:                172.17.0.3:8443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>


curl "https://10.100.2.162:443" -k 可以访问

master机器执行  curl "https://localhost:32666" -k 不可以访问


打标签 以后使用定制标签,否则每次会去网上检查,再拉镜像
docker tag httpd:latest httpd:20180322

几个port之间的关系





  • dashboard https打不开问题
k8s 操作笔记

ERR_SSL_SERVER_CERT_BAD_FORMAT

ie和google浏览器都打不开 
要用火狐打开。允许 添加例外。然后就能打开了


  • dashboard看不了了
k8s 操作笔记
权限问题 如何配置权限见另一篇文章 RBAC多租户权限控制实现

直接编辑对象
kubectl -n kube-system edit service kubernetes-dashboard


gpu 资源控制
存在bug 修复了
升级方案

gpu 只有requests值,没有limits值

k8s 1.10.0安装



gpu设置为0的时候无效是个bug
目前解决方案是设置环境变量  将NVIDIA_VISIBLE_DEVICES 设置为空