Files
Deploy-Laboratory/docs/03-01-k3s-traefik-dashboard.md
2026-03-27 16:58:41 +08:00

108 lines
4.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 03-02-k3s Traefik Dashboard
> 启用并访问 Traefik Dashboard用于查看路由与服务状态。
## TL;DR
- **自动化验收**`./scripts/verify.sh run 03-01`
- **关键前置**:按本文「前置条件」准备环境变量/Secret/入口 IP
- **成功判据**:达到本文「预期」且 playbook 断言通过
- **排障**:见本文「排障」
## 前置条件
- Traefik 已正常运行
- 已了解 Dashboard 仅用于运维,不建议公网裸露
## 部署说明(几个 Pod配置存哪如何同步
- **几个 Pod**K3s 默认 Traefik 是 **Deployment、replicas=1**,即只有 **1 个 Traefik Pod**。该 Pod 可能跑在控制节点或你打了 Traefik 入口标签的任意节点上。
- **为何每个节点都能访问**:流量不是“每个节点一个 Traefik”。每个节点上的 80 端口由 K3s 的 **ServiceLBsvclb-traefik** 监听,请求被转发到 **同一个** Traefik Service再转到上述那 1 个 Traefik Pod。所以多节点能访问是 LB 转发到同一后端,不是每节点一个 Traefik。
- **配置存在哪里****HelmChartConfig** 与 **IngressRoute** 是 Kubernetes 资源,存在 **etcd**控制节点。Traefik 进程通过 **Kubernetes API** 监听 Ingress/IngressRoute 等,动态生成路由,**不需要在多个 Pod 之间同步配置**。若以后把 Traefik 扩成多副本,所有副本都从同一 API 读到的资源,行为一致。
## 操作步骤
1. 在控制节点创建 `traefik-dashboard.yaml`,放入 K3s manifests 目录K3s 启动时自动加载,重启后无需手动 apply
- **默认路径**`/var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml`
- **自定义 data-dir**(如 `--data-dir=/storage``<data-dir>/server/manifests/traefik-dashboard.yaml`
**唯一真源(勿与文档内联重复)**[HelmChartConfig + IngressRoute 完整 YAML](../../ansible/files/03-01/traefik-dashboard.yaml)。复制到上述 manifests 路径,或在仓库根执行:
```bash
kubectl apply -f ansible/files/03-01/traefik-dashboard.yaml
```
2. 应用配置并等待 Traefik 重载(按实际路径选择其一复制执行):
```bash
# 默认路径
kubectl apply -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
kubectl -n kube-system rollout status deploy/traefik
```
```bash
# 自定义 data-dir如 /storage
kubectl apply -f /storage/server/manifests/traefik-dashboard.yaml
kubectl -n kube-system rollout status deploy/traefik
```
3. 验证:一键对全部节点 IP 做 curl 测试(按实际环境修改 IP 列表):
```bash
# 已按 01-02 / 01-06 配置 K3s 默认 LBTraefik 入口标签 + firewalld 基线61~64 任一台 :80 均应返回 200/307
for ip in 192.168.2.61 192.168.2.62 192.168.2.63 192.168.2.64; do
code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 3 "http://${ip}/dashboard/" 2>/dev/null || echo "---")
echo "${ip}: ${code}"
done
```
查看 Traefik 日志(确认无报错):
```bash
kubectl -n kube-system logs deploy/traefik --tail=50
```
可选:只看响应头(单节点)
`curl -I --max-time 5 http://192.168.2.61/dashboard/`
## 删除部署与文件
因同一 chart 只能有一份 HelmChartConfig后续做 03-03ACME、03-04Dashboard+ACME 合并)等测试前,建议先删除本部署并删掉 manifest 文件,避免被覆盖或重复加载。
1. **删除集群内资源**HelmChartConfig + IngressRoute
```bash
kubectl delete -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
kubectl -n kube-system rollout status deploy/traefik
```
```bash
kubectl delete -f /storage/server/manifests/traefik-dashboard.yaml
kubectl -n kube-system rollout status deploy/traefik
```
2. **删除宿主机上的 manifest 文件**(否则 K3s 重启会再次加载):
```bash
# 默认路径
sudo rm -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
```
```bash
# 自定义 data-dir如 /storage
sudo rm -f /storage/server/manifests/traefik-dashboard.yaml
```
## 下一步
- `02-05-nginx-验证矩阵-一键部署.md`(若尚未做 HTTP 矩阵验证;可先读 `02-00-nginx-系列说明.md`
- `04-01-k3s-nodejs-高级部署.md`
## 排障
- **先看 playbook 输出**:失败时先定位是 deploy/wait/http_check 哪一步。
- **集群侧总览**`kubectl get nodes -o wide`、`kubectl -n kube-system get pods -o wide`。
- **事件与日志**`kubectl -n <ns> describe ...`、`kubectl -n <ns> logs ... --tail=200`。