Grafana Loki单节点部署指南:阿里云OSS存储生产可用方案
本文将带你快速部署一个生产可用的 Loki + Grafana 日志聚合方案,存储后端使用阿里云 OSS。
📊 部署效果
想了解实际查询体验如何?查看这篇体验文章:
👉 Grafana Loki:别再空谈架构,我带你看真实查询效果
🚀 10分钟快速部署(MVP)
前置准备
- Ubuntu/Debian 服务器
- 阿里云 OSS 存储桶(参考 OSS配置指南)
① 前置条件检查
# 1. 系统更新
&&
② 安装二进制包(APT 源)
# 导入 GPG 与仓库
|
|
③ 目录权限
④ 配置 Loki(/etc/loki/config.yml
)
为了便于理解,我这里把官方的文档都放上来注释掉了,文章的末尾附录部分也复制了几个官方的example。
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
log_level: info #开始可以使用debug模式查看
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
# The number of ingesters to write to and read from.
# CLI flag: -pattern-ingester.distributor.replication-factor
# [replication_factor: <int> | default = 1]
replication_factor: 1
path_prefix: /var/lib/loki
schema_config:
configs:
- from:
store: tsdb
object_store: alibabacloud
schema: v13
index:
prefix: index_
period: 24h
storage_config:
# Configures storing index in an Object Store
# (GCS/S3/Azure/Swift/COS/Filesystem) in a prometheus TSDB-like format. Required
# fields only required when TSDB is defined in config.
tsdb_shipper:
# Directory where ingesters would write index files which would then be
# uploaded by shipper to configured storage
# CLI flag: -tsdb.shipper.active-index-directory
active_index_directory: /var/lib/loki/index
# Cache location for restoring index files from storage for queries
# CLI flag: -boltdb.shipper.cache-location
cache_location: /var/lib/loki/index_cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
alibabacloud:
bucket: xxxxxx
endpoint: xxxxxxx.aliyuncs.com
access_key_id: xxxxxxxxxx
secret_access_key: xxxxxxxxxxx
frontend:
# Defines the encoding for requests to and responses from the scheduler and
# querier. Can be 'json' or 'protobuf' (defaults to 'json').
# CLI flag: -frontend.encoding
encoding: protobuf
# Enable anonymous usage reporting.
# CLI flag: -reporting.enabled
analytics:
reporting_enabled: false
compactor:
# Working directory to store downloaded bloom blocks. Supports multiple
# directories, separated by comma.
# CLI flag: -bloom.shipper.working-directory
working_directory: /var/lib/loki/compactor
# Interval at which to re-run the compaction operation.
# CLI flag: -compactor.compaction-interval
compaction_interval: 5m
# Interval to use for time-based splitting when a request is within the
# `query_ingesters_within` window; defaults to `split-queries-by-interval` by
# setting to 0.
# CLI flag: -querier.split-ingester-queries-by-interval
querier:
query_ingesters_within: 0s
⚠️ 踩坑提示
•endpoint
末尾 不要 带 bucket 名。
• 若出现AccessDenied
,检查 RAM 策略是否给足oss:*
权限。
⑤ 启动并设置开机自启
⑥ Grafana 添加数据源
浏览器打开 http://<ECS_IP>:3000
默认账号 admin / admin
→ 立即改密 →
Configuration → Data Sources → Loki → URL 填 http://localhost:3100
→ Save & Test ✔
✅ 恭喜!基础环境已就绪
🔧 深度配置解析
为什么选择这种架构?
Loki 的独特设计理念:
- 📊 仅索引标签:不对日志内容建立全文索引,大幅降低存储成本
- 🗜️ 高效压缩:日志以压缩块形式存储,存储效率比传统方案高 10 倍以上
- 🔄 云原生:天然支持对象存储,易于扩展
为什么选择 TSDB + 阿里云 OSS:
- 从 Loki 2.8 开始,TSDB 是官方推荐的索引存储方式
- 阿里云内网访问 OSS 无流量费用
- 单节点方案运维简单,适合中小型应用
存储配置详解
TSDB 索引配置
tsdb_shipper:
active_index_directory: /var/lib/loki/index # 活跃索引目录
cache_location: /var/lib/loki/index_cache # 索引缓存目录
cache_ttl: 24h # 缓存保留时间
为什么这样配置:
-
active_index_directory
:存储当前时间段的索引文件 -
cache_location
:缓存从 OSS 下载的历史索引,提升查询性能 -
cache_ttl
:平衡磁盘使用和查询性能
阿里云 OSS 配置
更推荐使用环境变量。
alibabacloud:
bucket: loki-logs-prod # 存储桶名称
endpoint: oss-cn-hangzhou-internal.aliyuncs.com # 内网地址
access_key_id: LTA*** # 访问密钥 ID
secret_access_key: *** # 访问密钥
endpoint 选择策略:
- 内网地址:
oss-cn-{region}-internal.aliyuncs.com
(无流量费) - 公网地址:
oss-cn-{region}.aliyuncs.com
(有流量费)
⚠️ 踩坑指南
常见错误及解决方案
错误现象 | 可能原因 | 解决方案 |
---|---|---|
permission denied | loki 用户权限不足 | sudo chown -R loki /var/lib/loki |
failed to create OSS client | OSS 配置错误 | 检查 bucket、endpoint、密钥配置、RAM权限改为* |
context deadline exceeded | 网络连接超时 | 确认使用正确的 endpoint(内网/公网) |
schema config invalid | 配置文件语法错误 | 删除所有 # 注释,检查 YAML 缩进 |
调试技巧
启用调试模式:
server:
log_level: debug # 生产环境记得改回 info
检查服务状态:
# 查看详细日志
网络和性能问题
带宽限制影响:
- 阿里云 99 元 ECS 带宽较小(ECS-OSS 无限制,外网访问 3Mbps 是瓶颈)
- 查询 24 小时日志可能需要传输数 MB 数据
- 建议:升级带宽或使用按量付费带宽
存储成本优化:
limits_config:
retention_period: 30d # 日志保留 30 天
compactor:
retention_enabled: true
retention_delete_delay: 2h
🔐 生产环境安全加固
⚠️ 重要提醒:本教程默认未启用身份验证,但已通过安全组关闭外网访问,仅适用于内网环境,切勿暴露端口到公网。
方案一:Nginx 反向代理认证
server {
listen 80;
server_name loki.yourdomain.com;
auth_basic "Loki Access";
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
proxy_pass http://127.0.0.1:3100;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
方案二:网络隔离
- 使用阿里云安全组限制访问来源
- 仅允许内网 IP 段访问 3100 端口
🔜 下期预告
Loki 环境已搭建完成,下一步就是让应用程序发送日志到 Loki。
📋 附录:官方配置示例
为了便于理解不同场景的配置差异,这里提供几个官方配置示例供参考:
本地文件系统配置示例
# This is a complete configuration to deploy Loki backed by the filesystem.
# The index will be shipped to the storage via tsdb-shipper.
auth_enabled: false
server:
http_listen_port: 3100
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
replication_factor: 1
path_prefix: /tmp/loki
schema_config:
configs:
- from:
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
filesystem:
directory: /tmp/loki/chunks
官方S3兼容存储
# This is a complete configuration to deploy Loki backed by a s3-compatible API
# like MinIO for storage.
# Index files will be written locally at /loki/index and, eventually, will be shipped to the storage via tsdb-shipper.
auth_enabled: false
server:
http_listen_port: 3100
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
replication_factor: 1
path_prefix: /loki
schema_config:
configs:
- from:
store: tsdb
object_store: s3
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/index_cache
aws:
s3: s3://access_key:secret_access_key@custom_endpoint/bucket_name
s3forcepathstyle: true
S3 扩展配置
# S3 configuration supports an expanded configuration.
# Either an `s3` endpoint URL can be used, or an expanded configuration can be used.
storage_config:
aws:
bucketnames: bucket_name1, bucket_name2
endpoint: s3.endpoint.com
region: s3_region
access_key_id: s3_access_key_id
secret_access_key: s3_secret_access_key
insecure: false
http_config:
idle_conn_timeout: 90s
response_header_timeout: 0s
insecure_skip_verify: false
s3forcepathstyle: true
阿里云OSS存储 配置文件
# This partial configuration uses Alibaba for chunk storage.
common:
path_prefix: /tmp/loki
schema_config:
configs:
- from:
store: tsdb
object_store: alibabacloud
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/index_cache
alibabacloud:
bucket: <bucket>
endpoint: <endpoint>
access_key_id: <access_key_id>
secret_access_key: <secret_access_key>
评论