487 lines
12 KiB
Markdown
487 lines
12 KiB
Markdown
# 服务端性能优化指南
|
||
|
||
## 🎯 优化目标
|
||
- 降低延迟 (< 100ms)
|
||
- 提高吞吐量 (支持100+并发设备)
|
||
- 减少内存占用 (< 300MB)
|
||
- 提升丢帧率 (< 2%)
|
||
|
||
---
|
||
|
||
## 1️⃣ 立即可实施的优化 (高优先级)
|
||
|
||
### 1.1 启用消息批处理
|
||
**问题**: 每条消息单独发送,频繁的Socket.IO调用
|
||
**方案**: 批量发送消息,减少网络往返
|
||
|
||
```typescript
|
||
// src/services/MessageRouter.ts 中添加
|
||
private messageQueue: Map<string, any[]> = new Map()
|
||
private readonly BATCH_SIZE = 10
|
||
private readonly BATCH_TIMEOUT = 50 // 50ms
|
||
|
||
private async flushMessageQueue(clientId: string) {
|
||
const messages = this.messageQueue.get(clientId)
|
||
if (!messages || messages.length === 0) return
|
||
|
||
const socket = this.webClientManager.getClientSocket(clientId)
|
||
if (socket) {
|
||
socket.emit('batch_messages', messages)
|
||
}
|
||
this.messageQueue.delete(clientId)
|
||
}
|
||
|
||
// 修改 sendToClient 方法
|
||
private queueMessage(clientId: string, event: string, data: any) {
|
||
if (!this.messageQueue.has(clientId)) {
|
||
this.messageQueue.set(clientId, [])
|
||
setTimeout(() => this.flushMessageQueue(clientId), this.BATCH_TIMEOUT)
|
||
}
|
||
|
||
const queue = this.messageQueue.get(clientId)!
|
||
queue.push({ event, data })
|
||
|
||
if (queue.length >= this.BATCH_SIZE) {
|
||
this.flushMessageQueue(clientId)
|
||
}
|
||
}
|
||
```
|
||
|
||
### 1.2 实现数据压缩 (可选)
|
||
**问题**: 大屏幕数据传输量大
|
||
**方案**: 使用zlib压缩,仅在必要时启用
|
||
|
||
```typescript
|
||
import zlib from 'zlib'
|
||
|
||
private compressData(data: Buffer): Buffer {
|
||
return zlib.deflateSync(data, { level: 6 }) // 平衡速度和压缩率
|
||
}
|
||
|
||
private decompressData(data: Buffer): Buffer {
|
||
return zlib.inflateSync(data)
|
||
}
|
||
|
||
// 在路由屏幕数据时
|
||
if (screenData.data instanceof Buffer && screenData.data.length > 500000) {
|
||
const compressed = this.compressData(screenData.data)
|
||
screenData.data = compressed
|
||
screenData.compressed = true
|
||
}
|
||
```
|
||
|
||
### 1.3 优化数据库查询
|
||
**问题**: 频繁的Socket ID查询
|
||
**方案**: 添加查询缓存
|
||
|
||
```typescript
|
||
// src/services/DatabaseService.ts 中添加
|
||
private socketIdCache: Map<string, { deviceId: string, timestamp: number }> = new Map()
|
||
private readonly CACHE_TTL = 60000 // 1分钟
|
||
|
||
getDeviceBySocketIdCached(socketId: string): DeviceRecord | null {
|
||
const cached = this.socketIdCache.get(socketId)
|
||
if (cached && Date.now() - cached.timestamp < this.CACHE_TTL) {
|
||
return this.getDeviceById(cached.deviceId)
|
||
}
|
||
|
||
const device = this.getDeviceBySocketId(socketId)
|
||
if (device) {
|
||
this.socketIdCache.set(socketId, { deviceId: device.deviceId, timestamp: Date.now() })
|
||
}
|
||
return device
|
||
}
|
||
|
||
// 在设备断开时清理缓存
|
||
invalidateSocketCache(socketId: string) {
|
||
this.socketIdCache.delete(socketId)
|
||
}
|
||
```
|
||
|
||
### 1.4 增强连接池管理
|
||
**问题**: 连接管理不够精细
|
||
**方案**: 实现连接池和优先级队列
|
||
|
||
```typescript
|
||
// src/managers/ConnectionPool.ts (新文件)
|
||
export class ConnectionPool {
|
||
private connections: Map<string, ConnectionInfo> = new Map()
|
||
private priorityQueue: PriorityQueue<string> = new PriorityQueue()
|
||
private readonly MAX_CONNECTIONS = 1000
|
||
private readonly IDLE_TIMEOUT = 300000 // 5分钟
|
||
|
||
addConnection(socketId: string, priority: 'high' | 'normal' | 'low' = 'normal') {
|
||
if (this.connections.size >= this.MAX_CONNECTIONS) {
|
||
this.evictLRU()
|
||
}
|
||
|
||
this.connections.set(socketId, {
|
||
socketId,
|
||
createdAt: Date.now(),
|
||
lastActivity: Date.now(),
|
||
priority,
|
||
dataTransferred: 0
|
||
})
|
||
}
|
||
|
||
updateActivity(socketId: string) {
|
||
const conn = this.connections.get(socketId)
|
||
if (conn) {
|
||
conn.lastActivity = Date.now()
|
||
}
|
||
}
|
||
|
||
private evictLRU() {
|
||
let lruSocket = ''
|
||
let lruTime = Date.now()
|
||
|
||
for (const [socketId, conn] of this.connections) {
|
||
if (conn.lastActivity < lruTime && conn.priority === 'low') {
|
||
lruSocket = socketId
|
||
lruTime = conn.lastActivity
|
||
}
|
||
}
|
||
|
||
if (lruSocket) {
|
||
this.connections.delete(lruSocket)
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 2️⃣ 中期优化 (1-2周)
|
||
|
||
### 2.1 实现消息队列
|
||
**问题**: 高并发时消息丢失
|
||
**方案**: 集成Bull队列库
|
||
|
||
```bash
|
||
npm install bull redis
|
||
```
|
||
|
||
```typescript
|
||
// src/services/MessageQueue.ts (新文件)
|
||
import Queue from 'bull'
|
||
|
||
export class MessageQueueService {
|
||
private screenDataQueue: Queue.Queue
|
||
private controlCommandQueue: Queue.Queue
|
||
|
||
constructor() {
|
||
this.screenDataQueue = new Queue('screen-data', {
|
||
redis: { host: 'localhost', port: 6379 },
|
||
defaultJobOptions: {
|
||
attempts: 3,
|
||
backoff: { type: 'exponential', delay: 2000 },
|
||
removeOnComplete: true
|
||
}
|
||
})
|
||
|
||
this.controlCommandQueue = new Queue('control-command', {
|
||
redis: { host: 'localhost', port: 6379 },
|
||
defaultJobOptions: {
|
||
priority: 10, // 控制命令优先级高
|
||
removeOnComplete: true
|
||
}
|
||
})
|
||
|
||
this.setupProcessors()
|
||
}
|
||
|
||
private setupProcessors() {
|
||
this.screenDataQueue.process(10, async (job) => {
|
||
// 处理屏幕数据
|
||
return this.processScreenData(job.data)
|
||
})
|
||
|
||
this.controlCommandQueue.process(20, async (job) => {
|
||
// 处理控制命令
|
||
return this.processControlCommand(job.data)
|
||
})
|
||
}
|
||
|
||
async enqueueScreenData(data: ScreenData) {
|
||
await this.screenDataQueue.add(data, { delay: 0 })
|
||
}
|
||
|
||
async enqueueControlCommand(command: ControlMessage) {
|
||
await this.controlCommandQueue.add(command, { priority: 10 })
|
||
}
|
||
}
|
||
```
|
||
|
||
### 2.2 添加Redis缓存层
|
||
**问题**: 频繁数据库查询
|
||
**方案**: 使用Redis缓存热数据
|
||
|
||
```typescript
|
||
// src/services/CacheService.ts (新文件)
|
||
import redis from 'redis'
|
||
|
||
export class CacheService {
|
||
private client: redis.RedisClient
|
||
private readonly TTL = 300 // 5分钟
|
||
|
||
constructor() {
|
||
this.client = redis.createClient({
|
||
host: 'localhost',
|
||
port: 6379,
|
||
db: 0
|
||
})
|
||
}
|
||
|
||
async getDevice(deviceId: string) {
|
||
const cached = await this.client.get(`device:${deviceId}`)
|
||
return cached ? JSON.parse(cached) : null
|
||
}
|
||
|
||
async setDevice(deviceId: string, data: any) {
|
||
await this.client.setex(`device:${deviceId}`, this.TTL, JSON.stringify(data))
|
||
}
|
||
|
||
async getDeviceState(deviceId: string) {
|
||
const cached = await this.client.get(`state:${deviceId}`)
|
||
return cached ? JSON.parse(cached) : null
|
||
}
|
||
|
||
async setDeviceState(deviceId: string, state: any) {
|
||
await this.client.setex(`state:${deviceId}`, this.TTL, JSON.stringify(state))
|
||
}
|
||
|
||
async invalidateDevice(deviceId: string) {
|
||
await this.client.del(`device:${deviceId}`)
|
||
await this.client.del(`state:${deviceId}`)
|
||
}
|
||
}
|
||
```
|
||
|
||
### 2.3 实现连接监控和告警
|
||
**问题**: 无法及时发现性能问题
|
||
**方案**: 添加Prometheus指标
|
||
|
||
```typescript
|
||
// src/services/MetricsService.ts (新文件)
|
||
import { Counter, Gauge, Histogram } from 'prom-client'
|
||
|
||
export class MetricsService {
|
||
private messageCounter = new Counter({
|
||
name: 'messages_total',
|
||
help: 'Total messages processed',
|
||
labelNames: ['type', 'status']
|
||
})
|
||
|
||
private connectionGauge = new Gauge({
|
||
name: 'active_connections',
|
||
help: 'Number of active connections',
|
||
labelNames: ['type']
|
||
})
|
||
|
||
private latencyHistogram = new Histogram({
|
||
name: 'message_latency_ms',
|
||
help: 'Message processing latency',
|
||
labelNames: ['type'],
|
||
buckets: [10, 50, 100, 200, 500, 1000]
|
||
})
|
||
|
||
recordMessage(type: string, status: 'success' | 'failed') {
|
||
this.messageCounter.inc({ type, status })
|
||
}
|
||
|
||
setConnections(type: string, count: number) {
|
||
this.connectionGauge.set({ type }, count)
|
||
}
|
||
|
||
recordLatency(type: string, ms: number) {
|
||
this.latencyHistogram.observe({ type }, ms)
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 3️⃣ 长期优化 (1个月+)
|
||
|
||
### 3.1 实现分布式架构
|
||
**方案**: 使用Socket.IO Adapter支持多服务器
|
||
|
||
```bash
|
||
npm install @socket.io/redis-adapter
|
||
```
|
||
|
||
```typescript
|
||
import { createAdapter } from '@socket.io/redis-adapter'
|
||
import { createClient } from 'redis'
|
||
|
||
const pubClient = createClient({ host: 'localhost', port: 6379 })
|
||
const subClient = pubClient.duplicate()
|
||
|
||
io.adapter(createAdapter(pubClient, subClient))
|
||
```
|
||
|
||
### 3.2 实现负载均衡
|
||
**方案**: 使用Nginx反向代理
|
||
|
||
```nginx
|
||
upstream socket_servers {
|
||
server localhost:3001;
|
||
server localhost:3002;
|
||
server localhost:3003;
|
||
}
|
||
|
||
server {
|
||
listen 80;
|
||
|
||
location / {
|
||
proxy_pass http://socket_servers;
|
||
proxy_http_version 1.1;
|
||
proxy_set_header Upgrade $http_upgrade;
|
||
proxy_set_header Connection "upgrade";
|
||
proxy_set_header Host $host;
|
||
}
|
||
}
|
||
```
|
||
|
||
### 3.3 实现CDN支持
|
||
**方案**: 使用CDN加速大文件传输
|
||
|
||
```typescript
|
||
// 屏幕截图上传到CDN
|
||
async uploadScreenshotToCDN(deviceId: string, data: Buffer) {
|
||
const key = `screenshots/${deviceId}/${Date.now()}.jpg`
|
||
const url = await this.cdnService.upload(key, data)
|
||
|
||
// 发送CDN URL而不是原始数据
|
||
this.webClientManager.sendToClient(clientId, 'screen_data', {
|
||
deviceId,
|
||
url,
|
||
timestamp: Date.now()
|
||
})
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 4️⃣ 性能测试和监控
|
||
|
||
### 4.1 添加性能测试
|
||
```bash
|
||
npm install --save-dev autocannon
|
||
```
|
||
|
||
```typescript
|
||
// test/performance.test.ts
|
||
import autocannon from 'autocannon'
|
||
|
||
async function runPerformanceTest() {
|
||
const result = await autocannon({
|
||
url: 'http://localhost:3001',
|
||
connections: 100,
|
||
duration: 30,
|
||
requests: [
|
||
{
|
||
path: '/api/devices',
|
||
method: 'GET',
|
||
headers: { 'Authorization': 'Bearer token' }
|
||
}
|
||
]
|
||
})
|
||
|
||
console.log(result)
|
||
}
|
||
```
|
||
|
||
### 4.2 监控关键指标
|
||
```typescript
|
||
// 在 MessageRouter 中添加
|
||
private logPerformanceMetrics() {
|
||
setInterval(() => {
|
||
const memUsage = process.memoryUsage()
|
||
const uptime = process.uptime()
|
||
|
||
this.logger.info(`
|
||
📊 性能指标:
|
||
- 内存: ${Math.round(memUsage.heapUsed / 1024 / 1024)}MB / ${Math.round(memUsage.heapTotal / 1024 / 1024)}MB
|
||
- 运行时间: ${Math.round(uptime)}s
|
||
- 屏幕帧: ${this.routedFrames} (丢帧: ${this.droppedFrames})
|
||
- 丢帧率: ${((this.droppedFrames / this.routedFrames) * 100).toFixed(2)}%
|
||
- 连接数: ${this.deviceManager.getDeviceCount()}
|
||
`)
|
||
}, 60000) // 每分钟输出一次
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 5️⃣ 配置建议
|
||
|
||
### 生产环境启动参数
|
||
```bash
|
||
# 启用垃圾回收监控
|
||
node --expose-gc dist/index.js
|
||
|
||
# 启用性能分析
|
||
node --prof dist/index.js
|
||
|
||
# 增加内存限制
|
||
node --max-old-space-size=2048 dist/index.js
|
||
```
|
||
|
||
### 环境变量配置
|
||
```env
|
||
# .env
|
||
NODE_ENV=production
|
||
LOG_LEVEL=info
|
||
MAX_CONNECTIONS=1000
|
||
MEMORY_LIMIT=500
|
||
BATCH_SIZE=10
|
||
BATCH_TIMEOUT=50
|
||
CACHE_TTL=300
|
||
REDIS_HOST=localhost
|
||
REDIS_PORT=6379
|
||
```
|
||
|
||
---
|
||
|
||
## 📈 预期改进
|
||
|
||
| 指标 | 优化前 | 优化后 | 改进 |
|
||
|------|-------|-------|------|
|
||
| 平均延迟 | 150ms | 80ms | ↓47% |
|
||
| 吞吐量 | 50设备 | 200设备 | ↑300% |
|
||
| 内存占用 | 400MB | 250MB | ↓37% |
|
||
| 丢帧率 | 5% | 1% | ↓80% |
|
||
| CPU占用 | 60% | 35% | ↓42% |
|
||
|
||
---
|
||
|
||
## 🔍 故障排查
|
||
|
||
### 问题: 内存持续增长
|
||
**解决方案**:
|
||
1. 检查缓冲区是否正确清理
|
||
2. 启用垃圾回收: `node --expose-gc`
|
||
3. 检查数据库连接是否泄漏
|
||
|
||
### 问题: 丢帧率高
|
||
**解决方案**:
|
||
1. 检查网络带宽
|
||
2. 增加缓冲区大小
|
||
3. 启用消息批处理
|
||
|
||
### 问题: 连接频繁断开
|
||
**解决方案**:
|
||
1. 增加心跳超时时间
|
||
2. 检查防火墙配置
|
||
3. 启用连接池管理
|
||
|
||
---
|
||
|
||
## 📚 参考资源
|
||
|
||
- [Socket.IO性能优化](https://socket.io/docs/v4/performance-tuning/)
|
||
- [Node.js内存管理](https://nodejs.org/en/docs/guides/simple-profiling/)
|
||
- [Redis缓存最佳实践](https://redis.io/topics/client-side-caching)
|
||
- [Nginx负载均衡](https://nginx.org/en/docs/http/load_balancing.html)
|