# VPN Exit Controller Documentation
> Generated: 2025-11-22T21:44:56.062858Z
> Source: https://vpn-docs.rbnk.uk
> Description: Multi-Node VPN Exit Controller with dual-mode access (Tailscale exit nodes + proxy services)
This documentation provides comprehensive information about the VPN Exit Controller system,
including architecture, API reference, deployment guides, and operational procedures.
## Api > Endpoints
### VPN Exit Controller API Documentation
## Overview
The VPN Exit Controller API provides comprehensive management of multi-node VPN exit points with intelligent load balancing, monitoring, and failover capabilities. The system uses FastAPI and runs on port 8080.
**Base URL:** `http://10.10.10.20:8080` (Container IP) or `http://100.73.33.11:8080` (Tailscale IP)
**API Version:** 2.0.0
**Interactive Documentation:** `/docs` (Swagger UI) or `/redoc` (ReDoc)
## Authentication
All API endpoints require HTTP Basic Authentication.
**Default Credentials:**
- Username: `admin`
- Password: `Bl4ckMagic!2345erver`
**Environment Variables:**
- `ADMIN_USER`: Override default username
- `ADMIN_PASS`: Override default password
### Authentication Examples
```
# Using curl with basic auth
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes
# Using curl with explicit header
curl -H "Authorization: Basic YWRtaW46Qmw0Y2tNYWdpYyEyMzQ1ZXJ2ZXI=" http://localhost:8080/api/nodes
```
## Core API Endpoints
### 1. Node Management APIs
#### List All Nodes
**GET** `/api/nodes`
Lists all VPN nodes and their current status.
**Response:**
```
[
{
"id": "vpn-us-1234567890ab",
"country": "us",
"status": "running",
"vpn_server": "us5063.nordvpn.com",
"tailscale_hostname": "vpn-us-node-1",
"started_at": "2025-01-15T10:30:00Z",
"vpn_connected": true,
"tailscale_connected": true
}
]
```
#### Start Node
**POST** `/api/nodes/{country_code}/start`
Starts a new VPN node for the specified country.
**Parameters:**
- `country_code` (path): ISO 3166-1 alpha-2 country code (e.g., "us", "uk", "de")
- `server` (query, optional): Specific VPN server to use
**Example:**
```
curl -X POST -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/nodes/us/start?server=us5063.nordvpn.com"
# Start UK node
curl -X POST -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/nodes/uk/start"
```
**Response:**
```
{
"status": "starting",
"node_id": "vpn-us-1234567890ab",
"country": "us",
"server": "us5063.nordvpn.com"
}
```
#### Stop Node
**DELETE** `/api/nodes/{node_id}/stop`
Stops and removes a specific VPN node.
**Example:**
```
curl -X DELETE -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes/vpn-us-1234567890ab/stop
```
#### Restart Node
**POST** `/api/nodes/{node_id}/restart`
Restarts a specific VPN node.
**Example:**
```
curl -X POST -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes/vpn-us-1234567890ab/restart
```
#### Get Node Details
**GET** `/api/nodes/{node_id}`
Gets detailed information about a specific node.
**Response:**
```
{
"id": "vpn-us-1234567890ab",
"country": "us",
"status": "running",
"container_info": {
"created": "2025-01-15T10:30:00Z",
"ports": {"1080/tcp": [{"HostPort": "31080"}]},
"environment": ["COUNTRY=us", "VPN_SERVER=us5063.nordvpn.com"]
},
"vpn_connected": true,
"tailscale_connected": true
}
```
#### Get Node Logs
**GET** `/api/nodes/{node_id}/logs?lines=100`
Retrieves logs from a specific node.
**Parameters:**
- `lines` (query): Number of log lines to retrieve (1-1000, default: 100)
#### Check Node Health
**GET** `/api/nodes/{node_id}/health`
Checks the health status of a specific node.
**Response:**
```
{
"node_id": "vpn-us-1234567890ab",
"healthy": true,
"message": "VPN and Tailscale connections active"
}
```
#### Cleanup Stopped Containers
**POST** `/api/nodes/cleanup`
Removes all stopped VPN containers.
**Response:**
```
{
"status": "completed",
"removed_containers": 3
}
```
### 2. Load Balancing APIs
#### Get Load Balancing Statistics
**GET** `/api/load-balancer/stats`
Returns current load balancing statistics and connection counts.
**Response:**
```
{
"total_nodes": 5,
"connections_per_node": {
"vpn-us-1234567890ab": 12,
"vpn-uk-2345678901bc": 8
},
"strategy": "health_score",
"last_updated": "2025-01-15T10:30:00Z"
}
```
#### Get Best Node for Country
**GET** `/api/load-balancer/best-node/{country}?strategy=health_score`
Selects the optimal node for a country using the specified load balancing strategy.
**Parameters:**
- `strategy` (query): Load balancing strategy
- `health_score` (default): Best overall health score
- `least_connections`: Fewest active connections
- `round_robin`: Even distribution
- `weighted_latency`: Latency-based with randomization
- `random`: Random selection
**Example:**
```
curl -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/load-balancer/best-node/us?strategy=least_connections"
# Get best UK node
curl -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/load-balancer/best-node/uk?strategy=health_score"
```
**Response:**
```
{
"selected_node": {
"id": "vpn-us-1234567890ab",
"country": "us",
"proxy_url": "socks5://10.10.10.20:31080",
"health_score": 95.2,
"connections": 5
},
"strategy": "least_connections",
"country": "us"
}
```
#### Scale Up Country
**POST** `/api/load-balancer/scale-up/{country}`
Starts additional nodes if needed based on current load.
#### Scale Down Country
**POST** `/api/load-balancer/scale-down/{country}`
Stops excess nodes if load is low.
#### Get Available Strategies
**GET** `/api/load-balancer/strategies`
Lists all available load balancing strategies with descriptions.
### 3. Speed Testing APIs
#### Test Node Speed
**POST** `/api/speed-test/{node_id}?test_size=1MB&run_in_background=false`
Runs a speed test on a specific node.
**Parameters:**
- `test_size` (query): Test size ("1MB" or "10MB")
- `run_in_background` (query): Run asynchronously (default: false)
**Example:**
```
curl -X POST -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/speed-test/vpn-us-1234567890ab?test_size=10MB"
```
**Response:**
```
{
"node_id": "vpn-us-1234567890ab",
"success": true,
"download_mbps": 95.2,
"upload_mbps": 45.6,
"latency_ms": 23.4,
"test_size": "10MB",
"duration_seconds": 12.3,
"tested_at": "2025-01-15T10:30:00Z"
}
```
#### Test All Nodes
**POST** `/api/speed-test/all?test_size=1MB&run_in_background=true`
Runs speed tests on all active nodes.
#### Get Latest Speed Test
**GET** `/api/speed-test/{node_id}/latest`
Gets the most recent speed test result for a node.
#### Get Speed Test History
**GET** `/api/speed-test/{node_id}/history?hours=24`
Gets speed test history for a node.
**Parameters:**
- `hours` (query): Hours of history to retrieve (1-168, default: 24)
#### Get Speed Test Summary
**GET** `/api/speed-test/summary`
Gets a summary of all speed test results across all nodes.
#### Get Country Speed Tests
**GET** `/api/speed-test/country/{country}`
Gets latest speed test results for all nodes in a specific country.
**Response:**
```
{
"country": "us",
"node_count": 3,
"tested_nodes": 3,
"successful_tests": 2,
"avg_download_mbps": 87.5,
"avg_latency_ms": 25.2,
"results": {
"vpn-us-1234567890ab": {
"success": true,
"download_mbps": 95.2,
"latency_ms": 23.4
}
}
}
```
#### Clear Speed Test Results
**DELETE** `/api/speed-test/{node_id}/results`
Clears stored speed test results for a node.
### 4. Metrics and Monitoring APIs
#### Get Node Metrics
**GET** `/api/metrics/{node_id}?period=1h`
Gets historical metrics for a specific node.
**Parameters:**
- `period` (query): Time period ("1h", "6h", "24h", "7d")
**Response:**
```
[
{
"timestamp": "2025-01-15T10:30:00Z",
"cpu_percent": 15.2,
"memory_mb": 256.8,
"network_rx_mb": 12.4,
"network_tx_mb": 8.9,
"active_connections": 5,
"vpn_connected": true
}
]
```
#### Get Current Node Metrics
**GET** `/api/metrics/{node_id}/current`
Gets current real-time metrics for a specific node.
#### Get All Metrics
**GET** `/api/metrics/`
Gets current metrics for all active nodes.
#### Trigger Metrics Collection
**POST** `/api/metrics/collect`
Manually triggers metrics collection across all nodes.
#### Get Metrics Summary
**GET** `/api/metrics/stats/summary`
Gets aggregated statistics across all nodes.
**Response:**
```
{
"total_nodes": 5,
"healthy_nodes": 4,
"unhealthy_nodes": 1,
"total_cpu_percent": 62.8,
"avg_cpu_percent": 12.6,
"total_memory_mb": 1280.4,
"avg_memory_mb": 256.1,
"total_network_rx_mb": 45.6,
"total_network_tx_mb": 32.1,
"timestamp": "2025-01-15T10:30:00Z"
}
```
### 5. Proxy Management APIs
#### Get All Proxy URLs
**GET** `/api/proxy/urls`
Gets all available proxy URLs organized by country.
**Response:**
```
{
"us": [
{
"node_id": "vpn-us-1234567890ab",
"tailscale_ip": "100.86.140.98",
"proxy_urls": {
"http": "http://100.86.140.98:3128",
"socks5": "socks5://100.86.140.98:1080",
"health": "http://100.86.140.98:8080/health"
},
"health_score": 95.2
}
],
"uk": [
{
"node_id": "vpn-uk-2345678901bc",
"tailscale_ip": "100.125.27.111",
"proxy_urls": {
"http": "http://100.125.27.111:3128",
"socks5": "socks5://100.125.27.111:1080",
"health": "http://100.125.27.111:8080/health"
},
"health_score": 88.1
}
]
}
```
#### Get Country Proxy URLs
**GET** `/api/proxy/urls/{country}`
Gets proxy URLs for a specific country.
#### Get Optimal Proxy
**GET** `/api/proxy/optimal/{country}?strategy=health_score`
Gets the optimal proxy endpoint for a country using load balancing.
**Response:**
```
{
"node_id": "vpn-us-1234567890ab",
"country": "us",
"tailscale_ip": "100.86.140.98",
"proxy_urls": {
"http": "http://100.86.140.98:3128",
"socks5": "socks5://100.86.140.98:1080",
"health": "http://100.86.140.98:8080/health"
},
"health_score": 95.2,
"connections": 5,
"strategy": "health_score"
}
```
#### Release Proxy Connection
**POST** `/api/proxy/release/{node_id}`
Decrements the connection count for a proxy (for connection tracking).
#### Get Proxy Statistics
**GET** `/api/proxy/stats`
Gets proxy system statistics and usage information.
#### Update HAProxy Configuration
**POST** `/api/proxy/config/update`
Updates and reloads HAProxy configuration based on current nodes.
#### Generate HAProxy Configuration
**GET** `/api/proxy/config/generate`
Generates HAProxy configuration preview without applying changes.
#### Check Proxy Health
**GET** `/api/proxy/health`
Comprehensive health check of the proxy system and all nodes.
### 6. Failover Management APIs
#### Get Failover Status
**GET** `/api/failover/status`
Gets current failover status and history.
**Response:**
```
{
"enabled": true,
"total_failovers": 12,
"last_failover": "2025-01-15T09:45:00Z",
"failover_history": {
"vpn-us-1234567890ab": [
{
"timestamp": "2025-01-15T09:45:00Z",
"reason": "VPN connection lost",
"action": "restart",
"success": true
}
]
}
}
```
#### Trigger Failover
**POST** `/api/failover/{node_id}/trigger?reason=Manual+trigger`
Manually triggers failover for a specific node.
#### Check All Nodes
**POST** `/api/failover/check-all`
Checks all nodes and triggers failover for unhealthy ones.
#### Get Node Failover History
**GET** `/api/failover/history/{node_id}`
Gets failover history for a specific node.
### 7. Configuration APIs
#### Get Available Countries
**GET** `/api/config/countries`
Lists all countries with available VPN configurations.
**Response:**
```
["us", "uk", "de", "jp", "ca", "au", "nl", "ch", "sg", "fr", "it", "es", "pl"]
```
#### Get Country Servers
**GET** `/api/config/servers/{country_code}`
Gets all available VPN servers for a specific country.
**Response:**
```
[
{
"hostname": "us5063.nordvpn.com",
"country": "us",
"health_score": 95.2,
"last_tested": "2025-01-15T10:00:00Z",
"avg_latency": 23.4,
"is_blacklisted": false
}
]
```
#### Get All Servers
**GET** `/api/config/servers`
Gets all available servers grouped by country.
#### Health Check Server
**POST** `/api/config/servers/{hostname}/health-check`
Runs a health check on a specific VPN server.
#### Health Check All Servers
**POST** `/api/config/servers/health-check-all?country_code=us`
Runs health checks on all servers (optionally filtered by country).
#### Blacklist Server
**POST** `/api/config/servers/{hostname}/blacklist?duration_hours=1`
Temporarily blacklists a server from being used.
#### Get Server Statistics
**GET** `/api/config/server-stats`
Gets statistics about available servers and their health.
### 8. Event System APIs
#### Get Recent Events
**GET** `/api/events?count=50`
Gets recent system events.
**Parameters:**
- `count` (query): Number of events to retrieve (1-500, default: 50)
**Response:**
```
[
{
"timestamp": "2025-01-15T10:30:00Z",
"type": "node_started",
"node_id": "vpn-us-1234567890ab",
"country": "us",
"message": "VPN node started successfully"
},
{
"timestamp": "2025-01-15T10:25:00Z",
"type": "speed_test_completed",
"node_id": "vpn-uk-2345678901bc",
"result": "95.2 Mbps download"
}
]
```
#### Get Events by Type
**GET** `/api/events/types/{event_type}?count=20`
Gets recent events of a specific type.
**Event Types:**
- `node_started`
- `node_stopped`
- `node_failed`
- `speed_test_completed`
- `failover_triggered`
- `health_check_failed`
#### Get Container Events
**GET** `/api/events/container/{container_id}?count=20`
Gets recent events for a specific container/node.
### 9. Authentication APIs
#### Login
**POST** `/api/auth/login`
Authenticates and returns a JWT token (uses HTTP Basic Auth).
**Response:**
```
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_at": "2025-01-16T10:30:00Z",
"user": "admin"
}
```
### 10. System APIs
#### Root Dashboard
**GET** `/`
Returns an HTML dashboard for managing VPN nodes through a web interface.
#### Health Check
**GET** `/health`
Basic health check endpoint.
**Response:**
```
{
"status": "healthy",
"version": "2.0.0"
}
```
## Error Handling
The API uses standard HTTP status codes:
- **200**: Success
- **400**: Bad Request (invalid parameters)
- **401**: Unauthorized (authentication required)
- **404**: Not Found (resource doesn't exist)
- **500**: Internal Server Error
**Error Response Format:**
```
{
"detail": "Error message describing what went wrong"
}
```
## Rate Limiting
No explicit rate limiting is currently implemented, but it's recommended to:
- Limit speed tests to avoid system overload
- Space out health checks appropriately
- Use background execution for long-running operations
## WebSocket Endpoints
Currently, no WebSocket endpoints are implemented. All communication is via REST API with periodic polling recommended for real-time updates.
## Practical Use Cases
### Starting a VPN Node
```
# Start a US node
curl -X POST -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes/us/start
# Start a UK node
curl -X POST -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes/uk/start
# Check if they're running
curl -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes
```
### Using Modern Proxy Architecture
```
# Get best US proxy using health score
curl -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/proxy/optimal/us?strategy=health_score"
# Get best UK proxy
curl -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/proxy/optimal/uk?strategy=health_score"
# Example response with new proxy ports:
# {
# "node_id": "vpn-us-1234567890ab",
# "country": "us",
# "tailscale_ip": "100.86.140.98",
# "proxy_urls": {
# "http": "http://100.86.140.98:3128",
# "socks5": "socks5://100.86.140.98:1080",
# "health": "http://100.86.140.98:8080/health"
# },
# "strategy": "health_score"
# }
# Use the returned proxy URLs with your application:
curl -x http://100.86.140.98:3128 http://ipinfo.io/ip
curl --socks5 100.86.140.98:1080 http://ipinfo.io/ip
# UK proxy usage example:
curl -x http://100.125.27.111:3128 http://ipinfo.io/ip
curl --socks5 100.125.27.111:1080 http://ipinfo.io/ip
```
### Monitoring System Health
```
# Check overall system metrics
curl -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/metrics/stats/summary
# Get recent events
curl -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/events
# Check proxy system health
curl -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/proxy/health
```
### Running Performance Tests
```
# Test all nodes in background
curl -X POST -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/speed-test/all?run_in_background=true"
# Check results later
curl -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/speed-test/summary
```
## Development Notes
- The API is built with FastAPI and includes automatic OpenAPI documentation
- All endpoints require HTTP Basic Authentication
- Background tasks are used for long-running operations like speed tests
- Redis is used for caching metrics and events
- Docker SDK is used for container management
- HAProxy integration provides load balancing capabilities
For interactive API exploration, visit `/docs` or `/redoc` endpoints after authentication.
---
## Api
### API Reference
Comprehensive documentation for the VPN Exit Controller REST API.
## API Overview
The VPN Exit Controller API provides programmatic access to all system functions:
- **RESTful Design**: Clean, predictable URL structure
- **JSON Format**: All requests and responses use JSON
- **HTTP Basic Auth**: Simple, secure authentication
- **Comprehensive**: Full control over nodes, metrics, and configuration
- **Well-Documented**: OpenAPI/Swagger specification available
## Quick Start
### Base URL
```
https://api.vpn.yourdomain.com
```
### Authentication
```
curl -u admin:password https://api.vpn.yourdomain.com/api/nodes
```
### Example Request
```
curl -X POST \
-u admin:password \
-H "Content-Type: application/json" \
-d '{"country": "us"}' \
https://api.vpn.yourdomain.com/api/nodes/start
```
## API Documentation
- :material-shield-account:{ .lg .middle } __Authentication__
---
Learn about API authentication methods and security
:octicons-arrow-right-24: Authentication
- :material-api:{ .lg .middle } __Endpoints__
---
Complete reference for all API endpoints
:octicons-arrow-right-24: API Endpoints
- :material-code-json:{ .lg .middle } __Examples__
---
Code examples in multiple languages
:octicons-arrow-right-24: Examples
- :material-language-python:{ .lg .middle } __SDKs__
---
Official and community SDKs
:octicons-arrow-right-24: SDKs
## API Categories
### Node Management
Control VPN nodes - start, stop, monitor
- `GET /api/nodes` - List all nodes
- `POST /api/nodes/start` - Start a node
- `DELETE /api/nodes/{id}` - Stop a node
- `GET /api/nodes/{id}/health` - Node health
### Load Balancing
Configure and query load balancing
- `GET /api/load-balancer/best-node/{country}` - Get optimal node
- `POST /api/load-balancer/strategy` - Set strategy
- `GET /api/load-balancer/status` - Current status
### Metrics & Monitoring
Access performance and health data
- `GET /api/metrics` - System metrics
- `GET /api/health` - Health check
- `POST /api/speed-test/{id}` - Run speed test
### Configuration
Manage system configuration
- `GET /api/config` - Get configuration
- `PUT /api/config` - Update settings
- `POST /api/config/reload` - Reload config
## Response Format
### Success Response
```
{
"status": "success",
"data": {
"id": "vpn-us-1",
"country": "us",
"status": "running"
},
"timestamp": "2024-01-15T10:30:00Z"
}
```
### Error Response
```
{
"status": "error",
"error": {
"code": "NODE_NOT_FOUND",
"message": "No node with ID 'vpn-xyz' exists",
"details": {}
},
"timestamp": "2024-01-15T10:30:00Z"
}
```
## Status Codes
| Code | Description | Usage |
|------|-------------|-------|
| 200 | OK | Successful GET/PUT |
| 201 | Created | Successful POST |
| 204 | No Content | Successful DELETE |
| 400 | Bad Request | Invalid parameters |
| 401 | Unauthorized | Missing/invalid auth |
| 404 | Not Found | Resource not found |
| 409 | Conflict | Resource conflict |
| 429 | Too Many Requests | Rate limited |
| 500 | Internal Error | Server error |
## Rate Limiting
API requests are rate limited:
- **Default**: 100 requests/minute
- **Burst**: 20 requests/second
- **Headers**: Rate limit info included
```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642257600
```
## API Versioning
The API uses URL versioning:
- Current: `/api/v1/` (or `/api/`)
- Legacy: Not applicable (v1 is first version)
## OpenAPI Specification
Interactive API documentation available at:
- **Swagger UI**: `https://api.vpn.yourdomain.com/api/docs`
- **ReDoc**: `https://api.vpn.yourdomain.com/api/redoc`
- **OpenAPI JSON**: `https://api.vpn.yourdomain.com/api/openapi.json`
## Quick Links
- Authentication Guide - Set up API access
- Complete Endpoint Reference - All endpoints documented
- Code Examples - Copy-paste examples
- SDK Documentation - Language-specific libraries
---
!!! tip "Interactive Documentation"
Visit `/api/docs` on your deployment for interactive API documentation with a built-in testing interface.
---
## Architecture
### Architecture Documentation
Welcome to the VPN Exit Controller architecture documentation. This section provides detailed technical information about the system design, components, and infrastructure.
## Architecture Overview
- :material-server-network:{ .lg .middle } __System Overview__
---
High-level architecture and design principles
:octicons-arrow-right-24: System Overview
- :material-lan:{ .lg .middle } __Network Design__
---
Network topology, routing, and security layers
:octicons-arrow-right-24: Network Design
- :material-puzzle:{ .lg .middle } __Components__
---
Detailed component architecture and interactions
:octicons-arrow-right-24: Components
- :material-shield-lock:{ .lg .middle } __Security Model__
---
Security architecture and threat modeling
:octicons-arrow-right-24: Security Model
## System Architecture Diagram
```
graph TB
subgraph "Internet"
Users[Users/Clients]
Internet[Public Internet]
end
subgraph "Edge Layer"
CF[Cloudflare DNS/CDN]
PublicIP[Public IP: 135.181.60.45]
end
subgraph "Proxy Layer"
Traefik[Traefik - SSL Termination]
HAProxy[HAProxy - L4/L7 Load Balancer]
end
subgraph "Application Layer"
API[FastAPI Application]
Redis[(Redis Cache)]
LB[Load Balancer Service]
end
subgraph "VPN Layer"
Docker[Docker Engine]
VPN1[VPN-US Container]
VPN2[VPN-UK Container]
VPN3[VPN-JP Container]
end
subgraph "Network Layer"
Tailscale[Tailscale Mesh Network]
NordVPN[NordVPN Servers]
end
Users --> Internet
Internet --> CF
CF --> PublicIP
PublicIP --> Traefik
Traefik --> HAProxy
HAProxy --> API
API --> Redis
API --> LB
LB --> Docker
Docker --> VPN1
Docker --> VPN2
Docker --> VPN3
VPN1 --> Tailscale
VPN2 --> Tailscale
VPN3 --> Tailscale
Tailscale --> NordVPN
style Users fill:#f9f,stroke:#333,stroke-width:2px
style CF fill:#ff9,stroke:#333,stroke-width:2px
style Traefik fill:#9ff,stroke:#333,stroke-width:2px
style HAProxy fill:#9f9,stroke:#333,stroke-width:2px
style API fill:#f99,stroke:#333,stroke-width:2px
```
## Key Design Principles
### 1. **Microservices Architecture**
- Loosely coupled services
- Independent scaling
- Technology agnostic
- API-first design
### 2. **Container-Based Infrastructure**
- Docker for service isolation
- Immutable infrastructure
- Easy deployment and rollback
- Resource efficiency
### 3. **High Availability**
- No single point of failure
- Automatic failover
- Health monitoring
- Self-healing capabilities
### 4. **Security by Design**
- Zero-trust networking
- End-to-end encryption
- Principle of least privilege
- Regular security audits
## Technology Stack
| Layer | Technology | Purpose |
|-------|------------|---------|
| **Frontend** | React/Vue.js | Web UI (optional) |
| **API** | FastAPI | REST API server |
| **Proxy** | HAProxy | Load balancing |
| **SSL** | Traefik | SSL termination |
| **Cache** | Redis | Metrics & sessions |
| **Container** | Docker | Service isolation |
| **VPN** | NordVPN | Exit nodes |
| **Mesh** | Tailscale | Secure networking |
| **DNS** | Cloudflare | DNS & CDN |
| **OS** | Ubuntu 22.04 | Host operating system |
## Performance Characteristics
### Latency Targets
- API Response: < 100ms (p95)
- Proxy Overhead: < 10ms
- Health Check: < 5s
- Failover Time: < 30s
### Throughput
- API Requests: 10,000 req/s
- Proxy Connections: 50,000 concurrent
- VPN Bandwidth: 1 Gbps per node
- Redis Operations: 100,000 ops/s
### Scalability
- Horizontal scaling for VPN nodes
- API instances: 1-100
- VPN nodes per country: 1-10
- Total supported countries: 25+
## Infrastructure Requirements
### Minimum Deployment
```
┌─────────────────────────┐
│ Single VM/Server │
│ - 4 CPU cores │
│ - 8GB RAM │
│ - 50GB SSD │
│ - 1 Gbps network │
└─────────────────────────┘
```
### Production Deployment
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ API Servers │ │ VPN Node Pool │ │ Monitoring │
│ - 3x instances │ │ - 5x servers │ │ - Prometheus │
│ - Load balanced│ │ - Geographic │ │ - Grafana │
│ - Auto-scaling │ │ - Auto-scaling │ │ - AlertManager │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
## Component Communication
```
sequenceDiagram
participant User
participant Cloudflare
participant Traefik
participant HAProxy
participant API
participant Redis
participant Docker
participant VPN Node
participant Tailscale
participant NordVPN
User->>Cloudflare: HTTPS Request
Cloudflare->>Traefik: Forward Request
Traefik->>HAProxy: Proxy Request
HAProxy->>API: Load Balanced Request
API->>Redis: Check Metrics
Redis-->>API: Return Data
API->>Docker: Start VPN Container
Docker->>VPN Node: Create Container
VPN Node->>Tailscale: Register Node
VPN Node->>NordVPN: Connect VPN
NordVPN-->>User: Proxy Traffic
```
## Data Flow Architecture
### Request Flow
1. User connects to proxy URL (e.g., proxy-us.rbnk.uk)
2. Cloudflare resolves DNS and forwards to server
3. Traefik handles SSL termination
4. HAProxy routes to appropriate backend
5. Request proxied through VPN node
6. Response returned through same path
### Metrics Flow
1. VPN nodes report health metrics
2. API collects and stores in Redis
3. Load balancer uses metrics for decisions
4. Monitoring systems query metrics API
5. Alerts triggered on thresholds
## Next Steps
- :material-book-open-variant:{ .lg .middle } __Deep Dive__
---
Explore detailed component documentation
:octicons-arrow-right-24: Components
- :material-security:{ .lg .middle } __Security__
---
Understand the security architecture
:octicons-arrow-right-24: Security Model
- :material-server:{ .lg .middle } __Deployment__
---
Deploy the architecture
:octicons-arrow-right-24: Deployment Guide
---
!!! info "Architecture Decisions"
For detailed architecture decision records (ADRs) and design rationale, see our ADR documentation.
---
## Architecture > Overview
### VPN Exit Controller - Technical Architecture
## Overview
The VPN Exit Controller is a sophisticated system that manages dynamic country-based VPN exit nodes using Tailscale mesh networking, Docker containers, and intelligent load balancing. The system provides HTTP/HTTPS proxy services through country-specific subdomains, enabling users to route traffic through different geographical locations.
## System Architecture Diagram
```
Internet → Cloudflare → Proxmox LXC → Traefik → HAProxy → VPN Exit Nodes
↓ ↓ ↓ ↓ ↓ ↓
Users DNS/CDN Host System SSL Term Routing Tailscale+VPN
(10.10.10.20) (Port 443) (Port 8080) (100.x.x.x)
```
### Network Flow Detail
```
1. User Request: https://proxy-us.rbnk.uk
│
2. Cloudflare DNS Resolution: 135.181.60.45
│
3. Proxmox Host: 135.181.60.45:443
│
4. Traefik (LXC 201): SSL termination + routing
│
5. HAProxy: Country-based backend selection
│
6. VPN Exit Node: Docker container with NordVPN + Tailscale
│
7. Final destination via NordVPN servers
```
## Core Components
### 1. FastAPI Application (`/opt/vpn-exit-controller/api/`)
The central orchestration service built with FastAPI that manages the entire VPN exit node ecosystem.
**Key Features:**
- RESTful API for node management
- Web-based dashboard with real-time status
- Authentication using HTTP Basic Auth
- Background services for monitoring and metrics
**Structure:**
```
api/
├── main.py # FastAPI application entry point
├── routes/ # API route handlers
│ ├── nodes.py # Node management endpoints
│ ├── proxy.py # Proxy configuration endpoints
│ ├── load_balancer.py # Load balancing control
│ ├── metrics.py # Metrics and monitoring
│ └── failover.py # Failover management
└── services/ # Business logic services
├── docker_manager.py # Docker container orchestration
├── proxy_manager.py # HAProxy configuration management
├── load_balancer.py # Intelligent node selection
├── redis_manager.py # State and metrics storage
└── metrics_collector.py # Real-time metrics collection
```
### 2. Docker-based VPN Exit Nodes
Each VPN node runs in a dedicated Docker container combining NordVPN and Tailscale.
**Container Architecture:**
```
FROM ubuntu:22.04
RUN apt-get install openvpn tailscale iptables
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
```
**Node Lifecycle:**
1. Container starts with country-specific environment variables
2. OpenVPN connects to optimal NordVPN server for the country
3. Tailscale connects to mesh network as exit node
4. IP forwarding rules enable traffic routing
5. Health monitoring ensures connectivity
**Resource Limits:**
- Memory: 512MB per container
- CPU: 50% of one core
- Swap: 1GB total (memory + swap)
### 3. Traefik SSL Termination and Reverse Proxy
Traefik handles SSL certificate management and initial request routing.
**Configuration:**
- SSL certificates via Let's Encrypt + Cloudflare DNS challenge
- Automatic certificate renewal
- Security headers middleware
- Docker provider for service discovery
**Key Features:**
- Wildcard SSL certificate for `*.rbnk.uk`
- Automatic service discovery through Docker labels
- Prometheus metrics export
- Dashboard at `traefik-vpn.rbnk.uk`
### 4. HAProxy Country-based Routing System
HAProxy provides intelligent country-based request routing and load balancing.
**Routing Logic:**
```
Request: https://proxy-us.rbnk.uk/path
↓
HAProxy ACL: hdr(host) -i proxy-us.rbnk.uk
↓
Backend Selection: proxy_us
↓
Server Selection: Load balancing among US nodes
```
**Backend Configuration:**
- Round-robin load balancing per country
- Health checks every 10 seconds
- Automatic failover to backup servers
- Dynamic configuration updates
**Health Monitoring:**
```
GET /health HTTP/1.1
Host: proxy-{country}.rbnk.uk
Expected: 200 OK
```
### 5. Redis Metrics and State Storage
Redis serves as the central data store for real-time metrics, connection tracking, and system state.
**Data Structure:**
```
node:{node_id} # Node metadata and configuration
metrics:{node_id}:current # Real-time node metrics
metrics:{node_id}:history # Historical metrics (1 hour window)
connections:{node_id} # Active connection counter
server_health:{server} # VPN server health and latency
```
**Metrics Tracked:**
- CPU usage percentage
- Memory usage in MB
- Network I/O statistics
- VPN connection status
- Tailscale connectivity
- Active proxy connections
## VPN Node Architecture
### Container Design
Each VPN exit node is a self-contained Docker container that provides secure routing through a specific country with integrated proxy services.
```
┌─────────────────────────────────────────────────────────────┐
│ VPN Exit Node Container │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌──────────────┐ ┌─────────────────────┐ │
│ │ OpenVPN │ │ Tailscale │ │ Proxy Services │ │
│ │ (NordVPN) │ │ (Exit Node) │ │ │ │
│ │ │ │ │ │ ┌─────────────────┐ │ │
│ │ Port: tun0 │ │ Port: ts0 │ │ │ Squid HTTP/S │ │ │
│ └─────────────┘ └──────────────┘ │ │ Port: 3128 │ │ │
│ │ │ │ └─────────────────┘ │ │
│ ┌─────────────────────────────────┐ │ ┌─────────────────┐ │ │
│ │ iptables Routing │ │ │ Dante SOCKS5 │ │ │
│ │ tun0 ←→ tailscale0 │ │ │ Port: 1080 │ │ │
│ └─────────────────────────────────┘ │ └─────────────────┘ │ │
│ │ ┌─────────────────┐ │ │
│ ┌─────────────────────────────────┐ │ │ Health Check │ │ │
│ │ DNS Configuration │ │ │ Port: 8080 │ │ │
│ │ NordVPN DNS: 103.86.96.100 │ │ └─────────────────┘ │ │
│ │ NordVPN DNS: 103.86.99.100 │ │ │ │
│ │ Fallback: 8.8.8.8, 1.1.1.1 │ └─────────────────────┘ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### NordVPN Integration
**Server Selection:**
- Country-specific server pools
- Automatic optimal server selection based on latency
- Support for both TCP and UDP configurations
- Service credentials authentication
**Configuration Management:**
```
configs/vpn/
├── us.ovpn # Default US configuration
├── us/ # Specific US servers
│ ├── us5063.nordvpn.com.tcp.ovpn
│ └── us5064.nordvpn.com.udp.ovpn
└── auth.txt # NordVPN service credentials
```
### Tailscale Mesh Networking
**Exit Node Configuration:**
- Advertises as exit node on Tailscale network with `--advertise-exit-node`
- Uses `--accept-dns=false` to prevent DNS conflicts (fixes HTTPS errors in incognito mode)
- Ephemeral auth key configuration for automatic device cleanup
- Unique hostname: `exit-{country}-{instance}`
- Userspace networking for container compatibility
- Automatic IP assignment from 100.x.x.x range
**Network Architecture:**
```
Internet ←→ Tailscale Client ←→ Tailscale Mesh ←→ Exit Node ←→ NordVPN ←→ Destination
(100.x.x.x) (tun0) (VPN Server)
```
**DNS Resolution Configuration:**
To resolve HTTPS errors in incognito mode and improve reliability:
1. **Tailscale DNS Disabled**: `--accept-dns=false` prevents Tailscale from overriding DNS
2. **NordVPN DNS Primary**: Uses NordVPN's DNS servers (103.86.96.100, 103.86.99.100)
3. **Google DNS Fallback**: Falls back to 8.8.8.8 and 1.1.1.1 if NordVPN DNS fails
4. **Container DNS Override**: Manual `/etc/resolv.conf` configuration in containers
This configuration eliminates the "doesn't support secure connection" errors that occurred when using Tailscale's DNS resolution through the VPN tunnel.
### Health Monitoring and Auto-Recovery
**Health Checks:**
1. Container status monitoring
2. VPN tunnel connectivity (`ip route | grep tun0`)
3. Tailscale connection status
4. Exit node advertisement verification
**Auto-Recovery Process:**
1. Health check failure detected
2. Container restart attempted (max 3 times)
3. If restart fails, node marked unhealthy
4. Load balancer redirects traffic to healthy nodes
5. Failed node removed after timeout
## Proxy Routing System
### Multi-Protocol Proxy Chain
The system provides a comprehensive proxy chain supporting HTTP/HTTPS and SOCKS5 protocols:
```
Client → HAProxy → Tailscale Mesh → VPN Container → Internet
↓ ↓ ↓ ↓ ↓
Request Routing Mesh Network Proxy Services Destination
Layer (100.x.x.x) (Squid/Dante) (via NordVPN)
```
**Proxy Chain Components:**
1. **HAProxy**: L7 load balancer with ACL-based country routing
2. **Tailscale Mesh**: Secure encrypted tunnel network (100.64.0.0/10)
3. **VPN Container**: Integrated Squid (HTTP/HTTPS) and Dante (SOCKS5) proxies
4. **NordVPN**: Exit point to internet with country-specific IP addresses
### Country-based Subdomain Routing
The system uses DNS subdomains to route traffic through specific countries with multiple proxy protocols:
```
proxy-us.rbnk.uk → United States exit nodes
proxy-uk.rbnk.uk → United Kingdom exit nodes
proxy-de.rbnk.uk → Germany exit nodes
proxy-jp.rbnk.uk → Japan exit nodes
```
**Available Proxy Protocols:**
- **HTTP Proxy**: `http://:3128` (Squid)
- **SOCKS5 Proxy**: `socks5://:1080` (Dante)
- **Health Check**: `http://:8080/health`
### HAProxy ACL-Based Routing (Updated)
HAProxy now uses ACL-based routing instead of regex for better performance and reliability:
```
# Country-specific ACLs using hostname matching
acl is_us_proxy hdr(host) -i proxy-us.rbnk.uk
acl is_uk_proxy hdr(host) -i proxy-uk.rbnk.uk
acl is_de_proxy hdr(host) -i proxy-de.rbnk.uk
# Route to appropriate backend
use_backend proxy_us if is_us_proxy
use_backend proxy_uk if is_uk_proxy
use_backend proxy_de if is_de_proxy
```
### Backend Server Selection with Health Checks
For each country backend, HAProxy selects from available healthy nodes using HTTP health checks:
```
backend proxy_us
mode http
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ localhost
http-check expect status 200
# VPN container nodes with health checks
server us-node-1 100.86.140.98:3128 check inter 10s
server us-node-2 100.86.140.99:3128 check inter 10s
server us-backup 127.0.0.1:3128 backup
```
**Health Check Updates for HAProxy 2.8:**
- Updated health check syntax for compatibility
- HTTP health checks on port 8080 (/health endpoint)
- 10-second check intervals with automatic failover
## Load Balancing System
### 5 Load Balancing Strategies
1. **Round Robin**: Sequential distribution across nodes
2. **Least Connections**: Route to node with fewest active connections
3. **Weighted Latency**: Prefer nodes with lower VPN server latency
4. **Random**: Random node selection
5. **Health Score**: Comprehensive scoring based on multiple factors
### Health Score Calculation
The health score algorithm considers multiple factors:
```
def calculate_health_score(node):
score = 100.0 # Perfect score baseline
# Server latency (40% weight)
latency_score = max(50, 100 - (latency - 50) * 0.5)
score = score * 0.6 + latency_score * 0.4
# Connection count (30% weight)
connection_penalty = min(20, connection_count * 2)
connection_score = max(60, 100 - connection_penalty)
score = score * 0.7 + connection_score * 0.3
# CPU usage (20% weight)
cpu_score = max(60, 100 - cpu_percent)
score = score * 0.8 + cpu_score * 0.2
# Memory usage (10% weight)
memory_penalty = max(0, (memory_mb - 300) / 10)
memory_score = max(70, 100 - memory_penalty)
score = score * 0.9 + memory_score * 0.1
return score
```
### Automatic Scaling Logic
**Scale Up Conditions:**
- Average connections per node > 50
- Current node count < 3 for the country
- At least one healthy server available
**Scale Down Conditions:**
- Average connections per node < 10
- Current node count > 1 for the country
- Target node has 0 active connections
## Infrastructure Details
### Proxmox LXC Container Setup
**Container Configuration:**
- Container ID: 201
- OS: Ubuntu 22.04
- Internal IP: 10.10.10.20
- Public IP: 135.181.60.45
- Memory: 8GB
- Storage: 100GB
**Special Permissions Required:**
```
pct set 201 -features nesting=1,keyctl=1
pct set 201 -lxc.apparmor.profile: unconfined
```
### Network Configuration
**Network Stack:**
```
┌─────────────────────────────────────┐
│ Internet (135.181.60.45) │
├─────────────────────────────────────┤
│ Proxmox Host │
│ ┌─────────────────────────────┐ │
│ │ LXC Container 201 │ │
│ │ IP: 10.10.10.20 │ │
│ │ ┌───────────────────────┐ │ │
│ │ │ Docker Network │ │ │
│ │ │ traefik_proxy │ │ │
│ │ │ vpn_network │ │ │
│ │ └───────────────────────┘ │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
```
**Port Mapping:**
- 80 → Traefik HTTP
- 443 → Traefik HTTPS
- 8080 → FastAPI Application
- 8081 → Traefik Dashboard
- 8404 → HAProxy Stats
### DNS Configuration with Cloudflare
**DNS Records:**
```
A rbnk.uk 135.181.60.45
A *.rbnk.uk 135.181.60.45
CNAME proxy-us.rbnk.uk rbnk.uk
CNAME proxy-uk.rbnk.uk rbnk.uk
CNAME proxy-de.rbnk.uk rbnk.uk
```
**Cloudflare Settings:**
- Proxy enabled for DDoS protection
- SSL/TLS: Full (strict)
- Always Use HTTPS: On
- HSTS enabled
## Configuration Examples
### Docker Compose for API Services
```
version: '3.8'
services:
api:
build: ./api
container_name: vpn-api
network_mode: host
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./configs:/configs
environment:
- TAILSCALE_AUTHKEY=${TAILSCALE_AUTHKEY}
restart: unless-stopped
redis:
image: redis:7-alpine
container_name: vpn-redis
network_mode: host
volumes:
- redis-data:/data
restart: unless-stopped
```
### Traefik Configuration
```
# traefik.yml
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
certificatesResolvers:
cf:
acme:
email: "admin@richardbankole.com"
storage: /letsencrypt/acme.json
dnsChallenge:
provider: cloudflare
```
### Environment Variables
```
# Required environment variables
TAILSCALE_AUTHKEY=tskey-auth-xxxxx # Tailscale auth key
ADMIN_USER=admin # API admin username
ADMIN_PASS=Bl4ckMagic!2345erver # API admin password
SECRET_KEY=your-secret-key # FastAPI secret key
REDIS_URL=redis://localhost:6379 # Redis connection string
CF_DNS_API_TOKEN=cloudflare-token # Cloudflare API token
```
## Monitoring and Observability
### Metrics Collection
**System Metrics:**
- Node count per country
- Connection distribution
- CPU and memory usage
- Network throughput
- VPN connection stability
**Business Metrics:**
- Request success rate
- Response time percentiles
- Geographic usage distribution
- Load balancing effectiveness
### Health Monitoring
**Health Check Endpoints:**
- `/health` - API service health
- `/api/nodes` - Node status overview
- `/api/metrics` - System metrics
- HAProxy stats at `:8404/stats`
- Traefik dashboard at `:8081`
### Alerting and Failover
**Automatic Failover Triggers:**
- Node health check failures
- High CPU/memory usage
- VPN connection loss
- Tailscale connectivity issues
**Recovery Actions:**
- Container restart (up to 3 attempts)
- Node replacement with fresh container
- Load balancer traffic redirection
- Administrative notifications
## Security Considerations
### Network Isolation
- Each VPN node runs in isolated Docker container
- Network policies restrict inter-container communication
- VPN credentials stored securely in mounted volumes
### Authentication and Authorization
- HTTP Basic Auth for API access
- Tailscale authentication for mesh network
- NordVPN service credentials for VPN access
### SSL/TLS Configuration
- End-to-end encryption via Traefik
- Let's Encrypt certificates with automatic renewal
- Secure headers middleware
- HSTS enforcement
## Deployment and Operations
### Initial Setup
1. **Proxmox LXC Creation:**
```
pct create 201 ubuntu-22.04-standard_22.04-1_amd64.tar.xz \
--hostname vpn-controller \
--memory 8192 \
--rootfs local-lvm:100
```
2. **Container Permissions:**
```
pct set 201 -features nesting=1,keyctl=1
pct set 201 -lxc.apparmor.profile: unconfined
```
3. **Service Installation:**
```
cd /opt/vpn-exit-controller
./setup-project.sh
systemctl enable vpn-controller
systemctl start vpn-controller
```
### Maintenance Operations
**Health Monitoring:**
```
# Check service status
systemctl status vpn-controller
# View real-time logs
journalctl -u vpn-controller -f
# Check Docker containers
docker ps -a --filter label=vpn.exit-node=true
```
**Configuration Updates:**
```
# Update HAProxy configuration
curl -X POST http://localhost:8080/api/proxy/update-config
# Restart all nodes for a country
curl -X POST http://localhost:8080/api/nodes/us/restart-all
```
**Backup and Recovery:**
```
# Backup Redis data
docker exec vpn-redis redis-cli BGSAVE
# Backup configuration
tar -czf backup.tar.gz /opt/vpn-exit-controller/configs
```
This architecture provides a robust, scalable, and intelligent VPN exit node system that automatically manages geographic traffic routing while maintaining high availability and performance.
---
## Getting Started
### Getting Started with VPN Exit Controller
Welcome to the VPN Exit Controller documentation! This guide will help you get up and running quickly.
## What is VPN Exit Controller?
VPN Exit Controller is a professional-grade system for managing VPN exit nodes with intelligent load balancing and country-specific proxy URLs. It provides:
- 🌍 **25+ Country Support**: Access VPN exit nodes in countries worldwide
- ⚡ **Intelligent Load Balancing**: 5 different strategies for optimal performance
- 🔄 **Automatic Failover**: Self-healing with health monitoring
- 🚀 **High Performance**: HAProxy-based routing with sub-second latency
- 🔒 **Enterprise Security**: SSL/TLS, authentication, and Tailscale mesh networking
- 📊 **Real-time Metrics**: Comprehensive monitoring and speed testing
- 🐳 **Container-based**: Docker architecture for easy scaling
## Quick Links
- :material-rocket-launch:{ .lg .middle } __Quick Start__
---
Get VPN Exit Controller running in minutes
:octicons-arrow-right-24: Quick Start Guide
- :material-book-open-variant:{ .lg .middle } __User Guide__
---
Learn how to use proxy URLs and configure clients
:octicons-arrow-right-24: User Guide
- :material-api:{ .lg .middle } __API Reference__
---
Integrate with the REST API
:octicons-arrow-right-24: API Documentation
- :material-server:{ .lg .middle } __Deployment__
---
Deploy to production infrastructure
:octicons-arrow-right-24: Deployment Guide
## Prerequisites
Before you begin, ensure you have:
- Ubuntu 22.04 or later (or compatible Linux distribution)
- Docker and Docker Compose installed
- Python 3.10+ with pip
- A domain with Cloudflare DNS (for proxy URLs)
- NordVPN service credentials
- Tailscale account for mesh networking
## Architecture Overview
```
graph LR
A[Internet] --> B[Cloudflare DNS]
B --> C[Traefik SSL]
C --> D[HAProxy]
D --> E[Load Balancer]
E --> F[VPN Nodes]
F --> G[NordVPN]
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#ff9,stroke:#333,stroke-width:2px
style C fill:#9ff,stroke:#333,stroke-width:2px
style D fill:#9f9,stroke:#333,stroke-width:2px
```
## Choose Your Path
### 🚀 **I want to use the proxy service**
Start with the Proxy Usage Guide to learn how to configure your browser or application.
### 🛠️ **I want to deploy my own instance**
Follow the Deployment Guide for step-by-step installation instructions.
### 💻 **I want to integrate via API**
Check out the API Reference for authentication and endpoint documentation.
### 🔧 **I want to contribute**
Read our Contributing Guide to get started with development.
## Key Features
### Country-Specific Proxy URLs
Access any supported country through intuitive URLs:
- `https://proxy-us.rbnk.uk` - United States
- `https://proxy-uk.rbnk.uk:8132` - United Kingdom
- `https://proxy-jp.rbnk.uk` - Japan
- View all 25+ supported countries →
### Intelligent Load Balancing
Choose from 5 strategies:
- **Round Robin**: Equal distribution
- **Least Connections**: Route to least busy node
- **Weighted Latency**: Favor fastest nodes
- **Random**: Randomized selection
- **Health Score**: AI-based optimal routing
### Enterprise-Ready
- SSL/TLS encryption with Let's Encrypt
- HTTP Basic and API key authentication
- Comprehensive logging and monitoring
- Redis-backed metrics and session persistence
- Automatic failover and recovery
## Community & Support
- 📖 Full Documentation
- 🐛 Report Issues
- 💬 Discussions
- 📧 Contact Support
---
!!! tip "Ready to get started?"
Head over to the Quick Start Guide to deploy your first VPN exit node in minutes!
---
## Guide > Api Usage
### API Usage Guide
Learn how to interact with the VPN Exit Controller API for programmatic control of VPN nodes and proxy services.
## API Overview
The VPN Exit Controller provides a comprehensive REST API for:
- **Node Management**: Start, stop, and monitor VPN nodes
- **Load Balancing**: Configure strategies and get optimal nodes
- **Metrics & Monitoring**: Access performance data and health status
- **Speed Testing**: Run bandwidth and latency tests
- **Proxy Management**: Configure proxy routing
## Authentication
All API endpoints require authentication using HTTP Basic Auth:
```
curl -u admin:your_password https://api.vpn.yourdomain.com/api/endpoint
```
### Authentication Methods
=== "HTTP Basic Auth"
```
# Using curl
curl -u admin:password https://api.vpn.yourdomain.com/api/nodes
# Using base64 encoding
curl -H "Authorization: Basic YWRtaW46cGFzc3dvcmQ=" \
https://api.vpn.yourdomain.com/api/nodes
```
=== "Environment Variables"
```
# Set credentials
export VPN_API_USER=admin
export VPN_API_PASS=your_password
# Use in scripts
curl -u $VPN_API_USER:$VPN_API_PASS \
https://api.vpn.yourdomain.com/api/nodes
```
=== "Programming Languages"
```
# Python
import requests
from requests.auth import HTTPBasicAuth
response = requests.get(
'https://api.vpn.yourdomain.com/api/nodes',
auth=HTTPBasicAuth('admin', 'password')
)
```
## Common API Operations
### 1. Node Management
#### List All Nodes
```
curl -u admin:password https://api.vpn.yourdomain.com/api/nodes
```
**Response:**
```
{
"nodes": [
{
"id": "vpn-us",
"country": "us",
"city": "New York",
"status": "running",
"health": "healthy",
"connections": 15,
"uptime": 86400
}
]
}
```
#### Start a New Node
```
curl -X POST -u admin:password \
-H "Content-Type: application/json" \
-d '{"country": "uk", "city": "London"}' \
https://api.vpn.yourdomain.com/api/nodes/start
```
#### Stop a Node
```
curl -X DELETE -u admin:password \
https://api.vpn.yourdomain.com/api/nodes/vpn-uk
```
### 2. Load Balancing
#### Get Best Node for Country
```
curl -u admin:password \
https://api.vpn.yourdomain.com/api/load-balancer/best-node/us
# Get best UK node
curl -u admin:password \
https://api.vpn.yourdomain.com/api/load-balancer/best-node/uk
```
**Response:**
```
{
"node": {
"id": "vpn-us-2",
"score": 95.5,
"latency": 12,
"connections": 5
},
"strategy": "health_score"
}
```
#### Change Load Balancing Strategy
```
curl -X POST -u admin:password \
-H "Content-Type: application/json" \
-d '{"strategy": "weighted_latency"}' \
https://api.vpn.yourdomain.com/api/load-balancer/strategy
```
### 3. Metrics and Monitoring
#### Get System Metrics
```
curl -u admin:password \
https://api.vpn.yourdomain.com/api/metrics
```
#### Health Check
```
curl -u admin:password \
https://api.vpn.yourdomain.com/api/health
```
### 4. Speed Testing
#### Run Speed Test
```
curl -X POST -u admin:password \
https://api.vpn.yourdomain.com/api/speed-test/vpn-us
# Run speed test for UK node
curl -X POST -u admin:password \
https://api.vpn.yourdomain.com/api/speed-test/vpn-uk
```
**Response:**
```
{
"node_id": "vpn-us",
"download_speed": 485.6,
"upload_speed": 234.8,
"latency": 15.2,
"timestamp": "2024-01-15T10:30:00Z"
}
```
## SDK Examples
### Python SDK
```
import requests
from typing import Dict, List, Optional
class VPNController:
def __init__(self, base_url: str, username: str, password: str):
self.base_url = base_url.rstrip('/')
self.auth = (username, password)
def list_nodes(self) -> List[Dict]:
"""List all VPN nodes"""
response = requests.get(
f"{self.base_url}/api/nodes",
auth=self.auth
)
response.raise_for_status()
return response.json()['nodes']
def start_node(self, country: str, city: Optional[str] = None) -> Dict:
"""Start a new VPN node"""
data = {"country": country}
if city:
data["city"] = city
response = requests.post(
f"{self.base_url}/api/nodes/start",
json=data,
auth=self.auth
)
response.raise_for_status()
return response.json()
def get_best_node(self, country: str) -> Dict:
"""Get the best node for a country"""
response = requests.get(
f"{self.base_url}/api/load-balancer/best-node/{country}",
auth=self.auth
)
response.raise_for_status()
return response.json()
# Usage
vpn = VPNController('https://api.vpn.yourdomain.com', 'admin', 'password')
nodes = vpn.list_nodes()
best_us = vpn.get_best_node('us')
best_uk = vpn.get_best_node('uk')
```
### JavaScript/Node.js SDK
```
const axios = require('axios');
class VPNController {
constructor(baseUrl, username, password) {
this.client = axios.create({
baseURL: baseUrl,
auth: {
username: username,
password: password
}
});
}
async listNodes() {
const response = await this.client.get('/api/nodes');
return response.data.nodes;
}
async startNode(country, city = null) {
const data = { country };
if (city) data.city = city;
const response = await this.client.post('/api/nodes/start', data);
return response.data;
}
async getBestNode(country) {
const response = await this.client.get(`/api/load-balancer/best-node/${country}`);
return response.data;
}
}
// Usage
const vpn = new VPNController('https://api.vpn.yourdomain.com', 'admin', 'password');
const nodes = await vpn.listNodes();
const bestUS = await vpn.getBestNode('us');
const bestUK = await vpn.getBestNode('uk');
```
## Error Handling
The API returns standard HTTP status codes:
| Status Code | Description |
|-------------|-------------|
| 200 | Success |
| 201 | Created |
| 400 | Bad Request |
| 401 | Unauthorized |
| 404 | Not Found |
| 409 | Conflict (e.g., node already exists) |
| 500 | Internal Server Error |
### Error Response Format
```
{
"error": "Node not found",
"detail": "No node with ID 'vpn-xyz' exists",
"timestamp": "2024-01-15T10:30:00Z"
}
```
### Error Handling Examples
=== "Python"
```
try:
response = vpn.start_node('us')
except requests.exceptions.HTTPError as e:
if e.response.status_code == 409:
print("Node already exists")
elif e.response.status_code == 401:
print("Invalid credentials")
else:
print(f"Error: {e.response.json()['error']}")
```
=== "JavaScript"
```
try {
const response = await vpn.startNode('us');
} catch (error) {
if (error.response) {
if (error.response.status === 409) {
console.log("Node already exists");
} else if (error.response.status === 401) {
console.log("Invalid credentials");
} else {
console.log(`Error: ${error.response.data.error}`);
}
}
}
```
## Rate Limiting
API requests are rate limited to prevent abuse:
- **Default Limit**: 100 requests per minute per IP
- **Burst Limit**: 20 requests per second
- **Headers**: Rate limit info in response headers
```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642257600
```
## Webhooks
Configure webhooks for real-time events:
```
curl -X POST -u admin:password \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-webhook.com/vpn-events",
"events": ["node.started", "node.stopped", "node.unhealthy"]
}' \
https://api.vpn.yourdomain.com/api/webhooks
```
### Webhook Events
- `node.started` - VPN node successfully started
- `node.stopped` - VPN node stopped
- `node.unhealthy` - Node health check failed
- `failover.triggered` - Automatic failover occurred
- `speed.test.completed` - Speed test finished
## Best Practices
!!! tip "API Usage Tips"
1. **Cache responses** when appropriate to reduce API calls
2. **Use bulk operations** when available
3. **Implement exponential backoff** for retries
4. **Monitor rate limits** to avoid throttling
5. **Use webhooks** for real-time updates instead of polling
!!! warning "Security Best Practices"
- Never hardcode credentials in your code
- Use environment variables or secure vaults
- Rotate API credentials regularly
- Implement request signing for sensitive operations
- Use HTTPS for all API communications
## API Playground
Try the API directly from your browser:
## Next Steps
- 📚 View complete API Reference
- 🔧 Learn about Configuration Options
- 📊 Explore Metrics and Monitoring
- 🚀 Check out SDK Examples
---
!!! question "Need Help?"
Check our API Reference for detailed endpoint documentation or contact support for assistance.
---
## Guide > Configuration
### Configuration Guide
This guide covers all configuration options for VPN Exit Controller, including environment variables, service settings, and advanced tuning parameters.
## Configuration Overview
VPN Exit Controller uses a hierarchical configuration system:
1. **Environment Variables** (`.env` file)
2. **Service Configuration** (systemd, Docker)
3. **Application Settings** (API, load balancer, etc.)
4. **Runtime Configuration** (via API)
## Environment Variables
### Essential Configuration
Create a `.env` file in the project root:
```
# Copy template
cp .env.example .env
# Edit configuration
nano .env
```
### Core Settings
#### NordVPN Configuration
```
# Service credentials from NordVPN dashboard
NORDVPN_USER=your_service_username
NORDVPN_PASS=your_service_password
# Optional: Preferred protocol
NORDVPN_PROTOCOL=udp # or tcp
NORDVPN_TECHNOLOGY=openvpn_udp # or nordlynx
```
!!! info "Getting NordVPN Credentials"
1. Log in to NordVPN Dashboard
2. Navigate to Manual Configuration
3. Generate service credentials
4. Use these credentials (not your account login)
#### Tailscale Configuration
```
# Auth key for automatic node registration (use ephemeral keys for auto-cleanup)
TAILSCALE_AUTH_KEY=tskey-auth-xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional: Custom hostname prefix
TAILSCALE_HOSTNAME_PREFIX=vpn-exit
# Optional: Exit node advertisement
TAILSCALE_ADVERTISE_EXIT_NODE=true
TAILSCALE_ADVERTISE_ROUTES=10.0.0.0/8,192.168.0.0/16
# DNS Configuration (Important: prevents HTTPS errors in incognito mode)
TAILSCALE_ACCEPT_DNS=false # Disables Tailscale DNS override
```
#### API Configuration
```
# API Authentication
API_USERNAME=admin
API_PASSWORD=strong_secure_password_here
# API Server Settings
API_HOST=0.0.0.0
API_PORT=8080
API_WORKERS=4
API_RELOAD=false # Set to true for development
# CORS Settings
API_CORS_ORIGINS=["https://vpn-docs.rbnk.uk", "https://admin.rbnk.uk"]
```
#### Redis Configuration
```
# Redis Connection
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0
REDIS_PASSWORD=redis_password_if_set
# Redis Settings
REDIS_MAX_CONNECTIONS=50
REDIS_DECODE_RESPONSES=true
REDIS_SOCKET_TIMEOUT=5
REDIS_CONNECTION_TIMEOUT=10
```
#### Proxy Server Configuration
```
# Proxy service settings
PROXY_HTTP_PORT=3128 # Squid HTTP/HTTPS proxy port
PROXY_SOCKS_PORT=1080 # Dante SOCKS5 proxy port
PROXY_HEALTH_PORT=8080 # Health check endpoint port
# DNS Configuration for VPN containers
VPN_DNS_PRIMARY=103.86.96.100 # NordVPN DNS server 1
VPN_DNS_SECONDARY=103.86.99.100 # NordVPN DNS server 2
VPN_DNS_FALLBACK_1=8.8.8.8 # Google DNS fallback 1
VPN_DNS_FALLBACK_2=1.1.1.1 # Google DNS fallback 2
# Squid proxy settings
SQUID_ACCESS_LOG=none # Disable access logging for privacy
SQUID_CACHE_ENABLED=false # Disable caching for privacy
SQUID_MAX_CONNECTIONS=1000 # Maximum concurrent connections
# SOCKS5 proxy settings
DANTE_MAX_CONNECTIONS=1000 # Maximum concurrent connections
DANTE_LOG_LEVEL=error # Logging level (error, warning, info, debug)
```
!!! info "DNS Resolution Fix"
The VPN containers are configured with specific DNS servers to resolve the "doesn't support secure connection" errors that occurred in incognito mode:
1. **Primary**: NordVPN DNS servers (103.86.96.100, 103.86.99.100)
2. **Fallback**: Google DNS (8.8.8.8, 1.1.1.1) if NordVPN DNS fails
3. **Tailscale DNS Disabled**: `--accept-dns=false` prevents conflicts
### Advanced Settings
#### Load Balancing
```
# Default strategy: round_robin, least_connections, weighted_latency, random, health_score
DEFAULT_LOAD_BALANCING_STRATEGY=health_score
# Auto-scaling
AUTO_SCALING_ENABLED=true
AUTO_SCALING_MIN_NODES=1
AUTO_SCALING_MAX_NODES=5
AUTO_SCALING_TARGET_CPU=70
AUTO_SCALING_TARGET_CONNECTIONS=100
# Connection limits
MAX_CONNECTIONS_PER_NODE=50
CONNECTION_DRAIN_TIMEOUT=30
```
#### Health Monitoring
```
# Health check intervals (seconds)
HEALTH_CHECK_INTERVAL=30
HEALTH_CHECK_TIMEOUT=10
HEALTH_CHECK_RETRIES=3
HEALTH_CHECK_BACKOFF_FACTOR=2
# Failover settings
FAILOVER_ENABLED=true
FAILOVER_THRESHOLD=3 # Failed health checks before failover
FAILOVER_COOLDOWN=300 # Seconds before retry
```
#### Speed Testing
```
# Speed test configuration
SPEED_TEST_ENABLED=true
SPEED_TEST_INTERVAL=3600 # Run every hour
SPEED_TEST_TIMEOUT=60
SPEED_TEST_SERVERS=["fast.com", "speedtest.net", "google.com"]
# Test file sizes
SPEED_TEST_DOWNLOAD_SIZE=10MB
SPEED_TEST_UPLOAD_SIZE=5MB
```
#### Metrics and Logging
```
# Logging
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_FORMAT=json # json or text
LOG_FILE=/var/log/vpn-controller/app.log
LOG_ROTATION=daily
LOG_RETENTION_DAYS=30
# Metrics
METRICS_ENABLED=true
METRICS_RETENTION_HOURS=168 # 7 days
METRICS_AGGREGATION_INTERVAL=60 # seconds
```
#### Security Settings
```
# API Security
API_RATE_LIMIT_ENABLED=true
API_RATE_LIMIT_PER_MINUTE=100
API_RATE_LIMIT_BURST=20
# IP Whitelisting (comma-separated)
API_WHITELIST_IPS=10.0.0.0/8,192.168.0.0/16
API_BLACKLIST_IPS=
# Session Management
SESSION_TIMEOUT=3600 # 1 hour
SESSION_SECURE_COOKIE=true
SESSION_SAME_SITE=strict
```
### Domain and SSL Configuration
```
# Domain settings
DOMAIN=rbnk.uk
API_SUBDOMAIN=vpn-api
DOCS_SUBDOMAIN=vpn-docs
# Cloudflare
CF_API_TOKEN=your_cloudflare_api_token
CF_ZONE_ID=your_zone_id
CF_PROXY_ENABLED=true
# SSL/TLS
SSL_EMAIL=admin@yourdomain.com
SSL_STAGING=false # Set to true for Let's Encrypt staging
```
## Service Configuration
### Systemd Service
Edit `/etc/systemd/system/vpn-controller.service`:
```
[Unit]
Description=VPN Exit Controller API
After=network.target redis.service docker.service
Wants=redis.service docker.service
[Service]
Type=exec
User=root
Group=docker
WorkingDirectory=/opt/vpn-exit-controller
# Environment
Environment="PATH=/opt/vpn-exit-controller/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
EnvironmentFile=/opt/vpn-exit-controller/.env
# Process management
ExecStart=/opt/vpn-exit-controller/venv/bin/python -m uvicorn api.main:app --host 0.0.0.0 --port 8080
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
# Restart policy
Restart=always
RestartSec=10
RestartPreventExitStatus=0
# Resource limits
LimitNOFILE=65536
LimitNPROC=4096
# Security
PrivateTmp=true
NoNewPrivileges=true
[Install]
WantedBy=multi-user.target
```
### Docker Configuration
#### Docker Compose Override
Create `docker-compose.override.yml` for local settings:
```
version: '3.8'
services:
vpn-controller:
environment:
- LOG_LEVEL=DEBUG
- API_RELOAD=true
volumes:
- ./custom-configs:/app/custom-configs
ports:
- "8081:8080" # Different port for development
```
#### Docker Resource Limits
```
services:
vpn-controller:
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '0.5'
memory: 512M
```
## HAProxy Configuration
### Load Balancer Tuning
Edit `/opt/vpn-exit-controller/proxy/haproxy.cfg`:
```
global
# Performance tuning
maxconn 10000
nbproc 4
nbthread 8
cpu-map auto:1/1-8 0-7
# Timeouts
timeout connect 5s
timeout client 30s
timeout server 30s
timeout tunnel 1h
# SSL/TLS tuning
tune.ssl.default-dh-param 2048
ssl-default-bind-ciphers ECDHE+AESGCM:ECDHE+AES256:ECDHE+AES128
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11
```
### Backend Configuration
```
backend proxy_us
# Load balancing algorithm
balance leastconn # or roundrobin, source, uri
# Health checks
option httpchk GET /health HTTP/1.1\r\nHost:\ localhost
http-check expect status 200
# Connection settings
option http-server-close
option forwardfor
http-reuse safe
# Servers with advanced options
server vpn-us-1 10.0.0.11:8888 check inter 5s rise 2 fall 3 weight 100
server vpn-us-2 10.0.0.12:8888 check inter 5s rise 2 fall 3 weight 100 backup
```
## Traefik Configuration
### Dynamic Configuration
Create `traefik/dynamic/vpn-controller.yml`:
```
http:
routers:
vpn-api:
rule: "Host(`vpn-api.rbnk.uk`)"
service: vpn-api
entryPoints:
- websecure
tls:
certResolver: cf
middlewares:
- rate-limit
- security-headers
services:
vpn-api:
loadBalancer:
servers:
- url: "http://localhost:8080"
healthCheck:
path: /api/health
interval: 30s
timeout: 10s
middlewares:
rate-limit:
rateLimit:
average: 100
burst: 50
period: 1m
security-headers:
headers:
customFrameOptionsValue: SAMEORIGIN
contentTypeNosniff: true
browserXssFilter: true
stsSeconds: 31536000
stsIncludeSubdomains: true
stsPreload: true
```
## Application Configuration
### API Settings
Create `api/config.py` for application-specific settings:
```
from pydantic_settings import BaseSettings
from typing import List, Optional
class Settings(BaseSettings):
# API Settings
title: str = "VPN Exit Controller API"
version: str = "1.0.0"
description: str = "Professional VPN node management system"
docs_url: str = "/api/docs"
redoc_url: str = "/api/redoc"
# Feature flags
enable_metrics: bool = True
enable_webhooks: bool = True
enable_speed_tests: bool = True
enable_auto_scaling: bool = True
# Performance tuning
connection_pool_size: int = 100
request_timeout: int = 30
background_task_workers: int = 4
class Config:
env_file = ".env"
case_sensitive = False
settings = Settings()
```
### Runtime Configuration API
Configure settings via API without restart:
```
# Update load balancing strategy
curl -X PUT -u admin:password \
-H "Content-Type: application/json" \
-d '{"key": "load_balancing.strategy", "value": "health_score"}' \
https://api.vpn.yourdomain.com/api/config
# Update health check interval
curl -X PUT -u admin:password \
-H "Content-Type: application/json" \
-d '{"key": "health_check.interval", "value": 60}' \
https://api.vpn.yourdomain.com/api/config
```
## Configuration Best Practices
### Environment Management
1. **Development Environment**
```
# .env.development
LOG_LEVEL=DEBUG
API_RELOAD=true
SSL_STAGING=true
```
2. **Production Environment**
```
# .env.production
LOG_LEVEL=INFO
API_RELOAD=false
SSL_STAGING=false
```
3. **Environment Loading**
```
# Load specific environment
export ENV=production
source .env.$ENV
```
### Secret Management
!!! warning "Security Best Practice"
Never commit secrets to version control. Use secure secret management solutions.
Options for secret management:
1. **HashiCorp Vault**
```
import hvac
client = hvac.Client(url='https://vault.company.com')
nordvpn_pass = client.read('secret/vpn/nordvpn')['data']['password']
```
2. **AWS Secrets Manager**
```
import boto3
client = boto3.client('secretsmanager')
secret = client.get_secret_value(SecretId='vpn-controller-secrets')
```
3. **Environment Variable Encryption**
```
# Encrypt secrets
echo "password" | openssl enc -aes-256-cbc -base64 -out secret.enc
# Decrypt at runtime
export API_PASSWORD=$(openssl enc -aes-256-cbc -d -base64 -in secret.enc)
```
### Configuration Validation
Validate configuration on startup:
```
def validate_config():
"""Validate all configuration settings"""
errors = []
# Check required variables
required = ['NORDVPN_USER', 'NORDVPN_PASS', 'TAILSCALE_AUTH_KEY']
for var in required:
if not os.getenv(var):
errors.append(f"Missing required: {var}")
# Validate formats
if os.getenv('API_PORT'):
try:
port = int(os.getenv('API_PORT'))
if not 1 <= port <= 65535:
errors.append("Invalid port range")
except ValueError:
errors.append("API_PORT must be integer")
if errors:
raise ConfigurationError("\n".join(errors))
```
## Monitoring Configuration
Use these commands to verify configuration:
```
# Check loaded environment
./scripts/check-config.sh
# Validate configuration
python -m api.config validate
# Test configuration changes
curl -u admin:password https://api.vpn.yourdomain.com/api/config/test
```
## Next Steps
- 🚀 Deploy to Production
- 🔒 Security Hardening
- 📊 Monitoring Setup
- 🔧 Troubleshooting
---
!!! tip "Configuration Tips"
- Always use `.env.example` as a template
- Keep production secrets in a secure vault
- Monitor configuration changes with audit logs
- Test configuration changes in staging first
- Document all custom configuration options
---
## Guide
### User Guide
Welcome to the VPN Exit Controller User Guide! This section covers everything you need to know about using the system effectively.
## What You'll Learn
- :material-web:{ .lg .middle } __Using Proxy URLs__
---
Configure browsers and applications to use country-specific proxies
:octicons-arrow-right-24: Proxy Usage Guide
- :material-scale-balance:{ .lg .middle } __Load Balancing__
---
Understand and configure intelligent load balancing strategies
:octicons-arrow-right-24: Load Balancing Guide
- :material-api:{ .lg .middle } __API Usage__
---
Integrate with the REST API for programmatic control
:octicons-arrow-right-24: API Usage Guide
- :material-cog:{ .lg .middle } __Configuration__
---
Advanced configuration options and environment variables
:octicons-arrow-right-24: Configuration Guide
## Quick Overview
### Proxy URLs
Access VPN exit nodes through simple proxy URLs:
```
https://proxy-us.rbnk.uk # United States
https://proxy-uk.rbnk.uk # United Kingdom
https://proxy-jp.rbnk.uk # Japan
```
### Load Balancing Strategies
| Strategy | Description | Best For |
|----------|-------------|----------|
| **Round Robin** | Equal distribution | Balanced load |
| **Least Connections** | Route to least busy | High traffic |
| **Weighted Latency** | Favor fastest nodes | Performance |
| **Random** | Random selection | Testing |
| **Health Score** | AI-based routing | Optimal performance |
### API Authentication
All API requests require authentication:
```
curl -u admin:password https://api.vpn.yourdomain.com/api/nodes
```
## Common Use Cases
### 1. Browser Configuration
Configure your browser to use a specific country proxy:
=== "Chrome"
```
Settings → Advanced → System → Open proxy settings
HTTP Proxy: proxy-us.rbnk.uk
Port: 443
```
=== "Firefox"
```
Settings → Network Settings → Manual proxy
HTTP Proxy: proxy-uk.rbnk.uk
Port: 443
```
### 2. Application Integration
Integrate proxy URLs in your applications:
=== "Python"
```
import requests
proxies = {
'http': 'https://proxy-jp.rbnk.uk',
'https': 'https://proxy-jp.rbnk.uk'
}
response = requests.get('https://ipinfo.io', proxies=proxies)
```
=== "Node.js"
```
const axios = require('axios');
const response = await axios.get('https://ipinfo.io', {
proxy: {
protocol: 'https',
host: 'proxy-de.rbnk.uk',
port: 443
}
});
```
### 3. Load Balancing Control
Select the optimal node for your needs:
```
# Get best node for US
curl -u admin:password \
https://api.vpn.yourdomain.com/api/load-balancer/best-node/us
# Change strategy to health score
curl -X POST -u admin:password \
-H "Content-Type: application/json" \
-d '{"strategy": "health_score"}' \
https://api.vpn.yourdomain.com/api/load-balancer/strategy
```
## Best Practices
!!! tip "Performance Tips"
- Use the health score strategy for optimal performance
- Monitor node metrics to identify performance issues
- Rotate between nodes to distribute load
- Use geographic proximity for lowest latency
!!! warning "Security Considerations"
- Always use HTTPS proxy URLs
- Rotate API credentials regularly
- Monitor access logs for unusual activity
- Implement IP whitelisting for sensitive operations
## Need Help?
- 📖 Check the detailed guides in this section
- 🔧 Review Troubleshooting Guide
- 💬 Ask a Question
- 📧 Contact Support
---
!!! success "Ready to dive deeper?"
Start with the Proxy Usage Guide to learn how to configure your applications for VPN access.
---
## Guide > Load Balancing
### Load Balancing System Documentation
## Table of Contents
1. Load Balancing Overview
2. Load Balancing Strategies
3. Health Score Algorithm
4. Speed Testing Integration
5. Failover Logic
6. Configuration and Tuning
7. Monitoring and Metrics
8. Advanced Features
9. API Reference
10. Troubleshooting
## Load Balancing Overview
### Purpose and Benefits
The VPN Exit Controller implements intelligent load balancing to distribute traffic across multiple VPN exit nodes within each country. This provides several key benefits:
- **High Availability**: Automatic failover when nodes become unhealthy
- **Performance Optimization**: Route traffic to the fastest available nodes
- **Scalability**: Automatic scaling based on connection load
- **Resource Efficiency**: Optimal utilization of compute resources
- **Geographic Distribution**: Balanced load across different VPN servers
### Integration with Failover Systems
The load balancer works closely with the failover manager to ensure service continuity:
```
# Example: Load balancer + failover integration
if not healthy_nodes:
# Trigger failover to different VPN server
await failover_manager.handle_node_failure(node_id, "no_healthy_nodes")
# Recheck for healthy nodes after failover
healthy_nodes = self._get_healthy_nodes_for_country(country)
```
## Load Balancing Strategies
The system supports five distinct load balancing strategies, each optimized for different scenarios:
### 1. Round Robin Strategy
**Purpose**: Simple, fair distribution of connections across all healthy nodes.
**Algorithm**:
```
async def _round_robin_select(self, nodes: List[Dict], country: str) -> Dict:
"""Round-robin selection"""
if country not in self.round_robin_counters:
self.round_robin_counters[country] = 0
selected_index = self.round_robin_counters[country] % len(nodes)
self.round_robin_counters[country] += 1
return nodes[selected_index]
```
**Best For**:
- Evenly distributed workloads
- Testing scenarios
- When all nodes have similar performance characteristics
**Characteristics**:
- Maintains per-country counters
- Guarantees fair distribution
- No performance consideration
### 2. Least Connections Strategy
**Purpose**: Route new connections to the node with the fewest active connections.
**Algorithm**:
```
async def _least_connections_select(self, nodes: List[Dict], country: str) -> Dict:
"""Select node with least connections"""
node_connections = []
for node in nodes:
connection_count = redis_manager.get_connection_count(node['id'])
node_connections.append((node, connection_count))
# Sort by connection count (ascending)
node_connections.sort(key=lambda x: x[1])
return node_connections[0][0]
```
**Best For**:
- Long-lived connections
- Scenarios where connection duration varies significantly
- Optimizing connection distribution
**Monitoring**:
```
# Check connection counts via API
curl -u admin:password http://localhost:8080/api/load-balancer/stats
```
### 3. Weighted Latency Strategy
**Purpose**: Route traffic based on server latency with weighted randomization.
**Algorithm**:
```
async def _weighted_latency_select(self, nodes: List[Dict], country: str) -> Dict:
"""Select based on weighted latency scores"""
node_scores = []
for node in nodes:
# Get server latency from Redis
server_health = redis_manager.get_server_health(node.get('vpn_server', ''))
latency = server_health.get('latency', 100) if server_health else 100
# Lower latency = higher weight
weight = max(1, 200 - latency) # Weight between 1-199
node_scores.append((node, weight))
# Weighted random selection
total_weight = sum(score[1] for score in node_scores)
random_point = random.uniform(0, total_weight)
current_weight = 0
for node, weight in node_scores:
current_weight += weight
if current_weight >= random_point:
return node
```
**Weight Calculation**:
- Latency 50ms → Weight 150
- Latency 100ms → Weight 100
- Latency 150ms → Weight 50
- Latency 200ms+ → Weight 1
**Best For**:
- Latency-sensitive applications
- Real-time communications
- Gaming or streaming workloads
### 4. Random Strategy
**Purpose**: Randomly distribute connections for simple load distribution.
**Algorithm**:
```
async def _random_select(self, nodes: List[Dict], country: str) -> Dict:
"""Random selection"""
return random.choice(nodes)
```
**Best For**:
- Simple load distribution
- Development and testing
- When other strategies are not applicable
### 5. Health Score Strategy (Default)
**Purpose**: Select nodes based on comprehensive health scores considering multiple factors.
**Algorithm**:
```
async def _health_score_select(self, nodes: List[Dict], country: str) -> Dict:
"""Select based on comprehensive health score"""
node_scores = []
for node in nodes:
score = await self._calculate_node_health_score(node)
node_scores.append((node, score))
# Sort by score (descending - higher is better)
node_scores.sort(key=lambda x: x[1], reverse=True)
return node_scores[0][0]
```
## Health Score Algorithm
The health score algorithm provides a comprehensive assessment of node performance by weighing multiple factors:
### Score Calculation
```
async def _calculate_node_health_score(self, node: Dict) -> float:
"""Calculate comprehensive health score for a node"""
score = 100.0 # Start with perfect score
# Factor 1: Server latency (40% weight)
server_health = redis_manager.get_server_health(node.get('vpn_server', ''))
if server_health:
latency = server_health.get('latency', 100)
# Score: 100ms=90, 50ms=95, 200ms=80
latency_score = max(50, 100 - (latency - 50) * 0.5)
score = score * 0.6 + latency_score * 0.4
# Factor 2: Connection count (30% weight)
connection_count = redis_manager.get_connection_count(node['id'])
# Penalize high connection counts
connection_penalty = min(20, connection_count * 2)
connection_score = max(60, 100 - connection_penalty)
score = score * 0.7 + connection_score * 0.3
# Factor 3: CPU usage (20% weight)
stats = node.get('stats', {})
cpu_percent = stats.get('cpu_percent', 0)
cpu_score = max(60, 100 - cpu_percent)
score = score * 0.8 + cpu_score * 0.2
# Factor 4: Memory usage (10% weight)
memory_mb = stats.get('memory_mb', 0)
# Penalize if using > 300MB
memory_penalty = max(0, (memory_mb - 300) / 10)
memory_score = max(70, 100 - memory_penalty)
score = score * 0.9 + memory_score * 0.1
return score
```
### Scoring Factors
| Factor | Weight | Description | Range |
|--------|--------|-------------|-------|
| **Server Latency** | 40% | Network latency to VPN server | 50-100 |
| **Connection Count** | 30% | Number of active connections | 60-100 |
| **CPU Usage** | 20% | Container CPU utilization | 60-100 |
| **Memory Usage** | 10% | Container memory consumption | 70-100 |
### Score Interpretation
- **90-100**: Excellent performance, optimal for routing
- **80-89**: Good performance, suitable for most traffic
- **70-79**: Acceptable performance, may experience delays
- **60-69**: Poor performance, consider failover
- **<60**: Critical issues, automatic failover triggered
### Health Score Thresholds
```
# Configuration examples
EXCELLENT_THRESHOLD = 90.0
GOOD_THRESHOLD = 80.0
ACCEPTABLE_THRESHOLD = 70.0
POOR_THRESHOLD = 60.0
CRITICAL_THRESHOLD = 50.0
```
## Speed Testing Integration
Speed testing provides crucial data for load balancing decisions through comprehensive performance evaluation.
### Speed Test Components
#### Download Speed Testing
```
async def _test_download_speed(self, node_id: str, test_url: str) -> Dict:
"""Test download speed by downloading a file inside the container"""
# Create curl command to test download speed
curl_cmd = [
"curl", "-s", "-w",
"%{time_total},%{speed_download},%{size_download}",
"-o", "/dev/null",
"--max-time", "60", # 60 second timeout
test_url
]
container = self.docker_manager.client.containers.get(node_id)
result = container.exec_run(curl_cmd, demux=False)
# Parse results and convert to Mbps
time_total, speed_download, size_download = output.split(',')
mbps = (float(speed_download) * 8) / (1024 * 1024)
return {
'mbps': mbps,
'time_seconds': float(time_total),
'size_bytes': float(size_download)
}
```
#### Latency Testing
```
async def _test_latency(self, node_id: str) -> Dict:
"""Test latency to multiple endpoints"""
ping_endpoints = [
"https://www.google.com",
"https://www.cloudflare.com",
"https://www.github.com",
"https://httpbin.org/ip"
]
# Test each endpoint and calculate average
successful_tests = []
for endpoint in ping_endpoints:
# Use curl to measure connection time
latency_ms = float(connect_time) * 1000
successful_tests.append(latency_ms)
avg_latency = sum(successful_tests) / len(successful_tests)
return {'avg_latency': avg_latency, 'tests': latency_tests}
```
### Speed Test Scheduling
```
# Automatic speed testing
async def schedule_speed_tests():
"""Run speed tests on all nodes every hour"""
while True:
try:
results = await speed_tester.test_all_nodes("1MB")
logger.info(f"Speed tests completed: {len(results)} nodes tested")
except Exception as e:
logger.error(f"Speed test cycle failed: {e}")
await asyncio.sleep(3600) # 1 hour interval
```
### Historical Data Usage
Speed test results are stored in Redis with time-series data:
```
def _store_speed_test_result(self, node_id: str, result: Dict):
"""Store speed test result in Redis"""
# Store latest result (1 hour TTL)
key = f"speedtest:{node_id}:latest"
redis_manager.client.setex(key, 3600, json.dumps(result))
# Store in history (keep last 24 hours)
history_key = f"speedtest:{node_id}:history"
timestamp = datetime.utcnow().timestamp()
redis_manager.client.zadd(history_key, {json.dumps(result): timestamp})
# Remove old entries (older than 24 hours)
cutoff = (datetime.utcnow() - timedelta(hours=24)).timestamp()
redis_manager.client.zremrangebyscore(history_key, 0, cutoff)
```
### Performance Trend Analysis
```
def analyze_performance_trends(node_id: str) -> Dict:
"""Analyze performance trends over time"""
history = speed_tester.get_speed_test_history(node_id, hours=24)
if len(history) < 2:
return {"trend": "insufficient_data"}
speeds = [h['download_mbps'] for h in history if 'download_mbps' in h]
latencies = [h['latency_ms'] for h in history if 'latency_ms' in h]
# Calculate trends
speed_trend = "improving" if speeds[-1] > speeds[0] else "degrading"
latency_trend = "improving" if latencies[-1] < latencies[0] else "degrading"
return {
"speed_trend": speed_trend,
"latency_trend": latency_trend,
"avg_speed_24h": sum(speeds) / len(speeds),
"avg_latency_24h": sum(latencies) / len(latencies)
}
```
## Failover Logic
The failover system ensures service continuity when nodes become unhealthy or disconnected.
### Automatic Failover Triggers
1. **VPN Connection Failure**: Node loses connection to VPN server
2. **High Resource Usage**: CPU > 90% or Memory > 1GB for 5 minutes
3. **Network Connectivity Issues**: Cannot reach test endpoints
4. **Container Health Check Failure**: Docker health checks fail
### Failover Process
```
async def handle_node_failure(self, node_id: str, failure_reason: str) -> bool:
"""Handle a failed node by attempting failover to a different server"""
# Check if failover already in progress
if node_id in self.failover_in_progress:
return False
self.failover_in_progress.add(node_id)
try:
# Get node details and alternative server
node = self.docker_manager.get_node_details(node_id)
country = node['country']
current_server = node['server']
# Check failover limits
if not self._can_failover(node_id):
return False
# Get alternative server
new_server = await self._get_alternative_server(country, current_server)
if not new_server:
return False
# Perform failover
success = await self._perform_failover(node_id, country, new_server)
# Record attempt
self._record_failover_attempt(node_id, country, current_server, new_server, success)
return success
finally:
self.failover_in_progress.discard(node_id)
```
### Failover Constraints
```
class FailoverManager:
def __init__(self):
self.max_failover_attempts = 3 # Max attempts per hour
self.failover_cooldown = 300 # 5 minutes between attempts
self.failover_history = {} # Track attempts per node
```
### Server Selection for Failover
```
async def _get_alternative_server(self, country: str, exclude_server: str) -> Optional[str]:
"""Get an alternative server for failover"""
# Get all servers for the country
servers = vpn_server_manager.get_servers_for_country(country)
# Filter out current and blacklisted servers
available_servers = [
s for s in servers
if s['hostname'] != exclude_server
and not redis_manager.is_server_blacklisted(s['hostname'])
]
# Sort by health score
available_servers.sort(key=lambda s: s.get('health_score', 50), reverse=True)
# Test top 3 servers
for server in available_servers[:3]:
success, latency = await vpn_server_manager.health_check_server(server['hostname'])
if success:
return server['hostname']
return available_servers[0]['hostname'] if available_servers else None
```
### Recovery Procedures
1. **Immediate Recovery**: Stop failed container, start new one with different server
2. **Graceful Recovery**: Wait for existing connections to drain before switching
3. **Rollback Recovery**: Return to previous working server if new server fails
## Configuration and Tuning
### Load Balancing Parameters
```
# /opt/vpn-exit-controller/.env
LOAD_BALANCER_STRATEGY=health_score
LOAD_BALANCER_ENABLED=true
MAX_NODES_PER_COUNTRY=3
AUTO_SCALE_ENABLED=true
SCALE_UP_THRESHOLD=50 # connections per node
SCALE_DOWN_THRESHOLD=10 # connections per node
```
### Health Check Intervals
```
# Configuration in services/metrics_collector.py
class MetricsCollector:
def __init__(self, interval_seconds: int = 30): # Collect every 30 seconds
self.interval = interval_seconds
```
### Performance Thresholds
```
# CPU and memory thresholds for scaling decisions
CPU_THRESHOLD_HIGH = 80.0 # Scale up trigger
CPU_THRESHOLD_LOW = 20.0 # Scale down trigger
MEMORY_THRESHOLD_HIGH = 500 # MB, scale up trigger
MEMORY_THRESHOLD_LOW = 300 # MB, scale down trigger
```
### Auto-scaling Configuration
```
async def start_additional_node_if_needed(self, country: str) -> bool:
"""Start additional node if load is high"""
nodes = self._get_healthy_nodes_for_country(country)
if not nodes:
return False
# Check if we need more capacity
total_connections = sum(redis_manager.get_connection_count(n['id']) for n in nodes)
avg_connections_per_node = total_connections / len(nodes)
# Start new node if average > 50 connections per node and < 3 nodes
if avg_connections_per_node > 50 and len(nodes) < 3:
logger.info(f"High load detected for {country}, starting additional node")
# Start new node...
return True
return False
```
### Tuning Recommendations
| Scenario | Strategy | Max Nodes | Thresholds |
|----------|----------|-----------|------------|
| **High Traffic** | `health_score` | 5 | Scale up: 30 conn/node |
| **Low Latency** | `weighted_latency` | 3 | Scale up: 20 conn/node |
| **Cost Optimized** | `least_connections` | 2 | Scale up: 80 conn/node |
| **Testing** | `round_robin` | 3 | Scale up: 50 conn/node |
## Monitoring and Metrics
### Key Metrics Collection
The system continuously collects metrics for load balancing decisions:
```
class MetricsCollector:
"""Background service that continuously collects metrics from all nodes"""
async def _collect_node_metrics(self, node_id: str):
"""Collect metrics for a single node"""
# Get detailed node info (includes Docker stats)
node_details = self.docker_manager.get_node_details(node_id)
# Check for anomalies
if node_details.get('stats'):
stats = node_details['stats']
# Alert on high resource usage
if stats.get('cpu_percent', 0) > 80:
logger.warning(f"High CPU usage on node {node_id}: {stats['cpu_percent']:.1f}%")
if stats.get('memory_mb', 0) > 500:
logger.warning(f"High memory usage on node {node_id}: {stats['memory_mb']:.1f}MB")
```
### Load Balancing Statistics
```
def get_load_balancing_stats(self) -> Dict:
"""Get comprehensive load balancing statistics"""
stats = {
'strategies': [s.value for s in LoadBalancingStrategy],
'round_robin_counters': self.round_robin_counters,
'countries': {}
}
# Get stats per country
all_nodes = self.docker_manager.list_nodes()
countries = set(n['country'] for n in all_nodes)
for country in countries:
nodes = self._get_healthy_nodes_for_country(country)
total_connections = sum(redis_manager.get_connection_count(n['id']) for n in nodes)
stats['countries'][country] = {
'node_count': len(nodes),
'total_connections': total_connections,
'avg_connections_per_node': total_connections / len(nodes) if nodes else 0,
'nodes': [
{
'id': n['id'],
'server': n.get('vpn_server', 'unknown'),
'connections': redis_manager.get_connection_count(n['id']),
'tailscale_ip': n.get('tailscale_ip'),
'cpu_percent': n.get('stats', {}).get('cpu_percent', 0),
'health_score': await self._calculate_node_health_score(n)
}
for n in nodes
]
}
return stats
```
### Performance Monitoring
```
# Monitor load balancing in real-time
curl -u admin:password http://localhost:8080/api/load-balancer/stats | jq
# Get speed test summary
curl -u admin:password http://localhost:8080/api/speed-test/summary | jq
# Monitor metrics
curl -u admin:password http://localhost:8080/api/metrics/current | jq
```
### Alert Conditions
| Condition | Threshold | Action |
|-----------|-----------|---------|
| High CPU Usage | >80% for 5 min | Scale up or failover |
| High Memory | >500MB | Scale up or failover |
| High Connection Count | >100 per node | Scale up |
| Low Speed | <10 Mbps | Investigate/failover |
| High Latency | >200ms | Switch strategy or failover |
| Node Down | Health check fails | Immediate failover |
### Reporting and Analysis
```
# Generate load balancing report
async def generate_load_balancing_report(hours: int = 24) -> Dict:
"""Generate comprehensive load balancing report"""
report = {
'period_hours': hours,
'generated_at': datetime.utcnow().isoformat(),
'summary': {},
'by_country': {},
'performance_trends': {},
'recommendations': []
}
# Analyze each country
countries = get_all_countries()
for country in countries:
nodes = get_nodes_for_country(country)
# Calculate statistics
total_connections = sum(get_connection_count(n['id']) for n in nodes)
avg_speed = calculate_avg_speed(nodes, hours)
avg_latency = calculate_avg_latency(nodes, hours)
report['by_country'][country] = {
'node_count': len(nodes),
'total_connections': total_connections,
'avg_speed_mbps': avg_speed,
'avg_latency_ms': avg_latency,
'failover_events': count_failover_events(country, hours)
}
# Generate recommendations
if avg_speed < 20:
report['recommendations'].append(f"Consider adding more nodes to {country} - low speed detected")
if total_connections / len(nodes) > 50:
report['recommendations'].append(f"Scale up {country} - high load detected")
return report
```
## Advanced Features
### Connection Affinity/Sticky Sessions
```
class ConnectionAffinity:
"""Manage connection affinity for consistent routing"""
def __init__(self):
self.client_node_map = {} # client_ip -> node_id
self.affinity_timeout = 3600 # 1 hour
async def get_affinity_node(self, client_ip: str, country: str) -> Optional[str]:
"""Get node with existing affinity for client"""
affinity_key = f"affinity:{client_ip}:{country}"
node_id = redis_manager.client.get(affinity_key)
if node_id:
# Check if node is still healthy
healthy, _ = docker_manager.check_container_health(node_id)
if healthy:
# Refresh affinity timeout
redis_manager.client.expire(affinity_key, self.affinity_timeout)
return node_id
else:
# Remove stale affinity
redis_manager.client.delete(affinity_key)
return None
async def set_affinity(self, client_ip: str, country: str, node_id: str):
"""Set client affinity to specific node"""
affinity_key = f"affinity:{client_ip}:{country}"
redis_manager.client.setex(affinity_key, self.affinity_timeout, node_id)
```
### Geographic Routing Preferences
```
class GeographicRouter:
"""Route based on geographic preferences"""
REGION_PREFERENCES = {
'americas': ['us', 'ca', 'br'],
'europe': ['de', 'uk', 'fr', 'nl'],
'asia': ['jp', 'sg', 'hk', 'au'],
'africa': ['za'],
'oceania': ['au', 'nz']
}
async def get_preferred_country(self, client_region: str, requested_country: str) -> str:
"""Get preferred country based on client region"""
# Return requested country if available and healthy
if self.is_country_healthy(requested_country):
return requested_country
# Find alternative in same region
preferred_countries = self.REGION_PREFERENCES.get(client_region, [])
for country in preferred_countries:
if self.is_country_healthy(country):
logger.info(f"Routing {client_region} client to {country} instead of {requested_country}")
return country
# Fallback to any healthy country
return self.get_any_healthy_country()
```
### Custom Load Balancing Rules
```
class CustomLoadBalancingRules:
"""Implement custom load balancing rules"""
def __init__(self):
self.rules = []
def add_rule(self, rule: Dict):
"""Add custom routing rule"""
self.rules.append({
'id': str(uuid4()),
'name': rule['name'],
'condition': rule['condition'],
'action': rule['action'],
'priority': rule.get('priority', 100),
'enabled': rule.get('enabled', True)
})
async def evaluate_rules(self, context: Dict) -> Optional[str]:
"""Evaluate rules and return target node"""
# Sort by priority
active_rules = sorted(
[r for r in self.rules if r['enabled']],
key=lambda x: x['priority']
)
for rule in active_rules:
if self._matches_condition(rule['condition'], context):
return await self._execute_action(rule['action'], context)
return None
def _matches_condition(self, condition: Dict, context: Dict) -> bool:
"""Check if context matches rule condition"""
# Example conditions:
# {"source_device": "iPhone", "domain": "*.streaming.com"}
# {"time_range": "09:00-17:00", "country": "us"}
# {"client_ip_range": "192.168.1.0/24"}
for key, value in condition.items():
if key == 'source_device':
if context.get('user_agent', '').find(value) == -1:
return False
elif key == 'domain':
if not fnmatch.fnmatch(context.get('domain', ''), value):
return False
elif key == 'time_range':
current_time = datetime.now().strftime('%H:%M')
start, end = value.split('-')
if not (start <= current_time <= end):
return False
return True
```
### API-based Load Balancing Control
```
# Extended API endpoints for advanced control
@router.post("/rules")
async def create_load_balancing_rule(rule: CustomRule, user=Depends(verify_auth)):
"""Create custom load balancing rule"""
custom_rules.add_rule(rule.dict())
return {"status": "rule_created", "rule": rule}
@router.put("/strategy/{country}")
async def set_country_strategy(
country: str,
strategy: LoadBalancingStrategy,
user=Depends(verify_auth)
):
"""Set load balancing strategy for specific country"""
load_balancer.set_country_strategy(country, strategy)
return {"country": country, "strategy": strategy.value}
@router.post("/rebalance/{country}")
async def force_rebalance(country: str, user=Depends(verify_auth)):
"""Force rebalancing of connections in a country"""
result = await load_balancer.rebalance_country(country)
return {"country": country, "rebalanced_connections": result}
@router.get("/prediction/{country}")
async def get_load_prediction(country: str, hours: int = 1, user=Depends(verify_auth)):
"""Get load prediction for next N hours"""
prediction = await load_balancer.predict_load(country, hours)
return prediction
```
## API Reference
### Load Balancer Endpoints
#### Get Load Balancing Statistics
```
GET /api/load-balancer/stats
Authorization: Basic
```
**Response:**
```
{
"strategies": ["round_robin", "least_connections", "weighted_latency", "random", "health_score"],
"round_robin_counters": {"us": 5, "uk": 2},
"countries": {
"us": {
"node_count": 2,
"total_connections": 45,
"avg_connections_per_node": 22.5,
"nodes": [
{
"id": "container_123",
"server": "us5063.nordvpn.com",
"connections": 25,
"tailscale_ip": "100.73.33.15",
"cpu_percent": 45.2,
"health_score": 87.3
}
]
}
}
}
```
#### Get Best Node for Country
```
GET /api/load-balancer/best-node/{country}?strategy=health_score
Authorization: Basic
```
**Response:**
```
{
"selected_node": {
"id": "container_123",
"country": "us",
"server": "us5063.nordvpn.com",
"tailscale_ip": "100.73.33.15",
"health_score": 87.3
},
"strategy": "health_score",
"country": "us"
}
```
#### Scale Up Country
```
POST /api/load-balancer/scale-up/{country}
Authorization: Basic
```
#### Scale Down Country
```
POST /api/load-balancer/scale-down/{country}
Authorization: Basic
```
#### Get Available Strategies
```
GET /api/load-balancer/strategies
Authorization: Basic
```
**Response:**
```
{
"strategies": [
{
"name": "round_robin",
"description": "Distributes requests evenly across all healthy nodes"
},
{
"name": "least_connections",
"description": "Routes to the node with fewest active connections"
},
{
"name": "weighted_latency",
"description": "Routes based on server latency with weighted randomization"
},
{
"name": "random",
"description": "Randomly selects from available healthy nodes"
},
{
"name": "health_score",
"description": "Routes to node with best overall health score (CPU, memory, latency, connections)"
}
]
}
```
## Troubleshooting
### Common Issues
#### 1. No Healthy Nodes Available
**Symptoms:**
- API returns 404 "No healthy nodes available"
- Load balancer cannot route traffic
**Diagnosis:**
```
# Check node health
curl -u admin:password http://localhost:8080/api/nodes/list | jq '.[] | select(.status == "running")'
# Check container health
docker ps --filter "label=vpn-exit-node"
# Check VPN connections
docker exec curl -s ipinfo.io
```
**Solutions:**
1. Restart unhealthy containers: `docker restart `
2. Check VPN credentials in `/opt/vpn-exit-controller/configs/auth.txt`
3. Verify network connectivity: `docker exec ping 8.8.8.8`
4. Force failover: `curl -X POST http://localhost:8080/api/failover/force/`
#### 2. Load Imbalance
**Symptoms:**
- One node has significantly more connections than others
- Performance degradation on overloaded nodes
**Diagnosis:**
```
# Check connection distribution
curl -u admin:password http://localhost:8080/api/load-balancer/stats | jq '.countries'
# Check strategy
curl -u admin:password http://localhost:8080/api/config | jq '.load_balancer'
```
**Solutions:**
1. Switch to `least_connections` strategy
2. Force rebalancing: `curl -X POST http://localhost:8080/api/load-balancer/rebalance/`
3. Increase connection drain timeout
4. Add more nodes: `curl -X POST http://localhost:8080/api/load-balancer/scale-up/`
#### 3. Frequent Failovers
**Symptoms:**
- High number of failover events in logs
- Unstable node assignments
**Diagnosis:**
```
# Check failover history
curl -u admin:password http://localhost:8080/api/failover/status | jq
# Check server health
curl -u admin:password http://localhost:8080/api/speed-test/summary | jq
```
**Solutions:**
1. Increase failover cooldown period
2. Check VPN server stability
3. Review health check thresholds
4. Blacklist problematic servers
#### 4. Poor Performance
**Symptoms:**
- Slow connection speeds
- High latency
**Diagnosis:**
```
# Run speed tests
curl -X POST -u admin:password http://localhost:8080/api/speed-test/run-all
# Check health scores
curl -u admin:password http://localhost:8080/api/load-balancer/stats | jq '.countries[].nodes[].health_score'
```
**Solutions:**
1. Switch to `weighted_latency` strategy
2. Add more nodes in region
3. Use different VPN servers
4. Check network congestion
### Debug Commands
```
# Enable debug logging
export LOG_LEVEL=DEBUG
# Check Redis data
redis-cli
> KEYS speedtest:*
> KEYS affinity:*
> KEYS server_health:*
# Monitor load balancer decisions
journalctl -u vpn-controller -f | grep "load_balancer"
# Test specific node
curl -X POST -u admin:password http://localhost:8080/api/speed-test/node/
# Force strategy change
curl -X PUT -u admin:password http://localhost:8080/api/load-balancer/strategy/ \
-H "Content-Type: application/json" \
-d '{"strategy": "health_score"}'
```
### Performance Optimization Tips
1. **Strategy Selection:**
- Use `health_score` for general purpose
- Use `weighted_latency` for latency-sensitive apps
- Use `least_connections` for long-lived connections
2. **Resource Tuning:**
- Monitor CPU/memory usage patterns
- Adjust scaling thresholds based on traffic
- Set appropriate connection limits
3. **Network Optimization:**
- Choose VPN servers close to users
- Monitor and blacklist slow servers
- Use multiple servers per country
4. **Monitoring:**
- Set up alerts for health score < 70
- Monitor failover frequency
- Track connection distribution
This comprehensive load balancing system ensures optimal performance, reliability, and scalability for the VPN Exit Controller infrastructure.
---
## Guide > Proxy Usage
### VPN Exit Controller - Usage Guide
This guide explains the **dual-mode access** provided by the VPN Exit Controller: **Tailscale Exit Nodes** for network-level routing and **Proxy Services** for application-level routing.
## Overview
The VPN Exit Controller provides two complementary approaches for routing traffic through VPN containers in different countries:
1. **🌐 Tailscale Exit Nodes**: Full network-level routing where entire devices/networks route through VPN containers
2. **🔗 Proxy Services**: Application-level routing where individual applications use HTTP/HTTPS/SOCKS5 proxies
Both approaches use the same underlying VPN containers but provide different levels of integration and control.
### Architecture Summary
```
┌─ Tailscale Exit Nodes ──────────────────────────┐ ┌─ Proxy Access ──────────────────────────────────┐
│ │ │ │
│ Device/Network → Tailscale → VPN Container │ │ Application → Tailscale → VPN Container │
│ (Exit Node) (NordVPN) │ │ (Proxy) (Squid/Dante) │
│ │ │ │
└──────────────────────────────────────────────────┘ └──────────────────────────────────────────────────┘
↓
Internet (Country IP)
```
**VPN Container Services:**
- **Tailscale Exit Node**: Full network routing via Tailscale mesh (`--advertise-exit-node`)
- **Squid HTTP/HTTPS Proxy**: Port 3128 for web traffic (accessible via Tailscale IP)
- **Dante SOCKS5 Proxy**: Port 1080 for application tunneling (accessible via Tailscale IP)
- **Health Check Endpoint**: Port 8080 for container monitoring
- **DNS Resolution**: Uses NordVPN DNS (103.86.96.100, 103.86.99.100) with fallback
## Choosing Your Approach
### 🌐 When to Use Tailscale Exit Nodes
**Best for:**
- Routing all traffic from a device through a specific country
- Mobile devices (iPhone, Android) using Tailscale app
- Docker containers or VMs that need VPN access
- Development environments requiring consistent geo-location
- Any scenario where you want "set it and forget it" VPN routing
**Example: Route your entire laptop through Germany**
```
# List available exit nodes
tailscale status --peers | grep exit-de
# Enable Germany exit node
tailscale up --exit-node=exit-de-server456
# All traffic now appears from Germany
curl https://ipinfo.io/ip # Returns German IP
```
### 🔗 When to Use Proxy Services
**Best for:**
- Specific applications that need different geo-locations
- Web scraping with rotating country IPs
- Testing geo-restricted content from multiple countries
- Development/testing without affecting system-wide traffic
- Applications that already support proxy configuration
**Example: Test from multiple countries simultaneously**
```
# Test US endpoint via direct Tailscale proxy
curl -x http://100.86.140.98:3128 https://api.example.com/us
# Test German endpoint via different container
curl -x http://100.72.45.23:3128 https://api.example.com/de
# Test UK endpoint
curl -x http://100.125.27.111:3128 https://api.example.com/uk
```
### Getting Current VPN Container Information
To discover available VPN containers and their Tailscale IPs:
```
# Get all active nodes with their Tailscale IPs
curl -u admin:Bl4ckMagic!2345erver http://100.73.33.11:8080/api/nodes
# Get optimal node for a specific country
curl -u admin:Bl4ckMagic!2345erver http://100.73.33.11:8080/api/load-balancer/best-node/us
# Get optimal UK node
curl -u admin:Bl4ckMagic!2345erver http://100.73.33.11:8080/api/load-balancer/best-node/uk
# List all available Tailscale exit nodes
tailscale status --peers | grep "exit-"
```
## 1. Proxy URL Format
### Base Domain Structure
All proxy endpoints use the following domain pattern:
```
proxy-{country}.rbnk.uk
```
### Available Countries and Codes
| Country | Code | Proxy URL | Description |
|---------|------|-----------|-------------|
| United States | `us` | `proxy-us.rbnk.uk` | US-based exit nodes |
| Germany | `de` | `proxy-de.rbnk.uk` | German exit nodes |
| Japan | `jp` | `proxy-jp.rbnk.uk` | Japanese exit nodes |
| United Kingdom | `uk` | `proxy-uk.rbnk.uk` | UK-based exit nodes |
| Canada | `ca` | `proxy-ca.rbnk.uk` | Canadian exit nodes |
| Australia | `au` | `proxy-au.rbnk.uk` | Australian exit nodes |
| Netherlands | `nl` | `proxy-nl.rbnk.uk` | Dutch exit nodes |
| France | `fr` | `proxy-fr.rbnk.uk` | French exit nodes |
| Italy | `it` | `proxy-it.rbnk.uk` | Italian exit nodes |
| Spain | `es` | `proxy-es.rbnk.uk` | Spanish exit nodes |
| Switzerland | `ch` | `proxy-ch.rbnk.uk` | Swiss exit nodes |
| Austria | `at` | `proxy-at.rbnk.uk` | Austrian exit nodes |
| Belgium | `be` | `proxy-be.rbnk.uk` | Belgian exit nodes |
| Czech Republic | `cz` | `proxy-cz.rbnk.uk` | Czech exit nodes |
| Denmark | `dk` | `proxy-dk.rbnk.uk` | Danish exit nodes |
| Hong Kong | `hk` | `proxy-hk.rbnk.uk` | Hong Kong exit nodes |
| Hungary | `hu` | `proxy-hu.rbnk.uk` | Hungarian exit nodes |
| Ireland | `ie` | `proxy-ie.rbnk.uk` | Irish exit nodes |
| Norway | `no` | `proxy-no.rbnk.uk` | Norwegian exit nodes |
| Poland | `pl` | `proxy-pl.rbnk.uk` | Polish exit nodes |
| Romania | `ro` | `proxy-ro.rbnk.uk` | Romanian exit nodes |
| Serbia | `rs` | `proxy-rs.rbnk.uk` | Serbian exit nodes |
| Singapore | `sg` | `proxy-sg.rbnk.uk` | Singapore exit nodes |
| Sweden | `se` | `proxy-se.rbnk.uk` | Swedish exit nodes |
| Bulgaria | `bg` | `proxy-bg.rbnk.uk` | Bulgarian exit nodes |
### SSL/HTTPS Support
All proxy endpoints support SSL/TLS encryption with valid certificates from Let's Encrypt via Cloudflare DNS challenge.
## 2. Proxy Protocols
The VPN Exit Controller now provides multiple proxy protocols running inside each VPN container:
### HTTP/HTTPS Proxy (Port 3128) - Squid
- **URL Format**: `http://:3128`
- **Protocol**: HTTP/1.1 with HTTPS CONNECT support
- **Service**: Squid proxy server
- **Use Case**: Web browsing, API calls, general HTTP/HTTPS traffic
- **Features**:
- Header modification and anonymization
- Caching disabled for privacy
- Access control for Tailscale network (100.64.0.0/10)
- SSL port filtering and security checks
- **Example**: `curl -x http://100.86.140.98:3128 http://ipinfo.io/ip`
### SOCKS5 Proxy (Port 1080) - Dante
- **URL Format**: `socks5://:1080`
- **Protocol**: SOCKS5
- **Service**: Dante SOCKS server
- **Use Case**: Application-level proxying, TCP traffic tunneling
- **Features**:
- Protocol-agnostic (works with any TCP application)
- No HTTP header inspection
- Full TCP tunnel support
- **Example**: `curl --socks5 100.86.140.98:1080 http://ipinfo.io/ip`
### Health Check Endpoint (Port 8080)
- **URL Format**: `http://:8080/health`
- **Protocol**: HTTP/1.0
- **Use Case**: Container health monitoring, load balancing decisions
- **Response**: Simple "OK" response for health checks
- **Features**:
- Lightweight HTTP server
- Used by HAProxy for backend health checks
- Always returns 200 OK if container is running
### Legacy Country-Specific URLs (Deprecated)
The original country-specific proxy URLs (`proxy-{country}.rbnk.uk:8080`) are being phased out in favor of direct Tailscale IP access for better performance and reliability.
## 3. Client Configuration
### Browser Proxy Settings
#### Chrome/Chromium
```
# HTTP proxy through Tailscale IP
google-chrome --proxy-server="http://100.86.140.98:3128"
# SOCKS5 proxy through Tailscale IP
google-chrome --proxy-server="socks5://100.86.140.98:1080"
# UK proxy examples
google-chrome --proxy-server="http://100.125.27.111:3128"
google-chrome --proxy-server="http://proxy-uk.rbnk.uk:8132"
# Legacy country-specific (still supported)
google-chrome --proxy-server="http://proxy-us.rbnk.uk:8080"
```
#### Firefox
1. Go to Settings → Network Settings
2. Select "Manual proxy configuration"
3. **Modern Setup (Recommended):**
- HTTP Proxy: `100.86.140.98` Port: `3128`
- HTTPS Proxy: `100.86.140.98` Port: `3128`
- SOCKS5 Proxy: `100.86.140.98` Port: `1080`
- **UK**: HTTP/HTTPS: `100.125.27.111` Port: `3128`, SOCKS5: `100.125.27.111` Port: `1080`
4. **Legacy Setup:**
- HTTP Proxy: `proxy-us.rbnk.uk` Port: `8080`
- HTTPS Proxy: `proxy-us.rbnk.uk` Port: `8443`
- **UK**: HTTP: `proxy-uk.rbnk.uk` Port: `8132`, SOCKS5: `proxy-uk.rbnk.uk` Port: `1084`
### Command Line Examples
#### cURL
```
# HTTP proxy (modern - direct Tailscale IP)
curl -x http://100.86.140.98:3128 http://ipinfo.io/ip
# HTTPS proxy (modern - same port for HTTP proxy with CONNECT)
curl -x http://100.86.140.98:3128 https://ipinfo.io/ip
# SOCKS5 proxy (modern - direct Tailscale IP)
curl --socks5 100.86.140.98:1080 http://ipinfo.io/ip
# Legacy country-specific URLs (still supported)
curl -x http://proxy-us.rbnk.uk:8080 http://ipinfo.io/ip
curl --socks5 proxy-us.rbnk.uk:1080 http://ipinfo.io/ip
# Test with different countries
curl -x http://proxy-uk.rbnk.uk:8132 http://ipinfo.io/ip
curl -x http://proxy-de.rbnk.uk:8080 http://ipinfo.io/ip
curl --socks5 proxy-uk.rbnk.uk:1084 http://ipinfo.io/ip
```
#### wget
```
# HTTP proxy
wget -e use_proxy=yes -e http_proxy=proxy-us.rbnk.uk:8080 http://ifconfig.me
# HTTPS proxy
wget -e use_proxy=yes -e https_proxy=proxy-us.rbnk.uk:8443 https://ifconfig.me
```
### Programming Language Examples
#### Python (requests)
```
import requests
# HTTP proxy
proxies = {
'http': 'http://proxy-us.rbnk.uk:8080',
'https': 'https://proxy-us.rbnk.uk:8443'
}
response = requests.get('http://ifconfig.me', proxies=proxies)
print(f"Your IP: {response.text}")
# SOCKS5 proxy (requires PySocks)
proxies = {
'http': 'socks5://proxy-us.rbnk.uk:1080',
'https': 'socks5://proxy-us.rbnk.uk:1080'
}
response = requests.get('http://ifconfig.me', proxies=proxies)
print(f"Your IP: {response.text}")
# With authentication
proxies = {
'http': 'http://username:password@proxy-us.rbnk.uk:8080',
'https': 'https://username:password@proxy-us.rbnk.uk:8443'
}
```
#### Node.js
```
const axios = require('axios');
const HttpsProxyAgent = require('https-proxy-agent');
const SocksProxyAgent = require('socks-proxy-agent');
// HTTP proxy
const httpAgent = new HttpsProxyAgent('http://proxy-us.rbnk.uk:8080');
const response = await axios.get('http://ifconfig.me', { httpAgent });
console.log(`Your IP: ${response.data}`);
// SOCKS5 proxy
const socksAgent = new SocksProxyAgent('socks5://proxy-us.rbnk.uk:1080');
const response2 = await axios.get('http://ifconfig.me', { httpAgent: socksAgent });
console.log(`Your IP: ${response2.data}`);
```
#### Go
```
package main
import (
"fmt"
"io/ioutil"
"net/http"
"net/url"
)
func main() {
proxyURL, _ := url.Parse("http://proxy-us.rbnk.uk:8080")
client := &http.Client{
Transport: &http.Transport{
Proxy: http.ProxyURL(proxyURL),
},
}
resp, err := client.Get("http://ifconfig.me")
if err != nil {
panic(err)
}
body, _ := ioutil.ReadAll(resp.Body)
fmt.Printf("Your IP: %s\n", string(body))
}
```
### System-wide Proxy Configuration
#### Linux/macOS Environment Variables
```
export http_proxy=http://proxy-us.rbnk.uk:8080
export https_proxy=https://proxy-us.rbnk.uk:8443
export HTTP_PROXY=http://proxy-us.rbnk.uk:8080
export HTTPS_PROXY=https://proxy-us.rbnk.uk:8443
# SOCKS5
export all_proxy=socks5://proxy-us.rbnk.uk:1080
export ALL_PROXY=socks5://proxy-us.rbnk.uk:1080
```
#### Windows
```
set http_proxy=http://proxy-us.rbnk.uk:8080
set https_proxy=https://proxy-us.rbnk.uk:8443
```
## 4. Authentication
### HTTP Basic Authentication
The system supports HTTP Basic Authentication for API access. Credentials are managed through the VPN Exit Controller API.
#### API Authentication Format
```
curl -u username:password -H "Content-Type: application/json" \
http://10.10.10.20:8080/api/proxy/urls
```
### Credential Management
- Credentials are stored in `/opt/vpn-exit-controller/configs/auth.txt`
- API endpoints require authentication via the `verify_auth` dependency
- Web UI uses credentials: `admin:Bl4ckMagic!2345erver`
### Proxy Authentication (if implemented)
```
# Python example with proxy authentication
proxies = {
'http': 'http://username:password@proxy-us.rbnk.uk:8080',
'https': 'https://username:password@proxy-us.rbnk.uk:8443'
}
```
## 5. Load Balancing and Failover
### Automatic Load Balancing
The system implements intelligent load balancing with multiple strategies:
#### Available Strategies
- **Health Score** (default): Combines latency, connection count, and server health
- **Least Connections**: Routes to the server with fewest active connections
- **Round Robin**: Distributes requests evenly across servers
- **Weighted Latency**: Prioritizes servers with lower latency
- **Random**: Randomly selects from healthy servers
#### API Usage
```
# Get optimal proxy for a country
curl -u admin:Bl4ckMagic!2345erver \
"http://10.10.10.20:8080/api/proxy/optimal/us?strategy=health_score"
# Response example
{
"node_id": "vpn-us-node-1",
"country": "us",
"tailscale_ip": "100.73.33.15",
"server": "us5063.nordvpn.com",
"proxy_urls": {
"http": "http://proxy-us.rbnk.uk:8080",
"https": "https://proxy-us.rbnk.uk:8443",
"socks5": "socks5://proxy-us.rbnk.uk:1080"
},
"selected_strategy": "health_score"
}
```
### Failover Behavior
- **Health Monitoring**: Continuous health checks every 10 seconds
- **Automatic Failover**: Unhealthy nodes automatically removed from rotation
- **Backup Servers**: Default backup servers activated when all primary nodes fail
- **Connection Draining**: Graceful handling of existing connections during failover
### Performance Optimization Tips
1. **Connection Pooling**: Reuse connections when possible
2. **Country Selection**: Choose geographically closer countries for better latency
3. **Protocol Selection**: Use SOCKS5 for maximum compatibility, HTTP for web traffic
4. **Load Balancing**: Let the system handle load balancing rather than sticky sessions
## 6. Use Cases
### Geo-location Testing
```
# Test website from different countries
curl -x http://proxy-us.rbnk.uk:8080 "https://ipinfo.io/json"
curl -x http://proxy-uk.rbnk.uk:8080 "https://ipinfo.io/json"
curl -x http://proxy-de.rbnk.uk:8080 "https://ipinfo.io/json"
```
### Content Access by Region
```
import requests
countries = ['us', 'uk', 'de', 'jp']
ports = {'us': 8080, 'uk': 8132, 'de': 8080, 'jp': 8080}
for country in countries:
port = ports[country]
proxy = f'http://proxy-{country}.rbnk.uk:{port}'
response = requests.get('https://example.com', proxies={'http': proxy, 'https': proxy})
print(f"{country.upper()}: {response.status_code}")
```
### Web Scraping with Different IP Addresses
```
import requests
import random
countries = ['us', 'uk', 'de', 'ca', 'au']
ports = {'us': 8080, 'uk': 8132, 'de': 8080, 'ca': 8080, 'au': 8080}
def get_random_proxy():
country = random.choice(countries)
port = ports[country]
return {
'http': f'http://proxy-{country}.rbnk.uk:{port}',
'https': f'http://proxy-{country}.rbnk.uk:{port}'
}
# Rotate proxies for each request
for i in range(10):
proxies = get_random_proxy()
response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(f"Request {i+1}: {response.json()['origin']}")
```
### Privacy and Anonymity
```
# Check your real IP
curl http://ifconfig.me
# Check IP through US proxy
curl -x http://proxy-us.rbnk.uk:8080 http://ifconfig.me
# Check IP through different countries
for country in us uk de jp; do
echo -n "$country: "
case $country in
uk) port=8132 ;;
*) port=8080 ;;
esac
curl -s -x http://proxy-$country.rbnk.uk:$port http://ifconfig.me
done
```
## 7. Performance Considerations
### Speed Test Results Interpretation
The system includes built-in speed testing capabilities:
```
# Get speed test results
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/speed-test/results
# Run speed test for specific country
curl -u admin:Bl4ckMagic!2345erver -X POST \
http://10.10.10.20:8080/api/speed-test/run/us
```
#### Performance Metrics
- **Latency**: Round-trip time to proxy server
- **Bandwidth**: Upload/download speeds through proxy
- **Connection Success Rate**: Percentage of successful connections
- **Health Score**: Combined metric for overall proxy performance
### Optimal Country Selection
```
import requests
# Get proxy statistics
auth = ('admin', 'Bl4ckMagic!2345erver')
response = requests.get('http://10.10.10.20:8080/api/proxy/stats', auth=auth)
stats = response.json()
# Find country with best performance
best_country = None
best_score = 0
for country, urls in stats['available_proxy_urls'].items():
# Logic to determine best country based on your requirements
pass
```
### Connection Pooling Recommendations
```
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
# Configure session with connection pooling
session = requests.Session()
# Retry strategy
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
)
adapter = HTTPAdapter(
pool_connections=10,
pool_maxsize=20,
max_retries=retry_strategy
)
session.mount("http://", adapter)
session.mount("https://", adapter)
# Use session with proxy
proxies = {'http': 'http://proxy-us.rbnk.uk:8080'}
response = session.get('https://example.com', proxies=proxies)
```
## 8. Troubleshooting
### Common Connection Issues
#### 1. Proxy Connection Refused
```
# Check if proxy is running
curl -I http://proxy-us.rbnk.uk:8080
# Check specific node health
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/proxy/health
```
#### 2. DNS Resolution Issues
```
# Test DNS resolution
nslookup proxy-us.rbnk.uk
dig proxy-us.rbnk.uk
# Use alternative DNS
curl --dns-servers 8.8.8.8 -x http://proxy-us.rbnk.uk:8080 http://ifconfig.me
```
#### 3. Authentication Errors
```
# Test API authentication
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/status
# Check authentication headers
curl -v -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/proxy/urls
```
### Debugging Proxy Problems
#### Enable Verbose Logging
```
# Python requests debugging
import logging
import requests
logging.basicConfig(level=logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True
```
#### Test Proxy Connectivity
```
# Test basic connectivity
nc -zv proxy-us.rbnk.uk 8080
# Test SOCKS5 connectivity
nc -zv proxy-us.rbnk.uk 1080
# Test with timeout
timeout 10 curl -x http://proxy-us.rbnk.uk:8080 http://ifconfig.me
```
#### Check HAProxy Statistics
```
# Access HAProxy stats (if enabled)
curl http://10.10.10.20:8404/stats
# Get detailed proxy statistics
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/proxy/stats
```
### Performance Troubleshooting
#### 1. Slow Proxy Response
```
# Test latency to different countries
for country in us uk de jp; do
echo -n "$country: "
case $country in
uk) port=8132 ;;
*) port=8080 ;;
esac
time curl -s -x http://proxy-$country.rbnk.uk:$port http://ifconfig.me >/dev/null
done
```
#### 2. High Connection Failures
```
# Check node health across all countries
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/nodes/status
# Monitor connection metrics
curl -u admin:Bl4ckMagic!2345erver \
http://10.10.10.20:8080/api/metrics/connections
```
#### 3. Load Balancing Issues
```
# Force different load balancing strategies
curl -u admin:Bl4ckMagic!2345erver \
"http://10.10.10.20:8080/api/proxy/optimal/us?strategy=least_connections"
curl -u admin:Bl4ckMagic!2345erver \
"http://10.10.10.20:8080/api/proxy/optimal/us?strategy=round_robin"
```
## API Reference
### Get All Proxy URLs
```
GET /api/proxy/urls
Authorization: Basic YWRtaW46Qmw0Y2tNYWdpYyEyMzQ1ZXJ2ZXI=
Response:
{
"us": {
"http": "http://proxy-us.rbnk.uk:8080",
"https": "https://proxy-us.rbnk.uk:8443",
"socks5": "socks5://proxy-us.rbnk.uk:1080"
},
"uk": { ... }
}
```
### Get Country-Specific URLs
```
GET /api/proxy/urls/{country}
Authorization: Basic YWRtaW46Qmw0Y2tNYWdpYyEyMzQ1ZXJ2ZXI=
Response:
{
"country": "us",
"proxy_urls": {
"http": "http://proxy-us.rbnk.uk:8080",
"https": "https://proxy-us.rbnk.uk:8443",
"socks5": "socks5://proxy-us.rbnk.uk:1080"
}
}
```
### Get Optimal Proxy
```
GET /api/proxy/optimal/{country}?strategy=health_score
Authorization: Basic YWRtaW46Qmw0Y2tNYWdpYyEyMzQ1ZXJ2ZXI=
Response:
{
"node_id": "vpn-us-node-1",
"country": "us",
"tailscale_ip": "100.73.33.15",
"server": "us5063.nordvpn.com",
"proxy_urls": { ... },
"selected_strategy": "health_score"
}
```
## Support and Monitoring
### Health Monitoring
- **Endpoint**: `http://10.10.10.20:8080/api/proxy/health`
- **HAProxy Stats**: `http://10.10.10.20:8404/stats`
- **System Status**: `http://10.10.10.20:8080/api/status`
### Logs and Diagnostics
- **Application Logs**: `journalctl -u vpn-controller -f`
- **HAProxy Logs**: `/opt/vpn-exit-controller/proxy/logs/`
- **Traefik Logs**: `/opt/vpn-exit-controller/traefik/logs/`
For additional support or advanced configuration, refer to the main system documentation or contact the system administrator.
---
## Includes > Abbreviations
*[API]: Application Programming Interface
*[CDN]: Content Delivery Network
*[CLI]: Command Line Interface
*[CPU]: Central Processing Unit
*[DNS]: Domain Name System
*[GB]: Gigabyte
*[HTTP]: Hypertext Transfer Protocol
*[HTTPS]: Hypertext Transfer Protocol Secure
*[IP]: Internet Protocol
*[JSON]: JavaScript Object Notation
*[JWT]: JSON Web Token
*[LB]: Load Balancer
*[LLM]: Large Language Model
*[LXC]: Linux Containers
*[MB]: Megabyte
*[RAM]: Random Access Memory
*[REST]: Representational State Transfer
*[SDK]: Software Development Kit
*[SSD]: Solid State Drive
*[SSL]: Secure Sockets Layer
*[TCP]: Transmission Control Protocol
*[TLS]: Transport Layer Security
*[UDP]: User Datagram Protocol
*[UI]: User Interface
*[URL]: Uniform Resource Locator
*[VM]: Virtual Machine
*[VPN]: Virtual Private Network
*[YAML]: YAML Ain't Markup Language
---
### VPN Exit Controller
A sophisticated VPN management system that provides **dual-mode access** to country-specific VPN routing: **Tailscale Exit Nodes** for network-level routing and **Proxy URLs** for application-level routing. Features intelligent load balancing, automatic failover, and performance monitoring with a modern Next.js dashboard.
## 🚀 Overview
The VPN Exit Controller manages Docker-based VPN containers that function as both **Tailscale exit nodes** and **proxy servers** across multiple countries. This dual approach provides maximum flexibility:
- **🌐 Tailscale Exit Nodes**: Route entire networks or devices through VPN containers via Tailscale's mesh network
- **🔗 Proxy Endpoints**: Route individual applications through HTTP/HTTPS/SOCKS5 proxies for specific use cases
- **🤝 Complementary Approaches**: Use both simultaneously for different needs - network routing for general use, proxies for development/testing
## ✨ Key Features
### 🌐 Dual-Mode VPN Access
- **Tailscale Exit Nodes**: Full network-level routing through VPN containers in the Tailscale mesh
- **HTTP/HTTPS/SOCKS5 Proxies**: Application-level routing with direct Tailscale IP access
- **Legacy Proxy URLs**: Country-specific endpoints like `proxy-us.rbnk.uk`, `proxy-de.rbnk.uk`
### 🎛️ Management & Monitoring
- **Modern Web Dashboard**: Professional Next.js interface at `https://vpn.rbnk.uk` with real-time monitoring
- **⚖️ Intelligent Load Balancing**: 5 strategies including health-score based routing
- **🔄 Automatic Failover**: Seamless switching when nodes become unavailable
- **📊 Performance Monitoring**: Real-time speed testing and latency monitoring
### 🔧 Infrastructure
- **🔒 SSL Security**: Automatic certificate management with Let's Encrypt
- **🐳 Container-Based**: Docker containers with NordVPN + Tailscale mesh networking
- **📈 Auto-Scaling**: Automatic node scaling based on connection load
- **🛡️ Health Monitoring**: Comprehensive health checks and recovery procedures
- **🎨 Responsive Design**: Dashboard works on desktop, tablet, and mobile devices
## 🏗️ Architecture Overview
### Dual-Mode Access Architecture
```
┌─ Tailscale Exit Nodes ─────────────────────────────────┐ ┌─ Proxy URLs ─────────────────────────────────────┐
│ │ │ │
│ Device/Network → Tailscale Mesh → VPN Container │ │ Application → Cloudflare → Traefik → HAProxy │
│ (100.x.x.x) (NordVPN Exit) │ │ (rbnk.uk) (SSL) (Routing) │
│ │ │ ↓ │
└─────────────────────────────────────────────────────────┘ │ VPN Container │
│ (Squid/Dante Proxies) │
└───────────────────────────────────────────────────┘
```
### Core Components
- **Next.js Dashboard**: Modern web interface for VPN node management and monitoring
- **FastAPI Application**: RESTful API for managing VPN nodes and load balancing
- **Docker VPN Containers**: Multi-service containers providing:
- **Tailscale Exit Node**: Full network routing via Tailscale mesh
- **Squid HTTP/HTTPS Proxy**: Web traffic routing on port 3128
- **Dante SOCKS5 Proxy**: Application-level tunneling on port 1080
- **NordVPN Connection**: Secure VPN tunnel to country-specific servers
- **HAProxy**: Country-based proxy routing for legacy proxy URLs
- **Traefik**: SSL termination and reverse proxy with automatic certificates
- **Tailscale Mesh**: Secure networking for both exit nodes and direct proxy access
- **Redis**: Metrics storage and session state management
## 🚀 Quick Start
### Prerequisites
- Proxmox VE with LXC container support
- Ubuntu 22.04 LTS
- Docker and Docker Compose
- Node.js 18+ and npm (for dashboard)
- NordVPN service credentials
- Cloudflare domain and API token
### Basic Setup
1. **Clone the repository**:
```
git clone https://your-repo/vpn-exit-controller.git
cd vpn-exit-controller
```
2. **Set up Python environment**:
```
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
3. **Configure environment variables**:
```
cp .env.example .env
# Edit .env with your NordVPN credentials, Tailscale auth key, etc.
```
4. **Start the services**:
```
# Start infrastructure
cd traefik && docker-compose -f docker-compose.traefik.yml up -d
cd ../proxy && docker-compose up -d
# Start the API
systemctl start vpn-controller
# Start the dashboard
cd dashboard && docker-compose up -d
```
### Web Dashboard
Access the modern web dashboard at:
- **Production**: `https://vpn.rbnk.uk`
- **Local Development**: `http://localhost:3000`
The dashboard provides:
- **Real-time Monitoring**: Live updates every 3 seconds
- **Country Selection**: Visual grid with flags
- **One-click Controls**: Start, stop, restart nodes
- **Performance Metrics**: CPU, memory, network stats
- **Professional UI**: Dark mode with responsive design
### API Usage
```
# Dashboard endpoints (public, no auth required)
curl https://vpn.rbnk.uk/api/stats
curl https://vpn.rbnk.uk/api/countries
curl https://vpn.rbnk.uk/api/nodes
# Management endpoints (require authentication)
curl -u admin:Bl4ckMagic!2345erver https://vpn.rbnk.uk/api/status
# Start a VPN node
curl -X POST -u admin:Bl4ckMagic!2345erver \
https://vpn.rbnk.uk/api/nodes/us/start \
-H "Content-Type: application/json" \
-d '{"server": "us9999.nordvpn.com"}'
# Get best node for a country
curl -u admin:Bl4ckMagic!2345erver \
https://vpn.rbnk.uk/api/load-balancer/best-node/us
```
## 🌍 Available Countries
The system supports VPN containers in 25+ countries, accessible via **both Tailscale exit nodes and proxy endpoints**:
| Country | Code | Tailscale Exit Node | Direct Proxy Access | Legacy Proxy URLs | Flag |
|---------|------|---------------------|---------------------|-------------------|------|
| United States | `us` | Route via Tailscale | `100.x.x.x:3128/1080` | `proxy-us.rbnk.uk` | 🇺🇸 |
| Germany | `de` | Route via Tailscale | `100.x.x.x:3128/1080` | `proxy-de.rbnk.uk` | 🇩🇪 |
| Japan | `jp` | Route via Tailscale | `100.x.x.x:3128/1080` | `proxy-jp.rbnk.uk` | 🇯🇵 |
| United Kingdom | `uk` | Route via Tailscale | `100.125.27.111:3128/1080` | `proxy-uk.rbnk.uk` | 🇬🇧 |
| And 20+ more... | | | | | |
*Note: Tailscale IPs are dynamic and can be discovered via the API*
## 🔌 Usage Approaches
Choose the approach that best fits your use case:
### 🌐 Approach 1: Tailscale Exit Nodes (Recommended)
**Best for: Full network routing, device-level VPN, multiple applications**
```
# Enable Tailscale exit node routing (macOS/Linux)
tailscale up --exit-node=exit-us-server123
# All traffic from your device now routes through US VPN container
curl https://ipinfo.io/ip # Shows US IP
```
**Benefits:**
- Routes **all** network traffic through VPN
- Works with **any application** (no proxy configuration needed)
- Perfect for mobile devices, entire computers, or Docker containers
- Automatic DNS resolution through VPN
- Zero application configuration required
### 🔗 Approach 2: Direct Proxy Access via Tailscale
**Best for: Development, testing, specific applications**
```
# Get current Tailscale IPs for active nodes
curl -u admin:password https://vpn.rbnk.uk/api/nodes
# Use direct Tailscale IP for HTTP proxy (discovered from API)
curl -x http://100.86.140.98:3128 https://httpbin.org/ip
# Use SOCKS5 proxy
curl --socks5 100.86.140.98:1080 https://httpbin.org/ip
# UK example
curl -x http://100.125.27.111:3128 https://httpbin.org/ip
```
**Benefits:**
- Direct connection to VPN containers via Tailscale mesh
- No internet routing through proxy infrastructure
- Lower latency and better performance
- Ideal for development and scripting
### 🌍 Approach 3: Legacy Proxy URLs
**Best for: External access, non-Tailscale networks**
```
# Use legacy country-specific URLs
curl -x http://proxy-us.rbnk.uk:8080 https://httpbin.org/ip
curl --socks5 proxy-de.rbnk.uk:1080 https://httpbin.org/ip
curl -x http://proxy-uk.rbnk.uk:8132 https://httpbin.org/ip
```
**Benefits:**
- Accessible from any internet connection
- No Tailscale client required
- SSL/TLS termination via Traefik
### Browser Configuration
**For HTTP Proxy:**
1. Go to Browser Proxy Settings
2. Select "Manual proxy configuration"
3. HTTP Proxy: `proxy-de.rbnk.uk` (or desired country)
4. Port: `8129` (for Germany, `8132` for UK, adjust for other countries)
5. Check "Use this proxy server for all protocols"
6. **No username/password required**
### Programming Examples
**Python with HTTP Proxy (No Auth)**:
```
import requests
# HTTP proxy - no authentication required
proxies = {
'http': 'http://proxy-de.rbnk.uk:8129',
'https': 'http://proxy-de.rbnk.uk:8129'
}
# UK proxy example
uk_proxies = {
'http': 'http://proxy-uk.rbnk.uk:8132',
'https': 'http://proxy-uk.rbnk.uk:8132'
}
response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.json())
```
**Python with SOCKS5 Proxy**:
```
import requests
# SOCKS5 proxy - requires requests[socks]
proxies = {
'http': 'socks5://proxy-jp.rbnk.uk:1082',
'https': 'socks5://proxy-jp.rbnk.uk:1082'
}
# UK SOCKS5 example
uk_socks_proxies = {
'http': 'socks5://proxy-uk.rbnk.uk:1084',
'https': 'socks5://proxy-uk.rbnk.uk:1084'
}
response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.json())
```
**Node.js with HTTP Proxy**:
```
const axios = require('axios');
const proxy = {
host: 'proxy-de.rbnk.uk',
port: 8129
// No authentication required
};
// UK proxy example
const ukProxy = {
host: 'proxy-uk.rbnk.uk',
port: 8132
};
axios.get('https://httpbin.org/ip', { proxy })
.then(response => console.log(response.data));
```
## 📁 Directory Structure
```
/opt/vpn-exit-controller/
├── dashboard/ # Next.js web dashboard
│ ├── src/ # Dashboard source code
│ ├── public/ # Static assets
│ ├── Dockerfile # Dashboard container
│ └── docker-compose.yml # Dashboard deployment
├── api/ # FastAPI application
│ ├── main.py # Main application entry point
│ ├── models/ # Data models and schemas
│ ├── routes/ # API route handlers
│ └── services/ # Business logic services
├── configs/ # VPN configuration files
├── traefik/ # Traefik reverse proxy configuration
│ ├── docker-compose.traefik.yml
│ ├── traefik.yml
│ └── dynamic/ # Dynamic configuration
├── proxy/ # HAProxy configuration
│ ├── docker-compose.yml
│ └── haproxy.cfg
├── scripts/ # Utility scripts
├── venv/ # Python virtual environment
├── .env # Environment variables
└── requirements.txt # Python dependencies
```
## ⚙️ Configuration
### Environment Variables
Key configuration options in `.env`:
```
# NordVPN Credentials
NORDVPN_USER=your_service_username
NORDVPN_PASS=your_service_password
# Tailscale
TAILSCALE_AUTH_KEY=your_tailscale_auth_key
# Redis
REDIS_HOST=localhost
REDIS_PORT=6379
# API Authentication
API_USERNAME=admin
API_PASSWORD=Bl4ckMagic!2345erver
# Cloudflare
CF_API_TOKEN=your_cloudflare_api_token
```
### Advanced Configuration
- **Load Balancing Strategy**: Set via API or environment variables
- **Health Check Intervals**: Configurable per-node monitoring
- **Auto-scaling Thresholds**: Connection-based scaling triggers
- **Speed Test Frequency**: Configurable performance monitoring
## 📊 Monitoring & Health Checks
### System Status
```
# Check overall system health
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/health
# Get detailed metrics
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics
# View active nodes
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes
```
### Service Status
```
# Check systemd service
systemctl status vpn-controller
# View logs
journalctl -u vpn-controller -f
# Check Docker containers
docker ps --filter name=vpn-exit
```
## 🔧 Troubleshooting
### Common Issues
**VPN Node Won't Start**:
```
# Check NordVPN credentials
docker logs vpn-exit-us
# Verify Tailscale connectivity
tailscale status
```
**Proxy Connection Fails**:
```
# Test HAProxy configuration
docker exec vpn-proxy haproxy -c -f /usr/local/etc/haproxy/haproxy.cfg
# Check Traefik routing
curl -H "Host: proxy-us.rbnk.uk" http://localhost
```
**Load Balancing Issues**:
```
# Check Redis connectivity
redis-cli ping
# View load balancing stats
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/load-balancer/stats
```
## 📚 Documentation
- 🎛️ **Dashboard Guide** - Complete dashboard documentation
- 📋 **API Documentation** - Complete API reference
- 🏗️ **Architecture Guide** - Technical architecture details
- 🚀 **Deployment Guide** - Setup and installation
- 🌐 **Proxy Usage** - How to use proxy URLs
- ⚖️ **Load Balancing** - Load balancing strategies
- 🔒 **Security Guide** - Security best practices
- 🔧 **Troubleshooting** - Common issues and solutions
- 🛠️ **Maintenance** - Operations and maintenance
## 👥 Development
### Local Development
**API Development:**
```
# Activate virtual environment
source venv/bin/activate
# Install development dependencies
pip install -r requirements-dev.txt
# Run in development mode
uvicorn api.main:app --reload --host 0.0.0.0 --port 8080
```
**Dashboard Development:**
```
# Navigate to dashboard directory
cd dashboard
# Install dependencies
npm install
# Start development server
npm run dev
# Access at http://localhost:3000
```
### Testing
```
# Run unit tests
pytest tests/
# Run integration tests
pytest tests/integration/
# Test specific functionality
pytest tests/test_load_balancer.py -v
```
### Contributing
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/new-feature`
3. Make changes and add tests
4. Commit changes: `git commit -am 'Add new feature'`
5. Push to branch: `git push origin feature/new-feature`
6. Submit a pull request
## 📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
## 🆘 Support
- 📖 **Documentation**: Check the comprehensive guides in this repository
- 🐛 **Issues**: Report bugs via GitHub Issues
- 💬 **Discussions**: Join GitHub Discussions for questions and ideas
- 📧 **Contact**: For enterprise support and custom deployments
---
**Built with ❤️ for reliable, intelligent VPN infrastructure management**
---
## Installation
### Installation Guide
This guide covers detailed installation instructions for VPN Exit Controller on various platforms.
## System Requirements
### Minimum Requirements
- **CPU**: 2 cores
- **RAM**: 4GB
- **Storage**: 20GB SSD
- **Network**: 100 Mbps connection
- **OS**: Ubuntu 22.04 LTS or compatible
### Recommended Requirements
- **CPU**: 4+ cores
- **RAM**: 8GB+
- **Storage**: 50GB+ SSD
- **Network**: 1 Gbps connection
- **OS**: Ubuntu 22.04 LTS
### Supported Platforms
| Platform | Version | Support Level |
|----------|---------|---------------|
| Ubuntu | 22.04 LTS | ✅ Full Support |
| Ubuntu | 20.04 LTS | ✅ Full Support |
| Debian | 11/12 | ✅ Full Support |
| RHEL/CentOS | 8/9 | ⚠️ Community Support |
| Proxmox LXC | 7.x/8.x | ✅ Full Support |
| Docker | 20.10+ | ✅ Full Support |
## Installation Methods
### Method 1: Automated Installation (Recommended)
```
# Download and run installer
curl -sSL https://vpn-docs.rbnk.uk/install.sh | bash
```
The installer will:
- ✅ Check system requirements
- ✅ Install dependencies
- ✅ Configure services
- ✅ Set up systemd units
- ✅ Create necessary directories
### Method 2: Manual Installation
#### Step 1: Install System Dependencies
=== "Ubuntu/Debian"
```
# Update system
sudo apt update && sudo apt upgrade -y
# Install dependencies
sudo apt install -y \
curl \
git \
python3.10 \
python3.10-venv \
python3-pip \
docker.io \
docker-compose \
redis-server \
nginx \
certbot \
python3-certbot-nginx
# Start services
sudo systemctl enable --now docker redis
```
=== "RHEL/CentOS"
```
# Update system
sudo dnf update -y
# Install dependencies
sudo dnf install -y \
curl \
git \
python3.10 \
python3-pip \
docker \
docker-compose \
redis \
nginx \
certbot \
python3-certbot-nginx
# Start services
sudo systemctl enable --now docker redis
```
#### Step 2: Install Tailscale
```
# Add Tailscale repository
curl -fsSL https://tailscale.com/install.sh | sh
# Start Tailscale
sudo systemctl enable --now tailscaled
```
#### Step 3: Clone Repository
```
# Clone from Gitea
git clone https://gitea.rbnk.uk/admin/vpn-controller.git /opt/vpn-exit-controller
cd /opt/vpn-exit-controller
```
#### Step 4: Python Environment Setup
```
# Create virtual environment
python3 -m venv venv
# Activate environment
source venv/bin/activate
# Install Python packages
pip install --upgrade pip
pip install -r requirements.txt
```
#### Step 5: Configure Environment
```
# Copy example configuration
cp .env.example .env
# Edit configuration
nano .env
```
Required environment variables:
```
# NordVPN Service Credentials
NORDVPN_USER=your_service_username
NORDVPN_PASS=your_service_password
# Tailscale Configuration
TAILSCALE_AUTH_KEY=tskey-auth-xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxx
# API Authentication
API_USERNAME=admin
API_PASSWORD=strong_password_here
# Cloudflare (for DNS management)
CF_API_TOKEN=your_cloudflare_api_token
# Redis Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
```
#### Step 6: Create Systemd Service
```
# Create service file
sudo tee /etc/systemd/system/vpn-controller.service << EOF
[Unit]
Description=VPN Exit Controller API
After=network.target redis.service docker.service
Wants=redis.service docker.service
[Service]
Type=exec
User=root
WorkingDirectory=/opt/vpn-exit-controller
Environment="PATH=/opt/vpn-exit-controller/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
ExecStart=/opt/vpn-exit-controller/venv/bin/python -m uvicorn api.main:app --host 0.0.0.0 --port 8080
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable --now vpn-controller
```
### Method 3: Docker Installation
#### Using Docker Compose
```
# Create docker-compose.yml
cat > docker-compose.yml << EOF
version: '3.8'
services:
redis:
image: redis:alpine
restart: always
ports:
- "6379:6379"
volumes:
- redis-data:/data
vpn-controller:
build: .
restart: always
ports:
- "8080:8080"
environment:
- REDIS_HOST=redis
env_file:
- .env
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./configs:/app/configs
depends_on:
- redis
volumes:
redis-data:
EOF
# Start services
docker-compose up -d
```
## Post-Installation Setup
### 1. Verify Installation
```
# Check service status
sudo systemctl status vpn-controller
# Test API endpoint
curl http://localhost:8080/api/health -u admin:your_password
```
### 2. Configure Firewall
=== "UFW (Ubuntu)"
```
# Allow required ports
sudo ufw allow 8080/tcp # API
sudo ufw allow 80/tcp # HTTP
sudo ufw allow 443/tcp # HTTPS
sudo ufw allow 8888/tcp # HAProxy stats
# Enable firewall
sudo ufw enable
```
=== "firewalld (RHEL)"
```
# Allow required ports
sudo firewall-cmd --permanent --add-port=8080/tcp
sudo firewall-cmd --permanent --add-port=80/tcp
sudo firewall-cmd --permanent --add-port=443/tcp
sudo firewall-cmd --permanent --add-port=8888/tcp
# Reload firewall
sudo firewall-cmd --reload
```
### 3. Set Up SSL Certificates
```
# Using Certbot
sudo certbot --nginx -d vpn-api.yourdomain.com
# Or using Traefik (automatic)
cd traefik && docker-compose up -d
```
### 4. Configure DNS Records
Add these records to your domain:
| Type | Name | Value | Proxy |
|------|------|-------|-------|
| A | vpn-api | YOUR_SERVER_IP | ❌ |
| A | proxy-us | YOUR_SERVER_IP | ✅ |
| A | proxy-uk | YOUR_SERVER_IP | ✅ |
| A | proxy-jp | YOUR_SERVER_IP | ✅ |
## Troubleshooting Installation
!!! warning "Common Issues"
**Docker Permission Denied**
```
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in
```
**Port Already in Use**
```
# Find process using port
sudo lsof -i :8080
# Change port in configuration
```
**Python Version Issues**
```
# Install specific Python version
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10 python3.10-venv
```
## Uninstallation
To completely remove VPN Exit Controller:
```
# Stop services
sudo systemctl stop vpn-controller
sudo systemctl disable vpn-controller
# Remove files
sudo rm -rf /opt/vpn-exit-controller
sudo rm /etc/systemd/system/vpn-controller.service
# Remove Docker containers
docker stop $(docker ps -a -q --filter name=vpn-)
docker rm $(docker ps -a -q --filter name=vpn-)
# Clean up (optional)
sudo apt remove --purge docker.io docker-compose
```
## Next Steps
- :material-rocket-launch:{ .lg .middle } __Quick Start__
---
Start using VPN Exit Controller
:octicons-arrow-right-24: Quick Start
- :material-cog:{ .lg .middle } __Configuration__
---
Configure advanced settings
:octicons-arrow-right-24: Configuration Guide
- :material-shield-check:{ .lg .middle } __Security__
---
Harden your installation
:octicons-arrow-right-24: Security Guide
---
!!! info "Need Help?"
If you encounter issues during installation, check our Troubleshooting Guide or open an issue.
---
## Operations > Deployment
### VPN Exit Controller - Deployment Guide
This comprehensive guide covers the complete deployment of the VPN Exit Controller system from scratch, including infrastructure setup, dependencies, configuration, and testing procedures.
## Table of Contents
1. Infrastructure Prerequisites
2. System Dependencies
3. Application Setup
4. Service Configuration
5. Network and DNS Setup
6. Container Infrastructure
7. Testing and Verification
8. Troubleshooting
## 1. Infrastructure Prerequisites
### 1.1 Proxmox VE Setup Requirements
#### Hardware Specifications (Minimum)
- **CPU**: 4 cores (Intel/AMD with virtualization support)
- **RAM**: 8GB (16GB recommended for multiple VPN nodes)
- **Storage**: 100GB SSD (for container and Docker images)
- **Network**: 1Gbps NIC with stable internet connection
#### Proxmox VE Installation
1. Install Proxmox VE 8.0+ on the host system
2. Configure network bridges in Proxmox web interface
3. Set up storage pools for container data
### 1.2 LXC Container Configuration
#### Create LXC Container
```
# Create Ubuntu 22.04 LXC container
pct create 201 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
--hostname vpn-controller \
--memory 4096 \
--cores 2 \
--rootfs local-lvm:32 \
--net0 name=eth0,bridge=vmbr1,ip=10.10.10.20/24,gw=10.10.10.1 \
--nameserver 1.1.1.1 \
--onboot 1 \
--unprivileged 0 \
--features nesting=1,keyctl=1
```
#### Essential Container Features
- `nesting=1`: Enables Docker containers within LXC
- `keyctl=1`: Required for Docker operations
- `unprivileged=0`: Runs as privileged container for Docker access
#### Network Configuration
```
# Configure static network in container
cat > /etc/netplan/01-netcfg.yaml << 'EOF'
network:
version: 2
ethernets:
eth0:
addresses:
- 10.10.10.20/24
gateway4: 10.10.10.1
nameservers:
addresses: [1.1.1.1, 8.8.8.8]
EOF
netplan apply
```
#### AppArmor Configuration (if needed)
```
# On Proxmox host, disable AppArmor for container
echo "lxc.apparmor.profile: unconfined" >> /etc/pve/lxc/201.conf
pct reboot 201
```
## 2. System Dependencies
### 2.1 Ubuntu 22.04 LXC Base Setup
```
# Update system packages
apt update && apt upgrade -y
# Install essential system packages
apt install -y \
curl \
wget \
git \
nano \
htop \
net-tools \
iptables \
ca-certificates \
gnupg \
lsb-release \
software-properties-common \
apt-transport-https
```
### 2.2 Docker Installation and Configuration
```
# Add Docker's official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# Add Docker repository
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker
apt update
apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
# Start and enable Docker
systemctl start docker
systemctl enable docker
# Add user to docker group (if not running as root)
usermod -aG docker $USER
```
#### Docker Configuration
```
# Configure Docker daemon
cat > /etc/docker/daemon.json << 'EOF'
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"dns": ["1.1.1.1", "8.8.8.8"],
"storage-driver": "overlay2"
}
EOF
systemctl restart docker
```
### 2.3 Python 3.10 with Virtual Environment
```
# Install Python 3.10 and pip
apt install -y python3.10 python3.10-venv python3-pip
# Verify Python installation
python3 --version
```
### 2.4 Redis Server Installation
```
# Install Redis
apt install -y redis-server
# Configure Redis
sed -i 's/bind 127.0.0.1 ::1/bind 127.0.0.1/' /etc/redis/redis.conf
sed -i 's/# requirepass foobared/requirepass vpn-redis-2024/' /etc/redis/redis.conf
# Start and enable Redis
systemctl start redis-server
systemctl enable redis-server
# Test Redis
redis-cli ping
```
### 2.5 Additional System Packages
```
# Install network utilities
apt install -y \
openvpn \
iptables-persistent \
netfilter-persistent \
bridge-utils \
iproute2 \
tcpdump \
nmap \
jq
```
## 3. Application Setup
### 3.1 Repository Cloning and Directory Setup
```
# Create application directory
mkdir -p /opt/vpn-exit-controller
cd /opt/vpn-exit-controller
# Clone repository (adjust URL as needed)
git clone https://github.com/your-repo/vpn-exit-controller.git .
# Set proper permissions
chown -R root:root /opt/vpn-exit-controller
chmod +x scripts/*.sh
chmod +x start.sh
```
### 3.2 Python Virtual Environment Setup
```
# Create virtual environment
cd /opt/vpn-exit-controller
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate
# Install Python dependencies
pip install --upgrade pip
pip install -r api/requirements.txt
# Verify installations
pip list
```
### 3.3 Environment Variable Configuration
```
# Create .env file
cat > /opt/vpn-exit-controller/.env << 'EOF'
# Application Settings
SECRET_KEY=your-super-secret-key-change-this-in-production
ADMIN_USER=admin
ADMIN_PASS=Bl4ckMagic!2345erver
# Tailscale Configuration
TAILSCALE_AUTHKEY=tskey-auth-your-tailscale-key-here
# NordVPN Credentials
NORDVPN_USERNAME=your-nordvpn-username
NORDVPN_PASSWORD=your-nordvpn-password
# Redis Configuration
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
REDIS_PASSWORD=vpn-redis-2024
# Cloudflare DNS API (for SSL certificates)
CLOUDFLARE_EMAIL=admin@richardbankole.com
CLOUDFLARE_API_KEY=your-cloudflare-api-key
# Domain Configuration
DOMAIN=rbnk.uk
API_DOMAIN=vpn-api.rbnk.uk
EOF
# Secure the .env file
chmod 600 /opt/vpn-exit-controller/.env
```
### 3.4 NordVPN Configuration Setup
```
# Create NordVPN authentication file
mkdir -p /opt/vpn-exit-controller/configs
cat > /opt/vpn-exit-controller/configs/auth.txt << 'EOF'
your-nordvpn-username
your-nordvpn-password
EOF
chmod 600 /opt/vpn-exit-controller/configs/auth.txt
# Download NordVPN configuration files
cd /opt/vpn-exit-controller
bash scripts/download-nordvpn-configs.sh
```
## 4. Service Configuration
### 4.1 NordVPN Service Credentials Setup
The NordVPN configurations are already present in the `/opt/vpn-exit-controller/configs/vpn/` directory. Ensure your NordVPN credentials are properly configured:
```
# Verify NordVPN configs exist
ls -la /opt/vpn-exit-controller/configs/vpn/
# Test a configuration (optional)
openvpn --config /opt/vpn-exit-controller/configs/vpn/us.ovpn \
--auth-user-pass /opt/vpn-exit-controller/configs/auth.txt \
--daemon
```
### 4.2 Tailscale Installation and Configuration
```
# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh
# Start Tailscale daemon
systemctl start tailscaled
systemctl enable tailscaled
# Authenticate with Tailscale (use your auth key from .env)
tailscale up --authkey=tskey-auth-your-key-here \
--advertise-exit-node \
--hostname=vpn-controller
# Verify Tailscale status
tailscale status
tailscale ip -4
```
### 4.3 Systemd Service Installation
```
# Create the systemd service file
cat > /etc/systemd/system/vpn-controller.service << 'EOF'
[Unit]
Description=VPN Exit Controller API
After=docker.service tailscaled.service redis-server.service
Requires=docker.service
Wants=tailscaled.service redis-server.service
[Service]
Type=simple
ExecStart=/opt/vpn-exit-controller/start.sh
Restart=on-failure
RestartSec=10
User=root
WorkingDirectory=/opt/vpn-exit-controller
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
[Install]
WantedBy=multi-user.target
EOF
# Reload systemd and enable service
systemctl daemon-reload
systemctl enable vpn-controller
```
### 4.4 Firewall and iptables Configuration
```
# Configure iptables for VPN traffic
cat > /etc/iptables/rules.v4 << 'EOF'
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
# NAT rules for VPN traffic
-A POSTROUTING -s 10.0.0.0/8 -o tun+ -j MASQUERADE
-A POSTROUTING -s 172.16.0.0/12 -o tun+ -j MASQUERADE
-A POSTROUTING -s 192.168.0.0/16 -o tun+ -j MASQUERADE
COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
# Allow loopback
-A INPUT -i lo -j ACCEPT
# Allow established connections
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# Allow SSH
-A INPUT -p tcp --dport 22 -j ACCEPT
# Allow HTTP/HTTPS
-A INPUT -p tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 443 -j ACCEPT
# Allow API port
-A INPUT -p tcp --dport 8080 -j ACCEPT
# Allow Tailscale
-A INPUT -p udp --dport 41641 -j ACCEPT
# Forward VPN traffic
-A FORWARD -i tun+ -j ACCEPT
-A FORWARD -o tun+ -j ACCEPT
# Drop invalid packets
-A INPUT -m state --state INVALID -j DROP
COMMIT
EOF
# Apply iptables rules
iptables-restore < /etc/iptables/rules.v4
netfilter-persistent save
```
## 5. Network and DNS Setup
### 5.1 Cloudflare DNS Configuration
Configure the following DNS records in your Cloudflare dashboard for `rbnk.uk`:
```
# Main API endpoint
vpn-api.rbnk.uk A 10.10.10.20 (Proxied: Yes)
# Proxy endpoints for each country
proxy-us.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-uk.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-de.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-jp.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-ca.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-au.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-nl.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-fr.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-it.rbnk.uk A 10.10.10.20 (Proxied: Yes)
proxy-es.rbnk.uk A 10.10.10.20 (Proxied: Yes)
# Traefik dashboard (optional)
traefik.rbnk.uk A 10.10.10.20 (Proxied: Yes)
```
### 5.2 SSL Certificate Configuration
The Traefik configuration handles SSL certificates automatically via Let's Encrypt and Cloudflare DNS challenge:
```
# Ensure acme.json has correct permissions
mkdir -p /opt/vpn-exit-controller/traefik/letsencrypt
touch /opt/vpn-exit-controller/traefik/letsencrypt/acme.json
chmod 600 /opt/vpn-exit-controller/traefik/letsencrypt/acme.json
```
## 6. Container Infrastructure
### 6.1 Docker Network Setup
```
# Create custom Docker networks
docker network create vpn-network --subnet=172.20.0.0/16
docker network create traefik-network --subnet=172.21.0.0/16
```
### 6.2 Build VPN Node Container
```
# Build the VPN node Docker image
cd /opt/vpn-exit-controller/vpn-node
docker build -t vpn-exit-node:latest .
# Verify image was built
docker images | grep vpn-exit-node
```
### 6.3 Traefik Deployment
```
# Start Traefik container
cd /opt/vpn-exit-controller/traefik
docker compose -f docker-compose.traefik.yml up -d
# Check Traefik status
docker ps | grep traefik
docker logs traefik
```
### 6.4 HAProxy Deployment
```
# Start HAProxy and proxy infrastructure
cd /opt/vpn-exit-controller/proxy
docker compose up -d
# Verify HAProxy is running
docker ps | grep haproxy
curl -s http://localhost:8404 # HAProxy stats page
```
### 6.5 Main Application Deployment
```
# Start the main application stack
cd /opt/vpn-exit-controller
docker compose up -d
# Start the systemd service
systemctl start vpn-controller
systemctl status vpn-controller
```
## 7. Testing and Verification
### 7.1 Health Check Procedures
```
# Check all services are running
systemctl status vpn-controller
systemctl status docker
systemctl status tailscaled
systemctl status redis-server
# Check Docker containers
docker ps -a
# Check application logs
journalctl -u vpn-controller -f
docker logs vpn-api
docker logs vpn-redis
```
### 7.2 API Endpoint Testing
```
# Test API status endpoint
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
# Test via domain (after DNS propagation)
curl -u admin:Bl4ckMagic!2345erver https://vpn-api.rbnk.uk/api/status
# Test node management endpoints
curl -u admin:Bl4ckMagic!2345erver https://vpn-api.rbnk.uk/api/nodes
# Test metrics endpoint
curl -u admin:Bl4ckMagic!2345erver https://vpn-api.rbnk.uk/api/metrics
```
### 7.3 Proxy URL Verification
```
# Test HTTP proxy endpoints
curl -x proxy-us.rbnk.uk:80 http://ipinfo.io/country
curl -x proxy-uk.rbnk.uk:80 http://ipinfo.io/country
curl -x proxy-de.rbnk.uk:80 http://ipinfo.io/country
# Test SOCKS5 proxy (if configured)
curl --socks5 proxy-us.rbnk.uk:1080 http://ipinfo.io/country
```
### 7.4 Performance Testing
```
# Speed test through proxy
curl -x proxy-us.rbnk.uk:80 -w "@curl-format.txt" -o /dev/null -s http://speedtest.net/mini.php
# Create curl format file for detailed timing
cat > curl-format.txt << 'EOF'
time_namelookup: %{time_namelookup}\n
time_connect: %{time_connect}\n
time_appconnect: %{time_appconnect}\n
time_pretransfer: %{time_pretransfer}\n
time_redirect: %{time_redirect}\n
time_starttransfer: %{time_starttransfer}\n
----------\n
time_total: %{time_total}\n
EOF
```
### 7.5 Tailscale Exit Node Verification
```
# Check Tailscale status
tailscale status
# Verify exit node advertisement
tailscale status | grep "exit node"
# Test from another Tailscale device
# Use this node as exit node and check external IP
```
## 8. Troubleshooting
### 8.1 Common Issues and Solutions
#### Docker Permission Issues
```
# Add user to docker group
usermod -aG docker $USER
newgrp docker
# Or run as root
sudo su -
```
#### Container Networking Issues
```
# Restart Docker daemon
systemctl restart docker
# Recreate networks
docker network rm vpn-network traefik-network
docker network create vpn-network --subnet=172.20.0.0/16
docker network create traefik-network --subnet=172.21.0.0/16
```
#### SSL Certificate Issues
```
# Check Traefik logs
docker logs traefik
# Verify Cloudflare API credentials
# Check acme.json permissions
ls -la /opt/vpn-exit-controller/traefik/letsencrypt/acme.json
```
#### VPN Connection Issues
```
# Check NordVPN credentials
cat /opt/vpn-exit-controller/configs/auth.txt
# Test manual OpenVPN connection
openvpn --config /opt/vpn-exit-controller/configs/vpn/us.ovpn \
--auth-user-pass /opt/vpn-exit-controller/configs/auth.txt
```
### 8.2 Log Locations
```
# Application logs
journalctl -u vpn-controller -f
# Docker container logs
docker logs vpn-api
docker logs vpn-redis
docker logs traefik
docker logs haproxy
# System logs
/var/log/syslog
/var/log/daemon.log
# Traefik logs
/opt/vpn-exit-controller/traefik/logs/
```
### 8.3 Recovery Procedures
#### Service Recovery
```
# Restart all services
systemctl restart vpn-controller
docker compose down && docker compose up -d
# Clean restart
docker system prune -f
docker compose down -v
docker compose up -d --build
```
#### Database Recovery
```
# Restart Redis
systemctl restart redis-server
# Clear Redis cache if needed
redis-cli FLUSHALL
```
## Post-Deployment Checklist
- [ ] All services running and enabled
- [ ] DNS records configured and propagated
- [ ] SSL certificates obtained and valid
- [ ] API endpoints responding correctly
- [ ] Proxy URLs functional for all countries
- [ ] Tailscale exit node operational
- [ ] Monitoring and logging configured
- [ ] Backup procedures established
- [ ] Security hardening completed
- [ ] Performance baselines established
## Security Considerations
1. **Change default passwords** in `.env` file
2. **Restrict API access** using proper authentication
3. **Configure firewall rules** to limit exposed ports
4. **Regular security updates** for all components
5. **Monitor access logs** for suspicious activity
6. **Secure NordVPN credentials** with proper file permissions
7. **Use strong Tailscale authentication** keys
8. **Regular backup** of configuration files
## Maintenance
### Regular Tasks
- Monitor disk space and logs
- Update Docker images monthly
- Rotate authentication keys quarterly
- Review access logs weekly
- Test backup/recovery procedures monthly
### Updates
- Always test updates in staging environment
- Backup configurations before updates
- Update dependencies in requirements.txt
- Monitor for security advisories
This deployment guide provides a complete foundation for setting up the VPN Exit Controller system. Adjust specific values like domain names, IP addresses, and credentials according to your environment.
---
## Operations > Docs Build System
### Documentation Build System
This page describes the automated documentation build system for the VPN Exit Controller project.
## Overview
The VPN Exit Controller documentation is built using MkDocs with the Material theme and is automatically rebuilt whenever changes are pushed to the repository. The documentation is hosted at https://vpn-docs.rbnk.uk.
## Architecture
```
graph LR
A[Git Push] --> B[Gitea Webhook]
B --> C[Webhook Server
Port 8888]
C --> D[Rebuild Script]
D --> E[Docker Build]
E --> F[New Container]
F --> G[Traefik]
G --> H[vpn-docs.rbnk.uk]
```
## Components
### 1. MkDocs Site
**Location**: `/opt/vpn-exit-controller/mkdocs-site/`
The documentation source includes:
- `mkdocs.yml` - MkDocs configuration
- `docs/` - Documentation source files
- `Dockerfile` - Multi-stage build for documentation
- `docker-compose.yml` - Container orchestration
- `nginx.conf` - Web server configuration
### 2. Webhook Server
**Script**: `/opt/vpn-exit-controller/scripts/webhook-docs-rebuild.py`
**Service**: `docs-webhook.service`
**Port**: 8888
The webhook server:
- Listens for POST requests to `/rebuild-docs`
- Validates webhook signatures (optional)
- Triggers the rebuild script
- Logs all activity
### 3. Rebuild Script
**Location**: `/opt/vpn-exit-controller/scripts/rebuild-docs.sh`
The rebuild script performs:
1. Stops the existing documentation container
2. Builds a new container with latest documentation
3. Starts the new container
4. Verifies deployment success
5. Logs the rebuild event
### 4. Docker Container
**Container Name**: `vpn-docs`
**Internal Port**: 80
**External Port**: 8001
The container uses a multi-stage build:
1. **Builder stage**: Python environment with MkDocs
2. **Production stage**: Nginx serving static files
## Configuration
### Webhook Service Configuration
The webhook service is managed by systemd:
```
[Unit]
Description=Documentation Rebuild Webhook Server
After=network.target docker.service
Requires=docker.service
[Service]
Type=simple
User=root
WorkingDirectory=/opt/vpn-exit-controller
Environment="WEBHOOK_SECRET=change-me-to-secure-secret"
ExecStart=/usr/bin/python3 /opt/vpn-exit-controller/scripts/webhook-docs-rebuild.py
Restart=always
RestartSec=10
```
### Gitea Webhook Setup
To configure automatic rebuilds:
1. Navigate to your repository settings in Gitea
2. Go to Webhooks section
3. Add a new webhook with:
- **URL**: `http://10.10.10.20:8888/rebuild-docs`
- **Method**: POST
- **Events**: Push events
- **Secret**: (optional but recommended)
### Security Configuration
For production use, configure a webhook secret:
1. Generate a secure secret:
```
openssl rand -hex 32
```
2. Update the systemd service:
```
systemctl edit docs-webhook
```
3. Add the environment variable:
```
[Service]
Environment="WEBHOOK_SECRET=your-generated-secret"
```
4. Use the same secret in Gitea webhook configuration
## Usage
### Automatic Builds
Documentation is automatically rebuilt when:
- Code is pushed to the main branch
- A webhook request is received
- The rebuild is manually triggered
### Manual Rebuild
To manually rebuild documentation:
```
# Option 1: Direct script execution
/opt/vpn-exit-controller/scripts/rebuild-docs.sh
# Option 2: Trigger via webhook
curl -X POST http://localhost:8888/rebuild-docs
# Option 3: With webhook secret
curl -X POST http://localhost:8888/rebuild-docs \
-H "X-Hub-Signature-256: sha256=your-signature"
```
### Monitoring
Check the system status:
```
# Webhook service status
systemctl status docs-webhook
# Container status
docker ps | grep vpn-docs
# Recent rebuilds
tail -f /opt/vpn-exit-controller/logs/docs-rebuild.log
# Webhook activity
tail -f /opt/vpn-exit-controller/logs/webhook.log
```
## Troubleshooting
### Common Issues
#### Webhook Not Triggering
1. Check service status:
```
systemctl status docs-webhook
journalctl -u docs-webhook -f
```
2. Test webhook connectivity:
```
curl -X POST http://localhost:8888/rebuild-docs
```
3. Verify Gitea can reach the webhook URL
#### Build Failures
1. Check Docker logs:
```
docker logs vpn-docs
```
2. Manually test the build:
```
cd /opt/vpn-exit-controller/mkdocs-site
docker-compose build
```
3. Check disk space:
```
df -h
```
#### Site Not Accessible
1. Verify container health:
```
docker inspect vpn-docs --format='{{.State.Health.Status}}'
```
2. Check Traefik routing:
```
docker logs traefik | grep vpn-docs
```
3. Test SSL certificate:
```
curl -vI https://vpn-docs.rbnk.uk
```
### Log Locations
- **Webhook logs**: `/opt/vpn-exit-controller/logs/webhook.log`
- **Rebuild logs**: `/opt/vpn-exit-controller/logs/docs-rebuild.log`
- **Container logs**: `docker logs vpn-docs`
- **Service logs**: `journalctl -u docs-webhook`
## Maintenance
### Regular Tasks
- **Monitor disk usage** - Documentation builds can consume space
- **Review logs** - Check for failed builds or security issues
- **Update dependencies** - Keep MkDocs and plugins updated
- **Rotate logs** - Ensure log files don't grow too large
### Updates
To update the documentation system:
1. Update MkDocs dependencies:
```
cd /opt/vpn-exit-controller/mkdocs-site
# Update requirements.txt
docker-compose build --no-cache
```
2. Update webhook server:
```
# Modify the Python script
systemctl restart docs-webhook
```
## Performance
The documentation build process:
- Takes 1-2 minutes to complete
- Uses minimal CPU during normal operation
- Requires ~500MB disk space for build cache
- Serves static files efficiently via Nginx
## Security Considerations
1. **Webhook Authentication**: Always use a secret in production
2. **Network Access**: Limit webhook access to trusted sources
3. **Container Isolation**: Runs with minimal privileges
4. **SSL/TLS**: All public access uses HTTPS via Traefik
5. **Input Validation**: Webhook server validates all inputs
## Future Enhancements
Potential improvements to consider:
- [ ] Add build status badges
- [ ] Implement build notifications
- [ ] Add search analytics
- [ ] Enable documentation versioning
- [ ] Add automated link checking
- [ ] Implement A/B testing for docs
- [ ] Add user feedback collection
## Related Documentation
- Deployment Guide - Overall system deployment
- Maintenance Guide - System maintenance procedures
- Troubleshooting Guide - General troubleshooting
---
!!! tip "Quick Test"
To test if the documentation build system is working, make a small change to any `.md` file, commit, and push. You should see the documentation automatically rebuild within 2-3 minutes.
---
## Operations
### Operations Guide
This section covers the operational aspects of running and maintaining the VPN Exit Controller system.
## Quick Links
- **Deployment** - Initial system deployment and setup
- **Documentation Build System** - Automated documentation builds
- **Monitoring** - System monitoring and alerting
- **Maintenance** - Routine maintenance procedures
- **Troubleshooting** - Common issues and solutions
- **Scaling** - Scaling the system for growth
## Overview
Operating the VPN Exit Controller requires understanding several key areas:
### 🚀 Deployment
Learn how to deploy the system from scratch, including infrastructure setup, service configuration, and initial testing.
### 📚 Documentation Build System
Understand how documentation is automatically built and deployed when changes are pushed to the repository. This ensures documentation stays in sync with the codebase.
### 📊 Monitoring
Set up comprehensive monitoring to track system health, performance metrics, and potential issues before they impact users.
### 🔧 Maintenance
Follow routine maintenance procedures to keep the system running smoothly, including updates, backups, and security patches.
### 🔍 Troubleshooting
Quickly diagnose and resolve common issues using our troubleshooting guide and diagnostic tools.
### 📈 Scaling
Plan for growth with our scaling guide, covering both vertical and horizontal scaling strategies.
## Key Operational Tasks
### Daily Tasks
- Monitor system health via dashboard
- Check webhook and build logs
- Review error logs for issues
- Verify all VPN nodes are operational
### Weekly Tasks
- Review performance metrics
- Check disk usage and clean if needed
- Verify backup procedures
- Update documentation as needed
### Monthly Tasks
- Security updates and patches
- Certificate renewal verification
- Capacity planning review
- Documentation review and updates
## Important Locations
### Configuration Files
- Main config: `/opt/vpn-exit-controller/`
- Service files: `/etc/systemd/system/`
- Docker configs: Various `docker-compose.yml` files
### Log Files
- API logs: `journalctl -u vpn-controller`
- Webhook logs: `/opt/vpn-exit-controller/logs/webhook.log`
- Container logs: `docker logs [container-name]`
### Documentation
- Source: `/opt/vpn-exit-controller/mkdocs-site/docs/`
- Live site: https://vpn-docs.rbnk.uk
## Getting Help
If you encounter issues not covered in these guides:
1. Check the troubleshooting guide
2. Review system logs for error messages
3. Consult the API documentation
4. Check the main documentation index
---
!!! info "Continuous Improvement"
This operations documentation is continuously updated based on operational experience. If you discover new issues or better procedures, please document them!
---
## Operations > Maintenance
### VPN Exit Controller - Maintenance Guide
This document provides comprehensive maintenance procedures for the VPN Exit Controller system running on Proxmox LXC container (ID: 201).
## Table of Contents
1. Routine Maintenance Tasks
2. System Monitoring
3. Backup and Recovery
4. Updates and Upgrades
5. Certificate Management
6. VPN Service Management
7. Capacity Management
8. Security Maintenance
9. Documentation Updates
10. Emergency Procedures
---
## 1. Routine Maintenance Tasks
### Daily Health Checks (Automated)
Create a daily health check script at `/opt/vpn-exit-controller/scripts/daily-check.sh`:
```
#!/bin/bash
# Daily health check script
LOG_FILE="/var/log/vpn-controller-health.log"
DATE=$(date "+%Y-%m-%d %H:%M:%S")
echo "[$DATE] Starting daily health check" >> $LOG_FILE
# Check system resources
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
MEM_USAGE=$(free | grep Mem | awk '{printf("%.2f"), $3/$2 * 100.0}')
DISK_USAGE=$(df -h /opt | awk 'NR==2 {print $5}' | sed 's/%//')
echo "[$DATE] CPU: ${CPU_USAGE}%, Memory: ${MEM_USAGE}%, Disk: ${DISK_USAGE}%" >> $LOG_FILE
# Check service status
if systemctl is-active --quiet vpn-controller; then
echo "[$DATE] VPN Controller service: RUNNING" >> $LOG_FILE
else
echo "[$DATE] VPN Controller service: FAILED" >> $LOG_FILE
systemctl restart vpn-controller
fi
# Check Docker containers
RUNNING_CONTAINERS=$(docker ps --format "table {{.Names}}\t{{.Status}}" | grep -c "Up")
echo "[$DATE] Running containers: $RUNNING_CONTAINERS" >> $LOG_FILE
# Check API health
API_RESPONSE=$(curl -s -u admin:Bl4ckMagic!2345erver -o /dev/null -w "%{http_code}" http://localhost:8080/api/status)
if [ "$API_RESPONSE" = "200" ]; then
echo "[$DATE] API health: OK" >> $LOG_FILE
else
echo "[$DATE] API health: FAILED (HTTP $API_RESPONSE)" >> $LOG_FILE
fi
# Check Redis
if docker exec vpn-redis redis-cli ping | grep -q PONG; then
echo "[$DATE] Redis: OK" >> $LOG_FILE
else
echo "[$DATE] Redis: FAILED" >> $LOG_FILE
fi
# Alert on high resource usage
if [ "$CPU_USAGE" -gt 80 ] || [ "$MEM_USAGE" -gt 85 ] || [ "$DISK_USAGE" -gt 90 ]; then
echo "[$DATE] WARNING: High resource usage detected" >> $LOG_FILE
fi
echo "[$DATE] Daily health check completed" >> $LOG_FILE
```
**Schedule**: Add to crontab:
```
0 6 * * * /opt/vpn-exit-controller/scripts/daily-check.sh
```
### Weekly Performance Review
Run every Sunday at 2 AM:
```
#!/bin/bash
# Weekly performance review script
REPORT_FILE="/var/log/vpn-controller-weekly-$(date +%Y%m%d).log"
echo "Weekly Performance Report - $(date)" > $REPORT_FILE
echo "========================================" >> $REPORT_FILE
# System uptime
uptime >> $REPORT_FILE
# Docker container statistics
echo -e "\nContainer Resource Usage:" >> $REPORT_FILE
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}" >> $REPORT_FILE
# VPN node performance
echo -e "\nVPN Node Statistics:" >> $REPORT_FILE
curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/summary >> $REPORT_FILE
# Log analysis
echo -e "\nError Summary (last 7 days):" >> $REPORT_FILE
journalctl -u vpn-controller --since "7 days ago" | grep -i error | wc -l >> $REPORT_FILE
# Redis memory usage
echo -e "\nRedis Memory Usage:" >> $REPORT_FILE
docker exec vpn-redis redis-cli info memory | grep used_memory_human >> $REPORT_FILE
```
### Monthly System Updates
**First Sunday of each month at 3 AM:**
```
#!/bin/bash
# Monthly system update script
UPDATE_LOG="/var/log/monthly-updates-$(date +%Y%m).log"
echo "Monthly System Update - $(date)" > $UPDATE_LOG
# Update package lists
apt update >> $UPDATE_LOG 2>&1
# List available updates
echo "Available updates:" >> $UPDATE_LOG
apt list --upgradable >> $UPDATE_LOG 2>&1
# Install security updates only (safer for production)
DEBIAN_FRONTEND=noninteractive apt upgrade -y -o Dpkg::Options::="--force-confdef" >> $UPDATE_LOG 2>&1
# Clean up
apt autoremove -y >> $UPDATE_LOG 2>&1
apt autoclean >> $UPDATE_LOG 2>&1
# Check if reboot is required
if [ -f /var/run/reboot-required ]; then
echo "REBOOT REQUIRED" >> $UPDATE_LOG
# Schedule reboot for low-traffic time (4 AM)
shutdown -r 04:00 "Scheduled reboot for system updates"
fi
```
### Quarterly Capacity Planning
**First day of each quarter:**
```
#!/bin/bash
# Quarterly capacity planning report
QUARTER=$(date +%Y-Q$(($(date +%m)/3+1)))
REPORT_FILE="/var/log/capacity-report-${QUARTER}.log"
echo "Quarterly Capacity Planning Report - $QUARTER" > $REPORT_FILE
echo "=============================================" >> $REPORT_FILE
# Historical resource usage trends
echo "Resource Usage Trends (last 90 days):" >> $REPORT_FILE
# Disk usage growth
df -h /opt >> $REPORT_FILE
# VPN node usage statistics
echo -e "\nVPN Node Utilization:" >> $REPORT_FILE
curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/nodes | jq '.[] | {country: .country, usage_percent: .usage_percent}' >> $REPORT_FILE
# Connection statistics
echo -e "\nConnection Statistics:" >> $REPORT_FILE
curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/connections >> $REPORT_FILE
# Recommendations section
echo -e "\nCapacity Recommendations:" >> $REPORT_FILE
echo "Review this report to determine if additional nodes or resources are needed." >> $REPORT_FILE
```
---
## 2. System Monitoring
### Key Metrics to Watch
#### System Level Metrics
- **CPU Usage**: Should stay below 80% average
- **Memory Usage**: Should stay below 85%
- **Disk Usage**: Should stay below 90%
- **Network I/O**: Monitor for unusual spikes
- **Load Average**: Should be below number of CPU cores
#### Application Level Metrics
- **API Response Time**: < 500ms for health checks
- **Active VPN Connections**: Track connection counts
- **Container Health**: All containers should be "healthy"
- **Redis Memory Usage**: Monitor for memory leaks
- **Failed Connection Attempts**: Track error rates
### Alert Thresholds and Escalation
#### Critical Alerts (Immediate Response)
- API unavailable (HTTP 5xx errors)
- System CPU > 95% for 5+ minutes
- System memory > 95%
- Disk usage > 95%
- All VPN nodes offline
- Redis unavailable
#### Warning Alerts (Response within 4 hours)
- System CPU > 80% for 15+ minutes
- System memory > 85%
- Disk usage > 90%
- > 50% of VPN nodes offline
- High error rate (> 5%)
#### Info Alerts (Response within 24 hours)
- Single VPN node offline
- Slow API response times (> 1s)
- Log rotation needed
### Log Monitoring and Rotation
#### Configure log rotation for VPN Controller:
Create `/etc/logrotate.d/vpn-controller`:
```
/var/log/vpn-controller*.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 644 root root
postrotate
systemctl reload vpn-controller
endscript
}
```
#### Monitor key log patterns:
```
# Create log monitoring script
#!/bin/bash
# Monitor critical log patterns
LOG_FILE="/var/log/vpn-controller-alerts.log"
JOURNAL_LOG=$(journalctl -u vpn-controller --since "1 hour ago" --no-pager)
# Check for critical errors
CRITICAL_ERRORS=$(echo "$JOURNAL_LOG" | grep -i "critical\|fatal\|emergency" | wc -l)
if [ "$CRITICAL_ERRORS" -gt 0 ]; then
echo "$(date): CRITICAL - $CRITICAL_ERRORS critical errors found" >> $LOG_FILE
fi
# Check for failed VPN connections
FAILED_CONNECTIONS=$(echo "$JOURNAL_LOG" | grep -i "connection failed\|vpn failed" | wc -l)
if [ "$FAILED_CONNECTIONS" -gt 10 ]; then
echo "$(date): WARNING - $FAILED_CONNECTIONS failed VPN connections in last hour" >> $LOG_FILE
fi
# Check for Docker issues
DOCKER_ERRORS=$(echo "$JOURNAL_LOG" | grep -i "docker.*error" | wc -l)
if [ "$DOCKER_ERRORS" -gt 0 ]; then
echo "$(date): WARNING - $DOCKER_ERRORS Docker errors found" >> $LOG_FILE
fi
```
### Performance Baseline Tracking
Create baseline measurements script:
```
#!/bin/bash
# Performance baseline tracking
BASELINE_FILE="/var/log/performance-baseline.json"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Collect performance metrics
API_RESPONSE_TIME=$(curl -o /dev/null -s -w "%{time_total}" -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status)
MEMORY_USAGE=$(free | grep Mem | awk '{printf("%.1f"), $3/$2 * 100.0}')
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
DISK_USAGE=$(df /opt | awk 'NR==2 {print $5}' | sed 's/%//')
# Create JSON entry
cat >> $BASELINE_FILE << EOF
{
"timestamp": "$TIMESTAMP",
"api_response_time": $API_RESPONSE_TIME,
"memory_usage_percent": $MEMORY_USAGE,
"cpu_usage_percent": $CPU_USAGE,
"disk_usage_percent": $DISK_USAGE
}
EOF
```
---
## 3. Backup and Recovery
### Configuration Backup Procedures
#### Daily Configuration Backup:
```
#!/bin/bash
# Daily configuration backup script
BACKUP_DIR="/opt/backups/vpn-controller"
DATE=$(date +%Y%m%d)
BACKUP_FILE="vpn-controller-config-${DATE}.tar.gz"
mkdir -p $BACKUP_DIR
# Backup configurations
tar -czf "${BACKUP_DIR}/${BACKUP_FILE}" \
/opt/vpn-exit-controller/configs/ \
/opt/vpn-exit-controller/.env \
/opt/vpn-exit-controller/docker-compose.yml \
/opt/vpn-exit-controller/traefik/ \
/etc/systemd/system/vpn-controller.service
# Keep only last 30 days of backups
find $BACKUP_DIR -name "vpn-controller-config-*.tar.gz" -mtime +30 -delete
echo "$(date): Configuration backup completed: $BACKUP_FILE"
```
#### Weekly Full System Backup:
```
#!/bin/bash
# Weekly full system backup
BACKUP_DIR="/opt/backups/vpn-controller"
DATE=$(date +%Y%m%d)
FULL_BACKUP="vpn-controller-full-${DATE}.tar.gz"
# Stop services for consistent backup
systemctl stop vpn-controller
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml down
# Create full backup
tar -czf "${BACKUP_DIR}/${FULL_BACKUP}" \
/opt/vpn-exit-controller/ \
/etc/systemd/system/vpn-controller.service \
--exclude=/opt/vpn-exit-controller/venv/ \
--exclude=/opt/vpn-exit-controller/logs/ \
--exclude=/opt/vpn-exit-controller/data/cache/
# Start services
systemctl start vpn-controller
# Keep only last 4 weekly backups
find $BACKUP_DIR -name "vpn-controller-full-*.tar.gz" -mtime +28 -delete
echo "$(date): Full system backup completed: $FULL_BACKUP"
```
### Redis Database Backup
```
#!/bin/bash
# Redis backup script
BACKUP_DIR="/opt/backups/redis"
DATE=$(date +%Y%m%d-%H%M)
REDIS_BACKUP="redis-backup-${DATE}.rdb"
mkdir -p $BACKUP_DIR
# Create Redis backup
docker exec vpn-redis redis-cli BGSAVE
sleep 5
# Copy the backup file
docker cp vpn-redis:/data/dump.rdb "${BACKUP_DIR}/${REDIS_BACKUP}"
# Keep only last 7 days of Redis backups
find $BACKUP_DIR -name "redis-backup-*.rdb" -mtime +7 -delete
echo "$(date): Redis backup completed: $REDIS_BACKUP"
```
### Recovery Testing Procedures
#### Monthly Recovery Test:
```
#!/bin/bash
# Monthly recovery test procedure
TEST_DIR="/tmp/recovery-test-$(date +%Y%m%d)"
LATEST_BACKUP=$(ls -t /opt/backups/vpn-controller/vpn-controller-config-*.tar.gz | head -1)
echo "Testing recovery from backup: $LATEST_BACKUP"
# Create test directory
mkdir -p $TEST_DIR
# Extract backup
tar -xzf $LATEST_BACKUP -C $TEST_DIR
# Verify key files exist
REQUIRED_FILES=(
"opt/vpn-exit-controller/configs/auth.txt"
"opt/vpn-exit-controller/.env"
"opt/vpn-exit-controller/docker-compose.yml"
)
RECOVERY_SUCCESS=true
for file in "${REQUIRED_FILES[@]}"; do
if [ ! -f "$TEST_DIR/$file" ]; then
echo "ERROR: Missing file in backup: $file"
RECOVERY_SUCCESS=false
fi
done
# Test configuration validity
if [ "$RECOVERY_SUCCESS" = true ]; then
echo "Recovery test PASSED"
else
echo "Recovery test FAILED"
fi
# Cleanup
rm -rf $TEST_DIR
```
### Disaster Recovery Planning
#### Full System Recovery Procedure:
1. **Prepare New LXC Container:**
```
# On Proxmox host
pct create 201 /var/lib/vz/template/cache/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
--memory 4096 --cores 4 --storage local-lvm --size 50G \
--net0 name=eth0,bridge=vmbr1,ip=10.10.10.20/24,gw=10.10.10.1 \
--nameserver 8.8.8.8 --hostname vpn-exit-controller \
--features nesting=1,keyctl=1 \
--unprivileged 0
```
2. **Start Container and Install Dependencies:**
```
pct start 201
pct enter 201
apt update && apt upgrade -y
apt install -y docker.io docker-compose python3 python3-pip python3-venv curl
systemctl enable docker
systemctl start docker
```
3. **Restore from Backup:**
```
# Copy latest backup to container
LATEST_BACKUP=$(ls -t /opt/backups/vpn-controller/vpn-controller-full-*.tar.gz | head -1)
tar -xzf $LATEST_BACKUP -C /
# Restore permissions
chown -R root:root /opt/vpn-exit-controller/
chmod +x /opt/vpn-exit-controller/start.sh
# Restore systemd service
systemctl daemon-reload
systemctl enable vpn-controller
systemctl start vpn-controller
```
4. **Verify Recovery:**
```
# Check service status
systemctl status vpn-controller
# Test API
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
# Check containers
docker ps
```
---
## 4. Updates and Upgrades
### System Package Updates
#### Security Updates (Weekly):
```
#!/bin/bash
# Weekly security updates
LOG_FILE="/var/log/security-updates.log"
echo "$(date): Starting security updates" >> $LOG_FILE
# Update package lists
apt update >> $LOG_FILE 2>&1
# Install security updates only
DEBIAN_FRONTEND=noninteractive apt-get -y upgrade \
-o Dpkg::Options::="--force-confdef" \
-o Dpkg::Options::="--force-confold" \
$(apt list --upgradable 2>/dev/null | grep -i security | cut -d/ -f1) >> $LOG_FILE 2>&1
# Check for reboot requirement
if [ -f /var/run/reboot-required ]; then
echo "$(date): Reboot required after security updates" >> $LOG_FILE
fi
echo "$(date): Security updates completed" >> $LOG_FILE
```
#### Full System Updates (Monthly):
```
#!/bin/bash
# Monthly full system updates (scheduled maintenance window)
MAINTENANCE_LOG="/var/log/maintenance-updates.log"
echo "$(date): Starting maintenance window - full system update" >> $MAINTENANCE_LOG
# Stop VPN service
systemctl stop vpn-controller >> $MAINTENANCE_LOG 2>&1
# Update all packages
apt update && apt full-upgrade -y >> $MAINTENANCE_LOG 2>&1
# Clean up
apt autoremove -y >> $MAINTENANCE_LOG 2>&1
apt autoclean >> $MAINTENANCE_LOG 2>&1
# Update Docker images
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml pull >> $MAINTENANCE_LOG 2>&1
# Restart service
systemctl start vpn-controller >> $MAINTENANCE_LOG 2>&1
# Verify service is running
sleep 30
if systemctl is-active --quiet vpn-controller; then
echo "$(date): Service restarted successfully" >> $MAINTENANCE_LOG
else
echo "$(date): ERROR: Service failed to restart" >> $MAINTENANCE_LOG
fi
echo "$(date): Maintenance window completed" >> $MAINTENANCE_LOG
```
### Docker Image Updates
#### Check for Updates:
```
#!/bin/bash
# Check for Docker image updates
IMAGES=(
"redis:7-alpine"
"traefik:v2.10"
)
for image in "${IMAGES[@]}"; do
echo "Checking updates for $image"
# Pull latest
docker pull $image
# Compare image IDs
LOCAL_ID=$(docker images --no-trunc --quiet $image | head -1)
REMOTE_ID=$(docker inspect --format='{{.Id}}' $image)
if [ "$LOCAL_ID" != "$REMOTE_ID" ]; then
echo "Update available for $image"
else
echo "No update available for $image"
fi
done
```
#### Update Custom VPN Node Image:
```
#!/bin/bash
# Update VPN node image
cd /opt/vpn-exit-controller/vpn-node
# Build new image
docker build -t vpn-exit-node:latest .
# Test new image
docker run --rm vpn-exit-node:latest --version
# Restart containers with new image
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml up -d --force-recreate
```
### Application Code Updates
#### Update from Git Repository:
```
#!/bin/bash
# Update application code from repository
cd /opt/vpn-exit-controller
# Backup current version
tar -czf "/opt/backups/pre-update-$(date +%Y%m%d).tar.gz" . --exclude=venv --exclude=data
# Stop service
systemctl stop vpn-controller
# Pull updates (if using git)
# git pull origin main
# Update Python dependencies
source venv/bin/activate
pip install -r api/requirements.txt
# Restart service
systemctl start vpn-controller
# Verify update
sleep 30
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
```
### Dependency Management
#### Python Dependencies Audit:
```
#!/bin/bash
# Audit Python dependencies for security vulnerabilities
cd /opt/vpn-exit-controller
source venv/bin/activate
# Check for outdated packages
pip list --outdated
# Security audit (install pip-audit if not available)
pip install pip-audit
pip-audit
# Generate requirements with exact versions
pip freeze > requirements-frozen.txt
```
---
## 5. Certificate Management
### SSL Certificate Monitoring
#### Check Certificate Expiration:
```
#!/bin/bash
# Check SSL certificate expiration
CERT_FILE="/opt/vpn-exit-controller/traefik/letsencrypt/acme.json"
ALERT_DAYS=30
if [ -f "$CERT_FILE" ]; then
# Extract certificate expiration dates
python3 << EOF
import json
import base64
import datetime
from cryptography import x509
with open('$CERT_FILE', 'r') as f:
acme_data = json.load(f)
for resolver in acme_data.values():
for cert_data in resolver.get('Certificates', []):
cert_pem = base64.b64decode(cert_data['certificate']).decode()
cert = x509.load_pem_x509_certificate(cert_pem.encode())
domain = cert_data['domain']['main']
expiry = cert.not_valid_after
days_left = (expiry - datetime.datetime.now()).days
print(f"Domain: {domain}, Expires: {expiry}, Days left: {days_left}")
if days_left < $ALERT_DAYS:
print(f"WARNING: Certificate for {domain} expires in {days_left} days")
EOF
fi
```
### Let's Encrypt Certificate Renewal
#### Automatic Renewal Check:
```
#!/bin/bash
# Check Let's Encrypt certificate renewal
# Traefik handles automatic renewal, but verify it's working
TRAEFIK_LOG="/opt/vpn-exit-controller/traefik/logs/traefik.log"
# Check for recent renewal attempts
if [ -f "$TRAEFIK_LOG" ]; then
echo "Recent certificate renewal attempts:"
grep -i "certificate\|acme" "$TRAEFIK_LOG" | tail -10
fi
# Verify certificate is valid
DOMAIN="your-domain.com" # Replace with actual domain
echo | openssl s_client -servername $DOMAIN -connect $DOMAIN:443 2>/dev/null | openssl x509 -noout -dates
```
### Certificate Troubleshooting
#### Common Issues and Solutions:
```
#!/bin/bash
# Certificate troubleshooting script
echo "Certificate Troubleshooting Report"
echo "=================================="
# Check Traefik configuration
echo "1. Checking Traefik configuration..."
docker-compose -f /opt/vpn-exit-controller/traefik/docker-compose.traefik.yml config
# Check ACME challenge accessibility
echo "2. Checking ACME challenge accessibility..."
DOMAIN="your-domain.com" # Replace with actual domain
curl -I "http://$DOMAIN/.well-known/acme-challenge/test"
# Check DNS resolution
echo "3. Checking DNS resolution..."
nslookup $DOMAIN
# Check port accessibility
echo "4. Checking port 443 accessibility..."
nc -zv $DOMAIN 443
# Check Traefik dashboard
echo "5. Checking Traefik dashboard..."
curl -I http://localhost:8080/dashboard/
```
---
## 6. VPN Service Management
### NordVPN Credential Rotation
#### Monthly Credential Check:
```
#!/bin/bash
# Check and rotate NordVPN credentials if needed
AUTH_FILE="/opt/vpn-exit-controller/configs/auth.txt"
CURRENT_DATE=$(date +%s)
FILE_AGE=$(stat -c %Y "$AUTH_FILE")
AGE_DAYS=$(( (CURRENT_DATE - FILE_AGE) / 86400 ))
echo "NordVPN credentials are $AGE_DAYS days old"
if [ $AGE_DAYS -gt 60 ]; then
echo "WARNING: NordVPN credentials are older than 60 days"
echo "Consider updating credentials in $AUTH_FILE"
# Test current credentials
echo "Testing current credentials..."
# This would require implementing a credential test function
fi
```
#### Credential Update Procedure:
```
#!/bin/bash
# Update NordVPN credentials
AUTH_FILE="/opt/vpn-exit-controller/configs/auth.txt"
BACKUP_FILE="/opt/backups/auth-backup-$(date +%Y%m%d).txt"
echo "Updating NordVPN credentials..."
# Backup current credentials
cp "$AUTH_FILE" "$BACKUP_FILE"
# Prompt for new credentials (in production, use secure input method)
echo "Enter new NordVPN username:"
read -r NEW_USERNAME
echo "Enter new NordVPN password:"
read -rs NEW_PASSWORD
# Update auth file
echo "$NEW_USERNAME" > "$AUTH_FILE"
echo "$NEW_PASSWORD" >> "$AUTH_FILE"
# Restart VPN containers to use new credentials
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml restart
echo "Credentials updated. Testing connections..."
# Wait for containers to start
sleep 30
# Test API to verify VPN connections are working
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
```
### Server List Updates
#### Update NordVPN Server Configurations:
```
#!/bin/bash
# Update NordVPN server configurations
SCRIPT_PATH="/opt/vpn-exit-controller/scripts/download-nordvpn-configs.sh"
CONFIG_DIR="/opt/vpn-exit-controller/configs/vpn"
echo "Updating NordVPN server configurations..."
# Backup current configurations
tar -czf "/opt/backups/vpn-configs-backup-$(date +%Y%m%d).tar.gz" "$CONFIG_DIR"
# Run the download script
if [ -f "$SCRIPT_PATH" ]; then
bash "$SCRIPT_PATH"
echo "Server configurations updated"
# Restart service to load new configurations
systemctl restart vpn-controller
else
echo "ERROR: Download script not found at $SCRIPT_PATH"
fi
```
### Performance Optimization
#### VPN Node Performance Tuning:
```
#!/bin/bash
# VPN node performance optimization
echo "VPN Performance Optimization Report"
echo "==================================="
# Check connection speeds for each country
COUNTRIES=("us" "uk" "de" "jp" "au")
for country in "${COUNTRIES[@]}"; do
echo "Testing $country nodes..."
# Use speed test API endpoint
SPEED_RESULT=$(curl -s -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/speed-test/$country" | jq '.download_speed')
echo "$country: $SPEED_RESULT Mbps"
done
# Check for underperforming nodes
echo -e "\nChecking for underperforming nodes..."
curl -s -u admin:Bl4ckMagic!2345erver \
"http://localhost:8080/api/metrics/nodes" | \
jq '.[] | select(.performance_score < 0.7) | {country: .country, score: .performance_score}'
```
### Service Status Monitoring
#### Comprehensive VPN Service Check:
```
#!/bin/bash
# Comprehensive VPN service status check
STATUS_LOG="/var/log/vpn-service-status.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] VPN Service Status Check" >> $STATUS_LOG
# Check each VPN container
VPN_CONTAINERS=$(docker ps --filter "name=vpn-" --format "{{.Names}}")
for container in $VPN_CONTAINERS; do
# Check container health
HEALTH=$(docker inspect --format='{{.State.Health.Status}}' $container 2>/dev/null || echo "no health check")
# Check VPN connection
IP_CHECK=$(docker exec $container curl -s --max-time 10 ifconfig.me 2>/dev/null || echo "failed")
echo "[$TIMESTAMP] $container: Health=$HEALTH, IP=$IP_CHECK" >> $STATUS_LOG
done
# Check overall API health
API_HEALTH=$(curl -s -u admin:Bl4ckMagic!2345erver -o /dev/null -w "%{http_code}" http://localhost:8080/api/status)
echo "[$TIMESTAMP] API Health: HTTP $API_HEALTH" >> $STATUS_LOG
# Check Redis connectivity
REDIS_HEALTH=$(docker exec vpn-redis redis-cli ping 2>/dev/null || echo "FAILED")
echo "[$TIMESTAMP] Redis Health: $REDIS_HEALTH" >> $STATUS_LOG
```
---
## 7. Capacity Management
### Resource Usage Monitoring
#### Real-time Resource Monitor:
```
#!/bin/bash
# Real-time resource monitoring script
MONITOR_LOG="/var/log/resource-monitor.log"
while true; do
TIMESTAMP=$(date "+%Y-%m-%d %H:%M:%S")
# System resources
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
MEM_USAGE=$(free | grep Mem | awk '{printf("%.1f"), $3/$2 * 100.0}')
DISK_USAGE=$(df /opt | awk 'NR==2 {print $5}' | sed 's/%//')
# Docker resources
DOCKER_CONTAINERS=$(docker ps -q | wc -l)
# Network connections
CONNECTIONS=$(ss -tuln | wc -l)
echo "$TIMESTAMP,$CPU_USAGE,$MEM_USAGE,$DISK_USAGE,$DOCKER_CONTAINERS,$CONNECTIONS" >> $MONITOR_LOG
sleep 300 # 5-minute intervals
done
```
### Scaling Decisions
#### Auto-scaling Triggers:
```
#!/bin/bash
# Auto-scaling decision logic
SCALE_LOG="/var/log/scaling-decisions.log"
TIMESTAMP=$(date)
# Get current metrics
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
ACTIVE_CONNECTIONS=$(curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/connections | jq '.active_connections')
RUNNING_NODES=$(docker ps --filter "name=vpn-" -q | wc -l)
echo "[$TIMESTAMP] Scaling Check: CPU=$CPU_USAGE%, Connections=$ACTIVE_CONNECTIONS, Nodes=$RUNNING_NODES" >> $SCALE_LOG
# Scaling up logic
if (( $(echo "$CPU_USAGE > 70" | bc -l) )) && [ "$ACTIVE_CONNECTIONS" -gt 100 ] && [ "$RUNNING_NODES" -lt 10 ]; then
echo "[$TIMESTAMP] SCALE UP: High load detected" >> $SCALE_LOG
# Trigger scale up via API
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes/scale-up
fi
# Scaling down logic
if (( $(echo "$CPU_USAGE < 30" | bc -l) )) && [ "$ACTIVE_CONNECTIONS" -lt 20 ] && [ "$RUNNING_NODES" -gt 3 ]; then
echo "[$TIMESTAMP] SCALE DOWN: Low load detected" >> $SCALE_LOG
# Trigger scale down via API
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes/scale-down
fi
```
### Node Capacity Planning
#### Capacity Planning Analysis:
```
#!/bin/bash
# Node capacity planning analysis
REPORT_FILE="/var/log/capacity-analysis-$(date +%Y%m%d).log"
echo "Node Capacity Planning Analysis - $(date)" > $REPORT_FILE
echo "=========================================" >> $REPORT_FILE
# Current node utilization
echo -e "\nCurrent Node Utilization:" >> $REPORT_FILE
curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/nodes | \
jq -r '.[] | "\(.country): \(.active_connections) connections, \(.cpu_usage)% CPU, \(.memory_usage)% Memory"' >> $REPORT_FILE
# Peak usage analysis (last 7 days)
echo -e "\nPeak Usage Analysis (Last 7 Days):" >> $REPORT_FILE
# Historical connection data
PEAK_CONNECTIONS=$(grep "connections" /var/log/resource-monitor.log | \
tail -2016 | cut -d',' -f5 | sort -n | tail -1) # Last 7 days of 5-min intervals
echo "Peak concurrent connections: $PEAK_CONNECTIONS" >> $REPORT_FILE
# Capacity recommendations
echo -e "\nCapacity Recommendations:" >> $REPORT_FILE
if [ "$PEAK_CONNECTIONS" -gt 500 ]; then
echo "- Consider adding more VPN nodes for high-demand countries" >> $REPORT_FILE
fi
if [ "$(docker ps -q | wc -l)" -gt 15 ]; then
echo "- Current container count is high, consider resource optimization" >> $REPORT_FILE
fi
echo "- Monitor load balancing effectiveness across regions" >> $REPORT_FILE
```
### Performance Optimization
#### System Performance Tuning:
```
#!/bin/bash
# System performance optimization
echo "Applying performance optimizations..."
# Network optimizations
echo 'net.core.rmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 65536 16777216' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 16777216' >> /etc/sysctl.conf
# Apply changes
sysctl -p
# Docker performance optimizations
echo '{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Hard": 64000,
"Soft": 64000
}
}
}' > /etc/docker/daemon.json
systemctl restart docker
echo "Performance optimizations applied"
```
---
## 8. Security Maintenance
### Security Patch Management
#### Critical Security Updates:
```
#!/bin/bash
# Critical security update management
SECURITY_LOG="/var/log/security-updates.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] Starting critical security check" >> $SECURITY_LOG
# Check for available security updates
SECURITY_UPDATES=$(apt list --upgradable 2>/dev/null | grep -i security | wc -l)
if [ "$SECURITY_UPDATES" -gt 0 ]; then
echo "[$TIMESTAMP] $SECURITY_UPDATES security updates available" >> $SECURITY_LOG
# Apply critical security updates immediately
DEBIAN_FRONTEND=noninteractive apt-get -y install \
$(apt list --upgradable 2>/dev/null | grep -i security | cut -d/ -f1) \
>> $SECURITY_LOG 2>&1
# Check if reboot is required
if [ -f /var/run/reboot-required ]; then
echo "[$TIMESTAMP] CRITICAL: Reboot required for security updates" >> $SECURITY_LOG
# Schedule reboot during maintenance window
shutdown -r +60 "Security updates require reboot"
fi
else
echo "[$TIMESTAMP] No security updates available" >> $SECURITY_LOG
fi
# Update Docker base images for security
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml pull >> $SECURITY_LOG 2>&1
```
### Credential Rotation Schedules
#### Quarterly Credential Rotation:
```
#!/bin/bash
# Quarterly credential rotation
ROTATION_LOG="/var/log/credential-rotation.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] Starting quarterly credential rotation" >> $ROTATION_LOG
# 1. API Admin Password Rotation
echo "[$TIMESTAMP] Rotating API admin password" >> $ROTATION_LOG
NEW_API_PASSWORD=$(openssl rand -base64 32)
# Update environment file
sed -i "s/ADMIN_PASS=.*/ADMIN_PASS=$NEW_API_PASSWORD/" /opt/vpn-exit-controller/.env
# 2. Redis Password (if used)
# NEW_REDIS_PASSWORD=$(openssl rand -base64 32)
# 3. Secret Key Rotation
NEW_SECRET_KEY=$(openssl rand -base64 64)
sed -i "s/SECRET_KEY=.*/SECRET_KEY=$NEW_SECRET_KEY/" /opt/vpn-exit-controller/.env
# 4. Restart services with new credentials
systemctl restart vpn-controller
echo "[$TIMESTAMP] Credential rotation completed" >> $ROTATION_LOG
echo "[$TIMESTAMP] New API password: $NEW_API_PASSWORD" >> $ROTATION_LOG
```
### Access Review Procedures
#### Monthly Access Review:
```
#!/bin/bash
# Monthly access review
REVIEW_LOG="/var/log/access-review-$(date +%Y%m).log"
echo "Monthly Access Review - $(date)" > $REVIEW_LOG
echo "=================================" >> $REVIEW_LOG
# Review systemd service permissions
echo -e "\nService File Permissions:" >> $REVIEW_LOG
ls -la /etc/systemd/system/vpn-controller.service >> $REVIEW_LOG
# Review application directory permissions
echo -e "\nApplication Directory Permissions:" >> $REVIEW_LOG
ls -la /opt/vpn-exit-controller/ >> $REVIEW_LOG
# Review Docker socket access
echo -e "\nDocker Socket Access:" >> $REVIEW_LOG
ls -la /var/run/docker.sock >> $REVIEW_LOG
# Review network exposure
echo -e "\nNetwork Exposure:" >> $REVIEW_LOG
ss -tuln >> $REVIEW_LOG
# Review recent authentication attempts
echo -e "\nRecent Authentication Attempts:" >> $REVIEW_LOG
journalctl -u vpn-controller --since "30 days ago" | grep -i "auth\|login" | tail -20 >> $REVIEW_LOG
```
### Security Audit Schedules
#### Comprehensive Security Audit:
```
#!/bin/bash
# Comprehensive security audit
AUDIT_LOG="/var/log/security-audit-$(date +%Y%m%d).log"
echo "Security Audit Report - $(date)" > $AUDIT_LOG
echo "===============================" >> $AUDIT_LOG
# 1. System vulnerabilities scan
echo -e "\n1. System Vulnerability Scan:" >> $AUDIT_LOG
if command -v lynis &> /dev/null; then
lynis audit system --quiet >> $AUDIT_LOG
else
echo "Lynis not installed. Install with: apt install lynis" >> $AUDIT_LOG
fi
# 2. Open ports audit
echo -e "\n2. Open Ports Audit:" >> $AUDIT_LOG
nmap -sS -p- localhost >> $AUDIT_LOG 2>&1
# 3. Docker security audit
echo -e "\n3. Docker Security Audit:" >> $AUDIT_LOG
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
-v /usr/lib/systemd:/usr/lib/systemd \
-v /etc:/etc --label docker_bench_security \
docker/docker-bench-security >> $AUDIT_LOG 2>/dev/null || echo "Docker Bench Security not available" >> $AUDIT_LOG
# 4. File permissions audit
echo -e "\n4. Critical File Permissions:" >> $AUDIT_LOG
CRITICAL_FILES=(
"/opt/vpn-exit-controller/.env"
"/opt/vpn-exit-controller/configs/auth.txt"
"/etc/systemd/system/vpn-controller.service"
)
for file in "${CRITICAL_FILES[@]}"; do
if [ -f "$file" ]; then
ls -la "$file" >> $AUDIT_LOG
fi
done
# 5. Process audit
echo -e "\n5. Running Processes:" >> $AUDIT_LOG
ps aux | grep -E "(vpn|docker|redis)" >> $AUDIT_LOG
echo -e "\nSecurity audit completed. Review $AUDIT_LOG for findings." >> $AUDIT_LOG
```
---
## 9. Documentation Updates
### Keeping Documentation Current
#### Documentation Update Checklist:
```
## Monthly Documentation Review Checklist
- [ ] Review API_DOCUMENTATION.md for new endpoints
- [ ] Update ARCHITECTURE.md for infrastructure changes
- [ ] Check DEPLOYMENT.md for new deployment procedures
- [ ] Verify SECURITY.md reflects current security measures
- [ ] Update this MAINTENANCE.md for new procedures
- [ ] Review configuration examples for accuracy
- [ ] Update version numbers and dependencies
- [ ] Check command examples are current
- [ ] Verify troubleshooting guides are accurate
- [ ] Update contact information and escalation procedures
```
#### Automated Documentation Validation:
```
#!/bin/bash
# Validate documentation accuracy
DOC_DIR="/opt/vpn-exit-controller"
VALIDATION_LOG="/var/log/doc-validation.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] Starting documentation validation" > $VALIDATION_LOG
# Check if mentioned files exist
DOCS=("API_DOCUMENTATION.md" "ARCHITECTURE.md" "DEPLOYMENT.md" "SECURITY.md" "MAINTENANCE.md")
for doc in "${DOCS[@]}"; do
if [ -f "$DOC_DIR/$doc" ]; then
echo "[$TIMESTAMP] ✓ $doc exists" >> $VALIDATION_LOG
# Check for outdated information
LAST_MODIFIED=$(stat -c %Y "$DOC_DIR/$doc")
CURRENT_TIME=$(date +%s)
DAYS_OLD=$(( (CURRENT_TIME - LAST_MODIFIED) / 86400 ))
if [ $DAYS_OLD -gt 90 ]; then
echo "[$TIMESTAMP] ⚠ $doc is $DAYS_OLD days old (review needed)" >> $VALIDATION_LOG
fi
else
echo "[$TIMESTAMP] ✗ $doc missing" >> $VALIDATION_LOG
fi
done
# Validate command examples in documentation
echo "[$TIMESTAMP] Validating command examples..." >> $VALIDATION_LOG
# Test API endpoint mentioned in docs
if curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status > /dev/null; then
echo "[$TIMESTAMP] ✓ API endpoint accessible" >> $VALIDATION_LOG
else
echo "[$TIMESTAMP] ✗ API endpoint not accessible" >> $VALIDATION_LOG
fi
# Test service commands
if systemctl is-active --quiet vpn-controller; then
echo "[$TIMESTAMP] ✓ vpn-controller service running" >> $VALIDATION_LOG
else
echo "[$TIMESTAMP] ✗ vpn-controller service not running" >> $VALIDATION_LOG
fi
```
### Change Log Maintenance
#### Automated Change Log Updates:
```
#!/bin/bash
# Update CHANGELOG.md with recent changes
CHANGELOG="/opt/vpn-exit-controller/CHANGELOG.md"
TEMP_LOG="/tmp/changelog-temp.md"
# Get recent commits (if using git)
if [ -d ".git" ]; then
echo "## $(date +%Y-%m-%d) - Maintenance Update" > $TEMP_LOG
echo "" >> $TEMP_LOG
# Add git log entries
git log --since="1 month ago" --pretty=format:"- %s (%an)" >> $TEMP_LOG
echo "" >> $TEMP_LOG
echo "" >> $TEMP_LOG
# Prepend to existing changelog
if [ -f "$CHANGELOG" ]; then
cat "$CHANGELOG" >> $TEMP_LOG
mv "$TEMP_LOG" "$CHANGELOG"
else
mv "$TEMP_LOG" "$CHANGELOG"
fi
fi
```
### Configuration Documentation
#### Auto-generate Configuration Documentation:
```
#!/bin/bash
# Generate current configuration documentation
CONFIG_DOC="/opt/vpn-exit-controller/CONFIG_CURRENT.md"
echo "# Current System Configuration" > $CONFIG_DOC
echo "Generated on: $(date)" >> $CONFIG_DOC
echo "" >> $CONFIG_DOC
# System information
echo "## System Information" >> $CONFIG_DOC
echo "- **OS**: $(lsb_release -d | cut -f2)" >> $CONFIG_DOC
echo "- **Kernel**: $(uname -r)" >> $CONFIG_DOC
echo "- **Docker Version**: $(docker --version)" >> $CONFIG_DOC
echo "- **Python Version**: $(python3 --version)" >> $CONFIG_DOC
echo "" >> $CONFIG_DOC
# Service status
echo "## Service Status" >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
systemctl status vpn-controller --no-pager >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
echo "" >> $CONFIG_DOC
# Docker containers
echo "## Running Containers" >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
echo "" >> $CONFIG_DOC
# Network configuration
echo "## Network Configuration" >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
ip addr show | grep -E "(inet|link)" >> $CONFIG_DOC
echo "\`\`\`" >> $CONFIG_DOC
```
---
## 10. Emergency Procedures
### System Outage Response
#### Emergency Response Playbook:
```
#!/bin/bash
# Emergency response script for system outages
EMERGENCY_LOG="/var/log/emergency-response.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] EMERGENCY: System outage detected" >> $EMERGENCY_LOG
# 1. Immediate Assessment
echo "[$TIMESTAMP] Step 1: Immediate assessment" >> $EMERGENCY_LOG
# Check if container is running
if pct status 201 | grep -q "status: running"; then
echo "[$TIMESTAMP] ✓ LXC container is running" >> $EMERGENCY_LOG
else
echo "[$TIMESTAMP] ✗ LXC container is stopped - starting now" >> $EMERGENCY_LOG
pct start 201
sleep 30
fi
# Check core services
SERVICES=("docker" "vpn-controller")
for service in "${SERVICES[@]}"; do
if systemctl is-active --quiet $service; then
echo "[$TIMESTAMP] ✓ $service is running" >> $EMERGENCY_LOG
else
echo "[$TIMESTAMP] ✗ $service is stopped - restarting" >> $EMERGENCY_LOG
systemctl restart $service
fi
done
# 2. Quick Recovery Attempt
echo "[$TIMESTAMP] Step 2: Quick recovery attempt" >> $EMERGENCY_LOG
# Restart VPN controller
systemctl restart vpn-controller
sleep 60
# Test API
if curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status > /dev/null; then
echo "[$TIMESTAMP] ✓ API is responding - emergency recovery successful" >> $EMERGENCY_LOG
exit 0
fi
# 3. Docker Recovery
echo "[$TIMESTAMP] Step 3: Docker container recovery" >> $EMERGENCY_LOG
cd /opt/vpn-exit-controller
docker-compose down
docker-compose up -d
sleep 120
# Test again
if curl -s -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status > /dev/null; then
echo "[$TIMESTAMP] ✓ API responding after Docker restart" >> $EMERGENCY_LOG
exit 0
fi
# 4. Full System Recovery
echo "[$TIMESTAMP] Step 4: Full system recovery required" >> $EMERGENCY_LOG
echo "[$TIMESTAMP] Escalating to manual intervention" >> $EMERGENCY_LOG
# Generate diagnostic report
/opt/vpn-exit-controller/scripts/generate-diagnostic-report.sh >> $EMERGENCY_LOG
echo "[$TIMESTAMP] Emergency procedures completed - manual intervention required" >> $EMERGENCY_LOG
```
### Emergency Contact Procedures
#### Contact List and Escalation:
```
## Emergency Contact Procedures
### Severity Levels
#### P1 - Critical (Response Time: 15 minutes)
- Complete system outage
- Security breach
- Data loss
#### P2 - High (Response Time: 2 hours)
- Partial system outage
- Performance degradation > 50%
- Single node failures affecting service
#### P3 - Medium (Response Time: 24 hours)
- Minor performance issues
- Single container failures
- Non-critical errors
#### P4 - Low (Response Time: 72 hours)
- Cosmetic issues
- Documentation updates
- Feature requests
### Contact Information
**Primary On-Call Engineer**
- Name: [Your Name]
- Phone: [Phone Number]
- Email: [Email Address]
- Signal/WhatsApp: [Number]
**Secondary Engineer**
- Name: [Backup Name]
- Phone: [Phone Number]
- Email: [Email Address]
**Infrastructure Team**
- Email: infrastructure@yourcompany.com
- Slack: #infrastructure-alerts
### Escalation Matrix
1. **0-15 minutes**: Primary engineer response
2. **15-30 minutes**: Escalate to secondary engineer
3. **30-60 minutes**: Escalate to infrastructure team lead
4. **1+ hours**: Escalate to engineering manager
```
### Critical Issue Escalation
#### Automated Escalation Script:
```
#!/bin/bash
# Automated escalation for critical issues
ISSUE_TYPE="$1"
SEVERITY="$2"
DETAILS="$3"
ESCALATION_LOG="/var/log/escalation.log"
TIMESTAMP=$(date)
echo "[$TIMESTAMP] ESCALATION: $ISSUE_TYPE (Severity: $SEVERITY)" >> $ESCALATION_LOG
echo "[$TIMESTAMP] Details: $DETAILS" >> $ESCALATION_LOG
# Generate system snapshot
SNAPSHOT_FILE="/tmp/system-snapshot-$(date +%Y%m%d-%H%M%S).txt"
cat > $SNAPSHOT_FILE << EOF
EMERGENCY SYSTEM SNAPSHOT
========================
Time: $(date)
Issue: $ISSUE_TYPE
Severity: $SEVERITY
Details: $DETAILS
System Status:
$(systemctl status vpn-controller --no-pager)
Docker Status:
$(docker ps -a)
Resource Usage:
$(free -h)
$(df -h)
Recent Logs:
$(journalctl -u vpn-controller --since "1 hour ago" --no-pager | tail -50)
Network Status:
$(ss -tuln)
EOF
# Send alerts based on severity
case $SEVERITY in
"P1"|"CRITICAL")
# Immediate alerts
echo "[$TIMESTAMP] Sending P1 alerts" >> $ESCALATION_LOG
# curl -X POST webhook-url or send email
;;
"P2"|"HIGH")
echo "[$TIMESTAMP] Sending P2 alerts" >> $ESCALATION_LOG
# Send high priority alerts
;;
*)
echo "[$TIMESTAMP] Logging issue for review" >> $ESCALATION_LOG
;;
esac
echo "[$TIMESTAMP] Escalation process initiated" >> $ESCALATION_LOG
```
### Recovery Time Objectives
#### RTO/RPO Definitions:
```
## Recovery Time and Point Objectives
### Recovery Time Objectives (RTO)
| Component | Target RTO | Maximum Acceptable |
|-----------|------------|-------------------|
| VPN API Service | 5 minutes | 15 minutes |
| Individual VPN Nodes | 2 minutes | 5 minutes |
| Database (Redis) | 3 minutes | 10 minutes |
| Full System | 15 minutes | 30 minutes |
| Container Infrastructure | 10 minutes | 20 minutes |
### Recovery Point Objectives (RPO)
| Data Type | Target RPO | Backup Frequency |
|-----------|------------|------------------|
| Configuration Files | 0 minutes | Real-time sync |
| System Metrics | 5 minutes | Continuous |
| Connection Logs | 15 minutes | Every 15 minutes |
| Performance Data | 1 hour | Hourly snapshots |
### Service Level Objectives (SLO)
- **Availability**: 99.9% (8.76 hours downtime/year)
- **API Response Time**: < 500ms (95th percentile)
- **VPN Connection Success Rate**: > 99%
- **Data Recovery Success Rate**: 100%
### Disaster Recovery Scenarios
#### Scenario 1: Complete LXC Container Failure
- **RTO**: 30 minutes
- **Procedure**: Restore from latest backup to new container
- **Validation**: Full service test suite
#### Scenario 2: Docker Infrastructure Failure
- **RTO**: 15 minutes
- **Procedure**: Restart Docker daemon, rebuild containers
- **Validation**: Container health checks
#### Scenario 3: Database Corruption
- **RTO**: 10 minutes
- **Procedure**: Restore Redis from latest backup
- **Validation**: Data integrity verification
#### Scenario 4: Configuration Corruption
- **RTO**: 5 minutes
- **Procedure**: Restore from configuration backup
- **Validation**: Service functionality test
```
---
## Maintenance Schedule Summary
### Daily (Automated)
- ✅ Health checks (6:00 AM)
- ✅ Resource monitoring (Every 5 minutes)
- ✅ Log rotation checks
- ✅ Basic performance metrics
### Weekly
- ✅ Performance review (Sunday 2:00 AM)
- ✅ Security updates (Sunday 3:00 AM)
- ✅ Configuration backup verification
- ✅ VPN node performance analysis
### Monthly
- ✅ Full system updates (First Sunday 3:00 AM)
- ✅ Access review
- ✅ Documentation validation
- ✅ Capacity planning review
- ✅ Recovery testing
### Quarterly
- ✅ Comprehensive security audit
- ✅ Credential rotation
- ✅ Disaster recovery testing
- ✅ Performance optimization review
- ✅ Documentation major updates
---
## Quick Reference Commands
### Emergency Commands
```
# Emergency service restart
systemctl restart vpn-controller
# Check service status
systemctl status vpn-controller
# View real-time logs
journalctl -u vpn-controller -f
# Test API health
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
# Docker container status
docker ps -a
# Resource usage
htop
df -h
free -h
```
### Maintenance Commands
```
# Activate Python environment
source /opt/vpn-exit-controller/venv/bin/activate
# Update system packages
apt update && apt upgrade -y
# Restart all containers
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml restart
# Create configuration backup
tar -czf backup-$(date +%Y%m%d).tar.gz /opt/vpn-exit-controller/configs/
# Check certificate expiration
openssl x509 -in cert.pem -noout -dates
```
---
**Document Version**: 1.0
**Last Updated**: $(date)
**Next Review**: $(date -d "+1 month")
---
*This maintenance guide should be reviewed and updated monthly to ensure accuracy and completeness. All procedures should be tested in a development environment before applying to production.*
---
## Operations > Monitoring
This guide covers monitoring the VPN Exit Controller system for health, performance, and security.
## Overview
Effective monitoring is crucial for maintaining a reliable VPN exit node service. This guide covers the tools and procedures for monitoring all aspects of the system.
## System Monitoring
### Service Health
Monitor core services:
```
# Check VPN controller service
systemctl status vpn-controller
# Check documentation webhook
systemctl status docs-webhook
# Check Docker services
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.RunningFor}}"
```
### Container Health
Monitor VPN exit node containers:
```
# Check all VPN containers
docker ps | grep vpn-exit
# Check specific country node
docker inspect vpn-exit-us --format='{{.State.Health.Status}}'
# View container resource usage
docker stats --no-stream
```
## Performance Monitoring
### API Performance
```
# Check API response time
time curl -s https://exit.idlegaming.org/api/health
# Monitor API logs
journalctl -u vpn-controller -f | grep -E "(ERROR|WARNING|response_time)"
```
### Proxy Performance
```
# Test proxy response time
time curl -x http://admin:password@proxy-us.exit.idlegaming.org:3128 http://httpbin.org/ip
# Check HAProxy statistics
docker exec haproxy-default echo "show stat" | socat stdio /var/run/haproxy/admin.sock
```
## Log Monitoring
### Key Log Locations
- **API Logs**: `journalctl -u vpn-controller -f`
- **Webhook Logs**: `/opt/vpn-exit-controller/logs/webhook.log`
- **Docker Logs**: `docker logs [container-name]`
- **Rebuild Logs**: `/opt/vpn-exit-controller/logs/docs-rebuild.log`
### Log Analysis
```
# Check for errors in last hour
journalctl -u vpn-controller --since "1 hour ago" | grep ERROR
# Monitor real-time errors
tail -f /var/log/syslog | grep -E "(error|fail|critical)"
# Check container restart frequency
docker ps -a --filter "name=vpn-" --format "table {{.Names}}\t{{.Status}}"
```
## Metrics Collection
### System Metrics
```
# CPU and Memory usage
top -b -n 1 | head -20
# Disk usage
df -h | grep -E "(/$|/opt|/var)"
# Network connections
ss -tunap | grep -E "(8080|3128|1080)"
```
### Container Metrics
```
# Container resource usage
docker stats --no-stream --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
# Container network stats
docker exec vpn-exit-us cat /proc/net/dev
```
## Alerting
### Basic Health Checks
Create a simple monitoring script:
```
#!/bin/bash
# /opt/vpn-exit-controller/scripts/health-check.sh
# Check API
if ! curl -sf https://exit.idlegaming.org/api/health > /dev/null; then
echo "ALERT: API is down"
fi
# Check containers
if [ $(docker ps | grep -c vpn-exit) -lt 10 ]; then
echo "ALERT: Some VPN containers are down"
fi
# Check disk space
if [ $(df -h / | awk 'NR==2 {print $5}' | sed 's/%//') -gt 80 ]; then
echo "ALERT: Disk usage above 80%"
fi
```
### Automated Monitoring
Set up a cron job for regular checks:
```
# Add to crontab
*/5 * * * * /opt/vpn-exit-controller/scripts/health-check.sh >> /opt/vpn-exit-controller/logs/health-check.log 2>&1
```
## Dashboard Monitoring
Access monitoring dashboards:
- **VPN Dashboard**: https://exit.idlegaming.org/
- **API Metrics**: https://exit.idlegaming.org/api/metrics/summary
- **Node Status**: https://exit.idlegaming.org/api/nodes
## Troubleshooting Common Issues
### High CPU Usage
1. Check container stats: `docker stats`
2. Identify problematic container
3. Restart if necessary: `docker restart [container]`
### Memory Leaks
1. Monitor memory over time: `free -m -s 5`
2. Check for growing processes: `ps aux | sort -nrk 4 | head`
3. Restart services if needed
### Network Issues
1. Check connectivity: `ping -c 4 8.8.8.8`
2. Verify DNS: `nslookup exit.idlegaming.org`
3. Check firewall: `ufw status`
## Best Practices
1. **Regular Reviews**: Check logs daily
2. **Automate Alerts**: Set up automated monitoring
3. **Document Issues**: Keep a log of problems and solutions
4. **Capacity Planning**: Monitor trends for growth
5. **Security Monitoring**: Watch for unusual activity
---
!!! tip "Monitoring Tools"
Consider implementing additional monitoring tools like Prometheus, Grafana, or Netdata for more comprehensive monitoring capabilities.
---
## Operations > Scaling
### Scaling Guide
This guide covers strategies for scaling the VPN Exit Controller system to handle increased load and geographic expansion.
## Overview
The VPN Exit Controller is designed to scale both vertically (more resources) and horizontally (more nodes). This guide covers both approaches and when to use each.
## Vertical Scaling
### Resource Upgrades
Increase resources for the existing LXC container:
```
# On Proxmox host
# Increase CPU cores
pct set 201 --cores 8
# Increase memory
pct set 201 --memory 16384
# Increase disk
pct resize 201 rootfs +20G
```
### Container Limits
Adjust Docker resource limits:
```
# In docker-compose.yml
services:
vpn-exit-us:
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '1.0'
memory: 1G
```
## Horizontal Scaling
### Adding More VPN Nodes
1. **Add new country nodes**:
```
# Use the API to start new nodes
curl -X POST -u admin:password \
https://exit.idlegaming.org/api/nodes/br/start
```
2. **Configure load balancing**:
```
# Add to HAProxy configuration
server vpn-br-1 100.73.33.20:3128 check
server vpn-br-2 100.73.33.21:3128 check
```
### Multi-Region Deployment
Deploy additional VPN controllers in different regions:
1. **Create new LXC container**
2. **Install VPN controller**
3. **Configure geo-routing**
4. **Update DNS for regional access**
## Auto-Scaling
### Connection-Based Scaling
Monitor and scale based on connections:
```
# Auto-scaling logic
async def check_scaling_needed():
stats = await get_connection_stats()
for country, connections in stats.items():
if connections > SCALE_UP_THRESHOLD:
await start_additional_node(country)
elif connections < SCALE_DOWN_THRESHOLD:
await stop_extra_node(country)
```
### Performance-Based Scaling
Scale based on response times:
```
# Monitor response times
for country in us uk de jp; do
response_time=$(curl -w "%{time_total}" -o /dev/null -s \
-x http://proxy-$country.exit.idlegaming.org:3128 \
http://httpbin.org/ip)
if (( $(echo "$response_time > 2.0" | bc -l) )); then
echo "Scaling up $country - slow response"
fi
done
```
## Load Distribution
### Geographic Distribution
Distribute load across regions:
```
# Nginx geo-based routing
geo $proxy_pool {
default us;
# Europe
2.0.0.0/8 eu;
5.0.0.0/8 eu;
# Asia
1.0.0.0/8 asia;
14.0.0.0/8 asia;
# Americas
8.0.0.0/8 us;
24.0.0.0/8 us;
}
```
### Load Balancing Strategies
Configure different strategies:
```
# Set load balancing strategy
curl -X POST -u admin:password \
https://exit.idlegaming.org/api/lb/strategy/health_score
```
Available strategies:
- `round_robin` - Equal distribution
- `least_connections` - Fewest active connections
- `health_score` - Best performing nodes
- `random` - Random selection
- `ip_hash` - Consistent hashing
## Capacity Planning
### Monitoring Metrics
Track key metrics for capacity planning:
```
# Connection count per node
docker exec haproxy-default \
echo "show stat" | socat stdio /var/run/haproxy/admin.sock | \
awk -F',' '{print $2, $5}'
# Bandwidth usage
vnstat -i eth0 -h
# Resource utilization
docker stats --no-stream --format json | \
jq '.CPUPerc, .MemUsage'
```
### Growth Projections
Plan for growth:
1. **Track daily peaks**: Monitor busiest hours
2. **Calculate growth rate**: Month-over-month increase
3. **Plan ahead**: Scale before hitting limits
4. **Budget resources**: CPU, memory, bandwidth
## Optimization
### Container Optimization
Optimize container performance:
```
# Optimized Dockerfile
FROM alpine:latest
RUN apk add --no-cache \
openvpn \
squid \
dante-server
# Reduce layers
RUN configuration && \
cleanup && \
optimization
```
### Network Optimization
Improve network performance:
```
# Increase network buffers
sysctl -w net.core.rmem_max=134217728
sysctl -w net.core.wmem_max=134217728
sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
```
## Scaling Checklist
Before scaling:
- [ ] Monitor current utilization
- [ ] Identify bottlenecks
- [ ] Plan scaling approach
- [ ] Test in staging environment
- [ ] Prepare rollback plan
- [ ] Schedule during low traffic
- [ ] Monitor after scaling
- [ ] Document changes
## Best Practices
1. **Scale gradually**: Add resources incrementally
2. **Monitor impact**: Watch metrics after changes
3. **Automate where possible**: Use scripts for common tasks
4. **Plan for failures**: Have redundancy
5. **Document everything**: Keep scaling playbooks
## Cost Optimization
### Right-Sizing
Ensure efficient resource usage:
```
# Analyze container usage over time
for container in $(docker ps --format "{{.Names}}" | grep vpn-); do
echo "=== $container ==="
docker stats $container --no-stream
done
```
### Idle Resource Management
Stop unused resources:
```
# Stop idle nodes during off-peak
for country in $(low_traffic_countries); do
docker stop vpn-exit-$country-backup
done
```
---
!!! warning "Scaling Considerations"
Always test scaling changes in a staging environment first. Monitor closely after any scaling operations to ensure system stability.
---
## Operations > Troubleshooting
### VPN Exit Controller Troubleshooting Guide
This guide provides step-by-step troubleshooting procedures for common issues with the VPN Exit Controller system. Use this as your primary reference when diagnosing problems.
## Quick Reference
### Essential Commands
```
# System status
systemctl status vpn-controller
systemctl status docker
docker ps -a
# View logs
journalctl -u vpn-controller -f
docker logs vpn-api
docker logs vpn-redis
# API health check
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
# Container access
pct enter 201 # From Proxmox host
```
### Log Locations
- **System Service**: `journalctl -u vpn-controller`
- **API Logs**: `docker logs vpn-api`
- **Redis Logs**: `docker logs vpn-redis`
- **Traefik Logs**: `/opt/vpn-exit-controller/traefik/logs/traefik.log`
- **HAProxy Logs**: `/opt/vpn-exit-controller/proxy/logs/`
- **VPN Node Logs**: `docker logs `
- **OpenVPN Logs**: Inside containers at `/var/log/openvpn.log`
---
## 1. Node Issues
### 1.1 VPN Nodes Failing to Start
**Symptoms:**
- Containers exit immediately after starting
- API shows nodes as "failed" or "stopped"
- Cannot connect to VPN endpoints
**Common Error Messages:**
```
ERROR: Auth file not found at /configs/auth.txt
VPN connection timeout after 30 seconds
Cannot allocate TUN/TAP dev dynamically
```
**Diagnostic Steps:**
1. **Check container status:**
```
docker ps -a | grep vpn-node
docker logs
```
2. **Verify auth file exists:**
```
ls -la /opt/vpn-exit-controller/configs/auth.txt
cat /opt/vpn-exit-controller/configs/auth.txt
```
3. **Check VPN configuration files:**
```
ls -la /opt/vpn-exit-controller/configs/vpn/
# Verify country-specific configs exist
ls -la /opt/vpn-exit-controller/configs/vpn/us/
```
4. **Test OpenVPN configuration manually:**
```
# Inside a test container
docker run -it --rm --cap-add=NET_ADMIN --device=/dev/net/tun \
-v /opt/vpn-exit-controller/configs:/configs \
ubuntu:22.04 bash
# Then install and test OpenVPN
apt update && apt install -y openvpn
openvpn --config /configs/vpn/us.ovpn --auth-user-pass /configs/auth.txt --verb 3
```
**Resolution Steps:**
1. **Fix auth file issues:**
```
# Ensure auth file exists with correct format
echo "your_nordvpn_username" > /opt/vpn-exit-controller/configs/auth.txt
echo "your_nordvpn_password" >> /opt/vpn-exit-controller/configs/auth.txt
chmod 600 /opt/vpn-exit-controller/configs/auth.txt
```
2. **Fix container permissions:**
```
# Ensure Docker has proper capabilities
docker run --cap-add=NET_ADMIN --device=/dev/net/tun ...
```
3. **Update NordVPN configs if outdated:**
```
cd /opt/vpn-exit-controller
./scripts/download-nordvpn-configs.sh
```
### 1.2 NordVPN Authentication Problems
**Symptoms:**
- Authentication failures in OpenVPN logs
- "AUTH_FAILED" messages
- Containers restart in loops
**Diagnostic Steps:**
1. **Verify credentials:**
```
cat /opt/vpn-exit-controller/configs/auth.txt
# Should contain username on line 1, password on line 2
```
2. **Test credentials manually:**
```
# Try logging into NordVPN website with same credentials
```
3. **Check for service token vs regular credentials:**
```
# NordVPN may require service credentials for OpenVPN
# Check if using regular login vs service credentials
```
**Resolution Steps:**
1. **Use NordVPN service credentials:**
- Generate service credentials from NordVPN dashboard
- Update auth.txt with service credentials (not regular login)
2. **Reset authentication:**
```
# Clear any cached auth
rm -f /opt/vpn-exit-controller/configs/auth.txt
# Recreate with correct service credentials
echo "service_username" > /opt/vpn-exit-controller/configs/auth.txt
echo "service_password" >> /opt/vpn-exit-controller/configs/auth.txt
chmod 600 /opt/vpn-exit-controller/configs/auth.txt
```
### 1.3 Tailscale Connection Issues
**Symptoms:**
- Nodes start but don't appear in Tailscale admin
- "tailscale up" command fails
- Exit node not advertised properly
**Diagnostic Steps:**
1. **Check Tailscale auth key:**
```
# Verify auth key is set
docker exec env | grep TAILSCALE_AUTHKEY
```
2. **Check Tailscale daemon:**
```
docker exec tailscale status
docker exec tailscale ping 100.73.33.11
```
3. **Verify container networking:**
```
docker exec ip addr show
docker exec ip route
```
**Resolution Steps:**
1. **Generate new auth key:**
- Go to Tailscale admin console
- Generate new auth key with exit node permissions
- Update .env file with new key
2. **Restart Tailscale in container:**
```
docker exec tailscale down
docker exec tailscale up --authkey= --advertise-exit-node
```
3. **Check firewall rules:**
```
# Ensure UDP port 41641 is accessible
iptables -L | grep 41641
```
### 1.4 Container Restart Loops
**Symptoms:**
- Containers continuously restart
- High CPU usage
- "Restarting" status in docker ps
**Diagnostic Steps:**
1. **Check restart policy:**
```
docker inspect | grep -A 5 RestartPolicy
```
2. **Monitor restart events:**
```
docker events --filter container=
```
3. **Check exit codes:**
```
docker logs --tail 50
```
**Resolution Steps:**
1. **Identify root cause:**
- VPN connection failures
- Tailscale authentication issues
- Network connectivity problems
2. **Temporary fix - stop restart:**
```
docker update --restart=no
docker stop
```
3. **Fix underlying issue and restart:**
```
docker update --restart=unless-stopped
docker start
```
---
## 2. Network Issues
### 2.1 Proxy Connection Failures
**Symptoms:**
- HTTP 502/503 errors from proxy endpoints
- Timeouts when connecting through proxy
- "upstream connect error" messages
**Diagnostic Steps:**
1. **Check HAProxy status:**
```
# Access HAProxy stats
curl http://localhost:8404/
```
2. **Test direct container connectivity:**
```
# Find proxy container ports
docker ps | grep proxy
# Test direct connection
curl -x localhost:3128 http://httpbin.org/ip
```
3. **Check backend health:**
```
# In HAProxy stats, look for backend server status
# Red = down, Green = up
```
**Resolution Steps:**
1. **Restart proxy containers:**
```
cd /opt/vpn-exit-controller/proxy
docker-compose restart
```
2. **Check proxy configuration:**
```
# Verify HAProxy config syntax
docker exec haproxy-container haproxy -c -f /usr/local/etc/haproxy/haproxy.cfg
```
3. **Update backend servers:**
```
# Use API to refresh backend configurations
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/proxy/refresh
```
### 2.2 DNS Resolution Problems
**Symptoms:**
- Domain names not resolving in containers
- "Name resolution failed" errors
- Proxy working with IPs but not domains
- **"Doesn't support secure connection" errors in incognito mode**
- HTTPS websites failing through proxy
**Diagnostic Steps:**
1. **Test DNS in containers:**
```
docker exec nslookup google.com
docker exec dig @8.8.8.8 google.com
docker exec dig @103.86.96.100 google.com # NordVPN DNS
```
2. **Check container DNS settings:**
```
docker exec cat /etc/resolv.conf
# Should show NordVPN DNS servers first, then fallbacks
```
3. **Test Tailscale DNS configuration:**
```
docker exec tailscale status --json | grep -i dns
# Should show accept-dns=false
```
4. **Test host DNS:**
```
nslookup google.com
dig google.com
```
**Resolution Steps:**
1. **Fix VPN container DNS (Primary Fix for HTTPS errors):**
```
# Ensure containers use NordVPN DNS with fallbacks
# This is handled automatically in entrypoint.sh:
echo "nameserver 103.86.96.100" > /etc/resolv.conf
echo "nameserver 103.86.99.100" >> /etc/resolv.conf
echo "nameserver 8.8.8.8" >> /etc/resolv.conf
echo "nameserver 1.1.1.1" >> /etc/resolv.conf
```
2. **Ensure Tailscale DNS is disabled:**
```
# In container, verify Tailscale is using --accept-dns=false
docker exec ps aux | grep tailscale
# Should show --accept-dns=false in the command line
```
3. **Test DNS resolution through VPN:**
```
# Test that DNS queries go through VPN tunnel
docker exec dig +trace google.com
# Should resolve through NordVPN DNS servers
```
4. **Legacy system DNS fix (if needed):**
```
# Edit /etc/systemd/resolved.conf
[Resolve]
DNS=8.8.8.8 1.1.1.1
systemctl restart systemd-resolved
```
5. **Restart networking if changes made:**
```
systemctl restart networking
systemctl restart docker
```
!!! success "DNS Resolution Fix Explained"
The recent update implements a comprehensive DNS fix that resolves HTTPS errors in incognito mode:
1. **Root Cause**: Tailscale's DNS resolution was interfering with HTTPS certificate validation
2. **Solution**: Disable Tailscale DNS (`--accept-dns=false`) and use NordVPN DNS servers
3. **Implementation**: Containers manually configure `/etc/resolv.conf` with NordVPN DNS first, Google DNS as fallback
4. **Result**: Eliminates "doesn't support secure connection" errors and improves proxy reliability
### 2.3 SSL Certificate Issues
**Symptoms:**
- HTTPS endpoints returning certificate errors
- "SSL handshake failed" messages
- Browser certificate warnings
**Diagnostic Steps:**
1. **Check Traefik certificate status:**
```
# View certificate file
ls -la /opt/vpn-exit-controller/traefik/letsencrypt/acme.json
# Check Traefik logs for ACME challenges
docker logs traefik | grep -i acme
```
2. **Test certificate validity:**
```
openssl s_client -connect proxy-us.rbnk.uk:443 -servername proxy-us.rbnk.uk
```
3. **Check DNS for ACME challenge:**
```
dig _acme-challenge.proxy-us.rbnk.uk TXT
```
**Resolution Steps:**
1. **Force certificate renewal:**
```
# Delete existing certificates
rm /opt/vpn-exit-controller/traefik/letsencrypt/acme.json
# Restart Traefik to trigger new certificate request
docker restart traefik
```
2. **Check Cloudflare API credentials:**
```
# Verify CF_API_EMAIL and CF_API_KEY are set correctly
docker exec traefik env | grep CF_
```
3. **Manual certificate debug:**
```
# Check ACME logs in Traefik
docker logs traefik | grep -i "certificate\|acme\|error"
```
### 2.4 Port Binding Conflicts
**Symptoms:**
- "Port already in use" errors
- Containers failing to start
- Services not accessible on expected ports
**Diagnostic Steps:**
1. **Check port usage:**
```
netstat -tulpn | grep :8080
lsof -i :8080
ss -tulpn | grep :8080
```
2. **Check Docker port mappings:**
```
docker ps --format "table {{.Names}}\t{{.Ports}}"
```
3. **Identify conflicting services:**
```
systemctl list-units --type=service --state=running | grep -E "(proxy|web|http)"
```
**Resolution Steps:**
1. **Kill conflicting processes:**
```
# Find and kill process using the port
sudo kill $(lsof -t -i:8080)
```
2. **Change service ports:**
```
# Update docker-compose.yml or service configuration
# Use different port numbers
```
3. **Restart in correct order:**
```
systemctl stop vpn-controller
docker-compose down
systemctl start vpn-controller
```
---
## 3. Service Issues
### 3.1 FastAPI Application Not Starting
**Symptoms:**
- systemctl shows failed status
- "Connection refused" on port 8080
- Python import errors in logs
**Diagnostic Steps:**
1. **Check systemd service status:**
```
systemctl status vpn-controller
journalctl -u vpn-controller -n 50
```
2. **Test manual startup:**
```
cd /opt/vpn-exit-controller
source venv/bin/activate
export $(grep -v '^#' .env | xargs)
cd api
python -m uvicorn main:app --host 0.0.0.0 --port 8080
```
3. **Check Python environment:**
```
/opt/vpn-exit-controller/venv/bin/python --version
/opt/vpn-exit-controller/venv/bin/pip list
```
**Resolution Steps:**
1. **Fix Python dependencies:**
```
cd /opt/vpn-exit-controller
source venv/bin/activate
pip install -r api/requirements.txt
```
2. **Check environment variables:**
```
# Verify .env file exists and has required variables
cat /opt/vpn-exit-controller/.env
# Required: SECRET_KEY, ADMIN_USER, ADMIN_PASS, TAILSCALE_AUTHKEY
```
3. **Fix import errors:**
```
# Ensure all Python modules are properly installed
cd /opt/vpn-exit-controller/api
python -c "import main"
```
### 3.2 Redis Connection Problems
**Symptoms:**
- "Redis connection failed" in API logs
- Cache-related operations failing
- High latency in API responses
**Diagnostic Steps:**
1. **Check Redis container:**
```
docker ps | grep redis
docker logs vpn-redis
```
2. **Test Redis connectivity:**
```
docker exec vpn-redis redis-cli ping
redis-cli -h localhost -p 6379 ping
```
3. **Check Redis configuration:**
```
docker exec vpn-redis redis-cli CONFIG GET "*"
```
**Resolution Steps:**
1. **Restart Redis:**
```
docker restart vpn-redis
```
2. **Clear Redis data if corrupted:**
```
docker exec vpn-redis redis-cli FLUSHALL
```
3. **Check disk space:**
```
df -h
# Redis needs disk space for persistence
```
### 3.3 Docker Daemon Issues
**Symptoms:**
- "Cannot connect to Docker daemon" errors
- Docker commands hanging
- Containers not starting
**Diagnostic Steps:**
1. **Check Docker service:**
```
systemctl status docker
journalctl -u docker -n 50
```
2. **Test Docker functionality:**
```
docker version
docker info
docker run hello-world
```
3. **Check Docker socket:**
```
ls -la /var/run/docker.sock
sudo chmod 666 /var/run/docker.sock # Temporary fix
```
**Resolution Steps:**
1. **Restart Docker:**
```
systemctl restart docker
```
2. **Clean up Docker resources:**
```
docker system prune -a
docker volume prune
```
3. **Check disk space:**
```
df -h /var/lib/docker
# Docker needs sufficient disk space
```
### 3.4 Systemd Service Failures
**Symptoms:**
- Service won't start at boot
- "Failed to start" messages
- Service stops unexpectedly
**Diagnostic Steps:**
1. **Check service definition:**
```
systemctl cat vpn-controller
systemctl status vpn-controller
```
2. **View detailed logs:**
```
journalctl -u vpn-controller -f
journalctl -u vpn-controller --since "1 hour ago"
```
3. **Test service script manually:**
```
/opt/vpn-exit-controller/start.sh
```
**Resolution Steps:**
1. **Fix service dependencies:**
```
# Ensure docker.service is running
systemctl start docker
systemctl enable docker
```
2. **Update service configuration:**
```
systemctl edit vpn-controller
# Add any necessary environment variables or dependencies
```
3. **Reload and restart:**
```
systemctl daemon-reload
systemctl restart vpn-controller
systemctl enable vpn-controller
```
---
## 4. Load Balancing Issues
### 4.1 Nodes Not Being Selected Properly
**Symptoms:**
- All traffic going to one node
- Load balancer ignoring some nodes
- Uneven traffic distribution
**Diagnostic Steps:**
1. **Check load balancer status:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/load-balancer/status
```
2. **View node health:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes
```
3. **Check HAProxy backend status:**
```
curl http://localhost:8404/
# Look for backend server weights and status
```
**Resolution Steps:**
1. **Reset load balancer:**
```
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/load-balancer/reset
```
2. **Update backend weights:**
```
# API call to rebalance
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/load-balancer/rebalance
```
3. **Restart load balancing service:**
```
docker restart haproxy
```
### 4.2 Health Check Failures
**Symptoms:**
- Nodes marked as unhealthy
- False positive health failures
- Health checks timing out
**Diagnostic Steps:**
1. **Check health check configuration:**
```
# View HAProxy health check settings
cat /opt/vpn-exit-controller/proxy/haproxy.cfg | grep -A 5 "option httpchk"
```
2. **Test health endpoints manually:**
```
# Test individual node health
curl -H "Host: proxy-us.rbnk.uk" http://container-ip:3128/health
```
3. **Check health check logs:**
```
docker logs haproxy | grep -i health
```
**Resolution Steps:**
1. **Adjust health check timeouts:**
```
# In haproxy.cfg, increase timeout values
timeout check 5s
```
2. **Fix health endpoint:**
```
# Ensure containers respond properly to health checks
docker exec curl localhost:3128/health
```
3. **Restart health monitoring:**
```
docker restart haproxy
```
### 4.3 Speed Test Failures
**Symptoms:**
- Speed tests returning errors
- Incorrect speed measurements
- Speed test endpoints unreachable
**Diagnostic Steps:**
1. **Test speed test API:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/speed-test/run/us
```
2. **Check network connectivity:**
```
# Test from container
docker exec curl -I http://speedtest.net
```
3. **View speed test logs:**
```
docker logs vpn-api | grep -i "speed"
```
**Resolution Steps:**
1. **Update speed test endpoints:**
```
# Modify speed test configuration to use working endpoints
```
2. **Increase timeout values:**
```
# In speed test service, allow more time for tests
```
3. **Use alternative speed test method:**
```
# Switch to different speed testing service
```
### 4.4 Failover Not Working
**Symptoms:**
- Failed nodes still receiving traffic
- No automatic failover
- Manual failover not working
**Diagnostic Steps:**
1. **Check failover configuration:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/failover/status
```
2. **Test failover trigger:**
```
# Manually trigger failover
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/failover/trigger/us
```
3. **Check monitoring service:**
```
docker logs vpn-api | grep -i "failover\|monitor"
```
**Resolution Steps:**
1. **Restart monitoring service:**
```
systemctl restart vpn-controller
```
2. **Update failover thresholds:**
```
# Adjust sensitivity in configuration
```
3. **Manual failover:**
```
# Remove failed nodes from rotation
curl -X DELETE -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes/
```
---
## 5. Proxy Issues
### 5.1 HAProxy Configuration Errors
**Symptoms:**
- HAProxy fails to start
- Configuration validation errors
- Syntax errors in logs
**Diagnostic Steps:**
1. **Validate configuration:**
```
docker exec haproxy haproxy -c -f /usr/local/etc/haproxy/haproxy.cfg
```
2. **Check configuration file:**
```
cat /opt/vpn-exit-controller/proxy/haproxy.cfg
```
3. **View HAProxy logs:**
```
docker logs haproxy
```
**Resolution Steps:**
1. **Fix syntax errors:**
```
# Check for missing commas, brackets, quotes in haproxy.cfg
# Validate with: haproxy -c -f haproxy.cfg
```
2. **Restore backup configuration:**
```
cp /opt/vpn-exit-controller/proxy/haproxy.cfg.backup /opt/vpn-exit-controller/proxy/haproxy.cfg
```
3. **Restart with corrected config:**
```
docker restart haproxy
```
### 5.2 Traefik Routing Problems
**Symptoms:**
- 404 errors on proxy domains
- Requests not reaching HAProxy
- Routing rules not working
**Diagnostic Steps:**
1. **Check Traefik dashboard:**
```
# Access dashboard (if enabled)
curl http://localhost:8080/dashboard/
```
2. **View Traefik configuration:**
```
cat /opt/vpn-exit-controller/traefik/traefik.yml
```
3. **Check routing rules:**
```
docker logs traefik | grep -i "router\|rule"
```
**Resolution Steps:**
1. **Update dynamic configuration:**
```
# Check and update files in /opt/vpn-exit-controller/traefik/dynamic/
```
2. **Restart Traefik:**
```
docker restart traefik
```
3. **Verify domain DNS:**
```
dig proxy-us.rbnk.uk
# Ensure domains point to correct IP
```
### 5.3 Country Routing Not Working
**Symptoms:**
- All traffic routed to same country
- Country-specific domains not working
- Incorrect geolocation
**Diagnostic Steps:**
1. **Test country-specific endpoints:**
```
curl -H "Host: proxy-us.rbnk.uk" http://localhost:8080/
curl -H "Host: proxy-uk.rbnk.uk" http://localhost:8080/
```
2. **Check HAProxy ACL rules:**
```
grep -A 10 "acl is_.*_proxy" /opt/vpn-exit-controller/proxy/haproxy.cfg
```
3. **Test IP geolocation:**
```
curl -x proxy-us.rbnk.uk:8080 http://ipinfo.io/json
curl -x proxy-uk.rbnk.uk:8080 http://ipinfo.io/json
```
**Resolution Steps:**
1. **Update HAProxy routing rules:**
```
# Verify ACL rules and backend assignments in haproxy.cfg
```
2. **Restart proxy services:**
```
cd /opt/vpn-exit-controller/proxy
docker-compose restart
```
3. **Check node assignments:**
```
# Ensure nodes are assigned to correct countries
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/nodes
```
### 5.4 Proxy Service Failures
**Symptoms:**
- Squid or Dante proxy services not responding
- Connection refused on ports 3128 or 1080
- Health check endpoint (port 8080) not responding
- Container restart loops involving proxy services
**Diagnostic Steps:**
1. **Check proxy service status in container:**
```
# Check if proxy processes are running
docker exec ps aux | grep -E "(squid|danted)"
# Check if ports are listening
docker exec netstat -tuln | grep -E "(3128|1080|8080)"
# Test proxy services directly
docker exec curl -I http://localhost:3128
docker exec curl -I http://localhost:8080/health
```
2. **Check proxy service logs:**
```
# Squid logs
docker exec tail -f /var/log/squid/cache.log
# Check container logs for proxy startup
docker logs | grep -E "(squid|dante|health)"
```
3. **Test proxy connectivity from host:**
```
# Test HTTP proxy (replace with actual Tailscale IP)
curl -x http://100.86.140.98:3128 http://httpbin.org/ip
# Test SOCKS5 proxy
curl --socks5 100.86.140.98:1080 http://httpbin.org/ip
# Test health endpoint
curl http://100.86.140.98:8080/health
```
**Resolution Steps:**
1. **Restart proxy services in container:**
```
# Kill and restart Squid
docker exec pkill squid
docker exec squid -N -d 1 &
# Kill and restart Dante
docker exec pkill danted
docker exec danted -D &
# The entrypoint.sh monitoring loop will also restart them automatically
```
2. **Restart entire container:**
```
# Use API to restart node
curl -X POST -u admin:Bl4ckMagic!2345erver \
http://localhost:8080/api/nodes//restart
# Or restart via Docker
docker restart
```
3. **Check Squid configuration:**
```
# Verify Squid config syntax
docker exec squid -k parse
# Check if Squid cache directory is initialized
docker exec ls -la /var/spool/squid/
```
4. **Check Dante configuration:**
```
# Verify Dante config file exists
docker exec cat /etc/danted.conf
# Check Dante listening status
docker exec ss -tuln | grep 1080
```
### 5.5 Authentication Failures
**Symptoms:**
- 401/403 errors on proxy requests
- Authentication not working
- Unauthorized access attempts
**Diagnostic Steps:**
1. **Test API authentication:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/status
```
2. **Check authentication configuration:**
```
grep -i auth /opt/vpn-exit-controller/proxy/haproxy.cfg
```
3. **View authentication logs:**
```
docker logs haproxy | grep -i "auth\|401\|403"
```
**Resolution Steps:**
1. **Update credentials:**
```
# Update .env file with correct credentials
vim /opt/vpn-exit-controller/.env
```
2. **Restart services:**
```
systemctl restart vpn-controller
```
3. **Clear authentication cache:**
```
docker exec vpn-redis redis-cli FLUSHDB
```
---
## 6. Performance Issues
### 6.1 Slow Proxy Speeds
**Symptoms:**
- High latency through proxy
- Slow download/upload speeds
- Timeouts on large requests
**Diagnostic Steps:**
1. **Test direct vs proxy speed:**
```
# Direct speed test
curl -o /dev/null -s -w "%{time_total}\n" http://speedtest.net/speedtest.jpg
# Proxy speed test
curl -x proxy-us.rbnk.uk:8080 -o /dev/null -s -w "%{time_total}\n" http://speedtest.net/speedtest.jpg
```
2. **Check network utilization:**
```
iftop -i eth0
nethogs
```
3. **Monitor container resources:**
```
docker stats
```
**Resolution Steps:**
1. **Optimize HAProxy configuration:**
```
# Increase connection limits and timeouts
maxconn 4096
timeout client 50000ms
timeout server 50000ms
```
2. **Scale up resources:**
```
# Add more CPU/memory to LXC container
# Add more VPN nodes for load distribution
```
3. **Optimize VPN connections:**
```
# Use UDP instead of TCP where possible
# Select closer VPN servers
```
### 6.2 High Latency
**Symptoms:**
- Slow response times
- High ping times
- Delayed connections
**Diagnostic Steps:**
1. **Measure latency at different points:**
```
# Host to VPN server
ping nordvpn-server.com
# Through proxy
curl -x proxy-us.rbnk.uk:8080 -w "%{time_connect}\n" http://httpbin.org/get
```
2. **Check routing:**
```
traceroute -T -p 80 google.com
```
3. **Monitor network queues:**
```
ss -i | grep -E "(rto|rtt)"
```
**Resolution Steps:**
1. **Select closer VPN servers:**
```
# Use geographically closer NordVPN servers
# Update configuration to use lower-latency servers
```
2. **Optimize TCP settings:**
```
# Tune TCP congestion control
echo 'net.core.default_qdisc = fq' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_congestion_control = bbr' >> /etc/sysctl.conf
sysctl -p
```
3. **Reduce proxy hops:**
```
# Minimize routing through multiple proxies
# Use direct connections where possible
```
### 6.3 Memory or CPU Issues
**Symptoms:**
- High CPU usage
- Out of memory errors
- System slowdowns
**Diagnostic Steps:**
1. **Check system resources:**
```
top
htop
free -m
df -h
```
2. **Monitor container resources:**
```
docker stats --no-stream
```
3. **Check for memory leaks:**
```
# Monitor memory usage over time
while true; do docker stats --no-stream; sleep 30; done
```
**Resolution Steps:**
1. **Increase LXC container resources:**
```
# From Proxmox host
pct set 201 --memory 4096 --cores 4
```
2. **Optimize container limits:**
```
# Add resource limits to docker-compose.yml
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
```
3. **Clean up resources:**
```
docker system prune -a
docker volume prune
```
### 6.4 Connection Timeouts
**Symptoms:**
- Connections dropping
- Timeout errors
- Incomplete transfers
**Diagnostic Steps:**
1. **Check timeout settings:**
```
grep -i timeout /opt/vpn-exit-controller/proxy/haproxy.cfg
```
2. **Monitor connection states:**
```
ss -s
netstat -an | grep -E "(ESTABLISHED|TIME_WAIT|CLOSE_WAIT)" | wc -l
```
3. **Test with different timeout values:**
```
curl --connect-timeout 30 --max-time 300 -x proxy-us.rbnk.uk:8080 http://httpbin.org/delay/10
```
**Resolution Steps:**
1. **Increase timeout values:**
```
# In haproxy.cfg
timeout connect 10s
timeout client 60s
timeout server 60s
```
2. **Optimize connection pooling:**
```
# Adjust keep-alive settings
option http-keep-alive
timeout http-keep-alive 10s
```
3. **Scale connection limits:**
```
# Increase system limits
ulimit -n 65536
echo '* soft nofile 65536' >> /etc/security/limits.conf
echo '* hard nofile 65536' >> /etc/security/limits.conf
```
---
## 7. Monitoring Issues
### 7.1 Metrics Not Being Collected
**Symptoms:**
- Empty metrics endpoints
- No data in monitoring dashboards
- Metrics API returning errors
**Diagnostic Steps:**
1. **Check metrics endpoint:**
```
curl -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics
```
2. **Verify metrics collector service:**
```
docker logs vpn-api | grep -i "metrics"
```
3. **Check Redis for metrics data:**
```
docker exec vpn-redis redis-cli KEYS "*metrics*"
```
**Resolution Steps:**
1. **Restart metrics collection:**
```
curl -X POST -u admin:Bl4ckMagic!2345erver http://localhost:8080/api/metrics/restart
```
2. **Clear corrupted metrics:**
```
docker exec vpn-redis redis-cli DEL metrics:*
```
3. **Check metrics configuration:**
```
# Verify metrics collection intervals and settings
```
### 7.2 Health Checks Failing
**Symptoms:**
- All nodes showing as unhealthy
- Health check endpoints not responding
- False positive failures
**Diagnostic Steps:**
1. **Test health endpoints manually:**
```
# Test individual container health
docker exec curl -I localhost:3128/health
```
2. **Check health check configuration:**
```
grep -A 3 "option httpchk" /opt/vpn-exit-controller/proxy/haproxy.cfg
```
3. **Monitor health check frequency:**
```
docker logs haproxy | grep -i "health\|check" | tail -20
```
**Resolution Steps:**
1. **Adjust health check parameters:**
```
# Increase check intervals and timeouts
inter 10s
fastinter 2s
downinter 5s
```
2. **Fix health endpoint implementation:**
```
# Ensure containers properly implement /health endpoint
```
3. **Reset health check state:**
```
docker restart haproxy
```
### 7.3 Log Analysis Techniques
**Key Log Locations and Commands:**
1. **System-wide issues:**
```
# System logs
journalctl -u vpn-controller -f
journalctl --since "1 hour ago" | grep -i error
# Docker logs
docker logs --tail 100 -f vpn-api
```
2. **Network issues:**
```
# HAProxy logs
docker logs haproxy | grep -E "(5xx|error|timeout)"
# Traefik logs
tail -f /opt/vpn-exit-controller/traefik/logs/traefik.log | grep -i error
```
3. **VPN connection issues:**
```
# OpenVPN logs in containers
docker exec tail -f /var/log/openvpn.log
# Connection monitoring
docker exec ip route show | grep tun0
```
4. **Performance analysis:**
```
# Response time analysis
docker logs haproxy | grep -oE '"[0-9]+/[0-9]+/[0-9]+/[0-9]+/[0-9]+"' | tail -100
# Error rate analysis
docker logs haproxy | grep -oE '" [0-9]{3} ' | sort | uniq -c
```
---
## Emergency Recovery Procedures
### Complete System Reset
If the system is completely broken, follow these steps:
1. **Stop all services:**
```
systemctl stop vpn-controller
docker-compose -f /opt/vpn-exit-controller/docker-compose.yml down
docker-compose -f /opt/vpn-exit-controller/proxy/docker-compose.yml down
```
2. **Clean up Docker:**
```
docker system prune -a
docker volume prune
```
3. **Reset configuration:**
```
cd /opt/vpn-exit-controller
git stash # Save any local changes
git reset --hard HEAD # Reset to last known good state
```
4. **Restart services:**
```
systemctl start docker
systemctl start vpn-controller
```
### Data Recovery
If data is corrupted:
1. **Backup current state:**
```
tar -czf /tmp/vpn-backup-$(date +%Y%m%d).tar.gz /opt/vpn-exit-controller/data/
```
2. **Reset Redis data:**
```
docker exec vpn-redis redis-cli FLUSHALL
```
3. **Restart with clean state:**
```
systemctl restart vpn-controller
```
### Network Recovery
If networking is broken:
1. **Reset network interfaces:**
```
systemctl restart networking
systemctl restart docker
```
2. **Flush iptables:**
```
iptables -F
iptables -X
iptables -t nat -F
iptables -t nat -X
```
3. **Restart Tailscale:**
```
tailscale down
tailscale up --authkey= --advertise-exit-node
```
---
## Prevention and Maintenance
### Regular Maintenance Tasks
1. **Weekly:**
- Check disk space: `df -h`
- Review error logs: `journalctl -u vpn-controller --since "1 week ago" | grep -i error`
- Test proxy endpoints: `curl -x proxy-us.rbnk.uk:8080 http://httpbin.org/ip`
2. **Monthly:**
- Update NordVPN configurations: `./scripts/download-nordvpn-configs.sh`
- Clean Docker resources: `docker system prune`
- Review and rotate Tailscale auth keys
3. **Quarterly:**
- Update system packages: `apt update && apt upgrade`
- Review and update SSL certificates
- Performance optimization review
### Monitoring Setup
Set up automated monitoring:
1. **Log monitoring:**
```
# Set up logrotate for container logs
# Monitor for specific error patterns
```
2. **Resource monitoring:**
```
# Set up alerts for CPU/memory usage
# Monitor disk space
```
3. **Health monitoring:**
```
# Automated health checks
# Alert on service failures
```
This troubleshooting guide should help you diagnose and resolve most issues with the VPN Exit Controller system. Keep it updated as you encounter new issues and solutions.
---
## Quick Start
### Quick Start Guide
Get VPN Exit Controller up and running in under 10 minutes!
## Prerequisites Checklist
Before starting, ensure you have:
- [x] Ubuntu 22.04+ server with root access
- [x] Docker and Docker Compose installed
- [x] Python 3.10+ with pip
- [x] Domain with Cloudflare DNS
- [x] NordVPN service credentials
- [x] Tailscale auth key
## 1. Clone the Repository
```
git clone https://gitea.rbnk.uk/admin/vpn-controller.git
cd vpn-controller
```
## 2. Configure Environment
Copy the example environment file:
```
cp .env.example .env
```
Edit `.env` with your credentials:
```
# Essential configuration
NORDVPN_USER=your_nordvpn_service_username
NORDVPN_PASS=your_nordvpn_service_password
TAILSCALE_AUTH_KEY=your_tailscale_auth_key
CF_API_TOKEN=your_cloudflare_api_token
API_PASSWORD=your_secure_password_here
```
!!! warning "Security Note"
Never commit `.env` to version control. The file is already in `.gitignore`.
## 3. Install Dependencies
Create Python virtual environment:
```
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
## 4. Start Services
### Redis (Required for metrics)
```
docker run -d --name redis \
-p 6379:6379 \
--restart always \
redis:alpine
```
### VPN Controller API
```
# Development mode
uvicorn api.main:app --reload --host 0.0.0.0 --port 8080
# Or use the systemd service (production)
sudo systemctl enable vpn-controller
sudo systemctl start vpn-controller
```
## 5. Create Your First VPN Node
Start a US exit node:
```
curl -X POST http://localhost:8080/api/nodes/start \
-u admin:your_api_password \
-H "Content-Type: application/json" \
-d '{"country": "us", "city": "New York"}'
```
Check node status:
```
curl http://localhost:8080/api/nodes \
-u admin:your_api_password
```
## 6. Set Up Proxy Access (Optional)
### Deploy HAProxy
```
cd proxy
docker-compose up -d
```
### Deploy Traefik for SSL
```
cd ../traefik
docker-compose up -d
```
### Configure DNS
```
# Set your Cloudflare API token
export CF_API_TOKEN=your_cloudflare_api_token
# Run DNS setup
./scripts/setup-proxy-dns.sh
```
## 7. Test Your Setup
### Test VPN connectivity:
```
# Check if node is connected
curl http://localhost:8080/api/health -u admin:your_api_password
```
### Test proxy URL (if configured):
```
# Test US proxy
curl -x https://proxy-us.yourdomain.com https://ipinfo.io
```
## 🎉 Success!
You now have a working VPN Exit Controller! Here's what you can do next:
### Essential Commands
=== "Node Management"
```
# List all nodes
curl http://localhost:8080/api/nodes -u admin:$API_PASSWORD
# Start a node
curl -X POST http://localhost:8080/api/nodes/start \
-u admin:$API_PASSWORD \
-H "Content-Type: application/json" \
-d '{"country": "uk"}'
# Start a specific UK node
curl -X POST http://localhost:8080/api/nodes/uk/start \
-u admin:$API_PASSWORD
# Stop a node
curl -X DELETE http://localhost:8080/api/nodes/vpn-uk \
-u admin:$API_PASSWORD
```
=== "Health & Metrics"
```
# System health
curl http://localhost:8080/api/health -u admin:$API_PASSWORD
# Node metrics
curl http://localhost:8080/api/metrics -u admin:$API_PASSWORD
# Speed test
curl -X POST http://localhost:8080/api/speed-test/vpn-us \
-u admin:$API_PASSWORD
```
=== "Load Balancing"
```
# Get best node for country
curl http://localhost:8080/api/load-balancer/best-node/us \
-u admin:$API_PASSWORD
# Get best UK node
curl http://localhost:8080/api/load-balancer/best-node/uk \
-u admin:$API_PASSWORD
# Change strategy
curl -X POST http://localhost:8080/api/load-balancer/strategy \
-u admin:$API_PASSWORD \
-H "Content-Type: application/json" \
-d '{"strategy": "health_score"}'
```
## Common Issues & Solutions
!!! question "Node won't start?"
- Check Docker is running: `docker ps`
- Verify NordVPN credentials in `.env`
- Check logs: `docker logs vpn-us`
!!! question "Can't access proxy URLs?"
- Ensure DNS records are created
- Check Traefik is running: `docker ps | grep traefik`
- Verify SSL certificates: Check Traefik logs
!!! question "API returns 401 Unauthorized?"
- Check username is `admin`
- Verify password matches `.env` setting
- Use `-u admin:password` with curl
## Next Steps
- :material-book-open-variant:{ .lg .middle } __Read the User Guide__
---
Learn about all features and configuration options
:octicons-arrow-right-24: User Guide
- :material-api:{ .lg .middle } __Explore the API__
---
Full API reference with examples
:octicons-arrow-right-24: API Reference
- :material-server:{ .lg .middle } __Deploy to Production__
---
Production deployment best practices
:octicons-arrow-right-24: Deployment Guide
- :material-shield-check:{ .lg .middle } __Security Hardening__
---
Secure your deployment
:octicons-arrow-right-24: Security Guide
---
!!! success "Congratulations!"
You've successfully deployed VPN Exit Controller. Join our community to stay updated with new features and best practices.
---
## Security > Best Practices
### Security Guide for VPN Exit Controller
This document outlines security considerations, best practices, and hardening procedures for the VPN Exit Controller system.
## Table of Contents
1. Network Security
2. Authentication and Authorization
3. Container Security
4. SSL/TLS Security
5. Data Protection
6. Operational Security
7. Compliance Considerations
8. Security Hardening
## Network Security
### Firewall Configuration
The VPN Exit Controller requires specific firewall rules for secure operation:
```
# Allow SSH (change default port)
ufw allow 2222/tcp
# Allow web management interface (behind Traefik)
ufw allow 80/tcp
ufw allow 443/tcp
# Allow Tailscale mesh network
ufw allow 41641/udp
# Allow OpenVPN traffic for exit nodes
ufw allow 1194/udp
ufw allow 443/tcp
# Block all other incoming traffic by default
ufw default deny incoming
ufw default allow outgoing
# Enable firewall
ufw enable
```
### Network Isolation
#### Proxmox LXC Security
- **Privileged Containers**: The system requires privileged LXC containers for Docker operations
- **Container Isolation**: Use `lxc.apparmor.profile: unconfined` carefully and only when necessary
- **Network Segmentation**: Place the LXC container on an isolated VLAN (vmbr1)
```
# Example LXC configuration for security
cat >> /etc/pve/lxc/201.conf << EOF
# Security settings
lxc.apparmor.profile: generated
lxc.seccomp.profile: /usr/share/lxc/config/seccomp
# Only use unconfined when Docker requires it
EOF
```
#### Docker Network Security
- **Bridge Networks**: Use custom Docker networks instead of default bridge
- **Network Policies**: Implement container-to-container communication restrictions
- **Port Exposure**: Only expose necessary ports (8080 for API, Redis port internally only)
### VPN Tunnel Security
#### OpenVPN Configuration
- **Cipher Suite**: Use AES-256-GCM or ChaCha20-Poly1305
- **Authentication**: Use SHA-256 or stronger for HMAC
- **Perfect Forward Secrecy**: Enable TLS-auth with rotating keys
- **Certificate Validation**: Implement strict certificate checking
#### NordVPN Integration Security
- **Credential Storage**: Store NordVPN credentials securely (see Data Protection section)
- **Server Verification**: Verify NordVPN server certificates
- **Protocol Selection**: Prefer OpenVPN over IKEv2 for better auditability
### Tailscale Mesh Network Security
#### Authentication
- **Auth Keys**: Use ephemeral auth keys when possible
- **ACL Policies**: Implement strict Access Control Lists
- **Node Authorization**: Require manual node approval
```
// Example Tailscale ACL
{
"acls": [
{
"action": "accept",
"src": ["group:admins"],
"dst": ["vpn-exit-controller:8080"]
}
],
"groups": {
"group:admins": ["user@example.com"]
}
}
```
## Authentication and Authorization
### HTTP Basic Authentication
The system currently uses HTTP Basic Authentication with the following security considerations:
#### Current Implementation Issues
```
# SECURITY WARNING: Default credentials are exposed
ADMIN_USER = os.getenv("ADMIN_USER", "admin")
ADMIN_PASS = os.getenv("ADMIN_PASS", "changeme")
```
#### Security Recommendations
1. **Change Default Credentials Immediately**
```
# Set secure environment variables
export ADMIN_USER="secure_admin_user"
export ADMIN_PASS="$(openssl rand -base64 32)"
```
2. **Use Strong Password Policy**
- Minimum 16 characters
- Include uppercase, lowercase, numbers, and symbols
- Avoid dictionary words
- Rotate passwords regularly (90 days)
3. **Implement Rate Limiting**
```
# Add to FastAPI middleware
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.post("/api/auth/login")
@limiter.limit("5/minute")
async def login(request: Request, ...):
# Login logic
```
### API Security
#### JWT Token Management
- **Secret Key**: Use cryptographically secure random keys
- **Token Expiration**: Set appropriate token lifetimes (24 hours max)
- **Refresh Tokens**: Implement token refresh mechanism
```
# Generate secure secret key
openssl rand -hex 32
```
#### API Key Security
- **Rotation**: Implement regular API key rotation
- **Scoping**: Use least-privilege principle for API access
- **Monitoring**: Log all API key usage
### Service Credential Storage
#### NordVPN Credentials
Current storage in `/opt/vpn-exit-controller/configs/auth.txt` is insecure.
**Secure Implementation:**
```
# Use HashiCorp Vault or similar
vault kv put secret/nordvpn username="your_username" password="your_password"
# Or encrypt with gpg
echo "username:password" | gpg --symmetric --armor > /opt/vpn-exit-controller/configs/auth.txt.gpg
```
#### Tailscale Auth Keys
- **Ephemeral Keys**: Use time-limited auth keys
- **Key Storage**: Store in secure key management system
- **Access Logging**: Monitor auth key usage
#### Cloudflare API Tokens
- **Scoped Tokens**: Use DNS-only tokens with domain restrictions
- **Token Rotation**: Regular rotation schedule
- **Environment Variables**: Never hardcode in configuration files
## Container Security
### Docker Security Best Practices
#### Image Security
```
# Use specific, non-root base images
FROM ubuntu:22.04@sha256:specific-hash
# Create non-root user
RUN useradd -r -u 1001 -m -c "vpn user" -d /home/vpn -s /bin/bash vpn
# Use non-root user
USER vpn
# Verify signatures
RUN curl -fsSL https://tailscale.com/install.sh | sh
```
#### Runtime Security
```
# docker-compose.yml security settings
services:
api:
security_opt:
- no-new-privileges:true
read_only: true
tmpfs:
- /tmp
- /var/tmp
cap_drop:
- ALL
cap_add:
- NET_ADMIN # Only if required for VPN
```
### Container Isolation
#### Network Isolation
```
# Create isolated networks
networks:
vpn-internal:
driver: bridge
internal: true
vpn-external:
driver: bridge
```
#### Resource Limits
```
services:
api:
deploy:
resources:
limits:
cpus: '1.0'
memory: 512M
reservations:
cpus: '0.5'
memory: 256M
```
### Privileged Container Considerations
The VPN nodes require privileged access for network operations:
#### Risk Mitigation
- **Minimal Privileges**: Only grant necessary capabilities
- **Network Namespaces**: Use separate network namespaces
- **Monitoring**: Enhanced monitoring for privileged containers
```
# Minimal privileged configuration
services:
vpn-node:
privileged: false
cap_add:
- NET_ADMIN
- NET_RAW
devices:
- /dev/net/tun
```
## SSL/TLS Security
### Traefik SSL Configuration
#### Let's Encrypt Integration
```
# traefik.yml - Secure ACME configuration
certificatesResolvers:
cf:
acme:
email: "security@yourdomain.com"
storage: /letsencrypt/acme.json
dnsChallenge:
provider: cloudflare
delayBeforeCheck: 60
```
#### SSL Security Headers
```
# security-headers.yml - Enhanced headers
http:
middlewares:
security-headers:
headers:
customResponseHeaders:
X-Frame-Options: "DENY"
X-Content-Type-Options: "nosniff"
X-XSS-Protection: "1; mode=block"
Strict-Transport-Security: "max-age=31536000; includeSubDomains; preload"
Content-Security-Policy: "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'"
Referrer-Policy: "strict-origin-when-cross-origin"
Permissions-Policy: "geolocation=(), microphone=(), camera=()"
```
### Certificate Management
#### Automatic Renewal
```
# Verify certificate renewal
docker exec traefik cat /letsencrypt/acme.json | jq '.cf.Certificates[0].certificate' | base64 -d | openssl x509 -text -noout
```
#### Certificate Monitoring
```
#!/bin/bash
# Certificate expiry monitoring script
CERT_PATH="/opt/vpn-exit-controller/traefik/letsencrypt/acme.json"
EXPIRY_DAYS=30
# Extract and check certificate expiry
# Add to cron for regular monitoring
```
## Data Protection
### Logging Security
#### Log Configuration
```
# Secure logging configuration
import logging
from logging.handlers import RotatingFileHandler
# Avoid logging sensitive data
class SensitiveDataFilter(logging.Filter):
def filter(self, record):
# Remove passwords, tokens, etc.
record.msg = re.sub(r'password=[^&\s]+', 'password=***', str(record.msg))
record.msg = re.sub(r'token=[^&\s]+', 'token=***', str(record.msg))
return True
handler = RotatingFileHandler('/var/log/vpn-controller.log', maxBytes=10485760, backupCount=5)
handler.addFilter(SensitiveDataFilter())
```
#### Log Retention Policy
- **Retention Period**: 90 days for operational logs
- **Security Logs**: 1 year minimum
- **Compliance Logs**: As required by regulations
- **Log Rotation**: Daily rotation with compression
### Sensitive Data Handling
#### Environment Variables
```
# Secure environment variable handling
cat > /opt/vpn-exit-controller/.env << EOF
# Never commit this file to version control
SECRET_KEY=$(openssl rand -hex 32)
ADMIN_USER=secure_admin
ADMIN_PASS=$(openssl rand -base64 32)
TAILSCALE_AUTHKEY=tskey-auth-***
NORDVPN_USER=***
NORDVPN_PASS=***
CLOUDFLARE_API_TOKEN=***
EOF
chmod 600 /opt/vpn-exit-controller/.env
```
#### Redis Security
```
# redis.conf security settings
requirepass "$(openssl rand -base64 32)"
bind 127.0.0.1
protected-mode yes
port 0 # Disable TCP port
unixsocket /var/run/redis/redis.sock
unixsocketperm 770
```
### Backup and Recovery Security
#### Encrypted Backups
```
#!/bin/bash
# Secure backup script
BACKUP_DIR="/secure/backups"
DATE=$(date +%Y%m%d_%H%M%S)
# Create encrypted backup
tar -czf - /opt/vpn-exit-controller | gpg --symmetric --cipher-algo AES256 > "$BACKUP_DIR/vpn-controller-$DATE.tar.gz.gpg"
# Set secure permissions
chmod 600 "$BACKUP_DIR/vpn-controller-$DATE.tar.gz.gpg"
```
## Operational Security
### System Monitoring
#### Security Monitoring
```
# Install fail2ban for brute force protection
apt install fail2ban
# Configure jail for API endpoints
cat > /etc/fail2ban/jail.d/vpn-api.conf << EOF
[vpn-api]
enabled = true
port = 8080
filter = vpn-api
logpath = /var/log/vpn-controller.log
maxretry = 5
bantime = 3600
EOF
```
#### Performance Monitoring
- **Resource Usage**: Monitor CPU, memory, disk usage
- **Network Traffic**: Monitor unusual traffic patterns
- **Container Health**: Monitor container status and resource usage
### Security Updates
#### Update Schedule
- **Security Updates**: Weekly automated security updates
- **System Updates**: Monthly maintenance windows
- **Container Images**: Monthly base image updates
```
# Automated security updates
cat > /etc/apt/apt.conf.d/50unattended-upgrades << EOF
Unattended-Upgrade::Allowed-Origins {
"\${distro_id}:\${distro_codename}-security";
};
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
EOF
```
### Access Control
#### SSH Hardening
```
# /etc/ssh/sshd_config security settings
Port 2222
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
X11Forwarding no
MaxAuthTries 3
ClientAliveInterval 300
ClientAliveCountMax 2
```
#### User Management
- **Principle of Least Privilege**: Grant minimum required permissions
- **Regular Audits**: Monthly access reviews
- **Multi-Factor Authentication**: Implement for privileged accounts
### Incident Response
#### Response Plan
1. **Detection**: Automated alerting for security events
2. **Containment**: Immediate isolation procedures
3. **Investigation**: Forensic data collection
4. **Recovery**: Secure restoration procedures
5. **Lessons Learned**: Post-incident review
#### Emergency Procedures
```
# Emergency shutdown script
#!/bin/bash
# Stop all VPN nodes
docker stop $(docker ps -q --filter "ancestor=vpn-exit-node:latest")
# Stop main services
systemctl stop vpn-controller
systemctl stop docker
# Block all traffic
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT DROP
```
## Compliance Considerations
### Privacy Regulations
#### GDPR Compliance
- **Data Minimization**: Collect only necessary data
- **Retention Limits**: Implement data retention policies
- **User Rights**: Provide data access and deletion capabilities
- **Consent Management**: Document legal basis for processing
#### Data Processing Records
```
{
"processing_activity": "VPN Traffic Routing",
"data_categories": ["IP addresses", "Connection timestamps", "Traffic volumes"],
"legal_basis": "Legitimate interest",
"retention_period": "30 days",
"security_measures": ["Encryption", "Access controls", "Audit logging"]
}
```
### VPN Service Terms
#### NordVPN Compliance
- **Terms of Service**: Regular review of provider terms
- **Usage Monitoring**: Ensure compliance with usage limits
- **Prohibited Activities**: Monitor and prevent policy violations
### Data Residency
#### Geographic Restrictions
- **Data Location**: Understand where data is processed and stored
- **Cross-Border Transfers**: Implement appropriate safeguards
- **Jurisdiction Requirements**: Comply with local data protection laws
### Audit Trail Maintenance
#### Comprehensive Logging
```
# Audit logging implementation
import structlog
audit_logger = structlog.get_logger("audit")
def log_admin_action(user, action, resource, result):
audit_logger.info(
"admin_action",
user=user,
action=action,
resource=resource,
result=result,
timestamp=datetime.utcnow().isoformat()
)
```
## Security Hardening
### System Hardening
#### Kernel Security
```
# /etc/sysctl.d/99-security.conf
# Network security
net.ipv4.ip_forward=1 # Required for VPN routing
net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.default.send_redirects=0
net.ipv4.conf.all.accept_redirects=0
net.ipv4.conf.default.accept_redirects=0
# Memory protection
kernel.dmesg_restrict=1
kernel.kptr_restrict=1
kernel.yama.ptrace_scope=1
```
#### File System Security
```
# Secure mount options
# /etc/fstab
tmpfs /tmp tmpfs defaults,nodev,nosuid,noexec 0 0
tmpfs /var/tmp tmpfs defaults,nodev,nosuid,noexec 0 0
```
### Service Configuration Security
#### Systemd Service Hardening
```
# /etc/systemd/system/vpn-controller.service
[Unit]
Description=VPN Exit Controller
After=docker.service
[Service]
Type=exec
User=vpn-controller
Group=vpn-controller
ExecStart=/opt/vpn-exit-controller/start.sh
Restart=always
RestartSec=10
# Security settings
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
ReadWritePaths=/opt/vpn-exit-controller
```
### Regular Security Assessments
#### Security Checklist
**Monthly Checks:**
- [ ] Review access logs for anomalies
- [ ] Update all container images
- [ ] Check certificate expiry dates
- [ ] Review firewall rules
- [ ] Audit user access
**Quarterly Checks:**
- [ ] Penetration testing of API endpoints
- [ ] Review and update security policies
- [ ] Audit third-party dependencies
- [ ] Review backup and recovery procedures
**Annual Checks:**
- [ ] Comprehensive security audit
- [ ] Business continuity testing
- [ ] Compliance assessment
- [ ] Security training updates
#### Automated Security Scanning
```
# Container vulnerability scanning
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image vpn-exit-node:latest
# System vulnerability scanning
lynis audit system
```
### Vulnerability Management
#### Vulnerability Scanning
- **Automated Scans**: Daily vulnerability scans
- **Patch Management**: Prioritized patching based on severity
- **Zero-Day Response**: Emergency response procedures
#### Security Tools
```
# Install security monitoring tools
apt install -y \
aide \
rkhunter \
chkrootkit \
lynis \
fail2ban
```
## Security Incident Contacts
- **Security Team**: security@yourdomain.com
- **Emergency Contact**: +1-555-0123
- **Incident Response**: Available 24/7
## Additional Resources
- OWASP Top 10
- NIST Cybersecurity Framework
- Docker Security Best Practices
- Tailscale Security Documentation
---
**Last Updated:** 2025-08-04
**Version:** 1.0
**Review Schedule:** Quarterly
> **Important:** This security guide should be reviewed and updated regularly to address new threats and vulnerabilities. All security measures should be tested in a non-production environment before implementation.
---
## Additional Resources
- API Endpoint: https://vpn.rbnk.uk:8080/api
- Dashboard: https://vpn.rbnk.uk
- Documentation: https://vpn-docs.rbnk.uk
- Repository: https://gitea.rbnk.uk/admin/vpn-controller