4.8 KiB
4.8 KiB
Python DeepSeek R1 API
A Python port of the DeepSeek R1 API wrapper (OpenAI-compatible).
Architecture
python/
├── api/ # DeepSeek API client
│ ├── client.py # Main HTTP client
│ ├── models.py # Data models and response types
│ └── utils.py # Utilities
├── dto/ # Data transfer objects
│ ├── models.py # Request/response models
│ └── utils.py # Utilities
├── kv/ # Cache/KV storage
│ └── cache.py # Redis cache implementation
├── solver/ # WASM proof-of-work solver
│ └── instance.py # Solver wrapper
├── application/ # Flask application
│ └── app.py # Main Flask app with routes
├── main.py # Entry point
└── requirements.txt # Python dependencies
Requirements
- Python 3.8+
- Redis server (for chat session caching)
- WASM binary from Go implementation (
sha3_wasm_bg.7b9ca65ddd.wasm)
Installation
Using pip
pip install -r requirements.txt
Using Docker
docker-compose up
Configuration
Create a .env file from the template:
cp .env.example .env
Edit .env with your settings:
REDIS_ADDR=localhost:6379
Running
Local Python
python main.py
Docker
docker-compose up
The server will start on http://localhost:8080
API Endpoints
GET /- Health check (returns "started")GET /models- List available models (returns r1 model)POST /chat/completions- OpenAI-compatible chat completions
API Usage
Non-streaming Request
curl -X POST http://localhost:8080/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "r1",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": false,
"thinking_enabled": false,
"search_enabled": false
}'
Streaming Request
curl -X POST http://localhost:8080/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "r1",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": true,
"thinking_enabled": false,
"search_enabled": false
}'
Python Example
import requests
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {
"model": "r1",
"messages": [
{"role": "user", "content": "What is Python?"}
],
"stream": False
}
response = requests.post(
"http://localhost:8080/chat/completions",
json=data,
headers=headers
)
print(response.json())
Key Components
API Client (api/client.py)
The main HTTP client that communicates with DeepSeek's API servers. Features:
- Chat creation and management
- Message completion with streaming
- Proof-of-Work (PoW) challenge handling
- Authentication and authorization
- Server-Sent Events (SSE) parsing for streaming responses
Cache (kv/cache.py)
Redis-based caching system for:
- Chat session persistence
- Message ID tracking
- FNV-1a hash-based key generation
WASM Solver (solver/instance.py)
Wraps the WASM SHA3 proof-of-work solver:
- Memory management via Wasmtime
- Hash calculation for PoW challenges
- Compatibility with Go WASM binary
Flask Application (application/app.py)
Web framework providing:
- REST API endpoints
- Request/response handling
- Streaming support
- Error handling and logging
Design Differences from Go
- HTTP Client: Uses
requestslibrary instead of Go'snet/http - Concurrency: Python's threading vs Go's goroutines
- JSON Serialization:
jsonmodule instead ofsonic(Go's fast JSON library) - Logging: Python's
logginginstead ofzap - Server: Flask instead of Echo web framework
- WASM Runtime:
wasmtime-pyinstead ofwasmtime-go
Troubleshooting
WASM Binary Not Found
Make sure the WASM binary file exists at:
deepseek4free/pkg/solver/sha3_wasm_bg.7b9ca65ddd.wasm
Redis Connection Error
Ensure Redis is running:
redis-server
# or with Docker
docker run -d -p 6379:6379 redis:7-alpine
Import Errors
Make sure you're running from the python directory and have set PYTHONPATH:
export PYTHONPATH=/path/to/python:$PYTHONPATH
python main.py
Performance Notes
-
Python is generally slower than Go for this workload
-
For production use, consider using Gunicorn:
pip install gunicorn gunicorn -w 4 -b 0.0.0.0:8080 "application.app:Application(solver, cache).app" -
WASM solver performance is comparable between Go and Python
-
Network I/O is the primary bottleneck
License
Same as the original Go implementation