🚀 High-performance Git repository synchronization and distribution service with automatic Minio object storage sync support.
简体中文 | English
- ✨ Features
- 🚀 Quick Start
- 📝 Configuration
- 🛠️ API Endpoints
- 📈 Performance
- 🔧 Debugging
- 🤝 Contributing
- 📄 License
- 🔄 Multi Git repository auto-sync
- ⏱️ Configurable sync intervals per repository
- 🚀 Customizable concurrent upload threads
- 📝 Incremental updates
- 🔗 Unified file access API
- 🎯 Custom access path mapping
- 🔒 SHA1 checksum verification
- 💫 Async sync without blocking
- 📦 Local cache for faster access
- Go 1.16+
- Minio Server (or S3 compatible storage)
- Git
-
Git Repository Sync
- Multi-repo auto sync
- Incremental updates
- Custom sync intervals
- SHA1 checksum verification
-
Minio Object Storage
- S3 compatible support
- Presigned URL access
- Parallel upload optimization
- Auto bucket creation
-
Caching System
- Local file cache
- CDN cache control
- API response cache
- Auto cleanup mechanism
-
Log Management
- Auto log rotation
- Multi-level logging
- Log compression
- Space management
- Go 1.16+
- Minio Server (or S3 compatible storage)
- Git
# Clone repository
git clone https://github.com/pysio2007/Files-API.git
cd Files-API
# Install dependencies
go mod download
# Run service
go run main.goA default config.yaml will be generated on first run.
server:
port: 8080 # Server port
host: "0.0.0.0" # Listen address
enableAPI: true # Enable API service
apiOnly: false # API-only mode
legacyAPI: true # Enable legacy API support
minio:
endpoint: "play.min.io"
accessKey: "your-access-key"
secretKey: "your-secret-key"
useSSL: true
bucket: "documents"
usePublicURL: true # Use presigned URLs
maxWorkers: 16 # Concurrent upload threads
cache:
enabled: true
directory: ".cache/files"
maxSize: 1000 # Cache size limit (MB)
ttl: "7d" # Cache TTL
cacheControl: "30d" # CDN cache timeThe service supports configuring multiple storage buckets to meet different data storage requirements. Each bucket configuration includes:
- Name: Identifier used for routing.
- Endpoint: Storage server address.
- AccessKey & SecretKey: Credentials for authentication.
- UseSSL: Whether to use SSL.
- BucketName: Actual bucket name.
- BasePath: Base path for files in the bucket (can be empty).
- ReadOnly: Indicates if the bucket is read-only.
Example configuration:
buckets:
- name: "Images" # Routing identifier; accessed via /Images/
endpoint: "minioapi.example.com"
accessKey: "your-access-key"
secretKey: "your-secret-key"
useSSL: true
bucketName: "pysioimages" # Actual bucket name
basePath: "" # Root directory
readOnly: true-
Full Mode (Default)
enableAPI: true apiOnly: false
- API endpoints available
- Direct file access enabled
- Suitable for most scenarios
-
API-Only Mode
enableAPI: true apiOnly: true
- API endpoints only
- Direct file access disabled
- For strict access control
The service supports automatic redirection from legacy API paths to the new format:
- Enable Support:
server:
legacyAPI: true # Enable legacy API support- Path Mapping Rules:
- Legacy:
/files/Pysio-Images/example.png - New:
/Pysio-Images/example.png
- Redirection Details:
- Uses 301 permanent redirect
- Automatically removes
/files/prefix - Preserves original query parameters
- Logs redirections (if enabled)
- Logging:
logs:
redirectLog: true # Enable redirection logginggit:
cachePath: ".cache/repos" # Local cache directory
repositories:
- url: "https://github.com/user/repo1" # Repository URL
branch: "main" # Branch name
localPath: "repos/repo1" # Local cache path
minioPath: "static" # Storage path prefix
checkInterval: "1h" # Sync check interval (m/h/d/y)
exposedPaths:
- urlPath: "/assets" # Access URL path
minioPath: "static" # Storage path prefixlogs:
accessLog: true # Record all file requests
processLog: false # Record processing details
redirectLog: false # Record URL redirections
presignLog: false # Record presigned URL generation
saveToFile: true # Save logs to file
maxSize: 100 # Max log directory size (MB)
directory: "logs" # Log directorycache:
enabled: true # Enable file caching
directory: ".cache/files" # Cache directory
maxSize: 1000 # Cache size limit (MB)
ttl: "7d" # Cache TTL
cacheControl: "30d" # Static file CDN cache time
enableAPICache: true # Enable API cache control
apiCacheControl: "5m" # API response cache time
cacheLog: true # Log cache operations
hitLog: true # Log cache hits- API Cache Control
cache:
enableAPICache: true # Enable API caching
apiCacheControl: "5m" # API cache duration
apiExcludePaths: # Paths to exclude from caching
- "/api/files/sync/status" # Sync status endpoint- Cache Exclusion Rules
- Supports exact path matching
- Supports path prefix matching
- Takes effect even when enableAPICache is true
- Higher priority than global cache settings
- Performance Recommendations
- Add dynamic content APIs to exclusion list
- Use longer cache times for static content
- Disable caching for monitoring endpoints
-
Local Cache
- Cache file content and metadata on disk
- Auto cleanup of expired cache files
- Configurable cache directory size limit
- Configurable cache TTL
-
Separate Control
- Different cache times for API and static files
- Optional API response caching
- Long cache time for static files (30 days)
- Short cache time for API responses (5 minutes)
-
CDN Support
- Control CDN caching via Cache-Control headers
- Configure cache times by resource type
- Compatible with various CDN services
-
Cache Monitoring
- Optional cache operation logging
- Optional cache hit logging
- Record cleanup and expiration events
- Monitor cache space usage
Supported time interval formats:
s: seconds, e.g.,"60s"for 60 secondsm: minutes, e.g.,"10m"for 10 minutesh: hours, e.g.,"1h"for 1 hourd: days, e.g.,"1d"for 1 dayy: years, e.g.,"1y"for 1 year
External URL supports high-frequency checks:
Default is 10 minutes if not configured or invalid.
The service supports automatic synchronization and caching of external URL resources:
externalURLs:
- path: "/external/banner.jpg" # Access path
mainURL: "https://example.com/banner.jpg" # Primary download URL
backupURLs: # List of backup URLs
- "https://backup1.com/banner.jpg"
- "https://backup2.com/banner.jpg"
minioPath: "external/banner.jpg" # Minio storage path
cacheControl: "max-age=3600" # Cache control header (e.g., "no-cache" or "max-age=3600")
checkInterval: "1h" # Update check interval (e.g., "1h", "1d")
- path: "/external/logo.png"
mainURL: "https://example.com/logo.png"
backupURLs:
- "https://cdn.example.com/logo.png"
minioPath: "external/logo.png"
cacheControl: "no-cache" # Disable caching
checkInterval: "1d" # Check once per dayConfiguration Details:
- path: API access path for the external resource
- mainURL & backupURLs: Primary and backup download URLs; in case the main URL fails, backups are attempted in order.
- minioPath: The target storage path in Minio.
- cacheControl: Sets the HTTP cache header; enable detailed logging during troubleshooting.
- checkInterval: Frequency to check for file updates. Use an appropriate value based on the importance of the resource.
How it works:
- On the first access, the resource is downloaded and stored in Minio.
- Later, the system checks for updates based on the checkInterval and automatically retries with backup URLs on failure.
- Users can monitor logs and error messages to debug issues, ensuring proper Minio write permissions and network connectivity.
Troubleshooting Tips:
- If the resource does not update, verify that the external URLs are reachable and the network is reliable.
- To bypass browser caching during testing, use: curl -H "Cache-Control: no-cache" http://localhost:8080/external/banner.jpg
When started with --skip, the program will:
- Skip initial sync at startup
- Wait for the configured check interval before first sync
- Useful for delayed sync scenarios
Example:
# Normal start (with initial sync)
./Files-API
# Skip initial sync
./Files-API --skipUse cases:
- Avoid duplicate syncs in CI/CD
- When repository content is temporarily unavailable
- Waiting for external services
- Control sync timing
# Show help
./Files-API -h
./Files-API --help
# Start service (with initial sync)
./Files-API# Skip initial sync
./Files-API --skip
# Single sync and exit
./Files-API --sync
# Sync specific repository
./Files-API --rsync=static# Compress log files
./Files-API --zip-logs
# Decompress log files
./Files-API --unzip-logs
# Clear logs
./Files-API --clear-logs
./Files-API -cl# Clear cache
./Files-API --clear-cache
./Files-API -cc
# Clear all logs and cache
./Files-API --clear-allCommand details:
-
Sync Control
--skip: Skip initial sync--sync: Single sync and exit--rsync: Sync specific repository
-
Log Management
--zip-logs: Compress logs to zip--unzip-logs: Extract log archives--clear-logs, -cl: Clear all logs
-
Cache Management
--clear-cache, -cc: Clear cache directory--clear-all: Clear all logs and cache
GET /{minioPath}/{filePath}
# Access static resources
GET /static/images/logo.png
GET /assets/css/main.css
# Access other files
GET /public/files/document.pdfGet a list of files and subdirectories in the specified directory.
GET /api/files/{path}?page=1&pageSize=20Parameters:
path: Optional, directory pathpage: Optional, page number (default: 1)pageSize: Optional, items per page (default: 20, max: 100)
Response Format:
{
"code": 200,
"message": "success",
"data": [
{
"name": "images",
"path": "static/images/",
"isDirectory": true
},
{
"name": "logo.png",
"path": "static/logo.png",
"size": 12345,
"lastModified": "2024-02-05T12:34:56Z",
"isDirectory": false,
"url": "https://..."
}
],
"pagination": {
"current": 1,
"pageSize": 20,
"total": 42
}
}Response Fields:
-
File Information (FileInfo)
name: File or directory namepath: Complete pathsize: File size in byteslastModified: Last modification timeisDirectory: Whether it's a directoryurl: File access URL (only when usePublicURL=true)
-
Pagination Info
current: Current page numberpageSize: Items per pagetotal: Total number of items
Direct file content access.
GET /{minioPath}/{filePath}Access Modes:
-
Redirect Mode (usePublicURL=true)
- Returns 302 redirect to presigned URL
- Presigned URL valid for 1 hour
- Reduces server load
- Recommended for public access
-
Proxy Mode (usePublicURL=false)
- Returns file content directly
- Sets appropriate Content-Type
- Supports large file transfers
- Suitable for internal networks
Examples:
# Access file
GET /static/images/logo.png
# Get JSON file info with Accept header
curl -H "Accept: application/json" http://localhost:8080/api/files/static/images/
# Paginated query
curl http://localhost:8080/api/files/static/?page=2&pageSize=50Get synchronization status for all repositories.
GET /api/files/sync/statusResponse Format:
{
"code": 200,
"message": "success",
"data": {
"repo1": {
"lastSync": "2024-02-05T12:34:56Z", // Last sync time
"nextSync": "2024-02-05T13:34:56Z", // Next scheduled sync
"progress": 100, // Sync progress (0-100)
"totalFiles": 50, // Total files
"currentFiles": 50, // Processed files
"status": "idle" // Status (idle/syncing/error)
}
}
}Status Descriptions:
idle: Waiting for next syncsyncing: Currently synchronizingerror: Sync failed, check error field for detailsunknown: Initial state
Monitoring Metrics:
lastSync: Last synchronization timenextSync: Next scheduled sync timeprogress: Current sync progress (0-100)totalFiles: Total number of filescurrentFiles: Processed files counterror: Error message (if any)
Monitoring Examples:
# Check sync status
curl http://localhost:8080/api/files/sync/status
# Monitor progress with watch
watch -n 1 'curl -s http://localhost:8080/api/files/sync/status | jq'The service supports different caching strategies for API responses and static files:
cache:
enableAPICache: true # Enable API response caching
apiCacheControl: "5m" # API cache duration (5 minutes)
cacheControl: "30d" # Static files cache duration (30 days)Headers:
- API responses include
Cache-Control: public, max-age=300(5 minutes) - Static files include
Cache-Control: public, max-age=2592000(30 days)
{
"code": 404,
"message": "File not found"
}Common Status Codes:
200: Success400: Invalid request404: File not found500: Server error
- Full Debug Configuration (Record Everything)
logs:
accessLog: true # Record all requests
processLog: true # Record file processing
redirectLog: true # Record URL redirects
presignLog: true # Record presigned URL generation
saveToFile: true # Output to both file and console
maxSize: 100 # Log directory limit (MB)
directory: "logs" # Log directory- Cache Debug Configuration
cache:
cacheLog: true # Record cache operations
hitLog: true # Record cache hits- File Sync Issues
# Check sync status
./Files-API --sync
# Sync specific repository
./Files-API --rsync=static
# Monitor sync logs
tail -f logs/Files-API-$(date +%Y-%m-%d).log | grep "sync"- Cache Issues
# Check cache status
ls -lh .cache/files/
# Clear cache and retry
./Files-API --clear-cache
# Monitor cache hits
tail -f logs/Files-API-$(date +%Y-%m-%d).log | grep "Cache hit"- Minio Connection Issues
# Check Minio connection
curl -I http://{minio-endpoint}/
# Or use tools like s3cmd test
# Verify configuration
cat config.yaml | grep minio -A 8- Using Go Profiling Tools
# Enable profiling
GODEBUG=gctrace=1 ./Files-API
# Use pprof
go tool pprof http://localhost:8080/debug/pprof/heap- Monitor System Resources
# Check memory usage
ps -o pid,ppid,%mem,rss,cmd -p $(pgrep Files-API)
# Check file descriptors
lsof -p $(pgrep Files-API)- Enable All Logs
# Edit config file
vim config.yaml
# Set all log options to true
# Test with shorter sync interval
checkInterval: "1m"- API Testing
# Test file list API
curl "http://localhost:8080/api/files/static/?page=1&pageSize=10"
# Test file access
curl -I "http://localhost:8080/static/test.txt"- Performance Testing
# Test concurrent requests
ab -n 1000 -c 10 http://localhost:8080/api/files/static/
# Test file uploads
for i in {1..10}; do
./Files-API --sync
done| Status Code | Description | Resolution |
|---|---|---|
| 400 | Bad Request | Check API parameters |
| 404 | File Not Found | Check path and sync status |
| 500 | Internal Server Error | Check detailed logs |
| 503 | Minio Service Unavailable | Check Minio connection |
- Full Debug Configuration
logs:
accessLog: true # Record all access
processLog: true # Record processing
redirectLog: true # Record redirects
presignLog: true # Record temp links
saveToFile: true # Save to file
maxSize: 100 # Size limit
directory: "logs" # Storage dir- Minimal Log Configuration
logs:
accessLog: true # Basic access only
processLog: false
redirectLog: false
presignLog: false
saveToFile: false # Console only- View Logs
# View today's log
cat logs/Files-API-2025-02-05.log
# Monitor real-time logs
tail -f logs/Files-API-2025-02-05.log- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the AGPL-3.0 License.
- Added CORS configuration support. You can now customize allowed origins via the
server.allowOriginsconfiguration. - Updated the CORS middleware to automatically handle preflight requests and set proper headers.
- The default configuration file now includes an
allowOriginsentry with a default value of["http://localhost:8080"].