🔒 Security Guide: Dataproc MCP Server
This guide covers security best practices, configuration, and hardening for the Dataproc MCP Server.
Overview
The Dataproc MCP Server implements comprehensive security measures including:
- Input validation and sanitization
- Rate limiting and abuse prevention
- Credential management and protection
- Audit logging and monitoring
- Secure defaults and configurations
Security Features
🛡️ Input Validation
All tool inputs are validated using comprehensive Zod schemas that enforce:
- GCP Resource Constraints: Project IDs, regions, zones, and cluster names must follow GCP naming conventions
- Data Type Validation: Ensures correct data types and formats
- Length Limits: Prevents oversized inputs that could cause issues
- Pattern Matching: Uses regex patterns to validate GCP-specific formats
- Injection Prevention: Detects and blocks common injection patterns
Example Validation Rules
// Project ID validation
const projectId = "my-project-123"; // ✅ Valid
const projectId = "My-Project"; // ❌ Invalid (uppercase)
const projectId = "a"; // ❌ Invalid (too short)
// Cluster name validation
const clusterName = "my-cluster"; // ✅ Valid
const clusterName = "My_Cluster"; // ❌ Invalid (underscore)
const clusterName = "cluster-"; // ❌ Invalid (ends with hyphen)
🚦 Rate Limiting
Built-in rate limiting prevents abuse and ensures fair resource usage:
- Default Limits: 100 requests per minute per client
- Configurable Windows: Adjustable time windows and limits
- Per-Tool Limiting: Different limits can be set per tool
- Automatic Cleanup: Expired rate limit entries are automatically cleaned up
Configuration
{
"rateLimiting": {
"windowMs": 60000, // 1 minute window
"maxRequests": 100, // Max requests per window
"enabled": true
}
}
🔐 Credential Management
Comprehensive credential validation and protection:
Sensitive File Protection
⚠️ CRITICAL: Configuration files containing sensitive information must never be committed to version control.
Protected Files:
config/server.json
- Contains authentication credentials, API keys, and project details- Service account key files (
.json
files with private keys) - Any files containing passwords, tokens, or API keys
Security Measures:
- Git Ignore Protection: Sensitive files are listed in
.gitignore
- Template System: Use
config/server.json.template
as a reference - History Cleanup: If accidentally committed, use BFG Repo-Cleaner to remove from history
Emergency: Removing Sensitive Files from Git History
If sensitive files were accidentally committed and pushed to a repository:
- Install BFG Repo-Cleaner:
# macOS brew install bfg # Or download from: https://rtyley.github.io/bfg-repo-cleaner/
- Remove file from current commit:
git rm -f config/server.json git commit -m "Remove sensitive configuration file"
- Clean entire Git history:
# Remove all instances of the file from history bfg --delete-files server.json # Clean up the repository git reflog expire --expire=now --all && git gc --prune=now --aggressive
- Force push to remote (⚠️ DESTRUCTIVE OPERATION):
# Push cleaned main branch git push --force origin main # Push all cleaned branches git push --force origin --all
- Post-cleanup actions:
- Rotate all compromised credentials immediately
- Update API keys and service account keys
- Notify team members to re-clone the repository
- Monitor for any unauthorized access
⚠️ Important Notes:
- Force pushing rewrites Git history and affects all collaborators
- All team members must re-clone the repository after cleanup
- This operation cannot be undone - ensure you have backups
- Consider contacting GitHub support for additional cache clearing
Configuration File Setup
- Copy the template:
cp config/server.json.template config/server.json
- Edit with your credentials:
{ "projectId": "your-actual-project-id", "region": "us-central1", "authentication": { "serviceAccountKeyPath": "/secure/path/to/your-key.json", "impersonateServiceAccount": "your-sa@project.iam.gserviceaccount.com" } }
- Verify protection:
# Ensure file is ignored git status # Should not show config/server.json as modified
Service Account Key Validation
- Format Validation: Ensures proper JSON structure and required fields
- Permission Checks: Validates file permissions (warns if world-readable)
- Age Monitoring: Warns about keys older than 90 days
- Content Sanitization: Removes sensitive data from logs
Best Practices
- Use Service Account Impersonation
{ "authentication": { "impersonateServiceAccount": "dataproc-sa@project.iam.gserviceaccount.com", "fallbackKeyPath": "/secure/path/to/source-key.json", "preferImpersonation": true } }
- Secure Key Storage
# Set restrictive permissions chmod 600 /path/to/service-account-key.json chown dataproc-user:dataproc-group /path/to/service-account-key.json
- Regular Key Rotation
- Rotate keys every 90 days
- Monitor key age with built-in warnings
- Use automated rotation where possible
📊 Audit Logging
All security-relevant events are logged for monitoring and compliance:
Logged Events
- Authentication Events: Login attempts, key validation, impersonation
- Input Validation Failures: Invalid inputs, injection attempts
- Rate Limit Violations: Exceeded request limits
- Tool Executions: All tool calls with sanitized parameters
- Error Conditions: Security-related errors and warnings
Log Format
{
"timestamp": "2025-05-29T22:30:00.000Z",
"event": "Input validation failed",
"details": {
"tool": "start_dataproc_cluster",
"error": "Invalid project ID format",
"clientId": "[REDACTED]"
},
"severity": "warn"
}
🔍 Threat Detection
Automatic detection of suspicious patterns:
- SQL Injection: Detects SQL keywords and patterns
- XSS Attempts: Identifies script injection attempts
- Path Traversal: Catches directory traversal attempts
- Template Injection: Detects template expression patterns
- Code Injection: Identifies code execution attempts
- System Commands: Flags dangerous system commands
Security Configuration
Environment Variables
# Security settings
SECURITY_RATE_LIMIT_ENABLED=true
SECURITY_RATE_LIMIT_WINDOW=60000
SECURITY_RATE_LIMIT_MAX=100
SECURITY_AUDIT_LOG_LEVEL=info
SECURITY_CREDENTIAL_VALIDATION=strict
Configuration File
{
"security": {
"enableRateLimiting": true,
"maxRequestsPerMinute": 100,
"enableInputValidation": true,
"sanitizeCredentials": true,
"auditLogLevel": "info",
"enableThreatDetection": true,
"secureHeaders": {
"enabled": true,
"customHeaders": {}
}
}
}
Hardening Checklist
✅ Basic Security
- Service account keys have restrictive permissions (600)
- Using service account impersonation instead of direct keys
- Rate limiting is enabled and configured appropriately
- Input validation is enabled for all tools
- Audit logging is configured and monitored
✅ Advanced Security
- Service account keys are rotated regularly (≤90 days)
- Monitoring and alerting for security events
- Network access is restricted (firewall rules)
- TLS/SSL is used for all communications
- Regular security audits and penetration testing
✅ Production Security
- Dedicated service accounts per environment
- Centralized credential management (Secret Manager)
- Automated security scanning in CI/CD
- Incident response procedures documented
- Security training for operators
Monitoring and Alerting
Key Metrics to Monitor
- Authentication Failures
- Failed service account validations
- Invalid credential attempts
- Permission denied errors
- Rate Limiting Events
- Clients hitting rate limits
- Unusual traffic patterns
- Potential abuse attempts
- Input Validation Failures
- Malformed requests
- Injection attempt patterns
- Suspicious input patterns
- System Health
- Error rates by tool
- Response times
- Resource utilization
Sample Alerts
# Example Prometheus alerts
groups:
- name: dataproc-mcp-security
rules:
- alert: HighAuthenticationFailures
expr: rate(dataproc_auth_failures_total[5m]) > 0.1
for: 2m
labels:
severity: warning
annotations:
summary: "High authentication failure rate"
- alert: RateLimitViolations
expr: rate(dataproc_rate_limit_violations_total[5m]) > 0.05
for: 1m
labels:
severity: warning
annotations:
summary: "Rate limit violations detected"
Incident Response
Security Incident Types
- Credential Compromise
- Immediately rotate affected keys
- Review audit logs for unauthorized access
- Update access controls
- Injection Attacks
- Block suspicious clients
- Review and strengthen input validation
- Analyze attack patterns
- Rate Limit Abuse
- Identify and block abusive clients
- Adjust rate limits if necessary
- Investigate traffic patterns
Response Procedures
- Immediate Response
- Isolate affected systems
- Preserve evidence (logs, configurations)
- Notify security team
- Investigation
- Analyze audit logs
- Identify attack vectors
- Assess impact and scope
- Recovery
- Apply security patches
- Update configurations
- Restore normal operations
- Post-Incident
- Document lessons learned
- Update security procedures
- Implement additional controls
Compliance Considerations
Data Protection
- PII Handling: Ensure no personally identifiable information is logged
- Data Encryption: Use encryption for data at rest and in transit
- Access Controls: Implement least privilege access principles
Regulatory Requirements
- SOC 2: Implement appropriate security controls
- GDPR: Ensure data protection and privacy compliance
- HIPAA: Additional controls for healthcare data (if applicable)
Audit Requirements
- Log Retention: Maintain audit logs for required periods
- Access Reviews: Regular review of service account permissions
- Security Assessments: Periodic security evaluations
Security Updates
Keeping Secure
- Regular Updates
- Update dependencies regularly
- Apply security patches promptly
- Monitor security advisories
- Vulnerability Scanning
- Automated dependency scanning
- Container image scanning
- Infrastructure scanning
- Security Testing
- Regular penetration testing
- Code security reviews
- Configuration audits
Support and Resources
Getting Help
- Security Issues: Report to security team immediately
- Configuration Questions: Consult this guide and documentation
- Best Practices: Follow industry security standards
Additional Resources
Remember: Security is an ongoing process, not a one-time setup. Regularly review and update your security configurations as threats evolve.