🔧 Dataproc MCP Server Documentation

Production-ready Model Context Protocol server for Google Cloud Dataproc operations.

🔒 Security Guide: Dataproc MCP Server

This guide covers security best practices, configuration, and hardening for the Dataproc MCP Server.

Overview

The Dataproc MCP Server implements comprehensive security measures including:

Security Features

🛡️ Input Validation

All tool inputs are validated using comprehensive Zod schemas that enforce:

Example Validation Rules

// Project ID validation
const projectId = "my-project-123"; // ✅ Valid
const projectId = "My-Project";     // ❌ Invalid (uppercase)
const projectId = "a";              // ❌ Invalid (too short)

// Cluster name validation
const clusterName = "my-cluster";   // ✅ Valid
const clusterName = "My_Cluster";   // ❌ Invalid (underscore)
const clusterName = "cluster-";     // ❌ Invalid (ends with hyphen)

🚦 Rate Limiting

Built-in rate limiting prevents abuse and ensures fair resource usage:

Configuration

{
  "rateLimiting": {
    "windowMs": 60000,     // 1 minute window
    "maxRequests": 100,    // Max requests per window
    "enabled": true
  }
}

🔐 Credential Management

Comprehensive credential validation and protection:

Sensitive File Protection

⚠️ CRITICAL: Configuration files containing sensitive information must never be committed to version control.

Protected Files:

Security Measures:

  1. Git Ignore Protection: Sensitive files are listed in .gitignore
  2. Template System: Use config/server.json.template as a reference
  3. History Cleanup: If accidentally committed, use BFG Repo-Cleaner to remove from history

Emergency: Removing Sensitive Files from Git History

If sensitive files were accidentally committed and pushed to a repository:

  1. Install BFG Repo-Cleaner:
    # macOS
    brew install bfg
       
    # Or download from: https://rtyley.github.io/bfg-repo-cleaner/
    
  2. Remove file from current commit:
    git rm -f config/server.json
    git commit -m "Remove sensitive configuration file"
    
  3. Clean entire Git history:
    # Remove all instances of the file from history
    bfg --delete-files server.json
       
    # Clean up the repository
    git reflog expire --expire=now --all && git gc --prune=now --aggressive
    
  4. Force push to remote (⚠️ DESTRUCTIVE OPERATION):
    # Push cleaned main branch
    git push --force origin main
       
    # Push all cleaned branches
    git push --force origin --all
    
  5. Post-cleanup actions:
    • Rotate all compromised credentials immediately
    • Update API keys and service account keys
    • Notify team members to re-clone the repository
    • Monitor for any unauthorized access

⚠️ Important Notes:

Configuration File Setup

  1. Copy the template:
    cp config/server.json.template config/server.json
    
  2. Edit with your credentials:
    {
      "projectId": "your-actual-project-id",
      "region": "us-central1",
      "authentication": {
        "serviceAccountKeyPath": "/secure/path/to/your-key.json",
        "impersonateServiceAccount": "your-sa@project.iam.gserviceaccount.com"
      }
    }
    
  3. Verify protection:
    # Ensure file is ignored
    git status  # Should not show config/server.json as modified
    

Service Account Key Validation

Best Practices

  1. Use Service Account Impersonation
    {
      "authentication": {
        "impersonateServiceAccount": "dataproc-sa@project.iam.gserviceaccount.com",
        "fallbackKeyPath": "/secure/path/to/source-key.json",
        "preferImpersonation": true
      }
    }
    
  2. Secure Key Storage
    # Set restrictive permissions
    chmod 600 /path/to/service-account-key.json
    chown dataproc-user:dataproc-group /path/to/service-account-key.json
    
  3. Regular Key Rotation
    • Rotate keys every 90 days
    • Monitor key age with built-in warnings
    • Use automated rotation where possible

📊 Audit Logging

All security-relevant events are logged for monitoring and compliance:

Logged Events

Log Format

{
  "timestamp": "2025-05-29T22:30:00.000Z",
  "event": "Input validation failed",
  "details": {
    "tool": "start_dataproc_cluster",
    "error": "Invalid project ID format",
    "clientId": "[REDACTED]"
  },
  "severity": "warn"
}

🔍 Threat Detection

Automatic detection of suspicious patterns:

Security Configuration

Environment Variables

# Security settings
SECURITY_RATE_LIMIT_ENABLED=true
SECURITY_RATE_LIMIT_WINDOW=60000
SECURITY_RATE_LIMIT_MAX=100
SECURITY_AUDIT_LOG_LEVEL=info
SECURITY_CREDENTIAL_VALIDATION=strict

Configuration File

{
  "security": {
    "enableRateLimiting": true,
    "maxRequestsPerMinute": 100,
    "enableInputValidation": true,
    "sanitizeCredentials": true,
    "auditLogLevel": "info",
    "enableThreatDetection": true,
    "secureHeaders": {
      "enabled": true,
      "customHeaders": {}
    }
  }
}

Hardening Checklist

✅ Basic Security

✅ Advanced Security

✅ Production Security

Monitoring and Alerting

Key Metrics to Monitor

  1. Authentication Failures
    • Failed service account validations
    • Invalid credential attempts
    • Permission denied errors
  2. Rate Limiting Events
    • Clients hitting rate limits
    • Unusual traffic patterns
    • Potential abuse attempts
  3. Input Validation Failures
    • Malformed requests
    • Injection attempt patterns
    • Suspicious input patterns
  4. System Health
    • Error rates by tool
    • Response times
    • Resource utilization

Sample Alerts

# Example Prometheus alerts
groups:
  - name: dataproc-mcp-security
    rules:
      - alert: HighAuthenticationFailures
        expr: rate(dataproc_auth_failures_total[5m]) > 0.1
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High authentication failure rate"
          
      - alert: RateLimitViolations
        expr: rate(dataproc_rate_limit_violations_total[5m]) > 0.05
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Rate limit violations detected"

Incident Response

Security Incident Types

  1. Credential Compromise
    • Immediately rotate affected keys
    • Review audit logs for unauthorized access
    • Update access controls
  2. Injection Attacks
    • Block suspicious clients
    • Review and strengthen input validation
    • Analyze attack patterns
  3. Rate Limit Abuse
    • Identify and block abusive clients
    • Adjust rate limits if necessary
    • Investigate traffic patterns

Response Procedures

  1. Immediate Response
    • Isolate affected systems
    • Preserve evidence (logs, configurations)
    • Notify security team
  2. Investigation
    • Analyze audit logs
    • Identify attack vectors
    • Assess impact and scope
  3. Recovery
    • Apply security patches
    • Update configurations
    • Restore normal operations
  4. Post-Incident
    • Document lessons learned
    • Update security procedures
    • Implement additional controls

Compliance Considerations

Data Protection

Regulatory Requirements

Audit Requirements

Security Updates

Keeping Secure

  1. Regular Updates
    • Update dependencies regularly
    • Apply security patches promptly
    • Monitor security advisories
  2. Vulnerability Scanning
    • Automated dependency scanning
    • Container image scanning
    • Infrastructure scanning
  3. Security Testing
    • Regular penetration testing
    • Code security reviews
    • Configuration audits

Support and Resources

Getting Help

Additional Resources


Remember: Security is an ongoing process, not a one-time setup. Regularly review and update your security configurations as threats evolve.