How to Monitor Proxy Quality & Performance

20 min read
proxy quality monitoring
performance measurement
Bright Data
monitoring tools

Complete guide to effectively monitoring proxy quality and performance. Learn measurement techniques for response time, success rates, and IP quality with automated tool implementation examples.

Proxy quality monitoring requires tracking three key metrics: success rate, response time, and IP quality. Use automated tools for continuous monitoring with automatic removal of underperforming proxies. Implement efficient quality management with Bright Data's monitoring API and Python scripts.

The Critical Importance of Proxy Quality Monitoring

Proxy service quality directly impacts web scraping project success. As explained in our Ultimate Guide to Proxy Services & Web Scraping, without proper monitoring systems, poor-quality proxies can significantly reduce data collection efficiency.

Proxy Quality Monitoring System Overview

Key Metrics to Monitor

1. Success Rate

Definition: Percentage of successful requests against total requests

Calculation Formula:

Success Rate = (Successful Requests / Total Requests) × 100

Recommended Standards:

  • Residential Proxies: 95%+
  • Datacenter Proxies: 98%+
  • Mobile Proxies: 90%+

2. Response Time

Measurement Items:

  • Average response time
  • 95th percentile time
  • Maximum response time

Recommended Standards:

  • Residential Proxies: Under 3 seconds
  • Datacenter Proxies: Under 1 second
  • Mobile Proxies: Under 5 seconds

3. IP Quality

Evaluation Elements:

  • IP reputation
  • Blacklist status
  • Geographic accuracy
  • Anonymity level

Building a Monitoring System

Basic Monitoring System

import time
import requests
import statistics
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import List, Dict, Optional

@dataclass
class ProxyMetrics:
    proxy_id: str
    success_rate: float
    avg_response_time: float
    p95_response_time: float
    error_count: int
    last_check: datetime

class ProxyMonitor:
    def __init__(self, proxy_list: List[str]):
        self.proxy_list = proxy_list
        self.metrics: Dict[str, ProxyMetrics] = {}
        self.test_urls = [
            'https://httpbin.org/ip',
            'https://httpbin.org/headers',
            'https://httpbin.org/user-agent'
        ]
    
    def test_proxy(self, proxy: str, timeout: int = 10) -> Dict:
        """Test individual proxy"""
        results = []
        
        for url in self.test_urls:
            start_time = time.time()
            try:
                response = requests.get(
                    url, 
                    proxies={'http': proxy, 'https': proxy},
                    timeout=timeout
                )
                end_time = time.time()
                
                results.append({
                    'success': response.status_code == 200,
                    'response_time': end_time - start_time,
                    'status_code': response.status_code
                })
            except Exception as e:
                results.append({
                    'success': False,
                    'response_time': timeout,
                    'error': str(e)
                })
        
        return self.calculate_metrics(results)
    
    def calculate_metrics(self, results: List[Dict]) -> Dict:
        """Calculate metrics from test results"""
        successful = [r for r in results if r['success']]
        response_times = [r['response_time'] for r in results]
        
        return {
            'success_rate': len(successful) / len(results) * 100,
            'avg_response_time': statistics.mean(response_times),
            'p95_response_time': statistics.quantiles(response_times, n=20)[18] if len(response_times) > 1 else 0,
            'error_count': len(results) - len(successful)
        }
    
    def monitor_all_proxies(self) -> Dict[str, ProxyMetrics]:
        """Execute monitoring for all proxies"""
        for proxy in self.proxy_list:
            metrics = self.test_proxy(proxy)
            
            self.metrics[proxy] = ProxyMetrics(
                proxy_id=proxy,
                success_rate=metrics['success_rate'],
                avg_response_time=metrics['avg_response_time'],
                p95_response_time=metrics['p95_response_time'],
                error_count=metrics['error_count'],
                last_check=datetime.now()
            )
        
        return self.metrics

Advanced Monitoring System

Advanced Proxy Monitoring Dashboard

import asyncio
import aiohttp
from concurrent.futures import ThreadPoolExecutor
import sqlite3
from typing import AsyncIterator

class AdvancedProxyMonitor:
    def __init__(self, proxy_list: List[str], db_path: str = 'proxy_metrics.db'):
        self.proxy_list = proxy_list
        self.db_path = db_path
        self.setup_database()
    
    def setup_database(self):
        """Initialize SQLite database"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS proxy_metrics (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                proxy_id TEXT,
                success_rate REAL,
                avg_response_time REAL,
                p95_response_time REAL,
                error_count INTEGER,
                timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
            )
        ''')
        
        conn.commit()
        conn.close()
    
    async def test_proxy_async(self, session: aiohttp.ClientSession, proxy: str) -> Dict:
        """Asynchronous proxy testing"""
        results = []
        
        for url in self.test_urls:
            start_time = time.time()
            try:
                async with session.get(
                    url, 
                    proxy=proxy,
                    timeout=aiohttp.ClientTimeout(total=10)
                ) as response:
                    end_time = time.time()
                    
                    results.append({
                        'success': response.status == 200,
                        'response_time': end_time - start_time,
                        'status_code': response.status
                    })
            except Exception as e:
                results.append({
                    'success': False,
                    'response_time': 10,
                    'error': str(e)
                })
        
        return self.calculate_metrics(results)
    
    async def monitor_proxies_async(self) -> List[ProxyMetrics]:
        """Asynchronous monitoring of all proxies"""
        async with aiohttp.ClientSession() as session:
            tasks = [
                self.test_proxy_async(session, proxy) 
                for proxy in self.proxy_list
            ]
            
            results = await asyncio.gather(*tasks)
            
            # Save to database
            self.save_metrics(results)
            
            return results
    
    def save_metrics(self, metrics: List[Dict]):
        """Save metrics to database"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        for i, metric in enumerate(metrics):
            cursor.execute('''
                INSERT INTO proxy_metrics (
                    proxy_id, success_rate, avg_response_time, 
                    p95_response_time, error_count
                ) VALUES (?, ?, ?, ?, ?)
            ''', (
                self.proxy_list[i],
                metric['success_rate'],
                metric['avg_response_time'],
                metric['p95_response_time'],
                metric['error_count']
            ))
        
        conn.commit()
        conn.close()

Automation and Alert Features

Threshold-Based Alerts

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

class ProxyAlertSystem:
    def __init__(self, smtp_config: Dict):
        self.smtp_config = smtp_config
        self.thresholds = {
            'success_rate': 90.0,
            'avg_response_time': 5.0,
            'p95_response_time': 10.0
        }
    
    def check_thresholds(self, metrics: ProxyMetrics) -> List[str]:
        """Check thresholds"""
        alerts = []
        
        if metrics.success_rate < self.thresholds['success_rate']:
            alerts.append(f"Success rate declined: {metrics.success_rate:.2f}%")
        
        if metrics.avg_response_time > self.thresholds['avg_response_time']:
            alerts.append(f"Average response time increased: {metrics.avg_response_time:.2f}s")
        
        if metrics.p95_response_time > self.thresholds['p95_response_time']:
            alerts.append(f"95th percentile response time increased: {metrics.p95_response_time:.2f}s")
        
        return alerts
    
    def send_alert(self, proxy_id: str, alerts: List[str]):
        """Send alert email"""
        msg = MIMEMultipart()
        msg['From'] = self.smtp_config['from_email']
        msg['To'] = self.smtp_config['to_email']
        msg['Subject'] = f"Proxy Quality Alert: {proxy_id}"
        
        body = f"""
        The following issues were detected with proxy {proxy_id}:
        
        {chr(10).join(alerts)}
        
        Please investigate.
        """
        
        msg.attach(MIMEText(body, 'plain'))
        
        try:
            server = smtplib.SMTP(self.smtp_config['smtp_server'], self.smtp_config['port'])
            server.starttls()
            server.login(self.smtp_config['username'], self.smtp_config['password'])
            server.send_message(msg)
            server.quit()
        except Exception as e:
            print(f"Email sending error: {e}")

Automatic Proxy Removal System

class ProxyPoolManager:
    def __init__(self, initial_proxies: List[str]):
        self.active_proxies = set(initial_proxies)
        self.quarantine_proxies = set()
        self.monitor = AdvancedProxyMonitor(list(self.active_proxies))
    
    def evaluate_proxy_health(self, proxy: str, metrics: ProxyMetrics) -> bool:
        """Evaluate proxy health"""
        if metrics.success_rate < 80:
            return False
        if metrics.avg_response_time > 10:
            return False
        if metrics.error_count > 5:
            return False
        
        return True
    
    def update_proxy_pool(self):
        """Update proxy pool"""
        current_metrics = self.monitor.monitor_all_proxies()
        
        for proxy, metrics in current_metrics.items():
            if not self.evaluate_proxy_health(proxy, metrics):
                if proxy in self.active_proxies:
                    self.active_proxies.remove(proxy)
                    self.quarantine_proxies.add(proxy)
                    print(f"Quarantined proxy: {proxy}")
            else:
                if proxy in self.quarantine_proxies:
                    self.quarantine_proxies.remove(proxy)
                    self.active_proxies.add(proxy)
                    print(f"Restored proxy: {proxy}")
    
    def get_healthy_proxies(self) -> List[str]:
        """Get list of healthy proxies"""
        return list(self.active_proxies)

Practical Monitoring Recipes

Regional Quality Monitoring

class RegionalProxyMonitor:
    def __init__(self, proxy_config: Dict[str, List[str]]):
        self.proxy_config = proxy_config  # {'US': [proxy1, proxy2], 'JP': [proxy3]}
        self.regional_metrics = {}
    
    def test_regional_access(self, region: str, proxies: List[str]) -> Dict:
        """Regional access testing"""
        test_sites = {
            'US': ['https://www.google.com', 'https://www.amazon.com'],
            'JP': ['https://www.google.co.jp', 'https://www.amazon.co.jp'],
            'EU': ['https://www.google.de', 'https://www.amazon.de']
        }
        
        results = []
        for proxy in proxies:
            for site in test_sites.get(region, []):
                try:
                    response = requests.get(
                        site, 
                        proxies={'http': proxy, 'https': proxy},
                        timeout=10
                    )
                    
                    # Regional restriction check
                    is_region_accessible = self.check_region_access(response, region)
                    
                    results.append({
                        'proxy': proxy,
                        'site': site,
                        'accessible': is_region_accessible,
                        'response_time': response.elapsed.total_seconds()
                    })
                except Exception as e:
                    results.append({
                        'proxy': proxy,
                        'site': site,
                        'accessible': False,
                        'error': str(e)
                    })
        
        return self.analyze_regional_results(results)
    
    def check_region_access(self, response: requests.Response, region: str) -> bool:
        """Check regional access availability"""
        # Check for common regional restriction error messages
        error_indicators = [
            'not available in your country',
            'access denied',
            'geographic restriction',
            'region not supported'
        ]
        
        content = response.text.lower()
        return not any(indicator in content for indicator in error_indicators)

Performance Optimization

Proxy Performance Optimization Graph

class PerformanceOptimizer:
    def __init__(self, proxy_list: List[str]):
        self.proxy_list = proxy_list
        self.performance_history = {}
    
    def benchmark_proxies(self, test_duration: int = 300) -> Dict:
        """Benchmark proxies"""
        results = {}
        
        for proxy in self.proxy_list:
            start_time = time.time()
            request_count = 0
            successful_requests = 0
            total_response_time = 0
            
            while time.time() - start_time < test_duration:
                try:
                    response = requests.get(
                        'https://httpbin.org/ip',
                        proxies={'http': proxy, 'https': proxy},
                        timeout=10
                    )
                    
                    request_count += 1
                    if response.status_code == 200:
                        successful_requests += 1
                        total_response_time += response.elapsed.total_seconds()
                    
                    time.sleep(0.1)  # 100ms interval
                    
                except Exception:
                    request_count += 1
            
            results[proxy] = {
                'requests_per_second': request_count / test_duration,
                'success_rate': successful_requests / request_count * 100,
                'avg_response_time': total_response_time / successful_requests if successful_requests > 0 else 0
            }
        
        return results
    
    def rank_proxies(self, benchmark_results: Dict) -> List[str]:
        """Rank proxies"""
        def calculate_score(metrics: Dict) -> float:
            """Calculate composite score"""
            return (
                metrics['success_rate'] * 0.4 +
                (1 / metrics['avg_response_time'] if metrics['avg_response_time'] > 0 else 0) * 0.3 +
                metrics['requests_per_second'] * 0.3
            )
        
        ranked = sorted(
            benchmark_results.items(),
            key=lambda x: calculate_score(x[1]),
            reverse=True
        )
        
        return [proxy for proxy, _ in ranked]

Leveraging Bright Data Monitoring API

class BrightDataMonitor:
    def __init__(self, api_token: str):
        self.api_token = api_token
        self.base_url = 'https://brightdata.com/api/v1'
    
    def get_zone_statistics(self, zone_id: str) -> Dict:
        """Get zone statistics"""
        headers = {
            'Authorization': f'Bearer {self.api_token}',
            'Content-Type': 'application/json'
        }
        
        response = requests.get(
            f'{self.base_url}/zones/{zone_id}/statistics',
            headers=headers
        )
        
        if response.status_code == 200:
            return response.json()
        else:
            raise Exception(f"API Error: {response.status_code}")
    
    def get_usage_metrics(self, zone_id: str, period: str = '24h') -> Dict:
        """Get usage metrics"""
        headers = {
            'Authorization': f'Bearer {self.api_token}',
            'Content-Type': 'application/json'
        }
        
        params = {'period': period}
        
        response = requests.get(
            f'{self.base_url}/zones/{zone_id}/usage',
            headers=headers,
            params=params
        )
        
        return response.json() if response.status_code == 200 else {}
    
    def analyze_bright_data_metrics(self, zone_id: str) -> Dict:
        """Analyze Bright Data metrics"""
        stats = self.get_zone_statistics(zone_id)
        usage = self.get_usage_metrics(zone_id)
        
        return {
            'success_rate': stats.get('success_rate', 0),
            'avg_response_time': stats.get('avg_response_time', 0),
            'bandwidth_usage': usage.get('bandwidth_mb', 0),
            'request_count': usage.get('requests', 0),
            'cost_per_gb': usage.get('cost_per_gb', 0)
        }

Frequently Asked Questions

Q: What's the appropriate monitoring frequency? A: For critical projects, 5-10 minute intervals are recommended; for general use, 30 minutes to 1 hour intervals are suitable.

Q: How to handle temporarily underperforming proxies? A: Quarantine after 3 consecutive failures, then retest after 1 hour to determine restoration is an effective approach.

Q: How long should monitoring data be retained? A: We recommend keeping detailed data for 1 month and summary data for 1 year.

Q: What are good baseline alert thresholds? A: Use 90% success rate, 5-second average response time, and 10-second 95th percentile response time as baselines, adjusting for your specific use case.

Q: Can multiple proxy providers be monitored simultaneously? A: Yes, the same monitoring system can manage multiple providers in an integrated manner.

Monitoring System Operation Tips

Continuous Improvement

  1. Regular threshold review: Set optimal thresholds based on historical data
  2. New feature testing: Expand monitoring scope and add new metrics
  3. Cost optimization: Apply techniques from Bright Data Cost Optimization

Troubleshooting

  • Sudden response time degradation: Check for proxy provider outages
  • Declining success rates: Investigate target site changes or access restrictions
  • Regional access failures: Verify changes in geographic restrictions

Conclusion

Effective proxy quality monitoring systems are essential for building stable web scraping environments. By implementing the techniques presented in this article and customizing them for your project requirements, you can achieve high-quality data collection systems.

Through continuous monitoring and improvement, maintain proxy quality and achieve efficient data collection.

Bright Data

30日間無料

世界最大のプロキシネットワーク

信頼性高機能サポート充実

References:

Related Articles

Related articles feature coming soon.