Skip to main content

AI Categorization

The Email Assistant uses Google Gemini AI to intelligently categorize incoming emails, helping you focus on what matters most.

How It Worksโ€‹

Email Categoriesโ€‹

CategoryDescriptionExamples
Need-ActionRequires your responseMeeting invites, direct questions
FYIInformational onlyStatus updates, notifications
NewsletterSubscriptionsDaily digests, weekly updates
PromotionalMarketing contentSales, offers, advertisements
SocialSocial networksLinkedIn, Twitter notifications

Categorization Logicโ€‹

Input Processingโ€‹

The system extracts key information from each email:

email_data = {
'subject': email['subject'],
'from': email['from'],
'snippet': email['snippet'], # First 100 chars
'date': email['date'],
}

AI Promptโ€‹

CATEGORIZATION_PROMPT = """
Analyze this email and categorize it:

From: {from_address}
Subject: {subject}
Preview: {snippet}

Categories:
1. NEED_ACTION - Requires response or action
2. FYI - Informational, no action needed
3. NEWSLETTER - Subscription content
4. PROMOTIONAL - Marketing/sales
5. SOCIAL - Social network notifications

Return ONLY the category name.
"""

Response Parsingโ€‹

def parse_category(response: str) -> str:
"""Parse Gemini response to category."""
response = response.strip().upper()

categories = {
'NEED_ACTION': 'Need-Action',
'FYI': 'FYI',
'NEWSLETTER': 'Newsletter',
'PROMOTIONAL': 'Promotional',
'SOCIAL': 'Social',
}

for key, value in categories.items():
if key in response:
return value

return 'FYI' # Default fallback

Gemini Configurationโ€‹

Model Settingsโ€‹

{
"api_settings": {
"gemini_model": "gemini-2.5-flash-lite",
"requests_per_minute": 30,
"max_retries": 3,
"timeout_seconds": 30
}
}

Rate Limitingโ€‹

The system respects API rate limits:

class RateLimiter:
def __init__(self, requests_per_minute: int = 30):
self.requests_per_minute = requests_per_minute
self.request_times = []

def wait_if_needed(self):
"""Wait if rate limit would be exceeded."""
now = time.time()
minute_ago = now - 60

# Remove old requests
self.request_times = [t for t in self.request_times if t > minute_ago]

if len(self.request_times) >= self.requests_per_minute:
sleep_time = self.request_times[0] - minute_ago
time.sleep(sleep_time)

self.request_times.append(now)

Cachingโ€‹

Categorization results are cached to minimize API calls:

Cache Configurationโ€‹

{
"cache_settings": {
"enabled": true,
"max_cached_emails": 30,
"cache_expiry_hours": 24
}
}

Cache Benefitsโ€‹

MetricFirst RunCached Run
API Calls10-150-3
Time13-20 sec5-8 sec
Cost~$0.10~$0.00

Error Handlingโ€‹

Retry Logicโ€‹

def categorize_with_retry(email: dict, max_retries: int = 3) -> str:
"""Categorize email with retry on failure."""
for attempt in range(max_retries):
try:
return call_gemini_api(email)
except RateLimitError:
wait_time = 2 ** attempt # Exponential backoff
time.sleep(wait_time)
except APIError as e:
logger.error(f"API error: {e}")
if attempt == max_retries - 1:
return 'FYI' # Default on failure

return 'FYI'

Fallback Behaviorโ€‹

If Gemini API fails:

  1. Log the error
  2. Return default category (FYI)
  3. Continue processing other emails
  4. Report error in metrics

Accuracy Improvementsโ€‹

Tips for Better Categorizationโ€‹

  1. Complete metadata: Ensure subject and snippet are available
  2. Consistent senders: Known senders improve accuracy
  3. Clean inbox: Reduce spam before processing
  4. Tune prompts: Adjust prompts for your use case

Common Misclassificationsโ€‹

SituationExpectedCommon MistakeFix
Meeting inviteNeed-ActionFYICheck for "invite" keyword
Bill reminderNeed-ActionNewsletterCheck sender domain
Product updateNewsletterPromotionalCheck subject patterns

Metricsโ€‹

Track categorization performance:

# Tracked metrics
metrics.record_api_call(
model='gemini-2.5-flash-lite',
latency=elapsed_time,
success=True,
category=result,
)

View in dashboard:

  • API calls made
  • Response times
  • Category distribution
  • Error rates