The Complete Guide to AI Assistant Privacy

Privacy in the Age of AI Assistants

AI assistants are increasingly woven into our daily routines. We share work documents, personal plans, financial details, and sensitive business information with them. The convenience is real, but so are the privacy implications.

Understanding where your data goes, who can access it, and how long it persists is essential for making informed decisions about which AI tools to trust with your information. This guide covers the full privacy picture - from how conversations are stored to what model providers do with your data to how platforms like Finna protect your information at every layer.

Where Your Conversation Data Lives

The Three Storage Layers

When you interact with an AI assistant, your data typically touches three distinct systems:

The client application - the app or interface you type into (a web browser, WhatsApp, Telegram, etc.)
The assistant platform - the server that processes your message, manages context, and coordinates tool usage (this is where Moltbot runs)
The model provider - the API service that performs the actual language model inference (Anthropic, OpenAI, etc.)

Each layer has its own data handling policies, and understanding all three is necessary for a complete privacy picture.

Client-Side Storage

Your messaging client stores conversation history locally on your device and potentially on its own cloud infrastructure. WhatsApp uses end-to-end encryption by default but may back up messages to iCloud or Google Drive. Telegram stores messages on its servers. Discord retains message history indefinitely.

These client-side storage behaviors are controlled by the messaging platform, not by your AI assistant. Choose your channel based on its native privacy properties if this matters to you.

Platform-Side Storage

The assistant platform - where Moltbot runs - stores conversation context to maintain session continuity. Without this context, the assistant would forget everything between messages and lose the ability to reference earlier parts of a conversation.

On a self-hosted Moltbot instance, this data lives on your server's disk. On Finna, it lives on your gateway's dedicated encrypted volume inside your isolated microVM. In both cases, the data is under your control and can be cleared at any time by resetting sessions.

Model Provider Storage

When the assistant processes your message, it sends the conversation context to a language model provider's API. This is the layer that often generates the most privacy concern, and rightfully so - your data is leaving your infrastructure and entering someone else's system.

Model Provider Data Policies

Anthropic (Claude)

Anthropic's API terms state that they do not use API inputs or outputs to train models. API data may be retained for a limited period (typically 30 days) for safety monitoring and abuse prevention, after which it is deleted. This retention period is shorter than many competitors, and the no-training commitment is clear.

OpenAI (GPT)

OpenAI's API data usage policy has evolved over time. Current API terms state that data submitted through the API is not used for training by default. You can verify this in your organization's API settings. Consumer products (like ChatGPT) have different terms, so make sure you are looking at the API-specific policy if you are using GPT through Moltbot.

Choosing Based on Privacy

When selecting a model provider for your Moltbot instance, review their current data handling policies directly. Policies change, so check periodically. Consider:

Do they train on API data? (Most major providers do not, by default)
How long do they retain API data?
Where is the data processed geographically?
Do they offer zero-retention options for enterprise customers?
Can you opt out of safety monitoring data retention?

Encryption - At Rest and In Transit

Encryption in Transit

All communication between components should use TLS encryption. This means:

Your browser to the Finna dashboard: HTTPS (TLS 1.3)
Dashboard to your gateway: Encrypted via Cloudflare Tunnel
Gateway to model provider API: HTTPS
Messaging platform to your gateway: Platform-specific encryption (WhatsApp uses Signal protocol, Telegram uses MTProto, etc.)

If any of these links are unencrypted, your conversation data could be intercepted by network-level attackers. On Finna, all links in the chain use encryption by default, with no option to downgrade.

Encryption at Rest

Encryption at rest protects data stored on disk. If a hard drive is stolen or a backup is compromised, encrypted data remains unreadable without the encryption key.

On Finna, encryption at rest operates at multiple levels:

Volume encryption: Fly.io encrypts the underlying storage volumes
Application-level encryption: Sensitive data like API keys and channel credentials are encrypted with AES-256-GCM before being written to the database
Per-tenant key derivation: Each tenant's encryption key is derived from a master key using HKDF with tenant-specific context, ensuring that one tenant's data cannot be decrypted using another tenant's key

Key Management

The strength of any encryption system depends on how the keys are managed. A perfectly encrypted database is worthless if the encryption key is stored in a plaintext file on the same server.

Finna stores the master encryption key in Doppler, a secrets management platform with hardware security module backing. Tenant-specific keys are derived at runtime and never stored directly - they are computed from the master key and tenant context each time they are needed.

Finna's Per-Tenant Isolation

Why Isolation Matters for Privacy

Multi-tenant architectures introduce privacy risks beyond encryption. If multiple tenants share the same server process, a vulnerability in the application code could expose one tenant's data to another. Memory inspection, timing attacks, and log pollution are all vectors that encryption alone does not address.

Firecracker MicroVM Isolation

Finna runs each tenant in a separate Firecracker microVM. This provides:

Memory isolation: Each VM has its own memory space. One tenant's data cannot be read through memory inspection by another tenant's process.
File system isolation: Each VM has its own root file system and data volume. No shared directories, no shared temp files, no shared logs.
Process isolation: Processes in one VM cannot see or interact with processes in another VM.
Network isolation: Each VM has its own network namespace. One tenant cannot sniff another tenant's network traffic.

This level of isolation is stronger than what containers provide and comparable to running on completely separate physical servers.

Data Lifecycle

Understanding when your data is created, how it is used, and when it is deleted is core to privacy:

Creation: Data enters the system when you send a message or configure your gateway
Processing: Your message is sent to the model provider for inference, then the response is returned to your channel
Storage: Conversation context is stored in your gateway's session data on your encrypted volume
Retention: Session data persists until you clear it or your gateway is deprovisioned
Deletion: When you deprovision a gateway, the Fly.io app and its volume are deleted entirely. The database records are marked as deleted.

Applicability

If you or your users are in the European Economic Area, GDPR applies to how personal data is collected, processed, and stored. AI assistant conversations frequently contain personal data - names, email addresses, preferences, opinions, and more.

Key Rights

GDPR grants data subjects several rights that are relevant to AI assistant usage:

Right of access: Users can request a copy of all personal data the system holds about them
Right to erasure: Users can request that their personal data be deleted
Right to data portability: Users can request their data in a machine-readable format
Right to be informed: Users should know what data is collected and how it is used

Finna provides mechanisms to support these rights:

Data export: API endpoints for exporting user data in structured formats
Session management: Users can view and clear conversation sessions at any time
Gateway deprovisioning: Complete data deletion when a user leaves the platform
Audit logs: Records of data processing activities for accountability
Data isolation: Per-tenant VMs make it straightforward to identify and manage one user's data without affecting others

Note that GDPR compliance is a shared responsibility. Finna provides the technical mechanisms, but you are responsible for how you use the assistant and what data you process through it.

Data Retention Policies

Conversation Data

Moltbot retains conversation context in active sessions. Sessions remain active until explicitly cleared by the user or until a configured timeout expires. There is no indefinite retention of conversation content on the platform side.

On Finna, you can manage session retention through the dashboard. Clear individual sessions, reset all sessions, or configure automatic session expiration.

Audit Logs

Finna retains audit logs (who did what and when) for seven years to meet SOC2 and ISO 27001 requirements. These logs contain metadata about actions taken (gateway created, configuration changed, user authenticated) but do not contain conversation content.

Backups

Understand your backup situation. Database backups may contain snapshots of data that has been "deleted" from the live system. Finna uses Neon PostgreSQL, which provides point-in-time recovery. Discuss backup retention periods and encryption with your provider if this is a concern.

Auditing Your AI Assistant's Data

What to Check

Periodically audit what data your AI assistant stores and processes:

Active sessions: Review what conversation context is currently stored. Clear sessions you no longer need.
Stored files: Check the workspace directory for files the assistant has created. Delete anything that contains sensitive information and is no longer needed.
Channel connections: Review which messaging channels are connected. Disconnect channels you no longer use.
API keys: Verify which API keys are configured. Remove keys for services you no longer use.
Audit logs: Review recent activity for any unexpected access patterns.

Setting Up a Privacy Routine

Establish a regular privacy maintenance routine:

Weekly: Clear old conversation sessions
Monthly: Review connected channels and stored files
Quarterly: Rotate API keys, review model provider data policies, check audit logs for anomalies

Practical Privacy Recommendations

For Personal Use

Choose messaging channels with strong native privacy (Signal, WhatsApp with E2E encryption)
Use a model provider with clear no-training policies on API data
Clear sessions regularly
Do not share information you would not want a third party to see, even briefly

For Business Use

Deploy on Finna or self-host for tenant isolation
Use per-tenant encryption for API keys and credentials
Enable audit logging for compliance
Establish data classification policies - define what types of information are appropriate for AI assistant processing
Review model provider data processing agreements annually
Train team members on responsible AI assistant usage

For Regulated Industries

Verify that your model provider's data handling meets your regulatory requirements
Use encryption at rest and in transit at every layer
Implement data retention policies aligned with your compliance framework
Maintain comprehensive audit logs
Consider providers that offer zero-retention API access for sensitive workloads

The Privacy Spectrum

Perfect privacy with AI assistants is not possible if you want useful functionality. The assistant needs to see your messages to respond to them, and language model inference requires sending data to a model provider. The question is not whether data flows to third parties but how much control you have over that flow.

The spectrum runs from centralized services (least control) through managed platforms like Finna (moderate control with isolation) to fully self-hosted with local models (maximum control, maximum effort).

Choose the point on this spectrum that matches your actual needs. Most people and organizations find that the managed model - strong isolation, encrypted storage, clear data policies, and no operational burden - hits the right balance between privacy and practicality.

What matters most is making an informed choice rather than a default one. Now you have the information to do exactly that.