AI & Privacy Engineering

A Chatbot That Protects Customer Privacy

The Problem

Shopify stores need chatbots that can answer questions about specific orders and recommend products. Customers expect personal responses like "Your order #1234 is shipping to 123 Main St." Most chatbots solve this by sending everything to AI services like OpenAI: customer names, addresses, emails, order details. That creates a privacy problem. If the AI service gets hacked or misuses the data, customers are exposed. We needed to build something different. The chatbot had to feel personal without ever sending private data to AI servers. That meant solving two problems. First: make the AI write natural responses using only anonymous data like order IDs, status codes, and timestamps. No names or addresses on the server side. Second: fill in the personal details in the customer's browser, where their data already lives. And do it securely, so no other scripts or websites could steal it.

How We Built It

We split the work between the server and the browser. The server handles AI and sends back messages with blanks. The browser fills in those blanks with customer data it already has. Simple concept, tricky to execute securely.

We call these blanks "tokens." They're placeholders that get swapped with real data in the browser.

Keeping Data Private

Here's how we kept customer data out of AI servers:

  • Server side: Only gets customer IDs and order numbers. Never sees names, addresses, or emails. The AI writes responses like "Your order is shipping to [ADDRESS]" with blanks where personal info goes. We use Shopify's security signatures to verify requests are legitimate.
  • Browser side: Gets those templated responses and fills in the blanks using customer data from the Shopify store (which the browser already has access to). The filled-in message never gets sent anywhere or saved. It just appears on screen and disappears when the page closes.

Conversation Memory

The chatbot remembers what you talked about:

  • Saves the last 20 messages from each conversation
  • Deletes everything after 30 days automatically
  • Lets you ask follow-up questions like "What about that order?" and the AI knows which order you mean

Product Search

Different types of questions need different search approaches:

  • Word matching: Finds products based on what customers actually type, not just exact matches
  • Popular items: Sorts by what sells best when customers ask for recommendations
  • New arrivals: Filters by date for "what's new" questions
  • Personal picks: Looks at what you bought before and suggests similar or matching products

How the AI Decides What to Do

The chatbot can do three things:

  • Look up orders (only if you're logged in)
  • Search products (anyone can do this)
  • Make recommendations based on purchase history

How It Works

Watch how customer data stays private while the chatbot works

Shopify Customer Support
Privacy-First AI Assistant
Interactive demonstration...
Send
Progress: Step 0 of 6

What We Achieved

Privacy Protection

  • No personal data leaves the browser: All customer names, addresses, and emails stay on their device. The AI servers never see them.
  • Request verification: Every request gets checked with Shopify's security system to prevent fake or malicious requests
  • Auto-deletion: Conversation history gets wiped after 30 days

Speed

  • Filling in customer data: Under 50 milliseconds (happens in the browser)
  • Product search: 200-500 milliseconds to find and return results
  • Loading chat history: Under 100 milliseconds
  • First-time responses: 2-3 seconds when cold, under 500ms when warmed up

Customer Experience

  • Customers see fully personalized messages with their name, shipping address, and order details
  • Can ask follow-up questions and the chatbot remembers the conversation
  • Get product recommendations based on their purchase history (without that data going to AI servers)

The Big Picture

This approach works for any online store that wants AI chat without privacy risks. Customer data never touches AI servers. That means compliance with privacy laws like GDPR and CCPA comes built-in.

Technical Deep Dive

AWS Serverless Stack

Built entirely on AWS Lambda (Python 3.11 runtime) with API Gateway routing requests to /api/chat. Infrastructure defined in AWS SAM templates for reproducible deployments. DynamoDB stores conversation history with 30-day TTL. OpenSearch provides hybrid text + semantic search with AWS Signature v4 authentication.

Shopify HMAC Authentication

All requests flow through Shopify's App Proxy, which includes the logged_in_customer_id and generates an HMAC-SHA256 signature. Lambda validates by recomputing the HMAC with a shared secret, comparing signatures, and checking timestamps (rejecting anything older than 5 minutes).

The validated customer ID is then used to query Shopify's GraphQL API for order data. These GraphQL queries only request order status, tracking numbers, and fulfillment dates. No PII fields like names, emails, or addresses are included in the query or response.

Closure-Based Data Privacy

Customer data lives in a JavaScript closure, a private function scope that's inaccessible from the browser console or external scripts. Shopify Liquid injects customer data once during page load, which gets captured in closure scope and immediately removed from the global window object.

The closure exposes only resolver methods (like resolveName() or resolveAddress()) that return specific values on demand. Data automatically purges after 5 minutes of inactivity, on tab switch, or page unload. LocalStorage only stores tokenized messages, never resolved PII.

Token Resolution Flow

Lambda returns responses with placeholder tokens like {CUSTOMER_NAME} or {SHIPPING_ADDRESS:1020}. The frontend loops through each token and calls the closure functions to swap placeholders with actual customer data. The final personalized message appears on screen but never gets sent anywhere or stored.

Catalog Synchronization Pipeline

Product data flows from Shopify to OpenSearch through a CLI-based indexing pipeline. Products are fetched via Shopify GraphQL, then combined with sales analytics from the last 60 days of order data. Each product gets assigned popularity and trending scores based on sales velocity and recency.

Products are converted to embedding text (title, description, vendor, tags concatenated) and sent to OpenAI's text-embedding-ada-002. The returned vectors get indexed to OpenSearch for semantic similarity search. The CLI can be run anytime to refresh the catalog with updated product or sales data.

AI Agent with PydanticAI

OpenAI GPT-4o-mini orchestrated through PydanticAI framework with three tools:

  • orders_tool: Queries Shopify GraphQL for order status (authenticated users only)
  • products_tool: Hybrid search via OpenSearch (text matching + semantic embeddings)
  • recommendations_tool: OpenSearch "More Like This" queries analyze past purchases to find similar products using text-embedding-ada-002 vectors

System prompt enforces mandatory token usage for all PII references. Agent automatically routes customer intent to appropriate tool.

Tech Stack

Python 3.11
AWS Lambda (Serverless)
PydanticAI + OpenAI GPT-4o-mini
AWS DynamoDB
AWS OpenSearch Service
Shopify GraphQL API
HMAC SHA-256 authentication
Shopify Liquid templates
Vanilla JavaScript (ES6+)
AWS SAM (Serverless Application Model)
AWS API Gateway
CloudFormation templates
Python 3.11
AWS Lambda (Serverless)
PydanticAI + OpenAI GPT-4o-mini
AWS DynamoDB
AWS OpenSearch Service
Shopify GraphQL API
HMAC SHA-256 authentication
Shopify Liquid templates
Vanilla JavaScript (ES6+)
AWS SAM (Serverless Application Model)
AWS API Gateway
CloudFormation templates