Shopify stores need chatbots that can answer questions about specific orders and recommend products. Customers expect personal responses like "Your order #1234 is shipping to 123 Main St." Most chatbots solve this by sending everything to AI services like OpenAI: customer names, addresses, emails, order details. That creates a privacy problem. If the AI service gets hacked or misuses the data, customers are exposed. We needed to build something different. The chatbot had to feel personal without ever sending private data to AI servers. That meant solving two problems. First: make the AI write natural responses using only anonymous data like order IDs, status codes, and timestamps. No names or addresses on the server side. Second: fill in the personal details in the customer's browser, where their data already lives. And do it securely, so no other scripts or websites could steal it.
We split the work between the server and the browser. The server handles AI and sends back messages with blanks. The browser fills in those blanks with customer data it already has. Simple concept, tricky to execute securely.
We call these blanks "tokens." They're placeholders that get swapped with real data in the browser.
Here's how we kept customer data out of AI servers:
The chatbot remembers what you talked about:
Different types of questions need different search approaches:
The chatbot can do three things:
Watch how customer data stays private while the chatbot works
This approach works for any online store that wants AI chat without privacy risks. Customer data never touches AI servers. That means compliance with privacy laws like GDPR and CCPA comes built-in.
Built entirely on AWS Lambda (Python 3.11 runtime) with API Gateway routing requests to /api/chat. Infrastructure defined in AWS SAM templates for reproducible deployments. DynamoDB stores conversation history with 30-day TTL. OpenSearch provides hybrid text + semantic search with AWS Signature v4 authentication.
All requests flow through Shopify's App Proxy, which includes the logged_in_customer_id and generates an HMAC-SHA256 signature. Lambda validates by recomputing the HMAC with a shared secret, comparing signatures, and checking timestamps (rejecting anything older than 5 minutes).
The validated customer ID is then used to query Shopify's GraphQL API for order data. These GraphQL queries only request order status, tracking numbers, and fulfillment dates. No PII fields like names, emails, or addresses are included in the query or response.
Customer data lives in a JavaScript closure, a private function scope that's inaccessible from the browser console or external scripts. Shopify Liquid injects customer data once during page load, which gets captured in closure scope and immediately removed from the global window object.
The closure exposes only resolver methods (like resolveName() or resolveAddress()) that return specific values on demand. Data automatically purges after 5 minutes of inactivity, on tab switch, or page unload. LocalStorage only stores tokenized messages, never resolved PII.
Lambda returns responses with placeholder tokens like {CUSTOMER_NAME} or {SHIPPING_ADDRESS:1020}. The frontend loops through each token and calls the closure functions to swap placeholders with actual customer data. The final personalized message appears on screen but never gets sent anywhere or stored.
Product data flows from Shopify to OpenSearch through a CLI-based indexing pipeline. Products are fetched via Shopify GraphQL, then combined with sales analytics from the last 60 days of order data. Each product gets assigned popularity and trending scores based on sales velocity and recency.
Products are converted to embedding text (title, description, vendor, tags concatenated) and sent to OpenAI's text-embedding-ada-002. The returned vectors get indexed to OpenSearch for semantic similarity search. The CLI can be run anytime to refresh the catalog with updated product or sales data.
OpenAI GPT-4o-mini orchestrated through PydanticAI framework with three tools:
text-embedding-ada-002 vectorsSystem prompt enforces mandatory token usage for all PII references. Agent automatically routes customer intent to appropriate tool.