Your Browser Just Got an AI Copilot: Chat, Commands, and Agentic Browsing

When we launched the JieGou browser extension, it was a tool executor. Recipes and workflows could call 60+ browser automation tools — clicking, reading, filling forms — through the Model Context Protocol. Powerful, but passive. The extension only did things when a recipe told it to.

Today, the extension is an AI assistant in its own right. Open the side panel, ask a question about the page you’re looking at, and get an answer that understands what’s on your screen. Use the command palette for one-keystroke actions. Let the agent take the wheel and browse autonomously. Record your browser interactions and replay them later.

AI chat with page awareness

Click the extension icon or use the keyboard shortcut to open the side panel. Type a question, and the AI responds with full awareness of the page you’re on.

The extension doesn’t just see the URL. It extracts the page’s text content (up to 8,000 characters) and detects which platform you’re using — Gmail, Slack, Jira, Salesforce, Confluence, ServiceNow, LinkedIn, or HubSpot. Platform detection triggers specialized context extraction, so the AI understands you’re looking at a Jira ticket or a Gmail thread, not just a generic web page.

Conversations persist across panel opens. Close the side panel, browse to another page, reopen it — your chat history is there. Up to 50 messages are stored locally in your browser, so nothing leaves your machine unless you explicitly ask the AI to take an action.

Two modes are available: standard chat for Q&A, and agent mode for multi-step actions (more on that below). Toggle between them with a single click.

Command palette: Cmd+Shift+K

Hit Cmd+Shift+K (Mac) or Ctrl+Shift+K (Windows/Linux) and a searchable palette appears with 10 built-in actions:

Page actions:

Screenshot Page — Captures the viewport as a PNG, copies to clipboard
Copy as Markdown — Extracts the page content as clean Markdown
Extract All Links — Pulls every link from the page
Extract Tables as CSV — Converts HTML tables to CSV format
Save to Notepad — Saves content to local storage for later

AI actions:

Summarize Page — AI-generated summary of the current page
Extract Structured Data — Pulls structured information from unstructured content
Draft Reply — Generates a contextual reply (useful for emails and threads)
Explain This — Plain-English explanation of technical content

Navigation:

Search Open Tabs — Fuzzy search across all your open browser tabs

Type to filter, arrow keys to navigate, Enter to execute. Results are copied to your clipboard automatically.

Beyond the built-ins, you can create custom actions. Define a prompt template with variables like {selectedText}, {pageUrl}, and {pageTitle}, and your action appears in the palette alongside the defaults. If your team has a standard way of summarizing support tickets or extracting action items from meeting notes, save it as a custom action and it’s always one keystroke away.

Agentic browsing

Standard chat answers questions. Agent mode takes action.

When you enable agent mode, the AI enters a multi-turn loop. It can propose browser tool calls — click this button, fill that form, navigate to another page — and the extension executes them after you approve.

Here’s how the approval flow works:

You type an instruction: “Find the latest invoice in my email and forward it to accounting@company.com”
The AI plans its approach and proposes tool calls: navigate to Gmail, search for “invoice”, open the latest result
Each tool call appears in a card with the action name and parameters. Read-only tools (reading page content, taking screenshots) execute automatically. Mutation tools (clicking, typing, navigating) wait for your approval.
Approve individually or hit “Approve All” to let the agent run through the remaining steps.

The agent runs for up to 10 turns before pausing, so it won’t loop indefinitely. You can stop it at any time.

Technically, the extension acts as a client-side orchestrator. It sends conversation context to the JieGou server, which proxies the request to the LLM. The LLM’s tool call proposals stream back via Server-Sent Events. The extension parses the stream, categorizes each tool call as read-only or mutation, and handles the approval flow locally. Tool execution happens entirely in your browser — the server never sees the page content.

Flow recording and playback

Sometimes you don’t need AI to figure out what to do — you just need it to repeat what you already did.

Click “Record” in the side panel, then interact with your browser normally. Click buttons, fill forms, navigate between pages. The extension captures each interaction as a structured step: click, fill, scroll, keyboard input, tab switching.

When you stop recording, you have a replayable flow. Each step maps to a browser automation tool call (chrome_click_element, chrome_fill_or_select, etc.), so playback uses the same reliable automation infrastructure as recipes and workflows.

Flows support:

Variables — Parameterize steps with {{variableName}} placeholders. A login flow becomes reusable across accounts by turning the username and password into variables.
Speed control — Play at 0.5x, 1x, or 2x speed
Step-by-step mode — Pause after each step for verification
Continue on error — Optionally skip failed steps instead of stopping
Execution history — Every playback is recorded with per-step success/failure status, timing, and error details

Flows are stored locally in IndexedDB — no cloud dependency for basic recording and playback. You can export flows as JSON files to share with teammates or import them on another machine.

Platform-specific intelligence

The AI assistant inherits all 60+ browser automation tools from the extension, plus platform-specific handlers for six enterprise applications:

Gmail — Read threads, compose emails, search inbox
Slack — Read messages, post to channels
Jira — Create issues, update tickets, read sprint data
Salesforce — Read and update records
ServiceNow — Manage incidents
HubSpot — Access contacts and campaigns

These handlers understand each platform’s DOM structure, so the AI operates at a semantic level — “read the latest email from Sarah” instead of “click the element at selector div.adn.ads > div:nth-child(3).”

Privacy and security

The AI assistant runs in your browser. Page content is extracted locally and only sent to the LLM when you ask a question or trigger an action. BYOK applies — if you use your own API keys, the data flows directly between your browser and the LLM provider.

Chat history, recorded flows, custom actions, and settings are all stored in your browser’s local storage. Nothing is synced to JieGou’s servers unless it’s part of a recipe or workflow execution.

Getting started

Update your JieGou browser extension to the latest version. The AI chat panel, command palette, and recording features are available immediately. Agent mode is available on Pro plans and above. Learn more about the browser extension or install it from the extension store.