Google Drive Integration
8 Google Drive agent tools for search, read, write, batch import — with MIME mapping and file type filters.
Overview
The Google Drive integration gives the agent eight tools for searching, reading, writing, and bulk-importing from Drive — plus OAuth with automatic token refresh and a REST surface for file browsing and single-file research import. Three write tools are @gated and route through pending actions.
Connecting
Initiate OAuth
POST /api/integrations/google-drive/connect{ "workspace_id": "your-workspace-uuid" }Returns { auth_url, state }. OAuth state rows expire after 10 minutes and are one-time-use (deleted on read).
Handle callback
Google redirects to GET /api/integrations/google-drive/callback?code=...&state=.... The backend exchanges the code for access + refresh tokens, fetches user info (id, email, name, picture), encrypts both tokens, and upserts the Integration with:
access_token_encryptedrefresh_token_encryptedexpires_at(now +expires_inseconds, default 3600)config:google_user_id,google_email,google_name,google_avatar_url
Token refresh
The service helper (_get_google_drive_service) refreshes the access token automatically if expires_at is within 60 seconds of now. On refresh, access_token_encrypted and expires_at are updated in place. If the refresh call itself fails, the helper falls through and tries the existing token once before surfacing token_expired.
Agent Tools
All eight live in app/core/sdk/tools/integration.py. Three are @gated.
| Tool | Args | Returns | HITL | Purpose |
|---|---|---|---|---|
google_drive_search | workspace_id, query, file_type?, folder_id?, max_results? | files[], count, message | No | Full-text search. file_type enum: document, spreadsheet, presentation, pdf, folder. max_results default 20, max 100. |
google_drive_read | workspace_id, file_id | content, file_name, file_type (MIME), web_link, modified_time, truncated | No | Read a file. Google Docs export as text; Sheets as CSV; PDFs, DOCX, and plain text are parsed. |
google_drive_list_folder | workspace_id, folder_id?, max_results? | files[], count, folder_id | No | List folder contents. folder_id defaults to "root" (My Drive). max_results default 50, max 100. |
google_drive_read_sheet | workspace_id, file_id, sheet_name?, cell_range? | sheet_name, headers[], rows[][], total_rows, available_sheets[], file_name, web_link | No | Read a Sheet as structured {headers, rows} — preferred over google_drive_read for spreadsheets. cell_range in A1 notation. |
google_drive_get_comments | workspace_id, file_id | comments[], count | No | Fetch file comments with replies and resolution status. Use to read stakeholder review annotations. |
google_drive_write | workspace_id, filename, content, folder_id?, file_type? | file_id, file_name, web_link | Yes | Create a new file. file_type maps to MIME via _DRIVE_MIME_MAP. |
google_drive_update | workspace_id, file_id, content, append? | file_id, file_name, web_link | Yes | Update an existing file. append=true adds after existing content; false (default) overwrites. |
google_drive_batch_import | workspace_id, folder_id, source_type? | imported_count, failed_count, sources[], failed[], message | Yes | Bulk-import text-extractable files from a folder into the workspace research library. Capped at 20 files per call. source_type enum: interview, support_ticket, survey, usage_data, feedback, other. |
MIME mapping for google_drive_write
Single source of truth for file_type → MIME (shared between the preview and the handler):
file_type | MIME type | Result |
|---|---|---|
document (default) | application/vnd.google-apps.document | New Google Doc |
text | text/plain | Plain .txt file |
csv | text/csv | CSV file |
Unknown values fall back to document.
Error shape
GoogleDriveServiceError subclasses map to typed payloads:
{
"success": false,
"error": "Google Drive token expired. Please reconnect in Settings.",
"error_code": "token_expired"
}
| Exception | error_code | Notes |
|---|---|---|
GoogleDriveNotFound | not_found | — |
GoogleDriveForbidden | forbidden | — |
GoogleDriveRateLimited | rate_limited | Includes retry_after seconds |
GoogleDriveTokenExpired | token_expired | Surfaces a reconnect prompt |
| other | unknown | Logged with stack trace server-side |
Gated Write Tools
Three tools route through the pending-actions queue. Their preview functions are pure over the tool input and capped at 4000 chars of content for UI rendering:
google_drive_write— preview showsfilename,folder_id,file_type, resolvedmime_type, a 4000-charcontent_preview, pluschar_countandword_count.google_drive_update— preview showsfile_id,appendflag,char_count, and a 500-charcontent_preview.google_drive_batch_import— preview showsfolder_idandsource_type. The handler validatessource_typeat the tool boundary (JSON-schema enums are advisory, not enforced).
REST Endpoints
File browse
GET /api/integrations/google-drive/files?workspace_id={id}&query=...&file_type=...
- With
query: full-text search across Drive (max 30). - Without
query: lists recent files from root (max 40), folders filtered out.
Single-file research import
POST /api/integrations/google-drive/import
{
"workspace_id": "your-workspace-uuid",
"file_id": "drive-file-id",
"source_type": "interview",
"title": "Optional — defaults to file_name",
"file_name": "Q1 Interview — Sam"
}
Reads the file content directly from Drive (no R2 upload), stores extracted_text on the ResearchSource, and marks processing_status="completed" immediately — no background processing needed. Returns { source_id, title, processing_status, web_view_link }.
source_type must be one of: interview, support_ticket, survey, usage_data, feedback, other.
Management Endpoints
| Endpoint | Purpose |
|---|---|
GET /api/integrations/google-drive/status?workspace_id={id} | connected, google_email, google_name, google_avatar_url, last_synced_at |
GET /api/integrations/google-drive/files?... | Search or browse Drive files |
POST /api/integrations/google-drive/import | Import one file as a research source |
DELETE /api/integrations/google-drive/disconnect | Removes the Integration row entirely |
Key Concepts
- Automatic refresh, automatic fallback — the service helper refreshes tokens on a 60-second pre-expiry buffer. If refresh fails, it falls through to the old token once; genuine auth errors surface as
token_expired. - Structured sheet reads —
google_drive_read_sheetreturns{headers, rows}plusavailable_sheetsso the agent can iterate across tabs without fetching twice. - Batch import is capped —
google_drive_batch_importstops at 20 files per call to keep write amplification bounded. Failures per file are reported infailed[]with reasons; partial success is the norm. - Source-type enum is enforced at the tool boundary — the JSON schema
enumis advisory to the model, so the handler re-validates against_VALID_SOURCE_TYPES. - Three gated tools, five read tools — the read tools run freely; writes always queue for approval.
What's Next
- Pending Actions — how gated Drive writes route through approval.
- Knowledge Graph — imported Drive files become research sources that feed the graph.
- GitHub Integration / Linear Integration — the other two deep surfaces.
Was this helpful?