Google Drive Integration

8 Google Drive agent tools for search, read, write, batch import — with MIME mapping and file type filters.

Overview

The Google Drive integration gives the agent eight tools for searching, reading, writing, and bulk-importing from Drive — plus OAuth with automatic token refresh and a REST surface for file browsing and single-file research import. Three write tools are @gated and route through pending actions.

Connecting

1

Initiate OAuth

POST /api/integrations/google-drive/connect
{ "workspace_id": "your-workspace-uuid" }

Returns { auth_url, state }. OAuth state rows expire after 10 minutes and are one-time-use (deleted on read).

2

Handle callback

Google redirects to GET /api/integrations/google-drive/callback?code=...&state=.... The backend exchanges the code for access + refresh tokens, fetches user info (id, email, name, picture), encrypts both tokens, and upserts the Integration with:

  • access_token_encrypted
  • refresh_token_encrypted
  • expires_at (now + expires_in seconds, default 3600)
  • config: google_user_id, google_email, google_name, google_avatar_url
3

Token refresh

The service helper (_get_google_drive_service) refreshes the access token automatically if expires_at is within 60 seconds of now. On refresh, access_token_encrypted and expires_at are updated in place. If the refresh call itself fails, the helper falls through and tries the existing token once before surfacing token_expired.

Agent Tools

All eight live in app/core/sdk/tools/integration.py. Three are @gated.

ToolArgsReturnsHITLPurpose
google_drive_searchworkspace_id, query, file_type?, folder_id?, max_results?files[], count, messageNoFull-text search. file_type enum: document, spreadsheet, presentation, pdf, folder. max_results default 20, max 100.
google_drive_readworkspace_id, file_idcontent, file_name, file_type (MIME), web_link, modified_time, truncatedNoRead a file. Google Docs export as text; Sheets as CSV; PDFs, DOCX, and plain text are parsed.
google_drive_list_folderworkspace_id, folder_id?, max_results?files[], count, folder_idNoList folder contents. folder_id defaults to "root" (My Drive). max_results default 50, max 100.
google_drive_read_sheetworkspace_id, file_id, sheet_name?, cell_range?sheet_name, headers[], rows[][], total_rows, available_sheets[], file_name, web_linkNoRead a Sheet as structured {headers, rows} — preferred over google_drive_read for spreadsheets. cell_range in A1 notation.
google_drive_get_commentsworkspace_id, file_idcomments[], countNoFetch file comments with replies and resolution status. Use to read stakeholder review annotations.
google_drive_writeworkspace_id, filename, content, folder_id?, file_type?file_id, file_name, web_linkYesCreate a new file. file_type maps to MIME via _DRIVE_MIME_MAP.
google_drive_updateworkspace_id, file_id, content, append?file_id, file_name, web_linkYesUpdate an existing file. append=true adds after existing content; false (default) overwrites.
google_drive_batch_importworkspace_id, folder_id, source_type?imported_count, failed_count, sources[], failed[], messageYesBulk-import text-extractable files from a folder into the workspace research library. Capped at 20 files per call. source_type enum: interview, support_ticket, survey, usage_data, feedback, other.

MIME mapping for google_drive_write

Single source of truth for file_type → MIME (shared between the preview and the handler):

file_typeMIME typeResult
document (default)application/vnd.google-apps.documentNew Google Doc
texttext/plainPlain .txt file
csvtext/csvCSV file

Unknown values fall back to document.

Error shape

GoogleDriveServiceError subclasses map to typed payloads:

{
  "success": false,
  "error": "Google Drive token expired. Please reconnect in Settings.",
  "error_code": "token_expired"
}
Exceptionerror_codeNotes
GoogleDriveNotFoundnot_found
GoogleDriveForbiddenforbidden
GoogleDriveRateLimitedrate_limitedIncludes retry_after seconds
GoogleDriveTokenExpiredtoken_expiredSurfaces a reconnect prompt
otherunknownLogged with stack trace server-side

Gated Write Tools

Three tools route through the pending-actions queue. Their preview functions are pure over the tool input and capped at 4000 chars of content for UI rendering:

  • google_drive_write — preview shows filename, folder_id, file_type, resolved mime_type, a 4000-char content_preview, plus char_count and word_count.
  • google_drive_update — preview shows file_id, append flag, char_count, and a 500-char content_preview.
  • google_drive_batch_import — preview shows folder_id and source_type. The handler validates source_type at the tool boundary (JSON-schema enums are advisory, not enforced).

REST Endpoints

File browse

GET /api/integrations/google-drive/files?workspace_id={id}&query=...&file_type=...
  • With query: full-text search across Drive (max 30).
  • Without query: lists recent files from root (max 40), folders filtered out.

Single-file research import

POST /api/integrations/google-drive/import
{
  "workspace_id": "your-workspace-uuid",
  "file_id": "drive-file-id",
  "source_type": "interview",
  "title": "Optional — defaults to file_name",
  "file_name": "Q1 Interview — Sam"
}

Reads the file content directly from Drive (no R2 upload), stores extracted_text on the ResearchSource, and marks processing_status="completed" immediately — no background processing needed. Returns { source_id, title, processing_status, web_view_link }.

source_type must be one of: interview, support_ticket, survey, usage_data, feedback, other.

Management Endpoints

EndpointPurpose
GET /api/integrations/google-drive/status?workspace_id={id}connected, google_email, google_name, google_avatar_url, last_synced_at
GET /api/integrations/google-drive/files?...Search or browse Drive files
POST /api/integrations/google-drive/importImport one file as a research source
DELETE /api/integrations/google-drive/disconnectRemoves the Integration row entirely

Key Concepts

  • Automatic refresh, automatic fallback — the service helper refreshes tokens on a 60-second pre-expiry buffer. If refresh fails, it falls through to the old token once; genuine auth errors surface as token_expired.
  • Structured sheet readsgoogle_drive_read_sheet returns {headers, rows} plus available_sheets so the agent can iterate across tabs without fetching twice.
  • Batch import is cappedgoogle_drive_batch_import stops at 20 files per call to keep write amplification bounded. Failures per file are reported in failed[] with reasons; partial success is the norm.
  • Source-type enum is enforced at the tool boundary — the JSON schema enum is advisory to the model, so the handler re-validates against _VALID_SOURCE_TYPES.
  • Three gated tools, five read tools — the read tools run freely; writes always queue for approval.

What's Next

Was this helpful?