Prompt Chain

Download OpenAI Compliance Logs

Bash and PowerShell scripts that download OpenAI Compliance API logs for ChatGPT Enterprise workspaces and API Platform organizations into SIEM or data lake

Works with openai

91
Spark score
out of 100
Updated 3 months ago
Version 1.0.0
Models

Add to Favorites

Why it matters

Automate the retrieval of compliance logs from OpenAI's platform. This asset enables you to download log files for security analysis and data lake ingestion.

Outcomes

What it gets done

01

Download compliance log files via the OpenAI API.

02

Filter logs by event type and time range.

03

Handle pagination for large log datasets.

04

Ingest logs into SIEM or data lake solutions.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/oai-logsplatform | bash

Steps

Steps in the chain

01
Set up prerequisites

Ensure you have an Enterprise Compliance API key exported as COMPLIANCE_API_KEY, the ChatGPT account ID or API Platform Org ID for the principal in question, and specific requirements for your environment.

02
Choose environment and save script

For Unix-based systems: Save the script locally as download_compliance_files.sh and mark it executable. Ensure you have up-to-date bash, curl, sed, and date installed. For Windows-based systems: Save the script locally as download_compliance_files.ps1 and open PowerShell (Version 5.1+).

03
Format date parameter

Format the date you want to get every log after as an ISO 8601 string including timezone (e.g., using date -u -v-1d +%Y-%m-%dT%H:%M:%SZ for Unix or Get-Date -AsUTC for Windows).

04
Execute script with parameters

Run the script with four parameters: workspace_or_org_id, event_type, limit, and after date. Example: ./download_compliance_files.sh <workspace_or_org_id> <event_type> <limit> <after> or .\download_compliance_files.ps1 <workspace_or_org_id> <event_type> <limit> <after>

05
List and paginate log files

The script handles listing and paging through available log files by making API requests to the Compliance API endpoint with the specified event type, limit, and after parameters. It checks for has_more flag to determine if additional pages exist.

06
Download log files

For each log file ID returned from the listing, download the log by making a request to the logs/{id} endpoint. The script writes the output to stdout for ingestion into your SIEM or data lake.

Overview

OpenAI Compliance Logs Platform quickstart

What it does

Two quickstart scripts demonstrating how to build an integration with the OpenAI Compliance API to retrieve and process log data for given event types and time ranges.

How it connects

Use when you need to download compliance logs from ChatGPT Enterprise, ChatGPT Edu, or API Platform organizations and ingest them into your SIEM or data lake for audit and compliance purposes.

Source README

OpenAI Compliance Logs Platform quickstart

Use this notebook to get started using the OpenAI Compliance Logs Platform. The examples focus on downloading log files so you can ingest them into your SIEM or data lake.

Prerequisites

  • An Enterprise Compliance API key exported as COMPLIANCE_API_KEY.
  • The ChatGPT account ID or the API Platform Org ID for the principal in question.
  • Specific requirements for your environment

Quickstart Scripts

Provided below are functionally identical scripts - one for Unix-based and one for Windows-based environments.
These scripts give an example of how one could build an integration with the Compliance API to retrieve and process
log data for given event types and time ranges.
These scripts handle listing and paging through the available log files and downloading them - writing the output to stdout.

Example invocations of these scripts are embedded in their help blocks - execute them with no arguments to see them.

Option 1: Unix-based

Prerequisites:

  • Save the script locally as download_compliance_files.sh and mark it executable
  • Make sure you have up-to-date bash, curl, sed, and date installed.
  • Format the date you want to get every log after as an ISO 8601 string including timezone.

Run the script akin to ./download_compliance_files.sh <workspace_or_org_id> <event_type> <limit> <after>

#!/usr/bin/env bash
set -euo pipefail

usage() {
  echo "Usage: $0 <workspace_or_org_id> <event_type> <limit> <after>" >&2
  echo >&2
  echo 'Examples: ' >&2
  echo 'COMPLIANCE_API_KEY=<KEY> ./download_compliance_files.sh f7f33107-5fb9-4ee1-8922-3eae76b5b5a0 AUTH_LOG 100 "$(date -u -v-1d +%Y-%m-%dT%H:%M:%SZ)" > output.jsonl' >&2
  echo 'COMPLIANCE_API_KEY=<KEY> ./download_compliance_files.sh org-p13k3klgno5cqxbf0q8hpgrk AUTH_LOG 100 "$(date -u -v-1d +%Y-%m-%dT%H:%M:%SZ)" > output.jsonl' >&2
}

if [[ $# -ne 4 ]]; then
  usage
  exit 2
fi

PRINCIPAL_ID="$1"
EVENT_TYPE="$2"
LIMIT="$3"
INITIAL_AFTER="$4"

### Require COMPLIANCE_API_KEY to be present and non-empty before using it
if [[ -z "${COMPLIANCE_API_KEY:-}" ]]; then
  echo "COMPLIANCE_API_KEY environment variable is required. e.g.:" >&2
  echo "COMPLIANCE_API_KEY=<KEY> $0 <workspace_or_org_id> <event_type> <limit> <after>" >&2
  exit 2
fi

API_BASE="https://api.chatgpt.com/v1/compliance"
AUTH_HEADER=("-H" "Authorization: Bearer ${COMPLIANCE_API_KEY}")

### Determine whether the first arg is a workspace ID or an org ID.
### If it starts with "org-" treat it as an organization ID and switch the path segment accordingly.
SCOPE_SEGMENT="workspaces"
if [[ "${PRINCIPAL_ID}" == org-* ]]; then
  SCOPE_SEGMENT="organizations"
fi

### Perform a curl request and fail fast on HTTP errors, logging context to stderr.
### Usage: perform_curl "description of action" <curl args...>
perform_curl() {
  local description="$1"
  shift
  # Capture body and HTTP status code, keeping body on stdout-like var
  # We append a newline before the status to reliably split even if body has no trailing newline.
  local combined
  if ! combined=$(curl -sS -w "\n%{http_code}" "$@"); then
    echo "Network/transport error while ${description}" >&2
    exit 1
  fi
  local http_code
  http_code="${combined##*$'\n'}"
  local body
  body="${combined%$'\n'*}"

  if [[ ! "${http_code}" =~ ^2[0-9][0-9]$ ]]; then
    echo "HTTP error ${http_code} while ${description}:" >&2
    if [[ -n "${body}" ]]; then
      # Print the body to stderr so it doesn't corrupt stdout stream
      echo "${body}" | jq . >&2
    fi
    exit 1
  fi

  # On success, emit body to stdout for callers to consume
  echo "${body}"
}

list_logs() {
  local after="$1"
  perform_curl "listing logs (after=${after}, event_type=${EVENT_TYPE}, limit=${LIMIT})" \
    -G \
    "${API_BASE}/${SCOPE_SEGMENT}/${PRINCIPAL_ID}/logs" \
    "${AUTH_HEADER[@]}" \
    --data-urlencode "limit=${LIMIT}" \
    --data-urlencode "event_type=${EVENT_TYPE}" \
    --data-urlencode "after=${after}"
}

download_log() {
  local id="$1"
  echo "Fetching logs for ID: ${id}" >&2
  perform_curl "downloading log id=${id}" \
    -G -L \
    "${API_BASE}/${SCOPE_SEGMENT}/${PRINCIPAL_ID}/logs/${id}" \
    "${AUTH_HEADER[@]}"
}

to_local_human() {
  local iso="$1"
  if [[ -z "${iso}" || "${iso}" == "null" ]]; then
    echo ""
    return 0
  fi

  local iso_norm
  iso_norm=$(echo -n "${iso}" \
    | sed -E 's/\.[0-9]+(Z|[+-][0-9:]+)$/\1/' \
    | sed -E 's/([+-]00:00)$/Z/')

  # macOS/BSD date: parse UTC to epoch then format in local timezone
  local epoch
  epoch=$(date -j -u -f "%Y-%m-%dT%H:%M:%SZ" "${iso_norm}" +%s 2>/dev/null) || true
  if [[ -n "${epoch}" ]]; then
    date -r "${epoch}" "+%Y-%m-%d %H:%M:%S %Z" 2>/dev/null && return 0
  fi

  # Fallback to original if parsing failed
  echo "${iso}"
}

current_after="${INITIAL_AFTER}"
page=1
total_downloaded=0
while true; do
  echo "Fetching page ${page} with after='${current_after}' (local: $(to_local_human "${current_after}"))" >&2
  response_json="$(list_logs "${current_after}")"

  # Count and download each ID from the current page (if any)
  page_count="$(echo "${response_json}" | jq '.data | length')"
  if [[ "${page_count}" -gt 0 ]]; then
    echo "${response_json}" | jq -r '.data[].id' | while read -r id; do
      download_log "${id}"
    done
    total_downloaded=$((total_downloaded + page_count))
  fi

  has_more="$(echo "${response_json}" | jq -r '.has_more')"
  current_after="$(echo "${response_json}" | jq -r '.last_end_time')"
  if [[ "${has_more}" == "true" ]]; then
    page=$((page + 1))
  else
    break
  fi
done

if [[ "${total_downloaded}" -eq 0 && ( -z "${current_after}" || "${current_after}" == "null" ) ]]; then
  echo "No results found for event_type ${EVENT_TYPE} after ${INITIAL_AFTER}" >&2
else
  echo "Completed downloading ${total_downloaded} log files up to ${current_after} (local: $(to_local_human "${current_after}"))" >&2
fi

Option 2: Windows-based

Prerequisites:

  • Save the script locally as download_compliance_files.ps1
  • Open PowerShell (Version 5.1+) and navigate to the directory where the script is saved.

Run the script akin to .\download_compliance_files.ps1 <workspace_or_org_id> <event_type> <limit> <after>

#!/usr/bin/env pwsh
#Requires -Version 5.1

Set-StrictMode -Version Latest
$ErrorActionPreference = 'Stop'

Add-Type -AssemblyName System.Web

function Show-Usage {
    [Console]::Error.WriteLine(@"
Usage: .\download_compliance_files.ps1 <workspace_or_org_id> <event_type> <limit> <after>

Example:
  `$env:COMPLIANCE_API_KEY = '<KEY>'
  .\download_compliance_files.ps1 f7f33107-5fb9-4ee1-8922-3eae76b5b5a0 AUTH_LOG 100 (Get-Date -AsUTC).AddDays(-1).ToString('yyyy-MM-ddTHH:mm:ssZ') |
    Out-File -Encoding utf8 output.jsonl

Example (org id):
  `$env:COMPLIANCE_API_KEY = '<KEY>'
  .\download_compliance_files.ps1 org-p13k3klgno5cqxbf0q8hpgrk AUTH_LOG 100 (Get-Date -AsUTC).AddDays(-1).ToString('yyyy-MM-ddTHH:mm:ssZ') |
    Out-File -Encoding utf8 output.jsonl
"@)
}

if ($args.Count -ne 4) {
    Show-Usage
    exit 2
}

if (-not $env:COMPLIANCE_API_KEY) {
    [Console]::Error.WriteLine('COMPLIANCE_API_KEY environment variable must be set.')
    exit 2
}

$PrincipalId = $args[0]
$EventType = $args[1]
$Limit = $args[2]
$InitialAfter = $args[3]

$ApiBase = 'https://api.chatgpt.com/v1/compliance'

if ($PrincipalId.StartsWith('org-')) {
    $ScopeSegment = 'organizations'
} else {
    $ScopeSegment = 'workspaces'
}

$handler = [System.Net.Http.HttpClientHandler]::new()
$client = [System.Net.Http.HttpClient]::new($handler)
$client.DefaultRequestHeaders.Authorization = New-Object System.Net.Http.Headers.AuthenticationHeaderValue('Bearer', $env:COMPLIANCE_API_KEY)

function Invoke-ComplianceRequest {
    param(
        [Parameter(Mandatory = $true)] [string] $Description,
        [Parameter(Mandatory = $true)] [string] $Path,
        [hashtable] $Query = @{}
    )

    $builder = [System.UriBuilder]::new("$ApiBase/$ScopeSegment/$PrincipalId/$Path")
    $queryString = [System.Web.HttpUtility]::ParseQueryString($builder.Query)
    foreach ($key in $Query.Keys) {
        $queryString[$key] = $Query[$key]
    }
    $builder.Query = $queryString.ToString()

    try {
        $response = $client.GetAsync($builder.Uri).GetAwaiter().GetResult()
    } catch {
        [Console]::Error.WriteLine("Network/transport error while $Description")
        exit 1
    }

    $body = $response.Content.ReadAsStringAsync().GetAwaiter().GetResult()
    if (-not $response.IsSuccessStatusCode) {
        [Console]::Error.WriteLine("HTTP error $($response.StatusCode.value__) while ${Description}:")
        if ($body) {
            try {
                $parsed = $body | ConvertFrom-Json
                $parsed | ConvertTo-Json -Depth 10 | Write-Error
            } catch {
                [Console]::Error.WriteLine($body)
            }
        }
        exit 1
    }

    Write-Output $body
}

function List-Logs {
    param(
        [Parameter(Mandatory = $true)] [string] $After
    )

    Invoke-ComplianceRequest -Description "listing logs (after=$After, event_type=$EventType, limit=$Limit)" -Path 'logs' -Query @{
        limit      = $Limit
        event_type = $EventType
        after      = $After
    }
}

function Download-Log {
    param(
        [Parameter(Mandatory = $true)] [string] $Id
    )

    [Console]::Error.WriteLine("Fetching logs for ID: $Id")
    Invoke-ComplianceRequest -Description "downloading log id=$Id" -Path "logs/$Id"
}

function ConvertTo-LocalHuman {
    param(
        [string] $Iso
    )

    if (-not $Iso -or $Iso -eq 'null') {
        return ''
    }

    try {
        $dt = [datetimeoffset]::Parse($Iso)
        return $dt.ToLocalTime().ToString('yyyy-MM-dd HH:mm:ss zzz')
    } catch {
        return $Iso
    }
}

$currentAfter = $InitialAfter
$page = 1
$totalDownloaded = 0
while ($true) {
    [Console]::Error.WriteLine("Fetching page $page with after='$currentAfter' (local: $(ConvertTo-LocalHuman -Iso $currentAfter))")
    $responseJson = List-Logs -After $currentAfter
    $responseObj = $responseJson | ConvertFrom-Json

    $pageCount = $responseObj.data.Count
    if ($pageCount -gt 0) {
        foreach ($entry in $responseObj.data) {
            Download-Log -Id $entry.id
        }
        $totalDownloaded += $pageCount
    }

    $hasMore = $false
    if ($null -ne $responseObj.has_more) {
        $hasMore = [System.Convert]::ToBoolean($responseObj.has_more)
    }

    $currentAfter = $responseObj.last_end_time
    if ($hasMore) {
        $page += 1
    } else {
        break
    }
}

if ($totalDownloaded -eq 0 -and ([string]::IsNullOrEmpty($currentAfter) -or $currentAfter -eq 'null')) {
    [Console]::Error.WriteLine("No results found for event_type $EventType after $InitialAfter")
} else {
    [Console]::Error.WriteLine("Completed downloading $totalDownloaded log files up to $currentAfter (local: $(ConvertTo-LocalHuman -Iso $currentAfter))")
}

$client.Dispose()
$handler.Dispose()

Step 1: Set up prerequisites

Ensure you have an Enterprise Compliance API key exported as COMPLIANCE_API_KEY, the ChatGPT account ID or API Platform Org ID for the principal in question, and specific requirements for your environment.

Step 2: Choose environment and save script

For Unix-based systems: Save the script locally as download_compliance_files.sh and mark it executable. Ensure you have up-to-date bash, curl, sed, and date installed. For Windows-based systems: Save the script locally as download_compliance_files.ps1 and open PowerShell (Version 5.1+).

Step 3: Format date parameter

Format the date you want to get every log after as an ISO 8601 string including timezone (e.g., using date -u -v-1d +%Y-%m-%dT%H:%M:%SZ for Unix or Get-Date -AsUTC for Windows).

Step 4: Execute script with parameters

Run the script with four parameters: workspace_or_org_id, event_type, limit, and after date. Example: ./download_compliance_files.sh <workspace_or_org_id> <event_type> <limit> <after> or .\download_compliance_files.ps1 <workspace_or_org_id> <event_type> <limit> <after>

Step 5: List and paginate log files

The script handles listing and paging through available log files by making API requests to the Compliance API endpoint with the specified event type, limit, and after parameters. It checks for has_more flag to determine if additional pages exist.

Step 6: Download log files

For each log file ID returned from the listing, download the log by making a request to the logs/{id} endpoint. The script writes the output to stdout for ingestion into your SIEM or data lake.

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.