cloudflare
references/r2-sql/api.md
.md 159 lines
Content
# R2 SQL API Reference
SQL syntax, functions, operators, and data types for R2 SQL queries.
## SQL Syntax
```sql
SELECT column_list | aggregation_function
FROM [namespace.]table_name
WHERE conditions
[GROUP BY column_list]
[HAVING conditions]
[ORDER BY column | aggregation_function [DESC | ASC]]
[LIMIT number]
```
## Schema Discovery
```sql
SHOW DATABASES; -- List namespaces
SHOW NAMESPACES; -- Alias for SHOW DATABASES
SHOW SCHEMAS; -- Alias for SHOW DATABASES
SHOW TABLES IN namespace; -- List tables in namespace
DESCRIBE namespace.table; -- Show table schema, partition keys
```
## SELECT Clause
```sql
-- All columns
SELECT * FROM logs.http_requests;
-- Specific columns
SELECT user_id, timestamp, status FROM logs.http_requests;
```
**Limitations:** No column aliases, expressions, or nested column access
## WHERE Clause
### Operators
| Operator | Example |
|----------|---------|
| `=`, `!=`, `<`, `<=`, `>`, `>=` | `status = 200` |
| `LIKE` | `user_agent LIKE '%Chrome%'` |
| `BETWEEN` | `timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z'` |
| `IS NULL`, `IS NOT NULL` | `email IS NOT NULL` |
| `AND`, `OR` | `status = 200 AND method = 'GET'` |
Use parentheses for precedence: `(status = 404 OR status = 500) AND method = 'POST'`
## Aggregation Functions
| Function | Description |
|----------|-------------|
| `COUNT(*)` | Count all rows |
| `COUNT(column)` | Count non-null values |
| `COUNT(DISTINCT column)` | Count unique values |
| `SUM(column)`, `AVG(column)` | Numeric aggregations |
| `MIN(column)`, `MAX(column)` | Min/max values |
```sql
-- Multiple aggregations with GROUP BY
SELECT region, COUNT(*), SUM(amount), AVG(amount)
FROM sales.transactions
WHERE sale_date >= '2024-01-01'
GROUP BY region;
```
## HAVING Clause
Filter aggregated results (after GROUP BY):
```sql
SELECT category, SUM(amount)
FROM sales.transactions
GROUP BY category
HAVING SUM(amount) > 10000;
```
## ORDER BY Clause
Sort results by:
- **Partition key columns** - Always supported
- **Aggregation functions** - Supported via shuffle strategy
```sql
-- Order by partition key
SELECT * FROM logs.requests ORDER BY timestamp DESC LIMIT 100;
-- Order by aggregation (repeat function, aliases not supported)
SELECT region, SUM(amount)
FROM sales.transactions
GROUP BY region
ORDER BY SUM(amount) DESC;
```
**Limitations:** Cannot order by non-partition columns. See [gotchas.md](gotchas.md#order-by-limitations)
## LIMIT Clause
```sql
SELECT * FROM logs.requests LIMIT 100;
```
| Setting | Value |
|---------|-------|
| Min | 1 |
| Max | 10,000 |
| Default | 500 |
**Always use LIMIT** to enable early termination optimization.
## Data Types
| Type | SQL Literal | Example |
|------|-------------|---------|
| `integer` | Unquoted number | `42`, `-10` |
| `float` | Decimal number | `3.14`, `-0.5` |
| `string` | Single quotes | `'hello'`, `'GET'` |
| `boolean` | Keyword | `true`, `false` |
| `timestamp` | RFC3339 string | `'2025-01-01T00:00:00Z'` |
| `date` | ISO 8601 date | `'2025-01-01'` |
### Type Safety
- Quote strings with single quotes: `'value'`
- Timestamps must be RFC3339: `'2025-01-01T00:00:00Z'` (include timezone)
- Dates must be ISO 8601: `'2025-01-01'` (YYYY-MM-DD)
- No implicit conversions
```sql
-- ✅ Correct
WHERE status = 200 AND method = 'GET' AND timestamp > '2025-01-01T00:00:00Z'
-- ❌ Wrong
WHERE status = '200' -- string instead of integer
WHERE timestamp > '2025-01-01' -- missing time/timezone
WHERE method = GET -- unquoted string
```
## Query Result Format
JSON array of objects:
```json
[
{"user_id": "user_123", "timestamp": "2025-01-15T10:30:00Z", "status": 200},
{"user_id": "user_456", "timestamp": "2025-01-15T10:31:00Z", "status": 404}
]
```
## See Also
- [patterns.md](patterns.md) - Query examples and use cases
- [gotchas.md](gotchas.md) - SQL limitations and error handling
- [configuration.md](configuration.md) - Setup and authentication