Some checks failed
Tests / Setup and Checkout (push) Failing after 1m43s
Tests / Backend Setup (Python 3.13 + uv + Environment) (push) Has been skipped
Tests / Frontend Setup (Node.js 24 + Yarn Berry + Build) (push) Has been skipped
Tests / Backend Tests (Python 3.13 + uv) (push) Has been skipped
Tests / Frontend Tests (TypeScript + Vue + Yarn Berry) (push) Has been skipped
- Documents the critical 'jobs waiting forever' issue and solution - Root cause: Docker syntax in runs-on labels causes immediate job cancellation - Includes diagnosis steps, SQL queries, and test procedures - References multi-Pi runner infrastructure and lessons learned Signed-off-by: Cliff Hill <xlorep@darkhelm.org>
152 lines
4.4 KiB
Markdown
152 lines
4.4 KiB
Markdown
# Gitea Actions Troubleshooting Guide
|
|
|
|
This document contains solutions to common issues with Gitea Actions CI/CD pipeline.
|
|
|
|
## Critical Issue: Jobs Stuck in "Waiting" State Forever
|
|
|
|
### Symptoms
|
|
- Workflows are created but jobs show "Waiting" indefinitely
|
|
- Runners are online and healthy
|
|
- No tasks appear in `action_task` database table
|
|
- Jobs get cancelled immediately (0-second duration)
|
|
- UI shows "Waiting" but database shows status 5 (cancelled)
|
|
|
|
### Root Cause
|
|
**Docker syntax in `runs-on` labels** causes Gitea Actions to immediately cancel jobs.
|
|
|
|
### Problem Syntax (BROKEN)
|
|
```yaml
|
|
jobs:
|
|
setup:
|
|
runs-on: ubuntu-latest:docker://ubuntu:22.04
|
|
backend:
|
|
runs-on: python-latest:docker://python:3.13-slim
|
|
frontend:
|
|
runs-on: node-latest:docker://node:20-bookworm-slim
|
|
```
|
|
|
|
### Solution Syntax (WORKING)
|
|
```yaml
|
|
jobs:
|
|
setup:
|
|
runs-on: ubuntu-latest
|
|
backend:
|
|
runs-on: python-latest
|
|
frontend:
|
|
runs-on: node-latest
|
|
```
|
|
|
|
### Why This Works
|
|
The runners are configured with Docker images in their labels:
|
|
```bash
|
|
GITEA_RUNNER_LABELS=ubuntu-latest:docker://ubuntu:22.04,node-latest:docker://node:20-bookworm-slim,python-latest:docker://python:3.13-slim
|
|
```
|
|
|
|
So jobs still run in the correct Docker containers, but Gitea can properly parse and dispatch them.
|
|
|
|
### Diagnosis Steps
|
|
|
|
1. **Check if new runs are created:**
|
|
```sql
|
|
SELECT id, status, title FROM action_run ORDER BY id DESC LIMIT 3;
|
|
```
|
|
|
|
2. **Check job status and duration:**
|
|
```sql
|
|
SELECT arj.id, arj.job_id, arj.status, ar.created, ar.updated, (ar.updated - ar.created) as duration_seconds
|
|
FROM action_run_job arj
|
|
JOIN action_run ar ON arj.run_id = ar.id
|
|
WHERE ar.id = (SELECT MAX(id) FROM action_run);
|
|
```
|
|
|
|
3. **Check if tasks are created:**
|
|
```sql
|
|
SELECT * FROM action_task ORDER BY id DESC LIMIT 5;
|
|
```
|
|
|
|
4. **Verify runners are online:**
|
|
```sql
|
|
SELECT id, name, last_online, agent_labels FROM action_runner WHERE last_online > (EXTRACT(epoch FROM NOW()) - 300)::bigint;
|
|
```
|
|
|
|
### Key Indicators
|
|
- **Duration = 0 seconds** → Immediate cancellation due to syntax issue
|
|
- **Empty action_task table** → Jobs never converted to executable tasks
|
|
- **Status 5 jobs with Status 7 dependents** → Setup job cancelled, others skipped
|
|
|
|
### Test Procedure
|
|
Create a minimal test workflow to isolate issues:
|
|
|
|
```yaml
|
|
# .gitea/workflows/test-simple.yml
|
|
name: Simple Test
|
|
on: push
|
|
jobs:
|
|
test:
|
|
name: Simple Test
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Echo
|
|
run: echo "Hello World"
|
|
```
|
|
|
|
If this works but your main workflow doesn't, the issue is likely syntax-related.
|
|
|
|
## Other Common Issues
|
|
|
|
### Cache/UI Synchronization Problems
|
|
If UI shows different status than database:
|
|
1. Restart Gitea: `docker compose restart server`
|
|
2. Clear browser cache
|
|
3. Check database vs UI status discrepancies
|
|
|
|
### Stuck Runs from Previous Sessions
|
|
Clean up stuck runs:
|
|
```sql
|
|
-- Clear stuck pending jobs
|
|
UPDATE action_run_job SET status = 5 WHERE status IN (1, 2);
|
|
UPDATE action_run SET status = 5 WHERE status IN (1, 2);
|
|
```
|
|
|
|
### Runner Registration Issues
|
|
If runners show "unregistered runner" errors:
|
|
1. Delete runner registrations: `DELETE FROM action_runner;`
|
|
2. Restart all runner containers
|
|
3. Let them auto-register with fresh state
|
|
|
|
## Infrastructure Overview
|
|
|
|
### Current Setup
|
|
- **Gitea Server**: Docker container with PostgreSQL backend
|
|
- **Runners**: 8 Raspberry Pi runners across 4 servers
|
|
- pi-desktop: Pi 400 4GB (2 runners)
|
|
- kankali: Pi with local Gitea (2 runners)
|
|
- urtzul: Pi 4B 8GB (2 runners)
|
|
- zhokq: Pi 4B 8GB (2 runners)
|
|
|
|
### Runner Configuration
|
|
Each runner supports multiple Docker environments:
|
|
- `ubuntu-latest` → `ubuntu:22.04`
|
|
- `python-latest` → `python:3.13-slim`
|
|
- `node-latest` → `node:20-bookworm-slim`
|
|
- `ubuntu-act` → `catthehacker/ubuntu:act-latest`
|
|
|
|
### Workflow Design
|
|
Multi-stage pipeline with artifact passing:
|
|
1. **Setup**: Checkout code, create artifacts
|
|
2. **Parallel Setup**: Backend (Python/uv) + Frontend (Node.js/Yarn)
|
|
3. **Parallel Tests**: Backend tests + Frontend tests
|
|
|
|
## Lessons Learned
|
|
|
|
1. **Gitea Actions syntax is stricter than GitHub Actions**
|
|
2. **Runner labels must match exactly** - no Docker syntax in workflow files
|
|
3. **Database debugging is essential** - UI can show cached/incorrect status
|
|
4. **Job cancellation happens immediately** for syntax errors
|
|
5. **Empty action_task table** is the key indicator of dispatch failure
|
|
|
|
---
|
|
|
|
*Last updated: October 25, 2025*
|
|
*Issue resolved after extensive database-level debugging and syntax isolation*
|