How to populate EU projects (begin/end dates, acronyms, awarded PMs/EUR)

This describes how to get EU project data (from EU Funding & Tenders org-details) into the app database with project long name, acronym, begin/end dates, and awarded person-months or EUR.

Data flow

Fetch – eu_playwright org-details (with --fetch-details) writes org_details_<slug>_results.json with per-project: title, acronym, url, project_id, start_date, end_date, awarded_eur, person_months, eu_contribution_eur, funding_programme, participants, etc.
Enrich dates (optional) – If some projects lack start_date/end_date in the JSON, you can save project HTML pages and run scripts/eu_portal/update_org_details_dates_from_saved_pages.py to parse dates from HTML and update the JSON.
Import into DB – Run scripts/eu_portal/seed_eu_projects_from_org_details.py to create/update Project rows from the JSON (name, planned begin/end, optional WP+Task with awarded PMs).

1. Fetch EU project data (eu_playwright)

From the repo root:

# Install Playwright and Chromium if not already
uv sync --group dev
uv run playwright install chromium

# Fetch org-details (e.g. by PIC) and open each project page for details
PYTHONPATH=. uv run python -m eu_playwright.cli org-details --fetch-details

Output: tmp/downloaded_data/eu_playwright/org_details_<slug>_results.json.

Options:

--url "https://ec.europa.eu/.../org-details/<PIC>" – different org.
--out-dir tmp/downloaded_data/eu_playwright – output directory.
--no-headless – show browser.
--connect http://localhost:9222 – attach to existing Chrome (see run-eura-hanketietopalvelu).

With --fetch-details, each project’s detail page is parsed for acronym, start_date, end_date, status, participants, eu_contribution_eur, funding_programme, etc. Dates in the JSON are in EU portal format (e.g. "01 November 2019", "31 October 2022").

2. Enrich dates from saved HTML (optional)

If some projects in the JSON have empty start_date/end_date, you can save the project HTML pages (eu_playwright with --save-pages) and then run:

uv run python scripts/eu_portal/update_org_details_dates_from_saved_pages.py
uv run python scripts/eu_portal/update_org_details_dates_from_saved_pages.py --json tmp/downloaded_data/eu_playwright/org_details_<slug>_results.json --saved-pages-dir ./saved-pages-dir

This parses “Start date” / “End date” from the HTML and updates the JSON in place.

3. Import EU projects into the database

Prerequisites:

Database (Podman PostgreSQL 18+); DATABASE_URL set (e.g. from .env.dev).
Top org present (e.g. from scripts/seed_example_projects.py or scripts/seed_from_ods.py).
Migrations applied.

Run:

uv run python scripts/eu_portal/seed_eu_projects_from_org_details.py

Default JSON path: tmp/downloaded_data/eu_playwright/org_details_<slug>_results.json.

Options:

--json / -j – path to org_details_<slug>_results.json.
--top-org – short name of the project owner org (default: your top org short name).
--name-style – how to set project name: title (long name only), acronym (acronym only), or acronym_title (e.g. "TEAMS – Teaching Entrepreneurship…") (default: title).
--dry-run – only print what would be created/updated; do not write to DB.

What the script does:

Reads result.projects from the JSON.
For each project:
- Name: Set from title and/or acronym according to --name-style (must be unique; duplicates get a suffix).
- Planned begin/end: Parsed from start_date and end_date (EU format "DD Month YYYY").
- Duration: Set from begin/end if both present.
- Project owner org: Set to the given top org (e.g. EXAMPLE).
- Funding programme: If funding_programme is present in the JSON, the script tries to match an existing FundingProgramme (by label/code under an "EU" group) or skips linking if none found.
- Awarded PMs: If person_months is present and parseable, the script creates one Work Package (WP 1) and one Task (task_id 1) with awarded_pms set; otherwise it still creates Project only (no WPs/tasks).
Idempotent: projects are matched by name; existing projects are updated (dates, duration). Re-running with the same JSON updates existing rows.

Limitations:

Awarded EUR from the JSON (awarded_eur, eu_contribution_eur) is not stored in the app schema (no project-level EUR field); it remains in the JSON and can be shown from a future “EU metadata” view or stored in a custom field if added later.
Acronym is only used in the project name (by --name-style); there is no separate acronym column on Project.
EU project_id is not stored; matching on re-import is by project name only.

Summary

Step	Command / file
Fetch EU data	`PYTHONPATH=. uv run python -m eu_playwright.cli org-details --fetch-details`
JSON output	`tmp/downloaded_data/eu_playwright/org_details_<slug>_results.json`
Enrich dates (optional)	`uv run python scripts/eu_portal/update_org_details_dates_from_saved_pages.py`
Import into DB	`uv run python scripts/eu_portal/seed_eu_projects_from_org_details.py`

This gives you EU projects in the app with long name and/or acronym, begin/end dates, and (when available) awarded person-months on a single task.