The shape of the job
You have a bucket or a directory with thousands of .fdx, .fadein,
.fountain, and legacy PDF files. You want, at the end, a parallel
directory of validated .json ScreenJSON documents with the originals
preserved.
Step 1 — inventory
find archive/ -type f \( -iname "*.fdx" -o -iname "*.fadein" \
-o -iname "*.fountain" -o -iname "*.spmd" -o -iname "*.pdf" \) \
> inventory.txt
wc -l inventory.txt
cut -d. -f2 inventory.txt | sort | uniq -c
Step 2 — convert + validate
For up to a few thousand files, a parallel shell loop is enough:
mkdir -p out
cat inventory.txt | xargs -n 1 -P 16 -I{} sh -c '
in="$1"
name="$(basename "$in")"
out="out/${name%.*}.json"
screenjson convert -i "$in" -o "$out" 2>>errors.log || exit 0
screenjson validate -i "$out" --strict 2>>invalid.log || mv "$out" "${out}.invalid"
' _ {}
Beyond that, use Greenlight — the same work, but with retries, progress tracking, and an admin UI.
Step 3 — triage
errors.log— conversion failures. Usually corrupt originals or legacy format variants. Inspect a sample; the majority resolve with--formathints or a clean re-save in Final Draft.invalid.log— validator rejections. Usually missing authors or malformed character cues. Fixable; keep the.json.invalidfile for later manual repair.
Step 4 — preserve the originals
Archive the sources as-is. Do not discard them. Do not overwrite them.
tar -czf archive-sources-$(date +%F).tar.gz archive/
An archive keeps the evidence. The ScreenJSON is the accessible rendition, not a replacement for the source.
Step 5 — catalogue
Derive a simple catalogue row per successful conversion:
jq -r '[.id, .title.en, (.authors[0].family + ", " + .authors[0].given)] | @tsv' \
out/*.json > catalogue.tsv
From here, load into your cataloguing system, your search index, or your content management platform.
Step 6 — schedule a re-validate
Pin a schema version for the project. Re-run screenjson validate
periodically to catch drift when the schema evolves.