The preservation problem, briefly
Any moving-image archive today has, somewhere in its holdings,
screenplays. Feature film scripts, TV bibles, episode outlines, shooting
drafts, continuity scripts. Some are original typescripts. Some are faxed
rewrites. Most of the ones produced after 1995 live in Final Draft .fdx
files.
Final Draft is a fine tool. It is also a proprietary format, maintained by a single vendor, whose long-term availability is not under the archive’s control. The same is true, to varying degrees, for FadeIn, Movie Magic Screenwriter, Celtx, and every web-based writing tool that has come and gone in the past decade.
An archival profession that spent thirty years migrating audio and video from proprietary containers into open, documented formats (WAV, FLAC, Matroska, FFV1) should be able to recognise the situation for what it is. The screenplay file is the next migration.
Why ScreenJSON, for an archive
Three properties matter more for preservation than they do for any other use case:
Openness. The schema is published, versioned, and available as a JSON Schema document at a stable URL. A conforming validator can be written from the specification alone. No vendor has unique insight into what a ScreenJSON file means; if everyone who works on the format disappeared tomorrow, the specification would still describe the files.
Text, not binary. A ScreenJSON file is plain UTF-8 JSON. It opens in any text editor. It survives every kind of archival operation — copy, checksum, diff, compression, bit-rot detection, format migration — that archives have spent decades getting right.
Structural, not presentational. The preserved artifact is “the screenplay”, not “a rendering of the screenplay”. Rendering can be reconstructed at any future date from the document plus a rendering convention. The opposite — reconstructing a structured document from a rendered PDF — is lossy, manual, and expensive.
A migration workflow
The pattern is familiar to any archive that has ingested legacy media:
- Characterise the source. Inventory what you have: how many
.fdx, how many.fadein, how many.fountain, how many PDFs. Which are originals, which are intermediate, which are duplicates. - Convert to the open format. Run each source through screenjson-cli (or screenjson-export for the free reference subset). Each yields a ScreenJSON document.
- Validate. Every output goes through
screenjson validate --strict. Any failure is held back for manual review. - Catalogue. Derive catalogue records from the document’s metadata —
authors, title, logline, characters, registration. Promote any free-text
headers in the source to
metaentries, don’t discard them. - Store both. Keep the original source file alongside the ScreenJSON output. An archive preserves evidence; the original is the evidence, the ScreenJSON is the accessible rendition.
- Checksum, fixity, bit-rot protection. Same as any other digital archival object.
Metadata discipline
ScreenJSON is generous about metadata, but an archive wants some of it populated to a consistent standard. A few things we suggest treating as non-negotiable in any archival ingestion:
id— a new UUID per ingested document, minted at ingest, even if the source contains one.title— populated in at least one language.authors— every credited author, with a stable UUID per person across the collection.generator— record the ingestion tool’s name and version. This is the closest ScreenJSON has toPREMIS agent.registrations— record WGA / guild / national registry data if available.license— a named license descriptor for anything open-licensed; otherwise the rights statement your archive uses for unresolved rights.meta— anything else you’d put in a PREMISintellectualEntityor a Dublin Core element, keyed consistently across your collection.
Authority control
Characters, authors, and contributors all carry UUIDs, which makes cross-collection authority control tractable. A single writer working across fifty screenplays in the archive is one UUID, not fifty hand-typed strings. Whether you reconcile against ORCID, VIAF, or an internal authority file is your call; ScreenJSON doesn’t mandate one.
Revisions and provenance
The revisions array at the document level and on individual elements
is the canonical place to record authorial revision history. For
archival purposes, treat it as part of the evidentiary record: never
squash revisions on ingestion, always preserve them as-is, and record
your own ingestion as a final, clearly-labelled revision if your archive
needs that discipline.
Open questions the schema doesn’t fix
The schema doesn’t solve every archival problem, and we’re explicit about that. A few things remain your institution’s policy call:
- Rights metadata. The schema has a
licensedescriptor but doesn’t mandate a rights vocabulary. Use RightsStatements.org, Europeana Rights, or your internal taxonomy, recorded inmeta. - Physical provenance. If the source was a paper typescript scanned and OCR’d into a PDF, that history belongs in your repository’s provenance metadata, not in the ScreenJSON file.
- Contextual access. Some material will be restricted. ScreenJSON’s content encryption gives you a technical layer; your access control is still your repository’s job.
On versioning the schema itself
ScreenJSON uses semantic versioning. The specification commits to:
- Major version bumps when a change is backwards-incompatible.
- Minor version bumps when fields are added.
- Patch version bumps for clarifications.
An archive should pin to a known schema version for a given ingestion project and migrate deliberately when upgrading, the same way you would pin any other ingestion contract.
Next
- Tool: screenjson-cli
- Tool: Greenlight — for batch migration of large collections.
- How-to: Validate a ScreenJSON document
- How-to: Migrate from FDX archives to ScreenJSON
- Specification: versioning & conformance