Supported formats¶
Read: - DFXP/TTML - SAMI - SCC - SRT - WebVTT
Write: - DFXP/TTML - SAMI - SRT - Transcript - WebVTT
See the examples folder for example captions that currently can be read correctly.
SAMI Reader / Writer :: spec¶
Microsoft Synchronized Accessible Media Interchange. Supports multiple languages.
Supported Styling: - text-align - italics - font-size - font-family - color
If the SAMI file is not valid XML (e.g. unclosed tags), will still attempt to read it.
DFXP/TTML Reader / Writer :: spec¶
The W3 standard. Supports multiple languages.
Supported Styling: - text-align - italics - font-size - font-family - color
SRT Reader / Writer :: spec¶
SubRip captions. If given multiple languages to write, will output all joined together by a ‘MULTI-LANGUAGE SRT’ line.
Supported Styling: - None
Assumes input language is english. To change:
pycaps = SRTReader().read(srt_content, lang='fr')
WebVTT Reader / Writer :: spec¶
WebVTT is a W3C standard for displaying timed text in HTML5. Its
specification is currently (as of February 2015) in draft stage and
therefore not all features are implemented by major players, the same
being true for pycaption
.
By default, the reader assumes the language is English and the writer
returns the first language it finds in the caption set. You can specify
a language using the lang
parameter:
pycaps = WebVTTReader().read(content, lang='fr')
If you need to adjust all timestamps in a WebVTT, you can use the
time_shift_milliseconds
parameter which moves the timestamps
forward (positive integer) or backward (negative integer) with
the specified amount:
pycaps = WebVTTReader(time_shift_milliseconds=1154).read(content)
Styling¶
Styling in WebVTT can be done via inline tags (e.g. <b>
, <i>
etc.) or external
CSS rules applied to text wrapped in class (<c>
) or voice (<v>
) tags.
pycaption
currently only keeps voice tags on conversion.
Example:
<v Fred>Hi, my name is Fred
is converted to
Fred: Hi, my name is Fred
The following WebVTT supported tags are stripped off the cue text:
<c>
,<i>
,<b>
,<u>
,<ruby>
,<rt>
,<lang>
and timestamp tags (<h:mm:ss.sss>
)
Non-supported tags are left unchanged as a natural part of the cue text with no special meaning.
Positioning¶
The WebVTT specs allow customizing the position of cues by configuring a
number of cue settings. pycaption
currently only maintains positioning
information on writing, in which case it supports the following settings:
A WebVTT line position cue setting.
A WebVTT text position cue setting.
A WebVTT size cue setting.
A WebVTT alignment cue setting.
pycaption
does not support:
A WebVTT vertical text cue setting.
A WebVTT region cue setting.
Refer to the official WebVTT specification for details about the cue settings.
SCC Reader :: spec¶
Scenarist Closed Caption format. Assumes Channel 1 input.
Supported Styling: - italics
By default, the SCC Reader does not simulate roll-up captions. To enable roll-ups:
pycaps = SCCReader().read(scc_content, simulate_roll_up=True)
Also, assumes input language is english. To change:
pycaps = SCCReader().read(scc_content, lang='fr')
Now has the option of specifying an offset (measured in seconds) for the timestamp. For example, if the SCC file is 45 seconds ahead of the video:
pycaps = SCCReader().read(scc_content, offset=45)
The SCC Reader handles both dropframe and non-dropframe captions, and will auto-detect which format the captions are in.
For debugging purposes, the SCC captions can be translated into a human readable form as following:
translated_scc = translate_scc(scc_content, brackets="[]")
Square brackets are used by default, but they can be replaced with other brackets or None.
Transcript Writer¶
Text stripped of styling, arranged in sentences.
Supported Styling: - None
The transcript writer uses natural sentence boundary detection algorithms to create the transcript.