CasinoLove logo

Convert subtitle files: SRT, VTT, SBV, and text

CasinoLove has created a free-to-use, open-source web application that lets you easily convert between common subtitle formats. The tool runs entirely in your browser, meaning your data is never uploaded to our servers. It requires no registration and is completely free of ads and tracking. We use this tool daily for our own video content creation, and we hope you find it just as useful. You can visit our GitHub page for the source code.

Overview of the common subtitle formats

This article analyzes three widely used plain text subtitle formats used in production, web delivery, archives, and conversion pipelines: SRT (SubRip), WebVTT (VTT), and SBV (SubViewer / YouTube SBV). It covers formal syntax, timing rules, parser behavior, de facto conventions, interoperability issues, and practical conversion decisions.

SRT vs WebVTT vs SBV: format overview and decision summary

SRT, WebVTT, and SBV are all text based timed subtitle formats, but they differ in goals and parser expectations. SRT is the de facto universal exchange format for video players and editing tools. WebVTT is the standards based web subtitle format designed for HTML media text tracks and supports richer semantics and cue layout controls. SBV is a lightweight subtitle format strongly associated with YouTube workflows and simple caption exchange.

If your goal is maximum compatibility, SRT is usually the safest export target. If your goal is browser native playback with cue positioning and web track semantics, WebVTT is the correct format. If your source data comes from YouTube subtitle export or legacy simple caption workflows, you may encounter SBV and need conversion.

Quick selection guide for technical teams

Why subtitle format differences matter in real systems

In small projects, subtitle conversion can look like a simple timestamp replacement task. In production systems, the details matter: encoding mismatches break non ASCII characters, parser leniency hides malformed files until a stricter platform rejects them, and timing rounding can introduce overlaps or zero length cues that break importers.

The biggest implementation risks usually come from:

SRT format (SubRip) deep technical analysis

What SRT is and why it remains dominant

SRT (SubRip Subtitle format) is a plain text sidecar subtitle format that originated from the SubRip software ecosystem. It became widely adopted because the structure is simple, human editable, and supported by many players, editors, and platforms. In practice, SRT is often treated as the default subtitle exchange format even though it is not governed by a single modern standards body in the same way as WebVTT.

Canonical SRT cue structure

A typical SRT cue block contains four logical parts:

  1. Sequential cue number
  2. Timing line in HH:MM:SS,mmm --> HH:MM:SS,mmm format
  3. One or more text lines
  4. A blank line that terminates the block
1
00:00:01,500 --> 00:00:04,000
Hello, world!

2
00:00:05,000 --> 00:00:08,500
This is a subtitle.

SRT timestamp specifics

The canonical SRT timestamp uses:

This comma separator is one of the most important practical differences between SRT and both WebVTT and SBV, which commonly use a dot for fractional seconds.

SRT encoding reality and interoperability risk

SRT is plain text, but the format itself historically does not enforce a single universal encoding. This is one of the most common causes of broken accented characters or mojibake when moving subtitle files between operating systems, editors, and players. Many modern pipelines normalize SRT to UTF-8 for reliability.

Platform rules can be stricter than generic SRT expectations. For example, some platforms explicitly require UTF-8 and ignore basic markup even when certain desktop players may render it.

SRT markup and formatting support in practice

SRT is often described as plain text, but in practice many tools tolerate or render a small subset of HTML like tags such as <b>, <i>, <u>, and sometimes color via <font>. This is not a universally consistent behavior across platforms. Some players render these tags, some strip them, and some display them as literal text.

SRT parser leniency and de facto behavior

Real world SRT files often deviate from canonical formatting and still work because importers are permissive. Common tolerated deviations include:

A technical pipeline should not assume that files labeled .srt are syntactically clean. Robust importers should parse loosely, then normalize.

SRT strengths for engineering workflows

SRT limitations for advanced caption workflows

WebVTT (VTT) format deep technical analysis

What WebVTT is designed to solve

WebVTT (Web Video Text Tracks) is a W3C standardized timed text format created for web media text tracks. It is designed for use with HTML media and supports subtitles, captions, chapters, and metadata style track uses. Compared to SRT, WebVTT keeps the plain text feel but adds a formal syntax, cue settings, and richer cue text semantics.

WebVTT file structure at a high level

A WebVTT file starts with a required header line: WEBVTT and then contains a sequence of blocks such as cues, comments, styles, and region definitions.

WEBVTT

00:01.000 --> 00:04.000
Never drink liquid nitrogen.

WebVTT cue structure

A WebVTT cue block can contain:

  1. Optional cue identifier
  2. Timing line with start and end timestamps
  3. Optional cue settings on the timing line
  4. Cue payload text
  5. Blank line terminator
intro-1
00:00:22.230 --> 00:00:24.606 align:start line:90%
Hello from a WebVTT cue.

WebVTT timestamp syntax and strictness

WebVTT timestamps use a dot as the fractional separator and support the form [hh:]mm:ss.mmm. Hours can be omitted when zero in many valid WebVTT cues. This is a key difference from SRT, where the canonical form is always hour based with comma fractions.

For conversion code, WebVTT timestamp parsing should be more strict than SRT if you want standards compliance, but many ingest systems still accept slightly malformed VTT in practice.

Cue settings in WebVTT

One of the biggest practical advantages of WebVTT is cue settings, which let authors control cue placement and orientation. Common settings include:

These settings appear on the same line as the cue timing and are separated by spaces. The web platform and player implementation determine the actual rendering behavior.

Cue text semantics and markup in WebVTT

WebVTT supports a richer cue text model than SRT. It allows a small subtitle oriented tag vocabulary and semantic spans such as:

This makes WebVTT more suitable for web captioning, speaker labeling, and some advanced authoring needs, but many downstream tools only support a subset.

Comments, style blocks, and region definitions

WebVTT includes block types that SRT and SBV do not have:

This expands WebVTT from a simple subtitle file into a more general timed text container syntax. It also increases the chance of conversion loss when exporting to SRT or SBV.

WebVTT strengths for web video pipelines

WebVTT limitations in mixed software ecosystems

SBV (SubViewer / YouTube SBV) format deep technical analysis

What SBV is in practice

SBV is a lightweight plain text subtitle format commonly associated with YouTube subtitle workflows and SubViewer style timing. In modern production practice, SBV is most often encountered as a YouTube oriented caption file rather than as a general purpose delivery format.

Unlike WebVTT, SBV does not have a widely referenced modern standards document maintained by a standards body. Most technical teams treat SBV behavior as platform defined and tool defined, with YouTube examples and de facto converter behavior serving as reference.

Canonical SBV block structure used in YouTube examples

SBV uses a very simple cue block:

  1. One timing line containing start and end timestamps separated by a comma
  2. One or more subtitle text lines
  3. Blank line separator between cues
0:00:00.599,0:00:04.160
>> ALICE: Hi, my name is Alice Miller and this is John Brown

0:00:04.160,0:00:06.770
>> JOHN: and we're the owners of Miller Bakery.

SBV timing syntax and differences from SRT and VTT

The core SBV timing line differs from both SRT and WebVTT:

SBV timestamps typically use a dot for fractional seconds, for example 0:00:01.000. The hour field is commonly not zero padded to two digits in examples, which is another source of format variation during conversion.

SBV and styling support expectations

In practical YouTube usage, SBV is treated as a basic text format. Basic file variants are accepted for timing and text, while style markup support is not the main use case. Teams should assume that SBV is a plain transport format and not a styling format.

SBV parser behavior and de facto quirks

Because SBV is often handled by converters and platform uploaders rather than advanced subtitle authoring tools, the most common de facto behaviors are:

Why SBV still matters in technical workflows

Even if SBV is not the preferred archive or distribution format for many teams, it still matters because:

SBV limitations

De facto behavior in software and platforms (important for engineering)

Formal syntax vs parser tolerance

One of the most important engineering realities is that subtitle software often accepts files that are not fully standard. This means a file may appear valid in one editor, fail in another, and import with altered timing in a third. For robust tooling, treat parsing and rendering as separate steps: parse liberally, normalize internally, render conservatively.

Common de facto SRT behaviors

Common de facto WebVTT behaviors

Common de facto SBV behaviors

Platform specific constraints can override format capability

A key practical point for technical teams is that a platform may support a file extension but only a limited feature subset. For example, a platform can accept SRT or SBV as upload formats but ignore style markup. Similarly, a WebVTT consumer may accept the file but ignore advanced cues, regions, or styling. Always validate against the target platform, not only the format specification.

Conversion engineering notes and edge cases (SRT, VTT, SBV)

Lossless vs lossy conversion expectations

Not all subtitle conversions are fully lossless. Timing values can usually be preserved exactly, but structure and semantics often cannot.

Key syntax mappings

Feature SRT WebVTT SBV
Header None WEBVTT required None
Cue numbering Canonical yes No (optional cue identifier) No
Timing separator --> --> comma between start and end
Millisecond separator comma dot dot (typical)
Cue settings on time line No standard support Yes No

Timestamp conversion pitfalls

Text and markup conversion pitfalls

Encoding normalization is not optional in serious pipelines

If your subtitle conversion tool is intended for multilingual production use, normalize text to UTF-8 on output. This is especially important when importing SRT from legacy sources and when targeting platforms that require plain UTF-8 uploads.

Recommended internal representation for converters

A robust converter should parse all subtitle formats into a single internal model before rendering. A practical internal cue model usually includes:

This approach makes it easier to handle malformed input and produce deterministic normalized output.

Validation and normalization strategies for production subtitle pipelines

Parse loosely, render strictly

This is the most reliable design strategy for subtitle tooling. Accept common malformed inputs to reduce failure rates during import, but always emit a stricter normalized output format.

Suggested normalization rules for SRT output

Suggested normalization rules for WebVTT output

Suggested normalization rules for SBV output

Error classes worth reporting in logs

SRT vs WebVTT vs SBV comparison matrix for technical users

Category SRT WebVTT (VTT) SBV
Primary use General subtitle interchange Web media text tracks YouTube style simple subtitles
Formal standard No single modern canonical spec in common use W3C standard No widely used modern standards body spec
Header required No Yes No
Cue numbers Canonical yes No No
Milliseconds separator Comma Dot Dot (typical)
Layout settings Not standardized Yes (cue settings) No
Comments / metadata blocks No standardized block types Yes (NOTE, STYLE, REGION, more structured semantics) No
Encoding certainty Historically variable in practice UTF-8 oriented standard usage Platform expectations often plain UTF-8 in modern usage
Typical compatibility Highest across players/editors Best on web and modern track consumers Narrower, often converted first

Recommendations by workflow (engineering and content operations)

For website video playback with HTML track elements

Use WebVTT as the delivery format. Keep a normalized SRT export for editing and fallback. If your authoring source is SRT, convert to WebVTT late in the pipeline and validate in the target browsers.

For video editing and cross platform subtitle exchange

Use SRT as the main exchange format unless you specifically need WebVTT semantics. Enforce UTF-8, canonical timestamps, and normalized line endings in your pipeline.

For YouTube subtitle import/export workflows

Be prepared to handle SBV and SRT. If subtitles need to move into NLEs, archives, or other platforms, convert SBV to normalized SRT first. Keep the original SBV as source evidence if timing provenance matters.

For subtitle conversion software developers

Implement:

Technical FAQ: SRT, VTT, and SBV subtitle formats

Is SRT formally standardized like WebVTT?

Not in the same way. SRT is widely used and well understood, but practical interoperability relies heavily on de facto conventions and parser tolerance. WebVTT has a formal W3C specification.

Why does SRT use commas while VTT uses dots for milliseconds?

This is a format syntax difference with historical roots. It is one of the most common causes of failed imports during naive subtitle conversion. Converters should explicitly normalize fractional separators.

Can I safely convert WebVTT to SRT without data loss?

Only if the WebVTT file uses basic cues without advanced settings, regions, or semantic cue text features. Otherwise, timing and text may convert, but layout and semantic information may be lost.

Is SBV obsolete?

SBV is not the best universal format, but it is still relevant because YouTube related workflows and legacy subtitle exports can produce it. It remains important in conversion and ingestion pipelines.

What should a technical team archive?

Archive the original source subtitle file plus at least one normalized interchange copy, typically UTF-8 SRT. If web playback is a primary target, also archive a validated WebVTT derivative.

Conclusion

SRT, WebVTT, and SBV all solve the same basic problem of timed text, but they do so with different assumptions. SRT wins on universal compatibility, WebVTT wins on standards based web features, and SBV remains useful as a simple YouTube centered source format. For robust subtitle engineering, treat subtitle conversion as a parsing and normalization problem, not just a timestamp string replacement task.