SubRip Subtitle Extension (.srt)

If a person has the practice of speaking in a foreign language and listening comprehension, the subtitles for him are just an aid, the value of which depends on the intelligibility, normativity, complexity and speed of speech on the screen. But if a person only reads tolerably well in a foreign language, subtitles for him are the main source of information.

Unfortunately, in the second case, the usual time for showing titles is not enough. You can extend it in many programs (for example, in SubtitleEdit ) - but sometimes this extension is limited by reasonable limits, and running a full-fledged editor for such a simple action is not always convenient.

Therefore, I tried to write simple scripts only for this need and chose two principles of lengthening time.
')
1. One-sided elongation: the end of each subtitle is extended up to the beginning of the next subtitle (less than one millisecond so that there is no overlap). This is a simpler method, but it is not as effective because time is one-sidedly distributed. On the other hand, this method is more familiar.

2. Elongation in both directions: first, the pause time between two subtitles is calculated, then this pause is divided in half - one half is added to the end of the current subtitle, the other is subtracted from the beginning of the next one. Thus, each subtitle, as far as possible, prefixes its audio cue as much as possible and lags as much as possible on the screen after it. It is necessary to get used to this format, but time is distributed more evenly and the viewer has a relative opportunity to read and understand key phrases in advance.

The scripts are implemented in two forms: in JavaScript (network version) and in Perl (local console version).

The network version is represented by a page with the water area of the source text of the subtitles and the output area of the extended version. It is not so convenient (you need to open the subtitles in a text editor, copy the text, change it using the script, insert the modified version and re-save), but it is simple and publicly available (checked the work in the latest versions of Chrome, Firefox, Opera and Safari). The page can be saved to disk, and it will work locally.

The console version is represented by a Perl script, which takes the source file as an argument (if you do not specify an argument, the script will ask for it interactively) and create a new file with an extended display time, adding the .long tail to the name. It works with UTF-8 encoding on input and output (having received a different encoding, the interpreter itself should report an error - in this case, re-save the original subtitles in UTF-8).

It hardly makes sense to explain the code, everything is very simple: both scripts convert the source text into an array of arrays, analyze pauses, lengthen the time, then everything is again combined into the text of the subtitles. Some dubious curiosity can only represent two small functions that convert SubRip timecodes to milliseconds and vice versa, to simplify calculations - but they are also quite obvious.

The scripts do not check the text for compliance with the format and do not correct the initial timing errors (overlap timecodes, negative display times, etc.), therefore, in case of doubtful subtitles, all checks are better to be done beforehand in full editors (for example, in the same SubtitleEdit ). The console version only checks the file extension (if not .srt, terminates with a warning) in order to start processing something big by mistake.

The script code is more than amateur, for which I apologize. Improve as necessary to your taste (for example, you can limit the time of extensions or invent a more cunning algorithm, depending on the length of each subtitle).

JavaScript browser versions:

Browser subtitle extension (.srt): one way
Browser subtitle extension (.srt): both ways

Perl local versions:

Console subtitle extension (.srt): one way
Console Subtitle Extender (.srt): both ways

PS Added extension in both directions for Node.js. It works the same with small additions: it can read files in both UTF-8 and UTF-16 LE; pure ASCII files will be read as UTF-8 files for simplicity of code (there will be no difference anyway); The script can take several arguments, that is, process multiple files in a batch.

Console subtitle extension (.srt): in both directions for Node.js

Source: https://habr.com/ru/post/178615/

All Articles

SubRip Subtitle Extension (.srt)

More articles: