,'``.._ ,'``. :,--._:)\,:,._,.: All Glory to :`--,'' :`...';\ the HYPNO TOAD! `,' `---' `. / : / \ ,' :\.___,-. `...,---'``````-..._ |: \ ( ) ;: ) \ _,-. `. ( // `' \ : `.// ) ) , ; ,-|`. _,'/ ) ) ,' ,' ( :`.`-..____..=:.-': . _,' ,' `,'\ ``--....-)=' `._, \ ,') _ '``._ _.-/ _ `. (_) / )' ; / \ \`-.' `--( `-:`. `' ___..' _,-' |/ `.) `-. `.`.``-----``--, .' |/`.\`' ,','); SSt ` (/ (/
Found on the Internet.
Hello!
I want to share my little development: a typographer that can be used locally.
The project is under development and needs to be thoroughly tested.
«„“»
and “''”
(in the English version). The number of levels is not limited - the printer simply alternates even / odd - where you can customize4′
, 20″
(c)
becomes ``, and even if it is written in Cyrillic40
..
will be . .
.; . .
- here the usual space will become discontinuous
and
(with a dot at the end and without) with a ruble symbol - maybe I will drink it, since it will remove the dot if it finds a match at the end of a sentence1/2
, 1/3
, etc. on existing unicode characters(head|iframe|pre|code|script|style)
from typus import ru_typus ru_typus('00" "11 \'22\' 11"? "11 \'22 "33 33?"\' 11" 00 "11 \'22\' 11" 0"') '00″ «11 „22“ 11»? «11 „22 «33 33?»“ 11» 00 «11 „22“ 11» 0″'
Number is the nesting level. If the first quotation stood to zero, there would be another level, and so the inches would come out.
class BaseTypus(EnRuExpressions, TypusCore): processors = (EscapePhrases, EscapeHtml, TypoQuotes, Expressions) class RuTypus(RuQuotes, BaseTypus): pass ru_typus = RuTypus()
Typus consists of "processors" and "expressions."
These are pairs (regex, replace)
, which are transferred to re.sub(regex, replace)
and are executed sequentially (see just below). Almost all typographers are "expressions." They are written as methods with the prefix expr_
, the function should return a nested list, i.e. one "expression" can return a train of "expressions":
class MyTypus(Typus): expressions = Typus.expressions + 'http://bar' def expr_http://bar(self): expr = ( (r'\d', '@'), # @ ) return expr
The third, optional, argument is the flags passed to re.compile
; by default, this is re.I | re.U | re.M | re.S
re.I | re.U | re.M | re.S
re.I | re.U | re.M | re.S
By the way, replace
may be a function, see re.sub .
To determine the sequence, the typograph attribute is used - expressions
, which stores a list of expression names . You can turn off the excess:
from typus import RuTypus exclude_expressions = ('ruble', 'math') class MyTypus(RuTypus): expressions = (e for e in RuTypus.expressions if e not in exclude_expressions)
expressions
can be a generator, but if you make a sequence, you can do this:
def expr_http://bar(self): if 'some' in self.expressions: return baz return egg
There is only one mix of expressions in the box - EnRuExpressions
, but it does almost all the work.
Expressions are used for expressions to work.
Sometimes simple regulars do not get off, you have to fence uber-function. The processor is a class-function-decorator, which is initiated during the creation of a typographer, and then called when processing text. It (the processor instance) is passed to the typograph instance itself, so that the processor can access its configuration.
When using multiple processors, they decorate each other in order. For example:
html , -
Several processors are EscapePhrases
with Typus: EscapePhrases
, EscapeHtml
, TypoQuotes
, Expressions
.
There are cases when a certain piece of text cannot be processed, or you know in advance that the typographer will stop at this place, in this case you can do this:
typus('"http://bar 2""', escape_phrases=['2"']) '«http://bar 2"»'
Without this, the printer will meet the closing quote: «http://bar 2»"
. Another example:
typus(' (c) (c)', escape_phrases=[' (c)']) ' (c) '
The escape_phrases
argument can be escape_phrases
separate field in your CRUD application (aka "admin"), where the content manager will be able to list the phrases through the separator, and you will pass them to the typographer.
To divide the text, you can use the utility:
from typus.utils import splinter split = splinter(',') split('a, b,c ') == ['a', 'b', 'c'] split('a, b\,c') == ['a', 'b,c']
splinter
understands shielded delimiters and calls str.strip()
for each phrase.
Express html-tags to the typographer and returns them after. Without it, <img src="http://bar">
will turn into <img src=«http://bar»>
.
Put quotes. Expects that the printer will list the attributes loq
, roq
, leq
, req
. Example:
from typus import BaseTypus from typus.chars import LAQUO, RAQUO, DLQUO, LDQUO class MyTypus(BaseTypus): # , , , loq, roq, leq, req = LAQUO, RAQUO, DLQUO, LDQUO
There are ready EnQuotes
and RuQuotes
in the module typus.mixins
.
Provides expression work. During the initialization of the printer, all regulars are compiled and stored in the processor instance.
If you give debug=True
to the typographer, he will replace all non-breaking spaces with an underscore, this can be useful for debugging:
ru_typus('(c) me', debug=True) '_me'
Important: the demo runs on a very simple virtual machine and is intended to demonstrate the possibilities.
I will not save anything (honestly) , the source code of the site you will find on my github .
pip install -e git://github.com/byashimov/typus.git#egg=typus
Further:
from typus import en_typus, ru_typus en_typus('"Beautiful is better than ugly." (c) Tim Peters.', debug=True) '“Beautiful is_better than ugly.” _Tim Peters.' # _ for nbsp ru_typus('" , ." () .') '« , .» .' # cyrillic '' in '()'
This article can be considered as such, until I make a clumsy translation into English.
Name Stmts Miss Cover ----------------------------------------- typus/__init__.py 8 0 100% typus/chars.py 18 0 100% typus/core.py 24 0 100% typus/mixins.py 77 0 100% typus/processors.py 99 0 100% typus/utils.py 30 0 100% ----------------------------------------- TOTAL 256 0 100% ________________ summary ________________ py25: commands succeeded py26: commands succeeded py27: commands succeeded py33: commands succeeded py34: commands succeeded py35: commands succeeded congratulations :)
Travis-CI , which I use, does not support 2.5
, and I’m not always checking manually by hand, so if you still use it (condolences), run the tests after installation.
&
, into html entities. At the moment, it is not clear to me why to do this: browsers, search engines and parsers cope playfully with such text, and I just don’t want to run cpu just like that to make the code unreadable. I would be glad to have a specific example.ru_typus
will cope with Ukrainian and Belarusian texts (and possibly with others), if so, I will add it to the project description.Look like that's it.
PS Some hell with highlighting inline code on Habré.
Source: https://habr.com/ru/post/303608/
All Articles