Micro Editors

There are many different useful editors who help article authors prepare content for publication on web resources. But unfortunately, it is often the case that the best is the enemy of the good, and you have to use not very convenient programs that instead of taking and sharpening “your” editor for the necessary goals. My approach to this issue is to write my editor. In this post, I want to tell you about how and what features I implemented.

Content

Autogenerated content

If you take the same Microsoft Word, then there is for example the ability to add content (table of contents) or a link (references / endnotes / footnotes) to the text. To publish on web resources, these features are sometimes needed, although not all portals support, for example, named links. But nevertheless, two features that I managed to add to my editor are the autogeneration of a TOC and a list of links (“notes”).
')

Content

The list of content items is shown a bit higher in this post. Its essence is to allow quick navigation through the elements of a post or article. Naturally, this list is autogenerated using the H1 - H8 tags, which it finds in the document. Simply tears them out, finds the topmost level of the header (for example, for Habr it is H3 ), and makes the corresponding lists using nested UL and LI elements. This is not very safe, but such a simple expression is used to search for headers:

var r = new Regex( "<h([^<]+)>(.+)</h.>" ); 

The content is useful mainly for large posts, for example if you are writing an article on CodeProject . By the way, I note that if there are no headers, then the content is not created. And the header level of the content (if any) always corresponds to the top level of the header in the file.

The implementation of such a list was not an easy task. Here is a piece of code that illustrates how this is done:

private static string GenerateToc([NotNull] string text, ConversionOptions options, out int minLevel) { List<HeadingEntry> entries = new List<HeadingEntry>(); var r = new Regex( "<h([^<]+)>(.+)</h.>" ); var matches = r.Matches(text); int count = 0; foreach (Match m in matches) { int n; if ( int .TryParse(m.Groups[1].Value, out n)) { HeadingEntry he = new HeadingEntry(n, m.Groups[2].Value); // set tag text if clash bool bFound = false ; foreach (HeadingEntry h in entries) if (h.SuggestedTagText == he.SuggestedTagText) bFound = true ; he.TagText = he.SuggestedTagText + (bFound ? (count++).ToString() : string .Empty); entries.Add(he); // replace only first occurence //text = text.Replace(m.Groups[0].Value, // string.Format("<h{0}><a name=\"{2}\"></a>{1}</h{0}>", n, he.Text, he.TagText)); int idx = text.IndexOf(m.Groups[0].Value); string emptyLink = options.UseImageHeadings ? string .Empty : string .Format( "<a name=\"{0}\"></a>" , he.TagText); text = text.Remove(idx, m.Groups[0].Value.Length).Insert(idx, string .Format( "<h{0}>{2}{1}</h{0}>" , n, he.Text, emptyLink)); } } minLevel = int .MaxValue; if (entries.Count > 0) { var hb = new HtmlBuilder(); // all are, essentially, ULs int lastLevel = -1; foreach (HeadingEntry he in entries) { // if this level > last, open a new UL if (he.Level > lastLevel) hb.AppendLine( "<ul>" ).Indent(); if (he.Level < lastLevel) { int diff = lastLevel - he.Level; while (--diff >= 0) hb.Unindent().AppendLine( "</ul>" ); } hb.AppendLine( string .Format( "<li><a href=\"#{0}\">{1}</a></li>" , he.TagText, he.Text)); minLevel = Math.Min(minLevel, he.Level); lastLevel = he.Level; } // close out all indent levels while (hb.IndentLevel > 0) hb.Unindent().AppendLine( "</ul>" ); if (! string .IsNullOrEmpty(options.TocLabel)) hb.Insert( string .Format( "<h{0}>{1}</h{0}>{2}" , minLevel, HttpUtility.HtmlEncode(options.TocLabel), Environment.NewLine), 0); // at this point, hb contains the TOC, so if (! string .IsNullOrEmpty(options.TocAfterToken)) { int idx = text.IndexOf(options.TocAfterToken); if (idx >= 0) return text.Substring(0, idx + options.TocAfterToken.Length) + hb + text.Substring(idx + options.TocAfterToken.Length); } hb.Append(text); return hb.ToString(); } return text; }

Notes

Notes, or links, are additional, minor thoughts that appear after the article itself. I prefer to add them to the code in square brackets, [ ] . Each footnote is assigned the number [ 1 ] in the order of the meeting in the file, and then they are neatly grouped at the end of the document.

Unfortunately, notes can not be pulled out by regexp'ami - because you can just have square brackets, for example, in the PRE section. Therefore, “catching” square brackets is real only in the process of traversing a file (by letter) with a transformer. Yes, yes, this is the same transformer, which makes beautiful dashes and quotes.

Text as pictures

Sometimes you need to demonstrate some kind of font, sometimes you want to write something that will not be indexed by a search engine, sometimes you just want to protect your content from plagiarism. Therefore, being able to display text as graphics is useful. And to be able to display it using ClearType and OpenType is generally super. Therefore, for these purposes, I had to add a “unmanaged” part to my managed (WPF) application, which draws beautiful text using Direct2D - a new API for two-dimensional graphics from Microsoft. How it works? Well, I write for example [{Hello, World!}] And the system responds to this:

After I earned plain text, I added the actual support for fonts, sizes and features of OpenType to make everything look even more beautiful.

Having the opportunity to replace the text with graphics with impunity, I automated this process so that, for example, I can replace all the headers with my own production graphics.

One valuable snippet that I finally wrote is a method for measuring the useful sizes of a bitmap after the text is drawn on it. I used to do this in GDI +, but I recently got angry and wrote in C ++:

MYAPI RECT MeasureCropArea( BYTE * src, int width, int height, int stride, int color) { Pixel& bgr = * reinterpret_cast <Pixel*>(&color); RECT result; // find the first non-conforming row of pixels for ( int y = 0; y < height; ++y) { int y_offset = y * stride; for ( int x = 0; x < width; ++x) { int offset = x * sizeof (Pixel) + y_offset; Pixel& s = * reinterpret_cast <Pixel*>(src + offset); // if this pixel is non-conforming, so is the row if (s != bgr) { result.top = y; // cause soft eject x = width; y = height; } } } // find the last non-conforming row of pixels for ( int y = height - 1; y >= 0; --y) { int y_offset = y * stride; for ( int x = 0; x < width; ++x) { int offset = x * sizeof (Pixel) + y_offset; Pixel& s = * reinterpret_cast <Pixel*>(src + offset); // if this pixel is non-conforming, so is the row if (s != bgr) { result.bottom = y; // cause soft eject x = width; y = -1; } } } // find the first non-conforming column of pixels for ( int x = 0; x < width; ++x) { for ( int y = 0; y < height; ++y) { int offset = x * sizeof (Pixel) + y * stride; Pixel& s = * reinterpret_cast <Pixel*>(src + offset); // if this pixel is non-conforming, so is the column if (s != bgr) { result.left = x; // cause soft eject x = width; y = height; } } } // find the last non-conforming column of pixels for ( int x = width - 1; x >= 0; --x) { for ( int y = 0; y < height; ++y) { int offset = x * sizeof (Pixel) + y * stride; Pixel& s = * reinterpret_cast <Pixel*>(src + offset); // if this pixel is non-conforming, so is the column if (s != bgr) { result.right = x; // cause soft eject x = -1; y = height; } } } int w = result.right - result.left; int h = result.bottom - result.top; if (w < 1) { result.left = 0; result.right = 1; } if (h < 1) { result.top = 0; result.bottom = 1; } return result; }

Syntax highlighting

RANT: I'm sick of seeing articles on Habré where it’s written after a piece of code: This code was highlighted with CodeSyntaxHighlighter . This is inadequate.

I needed exactly the same functionality for the editor. As a result, I used the same library as the highlighter (I don’t remember exactly how it is there), got a little bit of the API and now I did it - you can write {{ }} and the backlight will work. Code examples are right in this post :)

Counts

Counts are my new hobby. Sometimes it’s just not to explain the idea differently. Again, the autogeneration of a graph graph is basically simple, but in practice it led to the creation of a DSL that allows me to write simple commands, and on the output draws a graph. From such specs

edge basic advanced edge advanced TOC`generation edge advanced Text- as -image`generation edge advanced Graph`generation edge Heading`substitution edge TOC`generation Heading`substitution edge Heading`substitution Migration`from`FlowDocument`to`Direct2D edge Text- as -image`generation Heading`substitution edge Graph`generation Subpixel`hinting`(future)

It turns out such a picture:

This is of course DSLka, and I realized it through reflection. Each edge directive turns into a real call to the edge method. Here is the parser of these commands:

public void AddInstruction( string line) { if ( string .IsNullOrEmpty(line)) return ; foreach (Match m in Regex.Matches(line, "([\"'])(?:\\\\\\1|.)*?\\1" )) line = line.Replace(m.Value, m.Value.Replace( ' ' , '`' )); string [] parts = line.Split( ' ' ).Select(p => p.Replace( '`' , ' ' ).Unquote()).ToArray(); if (parts.Length < 1) return ; // having acquired the parts, see if there's a matching method MethodInfo mi = null ; if (cachedMethodInfo.ContainsKey( new KeyValuePair< string , int >(parts[0], parts.Length - 1))) mi = cachedMethodInfo[ new KeyValuePair< string , int >(parts[0], parts.Length - 1)]; else { var methods = GetType().GetMethods().Where(m => m.Name.ToLower() == parts[0] && m.GetParameters().Length == (parts.Length - 1)); if (methods.Any()) { mi = methods.First(); cachedMethodInfo.Add( new KeyValuePair< string , int >(mi.Name, parts.Length - 1), mi); } } if (mi != null ) { var pars = mi.GetParameters(); object [] ps = new object [0]; if (pars.Length == parts.Length - 1) { // try building a parameter array ps = new object [pars.Length]; for ( int i = 0; i < pars.Length; i++) { var par = pars[i]; var source = parts[i + 1]; ps[i] = ConvertString(source, par.ParameterType); } } // parameters ready - call it try { mi.Invoke( this , ps); } catch (Exception ex) { } } }

In a piece of code above, I tried to cache an expensive search for information about the method. Perhaps there are better solutions, but everything works for me. This again shows that it is not always necessary to use F # or climb into the DLR to quickly get a small DSL interpreter.

That's all I wanted to tell you.

Notes

↑ Here is this footnote number 1

Source: https://habr.com/ru/post/72527/

All Articles

Micro Editors

Content

Autogenerated content

Content

Notes

Text as pictures

Syntax highlighting

Counts

Notes

More articles: