📜 ⬆️ ⬇️

Dummy about Dummies and one exciting journey into the depths of Excel. Long-awaited RegExp in tables

Prologue, just prologue!


Of course, my article is not as cool as the story about game dev on VBA in ARIMI, but there are source codes located right at the end of the article under the spoiler.



Kryptonite for a teapot.


Meet me a kettle! Actually, I'm really not very good at programming, but I feel particularly uncomfortable trying to create “something” in a language I don't know. In my story "Visual Basic for Applications" there is that language, which is unusual for me. In fact, all this lyrical digression, nothing more than an attempt to understate your expectations.
')
My story begins as a typical story of an office worker. The only distinguishing feature is that I don’t have an office and I work at home (hell, no one is surprised by this). But to everyone's happiness, I am as grumpy as the majority.

Monday morning as always was disgusting. Dissolving haze and barely noticeable breaking rain - an unpleasant sight. The desire to work disappeared on Sunday. As always, there is a lot of work, nothing is glued, and here also the Excel base of 300k lines fell into my hands. Well, not something that fell, finished parsing. And as you understand, a structured, formatted and homogeneous base didn’t smell there, otherwise how could the article come from. So, at the exit, I received a huge amount of heterogeneous information, from which it was necessary to select the necessary information that I needed to work, cutting off everything else.

Kettle Exposure


Looking at that horror and chaos, I spent some time in a state of procrastination, mindlessly multiplying entropy. By the way, I know excel quite badly, well, in a sense, I can fill the table with values ​​- and format the sheet too. Even able to remove duplicates (even in 2007), but no more. But, having come to himself a little and realizing that nothing was formed by itself, he began to look for a solution.

Like any other sensible person, I turned for help to a special forum with a party of accountants and excel geeks. I will not advertise the forums in which I asked for help, I will only note that out of 5 popular ones, only two agreed to help me. Guys if you read this - thank you so much!

In general, leaving my question, I began to look for a solution on my own.

Outline teapot


My tasks were very typical and simple. The table contained the most necessary information: Name (along with the type of registration of the company), Phone (mobile and urban mixed), Address (rare dreb), Company website (here everything is surprisingly ok), email (unimportant info, but still for checkmarks - grabanul) and Branch / Rubric. I needed to bring all this into proper form, for subsequent export to crm.

Waiting for the long-awaited response, I began to put it into practice. The bottom line is even worse. Of course, I am exaggerating, but nevertheless that formula (by the way, hell, not formula), which was provided to me by one of the responsive forum users, turned out to be not entirely viable. Of course, it performed its function, but it was all very dreary, long and terrible. Realizing that formulas are not an option, I decided to write a macro.

And ... Created another tearful topic on the above forum (black belt on forum-up). In the old manner, I waited for an answer, and this time some not indifferent user wrote me a fairly tolerable macro suitable for my needs. He was able to separate phones, as well as to separate the type of enterprise from the name (just to separate the columns by the last comma).

The principle of its operation can not be simple. On the basis of the dictionary, all extra characters were removed from the column with numbers using Replace: dashes, parentheses and commas. Then the program went through the values ​​and separated those in which there is a designated operator code. As a result, the Numbers * of course * were separated, but the formatting was broken and without that was not particularly aesthetic.

And figs with him, but there was some understatement, incompleteness.

Kettle razor


Sitting in the evening in the chair and finishing his glass of bourbon ... Yeah, he sat and sawed another hack taken in freelance. To my shame, I don’t remember what the essence was there (“More monotonous work is needed” (C) War3). But the most important thing is that, applying RegExp to the e-mail verification field, it suddenly struck me, but what if ... Yes, after so many useless words, I finally came to the most important thing. And what if you use regular expressions to juggle values ​​within a table?



Scrolling through several foreign sites, nothing sensible was found. Of course, there were macros that add functionality to form regeex, but this is not it. Taking the formula macro code as a basis, I modified the source script and, surprisingly, it worked. Eureka, I shouted and began to finish drinking bourbon .

Triumph kettle


The victory dance was interrupted by an error in the execution of my macro version 2.0. By the way, I released major updates as often as turning around, to check if there is anyone behind my back. Paranoia, you know.
What am I talking about? Oh yeah, it later turned out that the essence of the error was in the random variable in quotes and the wrong number of submatches. Having fixed this defect, I began to finish the functionality specifically for my task. After 2 hours I finished. The final macro successfully formatted and separated the phone by operator code (differentiating by code as mobile, and the rest as urban ones), separated the company name and type of registration, and also separated the company’s city from the address and entered it into a separate column.

Nothing special, you say? And you will be absolutely right. Most people who can work with excel at the “advanced user” level would do the same operation in 15 minutes using formulas and basic formatting functions. But I was happy and decided to share this happiness with others.



The result, just the result!


At the bottom of the article I posted a detailed form code that allows other less advanced users, like me, to apply their own regular expressions to the data contained in the table. In fact, my tweak (not exactly my course) can do the following:



Separate - separates the values ​​for a given mask into a new column, leaving the original column intact.
Important note: due to lack of experience and knowledge, I did not manage to force the macro to create a new empty column to the right of the one being edited. This means that the adjacent right column must be empty, otherwise the output \ final data will replace the original content.

Delete - deletes values ​​by the specified mask. It is noteworthy that I managed to get the macro to work in such a way that, provided the number of capture groups is more than 0 (from 1 to 3), only the surrounding information is deleted, and not the value that matches the mask in brackets.

Split - splits the value into two groups. Values ​​suitable for RegExp are put into the new column, everything that does not match remains in the old one. It also means that the adjacent right column must be empty.

TRUE / FALSE - an additional tool for interacting with formulas. When a mask matches, the values ​​in the adjacent column are TRUE; otherwise, FALSE.

Wishlist or additional options

The number of capture groups - the number of masks in brackets. Example: \ d {3} \ s? (. *?), \ S (. *?) \. - there are two of them.

Select column - the name speaks for itself. Select editable column.

Replace with \ add - In case of deletion by mask, replaces the deleted values ​​with the text entered by the user. In the case of separation / separation - adds to the end of each value or group, the entered characters.

The result on a new sheet is the result on a new sheet.

Winner epilogue


Forgive me for some pretentiousness and unusual habrastil. Because of my innate inability to beautifully expose my thoughts to the text, as well as create an action, explaining the code, I decided that the article itself would describe the case better, and place well-commented code at the bottom of the article.

200 lines of terrible code
Private Function regexpmulti(S As Variant) '  '  Dim Sl As String, bRes As Boolean, RegExp As Object, oMatches As Object, n As Integer, P As String, Text As String, TextReplace As String Sl = "" ReDim x(1) As String bRes = False Set RegExp = CreateObject("VBScript.RegExp") '  RegExp.Global = True RegExp.IgnoreCase = False '    RegExp.Pattern = TextBox1.Text ' TextBox1.Text -    On Error Resume Next '         bRes = RegExp.test(S) If ComboBox1.Text = "" Then '  ,    p    P = 0 Else P = ComboBox1.Text End If If OptionButton2 Then '    " /" Text = "" TextReplace = TextBox2.Text Else Text = TextBox2.Text End If If bRes Then Set oMatches = RegExp.Execute(S) '  '       If P = 0 Or P > 5 Then '     Sl Sl = oMatches(0) & Text '     For n = 1 To oMatches.Count - 1 Sl = Sl & oMatches(n) & Text '    Next ElseIf P = 1 Then Sl = oMatches(0).subMatches(0) & Text For n = 1 To oMatches.Count - 1 Sl = Sl & oMatches(n).subMatches(0) & Text Next ElseIf P = 2 Then Sl = oMatches(0).subMatches(0) & oMatches(0).subMatches(1) & Text For n = 1 To oMatches.Count - 1 Sl = Sl & oMatches(n).subMatches(0) & oMatches(n).subMatches(1) & Text Next ElseIf P = 3 Then Sl = oMatches(0).subMatches(0) & oMatches(0).subMatches(1) & oMatches(0).subMatches(2) & Text For n = 1 To oMatches.Count - 1 Sl = Sl & oMatches(n).subMatches(0) & oMatches(n).subMatches(1) & oMatches(n).subMatches(2) & Text Next End If If OptionButton2 Then '      S = RegExp.Replace(S, TextReplace) ' TextReplace -    End If End If Set RegExp = Nothing '   If OptionButton4 Then '   "/" '  If Sl <> x(0) Then x(0) = Sl: x(1) = "" regexpmulti = x Else x(0) = Sl: x(1) = "" regexpmulti = x End If Else x(0) = Sl: x(1) = S regexpmulti = x End If End Function Private Sub ComboBox1_DropButtonClick() ComboBox1.List = Array("0", "1", "2", "3") '     ComboBox1.style = fmStyleDropDownList '     End Sub Private Sub ComboBox2_DropButtonClick() ComboBox2.List = Array("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z") ComboBox2.style = fmStyleDropDownList End Sub Private Sub CommandButton1_Click() '     On Error GoTo Error ' .       . '  Dim R, c Dim M(), RZ(), U(), S, P As String If ComboBox2.Text = "" Then '      S = "A" '     ,       A Else S = ComboBox2.Text End If P = S & "1" & ":" & S '    "A1:A" M = ActiveSheet.Range(P & ActiveSheet.Range(S & Rows.Count).End(xlUp).Row).Value ' RZ -  ,    ,      . ReDim RZ(1 To UBound(M), 1 To UBound(M, 2) + 1) '    For R = 1 To UBound(M) If OptionButton3 Or OptionButton4 Then RZ(R, 1) = M(R, 1) '       End If c = regexpmulti(M(R, 1)) '    .  "M(R, 1)" -       If OptionButton2 Then '      -          If ComboBox1.Text > 0 Then RZ(R, 1) = c(0) & c(1) Else RZ(R, 1) = c(1) End If End If If OptionButton3 Then RZ(R, 2) = c(0) End If If OptionButton1 Then RZ(R, 1) = c(1) RZ(R, 2) = c(0) End If If OptionButton4 Then RZ(R, 2) = c(1) End If Next R If CheckBox1.Value = True Then '   ,      Worksheets.Add '    Range("A1").Resize(UBound(RZ), UBound(RZ, 2)) = RZ '     Else Range(S & "1").Resize(UBound(RZ), UBound(RZ, 2)) = RZ End If ' \ \ Cells.Columns.AutoFit Cells.Rows.AutoFit GoTo Skip '      Error: '      MsgBox "       .   -  ", vbCritical, ",    =(" GoTo 111 '      (Unload Me)   ,      Skip: Unload Me 111: End Sub Private Sub Label2_Click() End Sub Private Sub OptionButton1_Click() Label2.ForeColor = &H80000012 '   Label* Label3.ForeColor = &H80000012 Label3.Caption = ":" '  Label* Label2.ControlTipText = "  -     .  - '\s?\d{3}( )\s'.        ,  - ." Label3.ControlTipText = "      ." '  ComboBox1.Enabled = True ComboBox1.BackColor = vbWindowBackground TextBox2.Enabled = True TextBox2.BackColor = vbWindowBackground End Sub Private Sub OptionButton2_Click() Label2.ForeColor = &H80000012 Label3.ForeColor = &HFF& Label3.Caption = " :" Label2.ControlTipText = "  -     .  - '\s?\d{3}( )\s'.       0 -    ,   ." Label3.ControlTipText = "  ,   ." ComboBox1.Enabled = True ComboBox1.BackColor = vbWindowBackground TextBox2.Enabled = True TextBox2.BackColor = vbWindowBackground End Sub Private Sub OptionButton3_Click() Label3.ForeColor = &H80000012 Label2.ForeColor = &H80000012 Label3.Caption = ":" Label2.ControlTipText = "  -     .  - '\s?\d{3}( )\s'.        ,  - ." Label3.ControlTipText = "      ." ComboBox1.Enabled = True ComboBox1.BackColor = vbWindowBackground End Sub Private Sub OptionButton4_Click() Label3.ForeColor = &H80000000 Label2.ForeColor = &H80000000 Label2.ControlTipText = "   -  ." Label3.ControlTipText = "   -  ." ComboBox1.Enabled = False '  ComboBox ComboBox1.BackColor = vbButtonFace '   TextBox2.Enabled = False TextBox2.BackColor = vbButtonFace End Sub 


Link to working scripts : download .



Thanks for attention!

Source: https://habr.com/ru/post/239369/


All Articles