The more I write one-liners in the shell, the more I come to two important ideas:
- This is a very powerful tool for "direct programming", that is, instructions to the computer what to do.
- Most of the one-liner is dedicated to grep / awk / cut / tr, which somehow pick out and bring the output of previous utilities into a human form.
Given that the pipe model is amazing, completely dirty hacks capture the necessary fields in the output in the second paragraph (“and here we can select the needed comma by a characteristic comma using awk -F, '{print $ 2}' ... ) make the procedure controversial for pleasure, and certainly unreadable.
Another serious problem: despite the fact that the shell gives quite a few idioms from functional programming, there is no list filtering idiom based on the result of executing an external program. That is, the list can be “burnt”. But to leave in the list only those elements for which some program returned "success" - no.
At the same time there is a hostile and not very well written environment - powershell (Windows). In which they took a good idea (the pipes convey not the text, but objects), but spoiled it with two things:
- Unergonomic console Windows (
Shift-PgUp where, eh? Say, Ctrl-PdUp in new versions) - a suggestion to go and learn .net in order to work properly with the methods.
- Absence under most operating systems
')
I want to have objects in the pipe in a warm Linux tube shell. With hand-candy (little typing), eye-candy (nice to watch) and the overall ergonomic use of the process. I also want to be able to combine the “new approach” with the old, that is, the usual text pipe.
Idea
It is necessary to write a set of tools that will allow to operate in a pipe-style with structured data. The obvious choice is
XML JSON.
We need:
- Utilities that will accept standard input formats and convert them to json.
- Utilities that will allow in the pipe'e to manipulate json'om.
- Utilities that will lead json in "normal" format.
In this case, the person will not see json on the screen, but will be able to work with him.
For seed
(for understanding, I will write long names of utilities, in real life these will be short abbreviations, that is, not json-get-object, but something like jgo or jg)
Displays only files for which file has managed to determine the type:
ls -la | ls2json | json-filter 'filename' --exec 'file {} >/dev/null' | json-print
It downloads a token for authorization from a certain site, picks it out from json and puts environment variables into environment variables, then downloads the list and filter on the regexp the author field downloads all urls:
curl mysite/api.json | env `json-get-to-env X-AUTH-TOKEN`;curl -H X-AUTH-TOKEN $X-AUTH-TOKEN mysite/api/list.json | json-filter --field 'author' --rmatch 'R.{1,2}dal\d*' | json-get --field 'url' | xargs wget
Parsing output find -ls, sorts by size field, cuts elements from 10 to 20 from an array, prints them to csv.
find . -ls | ls2josn | json-sort --field 'size' | json-slice [10:20] | json2csv
Terminology
input'y
The main task is to make a json candy from a messy output. It is important: to have the option to process incorrect input: a) ignore, b) stop the pipe with an error.
Examples:
Generic:
- line2json - converts normal output to an array of strings, where the string matches the line (line to string).
- words2json - similarly, but by "words".
- csv2json - converts cvs to an object, allowing the specified item to be assigned as a key.
- lineparse2json - converts a string into an object, dividing it by the specified characters. Reminds awk -F: '{print $ 1, $ 2}',
app-specific:
- ls2json (optionally either makes ls or takes ls output) and structures it as an array of objects, where each object is a file with a bunch of fields. Maybe even more than ls can do (normal and extended lsattr attributes, all information about inodes, creation dates, etc.)
- ps2json - similarly, according to the lists of processes
- lsof2json - a list of objects describing applications using the file.
- openfiles2json - a list of fd opened by the application (/ proc / PID / fd), with built-in filtering, for example, "files only", "ignore / dev / null". Objects on network sockets immediately attached all the information - ports / ip.
- iptables2json - displays the current iptables settings in the form of json
As suggested in private, mysql-json fits perfectly on this idea. Run binary output from sql'ya? Easily.
File-specific:
Read the file, output it in json'e.
- syslog2json
- ini2json
- conf.d2json
- sysv2json, upstart2json
native json transforms
The most delicious is native manipulation of json. Similarly, they should have “non-json'a processing options -“ ignore ”/“ stop ”.
- json-filter - filters objects / arrays by specified criteria.
- json-join - makes one of the two json's by the specified method.
- json-sort - sorts an array of objects by the specified field
- json-slice - cut a piece of the array
- json-arr-get - returns an element from an array
- json-obj-get - returns the specified field / fields of the specified object
- json-obj-add - add object
- json-obj-del - deletes an object
- json-obj-to-arr - displays keys or a specified field of objects as an array
- json-arr-to-obj - turns an array into an object forming a key for a given attribute.
- json-uniq - removes duplicate elements in an array (or displays only duplicate)
(add to taste and needs)
outputs
Give json a human readable view:
- json2fullpath - turn json into string notation of the form key1.key2 [3] .key4 = "foobar"
- json2csv
- json2lines - output an array by element on a line, if inside objects - by separating them with spaces on the line.
- json2keys - displays object keys
- json2values ​​- displays only object values
iterators
In fact, the xargs extension on json:
- json-keys-iterate - runs the specified commands for each key
- json-values-iterate - runs the specified commands for each key
- json-iterate - runs the specified commands for each element
Difficulties
Of course, such methods cannot solve the problem of processing an arbitrary json - it may turn out to be too “unstructured”. But firstly, input'y do the same predictable type of json'a, and secondly, the processing of json'a is still more predictable than the processing of "like here elements are separated by space" in the existing shell.
Implementation
I would have written it myself, but I don’t know part of what I need, I don’t have enough time for something. Not a programmer, me. The secret idea of ​​the article is that “someone will write for me,” but if there is no such one, then there will be at least a program article with the motivation to complete my education (do) and do it myself.
If someone is ready to take on this, I will be extremely grateful. If not, I will uncover my fig python - and ideas and suggestions are welcome.
UPDATE: It seems that people are a little stir. Your commits will be welcome here:
github.com/amarao/json4shell . When I can use it, I don’t know yet. Do I have enough gunpowder? I don't know either.