The compression command performs the following tasks:
- Removing the \n and \t characters:
tr -d '\n\t'
- Removing extra spaces:
tr -s ' ' or sed 's/[ ]\+/ /g'
- Removing comments:
sed 's:/\*.*\*/::g'
: is used as a sed delimiter to avoid the need to escape / since we need to use /* and */.
In sed, * is escaped as \*.
.* matches all the text in between /* and */.
- Removing all the spaces preceding and suffixing the {, }, (, ), ;, :, and , characters:
sed 's/ \?\([{}();,:]\) \?/\1/g'
The preceding sed statement works like this:
- / \?\([{}();,:]\) \?/ in the sed code is the match part, and /\1 /g is the replacement part.
- \([{}();,:]\) is used to match any one character in the [ { }( ) ; , : ] set (spaces inserted for readability). \( and \) are group operators used to memorize the match and back reference in the replacement part. ( and ) are escaped to give them a special meaning as a group operator.\? precedes and follows the group operators to match the space character that may precede or follow any of the characters in the set.
- In the replacement part, the match string (that is, the combination of :, a space (optional), a character from the set, and again an optional space) is replaced with the character matched. It uses a back reference to the character matched and memorized using the group operator (). Back-referenced characters refer to a group match using the \1 symbol.
The decompression command works as follows:
- s/;/;\n/g replaces ; with ;\n
- s/{/{\n\n/g replaces { with {\n\n
- s/}/\n\n}/g replaces } with \n\n}