The language of awk includes many built-in string manipulation functions:
- length(string): This returns the string length.
- index(string, search_string): This returns the position at which search_string is found in the string.
- split(string, array, delimiter): This populates an array with the strings created by splitting a string on the delimiter character.
- substr(string, start-position, end-position): This returns the substring of the string between the start and end character offsets.
- sub(regex, replacement_str, string): This replaces the first occurring regular expression match from the string with replacment_str.
- gsub(regex, replacment_str, string): This is like sub(), but it replaces every regular expression match.
- match(regex, string): This returns whether a regular expression (regex) match is found in the string. It returns a non-zero output if a match is found, otherwise it returns zero. Two special variables are associated with match(). They are RSTART and RLENGTH. The RSTART variable contains the position at which the regular expression match starts. The RLENGTH variable contains the length of the string matched by the regular expression.