We will implement a little tool that searches for user-provided text patterns in files. The tool works similar to the UNIX tool grep, but will not be as mature and powerful, for the sake of simplicity.
- First, we need to include all the necessary headers and declare that we use namespace std and filesystem.
#include <iostream>
#include <fstream>
#include <regex>
#include <vector>
#include <string>
#include <filesystem>
using namespace std;
using namespace filesystem;
- We implement a helper function first. It accepts a file path and a regular expression object that describes the pattern we are looking for. Then, we instantiate a vector that shall contain pairs of matching line numbers and their content. And we instantiate an input file stream object from which we will read and pattern-match the content, line by line.
static vector<pair<size_t, string>>
matches(const path &p, const regex &re)
{
vector<pair<size_t, string>> d;
ifstream is {p.c_str()};
- We traverse the file line by line using the getline function. regex_search returns true if the string contains our pattern. If this is the case, then we put the line number and the string into the vector. Finally, we return all collected matches.
string s;
for (size_t line {1}; getline(is, s); ++line) {
if (regex_search(begin(s), end(s), re)) {
d.emplace_back(line, move(s));
}
}
return d;
}
- In the main function, we first check whether the user provided a command-line argument that we can use as the pattern. If not, we error out.
int main(int argc, char *argv[])
{
if (argc != 2) {
cout << "Usage: " << argv[0] << " <pattern>n";
return 1;
}
- Next, we construct a regular expression object from the input pattern. If the pattern is not a valid regular expression, this would lead to an exception. If such an exception occurs, we catch it and error out.
regex pattern;
try { pattern = regex{argv[1]}; }
catch (const regex_error &e) {
cout << "Invalid regular expression provided.n";
return 1;
}
- Now, we can finally iterate over the filesystem and look for pattern matches. We use recursive_directory_iterator to iterate over all the files in the working directory. It works exactly like directory_iterator in the previous recipe, but it also descends down into subdirectories. This way we don't have to manage recursion. On every entry, we call our helper function matches.
for (const auto &entry :
recursive_directory_iterator{current_path()}) {
auto ms (matches(entry.path(), pattern));
- For every match (if any) we print the file path, its line number, and the matching line's complete content.
for (const auto &[number, content] : ms) {
cout << entry.path().c_str() << ":" << number
<< " - " << content << 'n';
}
}
}
- Let's prepare a file called "foobar.txt", which contains some test lines we can search for.
foo
bar
baz
- Compiling and running yields the following output. I launched the app in the /Users/tfc/testdir folder on my laptop, first with the pattern "bar". Within that directory, it found the second line of our foobar.txt file and another file "text1.txt" that is located in testdir/dir1.
$ ./grepper bar
/Users/tfc/testdir/dir1/text1.txt:1 - foo bar bla blubb
/Users/tfc/testdir/foobar.txt:2 - bar
- Launching the app again, but this time with the pattern "baz", it finds the third line of our example text file.
$ ./grepper baz
/Users/tfc/testdir/foobar.txt:3 - baz