We are going to implement two different custom string classes: lc_string and ci_string. The first class constructs lower case strings from any string input. The other class does not transform any string, but it can do case-insensitive string comparison:
- Let's include the few necessary headers first and then declare that we use the std namespace by default:
#include <iostream>
#include <algorithm>
#include <string>
using namespace std;
- Then we reimplement the std::tolower function, which is already defined in <cctype>. The already existing function is fine, but it is not constexpr. Some string functions are constexpr since C++17, however, and we want to be able to make use of that with our own custom string trait class. The function maps upper-case characters to lower case and leaves other characters unchanged:
static constexpr char tolow(char c) {
switch (c) {
case 'A'...'Z': return c - 'A' + 'a';
default: return c;
}
}
- The std::basic_string class accepts three template parameters: the underlying character type, a character traits class, and an allocator type. We are only changing the character traits class in this section because it defines the behavior of strings. In order to reimplement only what should differ from the ordinary strings, we are publicly inheriting from the standard traits class:
class lc_traits : public char_traits<char> {
public:
- Our class accepts input strings but transforms them to lower case. There is one function, which does this character-wise, so we can put our own tolow function here. This function is constexpr, which is why we reimplemented ourselves a constexpr tolow function:
static constexpr
void assign(char_type& r, const char_type& a ) {
r = tolow(a);
}
- The other function handles the copying of an entire string into its own memory. We use an std::transform call to copy all the characters from the source string to the internal destination string and, at the same time, map every character to its lower-case version:
static char_type* copy(char_type* dest,
const char_type* src,
size_t count) {
transform(src, src + count, dest, tolow);
return dest;
}
};
- The other trait helps build a string class that effectively transforms strings to lower case. We are going to write another trait that leaves the actual string payload untouched but which is case insensitive when it comes to comparing strings. We inherit from the existing standard character traits class again, and this time, we redefine some other member functions:
class ci_traits : public char_traits<char> {
public:
- The eq function tells whether two characters are equal. We do this too, but we compare their lower-case versions. This way 'A' equals 'a':
static constexpr bool eq(char_type a, char_type b) {
return tolow(a) == tolow(b);
}
- The lt function tells whether the value of a is less than the value of b. We apply the correct logical operator for that, just after lower-casing both the characters again:
static constexpr bool lt(char_type a, char_type b) {
return tolow(a) < tolow(b);
}
- The last two functions worked on character-wise input and the next two functions work on string-wise input. The compare function works similar to the old-school strncmp function. It returns 0 if both the strings are equal within the length that count defines. If they differ, it returns a negative or positive number, which tells which input string is lexicographically smaller. Calculating the difference between both the characters at every position must, of course, be done on their lower-case versions. The nice thing is that this whole loop code has been part of a constexpr function since C++14:
static constexpr int compare(const char_type* s1,
const char_type* s2,
size_t count) {
for (; count; ++s1, ++s2, --count) {
const char_type diff (tolow(*s1) - tolow(*s2));
if (diff < 0) { return -1; }
else if (diff > 0) { return +1; }
}
return 0;
}
- The last function we need to implement for our case-insensitive string class is find. For a given input string, p, and length, count, it finds the position of a character, ch. Then, it returns a pointer to the first occurrence of that character, or it returns nullptr if there is none. The comparison in this function has to be done using the tolow "glasses" in order to make the search case-insensitive. Unfortunately, we cannot use std::find_if, because it is not constexpr, and must write a loop ourselves:
static constexpr
const char_type* find(const char_type* p,
size_t count,
const char_type& ch) {
const char_type find_c {tolow(ch)};
for (; count != 0; --count, ++p) {
if (find_c == tolow(*p)) { return p; }
}
return nullptr;
}
};
- Okay, that's it for the traits. Since we have them in place now, we can define two new string class types. lc_string means lower-case string. ci_string means case-insensitive string. Both the classes only differ from std::string by their character traits class:
using lc_string = basic_string<char, lc_traits>;
using ci_string = basic_string<char, ci_traits>;
- In order to make the output streams accept these new classes for printing, we quickly need to overload the stream operator<<:
ostream& operator<<(ostream& os, const lc_string& str) {
return os.write(str.data(), str.size());
}
ostream& operator<<(ostream& os, const ci_string& str) {
return os.write(str.data(), str.size());
}
- Now we can finally begin implementing the actual program. Let's instantiate a normal string, a lower-case string, and a case-insensitive string, and print them immediately. They should all look normal on the terminal, but the lower case strings should be all lower-cased:
int main()
{
cout << " string: "
<< string{"Foo Bar Baz"} << 'n'
<< "lc_string: "
<< lc_string{"Foo Bar Baz"} << 'n'
<< "ci_string: "
<< ci_string{"Foo Bar Baz"} << 'n';
- In order to test the case-insensitive string, we can instantiate two strings that are basically equal but differ in the casing of some characters. When doing a really case-insensitive comparison, they should appear equal nevertheless:
ci_string user_input {"MaGiC PaSsWoRd!"};
ci_string password {"magic password!"};
- So, let's compare them and print that they match if they do:
if (user_input == password) {
cout << "Passwords match: "" << user_input
<< "" == "" << password << ""n";
}
}
- Compiling and running the program yields us the expected results. When we first printed the same string three times in different types, we got unchanged results, but the lc_string instance is all lower case. The comparison of the two strings that only differ in their character casing was indeed successful and yields us the right output:
$ ./custom_string
string: Foo Bar Baz
lc_string: foo bar baz
ci_string: Foo Bar Baz
Passwords match: "MaGiC PaSsWoRd!" == "magic password!"