Sunday, November 8, 2015

C++: splitting strings

There are multiple ways of splitting or tokenizing strings in C++. I'll enumerate below four types that seem most used and/or useful.

The C-style strtok

#include <cstring>

char str[] ="The quick brown fox jumps over the lazy dog";
char * pch;
pch = strtok (str," ");
while (pch != NULL)
{
  printf ("%s\n",pch);
  pch = strtok (NULL, " ");
}

Using the C++ std::stringstream class

#include <sstream>

std::string input = "The quick brown fox jumps over the lazy dog";
std::stringstream ss(input);
std::string item;
while (std::getline(ss, item, ' ')) {
    std::cout << item << std::endl;
}

Using std::string methods only

std::string input = "The quick brown fox jumps over the lazy dog";
std::string strSplit = " ";
size_t pos = 0;
size_t start = 0;
std::string subStr;
while( (pos = input.find(strSplit, start)) != std::string::npos){
    subStr = input.substr(start, pos-start);
    start = pos + strSplit.size();
    std::cout << subStr << std::endl;
}
subStr = input.substr(start);
std::cout << subStr << std::endl;

Using the boost libraries

#include <vector>
#include <boost/foreach.hpp>
#include <boost/algorithm/string.hpp>
#include <boost/algorithm/string/iter_find.hpp>

std::string input = "The quick brown fox jumps over the lazy dog";
std::string strSplit = " ";
std::vector<std::string> stringVector;
boost::iter_split(stringVector, input, boost::first_finder(strSplit));
for(auto it : stringVector){
    std::cout << it << std::endl;
}

All these methods do the same thing, each having it's pros and cons (which won't be explained here - google knows best :) ).
The first two methods can use as delimiters only single characters, while the last two can use words also.

No comments:

Post a Comment