[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.c++

elementary string processing question

tonywh00t

11/1/2008 3:29:00 AM

Hi everyone,

I have a "simple" question, especially for people familiar with regex.
I need to parse strings that have the form:

1:3::5:9

which indicates the set of integers {1 3 4 5 9}. In other words i have
a set of numbers separated by ":", where "::" indicates a range from
lo to hi inclusive. It is desirable to error check this string (i.e it
should. start and end with a number, and be composed only numbers,
"::", and ":"). I'm currently using the Boost C++ library, and i've
worked out some pretty ugly solutions. If anyone has a suggestion, I'd
very much appreciate it. Thanks!
3 Answers

James Kanze

11/1/2008 9:29:00 AM

0

On Nov 1, 4:28 am, tonywh00t <tony.s...@gmail.com> wrote:

> I have a "simple" question, especially for people familiar
> with regex. I need to parse strings that have the form:

> 1:3::5:9

> which indicates the set of integers {1 3 4 5 9}. In other
> words i have a set of numbers separated by ":", where "::"
> indicates a range from lo to hi inclusive. It is desirable to
> error check this string (i.e it should. start and end with a
> number, and be composed only numbers, "::", and ":"). I'm
> currently using the Boost C++ library, and i've worked out
> some pretty ugly solutions. If anyone has a suggestion, I'd
> very much appreciate it. Thanks!

I presume that the number of entries in the string may vary;
otherwise, of course, you said it yourself, regex. I'd still
use regex to validate the string, something like
"^\\d+(:\\d+|::\\d+)*$", I think would do the trick. (It would
be really elegant if you could use capture, but capture doesn't
work well within closures---only the last match is captured.)
Then I'd simply break the string up into substrings at each ':':

std::vector< std::string >
parse( std::string const& source )
{
typedef std::string::const_iterator
TextIter ;
std::vector< std::string >
result ;
TextIter current = source.begin() ;
TextIter const end = source.end() ;
while ( current != end ) {
TextIter fieldBegin = current ;
current = std::find( current, end, ':' ) ;
result.push_back( std::string( fieldBegin, current ) ) ;
if ( current != end ) {
++ current ;
}
}
return result ;
}

This gives you an array of strings, with an emtpy string between
:: (so when you see an empty string, you know you have a range).
So you could do something like:

int
toInt( std::string const& string )
{
std::istringstream cvt( string ) ;
int result ;
cvt >> result ;
return result ;
}

std::vector< int >
convert( std::vector< std::string const& source )
{
typedef std::vector< std::string >::const_iterator
FieldIter ;
std::vector< int > result ;
FieldIter current = source.begin() ;
FieldIter const end = source.end() ;
while ( current != end ) {
result.push_back( toInt( *current ) ) ;
++ current ;
if ( current != end && *current == "" ) {
int bottom = result.back() ;
++ current ;
int top = toInt( *current ) ;
if ( top <= bottom ) {
throw someError ;
}
while ( ++ bottom <= top ) {
result.push_back( bottom ) ;
}
++ current ;
}
}
sort( result.begin(), result.end() ) ;
// Or you might want to track the last seen to ensure
// that the input was correctly sorted.
return result ;
}

Note that all of the above code supposes the precheck on the
format using regex. Otherwise, you'll need a lot more error
handling and special cases.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Juha Nieminen

11/1/2008 10:02:00 AM

0

tonywh00t wrote:
> I'm currently using the Boost C++ library, and i've
> worked out some pretty ugly solutions. If anyone has a suggestion, I'd
> very much appreciate it. Thanks!

My experience is that whenever you need to parse input data which is
more complicated than fixed-format whitespace-separated elements, the
parsing code always becomes very complicated in C++ (as well as C). The
C/C++ language has clearly not been designed to be a language which you
can use to create complicated format parsers with one-liners. Often not
even with 100-liners (especially if you want full error checking).

Of course libraries have been developed during the decades to try to
help this, but they often only help more on the abstraction rather than
on the verbosity and complexity of the code.

tonywh00t

11/1/2008 4:40:00 PM

0

thanks guys very much for your suggestions and help =).