[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

How to find the best solution ?

Johny

3/23/2010 10:49:00 AM

I have a text and would like to split the text into smaller parts,
say into 100 characters each. But if the 100th character is not a
blank ( but word) this must be less than 100 character.That means the
word itself can not be split.
These smaller parts must contains only whole( not split) words.
I was thinking about RegEx but do not know how to find the correct
Regular Expression.
Can anyone help?
Thanks
L.
2 Answers

Tim Golden

3/23/2010 10:55:00 AM

0

On 23/03/2010 10:48, Johny wrote:
> I have a text and would like to split the text into smaller parts,
> say into 100 characters each. But if the 100th character is not a
> blank ( but word) this must be less than 100 character.That means the
> word itself can not be split.
> These smaller parts must contains only whole( not split) words.
> I was thinking about RegEx but do not know how to find the correct
> Regular Expression.
> Can anyone help?
> Thanks
> L.

Have a look at the textwrap module

TJG

Tim Chase

3/23/2010 5:31:00 PM

0

Johny wrote:
> I have a text and would like to split the text into smaller parts,
> say into 100 characters each. But if the 100th character is not a
> blank ( but word) this must be less than 100 character.That means the
> word itself can not be split.
> These smaller parts must contains only whole( not split) words.
> I was thinking about RegEx but do not know how to find the correct
> Regular Expression.

While I suspect you can come close with a regular expression:

import re, random
size = 100
r = re.compile(r'.{1,%i}\b' % size)
# generate a random text string with a mix of word-lengths
words = ['a', 'an', 'the', 'four', 'fives', 'sixsix']
data = ' '.join(random.choice(words) for _ in range(200))
# for each chunk of 100 characters (or fewer
# if on a word-boundary), do something
for bit in r.finditer(data):
chunk = bit.group(0)
print "%i: [%s]" % (len(chunk), chunk)

it may have an EOF fencepost error, so you might have to clean up
the last item. My simple test seemed to show it worked without
cleanup though.

-tkc