[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.lisp

Re: Most efficient way to read words from string.

William James

5/13/2015 2:40:00 PM

Pierre Mai wrote:

> > What I'd like to do is take a string like "I am a string." and return the
> > words and punctuation as multiple values such as:
> >
> > (my-read-from-string "I am a string.")
> > => "I"
> > "am"
> > "a"
> > "string"
> > "."
> >
> > Or have it build a list. Which ever is more efficient:
> > (my-read-from-string "I am a string.")
> > => ("I" "am" "a" "string" ".")
>
> Here is a simplistic tokenization function most Perl users will find
> vaguely familiar. It currently will only tokenize along a given set
> of whitespace (which by definition will be elided from the result), so
> you will have to change it a bit for your punctuation characters,
> along which you'll split, but which should be _kept_ in the result
> sequence:
>
> (defun split (string &optional max (ws '(#\Space #\Tab)))
> "Split `string' along whitespace as defined by the sequence `ws'.
> The whitespace is elided from the result. The whole string will be
> split, unless `max' is a non-negative integer, in which case the
> string will be split into `max' tokens at most, the last one
> containing the whole rest of the given `string', if any."
> (flet ((is-ws (char) (find char ws)))
> (loop for start = (position-if-not #'is-ws string)
> then (position-if-not #'is-ws string :start index)
> for index = (and start
> (if (and max (= (1+ word-count) max))
> nil
> (position-if #'is-ws string :start start)))
> while start
> collect (subseq string start index)
> count 1 into word-count
> while index)))


As he indicated, it does not satisfy the specification:

* (split "I am a string.")

("I" "am" "a" "string.")


Gauche Scheme:


(use gauche.lazy :only (lrxmatch))

(map (cut <> 0) (lrxmatch #/\w+|[.,?!;:]/ "Here, it is!?"))
===>
("Here" "," "it" "is" "!" "?")

--
Viewed at its most abstract level, a fundamental agenda is thus to influence
the European-derived peoples of the United States to view concern about their
own demographic and cultural eclipse as irrational and as an indication of
psychopathology. --- Kevin MacDonald; "The Frankfurt School of Social Research
and the Pathologization of Gentile Group Allegiances"