[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.lisp

Re: newbie post - by delimited text file reader suite any good?

William James

3/31/2015 8:26:00 AM

Rob Warnock wrote:

> ;;; PARSE-CSV-LINE -- Parse one CSV line into a list of fields,
> ;;; stripping quotes and field-internal escape characters.
> ;;; Lexical states: '(normal quoted escaped quoted+escaped)
> ;;;
> (defun parse-csv-line (line)
> (when (or (string= line "") ; special-case blank lines
> (char= #\# (char line 0))) ; or those starting with "#"
> (return-from parse-csv-line '()))
> (loop for c across line
> with state = 'normal
> and results = '()
> and chars = '() do
> (ecase state
> ((normal)
> (case c
> ((#\") (setq state 'quoted))
> ((#\\) (setq state 'escaped))
> ((#\,)
> (push (coerce (nreverse chars) 'string) results)
> (setq chars '()))
> (t (push c chars))))
> ((quoted)
> (case c
> ((#\") (setq state 'normal))
> ((#\\) (setq state 'quoted+escaped))
> (t (push c chars))))
> ((escaped) (push c chars) (setq state 'normal))
> ((quoted+escaped) (push c chars) (setq state 'quoted)))
> finally
> (progn
> (push (coerce (nreverse chars) 'string) results) ; close open field
> (return (nreverse results)))))

That is not a correct csv parser.

If a field contains a double quote character ("), it
is not supposed to be escaped, it is supposed to be
doubled; i.e, " becomes "".

Let's say we want to construct a csv record that
contains these 3 fields:

foo
Rob "Loopy" Warnock
bell, book, and candle

Since the 2nd field contains quotes, we must surround
it with quotes and double the interior quotes.
The 3rd field must also be surrounded with quotes.

foo
"Rob ""Loopy"" Warnock"
"bell, book, and candle"

So the csv record can be described by this:

(define csv-record
(symbol->string '|foo,"Rob ""Loopy"" Warnock","bell, book, and candle"|))


Gauche Scheme:

(use gauche.lazy) ; lrxmatch

(define (parse-csv record)
(map
(lambda (match)
(regexp-replace-all
#/""/
(regexp-replace-all #/^"|"$/ (rxmatch-substring match 1) "")
"\""))
(lrxmatch #/("(?:[^"]+|"")*"|[^,]*),/
(string-append record ","))))


(for-each
print
(parse-csv csv-record))

===>
foo
Rob "Loopy" Warnock
bell, book, and candle