[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

[QUIZ] Editing Text (#145

James Gray

10/26/2007 1:03:00 PM

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rub...

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Eric Mahurin

Have you ever wondered how a text buffer might be represented in a text editor
or word processor? A simple string to represent text buffer isn't efficient
enough because inserting (i.e. typing) and deleting (backspace) in the middle
would result in moving all of the text to the end for each operation. A data
structure that can efficiently insert and delete in the middle is needed.

The task is to implement data structures for efficiently editing text. One
common data structure is a gap buffer:

http://en.wikipedia.org/wiki/...

Other options to consider include: ropes (see quiz #137), linked lists, simple
strings, or a multi-level combination of data structures (i.e. for lines vs.
characters in a line). There are many data structures that may work efficiently
with simple editing operations, but not all of those data structures will work
well for more complex functionality.

All of the basic operations occur around a text cursor. The minimal operations
on/at the cursor should be:

* Insert a character before or after the cursor.
* Delete a character before or after the cursor and return the
deleted character.
* Move the cursor left or right (crossing line boundaries if
necessary) and return the character or nil at the beginning
or end of the buffer.
* Move the cursor up or down a line and return nil/false only if a
line boundary could not be crossed. The cursor may be placed in
the most natural column for the data structure.

Additional useful operations that you might find in a typical text editor can
also be added:

* Get current line and column numbers
* Copy some amount of text before or after the cursor and return
this buffer.
* Display some context around the cursor.
* Cut some amount of text before or after the cursor and return
this buffer.
* Paste a copy/cut buffer before or after the cursor.
* Insert a file.
* Write to a file.
* Goto a specific line or column.
* Goto the begin/end of the line/buffer.
* Copy or cut to a specific line/column.
* Filter some text through a ruby block.
* Search (and possibly replace) using a regular expression.
* Undo/redo.

Major bonus points for the following where gap buffers probably won't work:

* Only store changes to a file.
* Handle multiple efficient cursors in a text buffer.

Although the focus is on data structures and making the ruby DSL equivalent to
unix "ed" or DOS "edlin", a GUI could be added to make a full-screen/window text
editor.

Here is a benchmark for testing that needs the minimal implementation
(#insert_before, #insert_after, #delete_before, #delete_after, #left, #right,
#up, #down):

# edit_test.rb
# usage: ruby -r klass.rb edit_test.rb <iter> # [<constructor> [<lines> <columns>] ...]

require 'benchmark'
require 'test/unit/assertions'
include Test::Unit::Assertions

# char = byte pre 1.9, each_char already defined in 1.9
unless "".respond_to?(:each_char)
class String;alias_method(:each_char, :each_byte);end
end

iterations = ARGV.shift.to_i

while cursor = ARGV.shift
nlines = (ARGV.shift || 10000).to_i
ncolumns = (ARGV.shift || 100).to_i
n = nlines*ncolumns
chars = (?a..?z).to_a
line = (0...ncolumns).inject("") { |line, i|
line << chars[i%chars.length]
}
line[-1] = ?\n

iterations.times { eval(cursor).instance_eval {
Benchmark.benchmark(
"#{cursor}: #{nlines}x#{ncolumns}\n",16,nil,"total"
) { |b|

total = b.report("insert_before") {
nlines.times { line.each_char { |ch| insert_before(ch) } }
}
i = 0
total += b.report("left") { i += 1 while left }
assert_equal(n, i)
i = 0
total += b.report("right") { i += 1 while right }
assert_equal(n, i)
i = 0
total += b.report("up") { i += 1 while up }
assert_equal(nlines, i)
i = 0
total += b.report("down") { i += 1 while down }
assert_equal(nlines, i)
total += b.report("insert_after") {
nlines.times { line.each_char { |ch| insert_after(ch) } }
}
i = 0
total += b.report("delete_before") {
i += 1 while delete_before
}
assert_equal(n, i)
i = 0
total += b.report("delete_after") {
i += 1 while delete_after
}
assert_equal(n, i)

[total]

}
} }
end

44 Answers

Todd Burch

10/26/2007 1:15:00 PM

0

James Gray wrote:
> All of the basic operations occur around a text cursor. The minimal
> operations
> on/at the cursor should be:
>

James, thanks for putting all these challenges together, and thanks to
the contributors too.

Just a terminology thing - the proper term is "caret", not a "text
cursor".

Todd
--
Posted via http://www.ruby-....

James Gray

10/26/2007 1:27:00 PM

0

On Oct 26, 2007, at 8:15 AM, Todd Burch wrote:

> James Gray wrote:
>> All of the basic operations occur around a text cursor. The minimal
>> operations on/at the cursor should be:

> Just a terminology thing - the proper term is "caret", not a "text
> cursor".

While I probably would have called it a caret as well, the term
doesn't seem completely out of line:

http://en.wikipedia.org/w...

http://en.wikipedia.org/w..._%28computers%29

James Edward Gray II

Todd Burch

10/26/2007 1:43:00 PM

0

James Gray wrote:
> On Oct 26, 2007, at 8:15 AM, Todd Burch wrote:
>
>
> While I probably would have called it a caret as well, the term
> doesn't seem completely out of line:
>

Not out of line in the least. Without resorting to cheesey emoticons
(wink wink, smiley, grin, et al), mine was a passing comment. I am sure
no one got confused from your description.

Again, thanks.
--
Posted via http://www.ruby-....

James Gray

10/26/2007 1:48:00 PM

0

On Oct 26, 2007, at 8:43 AM, Todd Burch wrote:

> James Gray wrote:
>> On Oct 26, 2007, at 8:15 AM, Todd Burch wrote:
>>
>>
>> While I probably would have called it a caret as well, the term
>> doesn't seem completely out of line:
>>
>
> Not out of line in the least. Without resorting to cheesey emoticons
> (wink wink, smiley, grin, et al), mine was a passing comment. I am
> sure
> no one got confused from your description.

Just to be clear, I didn't write that quiz. Eric Mahurin did.

James Edward Gray II

Rick DeNatale

10/26/2007 2:27:00 PM

0

On 10/26/07, James Edward Gray II <james@grayproductions.net> wrote:
> On Oct 26, 2007, at 8:15 AM, Todd Burch wrote:
>
> > James Gray wrote:
> >> All of the basic operations occur around a text cursor. The minimal
> >> operations on/at the cursor should be:
>
> > Just a terminology thing - the proper term is "caret", not a "text
> > cursor".
>
> While I probably would have called it a caret as well, the term
> doesn't seem completely out of line:
>
> http://en.wikipedia.org/w...
>
> http://en.wikipedia.org/w..._%28computers%29

Actually, I'd say that caret is a UI/View related term (its the name
of the ^ character, traditionally used in blue-pencil editing to mark
a place to insert test.

Cursor is kind of both a view and a model term, and I think we're
talking about a model in this quiz.

That said, in general, when I've looked at similar code, the analogous
concept usually used is a selection, which represents a contiguous run
of zero or more characters within the buffer. The quiz seems to be
using a degenerate version of this where the length is constrained to
be zero.


--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Joel VanderWerf

10/26/2007 5:45:00 PM

0

Todd Burch wrote:
> James Gray wrote:
>> All of the basic operations occur around a text cursor. The minimal
>> operations
>> on/at the cursor should be:
>>
>
> James, thanks for putting all these challenges together, and thanks to
> the contributors too.
>
> Just a terminology thing - the proper term is "caret", not a "text
> cursor".
>
> Todd

When the Macintosh first came out, the little blinking vertical bar in
text editors was called the insertion point or caret, but, because of
its shape, it was also called the "stick".

It doesn't matter whether you see it as a caret or a stick, as long as
it provides sufficient motivation for the Quiz. ;) ;) ;)

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Eric Mahurin

10/27/2007 9:13:00 PM

0

On 10/26/07, Rick DeNatale <rick.denatale@gmail.com> wrote:
> On 10/26/07, James Edward Gray II <james@grayproductions.net> wrote:
> > On Oct 26, 2007, at 8:15 AM, Todd Burch wrote:
> >
> > > James Gray wrote:
> > >> All of the basic operations occur around a text cursor. The minimal
> > >> operations on/at the cursor should be:
> >
> > > Just a terminology thing - the proper term is "caret", not a "text
> > > cursor".
> >
> > While I probably would have called it a caret as well, the term
> > doesn't seem completely out of line:
> >
> > http://en.wikipedia.org/w...
> >
> > http://en.wikipedia.org/w..._%28computers%29
>
> Actually, I'd say that caret is a UI/View related term (its the name
> of the ^ character, traditionally used in blue-pencil editing to mark
> a place to insert test.
>
> Cursor is kind of both a view and a model term, and I think we're
> talking about a model in this quiz.

Correct. I've also seen the term cursor being used with databases
(pointer to what row is being operated on). It has also been used to
mean the same as what rubyists call an "external iterator" (ruby
mainly uses "internal iterators" or what others might call visitors).

Before mice came on the scene the symbol representing the text
insertion point was also called a "cursor". I didn't realize that
"cursor" is now mainly used for the mouse now and the text cursor has
been demoted to "caret". I usually just say "mouse pointer" and "text
cursor". According to how the term "cursor" is used in data
structures, I think the text cursor/caret deserves to use this term
more than a mouse pointer/cursor does.

> That said, in general, when I've looked at similar code, the analogous
> concept usually used is a selection, which represents a contiguous run
> of zero or more characters within the buffer. The quiz seems to be
> using a degenerate version of this where the length is constrained to
> be zero.

Don't let the test in this quiz stop you. I just provided a benchmark
for the simple stuff. Feel free to implement other text editor
operations in addition to the simple operations that the benchmark
uses. And don't let the benchmark restrict your API. If the model
that the benchmark assumes doesn't match your model (i.e. a selection
in your case?), change the benchmark test to what you need. Also,
achieving the best absolute performance on this benchmark isn't
necessarily the most important, but you should get reasonable big-O
performance. I'd like to see what data structures some of you come up
with and how far you can take them in terms of functionality. I think
there are lots of solutions with a variety of data structures.

Have fun!

Eric

Eric Mahurin

10/29/2007 1:36:00 PM

0

On 10/26/07, Ruby Quiz <james@grayproductions.net> wrote:
> A simple string to represent text buffer isn't efficient
> enough because inserting (i.e. typing) and deleting (backspace) in the middle
> would result in moving all of the text to the end for each operation.

Here is an inefficient string-based solution:

class StringCaret
# cursor/caret is between char i-1 and i
def initialize(data="", i=0)
@data = data
@i = i
end
def insert_before(ch)
@data[@i,0] = ("" << ch)
@i += 1
end
def insert_after(ch)
@data[@i,0] = ("" << ch)
end
def delete_before
@i.nonzero? and @data.slice!(@i-=1)
end
def delete_after
@data.slice!(@i)
end
def left
@i.nonzero? and @i-=1
end
def right
@i<@data.length and @i+=1
end
def up
while @i.nonzero?
@i -= 1
break(true) if @data[@i]==?\n
end
end
def down
while @i<@data.length
break(@i+=1) if @data[@i]==?\n
@i += 1
end
end
end


The benchmark results look like this on my machine:

> ruby -r string.rb test.rb 2 StringCaret.new 1000 100 StringCaret.new 2000 100
StringCaret.new: 1000x100
...
total 7.800000 0.030000 7.830000 ( 7.851117)
StringCaret.new: 2000x100
...
total 26.810000 0.100000 26.910000 ( 27.015445)

The run-time almost quadrupled when the data-size doubled, so this is O(n**2).

An O(n) solution with this minimal functionality is doable in about
the same number of lines as above. But, you have to use another data
structure...

Eric

Philip Gatt

10/29/2007 6:50:00 PM

0

Now that you've submitted a reference implementation, it looks much
more interesting to try to optimize. I bet you'll get a better
response now.

On Oct 29, 2007, at 6:36 AM, Eric Mahurin wrote:

> On 10/26/07, Ruby Quiz <james@grayproductions.net> wrote:
>> A simple string to represent text buffer isn't efficient
>> enough because inserting (i.e. typing) and deleting (backspace) in
>> the middle
>> would result in moving all of the text to the end for each operation.
>
> Here is an inefficient string-based solution:
>
> class StringCaret
> # cursor/caret is between char i-1 and i
> def initialize(data="", i=0)
> @data = data
> @i = i
> end
> def insert_before(ch)
> @data[@i,0] = ("" << ch)
> @i += 1
> end
> def insert_after(ch)
> @data[@i,0] = ("" << ch)
> end
> def delete_before
> @i.nonzero? and @data.slice!(@i-=1)
> end
> def delete_after
> @data.slice!(@i)
> end
> def left
> @i.nonzero? and @i-=1
> end
> def right
> @i<@data.length and @i+=1
> end
> def up
> while @i.nonzero?
> @i -= 1
> break(true) if @data[@i]==?\n
> end
> end
> def down
> while @i<@data.length
> break(@i+=1) if @data[@i]==?\n
> @i += 1
> end
> end
> end
>
>
> The benchmark results look like this on my machine:
>
>> ruby -r string.rb test.rb 2 StringCaret.new 1000 100
>> StringCaret.new 2000 100
> StringCaret.new: 1000x100
> ...
> total 7.800000 0.030000 7.830000 ( 7.851117)
> StringCaret.new: 2000x100
> ...
> total 26.810000 0.100000 26.910000 ( 27.015445)
>
> The run-time almost quadrupled when the data-size doubled, so this
> is O(n**2).
>
> An O(n) solution with this minimal functionality is doable in about
> the same number of lines as above. But, you have to use another data
> structure...
>
> Eric
>


Eric Mahurin

10/29/2007 7:20:00 PM

0

On 10/29/07, Philip Gatt <gattster@gmail.com> wrote:
> Now that you've submitted a reference implementation, it looks much
> more interesting to try to optimize. I bet you'll get a better
> response now.

Yes, I probably should have put something like this in the quiz to begin with.