Asp Forum - Pass subarray by reference (not by value) in Common Lisp

Alexandre

8/18/2015 2:45:00 PM

Let's suppose I have an array - which I will call *my-array* - that looks like this:

#2A((1 2 3)
(4 5 6)
(7 8 9))

and I wish to apply some function f on the subarray

#2A((5 6)
(8 9))

I'd love to be able to write

(f (subarray *my-array* '(1 2) '(1 2))

where the subarray function takes as arguments:

- the original array
- a 2-element list with starting point and ending point on the 1st dimension
- another 2-element list with starting point and ending point on the 2nd dimension
- etc.

I am looking for some way to pass the subarray as argument to function f by reference (or by pointer?) instead of by value.

(The dumb way to address this would be to write a function that creates (in this specific case) a 2*2 array and loops over i and j copying values from the original array. However, if you are dealing relatively large arrays, this would be quite costly.)

I found there exists a cl-slice package but I do not get whether it copies values or accesses data by reference.

17 Answers

Barry Margolin

8/18/2015 3:16:00 PM

In article <3cf9857c-9893-49b3-be49-7ca4bd0a1d95@googlegroups.com>,
Alexandre Landi <alandi@lrde.epita.fr> wrote:

> Let's suppose I have an array - which I will call *my-array* - that looks
> like this:
>
> #2A((1 2 3)
> (4 5 6)
> (7 8 9))
>
> and I wish to apply some function f on the subarray
>
> #2A((5 6)
> (8 9))
>
> I'd love to be able to write
>
> (f (subarray *my-array* '(1 2) '(1 2))
>
> where the subarray function takes as arguments:
>
> - the original array
> - a 2-element list with starting point and ending point on the 1st dimension
> - another 2-element list with starting point and ending point on the 2nd
> dimension
> - etc.
>
> I am looking for some way to pass the subarray as argument to function f by
> reference (or by pointer?) instead of by value.
>
> (The dumb way to address this would be to write a function that creates (in
> this specific case) a 2*2 array and loops over i and j copying values from
> the original array. However, if you are dealing relatively large arrays, this
> would be quite costly.)
>
> I found there exists a cl-slice package but I do not get whether it copies
> values or accesses data by reference.

For a 1-dimensional array you could do this with a displaced array. But
CL doesn't have displaced arrays that refer to multi-dimensional arrays,
except by accessing a contiguous slice in the flat, row-major
representation (Lisp Machine Lisp had "conformant" arrays for this).

Perhaps you should redefine F so it takes the slice coordinates as
arguments, similar to the way the sequence functions take :START and
:END keywords.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Pascal J. Bourguignon

8/18/2015 3:40:00 PM

Alexandre Landi <alandi@lrde.epita.fr> writes:

> Let's suppose I have an array - which I will call *my-array* - that looks like this:
>
> #2A((1 2 3)
> (4 5 6)
> (7 8 9))
>
> and I wish to apply some function f on the subarray
>
> #2A((5 6)
> (8 9))
>
> I'd love to be able to write
>
> (f (subarray *my-array* '(1 2) '(1 2))
>
> where the subarray function takes as arguments:
>
> - the original array
> - a 2-element list with starting point and ending point on the 1st dimension
> - another 2-element list with starting point and ending point on the 2nd dimension
> - etc.
>
> I am looking for some way to pass the subarray as argument to function f by reference (or by pointer?) instead of by value.
>
> (The dumb way to address this would be to write a function that
> creates (in this specific case) a 2*2 array and loops over i and j
> copying values from the original array. However, if you are dealing
> relatively large arrays, this would be quite costly.)
>
> I found there exists a cl-slice package but I do not get whether it copies values or accesses data by reference.

The trick is to use closures to implement access to the subarray.
If you want a smooth integration with existing sources that use aref,
array-dimensions, etc, you can shadow those CL symbols and use symbols
with the same name.

(defun subarray-dims (subarray) (funcall subarray 'dims))
(defun subarray-rank (subarray) (funcall subarray 'rank))
(defun subarray-get (subarray &rest indices) (apply subarray 'get indices))
(defun (setf subarray-get) (new-value subarray &rest indices) (apply subarray 'set new-value indices))

(defun subarray (array corner1 corner2)
"CORNER1 and CORNER2 are lists of indices"
(let ((dimensions-of-subarray (mapcar (function -) corner2 corner1)))
(if (some (function minusp) dimensions-of-subarray)
(lambda (msg &rest arg-and-indices)
(error "Cannot index an empty subarray"))
(lambda (msg &rest arg-and-indices)
(flet ((indices (subarray-indices)
(loop
:with indices := '()
:for base :in corner1
:for max :in dimensions-of-subarray
:do (if (plusp max)
(let ((i (pop subarray-indices)))
(check-type i (and fixnum (integer 0)))
(assert (<= i max))
(push (+ base i) indices))
(push base indices))
:finally (return (nreverse indices)))))
(ecase msg
(dims (mapcar (function 1+) (remove 0 dimensions-of-subarray)))
(rank (count-if (function plusp) dimensions-of-subarray))
(get (apply (function aref) array (indices arg-and-indices)))
(set (setf (apply (function aref) array (indices (rest arg-and-indices)))
(first arg-and-indices)))))))))

(let ((a #4A((((1001 1002 1003 1004)
(1011 1012 1013 1014)
(1021 1022 1023 1024)
(1031 1032 1033 1034))
((1101 1102 1103 1104)
(1111 1112 1113 1114)
(1121 1122 1123 1124)
(1131 1132 1133 1134))
((1301 1302 1303 1304)
(1311 1312 1313 1314)
(1321 1322 1323 1324)
(1331 1332 1333 1334)))
(((2001 1002 1003 1004)
(2011 1012 1013 1014)
(2021 1022 1023 1024)
(2031 1032 1033 1034))
((2101 1102 1103 1104)
(2111 1112 1113 1114)
(2121 1122 1123 1124)
(2131 1132 1133 1134))
((2301 1302 1303 1304)
(2311 1312 1313 1314)
(2321 1322 1323 1324)
(2331 1332 1333 1334)))
(((3001 1002 1003 1004)
(3011 1012 1013 1014)
(3021 1022 1023 1024)
(3031 1032 1033 1034))
((3101 1102 1103 1104)
(3111 1112 1113 1114)
(3121 1122 1123 1124)
(3131 1132 1133 1134))
((3301 1302 1303 1304)
(3311 1312 1313 1314)
(3321 1322 1323 1324)
(3331 1332 1333 1334))))))

(let* ((b (subarray a '(0 0 0 0) '(0 0 1 2))))
(print (list :rank (subarray-rank b) :dims (subarray-dims b)))
(terpri)
(loop
:for i :below (elt (subarray-dims b) 0)
:do (loop
:for j :below (elt (subarray-dims b) 1)
:do (format t "~5D " (subarray-get b i j)))
(terpri)))

(let* ((b (subarray a '(1 1 0 0) '(1 2 2 2))))
(print (list :rank (subarray-rank b) :dims (subarray-dims b)))
(terpri)
(loop
:for i :below (elt (subarray-dims b) 0)
:do (loop
:for j :below (elt (subarray-dims b) 1)
:do (loop
:for k :below (elt (subarray-dims b) 2)
:do (format t "~5D " (subarray-get b i j k))
(setf (subarray-get b i j k) (* 2 (subarray-get b i j k))))
(terpri))
(terpri)))
(terpri)
(pprint a)
(terpri))

(:rank 2 :dims (2 3))
1001 1002 1003
1011 1012 1013

(:rank 3 :dims (2 3 3))
2101 1102 1103
2111 1112 1113
2121 1122 1123

2301 1302 1303
2311 1312 1313
2321 1322 1323

#4A((((1001 1002 1003 1004)
(1011 1012 1013 1014)
(1021 1022 1023 1024)
(1031 1032 1033 1034))
((1101 1102 1103 1104)
(1111 1112 1113 1114)
(1121 1122 1123 1124)
(1131 1132 1133 1134))
((1301 1302 1303 1304)
(1311 1312 1313 1314)
(1321 1322 1323 1324)
(1331 1332 1333 1334)))
(((2001 1002 1003 1004)
(2011 1012 1013 1014)
(2021 1022 1023 1024)
(2031 1032 1033 1034))
((4202 2204 2206 1104)
(4222 2224 2226 1114)
(4242 2244 2246 1124)
(2131 1132 1133 1134))
((4602 2604 2606 1304)
(4622 2624 2626 1314)
(4642 2644 2646 1324)
(2331 1332 1333 1334)))
(((3001 1002 1003 1004)
(3011 1012 1013 1014)
(3021 1022 1023 1024)
(3031 1032 1033 1034))
((3101 1102 1103 1104)
(3111 1112 1113 1114)
(3121 1122 1123 1124)
(3131 1132 1133 1134))
((3301 1302 1303 1304)
(3311 1312 1313 1314)
(3321 1322 1323 1324)
(3331 1332 1333 1334))))

--
__Pascal Bourguignon__ http://www.informat...
â??The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.â? -- Carl Bass CEO Autodesk

Marco Antoniotti

8/19/2015 6:59:00 AM

On Tuesday, August 18, 2015 at 5:45:07 PM UTC+3, Alexandre Landi wrote:
> Let's suppose I have an array - which I will call *my-array* - that looks like this:
>
> #2A((1 2 3)
> (4 5 6)
> (7 8 9))
>
> and I wish to apply some function f on the subarray
>
> #2A((5 6)
> (8 9))
>
> I'd love to be able to write
>
> (f (subarray *my-array* '(1 2) '(1 2))
>
> where the subarray function takes as arguments:
>
> - the original array
> - a 2-element list with starting point and ending point on the 1st dimension
> - another 2-element list with starting point and ending point on the 2nd dimension
> - etc.
>
> I am looking for some way to pass the subarray as argument to function f by reference (or by pointer?) instead of by value.
>
> (The dumb way to address this would be to write a function that creates (in this specific case) a 2*2 array and loops over i and j copying values from the original array. However, if you are dealing relatively large arrays, this would be quite costly.)
>
> I found there exists a cl-slice package but I do not get whether it copies values or accesses data by reference.

The CL-SLICE package (thank you! :) ) copies when it cannot displace. AFAIK there are no "mainstream" languages that do not copy when slicing a n-dimensional array. The reason is that arrays (matrices) are usually kept in row-major order (CL, C/C++ etc) or in column-major order (FORTRAN and whatever uses FORTRAN as a base: R and Matlab come to mind).

Cheers
--
MA

Kaz Kylheku

8/19/2015 2:17:00 PM

On 2015-08-19, Marco Antoniotti <marcoxa@gmail.com> wrote:
> On Tuesday, August 18, 2015 at 5:45:07 PM UTC+3, Alexandre Landi wrote:
>> Let's suppose I have an array - which I will call *my-array* - that looks like this:
>>
>> #2A((1 2 3)
>> (4 5 6)
>> (7 8 9))
>>
>> and I wish to apply some function f on the subarray
>>
>> #2A((5 6)
>> (8 9))
>>
>> I'd love to be able to write
>>
>> (f (subarray *my-array* '(1 2) '(1 2))
>>
>> where the subarray function takes as arguments:
>>
>> - the original array
>> - a 2-element list with starting point and ending point on the 1st dimension
>> - another 2-element list with starting point and ending point on the 2nd dimension
>> - etc.
>>
>> I am looking for some way to pass the subarray as argument to function f by
>> reference (or by pointer?) instead of by value.
>>
>> (The dumb way to address this would be to write a function that creates (in
>> this specific case) a 2*2 array and loops over i and j copying values from
>> the original array. However, if you are dealing relatively large arrays,
>> this would be quite costly.)
>>
>> I found there exists a cl-slice package but I do not get whether it copies
>> values or accesses data by reference.
>
> The CL-SLICE package (thank you! :) ) copies when it cannot displace. AFAIK
> there are no "mainstream" languages that do not copy when slicing a
> n-dimensional array. The reason is that arrays (matrices) are usually kept
> in row-major order (CL, C/C++ etc) or in column-major order (FORTRAN and
> whatever uses FORTRAN as a base: R and Matlab come to mind).

That's hardly a reason. If you want non-copying slicing badly enough, you
can make it work for a sparse hash table, even.

Every entry in the slice view can independently alias the corresponding
original location, if such extremes are necessary.

A displacement from one array to another, regardless of the number of
dimensions, is basically just a coordinate transform with bounds checking.

1. Check that the coordinates are in the aliased view, else reject.
2. Apply coordinate transformation.
3. Check that the coordinates are in the target array, else reject (if
out-of-bounds slices are allowed, or targets can be resized after a slice is
created, so that views become partially or wholly out-of-bounds).
4. Delegate the access to the target array.

No knowledge of the storage implementation of the target array should
be required, other than for clever optimizing.

Barry Margolin

8/19/2015 3:21:00 PM

In article <20150819071043.139@kylheku.com>,
Kaz Kylheku <kaz@kylheku.com> wrote:

> A displacement from one array to another, regardless of the number of
> dimensions, is basically just a coordinate transform with bounds checking.

The issue isn't whether it's possible -- it's "just math", after all.
The issue is performance. Arrays are typically used because they're
really efficient -- almost all CPUs have had indexing instructions or
addressing modes that allow optimizing array accesses. When you throw in
the extra transformations necessary for conformant slicing, though,
these methods cannot be used.

However, maybe this bias is a remnant of decades-old thinking. Processor
speeds have increased many orders of magnitude, and we can probably
afford to do the extra work now. Also, maybe there's some way to
leverage coprocessors like GPUs, which are designed to handle complex
transformations.

--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Marco Antoniotti

8/19/2015 4:23:00 PM

On Wednesday, August 19, 2015 at 6:20:48 PM UTC+3, Barry Margolin wrote:
> In article <20150819071043.139@kylheku.com>,
> Kaz Kylheku <kaz@kylheku.com> wrote:
>
> > A displacement from one array to another, regardless of the number of
> > dimensions, is basically just a coordinate transform with bounds checking.
>
> The issue isn't whether it's possible -- it's "just math", after all.
> The issue is performance. Arrays are typically used because they're
> really efficient -- almost all CPUs have had indexing instructions or
> addressing modes that allow optimizing array accesses. When you throw in
> the extra transformations necessary for conformant slicing, though,
> these methods cannot be used.
>
> However, maybe this bias is a remnant of decades-old thinking. Processor
> speeds have increased many orders of magnitude, and we can probably
> afford to do the extra work now. Also, maybe there's some way to
> leverage coprocessors like GPUs, which are designed to handle complex
> transformations.
>

+1

--
MA

Marco Antoniotti

8/19/2015 4:23:00 PM

On Wednesday, August 19, 2015 at 5:16:47 PM UTC+3, Kaz Kylheku wrote:
> On 2015-08-19, Marco Antoniotti <marcoxa@gmail.com> wrote:
> > On Tuesday, August 18, 2015 at 5:45:07 PM UTC+3, Alexandre Landi wrote:
> >> Let's suppose I have an array - which I will call *my-array* - that looks like this:
> >>
> >> #2A((1 2 3)
> >> (4 5 6)
> >> (7 8 9))
> >>
> >> and I wish to apply some function f on the subarray
> >>
> >> #2A((5 6)
> >> (8 9))
> >>
> >> I'd love to be able to write
> >>
> >> (f (subarray *my-array* '(1 2) '(1 2))
> >>
> >> where the subarray function takes as arguments:
> >>
> >> - the original array
> >> - a 2-element list with starting point and ending point on the 1st dimension
> >> - another 2-element list with starting point and ending point on the 2nd dimension
> >> - etc.
> >>
> >> I am looking for some way to pass the subarray as argument to function f by
> >> reference (or by pointer?) instead of by value.
> >>
> >> (The dumb way to address this would be to write a function that creates (in
> >> this specific case) a 2*2 array and loops over i and j copying values from
> >> the original array. However, if you are dealing relatively large arrays,
> >> this would be quite costly.)
> >>
> >> I found there exists a cl-slice package but I do not get whether it copies
> >> values or accesses data by reference.
> >
> > The CL-SLICE package (thank you! :) ) copies when it cannot displace. AFAIK
> > there are no "mainstream" languages that do not copy when slicing a
> > n-dimensional array. The reason is that arrays (matrices) are usually kept
> > in row-major order (CL, C/C++ etc) or in column-major order (FORTRAN and
> > whatever uses FORTRAN as a base: R and Matlab come to mind).
>
> That's hardly a reason. If you want non-copying slicing badly enough, you
> can make it work for a sparse hash table, even.
>
> Every entry in the slice view can independently alias the corresponding
> original location, if such extremes are necessary.
>
> A displacement from one array to another, regardless of the number of
> dimensions, is basically just a coordinate transform with bounds checking.
>
> 1. Check that the coordinates are in the aliased view, else reject.
> 2. Apply coordinate transformation.
> 3. Check that the coordinates are in the target array, else reject (if
> out-of-bounds slices are allowed, or targets can be resized after a slice is
> created, so that views become partially or wholly out-of-bounds).
> 4. Delegate the access to the target array.
>
> No knowledge of the storage implementation of the target array should
> be required, other than for clever optimizing.

All of this is fine but it is not displacing in the Common Lisp sense (or in the FORTRAN sense, if available; my FORTRAN-fu is quite low).

Given the row-major layout of CL arrays you can get some form of "displacing" in certain cases.

As for the rest, you have to resort to this or that trick (PJB uses closures) to do the coordinate transform. That is what CL-SLICE does.

Cheers
--
MA

Kaz Kylheku

8/19/2015 5:56:00 PM

On 2015-08-19, Barry Margolin <barmar@alum.mit.edu> wrote:
> In article <20150819071043.139@kylheku.com>,
> Kaz Kylheku <kaz@kylheku.com> wrote:
>
>> A displacement from one array to another, regardless of the number of
>> dimensions, is basically just a coordinate transform with bounds checking.
>
> The issue isn't whether it's possible -- it's "just math", after all.
> The issue is performance.

Ah, performance: that last resort of the scoundrel.

> Arrays are typically used because they're
> really efficient -- almost all CPUs have had indexing instructions or
> addressing modes that allow optimizing array accesses.

But we can't forget that arrays are also used because they allow
random access. For that reason, the arrays abstraction is sometimes
used even when emulated by hashing (where the addressing modes don't
help).

They are also used because of the arithmetic relationship among the places of
related elements; not only can we randomly access an element, but perform
calculations on the place to access some other, related element, at some
displaced location in any direction.

All this is retained evn if we go through some fairly costly coordinate transform
on each access.

rwiker

8/19/2015 7:04:00 PM

Kaz Kylheku <kaz@kylheku.com> writes:

> On 2015-08-19, Barry Margolin <barmar@alum.mit.edu> wrote:
>> In article <20150819071043.139@kylheku.com>,
>> Kaz Kylheku <kaz@kylheku.com> wrote:
>>
>>> A displacement from one array to another, regardless of the number of
>>> dimensions, is basically just a coordinate transform with bounds checking.
>>
>> The issue isn't whether it's possible -- it's "just math", after all.
>> The issue is performance.
>
> Ah, performance: that last resort of the scoundrel.
>
>> Arrays are typically used because they're
>> really efficient -- almost all CPUs have had indexing instructions or
>> addressing modes that allow optimizing array accesses.
>
> But we can't forget that arrays are also used because they allow
> random access. For that reason, the arrays abstraction is sometimes
> used even when emulated by hashing (where the addressing modes don't
> help).
>
> They are also used because of the arithmetic relationship among the places of
> related elements; not only can we randomly access an element, but perform
> calculations on the place to access some other, related element, at some
> displaced location in any direction.
>
> All this is retained evn if we go through some fairly costly
> coordinate transform
> on each access.

Note that, in the case of using a sub-volume of an n-dimensional array,
the only additional work is a few more additions when transforming a
multidimensional index into a single-dimensional offset. You don't need
to bounds-check against the original array, as you know that any valid
point in the displaced array are also part of the original array.

So, the cost of n-dimensional array displacement may well be lower than
most people expect.

Kaz Kylheku

8/19/2015 9:04:00 PM

On 2015-08-19, Raymond Wiker <rwiker@gmail.com> wrote:
> Kaz Kylheku <kaz@kylheku.com> writes:
>
>> All this is retained evn if we go through some fairly costly
>> coordinate transform
>> on each access.
>
> Note that, in the case of using a sub-volume of an n-dimensional array,
> the only additional work is a few more additions when transforming a
> multidimensional index into a single-dimensional offset. You don't need
> to bounds-check against the original array, as you know that any valid
> point in the displaced array are also part of the original array.

[Covered in step (3) in my original posting.]

comp.lang.lisp

Pass subarray by reference (not by value) in Common Lisp

Alexandre

Barry Margolin

Pascal J. Bourguignon

Marco Antoniotti

Kaz Kylheku

Barry Margolin

Marco Antoniotti

Marco Antoniotti

Kaz Kylheku

rwiker

Kaz Kylheku

x Login to ForumsZone