### SWAR implementation of (some #'zerop ...)

SWAR (SIMD Within A Register) codes can portably express short-vector parallelism for operations on small packed data (e.g. byte or even nybble vectors). A trivial application of the technique is when we test whether any bit in a word is set (equal to 1) by comparing the whole word against 0. Obviously, that also work to test whether any field (of arbitrary width) is not filled with 0. This document, from 1997, provides a fairly clear and complete overview.

Just like testing whether any bit is set, it is easy to find whether
some bit is *not* set (it's simply the opposite of whether every bit
in the word is set). Things are more complex when the data
are wider than a single bit (but obviously narrower than a full
word). I found a short implementation (and barely tested it), but
it might be possible to do even shorter. Skip to the series of
asterisks if you want to solve that puzzle (to efficiently find
whether any field in a sequence of data, itself packed into a single
word, is 0) yourself.

To simplify the description, I'll assume that we're working with 4-bit-wide fields. It should be clear how the code can be adapted to other widths or even mixed widths.

Let `x = aaaabbbbccccdddd...`

be the input.

1. `x' = x | (x >> 1)`

.
The first bit in each field is now, for our purposes, noise. However,
some of the remaining 3 bits are now non-zero iff some the original 4
were.

2. `ones = x' & ~0x8888...`

. The noise bits are masked out.

3. `borrow = 0x8888... - ones`

. The first bit of each field
in `borrow`

is 0 iff some of the 3 other bits in `ones`

aren't (iff some
of the 4 bits in `x`

weren't).

4. `result = borrow & 0x8888...`

is zero iff the first bit
of every field in `borrow`

is 0 (iff every field in `x`

was non-null).

And, finally, that is easy to test for, a word at a time. In the end,
it takes 5 (hopelessly serial) operations (`>>`

, `|`

, `&`

, `-`

and `&`

) and a
conditional branch.

`****`

Testing whether any field in a word is filled with 0 may seem
incredibly obscure. Apart from ```
(some #'zerop
[packed-unboxed-vector])
```

, what use is there for such code sequences?
One trick is to exploit `xor`

. `xor`

lets us compare multiple fields at a
time: a field is 0 in `a xor b`

iff the corresponding fields in `a`

and `b`

are identical (bit for bit). Now that we can determine when at least
one pair of fields is equal, it's simple to implement, e.g., default
`FIND`

on specialised vectors without having to test each datum
separately (until the very end, when we know that one of the pairs is
equal, but not which). As usual, a full implementation, capable of
dealing with `:start`

, `:end`

and displaced vectors is a lot more work.