Asp Forum - Probabilistic BDD?

Robert Feldt

11/1/2006 5:51:00 AM

Hi,

I'm playing around with BDD à la test/spec and foudn that I need to
specify properties probabilistically ie saying that they are
likely/unlikely. Has there been any previous work along these lines?

Should we add something like this to test/spec (Christian are you listening? :))

diff -rN old-testspec/lib/test/spec.rb new-testspec/lib/test/spec.rb
286a287,300
> def unlikely(specname, probability = 0.01, preName = "unlikely", &block)
> specify(preName + " " + specname) do
> count = 0
> num_repetitions = 100
> num_repetitions.times {count += 1 if (block.call == true)}
> actual_probability = count.to_f / num_repetitions
> assert actual_probability <= probability, "Expected probability of #{probability} but was #{actual_probability} (#{count} in #{num_repetitions} repetitions)"
> end
> end
>
> def likely(specname, probability = 0.99, &block)
> unlikely(specname, 1.0 - probability, "likely") {!block.call}
> end
>

so that one can write specs like

> # Example of probabilistic specifications
>
> context "random generation" do
> unlikely "that two consecutive calls to rand gives the same value" do
> rand(1000) == rand(1000)
> end
>
> likely "that two consecutive calls to rand gives different values" do
> rand(1000) != rand(1000)
> end
> end

?

This should probably be generalized so that the number of repetitions
to run depends on the probability of the event but I think you get the
idea.

Comments?

/Robert Feldt

7 Answers

Robert Feldt

11/1/2006 6:02:00 AM

> > # Example of probabilistic specifications
> >
> > context "random generation" do
> > unlikely "that two consecutive calls to rand gives the same value" do
> > rand(1000) == rand(1000)
> > end
> >
> > likely "that two consecutive calls to rand gives different values" do
> > rand(1000) != rand(1000)
> > end
> > end
>
Forgot one thing:

Some decision has to be made if setup code should be run before each
eval of the block or only before the repeated repetitions. To make it
more in line with the rest of test/spec maybe setup code should run
before every eval of the unlikely/likely blocks?

/Robert Feldt

Ben Nagy

11/1/2006 6:24:00 AM

> -----Original Message-----
> From: Robert Feldt [mailto:robert.feldt@gmail.com]
> Sent: Wednesday, November 01, 2006 12:51 PM
> To: ruby-talk ML
> Subject: Probabilistic BDD?
>
> Hi,
>
> I'm playing around with BDD à la test/spec and foudn that I need to
> specify properties probabilistically ie saying that they are
> likely/unlikely. Has there been any previous work along these lines?
>
> Should we add something like this to test/spec (Christian are
> you listening? :))
[...]
> so that one can write specs like
>
> > # Example of probabilistic specifications
> >
> > context "random generation" do
> > unlikely "that two consecutive calls to rand gives the
> same value" do
> > rand(1000) == rand(1000)
> > end
> >
> > likely "that two consecutive calls to rand gives
> different values" do
> > rand(1000) != rand(1000)
> > end
> > end
>
> ?
>
> This should probably be generalized so that the number of repetitions
> to run depends on the probability of the event but I think you get the
> idea.
>
> Comments?

I think that the example you give is not appropriate for testing rand(), and
pretty much any code where the result is expected to conform to a set of
statistical properties. If you take a look at randomness test suites like
Diehard there are a battery of different tests that should be applied before
data can be called 'random' with any confidence.

http://en.wikipedia.org/wiki/Die...

The tests as you have written them would be satisfied by any number of
broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg alternating
'1' and '2' ;). In particular, unlikely events must occur sometimes and
likely events must fail to occur sometimes, so some form of === seems better
than <=.

If you wanted to test RNGs then you need to run a whole series of tests -
either like the Diehard tests or just basic stuff like chi square, binomial,
monte-carlo calculation of pi etc.

More generally, I think that 'likely' and 'unlikely' are going to be so
context dependant that the user would be better off writing their own test
code, surely? I can see a place for should_be_random, but likely and
unlikely strike me as a bad idea. In any case, when running test code I
expect that it will give me the same result every time, so any tests should
at least have that property.

Sorry to sound negative. :(

ben

Robert Feldt

11/1/2006 7:40:00 AM

> I think that the example you give is not appropriate for testing rand(), and
> pretty much any code where the result is expected to conform to a set of
> statistical properties. If you take a look at randomness test suites like
> Diehard there are a battery of different tests that should be applied before
> data can be called 'random' with any confidence.
>
> http://en.wikipedia.org/wiki/Die...
>
> The tests as you have written them would be satisfied by any number of
> broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg alternating
> '1' and '2' ;). In particular, unlikely events must occur sometimes and
> likely events must fail to occur sometimes, so some form of === seems better
> than <=.
>
> If you wanted to test RNGs then you need to run a whole series of tests -
> either like the Diehard tests or just basic stuff like chi square, binomial,
> monte-carlo calculation of pi etc.
>
I don't want to test RNG's; that was just the smallest possible
example use of the likely/unlikely methods I could think of. I'm
fairly well versed in RNG testing, thank you.

> More generally, I think that 'likely' and 'unlikely' are going to be so
> context dependant that the user would be better off writing their own test
> code, surely? I can see a place for should_be_random, but likely and
> unlikely strike me as a bad idea. In any case, when running test code I
> expect that it will give me the same result every time, so any tests should
> at least have that property.
>
> Sorry to sound negative. :(
>
It is ok to be negative but I have run into test situations many times
where there is an element of varying behavior involved and specifying
exactly what is to be expected can only be done by running multiple
tests and making claims about overall properties of the results.

But it may be the case that people should write their own test code for it yes.

Still I think this is an important discussion in general since for
complex algorithms where it is costly to calc the exact expected
results ways to write partial specs are important.

Regards,

Robert

Wilson Bilkovich

11/1/2006 3:40:00 PM

On 11/1/06, Robert Feldt <robert.feldt@gmail.com> wrote:
> > I think that the example you give is not appropriate for testing rand(), and
> > pretty much any code where the result is expected to conform to a set of
> > statistical properties. If you take a look at randomness test suites like
> > Diehard there are a battery of different tests that should be applied before
> > data can be called 'random' with any confidence.
> >
> > http://en.wikipedia.org/wiki/Die...
> >
> > The tests as you have written them would be satisfied by any number of
> > broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg alternating
> > '1' and '2' ;). In particular, unlikely events must occur sometimes and
> > likely events must fail to occur sometimes, so some form of === seems better
> > than <=.
> >
> > If you wanted to test RNGs then you need to run a whole series of tests -
> > either like the Diehard tests or just basic stuff like chi square, binomial,
> > monte-carlo calculation of pi etc.
> >
> I don't want to test RNG's; that was just the smallest possible
> example use of the likely/unlikely methods I could think of. I'm
> fairly well versed in RNG testing, thank you.
>
> > More generally, I think that 'likely' and 'unlikely' are going to be so
> > context dependant that the user would be better off writing their own test
> > code, surely? I can see a place for should_be_random, but likely and
> > unlikely strike me as a bad idea. In any case, when running test code I
> > expect that it will give me the same result every time, so any tests should
> > at least have that property.
> >
> > Sorry to sound negative. :(
> >
> It is ok to be negative but I have run into test situations many times
> where there is an element of varying behavior involved and specifying
> exactly what is to be expected can only be done by running multiple
> tests and making claims about overall properties of the results.
>
> But it may be the case that people should write their own test code for it yes.
>
> Still I think this is an important discussion in general since for
> complex algorithms where it is costly to calc the exact expected
> results ways to write partial specs are important.
>

Personally, I would do something like:

def something_run_a_bunch_of_times
results = []
100_000.times {results << whatever_is_being_verified}
results.matches_statistical_requirements_of_domain?
end

specify "should be totally awesome" do
something_run_a_bunch_of_times.should.be true
end

M. Edward (Ed) Borasky

11/1/2006 3:45:00 PM

Robert Feldt wrote:
>> I think that the example you give is not appropriate for testing
>> rand(), and
>> pretty much any code where the result is expected to conform to a set of
>> statistical properties. If you take a look at randomness test suites like
>> Diehard there are a battery of different tests that should be applied
>> before
>> data can be called 'random' with any confidence.
>>
>> http://en.wikipedia.org/wiki/Die...

Speaking of numerical test suites, do you happen to know of any test
suites on line for elementary functions? I used to know of one, but
haven't been able to find it. Nor have I been able to find my copy of
"Software Manual for the Elementary Functions".

And I'm *not* talking about "paranoia" ... that just tests arithmetic,
and I found it.

Robert Feldt

11/1/2006 5:33:00 PM

> Personally, I would do something like:
>
> def something_run_a_bunch_of_times
> results = []
> 100_000.times {results << whatever_is_being_verified}
> results.matches_statistical_requirements_of_domain?
> end
>
> specify "should be totally awesome" do
> something_run_a_bunch_of_times.should.be true
> end
>
Yes, I'll probably keep my own set of test/spec extensions for now.

lambda {some bool test}.should.be.unlikely

is kind of tempting though... ;)

Thanks for your input,

/Robert

Christian Neukirchen

11/2/2006 11:02:00 AM

"Robert Feldt" <robert.feldt@gmail.com> writes:

>> Personally, I would do something like:
>>
>> def something_run_a_bunch_of_times
>> results = []
>> 100_000.times {results << whatever_is_being_verified}
>> results.matches_statistical_requirements_of_domain?
>> end
>>
>> specify "should be totally awesome" do
>> something_run_a_bunch_of_times.should.be true
>> end
>>
> Yes, I'll probably keep my own set of test/spec extensions for now.
>
> lambda {some bool test}.should.be.unlikely
>
> is kind of tempting though... ;)

I would add these methods to the test/spec distribution, but I don't
want to add new kinds of contexts, at least not before 1.0.

Just send me a patch.

> Thanks for your input,
>
> /Robert
--
Christian Neukirchen <chneukirchen@gmail.com> http://chneuk...

comp.lang.ruby

Probabilistic BDD?

Robert Feldt

Robert Feldt

Ben Nagy

Robert Feldt

Wilson Bilkovich

M. Edward (Ed) Borasky

Robert Feldt

Christian Neukirchen

x Login to ForumsZone