jasonp: thank you for your hints.

I had allready the barrett modular multiplicatoin on my radar but didn't take a deeper look at it so far.

Current speed: 30 million checks per second on a 26bit exponent with factors up to 71bit :)

Next on my to-do list:
- more flexible input (e.g. the exponents are still hardcoded in the hostcode)
- presieving of factor candidates
- reduce the amount of data transfered from/to device
- interleave host/device code
