You'll need to be specific as to what you man by "toy". If you mean Mlucas, it's had Arm64 128-bit SIMD assembly support since 2017. I no longer run it my little Odroid because that is too slow, only currently running it on the last of the batch of 12 Samsung Galazy 7 Android broke-o-phones I bought in the used/for-parts market in early 2019.

ATM, based on test-compiles from Laurent Desnogues, am making some code-fiddles to try to get it to build using the Clang compiler on the new Apple M1 - identical instruction set, and I use an older version of Clang myself on my old Macbook classic where I do most of my code editing and proof-of-principle work, but the version of Clang on M1 is doing some macro-inlining-related optimizations - and lowering the -O* level did not cure this - which GCC-on-Arm64 does not do, and Laurent hit "ran out of registers" errors on a pair of core-FFT macros which max out the GCC macro-arglist limit of 30. Hopefully have something buildable by the new year.
It seems the M1 has some undocumented instructions that may be useful for Mlucas:
