Yeah the 2/8 combo was probably what we went with in the product as well. The 512k was more like a shoehorned concept demo in an existing product.
The next thing we did was make a version of our CPU with an MMU, designed to work optimally with Linux (the first version was on the uClinux concept, with a kernel without MMU support and user-space programs that couldn't rely on fork() or mmap() fully). After a year or 2 with MMU-less Linux, it was like heaven to be able to run on an MMU :)
The 512k is impressive.