I have a fairly large dataset (~1.4m rows) that I'm doing some splitting and summarizing on. The whole thing takes a while to run, and my final application depends on frequent running, so my thought was to use I have a fairly large dataset (~1.4m rows) that