Finished a 12-week internship with Meta

I worked on decoding Nimble files using GPU acceleration in the open-source query engine Velox. The implementation is based on Wave, an alternative to libcudf, that features aggressive kernel fusion to reduce memory round-trips. The three main challenges were (1) decoding nested-encoded data without materializing intermediate results; (2) dealing with chunked data efficiently; and (3) evaluating filters while decoding. PRs are available here. I will also present this work at the Velox Tech Talk in September. Stay tuned!