Limitations for Gazelle Plugin
Spark compability
Currently, Gazelle Plugin is workable with Spark 3.1.x & 3.2.x.
Operator limitations
All performance critical operators in TPC-H/TPC-DS should be supported. For those unsupported operators, Gazelle Plugin will automatically fallback to row operators in vanilla Spark.
Columnar Projection with Filter
We used 16 bit selection vector for filter so the max batch size need to be < 65536
Columnar Sort
Columnar Sort does not support spill to disk yet. To reduce the peak memory usage, we used smaller data structure(uin16_t), so this limits - the max batch size to be < 65536 - the number of batches in one partiton to be < 65536