Available on x86-64 and target feature 
avx512vbmi2,avx512vl only.Expand description
Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.