Available on x86 and target feature 
avx512vpopcntdq,avx512vl only.Expand description
For each packed 32-bit integer maps the value to the number of logical 1 bits.
Uses the writemask in k - elements are zeroed in the result if the corresponding mask bit is not set. Otherwise the computation result is written into the result.