Available on x86 and target feature 
avx512f only.Expand description
Round the lower single-precision (32-bit) floating-point element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
Rounding is done according to the imm8[2:0] parameter, which can be one of:
_MM_FROUND_TO_NEAREST_INT // round to nearest
_MM_FROUND_TO_NEG_INF     // round down
_MM_FROUND_TO_POS_INF     // round up
_MM_FROUND_TO_ZERO        // truncate
_MM_FROUND_CUR_DIRECTION  // use MXCSR.RC; see _MM_SET_ROUNDING_MODE