Available on x86-64 and target feature 
avx only.Expand description
Loads 256-bits (composed of 4 packed double-precision (64-bit)
floating-point elements) from memory into result.
mem_addr does not need to be aligned on any particular boundary.