As per my knowledge, I don’t think there's a direct way to do what you're asking for(it would require unaligned access, which is highly inefficient on some architectures). But I found an efficient way to transfer the data to an in-process array:
a = np.memmap("filename", mode='r', dtype=np.dtype('>u1'))
e = np.zeros(a.size / 6, np.dtype('>u8'))
for i in range(3):
e.view(dtype='>u2')[i + 1::4] = a.view(dtype='>u2')[i::3]
You can get unaligned access using the strides constructor parameter:
e = np.ndarray((a.size - 2) // 6, np.dtype('<u8'), buf, strides=(6,))
However with this each element will overlap with the next, so to actually use it you'd have to mask out the high bytes on access.