Lip-reading via deep neural networks using hybrid visual features