From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition

Paweł Rościszewski


Recently gathered image datasets and new capabilities of high performance computing systems allowed developing new artificial neural network models and training algorithms. Using the new machine learning models, computer vision tasks can be accomplished based on the raw values of image pixels, instead of specific features. The principle of operation of deep artificial neural networks is more and more resembling of what we believe to be happening in the human visual cortex. In this paper we build up an understanding of convolutional neural networks through investigating supervised machine learning methods suchas K-Nearest Neighbors, linear classifiers and fully connected neural networks. We provide examples and accuracy results based on our implementation aimed for the problem of hand pose recognition.


machine learning;artificial neural networks;computer vision

Full Text:



Bhuyan M.K., Neog D.R., Kar M.K.: Hand pose recognition using geometric features. In: Communications (NCC), 2011 National Conference on, pp. 1–5. IEEE, 2011.

Dardas N.H., Georganas N.D.: Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques. In: IEEE Transactions on Instrumentation and Measurement, vol. 60(11), pp. 3592–3607, 2011. ISSN 0018-9456.

Erol A., Bebis G., Nicolescu M., Boyle R.D., Twombly X.: Vision-based hand pose estimation: A review. In: Computer Vision and Image Understanding, vol. 108(1-2), pp. 52–73, 2007. ISSN 10773142.

Hubel D.H., Wiesel T.N.: Receptive fields of single neurones in the cat’s striate cortex. In: The Journal of Physiology, vol. 148(3), pp. 574–591, 1959. ISSN00223751.

Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., Fei-Fei L.:

Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725–1732. 2014.

Kim T.K., Wong S.F., Cipolla R.: Tensor canonical correlation analysis for action classification. In: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pp. 1–8. IEEE, 2007.

Kingma D., Ba J.: Adam: A method for stochastic optimization. In: arXiv preprint arXiv:1412.6980, 2014.

Molchanov P., Gupta S., Kim K., Kautz J.: Hand Gesture Recognition with 3D

Convolutional Neural Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7. 2015.

Neverova N., Wolf C., Taylor G.W., Nebout F.: Multi-scale deep learning for

gesture detection and localization. In: Workshop at the European Conference on Computer Vision, pp. 474–490. Springer, 2014.

Powers D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, 2011.

Prasad V.S.N., Domke J.: Gabor filter visualization. In: J. Atmos. Sci,

vol. 13, 2005.

Rumelhart D.E., Hinton G.E., Williams R.J.: Learning representations by backpropagating errors. In: Nature, vol. 323(6088), pp. 533–536, 1986. ISSN 0028-0836.

Sankowski D., Nowakowski J., eds.: Computer vision in robotics and industrial applications. No. 3 in Series in computer vision. World Scientific, Singapore, 2014. ISBN 978-981-4583-71-8.

Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R.:

Dropout: A simple way to prevent neural networks from overfitting. In: The

Journal of Machine Learning Research, vol. 15(1), pp. 1929–1958, 2014.

Sturman D., Zeltzer D.: A survey of glove-based input. In: IEEE Computer

Graphics and Applications, vol. 14(1), pp. 30–39, 1994. ISSN 0272-1716.

Suarez J., Murphy R.R.: Hand gesture recognition with depth images: A review. In: 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, pp. 411–417. 2012.



  • There are currently no refbacks.