Kibern. vyčisl. teh., 2018, Issue 4 (194), pp.
Grytsenko V.I., Corresponding Member of NAS of Ukraine,
Director of International Research and Training
Center for Information Technologies and Systems
of the National Academy of Sciences of Ukraine
and Ministry of Education and Science of Ukraine
International Research and Training Center for Information Technologies
and Systems of the National Academy of Sciences of Ukraine
and Ministry of Education and Science of Ukraine,
Acad. Glushkov av., 40, Kiev, 03187, Ukraine
NEURAL DISTRIBUTED REPRESENTATIONS OF VECTOR DATA IN INTELLIGENT INFORMATION TECHNOLOGIES
Introduction. Distributed representation (DR) of data is a form of a vector representation, where each object is represented by a set of vector components, and each vector component can belong to representations of many objects. In ordinary vector representations, the meaning of each component is defined, which cannot be said about DR. However, the similarity of RP vectors reflects the similarity of the objects they represent.
DR is a neural network approach based on modeling the representation of information in the brain, resulted from ideas about a “distributed” or “holographic” representations. DRs have a large information capacity, allow the use of a rich arsenal of methods developed for vector data, scale well for processing large amounts of data, and have a number of other advantages. Methods for data transformation to DRs have been developed for data of vari-ous types – from scalar and vector to graphs.
The purpose of the article is to provide an overview of part of the work of the Department of Neural Information Processing Technologies (International Center) in the field of neural network distributed representations. The approach is a development of the ideas of Nikolai Amosov and his scientific school of modeling the structure and functions of the brain.
Scope. The formation of distributed representations from the original vector representations of objects using random projection is considered. With the help of the DR, it is possible to efficiently estimate the similarity of the original objects represented by numerical vectors. The use of DR allows developing regularization methods for obtaining a stable solution of discrete ill-posed inverse problems, increasing the computational efficiency and accuracy of their solution, analyzing analytically the accuracy of the solution. Thus DRs allow for in-creasing the efficiency of information technologies applying them.
Conclusions. DRs of various data types can be used to improve the efficiency and intelligence level of information technologies. DRs have been developed for both weakly structured data, such as vectors, and for complex structured representations of objects, such as sequences, graphs of knowledge-base situations (episodes), etc. Transformation of different types of data into the DR vector format allows unifying the basic information technologies of their processing and achieving good scalability with an increase in the amount of data processed.
In future, distributed representations will naturally combine information on structure and semantics to create computationally efficient and qualitatively new information technologies in which the processing of relational structures from knowledge bases is performed by the similarity of their DRs. The neurobiological relevance of distributed representations opens up the possibility of creating intelligent information technologies based on them that func-tion similarly to the human brain.
Keywords: distributed data representation, random projection, vector similarity estimation, discrete ill-posed problem, regularization.
1. Amosov N. M. Modelling of thinking and the mind. New York: Spartan Books, 1967. 192 p. https://doi.org/10.1007/978-1-349-00640-3
2. Amosov N.M., Baidyk T.N., Goltsev A.D., Kasatkin A.M., Kasatkina L.M., Rachkovskij D.A. Neurocomputers and Intelligent Robots. Kyiv: Nauk. Dumka. 1991. 269 p.(in Russian)
3. Gritsenko V.I., Rachkovskij D.A., Goltsev A.D., Lukovych V.V., Misuno I.S., Revunova E.G., Slipchenko S.V., Sokolov A.M., Talayev S.A. Neural distributed representation for intelligent information technologies and modeling of thinking. Kibernetika i vycislitelnaa tehnika. 2013. Vol. 173. P. 7–24. (in Russian)
4. Goltsev A.D., Gritsenko V.I. Neural network technologies in the problem of handwriting recognition. Control Systems and Machines. 2018. N 4. P. 3–20. (in Russian).
5. Kussul E.M. Associative neuron-like structures. Kyiv: Nauk. Dumka. 1992. 144 p. (in Russian)
6. Kussul E.M., Rachkovskij D.A., Baidyk T.N. Associative-Projective Neural Networks: Architecture, Implementation, Applications. Proc. Neuro-Nimes’91. (Nimes, 25–29th of Oct. 25–29, 1991). Nimes, 1991. P. 463–476.
7. Gayler R. Multiplicative binding, representation operators, and analogy. Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences. Edited by K. Holyoak, D. Gentner, and B. Kokinov. Sofia, Bulgaria: New Bulgarian University, 1998. P. 405.
8. Kanerva P. Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive Computation. 2009. Vol. 1, N 2. P. 139–159. https://doi.org/10.1007/s12559-009-9009-8
9. Goltsev A., Husek D. Some properties of the assembly neural networks. Neural Network World. 2002. Vol. 12, N. 1. P. 15–32.
10. Goltsev A.D. Neural networks with assembly organization. Kyiv: Nauk. Dumka. 2005. 200 p. (in Russian)
11. Goltsev A., Gritsenko V. Modular neural networks with radial neural columnar architecture. Biologically Inspired Cognitive Architectures. 2015. Vol. 13. P. 63–74. https://doi.org/10.1016/j.bica.2015.06.001
12. Frolov A.A., Rachkovskij D.A., Husek D. On information characteristics of Willshaw-like auto-associative memory. Neural Network World. 2002. Vol. 12, No 2. P. 141–158.
13. Frolov A.A., Husek D., Rachkovskij D.A. Time of searching for similar binary vectors in associative memory. Cybernetics and Systems Analysis. 2006. Vol. 42, N 5. P. 615–623. https://doi.org/10.1007/s10559-006-0098-z
14. Gritsenko V.I., Rachkovskij D.A., Frolov A.A., Gayler R., Kleyko D., Osipov E. Neural distributed autoassociative memories: A survey. Kibernetika i vycislitel`naa tehnika. 2017. N 2 (188). P. 5–35.
15. Li P., Hastie T.J., Church K.W. Very sparse random projections. Proc. KDD’06. (Philadelphia, 20 – 23th of Aug.). Philadelphia, 2006. P. 287–296. https://doi.org/10.1145/1150402.1150436
16. Rachkovskij D.A. Vector data transformation using random binary matrices. Cybernetics and Systems Analysis. 2014. Vol. 50, N 6. P. 960–968. https://doi.org/10.1007/s10559-014-9687-4
17. Rachkovskij D.A. Formation of similarity-reflecting binary vectors with random binary projections. Cybernetics and Systems Analysis. 2015. Vol. 51, N 2. P. 313–323. https://doi.org/10.1007/s10559-015-9723-z
18. Rachkovskij D.A. Estimation of vectors similarity by their randomized binary projections. Cybernetics and Systems Analysis. 2015. Vol. 51, N 5. P. 808–818. https://doi.org/10.1007/s10559-015-9774-1
19. Revunova E.G., Rachkovskij D.A. Using randomized algorithms for solving discrete ill-posed problems. Intern. Journal Information Theories and Applications. 2009. Vol. 16, N 2. P. 176–192.
20. Durrant R.J., Kaban A. Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions. Machine Learning. 2015. Vol. 99, N 2. P. 257–286. https://doi.org/10.1007/s10994-014-5466-8
21. Xiang H., Zou J. Randomized algorithms for large-scale inverse problems with general Tikhonov regularizations. Inverse Problems. 2015. Vol. 31, N 8: 085008. P. 1–24.
22. Revunova E.G. Study of error components for solution of the inverse problem using random projections. Mathematical Machines and Systems. 2010. N 4. P. 33–42 (in Russian).
23. Rachkovskij D.A., Revunova E.G. Randomized method for solving discrete ill-posed problems. Cybernetics and Systems Analysis. 2012. Vol. 48, N. 4. P. 621–635. https://doi.org/10.1007/s10559-012-9443-6
24. Revunova E.G. Randomization approach to the reconstruction of signals resulted from indirect measurements. Proc. ICIM’13 (Kyiv 16–20th of Sept., 2013). Kyiv, 2013. P. 203–208.
25. Revunova E.G. Analytical study of the error components for the solution of discreteill-posed problems using random projections. Cybernetics and Systems Analysis. 2015. Vol. 51, N. 6. P. 978–991. https://doi.org/10.1007/s10559-015-9791-0
26. Revunova E.G. Model selection criteria for a linear model to solve discrete ill-posed problems on the basis of singular decomposition and random projection. Cybernetics and Systems Analysis. 2016. Vol. 52, N.4. P. 647–664. https://doi.org/10.1007/s10559-016-9868-4
27. Revunova E.G. Averaging over matrices in solving discrete ill-posed problems on the basis of random projection. Proc. CSIT’17 (Lviv 05–08th of Sept., 2017). Lviv, 2017. Vol. 1. P. 473–478. https://doi.org/10.1109/STC-CSIT.2017.8098831
28. Revunova E.G. Solution of the discrete ill-posed problem on the basis of singular value decomposition and random projection. Advances in Intelligent Systems and Computing II. Cham: Springer. 2018. P. 434–449.
29. Hansen P. Rank-deficient and discrete ill-posed problems. Numerical aspects of linear inversion. Philadelphia: SIAM, 1998. 247 p. https://doi.org/10.1137/1.9780898719697
30. Nowicki D., Verga P., Siegelmann H. Modeling reconsolidation in kernel associative memory. PLoS ONE. 2013. Vol. 8(8): e68189. doi:10.1371/journal.pone.0068189. https://doi.org/10.1371/journal.pone.0068189
31. Nowicki D, Siegelmann H. Flexible kernel memory. PLoS ONE. 2010. Vol. 5(6): e10955. doi:10.1371/journal.pone.0010955. https://doi.org/10.1371/journal.pone.0010955
32. Revunova E.G., Tyshchuk A.V. A model selection criterion for solution of discrete ill-posed problems based on the singular value decomposition. Proc. IWIM’2015 (20–24th of July, 2015, Kyiv-Zhukin). Kyiv-Zhukin, 2015. P.43–47.
33. Revunova E.G. Improving the accuracy of the solution of discrete ill-posed problem by random projection. Cybernetics and Systems Analysis. 2018. Vol. 54, N 5. P. 842–852. https://doi.org/10.1007/s10559-018-0086-0
34. Marzetta T., Tucci G., Simon S. A random matrix-theoretic approach to handling singular covariance estimates. IEEE Trans. Information Theory. 2011. Vol. 57, N 9. P. 6256–6271. https://doi.org/10.1109/TIT.2011.2162175
35. Stepashko V. Theoretical aspects of GMDH as a method of inductive modeling. Control systems and machines. 2003. N 2. P. 31–38. (in Russian)
36. Stepashko V. Method of critical variances as analytical tool of theory of inductive modeling. Journal of Automation and Information Sciences. 2008. Vol. 40, N 3. P. 4–22. https://doi.org/10.1615/JAutomatInfScien.v40.i3.20
37. Kussul E.M., Baidyk T.N., Lukovich V.V., Rachkovskij D.A. Adaptive neural network classifier with multifloat input coding. Proc. Neuro-Nimes’93 (25–29th of Oct., 1993, Nimes). Nimes, France, 1993 P. 209–216.
38. Lukovich V.V., Goltsev A.D., Rachkovskij D.A. Neural network classifiers for micromechanical equipment diagnostics and micromechanical product quality inspection. Proc. EUFIT’97 (8–11th of Sept, 1997, Aachen). Aachen, Germany, 1997. P. 534–536.
39. Kussul E.M., Kasatkina L.M., Rachkovskij D.A., Wunsch D.C. Application of random threshold neural networks for diagnostics of micro machine tool condition. Proc. IJCNN’01 (4–9th of May, 1998, Anchorage). Anchorage, Alaska, USA, 1998 P. 241–244. https://doi.org/10.1109/IJCNN.1998.682270
40. Gol’tsev A.D. Structured neural networks with learning for texture segmentation in images. Cybernetics and Systems Analysis. 1991. Vol. 27, N 6. P. 927–936. https://doi.org/10.1007/BF01246527
41. Rachkovskij D.A., Revunova E.G. Intelligent gamma-ray data processing for environmental monitoring. In: Intelligent Data Processing in Global Monitoring for Environment and Security. Kyiv-Sofia: ITHEA. 2011. P. 136–157.
42. Revunova E.G., Rachkovskij D.A. Random projection and truncated SVD for estimating direction of arrival in antenna array. Kibernetika i vycislitel`naa tehnika. 2018. N 3(193). P. 5–26.
43. Ferdowsi S., Voloshynovskiy S., Kostadinov D., Holotyak T. Fast content identification in highdimensional feature spaces using sparse ternary codes. Proc. WIFS’16 (4–7th of Dec., 2016, Abu Dhabi) Abu Dhabi, UAE, 2016. P. 1–6.
44. Dasgupta S., Stevens C.F., Navlakha S. A neural algorithm for a fundamental computing problem. Science. 2017. Vol. 358(6364). P. 793–796. https://doi.org/10.1126/science.aam9868
45. Iclanzan D., Szilagyi S.M., Szilagyi L.. Evolving computationally efficient hashing for similarity search. Proc. ICONIP’18. 2. (Siem Reap, 15-18th of Dec., 2018). Siem Reap, Cambodia, 2018. 2018. https://doi.org/10.1007/978-3-030-04179-3_49
46. Rachkovskij D.A., Slipchenko S.V., Kussul E.M., Baidyk T.N. Properties of numeric codes for the scheme of random subspaces RSC. Cybernetics and Systems Analysis. 2005. Vol. 41, N. 4. P. 509–520. https://doi.org/10.1007/s10559-005-0086-8
47. Rachkovskij D.A., Slipchenko S.V., Kussul E.M., Baidyk T.N. Sparse binary distributed encoding of scalars. 2005. Journal of Automation and Information Sciences. Vol. 37, N 6. P. 12–23. https://doi.org/10.1615/J
Automat Inf Scien.v37.i6.20
48. Rachkovskij D.A., Slipchenko S.V., Misuno I.S., Kussul E.M., Baidyk T. N. Sparse binary distributed encoding of numeric vectors. Journal of Automation and Information Sciences. 2005. Vol. 37, N 11. P. 47–61. https://doi.org/10.1615/J
Automat Inf Scien.v37.i11.60
49. Kleyko D., Osipov E., Rachkovskij D.A. Modification of holographic graph neuron using sparse distributed representations. Procedia Computer Science. 2016. Vol. 88. P. 39–45. https://doi.org/10.1016/j.procs.2016.07.404
50. Kleyko D., Rahimi A., Rachkovskij D., Osipov E., Rabaey J. Classification and recall with binary hyperdimensional computing: Tradeoffs in choice of density and mapping characteristics. IEEE Trans. Neural Netw. Learn. Syst. 2018.
51. Kussul E., Baidyk T., Kasatkina L. Lukovich V. Rosenblatt perceptrons for handwritten digit recognition. Proc. IJCNN’01. (Washington, 15-19 July, 2001). Washington, USA. 2001. P. 1516–1521. https://doi.org/10.1109/IJCNN.2001.939589
52. Baidyk T, Kussul E., Makeyev O., Vega A., Limited receptive area neural classifier based image recognition in micromechanics and agriculture. International Journal of Applied Mathematics and Informatics. 2008.Vol. 2, N 3. P. 96–103.
53. Baydyk T., Kussul E., Hernandez Acosta M. LIRA neural network application for microcomponent measurement. International Journal of Applied Mathematics and Informatics. Vol.6, N 4. 2012. P.173–180.
54. Goltsev A.D., Gritsenko V.I. Algorithm of sequential finding the textural features characterizing homogeneous texture segments for the image segmentation task. Kibernetika i vycislitel`naa tehnika. 2013. N 173. P. 25–34 (in Russian).
55. Goltsev A., Gritsenko V., Kussul E., Baidyk T. Finding the texture features characterizing the most homogeneous texture segment in the image. Proc. IWANN’15. (Palma de Mallorca, Spain, June 10-12, 2015). Palma de Mallorca, 2015. 2015. P. 287–300. https://doi.org/10.1007/978-3-319-19258-1_25
56. Goltsev A., Gritsenko V., Husek D. Extraction of homogeneous fine-grained texture segments in visual images. Neural Network World. 2017. Vol. 27, N 5. P. 447– 477. https://doi.org/10.14311/NNW.2017.27.024
57. Kussul N.N., Sokolov B.V., Zyelyk Y.I., Zelentsov V.A., Skakun S.V., Shelestov A.Y. Disaster risk assessment based on heterogeneous geospatial information. J. of Automation and Information Sci. 2010. Vol. 42, N 12. P. 32–45. https://doi.org/10.1615/JAutomatInfScien.v42.i12.40
58. Kussul N., Lemoine G., Gallego F. J., Skakun S. V, Lavreniuk M., Shelestov A. Y. Parcel-based crop classification in Ukraine using Landsat-8 data and Sentinel-1A data. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2016. Vol. 9, N 6. P. 2500–2508. https://doi.org/10.1109/JSTARS.2016.2560141
59. Kussul N., Lavreniuk M., Shelestov A., Skakun S. Crop inventory at regional scale in Ukraine: developing in season and end of season crop maps with multi-temporal optical and SAR satellite imagery. European Journal of Remote Sensing. 2018. Vol. 51, N 1. P. 627–636. https://doi.org/10.1080/22797254.2018.1454265
60. Sokolov A., Rachkovskij D. Approaches to sequence similarity representation. Information Theories and Applications. 2005. Vol.13, N 3. P. 272–278.
61. Recchia G., Sahlgren M., Kanerva P., Jones M. Encoding sequential information in semantic space models: Comparing holographic reduced representation and random permutation. Comput. Intell. Neurosci. 2015. Vol. 2015. Art. 986574. P. 1–18.
62. Rasanen O.J., Saarinen J.P. Sequence prediction with sparse distributed hyperdimensional coding applied to the analysis of mobile phone use patterns. IEEE Trans. Neural Netw. Learn. Syst. 2016. Vol. 27, N 9. P. 1878–1889. https://doi.org/10.1109/TNNLS.2015.2462721
63. Gallant S.I., Culliton P. Positional binding with distributed representations. Proc. ICIVC’16. (Portsmouth, UK 3–5 Aug., 2016). Portsmouth, 2016. 2016. P. 108–113. https://doi.org/10.1109/ICIVC.2016.7571282
64. Frady E. P., Kleyko D., Sommer F. T. A theory of sequence indexing and working memory in recurrent neural networks. Neural Comput. 2018. Vol. 30, N. 6. P. 1449–1513. https://doi.org/10.1162/neco_a_01084
65. Rachkovskij D.A. Some approaches to analogical mapping with structure sensitive distributed representations. Journal of Experimental and Theoretical Artificial Intelligence. 2004. Vol. 16, N 3. P. 125–145. https://doi.org/10.1080/09528130410001712862
66. Slipchenko S.V., Rachkovskij D.A. Analogical mapping using similarity of binary distributed representations. Int. J. Information Theories and Applications. 2009. Vol. 16, N 3. P. 269–290.