Unsupervised Learning in Neuromemristive Systems Cory Merkel and Dhireesha Kudithipudi Department of Computer Engineering Rochester Institute of Technology Rochester, New York 14623-5603 Email: {cem1103,dxkeec}@rit.edu Abstract—Neuromemristive systems (NMSs) currently repre- we believe this work will advance the state of unsupervised sent the most promising platform to achieve energy efﬁcient learning in NMSs and help others do the same. neuro-inspired computation. However, since the research ﬁeld 6 is less than a decade old, there are still countless algorithms II. BRIEFDESCRIPTIONOFMEMRISTORS 1 and design paradigms to be explored within these systems. One 0 particular domain that remains to be fully investigated within Technically, a memristor can be deﬁned as any two- 2 NMSs is unsupervised learning. In this work, we explore the terminal device with a state-dependent Ohm’s law [5]. More designofanNMSforunsupervisedclustering,whichisacritical concretely, a memristor is a thin ﬁlm (I) sandwiched between n a element of several machine learning algorithms. Using a simple a top (M) and bottom (M) electrode. The stack is referred to memristorcrossbararchitectureandlearningrule,weareableto J as a metal-insulator-metal (MIM) structure because the ﬁlm 7 aclcuhsiteevreinpge.rformance which is on par with MATLAB’s k-means material is nominally insulating. That is, in its stoichiometric 2 crystalline form it will have a large band gap and not enough free carriers to conduct. The ﬁlm is made conductive by ] introducing defects in the crystalline structure, either through T I. INTRODUCTION fabrication, applying an electric ﬁeld, or both. Defects may be E The present research explores multiple aspects of unsuper- interstitial metallic ions which are oxidized at one electrode s. vised learning in a class of neurologically-inspired computer and then drift to the other, where they are reduced. Defects c architectures referred to as neuromemristive systems (NMSs). mayalsobevacanciessuchasoxygenvacanciesinaTiO2ﬁlm. [ Although they are closely related, NMSs differ from neuro- In addition, defects may be changes in polarization, such as 1 morphic systems–pioneered by Mead in the late 1980s [1]– thoseinferroelectricﬁlms,orevenjustchangesincrystallinity v in two respects: First, they are designed using a mixture of as in phase change memory. In some ﬁlms, the defect proﬁle 2 CMOS and memristor technologies, which affords levels of can be gradually adjusted by applying electric ﬁelds for short 8 connectivityandplasticitythatarenotachievableinneuromor- durations, yielding incremental modiﬁcations to the ﬁlm’s 4 phicsystems.Theseconddistinction,whichismoresubtlebut overall conductance. In other ﬁlms, only two conductance 7 equally important, is that NMSs focus on abstraction, rather states can be reached. Moreover, there is usually a minimum 0 than mimicking, of neurological processes. This abstraction amountofenergyrequiredtoeffectchangeintheﬁlm’sdefect . 1 generally improves the efﬁciency of the resulting hardware proﬁle. This often translates to a threshold voltage which 0 implementations. must be applied across the ﬁlm to change its conductance. 6 Giventheconstantevolutionofmemristortechnology,itmakes 1 NMS research and development took off rapidly in the little sense to design an NMS around any speciﬁc memristor : late 2000s, coinciding with a growing interest in two-terminal deviceparameters.Instead,weassumedeviceswillhavethese v i memristors, which will be described brieﬂy in Section II general characteristics: (1) a large minimum resistance value X for unfamiliar readers. The utility of these systems has been (e.g. in kΩs), (2) a large OFF/ON resistance ratio (at least r demonstratedinvariousapplicationdomains,especiallyimage 103), (3) high endurance (ability to switch many times before a processing/analysis. See [2] for a review. Learning in these failing),(4)highretention(non-volatility),and(5)incremental systems has primarily been supervised. Although there are conductance states that can be reached by applying bipolar some examples of unsupervised learning in spike-based sys- voltagepulsesaboveaparticularthersholdvoltage.Allofthese tems [3], it is relatively unexplored in non-spiking NMSs. properties have been demonstrated in various devices. See [6] One example where unsupervised learning in non-spiking for a review. networks has been demonstrated is in [4], where principal componentanalysis(PCA)isusedtoreducethedimensionality III. CLUSTERINGALGORITHMDESIGN of images. However, the authors to not discuss any circuit for Clustering algorithms uncover structure in a set of m implementing the PCA algorithm in hardware. unlabeled input vectors {u(p)} by identifying M groups, or In this research, we propose a non-spiking NMS design clustersofvectorsthataresimilarinsomeway.Inonecommon for unsupervised clustering. The NMS is tested on samples approach, each cluster is represented by its centroid, so the from the MNIST database of handwritten digits. To the best clustering algorithm is reduced to ﬁnding each of the M of our knowledge, this is the ﬁrst work that explores general centroids. This can be achieved through a simple competitive aspects of clustering in these systems. Given its ability to learning algorithm: Initialize M vectors w by assigningthem i reducedatadimensionalityandaidinclassiﬁcation,webelieve to randomly-chosen input vectors. These will be referred to thatclusteringisakeyprimitiveforfutureNMSs.Furthermore, as weight vectors. Then, for each input vector, move the Algorithm 1 Proposed clustering algorithm. A T 𝑥 𝑥 … 𝑥 W 1 2 𝑀 1: Map inputs to hypercube vertices. n 423::: Ifonriftoiearpliopzce=hw1=e:im1g:hNtdeovpeoccthosrsdtoo random input vectors. ecnatsiDoitalucla … 56:: dx∗ii,p==(cid:26)w01,,i·uod(t∗ihp,p)er=w∀iims=eax1(,d2∗i,,p.)..∀,iM=1,2,...,M rotsirmrabssoC UWpedigahtet 7: ∆wi,j = αxiu(jp) ∀i = 1,2,...,M ∀j = eMrCs 1,2,...,m tu 𝑢 𝑢 … 𝑢 p 1 2 𝑁 8: end for n I 9: end for Fig.1. BlockdiagramofproposedNMSforunsupervisedclustering. closest weight vector a little closer. After several iterations, 𝑖𝑢1(𝑝) 𝑖𝑢2(𝑝) 𝑖𝑢𝑁(𝑝) Inputs the algorithm should converge with the weight vectors lying 𝑣 𝑣 𝑣 at (or close to) the centroids. Of course, there are several 𝑤𝑖,1 𝑤𝑖,2 𝑤𝑖,𝑁 parameterswhichmustbedeﬁned,includingadistancemetric train_en for measuring closeness. The most obvious choice is the (cid:96)2-norm. However, computing this is expensive in terms of … 𝑅 hardware because it requires units for calculating squares and square roots. In addition, as we will discuss later, it is easy to use a high-density memristor circuit called a crossbar … 𝑣𝑑𝑖∗,𝑝 to compute dot products between input and weight vectors. Therefore, it is preferred to use a dot product as a distance 𝑅 metric. For example, if all of the vectors are normalized Memristor Crossbar for 𝐰 ((cid:107)u(p)(cid:107) = (cid:107)w (cid:107) = 1), then w ·u(p) > w ·u(p)∀w (cid:54)= w , 𝑖 i i∗ i i i∗ where w is the closest weight vector to u(p). However, the i∗ constraint that (cid:107)u(p)(cid:107) = (cid:107)w (cid:107) = 1 creates a large overhead, Fig.2. Crossbarandsummingampliﬁercircuitforcomputingthedistance i betweentheinputandaweightvector. because every input vector has to be normalized and every weight vector has to be re-normalized each time it is updated. IV. NMSHARDWAREDESIGN We propose the following solution: Map each input vector to the vertex of a hypercube centered about the origin: u(p) ∈ TheunsupervisedclusteringalgorithmdiscussedinSection {−1,1}N, where N is the dimensionality of the input space. III can be implemented efﬁciently in an NMS by representing Now, w ·u(p) will yield a scalar value d∗ between −N and weight vectors as memristor conductances. A block diagram i i,p +N. Moreover, this scalar value can be linearly transformed of the proposed design is shown in Figure 1. The inputs, toadistanced whichisthe(cid:96)1-norm,orManhattandistance, which are represented as positive and negative currents, are i,p between the weight vector and the input: fedthroughM crossbarcircuits.Togetherwithanon-inverting summing ampliﬁer, (represented as a circle), each crossbar d ≡N −d∗ =(cid:88)N |w −u(p)|. (1) computesthedistancebetweenthecurrentinputandtheweight i,p i,p i,j j vector represented by its memristors’ conductances. j=1 Theconﬁgurationofthecrossbarandsummingampliﬁeris shown in Figure 2. Memristors in the top row inhibit, or con- Using this distance metric, we don’t ever need to re- tribute a negative component to the output, while memristors normalize the weight vectors. Furthermore, mapping input in the bottom row excite, or contribute a positive component vectors to hypercube vertices can usually be accomplished by to the output. Therefore, each crossbar column represents one thresholding. For example, grayscale images can be mapped component of one weight vector w , which can be positive or i by assigning -1 to pixel values from 0 to 127 and +1 to negative. If we assume that the op amp has a high open loop pixel values from 128 to 255. Algorithm 1 summarizes the gain and the wire resistances are small, then algorithm. The ﬁrst two lines are initialization steps. Within N (cid:18) (cid:19) otthhfeethwdeoeuciglbohlsetecsftoovmrepclotoonorepn(ctxsailoleifsdt1hthewewhweiinnnnneierrc)aorarernedmsp0oovonetdhdsecrtwlooistsheere.tTionhdtehenxe, vd∗i,p =(cid:88)j=1iu(jp)R GG12−+GG21 i,j, (2) current input vector using a Hebbian update rule. The pre- where G and G are the top and bottom memristors in each 1 2 factor α, which is called the learning rate, determines how far column, respectively. The output of the circuit is a voltage the weight vectors move each time they win. Notice that this representationofthedistancebetweenthecurrentinputandthe algorithm is completely unsupervised, so there are no labeled weight vector represented by the crossbar. The weight vectors input vectors. are modiﬁed by connecting them to write voltages v using wi,j Fig.3. 10clustercentroidsfoundinasetof1000MNISTimagesusingtheproposedNMS. a training enable signal train_en. The write voltages are #10 5 determined by the value of ∆w in line 7 of Algorithm 1. 1.8 i,j Speciﬁcally, if ∆w is negative, then v will be a negative voltage below theim,jemristor’s write thrweis,jhold, and if ∆w 1.7 MATLAB i,j Proposed is positive, then v will be a positive voltage above the memristor’s write wthir,jeshold. Otherwise, the write voltage is 1.6 zero. J 1.5 So far, we have only discussed the memristor crossbar and distance calculation parts of Figure 1 (line 5 in Algorithm 1.4 1). The winner-takes-all circuit (line 6 in Algorithm 1) can be implemented in a number of ways. In this work, we used 1.3 the current-mode design described in [7]. Finally, the weight 0 100 200 300 400 500 update (line 7 in Algorithm 1) can be computed using simple Epoch combinational logic circuits. Fig. 4. Cost function versus epoch while clustering MNIST images using theproposedNMS. V. CLUSTERINGMNISTIMAGES Oneexcitingapplicationoftheproposedhardwareisauto- matically identifying clusters in sets of images. We took 1000 REFERENCES images (m=1000) from the MNIST handwritten digit dataset [1] C.Mead,“Neuromorphicelectronicsystems,”Proceedings and clustered them using a behavioral model of the NMS of the IEEE, vol. 78, no. 10, pp. 1629–1636, 1990. describedinthelastsection.Eachimagewasoriginally20×20 [2] D. Kudithipudi, C. Merkel, M. Soltiz, G. S. Rose, and grayscale pixels (N=400). They were mapped to hypercube R.Pino,“Designofneuromorphicarchtectureswithmem- vertices using the thresholding approach discussed earlier. In ristors,” in Network Science and Cybersecurity, R. Pino, addition, we used 10 clusters (M=10), 500 training epochs Ed. Springer, 2014, pp. 93–103. (N =500), and α=0.005. The results are shown in Figure train [3] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H.-S. P. 3. Here, we have plotted the weight vectors representing the Wong, “A low energy oxide-based electronic synaptic centroid of each cluster. Figure 4 shows the cost versus the device for neuromorphic visual systems with tolerance to training epoch, where the cost is deﬁned as devicevariation.”AdvancedMaterials,vol.25,no.12,pp. m 1774–9, Mar. 2013. (cid:88) J = (mind ∀i). (3) [4] S.Choi,P.Sheridan,andW.D.Lu,“DataClusteringusing i,p Memristor Networks.” Scientiﬁc reports, vol. 5, p. 10492, p=1 Jan. 2015. We see that the cost function for the proposed NMS ap- [5] L.Chua,“Resistanceswitchingmemoriesarememristors,” proaches that of MATLAB’s built-in k-means clustering after AppliedPhysicsA,vol.102,no.4,pp.765–783,Jan.2011. 500 epochs. [6] D. Kuzum, S. Yu, and H.-S. P. Wong, “Synaptic electron- ics: materials, devices and applications.” Nanotechnology, VI. CONCLUSIONS vol. 24, no. 38, p. 382001, Sep. 2013. [7] J. Lazzaro, S. Ryckebusch, M. A. Mahowald, and C. A. The goal of this work was to explore both algorithmic and Mead, “Winner-take-all networks of O(N) complexity,” in hardwaredesignaspectsofunsupervisedlearninginNMSs.To AdvancesinNeuralInformationProcessingSystems,1988, that end, we proposed a clustering algorithm that maps inputs pp. 703–711. to vertices of a hypercube, and then iteratively ﬁnds clusters’ centroids using a Hebbian learning rule. We argue (although we haven’t proven) that the proposed algorithm can be im- plemented more efﬁciently in an NMS than algorithms that use either (cid:96)2-norm or cosine similarity as a distance function. ThealgorithmwasimplementedinacustomNMSdesignthat leverages crossbar circuits to compute the distance between inputs and weight vectors. To test our design, we clustered 1000 MNIST images and found the results to be consistent with MATLAB’s k-means clustering implementation.

Unsupervised Learning in Neuromemristive Systems PDF

0.63 MB

by Cory Merkel

#journals #arxiv

Checking for file health...

Preview Unsupervised Learning in Neuromemristive Systems

Unsupervised Learning in Neuromemristive Systems Cory Merkel and Dhireesha Kudithipudi Department of Computer Engineering Rochester Institute of Technology Rochester, New York 14623-5603 Email: {cem1103,dxkeec}@rit.edu Abstract—Neuromemristive systems (NMSs) currently repre- we believe this work will advance the state of unsupervised sent the most promising platform to achieve energy efﬁcient learning in NMSs and help others do the same. neuro-inspired computation. However, since the research ﬁeld 6 is less than a decade old, there are still countless algorithms II. BRIEFDESCRIPTIONOFMEMRISTORS 1 and design paradigms to be explored within these systems. One 0 particular domain that remains to be fully investigated within Technically, a memristor can be deﬁned as any two- 2 NMSs is unsupervised learning. In this work, we explore the terminal device with a state-dependent Ohm’s law [5]. More designofanNMSforunsupervisedclustering,whichisacritical concretely, a memristor is a thin ﬁlm (I) sandwiched between n a element of several machine learning algorithms. Using a simple a top (M) and bottom (M) electrode. The stack is referred to memristorcrossbararchitectureandlearningrule,weareableto J as a metal-insulator-metal (MIM) structure because the ﬁlm 7 aclcuhsiteevreinpge.rformance which is on par with MATLAB’s k-means material is nominally insulating. That is, in its stoichiometric 2 crystalline form it will have a large band gap and not enough free carriers to conduct. The ﬁlm is made conductive by ] introducing defects in the crystalline structure, either through T I. INTRODUCTION fabrication, applying an electric ﬁeld, or both. Defects may be E The present research explores multiple aspects of unsuper- interstitial metallic ions which are oxidized at one electrode s. vised learning in a class of neurologically-inspired computer and then drift to the other, where they are reduced. Defects c architectures referred to as neuromemristive systems (NMSs). mayalsobevacanciessuchasoxygenvacanciesinaTiO2ﬁlm. [ Although they are closely related, NMSs differ from neuro- In addition, defects may be changes in polarization, such as 1 morphic systems–pioneered by Mead in the late 1980s [1]– thoseinferroelectricﬁlms,orevenjustchangesincrystallinity v in two respects: First, they are designed using a mixture of as in phase change memory. In some ﬁlms, the defect proﬁle 2 CMOS and memristor technologies, which affords levels of can be gradually adjusted by applying electric ﬁelds for short 8 connectivityandplasticitythatarenotachievableinneuromor- durations, yielding incremental modiﬁcations to the ﬁlm’s 4 phicsystems.Theseconddistinction,whichismoresubtlebut overall conductance. In other ﬁlms, only two conductance 7 equally important, is that NMSs focus on abstraction, rather states can be reached. Moreover, there is usually a minimum 0 than mimicking, of neurological processes. This abstraction amountofenergyrequiredtoeffectchangeintheﬁlm’sdefect . 1 generally improves the efﬁciency of the resulting hardware proﬁle. This often translates to a threshold voltage which 0 implementations. must be applied across the ﬁlm to change its conductance. 6 Giventheconstantevolutionofmemristortechnology,itmakes 1 NMS research and development took off rapidly in the little sense to design an NMS around any speciﬁc memristor : late 2000s, coinciding with a growing interest in two-terminal deviceparameters.Instead,weassumedeviceswillhavethese v i memristors, which will be described brieﬂy in Section II general characteristics: (1) a large minimum resistance value X for unfamiliar readers. The utility of these systems has been (e.g. in kΩs), (2) a large OFF/ON resistance ratio (at least r demonstratedinvariousapplicationdomains,especiallyimage 103), (3) high endurance (ability to switch many times before a processing/analysis. See [2] for a review. Learning in these failing),(4)highretention(non-volatility),and(5)incremental systems has primarily been supervised. Although there are conductance states that can be reached by applying bipolar some examples of unsupervised learning in spike-based sys- voltagepulsesaboveaparticularthersholdvoltage.Allofthese tems [3], it is relatively unexplored in non-spiking NMSs. properties have been demonstrated in various devices. See [6] One example where unsupervised learning in non-spiking for a review. networks has been demonstrated is in [4], where principal componentanalysis(PCA)isusedtoreducethedimensionality III. CLUSTERINGALGORITHMDESIGN of images. However, the authors to not discuss any circuit for Clustering algorithms uncover structure in a set of m implementing the PCA algorithm in hardware. unlabeled input vectors {u(p)} by identifying M groups, or In this research, we propose a non-spiking NMS design clustersofvectorsthataresimilarinsomeway.Inonecommon for unsupervised clustering. The NMS is tested on samples approach, each cluster is represented by its centroid, so the from the MNIST database of handwritten digits. To the best clustering algorithm is reduced to ﬁnding each of the M of our knowledge, this is the ﬁrst work that explores general centroids. This can be achieved through a simple competitive aspects of clustering in these systems. Given its ability to learning algorithm: Initialize M vectors w by assigningthem i reducedatadimensionalityandaidinclassiﬁcation,webelieve to randomly-chosen input vectors. These will be referred to thatclusteringisakeyprimitiveforfutureNMSs.Furthermore, as weight vectors. Then, for each input vector, move the Algorithm 1 Proposed clustering algorithm. A T 𝑥 𝑥 … 𝑥 W 1 2 𝑀 1: Map inputs to hypercube vertices. n 423::: Ifonriftoiearpliopzce=hw1=e:im1g:hNtdeovpeoccthosrsdtoo random input vectors. ecnatsiDoitalucla … 56:: dx∗ii,p==(cid:26)w01,,i·uod(t∗ihp,p)er=w∀iims=eax1(,d2∗i,,p.)..∀,iM=1,2,...,M rotsirmrabssoC UWpedigahtet 7: ∆wi,j = αxiu(jp) ∀i = 1,2,...,M ∀j = eMrCs 1,2,...,m tu 𝑢 𝑢 … 𝑢 p 1 2 𝑁 8: end for n I 9: end for Fig.1. BlockdiagramofproposedNMSforunsupervisedclustering. closest weight vector a little closer. After several iterations, 𝑖𝑢1(𝑝) 𝑖𝑢2(𝑝) 𝑖𝑢𝑁(𝑝) Inputs the algorithm should converge with the weight vectors lying 𝑣 𝑣 𝑣 at (or close to) the centroids. Of course, there are several 𝑤𝑖,1 𝑤𝑖,2 𝑤𝑖,𝑁 parameterswhichmustbedeﬁned,includingadistancemetric train_en for measuring closeness. The most obvious choice is the (cid:96)2-norm. However, computing this is expensive in terms of … 𝑅 hardware because it requires units for calculating squares and square roots. In addition, as we will discuss later, it is easy to use a high-density memristor circuit called a crossbar … 𝑣𝑑𝑖∗,𝑝 to compute dot products between input and weight vectors. Therefore, it is preferred to use a dot product as a distance 𝑅 metric. For example, if all of the vectors are normalized Memristor Crossbar for 𝐰 ((cid:107)u(p)(cid:107) = (cid:107)w (cid:107) = 1), then w ·u(p) > w ·u(p)∀w (cid:54)= w , 𝑖 i i∗ i i i∗ where w is the closest weight vector to u(p). However, the i∗ constraint that (cid:107)u(p)(cid:107) = (cid:107)w (cid:107) = 1 creates a large overhead, Fig.2. Crossbarandsummingampliﬁercircuitforcomputingthedistance i betweentheinputandaweightvector. because every input vector has to be normalized and every weight vector has to be re-normalized each time it is updated. IV. NMSHARDWAREDESIGN We propose the following solution: Map each input vector to the vertex of a hypercube centered about the origin: u(p) ∈ TheunsupervisedclusteringalgorithmdiscussedinSection {−1,1}N, where N is the dimensionality of the input space. III can be implemented efﬁciently in an NMS by representing Now, w ·u(p) will yield a scalar value d∗ between −N and weight vectors as memristor conductances. A block diagram i i,p +N. Moreover, this scalar value can be linearly transformed of the proposed design is shown in Figure 1. The inputs, toadistanced whichisthe(cid:96)1-norm,orManhattandistance, which are represented as positive and negative currents, are i,p between the weight vector and the input: fedthroughM crossbarcircuits.Togetherwithanon-inverting summing ampliﬁer, (represented as a circle), each crossbar d ≡N −d∗ =(cid:88)N |w −u(p)|. (1) computesthedistancebetweenthecurrentinputandtheweight i,p i,p i,j j vector represented by its memristors’ conductances. j=1 Theconﬁgurationofthecrossbarandsummingampliﬁeris shown in Figure 2. Memristors in the top row inhibit, or con- Using this distance metric, we don’t ever need to re- tribute a negative component to the output, while memristors normalize the weight vectors. Furthermore, mapping input in the bottom row excite, or contribute a positive component vectors to hypercube vertices can usually be accomplished by to the output. Therefore, each crossbar column represents one thresholding. For example, grayscale images can be mapped component of one weight vector w , which can be positive or i by assigning -1 to pixel values from 0 to 127 and +1 to negative. If we assume that the op amp has a high open loop pixel values from 128 to 255. Algorithm 1 summarizes the gain and the wire resistances are small, then algorithm. The ﬁrst two lines are initialization steps. Within N (cid:18) (cid:19) otthhfeethwdeoeuciglbohlsetecsftoovmrepclotoonorepn(ctxsailoleifsdt1hthewewhweiinnnnneierrc)aorarernedmsp0oovonetdhdsecrtwlooistsheere.tTionhdtehenxe, vd∗i,p =(cid:88)j=1iu(jp)R GG12−+GG21 i,j, (2) current input vector using a Hebbian update rule. The pre- where G and G are the top and bottom memristors in each 1 2 factor α, which is called the learning rate, determines how far column, respectively. The output of the circuit is a voltage the weight vectors move each time they win. Notice that this representationofthedistancebetweenthecurrentinputandthe algorithm is completely unsupervised, so there are no labeled weight vector represented by the crossbar. The weight vectors input vectors. are modiﬁed by connecting them to write voltages v using wi,j Fig.3. 10clustercentroidsfoundinasetof1000MNISTimagesusingtheproposedNMS. a training enable signal train_en. The write voltages are #10 5 determined by the value of ∆w in line 7 of Algorithm 1. 1.8 i,j Speciﬁcally, if ∆w is negative, then v will be a negative voltage below theim,jemristor’s write thrweis,jhold, and if ∆w 1.7 MATLAB i,j Proposed is positive, then v will be a positive voltage above the memristor’s write wthir,jeshold. Otherwise, the write voltage is 1.6 zero. J 1.5 So far, we have only discussed the memristor crossbar and distance calculation parts of Figure 1 (line 5 in Algorithm 1.4 1). The winner-takes-all circuit (line 6 in Algorithm 1) can be implemented in a number of ways. In this work, we used 1.3 the current-mode design described in [7]. Finally, the weight 0 100 200 300 400 500 update (line 7 in Algorithm 1) can be computed using simple Epoch combinational logic circuits. Fig. 4. Cost function versus epoch while clustering MNIST images using theproposedNMS. V. CLUSTERINGMNISTIMAGES Oneexcitingapplicationoftheproposedhardwareisauto- matically identifying clusters in sets of images. We took 1000 REFERENCES images (m=1000) from the MNIST handwritten digit dataset [1] C.Mead,“Neuromorphicelectronicsystems,”Proceedings and clustered them using a behavioral model of the NMS of the IEEE, vol. 78, no. 10, pp. 1629–1636, 1990. describedinthelastsection.Eachimagewasoriginally20×20 [2] D. Kudithipudi, C. Merkel, M. Soltiz, G. S. Rose, and grayscale pixels (N=400). They were mapped to hypercube R.Pino,“Designofneuromorphicarchtectureswithmem- vertices using the thresholding approach discussed earlier. In ristors,” in Network Science and Cybersecurity, R. Pino, addition, we used 10 clusters (M=10), 500 training epochs Ed. Springer, 2014, pp. 93–103. (N =500), and α=0.005. The results are shown in Figure train [3] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H.-S. P. 3. Here, we have plotted the weight vectors representing the Wong, “A low energy oxide-based electronic synaptic centroid of each cluster. Figure 4 shows the cost versus the device for neuromorphic visual systems with tolerance to training epoch, where the cost is deﬁned as devicevariation.”AdvancedMaterials,vol.25,no.12,pp. m 1774–9, Mar. 2013. (cid:88) J = (mind ∀i). (3) [4] S.Choi,P.Sheridan,andW.D.Lu,“DataClusteringusing i,p Memristor Networks.” Scientiﬁc reports, vol. 5, p. 10492, p=1 Jan. 2015. We see that the cost function for the proposed NMS ap- [5] L.Chua,“Resistanceswitchingmemoriesarememristors,” proaches that of MATLAB’s built-in k-means clustering after AppliedPhysicsA,vol.102,no.4,pp.765–783,Jan.2011. 500 epochs. [6] D. Kuzum, S. Yu, and H.-S. P. Wong, “Synaptic electron- ics: materials, devices and applications.” Nanotechnology, VI. CONCLUSIONS vol. 24, no. 38, p. 382001, Sep. 2013. [7] J. Lazzaro, S. Ryckebusch, M. A. Mahowald, and C. A. The goal of this work was to explore both algorithmic and Mead, “Winner-take-all networks of O(N) complexity,” in hardwaredesignaspectsofunsupervisedlearninginNMSs.To AdvancesinNeuralInformationProcessingSystems,1988, that end, we proposed a clustering algorithm that maps inputs pp. 703–711. to vertices of a hypercube, and then iteratively ﬁnds clusters’ centroids using a Hebbian learning rule. We argue (although we haven’t proven) that the proposed algorithm can be im- plemented more efﬁciently in an NMS than algorithms that use either (cid:96)2-norm or cosine similarity as a distance function. ThealgorithmwasimplementedinacustomNMSdesignthat leverages crossbar circuits to compute the distance between inputs and weight vectors. To test our design, we clustered 1000 MNIST images and found the results to be consistent with MATLAB’s k-means clustering implementation.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.