python - fastest way to count the number of differences among rows in 2d-array -


I need to calculate the difference (~ score) of all the rows against all the other 2d-array (figures) Here's a simple example, but I need to do this on the huge 2d-arrays of ~ 100 000 rows and thousands of rows, so I got my inexhaustible code I'm looking to speed up:

  a = numpy.array (x [x] [[1,2] , [1,2], [1,3], [2,3], [3,3]] score = 0 scorer = 0 in the xrange (lane (a)): in the range for j (i + 1, Lane (a)): scoretime = 0 if any [i, 0]! = A [j, 0] and a [i, 1]! = A [j, 0]] and a [i, 1]! = A [J, 1] and one [i, 0]! = A [J, 1]: # compares two separate items = 2 alif (one [i] == a [ja]). ): ScoreTime = 0 and: ScoreTime = 1 print [i], a [j], scoottump, (a [i] == a [ja]). (All), (a [i] == a [j] ). Any () score + = ScoreTump scoresAir = = (ScoreTimePoint * ScoreType) Runt score, scorecore  a [0] is similar [1] so much the score (number of differences) = 0, but [2] with a difference and [3] There are two differences with  To calculate such distance (data), I need an intermediate squire score and score.    context_unit_a score [1 2] [1 2] 0 [1 2] [1 3] 1 [1 2] [2 3] 1 [1 2] [3 3] 2 [1] 2] [1 3] 1 [1 2] [2 3] 1 [1 2] [3 3] 2 [1 3] [2 3] 1 [1 3] [3 3] 1 [2 3] [3 3] ] 1 sum_score = 11 Sum_scoresquare = 15  

My code is quite naive and in this way to accelerate this kind of computation: take full advantage of the arrays of your help. Thank you for

np.in1d ​​ array 1 generates array 2 and True for a match. So we need to reject the result by using ~ np.in1d ​​. After that, np.where returns those indexes which hold True , therefore len (np.where (...)) total Gives mismatch I hope this will help you:

  gt; & Gt; Np & gt; & Gt; & Gt; A = np.array ([[1,2], [1,2], [1,3], [2,3], [3,3]])> gt; & Gt; & Gt; Ridge = [For q in the range (for A + [1]) category P (for P + 1) ([np.inwhere (~ np.in1d ​​(a [p], a [q])) [0] A.shape [0])]> gt; & Gt; & Gt; Res = np.array (res)> gt; & Gt; & Gt; Sum_score = sum (res)> gt; & Gt; & Gt; Sum_score_square = sum (res * res)> gt; & Gt; & Gt; Print Sum_score, Sum_score_square 11 15 & gt; & Gt; & Gt; K = 0> & gt; & Gt; I in range (a.shape [0]): ... in the range J (i + 1, a.shape [0]): ... print [i], a [j], res [k]]. .. k + = 1 [1 2] [1 2] 0 [1 2] [1 3] 1 [1 2] [2 3] 1 [1 2] [3 3] 2 [1 2] [1 3] 1 [1 2] [2 3] 1 [1 2] [3 3] 2 [1 3] [2 3] 1 [1 3] [3 3] 1 [2 3] [3 3] 1  

Comments

Popular posts from this blog

mysql - How to enter php data into a html multiple select box -

java - Can't add JTree to JPanel of a JInternalFrame -

c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -