python - Calculate STD manually using Groupby Pandas DataFrame -


I was trying to write a solution, which is a different and a manual for calculating a mean and STD The way.

I have created

  a = ["Apple", "Banana", "Cherry", "Apple"] B = [3,4,7,3] C = [5,4,1,4] D = [7,8,3,7] Pd DF = PD to import pandals DataFrame (index = class (4), column = list ("ABCD")) DF ["A"] = A DF ["B"] = BDF ["C"] = C DF ["D"] = D  

Again, I made a list of A duplication. Then I went through the group all the time of the objects and calculated the solution.

  import as np l = list (set (df.A)) df.groupby ('A', As_index = False) listMean = [0] * len (df.C) ListSTD = [0] * L in the LAN (df.C) X: s = np.mean (df [df ['A'] == x] for C =.) = Z = [index for index, enumerate In the object (df ['a']. Value] x == item i for z: listMean [i] = s in: s = np.std (df [df ['a'] == X] .cvalues) z = index for index, enumerate item (df ['a']. Value) if x == item] i in Z: listSTD [i] = s df ['c'] = ListMean df ['E'] = listSTD print df  

I used description () grouping To calculate the mean, STD, "A" by

  print df.groupby ('A'). Description ()  

and test the suggestion solution:

  result = df.groupby (['a'], as_index = False) .gg ({ 'C': ['mean', 'std'], 'b': 'first', 'd': 'first'})  

I have noticed that when I get the STD ("E"), I get different results. I'm just curious, what did I miss?

contains: population SD and sample SD

population SD

Enter image details here

sample sd

Enter image details here

It is used when the value is There is only one sample from the universe.

np.std by default population counts SD, while panda ' series.std calculates sample SD by default. [42]: np.std ([4,5]) outside [42]: in 0.5 [43]: np.std ([4,5], ddof = 0)

  Outside [43]: 0.5 in [44]: np.std ([4,5]], Ddof = 1 out [44]: 0.70710678118654757 in [45]: x = pd.Series ([4,5]) [46]: x.std () out [46]: 0.70710678118654757 [47]: X.std (ddof = 0) out [47]: 0.5  

ddof < / Code> for cents and "degrees of freedom", and control the number that occurred in the SD formula from N .

The formula images appear above. There is the "uncorrected sample standard deviation" I called the population SD, and the "true sample standard deviation" sample is SD.


Comments

Popular posts from this blog

java - Can't add JTree to JPanel of a JInternalFrame -

javascript - data.match(var) not working it seems -

javascript - How can I pause a jQuery .each() loop, while waiting for user input? -