r - Convert FIX message format ("Tag=Value") into CSV -


I have a CSV / log file of 35 = S (quotation message; "tag = value") and remove me Need rates for data mining in a proper CSV file. This is not strictly related to fixes, it is more about R related questions about how to clear this dataset.

The raw messages look something like this:

  190 = 1.1204, 1 9 1 = -0.000029,193 = 20141008,537 = 0, 631 = 1.12029575,642 = 0.000145,10 = 56 190 = 7.20425,191 = 0.000141, 537 = 0, 631 = 7.2034485,10 = 140, 190 = 1.26237,191 = 0, 537 = 1, 10 = 068,,  < / Pre> 

I need to get the first intermediate data set that looks like this where the same tags are allocated.

  190 = 1.1204, 1 9 1 = -0.000029,193 = 20141008,537 = 0,631 = 1.12029575,642 = 0.000145,10 = 56 190 = 7.20425,191 = 0.000141, 537 = 0,631 = 7.2034485, 10 = 140 190 = 1.26237,191 = 0, 537 = 1,, 10 = 068  

In turn, it will need to be changed:

  190, 191, 193, 537,631, 642, 10 1.1204, -0.000029,20141008,0, 1.12029575,0.000145,56 7.20425,0.000141,, 0, 7.2034485, 140 1.26237,0, 1, , 068  

I'm in the middle of developing a script with awk, but I wonder if I can do that R. Currently, my biggest challenge is reaching the intermediate table. From intermediate to the last table, I thought about the Tidir package using R., especially 'different' works if someone could suggest a better argument, I would greatly appreciate it!

Another possibility start with @Andree as the same scan , But also the logic of strip.white and na.strings :

  x and lt; - Scan (text = "190 = 1.1204, 1 9 1 = -0.000029,193 = 20141008,537 = 0, 631 = 1.12029575,642 = 0.000145,10 = 56 190 = 7.20425,191 = 0.000141, 537 = 0, 631 = 7.2034485,10 = 140 ,, 190 = 1.26237,191 = 0, 537 = 1, 10 = 068 ,,, ", sep =", ", what =" character ", strip.white = TRUE, na.strings =" ") # Deleted NAX and LT; - x [! Then  reshape2  from Is.na (x)]  

then colsplit and dcast package:

  Library (rhesus 2) # partition 'x' in two columns 1 and Lt; - colsplit (string = x, pattern = "=", name = c ("x", "y")) # creates an id variable, which is the dcast d1 $ id & lt; - Away (d1 $ x, d1 $ x, FUN = seq_along) is required # Longer than D2 and Lt; - Dakas (Data = D1, id ~ x, value = "Y") # ID10 190 1 9 1 193 537 631 642 # 1 1 56 1.12040 -0.000029 20141008 1.120296 0.000145 # 2 2 140 7.20425 0.000141 NA 7.203449 NA # 3 68 1.26237 0.000000 NA 1 NA NA  

Because you have mentioned <(data = data.frame (x), col = x, in = c ("x" , "Cd", "x", "# 1 10 091 193 537 631 642 # 1 1 56 1.1204 -0.000029 20141008 1.12029575 0.000145 # 2 2 140 7.20425 0.000141> NA & gt; 0 7.2034485 & lt ; NA & gt; # 3 3 068 1.26237 0 & lt; NA & gt; 1 & lt; NA & gt;

This value is set to Character as . If you want the numeric , then you convert = TRUE in spread

.

Comments

Popular posts from this blog

mysql - How to enter php data into a html multiple select box -

java - Can't add JTree to JPanel of a JInternalFrame -

c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -