What is the fastest way to lookup a large number of values using R? -
I have a list of more than 1,000,000 numbers. I have a lookup table that has numbers and a range. For example, 0-200 Category A, 201-650 is Category B (ranges are not of the same length)
I need to just recycle on the list of 1,000,000 numbers and get a list of 1,000,000 related Categories
Edit:
For example, there are some elements of my list - 100, 125.5, 807.5, 345.2, and it should return categories As for logging like some 1,1,8,4, logic has been implemented in a function - categoryLookup (CD)
and I am using the following command to get categories
Cats & lt; - sapply (List.cd, categoryLookup)
However, as long as it works up to 10000 on the size lists, it is taking a lot of time for the complete list.
What is the fastest way to do this? Is there any indexing that can help speed up the process?
number:
numbers < - Sample (1: 1000000)
Groups:
Groups & lt; - Sort (Representative (alphabet, 40000))
Lookup:
Categories & lt; - Groups [numbers]
Edit:
If you do not have a vector of "groups", you can create it first.
Assume that you have information on the information limit:
Ranges & lt; - data.fr (group = c ("a", "b", "c"), start = c (0,300001,600001), end = c (300000,600000,1000000) 1a1 3 e + 05 2b 300001 6A + 05 3C600001 1A +06 # If groups are sorted and do not overlap: Group & lt; - Representative ($ group, ($ $ limit starts $ limit) +1)
Continue again before
categories < - Group [Numbers]
Edit: As @ Jabau MS said - in this case you +1 ($ $ $ end-limits $ start) ). (Already edited in the example above). Additionally, your initial coordinate should not be 1 and 0
Comments
Post a Comment