Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (17.6k points)

I've a dataset that contains 1 column but huge umber of rows. The column contains huge number of public IP addresses. So its possible to get the geolocation from those IPs using sites like ( want to generate a column of country names which contains the country name for each IP of the rows. Here is my naive approach -


#Import your list of IPs

ip.addresses <- read.csv("ip-address.csv")

#This is my API

api.url <- ""

#Appending API URL before each of the IPs

api.with.ip <- paste(api.url, ip.addresses$IP.Addresses ,sep="")

#Creating an empty vector for collecting the country names

country.vec <- c()

#Running a for loop to parse country name for each IP

for(i in api.with.ip)


    #Using xmlParse & xmlToList to extract IP information

    data <- xmlParse(i) <- xmlToList(data)

    #Selecting only Country Name by using$CountryName

    #If Country Name is NULL then putting NA


      country.vec <- c(country.vec, NA)



      country.vec <- c(country.vec,$CountryName)



#Combining IPs with its corresponding country names into a dataframe

result <- data.frame(ip.addresses,country.vec)

colnames(result) <- c("IP Address", "Country")

#Exporting the dataframe as csv file

write.csv(result, "IP_to_Location.csv")

But as I've huge number of rows, my approach using for loop is very slow. How the process can be faster?

1 Answer

0 votes
by (41.4k points)

This process can be faster by using  'rgeolocate' and mmdb.



ipdf <- read.csv("IP_Address.csv")

ipmmdb <- system.file("extdata","GeoLite2-Country.mmdb", package = "rgeolocate")

results <- maxmind(ipdf$IP.Address, ipmmdb,"country_name")

export.results <- data.frame(ipdf$IP.Address, results$country_name)

colnames(export.results) <- c("IP Address", "Country")

write.csv(export.results, "IP_to_Locationmmdb.csv")

If you want to know about What is R Programming visit this R Programming Course.

Browse Categories