0 votes
1 view
in Data Science by (17.6k points)

I've a dataset that contains 1 column but huge umber of rows. The column contains huge number of public IP addresses. So its possible to get the geolocation from those IPs using sites like (http://freegeoip.net).I want to generate a column of country names which contains the country name for each IP of the rows. Here is my naive approach -

library(XML)

#Import your list of IPs

ip.addresses <- read.csv("ip-address.csv")

#This is my API

api.url <- "http://freegeoip.net/xml/"

#Appending API URL before each of the IPs

api.with.ip <- paste(api.url, ip.addresses$IP.Addresses ,sep="")

#Creating an empty vector for collecting the country names

country.vec <- c()

#Running a for loop to parse country name for each IP

for(i in api.with.ip)

{

    #Using xmlParse & xmlToList to extract IP information

    data <- xmlParse(i)

    xml.data <- xmlToList(data)

    #Selecting only Country Name by using xml.data$CountryName

    #If Country Name is NULL then putting NA

    if(is.null(xml.data$CountryName)){

      country.vec <- c(country.vec, NA)

    }

    else{

      country.vec <- c(country.vec, xml.data$CountryName)

    }

}

#Combining IPs with its corresponding country names into a dataframe

result <- data.frame(ip.addresses,country.vec)

colnames(result) <- c("IP Address", "Country")

#Exporting the dataframe as csv file

write.csv(result, "IP_to_Location.csv")

But as I've huge number of rows, my approach using for loop is very slow. How the process can be faster?

1 Answer

0 votes
by (38.2k points)

This process can be faster by using  'rgeolocate' and mmdb.

library(rgeolocate)

setwd("/home/imran/Documents/")

ipdf <- read.csv("IP_Address.csv")

ipmmdb <- system.file("extdata","GeoLite2-Country.mmdb", package = "rgeolocate")

results <- maxmind(ipdf$IP.Address, ipmmdb,"country_name")

export.results <- data.frame(ipdf$IP.Address, results$country_name)

colnames(export.results) <- c("IP Address", "Country")

write.csv(export.results, "IP_to_Locationmmdb.csv")

If you want to know about What is R Programming visit this R Programming Course.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...