I am working with NCBI Reference Sequence accession numbers like variable a:

a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")  

To get information from the biomart package I need to remove the .1, .2, etc. after the accession numbers. I normally do this with this code:

b <- sub("..*", "", a)

# [1] "" "" "" "" "" ""

But as you can see, this isn't the correct way for this variable. Can anyone help me with this?

To remove the part of the string after “.”, you can use the gsub function with the escape characters (\\) before the “.” as follows:

a <- c("NM_020506.1","NM_020519.1","NM_001030297.2"

       ,"NM_010281.2","NM_011419.3", "NM_053155.2") 


[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"   

[6] "NM_053155" 

