This is the first post in a new “mini” series of tool features and research results. Unlike previous long winded blog posts, the mini series only features a very short description of the topic, gives an example and concludes.
bioknack‘s collection of tools has been extended with a small command line tool for retrieving the PubMed IDs that are associated with a certain gene. The information is retrieved from www.pubmed2ensembl.org where the user only specifies a species and Ensembl gene ID to query their data sources “Entrez Gene”, “MEDLINE”, “PMC”, “EMBL BLAST”, “EMBL XREF” and “text2genome” (see this poster for a data-source description). Results are displayed in a TSV format. The first column denotes the attribute in the BioMart that was queried and the second column denotes the PubMed ID which has been returned for said attribute.
Examples
Query 1: Query PubMed IDs for the Ensembl gene ENSG00000139618 for the default species Homo sapiens
bk_pubmed2ensembl.rb -g ENSG00000139618
Output:
blast56_c100t_flat_blast56_c100t_pmid_1090 8524414 blast56_c100t_flat_blast56_c100t_pmid_1090 8640236 blast56_c100t_flat_blast56_c100t_pmid_1090 12100744 blast56_c100t_flat_blast56_c100t_pmid_1090 14722926 embl_flat_embl_pmid_1092 8640236 entrez_flat_entrez_pmid_1094 1072445 entrez_flat_entrez_pmid_1094 7581463 entrez_flat_entrez_pmid_1094 7597059 entrez_flat_entrez_pmid_1094 8091231 [...]
Query 2: Query PubMed IDs for the Ensembl gene FBgn0001325 of Drosophila melanogaster
bk_pubmed2ensembl.rb -s dmelanogaster -g FBgn0001325
Output:
entrez_flat_entrez_pmid_1094 1327756 entrez_flat_entrez_pmid_1094 1346367 entrez_flat_entrez_pmid_1094 1346608 entrez_flat_entrez_pmid_1094 1348871 entrez_flat_entrez_pmid_1094 1423595 entrez_flat_entrez_pmid_1094 1438276 entrez_flat_entrez_pmid_1094 1451665 entrez_flat_entrez_pmid_1094 1457465 entrez_flat_entrez_pmid_1094 1463605 entrez_flat_entrez_pmid_1094 1480489 [...]
Query 3: List the available species
bk_pubmed2ensembl.rb -l
Output:
Available species: acarolinensis btaurus celegans cfamiliaris choffmanni cintestinalis cjacchus cporcellus csavignyi [...]
Acknowledgements
- bk_pubmed2ensembl.rb makes use of Darren Oakley’s ‘biomart’ Ruby gem.




Joachim
June 18, 2011
Now with -b parameter for bulk retrieval and -i for passing gene IDs via STDIN.