# PLOS figures in R

I’m preparing a manuscript for PLOS ONE and saw this in the figure guidelines:

Figure text must be in Arial font, between 8 and 12 point.

Easy, I thought. Just a matter of specifying a font family in the device I print to.

Think again.

Apparently, R does not support using different font definitions. Of course I’m not the first person to encounter this problem. In an excellent post by Gavin Simpson he explains how to come around this, and even to embed fonts in PDFs printed by R. In short, take a look at the extrafont package. It enables you to use fonts on your system in your R figures.

To get this to work properly, I had to specify one single directory to look when importing the fonts in since I apparently had multiple copies of Arial on my system (Mac OSX 10.9). This can be done easily by using font_import(path=c('/Library/Fonts'), recursive=FALSE) to ensure that only one copy is used.

# Authors and affiliations in LaTeX

Oh, that’s right, I have a blog. Had almost forgotten about it.

I just faced the problem of setting multiple authors in a LaTeX document together with their affiliations. On top of that, one of the authors had multiple affiliations. Google returned this excellent answer on TeX StackExchange. In short; use the authblk package:

\documentclass[a4paper,11pt]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{authblk}

\title{More than one Author with different Affiliations}
\author[1]{Author A\thanks{A.A@university.edu}}
\author[2]{Author D\thanks{D.D@university.edu}}
\author[1,2]{Author E\thanks{E.E@university.edu}}
\affil[1]{Department of Computer Science, \LaTeX\ University}
\affil[2]{Department of Mechanical Engineering, \LaTeX\ University}

\begin{document}
\maketitle
\end{document}


\thanks is useful to add footnotes to the authors, in this case their email addresses.

# Python and XML

Today I faced a task where I had to parse huge XML-files. And when I say huge, I mean 6-14 GB. My weapon of choice is Python, since I’m comfortable with it. However, I had never parsed XML with it before. Because of the size of the files, it was unfeasible to load the entire file into memory, and for me that was not necessary either.

After Googling for a while, I found that many people recommend the ElementTree module and it’s C-equivalent cElementTree. The function iterparse proved to be a real life saver. By iterating through the element tree and deleting elements as you go, you will only consume small amounts of memory. The following snippet is more or less taken from the documentation.

import xml.etree.cElementTree as ET

# Get an iterable tree
context = ET.iterparse('huge_xml_file.xml', events=('start', 'end'))
# Get an iterator
context = iter(etree)
# Get the root element to be able to clear it
event, root = context.next()

# Iterate through the tree and to what you have to do
for event, elem in context:
if event == 'start' and elem.tag == '{some_namespace}some_tag_name':
# The element was opened here
if event == 'end' and elem.tag == '{some_namespace}some_tag_name':
# Now the whole element is available! Process it here.
# When done with it, call
root.clear() # to free up some memory

I don’t like the way I had to specify the namespaces, but I guess there’s a better way of doing it. When running this on a 6.4 GB XML file (161 million rows, 81 million elements), the code above did not consume more than 15 MB of memory. I don’t remember how long it took, but it was reasonably fast.

# Minecraft: Pi Edition

I’ve never played Minecraft (if you don’t count my futile attempts at playing Minicraft on my Windows Phone), but now it seems a (free) version of it is coming to the Raspberry Pi. Not only can you play it like the ordinary Minecraft, you can also program it in real time! Check out the video below.

So, if I haven’t gotten started with Minecraft yet, this is a golden opportunity for some procrastination! It’s hopefully going to be released by the end of this year, and I can barely wait!

# Raspberry Pi Camera

My Raspberry Pi has been collecting dust for far too long now. I’ve had some projects in mind, but I haven’t found any motivation for realising them yet. Now however, the Raspberry Pi camera is on its way! Hopefully it will be available sometime in the first half of next year, and I want one. For a while I’ve been thinking that I want to try to implement some machine learning methods for e.g. facial recognition, just for fun. This seems like a golden opportunity! It shoots 1080p at 30 fps according to DesignSpark.

Since I have a webcam in my computer, that should work as well, but I’m having trouble installing the Python bindings for OpenCV. It feels like I’ve tried every configuration possible, but I still can’t get it to work. If I get some time on my hands I will primarily try to fix that, but the Raspberry Pi camera is a tempting option. Even if I get it to work on my Mac, I will probably buy it anyway.

# Slow going

As of now, I’ve had Internet in my apartment for a few weeks, and as soon as I got it I apparently stopped posting here. Constructive, yeah I know.

Anyhow, today I had my start seminar for my PhD where I introduced my project to the rest of the group plus some other people. It went better than I expected, and I’m really excited to get started for real. However, there’s a lot of administrative stuff that has to be solved before I can really dig in.

I really like it at the University. The predecessor to the University of today was founded in 1859, and in 2005 it got University status and was renamed to the Norwegian University of Life Sciences. Some of the buildings are quite old, but in a charming way. With narrow corridors and strange floor plans, it was a mess finding your way around in the beginning. Just next to my office there’s actually a small dairy. Sometimes they get cream left over, and it’s just to go there and fill a bottle. Pretty nice!

# SpeechJammer

I still don’t have an Internet connection in my apartment, but now it should be on it’s way!

Meanwhile, I found that this years Ig Nobel Prize in acoustics was awarded to Kazutaka Kurihara and Koji Tsukada for creating the SpeechJammer. It’s a device that disturbs people’s speech by simply playing back what the person is saying, but with a tiny delay. I’ve experienced this myself working at the technical support of a major swedish ISP. Sometimes I could hear my own voice in the headset, with a delay, and it was tremendously difficult trying to talk to the customer without stuttering.

If I only thought of this application back then (2007), I could’ve been awarded an Ig Nobel Prize instead. Ah, well…

# Still no Internet

I’m still waiting to get the stuff I need to set up an Internet connection in my apartment. I couldn’t imagine how restless I would become without an Internet connection, but it hasn’t been all that bad. I managed to read through book six in the Dark Tower series by Stephen King, and I’m halfway through the seventh (and last) book. I highly recommend this series if you’re a fan of the work of Stephen King, and even if you’re not. I also managed to pick up an old hobby of mine; drawing. If I feel confident enough, maybe I’ll post some of my drawings here.

Next week I will go to Umeå for some meetings, and a crash course in RNAseq data analysis. It’ll be nice to come back “home” and see my girlfriend and her dog. Not that I don’t like it here. I feel quite at home here in Norway. The people are friendly, and the infrastructure is nice. There’s just one thing that I find really annoying: most supermarkets are closed on Sundays. That sucks.

# MacBook Pro with retina display

Wow! I had heard that the retina displays were nice, but I couldn’t imagine how nice. I just got my work computer, a 15″ MacBook Pro with retina display, and I’ve only been working with it for 15 minutes. Still, when I look at my old 13″ MacBook, it looks absolutely dreadful. My economy won’t survive this awakening…

# dos2unix on Mac

In Linux there’s a nice little tool called dos2unix that converts those nasty Windows line breaks into Unix line breaks. This is not available in OSX, but it is possible to do it with tr:

tr '\r' '\n' < file_to_convert.txt > output.txt


Nice and simple, and it seems to work just fine.