Jérôme Belleman
Home  •  Tools  •  Posts  •  Talks  •  Travels  •  Graphics  •  About Me

Editing PDFs

14 Jun 2015

Two ways of editing existing PDF documents: one with LaTeX for automation's sake, another one with Inkscape for visualisation's sake. This post describes how.

Googling for pdf editor linux will take you to discussions such as Which programs can I use to edit PDF files? or How to Edit PDFs? which will refer to tools such as flpsed, PDFedit, even LibreOffice Draw. I quite like using LATEX myself or Inkscape, depending on the needs.

1 LATEX

This approach doesn't allow you to change the PDF per se, so much as it lets you add contents on top of an existing PDF file and generate a new PDF. The LATEX graphicx package supports PDF perfectly and TikZ is a rather good solution for placing text and adding drawings. The idea is to make an overlaid tikzpicture for each page and place e.g. text nodes. This part is relatively easy and can be automated on the fly with any good text editor.

This approach being non-graphical, the difficulty lies in actually finding the right position for those nodes. This will involve repeatedly changing coordinates and checking the result. You'll soon want to use all the shortcuts you can wield to make this bearably fast. For instance, I find it convenient to use Vim's CTRL-X and CTRL-A commands to respectively decrease and increase values as with a slider and :w | !latexmk -pdf short-cut to @: to save and compile the file quickly. Unlike Inkscape, you can't use snapping at all, so the positioning precision can only be perceptual.

Your file will look like this:

\documentclass[a4paper]{article}
\usepackage{graphicx}
\usepackage{tikz}

\pagestyle{empty}

\begin{document}

\begin{tikzpicture}[remember picture,overlay]
    \node at (current page.center) {\includegraphics{originaldoc.pdf}};
    \node at (0,5)  {Foo};
    \node at (0,10) {Bar};
\end{tikzpicture}

\pagebreak

\begin{tikzpicture}[remember picture,overlay]
    \node at (current page.center) {\includegraphics[page=2]{originaldoc.pdf}};
    \node at (1,10) {Boo};
\end{tikzpicture}

\end{document}

2 Inkscape

Inkscape is a superb vector graphics editor which supports a variety of formats. Since version 0.91 and thanks to poppler, it's become rather good at supporting PDF files which you can open and save. Imported PDF documents are organised in groups of objects which you can all edit graphically, i.e. remove, duplicate, change their colours, shape – anything you fancy, really. What's more, you can use snapping to position objects with the best possible precision.

Import DialogTo edit a PDF file in Inkscape, simply create a new Inkscape document, then open menu File → Import and choose the original PDF file. Inkscape can unfortunately only import one page at a time, which you need to select in the PDF Import Settings dialogue that will appear while importing the file. I never found any of the Clip to options to do anything useful, so don't bother with them. However, do tick the import via Poppler setting to ensure the best possible PDF support. I normally leave the other options to their default values.

One annoying problem is that the page clipping will be cropped to the contents, so importing an A4 document will always result in an object smaller than A4 if the content of the page is smaller than the page itself. LATEX to the rescue: you can draw a rectangle the size of each page with this snippet of code – just change the loop condition according to the number of pages and the path to the original PDF file.

\documentclass[a4paper]{article}
\usepackage{tikz}
\usepackage{graphicx}
\usepackage{forloop}

\pagestyle{empty}
\newcounter{pagenumber}

\begin{document}
  % Change the max page number
  \forloop{pagenumber}{1}{\value{pagenumber} < 3}{
    \begin{tikzpicture}[remember picture,overlay]
      \node [rectangle,draw] at (current page.center) 
        % Change the file path
        {\includegraphics[page=\value{pagenumber}]{lipsum.pdf}};
    \end{tikzpicture}

    \pagebreak
  }
\end{document}

The page will be imported as a single group of object groups which you can repeatedly ungroup with menu Object → Ungroup to be able to edit them. If you've added a clipping rectangle, you can easily remove it at this point.

Once you're satisfied, you can use menu File → Save to directly save a new PDF file. Again, Inkscape doesn't understand multiple pages so you'll need to save a PDF file for each one of them. I use pdfjoin to join all pages to a single PDF file again:

pdfjoin lipsum-1.pdf lipsum-2.pdf

This will create a single PDF file called lipsum-2-joined.pdf with all the pages joined. You'll find that the resulting file will be much heavier than the original one – Inkscape isn't too good at keeping PDF files optimised.

3 Summary

Support for Multiple Pages Clipping Graphical Editing Automation Positioning Precision Resulting PDF Size
LATEX Yes Correct No Easy Only perceptual Light
Inkscape No Incorrect Yes Difficult Exact with snapping Heavy

4 References