The Scots Law Student

The SLS : Life and trials of learning law in Scotland

Tag: pdf

File formats and the pirate bay

The Pirate Bay is a great source of material for blog posting. Oddly enough this isn’t about the issue of, you know, their big court case. This is actually about their rather entertaining “Legal Threats” page. The Pirate Bay has (had?) a policy whereby if you found someone had posted a torrent with your copyrighted material on the Pirate Bay tracker / search engine you could write to the Pirate Bay and they… will promptly ignore it. Or they’ll send you a cheeky reply.

They post the letters they get on this page. Generally what they have are copies of emails which are very simply the plain text listings of the emails, generally with lots of lawyerly signatures including the words of “STRICTLY CONFIDENTIAL” etc. However, one of the documents is interesting because it’s a PDF. The Pirate Bay took this and replied back with a 1 megabyte image in .BMP format which looked a lot like this:

Pirate Bay message

“I can use annoying formats too” they say. But is PDF annoying? I’m not so sure.

With my techie hat on I know that the best form to find text in is simple, human readable plain text, the sort of thing you’d get if you typed it in Notepad. It’s just the words, you can do anything with it, you can copy and paste it into any other program and every computer can interpret it in such a way as to let you see it on any computer you can find. However, with my (law) student hat on I happen to really like the not so humble Portable Document Format.

What is PDF?

It’s probably worth talking about what PDF is by comparing it to the other options for text.

1) Plain text

Examples, created by: Notepad, Text Editor

Pares everything down to the words themselves. There is no option for formatting, fonts, colours, pages, anything. All you do is type a long sheet of contigous text. The great thing is the sheer efficiency of what you produce. The document provides all the substantive content of the fancier formats but without messing with formatting issues.

Pros

  1. Very lightweight
  2. Easily transferred
  3. Easily modified in many different programs running on many different systems
  4. Easily adapted into other forms, not burdened by extra code put in for formats etc.

Cons

  1. No formatting, at all. Need to use things like *bold* or /italic/ to distinguish formatting
  2. No diagrams. It’s possible to do using letters and symbols but no chance for images in the text
  3. Can be hard to set out – things like footnoting and tables of contents pretty much need to be set out by hand in the vast majority of plain text editors.
  4. Can be very elegant, can be very crude.

2) Rich text

Examples, created by: MS Word, OpenOffice

Pros

  1. Most common kind of text – every web page and every Word document are rich text.
  2. Allows visible formatting – select text and make it bold, italic etc. Allows fonts
  3. Allows image imbedding, depending on the specific format this can be within the file itself (eg, Word documents) or through referencing (eg web pages)
  4. Can be very feature rich – templates, automated footnoting, automated table of conents etc are all possible.

Cons

  1. Extra features means compatibility suffers. Documents created in MS Word may have compatibility issues when opened in slightly different programs, eg. OpenOffice, Word Perfect, Abiword.
  2. Although you can choose various fonts for your documents these fonts will only appear on other people’s computers if they also have the same fonts installed. If they don’t they’ll see a fallback option which you may not have chosen. There are ways around this.
  3. Will not look the same on every computer, settings will vary and the resulting document can be affected.

3) Image

Examples, created by: Paint, Photoshop

I might surprise some people by including this option here but I really do think that image formats are a real option (of sorts) for conveying text on a computer. The flexibility that allows the same picture format to contain a picture a funny cat or a world famous old master also allows it to hold the shape of words.

Pros:

  1. Document looks exactly as it did on your computer for everyone
  2. Very easily shared between users – every modern computer can understand the common picture formats, so no need for specialist software to view it.
  3. Very, very good for diagrams. Will look exactly as intended, allows full colour and photorealistic images to included directly with the text.
  4. Very flexible layout – not bound by justification or layout tags, can put elements in anywhere on the document

Cons

  1. Very big files for email etc (the Pirate Bay image was 1 megabyte for 7 words)
  2. Can be hard to edit, and editing it well requires specialist software that’s hard to use
  3. Can be hard to add extra pages
  4. Not actually text – only an image so can’t be copied and manipulated like a text document

4) Device Independent formats

Examples, created by: Acrobat, Foxit, TeX

Pros

  1. Will look the same on every computer (is device independent). Designed to be transferred between computers
  2. Allows you to rely on page, line numbers because it is identical to each user
  3. Allows direct embedding of images, allows for diagrams to be laid in text exactly where intended by the creator
  4. Is still text, so can be copied and pasted as text. Possible to also have original image as well as text, for example if scanning a book, in the same document
  5. Can be pretty immutable, so provides quite a good historical reference. (eg, harder to edit a PDF report from Westlaw than an RTF)

Cons

  1. Can be “annoying” – that is if you’re browsing the internet and you come across a PDF document your browser will need to load an external reader.
  2. Can be expensive. PDF is officially created by Acrobat and that is not cheap. On the other hand DVI,free PDF and so on are open-source and can be produced by many different formats.
  3. Can be pretty immutable, it can be difficult to just change something in a PDF document.

Now, if I point you to 4ii) I think I will show you a huge reason to like PDF (and other device independent formats). The reason here is to look at the ability to rely on the page numbers – so that useful summation of a case’s ratio at the bottom of page 4 is at the bottom of page 4, on everyone’s computer.

I can’t really understand why you would email someone a PDF version of a letter instead of writing your message in the email itself. I find that strange but I don’t think that means that the format is annoying. Feel free to use these formats in your own workflow. They’re good.

How to generate pdfs of books or case reports while in the library

I’ve been looking at programs which may help me in my studies. One of the most promising I’ve found is one which is intended to allow people to create multi page pdf copies of any documents, books, whiteboards or cards they can photograph. The whiteboard mode is surprising and I’m not certain it fits into my current teaching style, however, there is nothing quite like being able to see exactly what the teacher has written on a whiteboard long after the lesson has finished.

It’s called Snapter and I’m pleasantly surprised with how effective it is. I tested it out with my camera phone and a copy of 100 Cases Every Scots Law Student Should Know and and as long as you remember to abide by the rules the program gives you: take the photos from straight above with the spine vertical in the image then you can reliably create a very readable pdf from the images. It’s not a quick process, and it’s almost certainly the most processor intensive application you will ever use for your legal studies but the results are very surprising and usable. I’ve done an example here with Scott Adam’s “Way of the Weasel” which I chose because it includes text boxes and images alongside text – so it’s actually more complicated to scan than most law textbooks.

Snapter has a deceptively simple design of interface for what is a powerful program with many features and controls hidden in the boxes, for the best results you should set the controls each time you use Snapter but the defaults manage well on their own. I found the most useful option was the “original size

Basic photographic principles apply If you used a higher resolution camera and better lens with a tripod you would see better results than these, these test shots came from my 3.2Megapixel SE k800i camera phone which I chose because it’s the only camera I routinely take to the library. Users with newer phones with 5 or more megapixel cameras will almost certainly find that the pdfs produced are extremely readable even on small text. I intend to use Snapter to replace my photocopying, this makes the $50 pricetag for the full version (needed to fully enable the program’s Book mode after the free trial expires) extremely affordable. With photocopying running at about 3-6p per sheet the expense of photocopying personal copies of cases becomes substantial. Also, filing the vast amounts of photocopying which you naturally generate as a law student is a task which requires considerable discipline to avoid the dreaded student “pile of paper under the desk”, being able to directly create pdfs of reference books without needing to photocopy them is more economical and more ecological, with the added advantage of not being able to lose the files as easily as the photocopies.

There are other book scanning solutions but these tend to rely on the user being able to scan the book using a specially designed flatbed scanner(for example the PlusTek Optiscan) which is less than ideal in a law library. Snapter’s advantage comes from the convenience of being able to take a record of the exact text you need on the fly using nothing other than the devices you would already be carrying.

You can use it to inexpensively produce copies of cases for other people as well, instead of needing to recopy each page of your own photocopy for others you can simply email the pdf around, and you can also do the processing on your laptop as you are in the library, all while using your university’s reproduction licence. It’s not the fastest process so be aware that it will both drain battery life and take its time but it’s the only example of automatically transforming photos of books into documents that I’ve seen.  It’ll save paper, money and the environment in its own small way.

The direct competitor to this are the online legal databases which also give you the option of downloading a digital copy of the report to your computer and I find these a better option than hurriedly produced snapter pdfs, however, Westlaw does not provide copies of textbooks nor does it provide copies of cases which are either very old or very obscure and it is these situations where snapter shines.  If your law library provides paper copies of journals or law reports which are not available online in full text format then you need some way to make a copy for yourself.

With many of the most sought after books only available on loan from the library for a matter of hours a student may sometimes find that they spend the entire time they have with the book running it through a photocopier instead of reading it. A fast camera can take photos of every page of a textbook within a university’s stort loan time, this means that books which are extremely sought after (for example the set textbook) can be copied out. The prohibitive expense of photocopying a textbook is considerably lessened when you are operating in the fixed cost of a digital camera and a copy of Snapter, and remember that with law textbooks retailing for around £40 (and science subjects cost even more) from the university bookshop any use that a student can get from the library is to be pounced on.

For those students who are also looking using snapter to produce copies of music, students in Glasgow can use the libraries of other higher education institutions, including the Glasgow School of Art and the Royal Scottish Academy of Music and Drama on a reference only basis which means that you can use the RSAMD to find sheet music for yourself.  I read here that Snapter was less impressive at capturing music books but I disagree based on my experiences using the newest version.

I was so surprised that snapter gave such poor results on capturing music that I immediately grabbed a book of scales off my shelf and tried it for myself, I believe I have a newer version than was tested since I downloaded my copy last night. Again, I used a Sony Ericsson k800i camera phone which is only 3.2Mpx and although some of the text is smudged (small bold text had a harder time of it) because of the resolution and the height I had to take the picture at to get both pages in frame the edges of the picture were detected perfectly and there was no issue seeing marks on semiquavers or the like.

I’m all for snapter, I think it’s designed for times you couldn’t bring an automated book scanner with you – in my case when I’m at the reference library and it does very well using even phone photos in those situations. It beats having to scan photocopies at home or having no copy at all, that’s for sure. I think it will provide a very important service for students above all, but remember that the possiblity to generate digital versions of paperwork is often very useful even just for collaboration with other people by email. For instance emailing digital copies of forms to other professionals. Consider Snapter to be an extremely flexible (allowing for the easily foxed edge detection), inexpensive digitiser which can be used anywhere that a photocopier or a scanner would also work, with much less footprint and less time spent with the original.

Follow

Get every new post delivered to your Inbox.

Join 384 other followers