perl

Converting Mailman "Gzip'd Text" archive files to proper mbox files

Mailman archives are often only available in the pretty useless "Gzip'd Text" format, which you cannot easily download and view locally (and threaded) in a MUA such as mutt. But that is exactly what I want to do from time to time (e.g. because I want to read the discussions of the past weeks on mailing lists where I'm newly subscribed).

After some searching I found one way to do it which I stripped down to my needs:

 $ cat mailman2mbox
 #!/usr/bin/perl
 while (<STDIN>) {
   s/^(From:? .*) (at|en) /\1\@/;
   s/^Date: ([A-Z][a-z][a-z]) +([A-Z][a-z][a-z]) +([0-9]+) +([0-9:]+) +([0-9]+)/Date: \1, \3 \2 \5 \4 +0000/; 
   print;
 }

Example run on some random mail archive:

 $ wget http://participatoryculture.org/pipermail/develop/2009-August.txt.gz
 $ gunzip 2009-August.txt.gz
 $ ./mailman2mbox < 2009-August.txt > 2009-August.mbox

You can then view the mbox as usual in mutt:

 $ mutt -f 2009-August.mbox

Suggestions for a simpler method to do this are highly welcome. Maybe some mbox related Debian package already ships with a script to do this?

jhead - List and modify EXIF fields in JPEG photos

jhead is a very nice and very powerful command line utility to mess with JPEG headers (esp. EXIF fields).

  $ apt-get install jhead

It can display/extract a great amount of metadata fields from JPEG files and also extract the thumbnails stored in JPEG files (if any). The following will list all known metadata fields from a sample photo:

  $ wget http://farm4.static.flickr.com/3173/3061542361_60acb0904b_o.jpg
  $ jhead *.jpg
  File name    : 3061542361_60acb0904b_o.jpg
  File size    : 1074172 bytes
  File date    : 2008:11:26 23:38:04
  Camera make  : Panasonic
  Camera model : DMC-FZ18
  Date/Time    : 2008:03:05 15:45:52
  Resolution   : 3264 x 2448
  Flash used   : No
  Focal length : 4.6mm  (35mm equivalent: 28mm)
  Exposure time: 0.0100 s  (1/100)
  Aperture     : f/3.6
  ISO equiv.   : 100
  Whitebalance : Auto
  Metering Mode: matrix
  Exposure     : program (auto)
  GPS Latitude : N %:.7fd %;.8fm %;.8fs
  GPS Longitude: E %;.8fd %:.7fm %;.8fs
  GPS Altitude : 174.00m
  Comment      : Aufgenommen auf dem <a href="http://www.froutes.de/TT00000014_Ars_Natura">Kunstweg Ars Natura</a>.
  ======= IPTC data: =======
  Record vers.  : 4
  Headline      : Felsburg auf dem Felsberg
  (C)Notice     : www.froutes.de
  Caption       : Aufgenommen auf dem <a href="http://www.froutes.de/TT00000014_Ars_Natura">Kunstweg Ars Natura</a>.

As you can see there's a huge amount of potentially privacy-sensitive metadata in your typical JPEG as generated by your camera (including camera type, settings, date/time, maybe even GPS coordinates of your location, etc).

You can extract the thumbnail stored in all JPEGs in the current directory with:

  $ jhead -st "&i_t.jpg" *.jpg
  Created: '3061542361_60acb0904b_o.jpg_t.jpg'

Random flickr image and its differing thumbnail

Note that the JPEG thumbnail does not necessarily show the same picture as the JPEG itself. Depending on the image manipulation software that was used to create the edited/fixed/cropped JPEG, the thumbnail may still reflect the original JPEG contents (see sample image on the right-hand side). This is a huge potential privacy issue. There have been a number of articles about this some years ago, in case you missed them:

Thus, an important jhead command line to know is the following, which removes all metadata (including any thumbnails) from all JPEG images in the current directory:

  $ jhead -purejpg *.jpg
  Modified: 3061542361_60acb0904b_o.jpg

As you can see the result is that only very basic information can be gathered from the file afterwards:

  $ jhead *.jpg
  File name    : 3061542361_60acb0904b_o.jpg
  File size    : 1052506 bytes
  File date    : 2008:11:26 23:38:04
  Resolution   : 3264 x 2448
  $ jhead -st "&i_t.jpg" *.jpg
  Image contains no thumbnail

I recommend doing this for most photos you make publically available on sites like flickr etc. (unless you have a good reason not to). Finally, see the jhead(1) manpage for lots more options that the tool supports.

The Underhanded C Contest - Results

Being too busy sucks. I didn't even have the time to blog about the Underhanded C Contest, whose results have now been announced.

Quick reminder: the goal of the contest is to

write innocent-looking C code implementing malicious behavior. In many ways this is the exact opposite of the Obfuscated C Code Contest: in this contest you must write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil.

I blogged about the contest earlier, but only later decided to take part in the contest myself (together with Daniel Reutter). After some initial brainstorming we hacked together our solution in roughly one day.

Although we didn't win (damn, no beer for us ;-), we managed to submit one of the simplest solutions (ca. 34 lines of code), i.e., it's very hard to embed any malicious but innocent-looking code in there... Our solution exploits an array bounds overrun, with an extra equals sign ("<=" instead of "<").

I have yet to look at the two winning entries by M. Joonas Pihlaja and Paul V-Khuong (team submission), as well as Natori Shin. Congratulations guys! Also, I noticed the Slashdot story about the contest results, but didn't get around to read that article, either. Sigh...

Biferno - Reinvent the Wheel

From tabasoft.it:

Biferno is a new generation, Cross Platform Web Scripting Language that allows developers the rapid implementation of dynamic Web applications and of Web sites that offer a high degree of user interactivity.

Biferno is an Open Source Project distributed under the GNU GENERAL PUBLIC LICENSE, its current version is 1.2.0.

So what's wrong with PHP, Perl, Python, Ruby (on Rails)? Why develop yet another (web) scripting language, especially one which looks like an exact PHP clone? Can anyone enlighten me why they're doing it and what they see as advantages over other languages?

(via Schockwellenreiter)

Syndicate content