Blogs

Using mdadm to recover from a dead disk in a Linux RAID-1 array

2.5
Yes, it's that time of the year again. A disk in my desktop-replacement laptop with 2 disks and a RAID-1 has died. Time for recovery.

This laptop has been running 24/7 for the last 3 years or such, so it's not too surprising that a disk dies. Surprisingly though, for the first time in a long series of dead disks, smartctl -a does indeed show errors for this disk. Here's a short snippet of those:

  $ smartctl -a /dev/sda
  [...]
  Error 1341 occurred at disk power-on lifetime: 17614 hours (733 days + 22 hours)
   When the command that caused the error occurred, the device was active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 41 02 1f c0 9c 40  Error: UNC at LBA = 0x009cc01f = 10272799

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   60 f8 08 20 c0 9c 40 00  41d+01:51:50.974  READ FPDMA QUEUED
   60 08 00 18 c0 9c 40 00  41d+01:51:50.972  READ FPDMA QUEUED
   ef 10 02 00 00 00 a0 00  41d+01:51:50.972  SET FEATURES [Reserved for Serial ATA]
   ec 00 00 00 00 00 a0 00  41d+01:51:50.971  IDENTIFY DEVICE
   ef 03 45 00 00 00 a0 00  41d+01:51:50.971  SET FEATURES [Set transfer mode]

  SMART Self-test log structure revision number 1
  Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
  # 1  Short offline       Completed: read failure       90%     20511         156170102
  [...]

The status of the degraded RAID array looks like this:

  $ cat /proc/mdstat
  Personalities : [raid1] 
  md1 : active raid1 sdb7[1]
       409845696 blocks [2/1] [_U]
  md0 : active raid1 sda6[0] sdb6[1]
       291776 blocks [2/2] [UU]

The [_U] means that one of two disks has failed, it should normally be [UU]. There are two RAID-1s actually, a small md0 (sda6 + sdb6) for /boot and the main md1 (sda7 + sdb7) which holds the OS and my data. Apparently (at first at least), only sda7 was faulty and got kicked out of the array:

  $ dmesg | grep kick
  md: kicking non-fresh sda7 from array!

Anyway, so I ordered a replacement disk, removed the dead disk (I checked the serial number and brand before, so I don't accidentally remove the wrong one), inserted the new disk and rebooted.

Note: In order for this to work you have to have (previously) installed the bootloader (usually GRUB) onto both disks, otherwise you won't be able to boot from either of them (which you'll want to do if one of them dies, of course). In my case, sda was now dead, so I put sdb into its place (physically, by using the other SATA connector/port) and the new replacement disk would become the new sdb.

After the reboot, the new disk needs to be partitioned like the other RAID disk. This can be done easily by copying the partition layout of the "good" disk (now sda after the reboot) onto the empty disk (sdb):

  $ sfdisk -d /dev/sda | sfdisk /dev/sdb

Specifically, the RAID disks/partitions need to have the type/ID "fd" ("Linux raid autodetect"), check if that is the case. Then, you can add the new disk to the RAIDs:

  $ mdadm /dev/md0 --add /dev/sdb6
  $ mdadm /dev/md1 --add /dev/sdb7

After a few hours the RAID will be re-synced properly and all is good again. You can check the progress via:

  $ watch -n 1 cat /proc/mdstat

You should probably not reboot during the resync (though I'm not 100% sure if that would be an issue in practice; please leave a comment if you know).

Also, don't forget to install GRUB on the new disk so you can still boot when the next disk dies:

  $ grub-mkdevicemap
  $ grub-install /dev/sdb

And it might be a good idea to use S.M.A.R.T. to check the new disk, just in case. I did a quick run for the new disk via:

  $ smartctl -t short /dev/sdb # Wait a few minutes after this.
  $ smartctl -a /dev/sdb
  [...]
  SMART Self-test log structure revision number 1
  Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
  # 1  Short offline       Completed without error       00%        22         -
  [...]

Looks good. So far.

libsigrokdecode 0.1.1 released, more protocol decoders supported

sigrok logo

Just a quick announce: We released libsigrokdecode 0.1.1 today, a new version of one of the shared libraries part of the open-source sigrok project (for signal acquisition/analysis of various test&measurement gear, like logic analyzers, scopes, multimeters, etc). I will update the Debian package soonish.

As you probably know, in addition to the infrastructure for protocol decoding, this library also ships with a bunch of protocol decoders written in Python. Currently we support 29 different ones (in various states of "completeness", improvements are ongoing).

This release adds support for the following new protocol decoders:
CAN probing

Please check the announce on the sigrok blog and/or the NEWS file for the full list of changes and improvements.

Happy hacking and decoding!

sigrok at the 29th Chaos Communication Congress (29c3)

29c3 logo

Yup, it's been a while since my last blog post, but I'm not dead yet. Most of my spare time goes into sigrok development these days (open-source signal analysis suite for logic analyzers, oscilloscopes, multimeters, and lots more), but I'll try to revive my blog too. I have various microcontroller/embedded topics and devices I want to talk about in a small blog post series in the nearer future. But more on that later.

Feel free to subscribe to the sigrok-devel mailing list, join us on IRC in #sigrok (Freenode) where most of the discussions take place, or follow our new sigrok blog (RSS) if you're interested in the ongoing sigrok developments. Anyway, for now just a quick announce:

Same as last year, we will be at the Chaos Communication Congress (29c3), this time in Hamburg, Germany. The conference takes place from December 27th to 30th, 2012.

We'll have a sigrok "assembly", likely in area 3b of the conference building, where we'll be hanging around, working on new sigrok features, new hardware drivers, new protocol decoders and various other things. We'll have lots of gear with us for demo and development purposes, including logic analyzers, oscilloscopes, MSOs, multimeters, and lots more.

Bring your own device if you own models we don't yet support or know about. We'll be happy to have a look!

Chat with us, give us your suggestions which features you'd like to see, which devices you want to be supported, which protocol decoders you'd like to have, or even help us write some drivers/decoders!

Hope to see you there!

sigrok - cross-platform, open-source logic analyzer software with protocol decoder support

sigrok logo

I'm happy to finally announce an open-source (GNU GPL), cross-platform (Linux, Mac OS X, FreeBSD, Windows, ...) logic analyzer software package myself and Bert Vermeulen have been working on for quite a long time now: sigrok (it groks your signals).

History

I originally started working on an open-source logic analyzer software named "flosslogic" in 2010, because I grew tired of almost all devices having a proprietary and Windows-only software, often with limited features, limited input/output file formats, limited usability, limited protocol decoder support, and so on. Thus, the goal was to write a portable, GPL'd, software that can talk to many different logic analyzers via modules/plugins, supports many input/output formats, and many different protocol decoders.

The advantage being, that every time we add a new driver for another logic analyzer it automatically supports all the input/output formats we already have, you can use all the protocol decoders we already wrote, etc. It also works the other way around: If someone writes a new protocol decoder or file format driver, it can automatically be used with any of the supported logic analyzers out of the box.

Turns out Bert Vermeulen had been working on a similar software for a while too (due to exactly the same reasons, crappy Windows software, etc.) so it was only logical that we joined forces and worked on this together. We kept Bert's name for the software package ("sigrok"), set up a SourceForge project, mailing lists, IRC channel, wiki, etc. and started working.

Overview, Features

You can get the lastest sigrok source code from our main git repository:

  $ git clone git://sigrok.git.sourceforge.net/gitroot/sigrok/sigrok

Here's a short overview of sigrok and its features as of today. The software consists of the following components:

  • libsigrok, a shared library written in C, which contains the general infrastructure for handling logic analyzer data in a streaming fashion.
    sigrok logic analyzer collection 2011
    It also contains the individual hardware drivers which add support for various logic analyzers. Currently supported hardware includes: Saleae Logic, CWAV USBee SX, Openbench Logic Sniffer (OLS), ZEROPLUS Logic Cube LAP-C, ASIX Sigma/Sigma2, ChronoVu LA8, and others. Many more devices are on our TODO list (and we already own them), it's just a matter of time to reverse engineer the USB protocols and implement a driver for them.

    Thanks ASIX for being open and helping with the ASIX Sigma driver, and many thanks to ChronoVu for being open as well and providing information about the ChronoVu LA8 protocol! Thanks to Håvard Espeland, Martin Stensgård, and Carl Henrik Lunde (who contributed the ASIX Sigma driver), Sven Peter and "Haxx Enterprises"/bushing (for contributing the ZEROPLUS Logic Cube LAP-C driver, ported from their zerominus tool). Also, thanks to Daniel Ribeiro and Renato Caldas who worked on the Link Instruments MSO-19 driver (still work in progress).

    Finally, libsigrok also contains the individual input/output file format drivers. Currently supported are: sigrok session (the default format, which contains all metadata), bits, hex, ASCII, binary, gnuplot, the OpenBench Logic Sniffer format, the ChronoVu LA8 format, Value Change Dump (VCD) viewable in gtkwave, and Comma-separated values (CSV).
    sigrok VCD file in gtkwave

  • libsigrokdecode, a shared library written in C, which contains the protocol decoder infrastructure and the protocol decoders themselves, which are written in Python (>= 3.0).

    The list of currently supported protocol decoders includes:

      dcf77                DCF77 time protocol
      lpc                  Low-Pin-Count
      mx25lxx05d           Macronix MX25Lxx05D
      jtag_stm32           Joint Test Action Group / ST STM32
      i2s                  Integrated Interchip Sound
      spi                  Serial Peripheral Interface
      edid                 Extended display identification data
      pan1321              Panasonic PAN1321
      mlx90614             Melexis MLX90614
      jtag                 Joint Test Action Group
      rtc8564              Epson RTC-8564 JE/NB
      transitioncounter    Pin transition counter
      usb                  Universal Serial Bus
      i2cdemux             I2C demultiplexer
      i2c                  Inter-Integrated Circuit
      i2cfilter            I2C filter
      mxc6225xu            MEMSIC MXC6225XU
      uart                 Universal Asynchronous Receiver/Transmitter
    

    Many more decoders are on our TODO list, and we especially welcome contributed protocol decoders, of course! We intentionally chose Python as implementation language for the decoders, to make them as easy to write (and understand) as possible, even if that means that performance suffers a bit. Have a look at the SPI decoder for example, to get a feeling for the implementation.

    Protocol decoders can be stacked on top of each other, e.g. you can run the i2c decoder and pipe its output into the rtc8564 (Epson RTC-8564 JE/NB) decoder for further processing of the RTC-specific, higher-level protocol. We also plan to support more complex stacking and combining of decoders in various ways in the nearer future.

  • sigrok-cli, is a command-line frontend, which uses both libsigrok and libsigrokdecode. It can acquire samples from logic analyzers and output them in various formats into files or to stdout, and/or run protocol decoders on the aquired data.

    Example: Data acquisition with 1MHz samplerate into a file.

     $ sigrok-cli -d chronovu-la8:samplerate=1mhz --time 1ms -o test.sr
    

    Example: Protocol decoding (JTAG).

     $ sigrok-cli -i test.sr -a jtag:tdi=5:tms=2:tck=3:tdo=7
     [...]
     jtag: "New state: EXIT1-IR"
     jtag: "IR TDI: 11111110, 8 bits"
     jtag: "IR TDO: 11110001, 8 bits"
     jtag: "New state: UPDATE-IR"
     jtag: "New state: RUN-TEST/IDLE"
     [...]
    

  • sigrok-qt, a Qt-based GUI for sigrok, using both libsigrok and libsigrokdecode.

    This is intended to be a cross-platform GUI (runs fine and looks "native" on Linux, Windows, Mac OS X) supporting data acquisition and protocol decoding.

    NOTE: The Qt GUI is not yet usable! We're working on getting it out of alpha-stage for the next release.

  • sigrok-gtk, a GTK+-based GUI for sigrok, using both libsigrok and libsigrokdecode (soon).
    sigrok-gtk
    This is a cross-platform GUI contributed by Gareth McMullin (thanks!), supporting data aqcuisition (and soon protocol decoding).

    NOTE: The GTK+ GUI is not yet fully usable (but it's more usable than sigrok-qt)! Consider it alpha-stage software for now.

We're happy to hear about other (maybe special-purpose) frontends you may want to write using libsigrok/libsigrokdecode as helper libs!

Firmware

Saleae Logic

Some logic analyzer devices require firmware to be uploaded before they can be used. As always, firmware is a bit of a pain, but here's what we currently do: For non-free firmware we provide instructions how to extract it from the vendor software or from USB dumps, if possible. For distributable firmware we have a git repo where you can get it (thanks ASIX for allowing us to distribute the ASIX Sigma/Sigma2 firmware files!).

  $ git clone git://sigrok.git.sourceforge.net/gitroot/sigrok/sigrok-firmwares

Finally, for all Cypress FX2 based logic analyzers we have an open-source (GNU GPL) firmware named fx2lafw, started by myself, but most work (and finishing the firmware) was then done by Joel Holdsworth, thanks! The support list includes Saleae Logic, CWAV USBee SX, CWAV USBee AX, Robomotic Minilogic/BugLogic3, Braintechnology USB-LPS, and many others. Get the code from the fw2lafw git repository:

  $ git clone git://sigrok.git.sourceforge.net/gitroot/sigrok/fx2lafw

Example dumps

We collect various captured logic analyzer signals / protocol dumps in the sigrok-dumps git repository:

  $ git clone git://sigrok.git.sourceforge.net/gitroot/sigrok/sigrok-dumps

They can be useful for testing the sigrok command-line application, the sigrok GUIs, or the protocol decoders.

We're happy to include further contributed example data in our repository, please send us .sr files of any interesting data/protocol you may come across (even if sigrok doesn't yet have a protocol decoder for that protocol). See the Example dumps wiki page for details.

Packages, distros, installers

sigrok Windows installer

I'm currently working on updated Debian packages for sigrok (will be apt-get install sigrok to get everything), and we're happy about further packaging efforts for other distros. We have preliminary Windows installer files (using NSIS), but the Windows code needs some more fixes and portability improvements before it's really usable. On Mac OS X you can use fink/Macports to install as usual, fancier .app installer files are being worked on.

Future

Apart from support for more logic analyzers, input/output formats, and protocol decoders, we have a number of other plans for the next few releases. This includes support for analog data, i.e. support for (USB) oscilloscopes, multimeters, spectrum analyzers, and such stuff. This will also require additional GUI support (which could take a while). Also, we want to improve/fix the Windows support, and test/port sigrok to other architectures we come across. Performance improvements for the protocol decoding as well as more features there are also planned.

Contact

Feel free to contact us on the sigrok-devel mailing list, or in the IRC channel #sigrok on Freenode. There's also an identi.ca group for sigrok. We're always happy about feedback, bug reports, suggestions for improving sigrok, and patches of course!

HOWTO: Using OpenVPN on Debian GNU/Linux

Here's a quick HOWTO for setting up an OpenVPN server and client on any (Debian, in this case) Linux machine of your choice. I'm running an OpenVPN server on a box at home, and a client on my laptop, so I can securely route all my laptop traffic through my OpenVPN server, no matter where I am.

I highly recommend reading the official OpenVPN HOWTO from top to bottom, at least once. But here's a short, condensed HOWTO (specifically geared towards my needs, yours might be different):

On the server:

Install OpenVPN (apt-get install openvpn), then copy the "easy-rsa" files to /etc/openvpn/easy-rsa from where we'll use them to create our keys and certificates:

  $ cp -r /usr/share/doc/openvpn/examples/easy-rsa/2.0 /etc/openvpn/easy-rsa
  $ cd /etc/openvpn/easy-rsa

In the vars file change the KEY_SIZE variable from 1024 to 4096 for good measure:

  export KEY_SIZE=4096

Then, read in the vars file, clean old keys and certificates (if any) and create new ones:

  $ . ./vars
  $ ./clean-all
  $ ./build-ca

You'll now have the chance to enter some data such as country code (e.g. "DE"), state/province, locality, organization name, organizational unit name, common name, name, and email address. The values you choose don't really matter much (except for commonName, maybe, which could be your hostname or domain or such). Finally, the ca.key (root CA key) and ca.crt (root CA certificate) files will be created.

Next, we'll create the server key:

  $ ./build-key-server server

You'll have to enter lots of info again (see above), commonName could be "server" or such this time. Upon "Sign the certificate? [y/n]" say y, as well as upon "1 out of 1 certificate requests certified, commit? [y/n]". Finally, the server.key and server.crt files will be created.

Same procedure for creating a client key (I used "client1" as filename and commonName here):

  $ ./build-key client1

Next up we'll generate Diffie Hellman parameters (this will take a shitload of time due to keysize=4096, go drink some coffee):

  $ ./build-dh

When this step is done, you'll have a dh4096.pem file.

As we want to use OpenVPN's "tls-auth" feature for perfect forward secrecy (it "adds an additional HMAC signature to all SSL/TLS handshake packets for integrity verification"), we'll have to generate a shared secret:

  $ openvpn --genkey --secret ta.key
  $ mv ta.key keys

So much for creating keys. Now, we'll have to configure OpenVPN. Copy the default server config file and edit it:

  $ cd /etc/openvpn
  $ cp /usr/share/doc/openvpn/examples/sample-config-files/server.conf.gz .
  $ gunzip server.conf.gz

The most important change in my setup is that I use port 443/TCP instead of the usual OpenVPN default of 1194/UDP. This increases the chances that you'll be able to use OpenVPN in almost all places, even in environments which firewall/block lots of stuff. Port 443/TCP (for https) will almost always be usable. I also uncommented the following line, which tells the client to use the VPN interface (usually tun0) per default, so that all the client's traffic (web browsing, DNS, and so on) goes over the VPN:

  push "redirect-gateway def1 bypass-dhcp"

Here's my server config file (comments and commented out lines stripped):

  port 443
  proto tcp
  dev tun
  ca /etc/openvpn/easy-rsa/keys/ca.crt
  cert /etc/openvpn/easy-rsa/keys/server.crt
  key /etc/openvpn/easy-rsa/keys/server.key  # This file should be kept secret
  dh /etc/openvpn/easy-rsa/keys/dh4096.pem
  server 10.8.0.0 255.255.255.0
  ifconfig-pool-persist ipp.txt
  push "redirect-gateway def1 bypass-dhcp"
  keepalive 10 120
  tls-auth /etc/openvpn/easy-rsa/keys/ta.key 0 # This file is secret
  comp-lzo
  user nobody
  group nogroup
  persist-key
  persist-tun
  status openvpn-status.log
  log-append openvpn.log
  verb 3

You can now start the OpenVPN server, e.g. via

  $ /etc/init.d/openvpn restart

Server firewall setup/changes:

I'm running a custom iptables script on pretty much all of my boxes. Here's the relevant changes needed to allow the OpenVPN server to work properly. Basically, you need to enable IP forwarding, accept/forward tun0 traffic and setup masquerading (change "eth0" below, if needed):

  echo 1 > /proc/sys/net/ipv4/ip_forward
  iptables -A INPUT -i tun+ -j ACCEPT
  iptables -A FORWARD -i tun+ -j ACCEPT
  iptables -A FORWARD -m state --state ESTABLISHED,RELATED -j ACCEPT
  iptables -t nat -F POSTROUTING
  iptables -t nat -A POSTROUTING -s 10.8.0.0/24 -o eth0 -j MASQUERADE

My firewall script gets run upon every reboot. If you don't use such a script, you could add the above stuff to your /etc/rc.local file.

On the client:

Install OpenVPN (apt-get install openvpn), then copy the default client config file and edit it:

  $ cd /etc/openvpn
  $ cp /usr/share/doc/openvpn/examples/sample-config-files/client.conf .

Change the parameters to match the server config (port 443/TCP, and so on) and use "tls-auth /etc/openvpn/ta.key 1" (note the "1" on the client, and the "0" on the server!). Replace xxx.xxx.xxx.xxx with the public IP address of your OpenVPN server. If it doesn't have a public, static IP address already, you can use services such as DynDNS, or (my preferred method), my ssh-based DIY poor man's dynamic DNS setup.

Here's my full client config:

  client
  dev tun
  proto tcp
  remote xxx.xxx.xxx.xxx 443
  resolv-retry infinite
  nobind
  user nobody
  group nogroup
  persist-key
  ca /etc/openvpn/ca.crt
  cert /etc/openvpn/client1.crt
  key /etc/openvpn/client1.key
  ns-cert-type server
  tls-auth /etc/openvpn/ta.key 1
  comp-lzo
  verb 3

Now you only need to copy the required certificates and keys to the client (into /etc/openvpn): client1.crt, client1.key, ca.crt, and ta.key. Do not copy the other, server-specific private keys and such to the client(s)! Also, the root CA key (ca.key) should not even be left on the server, but rather moved to some offline storage/box, so that it cannot fall into the wrong hands, e.g. in the case of a server compromise.

I prefer to manually start the client on my laptop when needed, so I use AUTOSTART="none" in /etc/default/openvpn and then start the client via:

  $ openvpn /etc/openvpn/client.conf

That's it. Comments and suggestions for improving the setup and/or the security aspects of it are highly welcome!

Syndicate content