Heckroth Industries

Ramblings

m3u Tools

9 years ago I purchased a cheap MP3 player that I used most days. When I first got it I thought that I’d be doing well to get 4 years out of it. Well 4 years later and the SD card capacity to price ratio had improved enough that I could start using my FLAC files directly rather than the having to convert them to MP3. Or at least I could if my MP3 Player supported FLAC files, which of course it didn’t.

At this point I was starting to think that I was going to have to keep on converting my files to MP3 to use them in my player, or get a new MP3 player, but then I discovered RockBox. Which, once installed on my MP3 player, would enable it to play FLAC files.

Well a week ago my old player stopped working, which means that thanks to RockBox it had lasted 5 years longer than I thought it would. Or course MP3 players would surely have improved over the last decade, right… Turns out the answer is “slightly”. My new player does support FLAC files, which is good, but other than that nothing else seems to have changed.

I also hadn’t realised just how much I’d gotten used to RockBox’s audio interface, where it would read out the menu options to you. When using my old player I rarely looked at the screen, and given that I normally use my player when I’m out walking, keeping my eyes on where I’m heading rather on a little screen is very helpful.

Unfortunately the new player isn’t supported by RockBox (yet?). So I had to come up with a new way to operate that minimised the time I had to look at its screen. What I really wanted was a simple option that would play my albums in alphabetical order. Of course a simple Play All Albums option wasn’t something that the new player had, but it does support playlists (then again so did my old player, I just didn’t use it as RockBox’s audio interface worked so well for me).

The new player’s playlists are M3U files, which really are just text files containing an ordered list of paths to the files to be played. The player’s manual says that you need to put the playlist in the same directory as the files to be played, but a quick Google search confirmed that you don’t need to do that as long as you make the paths in the playlist relative paths then you can put your playlist in the player’s Playlists directory and point to the FLAC files in the Music directory structure - e.g.

../Music/Artist_1/Album_1/Track_1
../Music/Artist_1/Album_1/Track_2
../Music/Artist_1/Album_1/Track_3
../Music/Artist_1/Album_1/Track_4
../Music/Artist_1/Album_2/Track_1
../Music/Artist_1/Album_2/Track_2
../Music/Artist_1/Album_2/Track_3
../Music/Artist_1/Album_2/Track_4

So all I needed to do was to create a simple text file listing my FLAC files in album order. Luckily I store my music files in the following directory structure:

<Artist>/<Album Title>/<Track Number> - <Track Title>

It’s a relatively sensible ordering and it has the unplanned benefit that a Perl script can build my playlist file just by walking the directory structure.

The resulting Perl script is available in the m3u-tools repository on GitLab.

The only major issue I encountered was with non-ASCII characters in the path, which resulted in the new player ignoring that entry. For this issue I took the easy path and simply had the script warn me about the file and I could go and fix up the file file path so that it just contains ASCII characters.

Now all I have to do is to remember to rebuild the playlist when I put new music on the player…

Jason — 2021-08-17

Advent of Code 2020 - Day 1

So, it’s that time of year again when I spend a bit of time tackling some of the Advent Of Code challenges. As a bit of an exercise I’ve decided to write down an explanation of how I tackled Day 1.

Part 1

Each Day’s challenge consists of two parts and the simplified explanation for the first part for Day 1’s challenge is:

Given a list of numbers find two that numbers that add up to 2020. Take those two numbers and multiply them together to get your solution.

Multiplication is trivial, getting the two numbers that add up to 2020 is the part of the challenge that you have to think a little bit about. So let’s look at how I tackled that.

So we need a function that given a list finds the two entries that add up to 2020 - lets call it findTwoEntriesThatSumTo. The first parameter can be the number we want the two list entries to add up to, the rest of the parameters can be the list.

Now we know what our function is going to be called and what parameters it takes, we can write a test:

use Day_01 qw( findTwoEntriesThatSumTo );

my @testExpenses = ( 1721, 979, 366, 299, 675, 1456 );

subtest 'Part 1' => sub {
    cmp_deeply(
        [ findTwoEntriesThatSumTo(2020, @testExpenses ) ],
        bag( 1721, 299 ),
        "Should find the two entries that sum to 2020"
    );
};

We’re comparing against a bag as the order that the two entries are returned doesn’t matter.

Now that we have a failing test we can write the code to make it pass. There’s a lot of different ways to tackle the problem, but the one I ended up with is:

sub findTwoEntriesThatSumTo {
    my ( $value, @list ) = @_;

    while ( my $a = shift @list ) {
        foreach my $b ( @list ) {
            return ( $a, $b ) if $a + $b == $value;
        }
    }

    return;
}

The first line defines the function, and the second extracts the parameters into variables (Perl passes parameters into a function via the special @_ array).

In Perl shift removes the first item from an array and returns it. If the list is empty then it’ll return undef, if the condition in a while loop evaluates to undefined then the while loop is done.

Inside the while loop we have a foreach loop that loops through the entries left in our array.

Inside that foreach loop we’ll return the two values if they add add up to the value we’re looking for, otherwise the foreach loop will move onto the next value in the list.

If the foreach loop completes without finding a suitable value then control is returned to the while loop which takes the next value off the start of the list and runs through the foreach loop again to see if any other number in the list will add up with it to the value we’re looking for. (Note that each time round the while loop the list is getting shorter).

If there isn’t a pair of numbers in the list that add up to 2020 then eventually the list will be empty and the while loop will hand control to the next statement after it, which in this case is a simple return. That return at the end of the function makes sure that we’ll get an undef back from the function if it doesn’t find a suitable answer.

Once the that was written, the bugs fixed and the test passing I put together a simple script that used it to find the answer to the first part of the Day 1 challenge.

Part 2

Part 2’s challenge is almost identical to Part 1 except this time we’re looking for three numbers that add up to 2020. So lets create a new function called findThreeEntriesThatSumTo that takes the same parameters as our function that solves Part 1. As before we’ll create a test so that we know what we’re trying to achieve:

subtest 'Part 2' => sub {
    cmp_deeply(
        [ findThreeEntriesThatSumTo( 2020, @testExpenses ) ],
        bag( 979, 366, 675 ),
        "Should find the three entries that sum to 2020"
    );
};

With a failing test we’re in a position to write our function to solve the problem. There’s a lot of ways to view the new challenge, but there all going to involve looping through our list of numbers trying to find two others that when added to it gives us 2020. Luckily for us we already have a function that given a target value and a list of numbers will try to find two that add up to that target value. So lets use that in a similar while loop to that we used in Part 1:

sub findThreeEntriesThatSumTo {
    my ( $value, @list ) = @_;

    while ( my $a = shift @list ) {
        my ( $b, $c ) = findTwoEntriesThatSumTo( $value - $a, @list );
        return ( $a, $b, $c ) if defined $b;
    }

    return;
}

The while loop functions in exactly the same way as it does in our solution to Part 1, shifting the first entry out of the list on each time round.

Each iteration of the while loop starts by calling the findTwoEntriesThatSumTo function and storing the result in the variables $b and $c. The first parameter passed to findTwoEntriesThatSumTo is the result of subtracting the number we shifted off the start of our list ($a) from our target value ($value), the rest of the parameters we pass are what remains of our list.

If our findTwoEntriesThatSumTo function doesn’t find two values then it returns undef and $b an $c will both be undefined. We can use this as a check on the next line to see if we’ve found our answer or if we need to carry on to the next time round the loop. If $b is defined then it’s found an answer and can return the three values it has found ($a, $b and $c). However, if $b isn’t defined then the while loop will move on and try the next number in our list.

If by some chance it doesn’t find three numbers in the list that add up to our target value then the while loop will finish and the return at the end of the function will return undef to the caller.

Once the function was working I updated my script from Part 1 to also call our new function to solve Part 2.

Jason — 2020-12-04

Picking up DBD::Mock

A few months ago I started using the DBD::Mock Perl module as part of some unit tests for a project I was working on. It was pretty simple to pick up and use, but I found that there was a feature missing that would make it easier for me to use. As it’s open source I was able to dig into the module’s code and figure out how to add the new functionality. The internals of the module are logically structured so it only took about an hour to prepare a patch, but when I tried to submit the patches back to the source I discovered that the module was no longer being actively maintained. This discovery triggered a chain of events which resulted in me taking on a maintainer role for the module.

As the new maintainer, the first task I had to undertake was to get the codebase into a repository that I controlled. This involved cloning the old GitHub repository with git’s --bare option and then using the --mirror option to push it up to the new GitLab repository.

Once that was done I needed to build a development environment around it, starting with migrating the build process to be consistent with my other CPAN modules (i.e. get it set up with Minilla).

Migrating the build process to using Minilla left one last step to do before the development environment was ready, Continuous Integration (CI). In GitLab the CI logic is controlled by the .gitlab-ci.yml file, and I didn’t need anything complicated, so I first went with:

image: perl:latest

before_script:
  - cpanm Minilla

stages:
  - test

unitTests:
    stage: test
    script:
      - minil test

Quick explanation of this .gitlab-ci.yml file:

  • Image tells GitLab’s CI which Docker image to use (in this case the latest Perl image from Docker Hub

  • before_script sets a series of commands to prepend to each job’s script

  • stages list the stages in our CI pipeline (in this case just the test stage)

  • unitTests is a job with the following properties:

    • stage the stage this job is part of (this one’s part the test stage)
    • script the script that commands that get run to perform the job

Now I had a development environment ready, I could get started with figuring out what to tackle for my first release. Reviewing the module’s RT queue showed a number of issues that needed investigating and resolving. I decided to keep it simple for the my first release and targeted three easy issues:

  • Adding in details about the module’s Git repository

  • Fixing a spelling mistake in the POD

  • Adding in my patches

Once they were done I used Minilla to release a new version (v1.46). A few hours later and the new version was available on CPAN and could be installed in the usual way for CPAN modules.

The next day I got an email from CPAN Testers , a group of people who test CPAN modules against different versions of Perl on different operating systems. The new version was failing on versions of Perl below v5.10.0. Sure enough I’d used a defined-or (//) which isn’t available in Perl v5.8.

The first thing to do was to fix my CI pipeline to make sure that I tested against Perl v5.8 as well as the latest version, so I wouldn’t make this mistake again. After a bit of playing with the .gitlab-ci.yml file, it looked like the following:

stages:
  - test

before_script:
  - cpanm Module::Build::Tiny Test::Pod Test::Pod::Coverage
  - cpanm --installdeps .
  - perl Build.PL
  - perl Build

unitTestsLatest:
    image: perl:latest
    stage: test
    script:
      - perl Build test

unitTestsV5.8:
    image: perl:5.8-threaded
    stage: test
    script:
      - perl Build test

There were three key changes:

  • Removal of Minilla in the build and testing process, the before_script now consisted of 4 commands to install dependencies, optional modules, run Build.PL and use the Build file produced to build the module so it’s ready for testing

  • A new unitTestsV5.8 job for testing against Perl v5.8

  • The image property has moved into the jobs as each job needs to use a different Perl docker image depending on the version being tested

These changes made it a lot easier to extend the versions of Perl being tested against by simply adding a new job (hint: the latest version of DBD::Mock tests against 13 different versions of Perl).

Once the CI was testing against Perl v5.8 as well as the latest, I could actually get around to fixing the bug and preparing the next release (v1.47). As development of the module had progressed in the time up to the point that CPAN Testers reported the issue with Perl v5.8, the new release also contained the following changes:

  • Max Carey’s patch from rt86294

  • Addition of a new experimental Connection Callback feature

Over the next month, two additional release of DBD::Mock were made, which resolved the last of the open issues in it’s RT queue. I’m now holding off on development for a little while to give time for any bugs to be found and reported.

Jason — 2019-10-09

New site

It’s been a while since I’ve last had any spare time to work on the site, but I’ve finally found some time to work on it. It’s changed a bit and as usual I’m using it to experiment with different ways of tackling an issue. So what’s changed? Well, I’ve dropped WordPress and moved to a static site. I’ve moved the contents into JSON files which are rendered via a project I’ve been working on, renderJsonAsHtml. I’ve made it available via npm and the source is available on GitLab

The styling of the site is built on top of another project I’ve finally released, CoopersCoreCSS. This is the core set of CSS that I normally start from when developing a website and build the rest of the site’s CSS on top. I’ve also made this available via GitLab.

Jason — 2017-01-26

Access denied, but I'm the Administrator!

Recently my HTPC, Windows Media Center (WMC) running under Windows 7, started to exhibit an odd issue. One or two of the recordings couldn’t be deleted from within WMC. So I tried falling back to closing down WMC and deleting the files from the Recorded TV directory, only to receive an “Access Denied” error message. Ok, let’s run a shell (cmd) as the administrator user change to the Recorded TV directory and use del to delete the files. Still I got the “Access Denied” error message.

Looking at the files properties showed even more odd results, they weren’t owned by anyone. When I tried to change the owner to the administrator account I go the same error message. This was the point that I decided to fall back to basics, if the file system is behaving oddly then check the disk, so I ran

chkdsk c: /R

It replied that it couldn’t check it now but would I like to check it on the next restart, which of course I said yes to and then immediately restarted the machine. 20 minutes later, after a full disk check, windows reappeared and I could delete the files. My current assumption is that windows had lost some of the ntfs details for the files and that the chkdsk reset them.

Jason — 2015-01-06

Launching screen only if the current terminal isn't already in a screen instance

If you like to use screen then this is a useful piece of code to add to the end of your .bashrc file. It will launch screen, but only if it isn’t currently in an instance of screen. The great thing about this is that it works across machines so you can put it in your .bashrc on all the machines you like and then when you ssh into the first one you get a new screen session, but ssh into a second machine from within that screen session and the second machine won’t launch another session of screen.

command_exists () {
    type "$1" &> /dev/null ;

# If we are in a ssh tty and not already running screen and screen exists then think about starting it
if [ $SSH_TTY ] && [ ! $WINDOW ] && [ "$TERM" != "screen.linux" ] && command_exists screen; then
    # If we don't have any attached screen sessions run screen and attach to the first dettached session
    SCREENLIST=`screen -ls | grep 'Attached'`
    if [ $? -eq "0" ]; then
        echo -e "Screen is already running and attached:\n ${SCREENLIST}"
    else
        screen -U -R
    fi
fi
Jason — 2013-12-03

IPv6 Routing issue

Recently I have been noticing an issue with my IPv6 networking. Everything was working fine locally but if I connected from outside the network then connections would lock up when they got busy.

It took a bit of investigating but I discovered that the IPv6 MTU size wasn’t being detected correctly by some devices. The reason it would work locally was that the local network’s MTU was 1500, but for IPv6 packets leaving the network the routing goes through an IPv4 tunnel, which results in it having a lower MTU size.

The solution was to add a line to the radvd.conf file (used by the daemon that handle the router advertisements). The line that I added was

AdvLinkMTU 1280;

Which results in those devices that configure their IPv6 details from the router advertisement setting their MTU to 1280 which is the minimum datagram size for IPv6.

Jason — 2013-04-24

Extending the number of post parameters supported by Tomcat5 on RedHat Enterprise Linux 5

I have just had to work around a weird limit to the number of post parameters in Tomcat5 on RedHat Enterprise Linux 5 install. Specifically the error was

org.apache.catalina.connector.Request parseParameters WARNING: Exception thrown whilst processing POSTed parameters

java.lang.IllegalStateException: Parameter count exceeded allowed maximum: 512

A quick Google lead me to solution editing the catalina.properties file, but that can get overwritten by a yum update. The adapted solution is to edit your /etc/syconfig/tomcat5.conf file and extend the JAVA_OPTs variable to include this parameter change.

-Dorg.apache.tomcat.util.http.Parameters.MAX_COUNT=10000
Jason — 2012-06-18

Linux or GNU/Linux

Yet again an old argument discussion has reared its head and so I have decided to voice my opinion The discussion in question is “should it be called Linux or GNU Linux?”

To me the idea that I should call the operating system GNU Linux is wrong. The operating system is Linux, the Kernel is Linux. Yes I run a lot of GNU software on Linux, but I run a lot of GNU software on my Windows machines and no-one tells me I should refer to it as GNU Windows.

I also think that it is detrimental for GNU when people refer to Linux as GNU Linux. It sets up people to only think of GNU software if they are looking for software on Linux, when in fact there is GNU software available for most OS’s.

So in conclusion, its going to be called Linux by me for the foreseeable future. (Though it looks like my CMS has decided to call it GNU Linux via its tag ordering)

Jason — 2012-05-11

Using libcurl with wxDev-C++

Just had to do some C++ development on a Windows 7 machine so I downloaded and installed wxDev-C++ on my machine and download the required packages for libcurl and added the -lcurl to the linker options for the project.

The problem came when I tried to compile to code, I got the following errors

[Linker Error] undefined reference to `_imp__curl_easy_init'
[Linker Error] undefined reference to `_imp__curl_easy_setopt'
[Linker Error] undefined reference to `_imp__curl_easy_perform'
[Linker Error] undefined reference to `_imp__curl_easy_cleanup'
[Linker Error] undefined reference to `_imp__curl_easy_init'
[Linker Error] undefined reference to `_imp__curl_easy_init'
[Linker Error] undefined reference to `_imp__curl_easy_setopt'
[Linker Error] undefined reference to `_imp__curl_easy_setopt'
[Linker Error] undefined reference to `_imp__curl_easy_setopt'

They all looked like I was missing the curl library, but it was in the linker options. After looking through the libraries available I discovered that there was a second libcurl library, libcurldll.a, so I changed the linker options to include

-lcurl -lcurldll

Having added in the extra library I could recompiled the code without any problems.

C++
Jason — 2012-01-12

Time for a new battery

Recently my eeePc 900’s battery has been struggling, so I figured it was time to actually test how dead it is. I charged it up fully and ran a script to log the battery’s charge every 15 seconds. I then unplugged the power and left it downloading files over a wireless network. 36 minutes later it ran out of power.

I then plugged in the power turned it back on and once it had booted up I ran the same script to log the battery levels as it charged up. Turns out it took about 90 minutes to charge up again.

As it now only last half an hour I decided that a new battery was due. So I replaced it with the same spec battery and reran the tests. With the new battery it can run for 3 hours, though it does still take a long time to charge.

As I had the data to hand I decided to run it through gnuplot to produce a nice graph to visualise the differences.

Jason — 2011-12-13

Please don't truncate error messages

This morning I have spent 10 minutes trying to figure out what an error message was referring to. Cryptic error messages are one thing that you get used to dealing with, but this time it was different. The error message made sense but left you thinking that you had to add some more space to something but didn’t say what.

It turns out that the error message was from the underlying Oracle database, but that the software running the SQL against Oracle had truncated the error message. After reading through Oracles logs I discovered the end of the message which detailed which tablespace was in need of expanding. A couple of minutes later the tablespace was extended and it all started working

So if you output error messages from an underlying system, please don’t truncate them.

Jason — 2011-11-28

Augmenting LDAP

Recently I needed to augment an LDAP service so that we could authenticate users against an Active Directory or an another internal system. This initially looked like a very time consuming task which would get very messy, that was until I stumbled across the ability for OpenLDAP to use a Perl Module for processing LDAP requests.

The documentation for the actual Perl Module side of this is not very good, the best thing to do is to play with the example module to get an understanding of how it all fits together.

Any sticking points? yes, if you are using Red Hat Enterprise Linux (Version 5 in my case) then you have to get the stable version of OpenLDAP and manually compile and install it (remembering to specify –enable-perl when configuring). The version being used by Red Hat doesn’t include the Perl backend part of OpenLDAP and even if you install their source and try to compile it as a module it fails.

If you require the option to accept a search filter with sAMAccountName in it then you will need to create a file with the other OpenLDAP schema containing

attributetype ( 1.2.840.113556.1.4.221
NAME 'sAMAccountName'
EQUALITY caseIgnoreMatch
SYNTAX '1.3.6.1.4.1.1466.115.121.1.15'
SINGLE-VALUE )

and in your slapd.conf add an include line to include the schema file.

The bind method wasn’t present in the Perl example module I started from and while some people say you shouldn’t do the bind in Perl but I couldn’t get it started without it. Here is an extract of the bind method I used.

sub bind {
    print {*STDERR} "==== bind start ====\n";
    my ($this, $dn, $pass)=@_;
    print {*STDERR} "DN=$dn\n";
    #print {*STDERR} "Pass=$pass\n";

    my $retval=1;

    # Code here to set $retval to 0 if the distinguised name and password are valid
    print {*STDERR} "==== bind end ====\n";
    return $retval;

If you don’t require a method then simply return the value 53, which is an “unwilling to perform” error.

twitterMap on Hak5

I had a pleasant surprise this morning when I discovered that twitterMap has appeared on the latest episode (season 9, episode 19) of Hak5.

Other news about twitterMap, I have almost finished version 2. The new version will contain the ability to produce wordlists and also let you map links between followers. The follower mapping won’t be fast as that part of the twitter API is rate limited. My tests so far have shown it to be a “leave running in the background for the rest of the day” tool rather than a “go and have a cup of tea while it finishes running” tool.

To reduce the number of calls I have also added the option to cache the mapping between twitter accounts number (returned from some of their APIs) and the users screen name. This has really helped with the testing and should help reduce the number of limited API calls if you are running multiple maps of the same person or group of people.

Jason — 2011-06-30

comm the opposite of diff

Today I needed to find matching lines in a number of text files. My first thought was, what is the opposite of diff? the answer is comm. To compare two text files and output lines that appear in both use

comm -1 -2 <file 1> <file 2>

To get matching lines between 4 files I redirected the output to tempory files and then comm’d them.

comm -1 -2 <file 1> <file 2> > tmp1
comm -1 -2 <file 3> <file 4> > tmp2
comm -1 -2 tmp1 tmp2

You can pipe into comm by using ‘-’ instead of a filename so you could also compare 4 files with

comm -1 -2 <file 1> <file 2> | comm -1 -2 - <file 3> | comm -1 -2 - <file 4>
Jason — 2011-06-21

Getting started with PowerShell

I have just started using PowerShell instead of CMD and I can say that it is a big improvement. The first thing I wanted to do though was to edit my profile so that I could tailor it to me.

First I listed my initial requirements

  • Don’t put anything in the prompt except >
  • vi, vim, gvim should all launch gvim for editing files
  • gimp should launch gimp for editing image files
  • <CTRL>+D should close powershell down

First I discovered that I would have to change my execution policy for PowerShell. This was a simple case of launching PowerShell as an administrator and entering

Set-ExecutionPolicy RemoteSigned

This lets PowerShell run local scripts but requires remote scripts to have been signed. After doing that I closed down the Administrator PowerShell and opened one as my standard user.

To edit my profile I initially used the command:

notepad $profile

and entered the following

# Functions

# prompt function, redefines what prompt is displayed
function prompt
---
"> "

After closing PowerShell and reopening it I now had a simple ‘>’ as the prompt.

The next step was to put in the aliases for vi, vim, gvim and gimp. So I edited the profile again and entered

# Assign vi, vim, gvim to gvim
Set-Alias vim 'C:\Program Files (x86)\Vim\vim73\gvim.exe'
Set-Alias gvim vim
Set-Alias vi vim

# Assign gimp to gimp
Set-Alias gimp 'C:\Program Files (x86)\GIMP-2.0\bin\gimp-2.6.exe'

After restarting PowerShell again I could now edit my profile by using

vim $profile

Now that I was using vim to edit my profile I could use <CTRL>+V to insert special characters. Which would let me put a <CTRL>+D (^D) in my profile, specifically as an alias, though I would still have to press <CTRL>+D followed by <RETURN>. Initially I tried

Set-Alias ^D exit

But that didn’t work. Ironically it wasn’t the <CTRL>+D character causing problems, instead it was the use of exit in an alias. The solution is to wrap exit in a function and call the function from the alias.

# ex function, required to use exit in aliases
function ex
---
    exit

# Aliases

# Assign CTRL+D to exit
Set-Alias ^D ex

That so far is my standard PowerShell profile. Here it is in one chunk incase you want to cut and paste (remembering to replace ^D with a proper <CTRL>+D character).

# Functions

# prompt function, redefines what prompt is displayed
function prompt
---
"> "

# ex function, required to use exit in aliases
function ex
---
exit

# Aliases

# Assign CTRL+D to exit
Set-Alias ^D ex

# Assign vi, vim, gvim to gvim
Set-Alias vim 'C:\Program Files (x86)\Vim\vim73\gvim.exe'
Set-Alias gvim vim
Set-Alias vi vim

# Assign gimp to gimp
Set-Alias gimp 'C:\Program Files (x86)\GIMP-2.0\bin\gimp-2.6.exe'
Jason — 2011-06-03

Getting DSpace's handle server to run over IPv4 and IPv6

I have spent this morning figuring out how to get a DSpace Handle Server to accept connections over both IPv4 and IPv6. Previously it was working over IPv4 only. The usual Google hunt didn’t turn up any advice, though it did turn up the exact opposite of what I wanted (how to get it to only run on IPv4).

In the end the solution turned out to be very simple, just not documented. In the handle server’s configuration file changing the bind_address entries from the IPv4 address to :: and it will start listening on the relevant port on all interfaces (IPv4 and IPv6).

Jason — 2011-06-02

How tough is a flash drive?

Back when I first started using computers (Oric Atmos and ZX Spectrum) everything was stored on cassette tapes (C15, C60, C90, etc). They weren’t the most reliable way to store data and there was a definite black art involved in retrieving data stored on them, though a volume of 7 is a good place to start.

When I started using computers at school (BBC Micros) their data was stored on 5.25 inch floppy disks and later on a network drive (50 machines all sharing a 10MB hard disk). They were easier to retrieve data from, though don’t leave the 5.25 floppy disks in the sun for too long or they wouldn’t stay flat.

Later on in life I started using PCs (286,386,486 and pentiums) at school/college which used a 3.5 inch floppy disk to store 720KB, which later on increased to 1.44MB. At home I moved onto an Amiga 500 and later an Amiga 1200 (A1200), both of which used 3.5 inch floppy disks (880KB), and a 120MB Hard disk in the case of the A1200. These were again more reliable and I have actually still got a large collection of 3.5 inch disks to use with my A1200.

From then on I used 3.5 inch floppy disks until USB flash drives came along. OK, sometimes I would use CDs but not for storing files I was working on, just for archiving data onto.

So why am I talking about the history of my data storage. Well I haven’t had any storage medium in the past that would have survived what I put one of my flash drives through the other day. It was an old one that I use to write ISOs to for booting rather than writing a CD, just 1GB of storage. I had put MemTest86+ on it to test the memory of my new media center PC. Needless to say that afterwards it ended up in my trouser pocket and the next night I put those trousers into the washing machine and gave them a hot wash. When I was taking out the washing there was my flash drive sat in the drum having been washed and spun.

Personally I had at that point written the drive off, I know they are tough but I didn’t expect it to survive that. I let it dry then plugged it into my computer and rebooted, the boot menu had the flash drive as an option so I selected it. MemTest86+ started up and started running. It looks like that flash drive is a lot tougher than I thought.

Jason — 2011-05-11

UTF-8 and CSVs

Recently I have had to produce some CSVs using UTF-8 character encoding. The UTF-8 encoding is easy to do you just need to remember to set the header charset to be utf-8 when printing the CGI header.

use CGI;
my $CGI=new CGI;
print $CGI->header(-type=>'text/csv', -charset=>'utf-8', -attachment =>$filename);

Then you have to print the Byte Order Mark (BOM) which in hex is FEFF as the very first thing so that Excel will recorgnise the CSV as being in UTF-8 and not in its default character set.

print "\x{FEFF}";

Interestingly from what I can tell this BOM is actually for UTF-16, the BOM for UTF-8 should be 0xEFBBBF, but this didn’t seem to work with Excel.

Note: Usually the BOM is not recommended for UTF-8 as it can cause problems, but in the case of CSV’s that you want to open in Excel it is required.

Jason — 2011-04-13

Authenticating Apache against an Active Directory with multiple top level OUs containing users

Wow that was a long title, but just what I have been dealing with recently. The first solution which, is easy, is to use Kerberos, this works great unless you also want authenticationto fall back to a standard.htpasswd` file. In that case you need to use LDAP. Why? because LDAP and File use the same AuthType of Basic where as Kerberos uses an AuthType of Kerberos. Using LDAP and File authentication you can use a config like this

<Directory /var/www/html/private>
    SSLRequireSSL
    AuthName "Private"
    AuthType Basic
    AuthBasicProvider ldap file

    # File Auth
    AuthUserFile /var/www/.htpasswd

    AuthLDAPURL "ldap://ADServer.domain.co.uk/ou=Users,dc=domain,dc=co,dc=uk?sAMAccountName?sub?(objectClass=*)"
    AuthLDAPBindDN User@Domain.co.uk
    AuthLDAPBindPassword XXXXXXX

    AuthzLDAPAuthoritative off
    Require valid-user
    Satisfy any
</Directory>

This works unless you have your users split over a number of OUs in the Active Directory. If that is the case here is the way I got around it.

<AuthnProviderAlias ldap ldap-group1>
    AuthLDAPURL "ldap://ADServer.domain.co.uk/ou=Group-OU1,dc=domain,dc=co,dc=uk?sAMAccountName?sub?(objectClass=*)"
    AuthLDAPBindDN User@Domain.co.uk
    AuthLDAPBindPassword XXXXXXX
</AuthnProviderAlias>

<AuthnProviderAlias ldap ldap-group2>
    AuthLDAPURL "ldap://ADServer.domain.co.uk/ou=Group-OU2,dc=domain,dc=co,dc=uk?sAMAccountName?sub?(objectClass=*)"
    AuthLDAPBindDN User@Domain.co.uk
    AuthLDAPBindPassword XXXXXXX
</AuthnProviderAlias>

<Directory /var/www/html/private>
    SSLRequireSSL
    AuthName "Private"
    AuthType Basic
    AuthBasicProvider ldap-group1 ldap-group2 file

    # File Auth
    AuthUserFile /var/www/.htpasswd

    AuthzLDAPAuthoritative off
    Require valid-user
    Satisfy any
</Directory>
Jason — 2011-02-28

Comparison of ext2,3,4 and ntfs on usb flash drive

There is a topic over on the Hak5 forums asking which filesystem format is best for use on a USB flash drive. I figured that I would run some basic tests and see if my results match up with other tests on the internet.

I decided to test 3 different types of operation, reading, writing and deleting. Each test on each files system was performed 10 times with the results averaged and then plotted onto graphs.

Each test tested tested 11 different sized files (1KB, 512KB, 1MB, 2MB, 4MB, 8MB, 16MB, 32MB, 64MB, 128MB, 256MB).

The filesystems were all tested on a 4GB Dane-elec flash drive over USB2.0 on my eeePC 900 running Linux.

Reading

Before performing the reading test the disks were synced and the file cache dropped to try get an more accurate measure of the filesystem instead of the file cache.

The results from the read performance tests showed ext2 and ext4 performed the best overall, with ext2 having a slightly better performance than ext4 on the larger files.

An interesting result is the way that ext3 really seemed to struggle with reading the large files. The performance from ntfs was slower than than ext2 and ext4.

Writing

The writing performance tests showed again that ext3 seemed to struggle as the file size increased. A bigger gap is also shown between ext2 and ext4 with ext4 standing out as better with the larger file sizes.

Interestingly once the file sizes get beyond 32MB ntfs stands out as the best performer for writing.

Deleting

ext3 and ext4 perform the worst at deleting the larger files while ext2 performs the best. Again ntfs is worst than the other file systems on smaller files but performs better than ext3 and ext4 on the larger files.

Conclusion

Based upon these results I would recommend ext4 as it does a good job with reading and writing and while slower than the others at deleting larger files it is still capable of deleting and 256MB file in less that an eighth of a second.

It would be interesting to run these tests on existing file systems which have been used a lot to see if there is a difference after files have been added and removed repeatedly (which is quite common with USB flash drives).

Of course if you are going to be using the drive on a windows machine then ntfs would make much more sense.

Jason — 2011-01-12

Why I use cat even though there are more efficient methods

When using a series of commands tied together with pipes I usually start with the cat command. A lot of times when I post a one liner solution on a forum someone will reply that there was no point in starting with cat as it is inefficient. So I decided to put a quick post about why I use cat rather than one of the other methods.

The main reason that I use cat at the start of most strings of pipes is that it is easier to maintain. The logical flow of the data is going from left to right and the files that go into the pipe is easy to spot e.g.

cat /etc/passwd | grep bash | grep -v :x:

We can see here that /etc/passwd gets pushed through grep first to find those lines containing bash. Then those lines are pushed through grep again looking for lines that don’t contain :x: (i.e. non shadowed passwords). This could have been written in a number of different ways.

grep bash /etc/passwd | grep -v :x:
</etc/passwd | grep bash | grep -v :x:

In these examples the first way would be reasonable, but the original file at the start of the pipe is a little hidden tucked away in the first grep command. The second way puts the original file at the start and is very clear, but a typo of > instead of < will destroy the file I am really wanting to read from.

So yes there are more efficient ways to start off a string of pipes, but I like to to use cat as it makes things a bit more obvious than some and less prone to destroying data with a simple typo than others.

Jason — 2011-01-10

gzip v bzip2

I have recently been looking at revamping our backup setup and I had to make a decision on the compression method to be used. Should I be using tar with gzip, bzip2 or a combination of the two. They were the only two real contenders mainly due to being stable and supported as standard in tar. The last thing I want to do with backups is to use an exotic compression method, as I want to be sure I will be able to restore the backups.

So the first thing I did on a mixture of servers was to time the length of time it took tar to create the compressed tarball with both tools. The results showed that for our data bzip2 was considerably slower at compressing the tar than gzip. Looking at the size of the final tarballs also showed that bzip2 produced smaller tarballs. So do I want faster generation of the tarballs or smaller resulting tarballs?

The final solution I decided on is to use a mixture of both gzip and bzip2. If it is a small quantity of data then bzip is used as the time difference to produce a small tarball is negligible. For the backup of large sets of data then gzip is used as bzip takes a lot longer to compress it than the time that would be saved pushing the smaller bzip2 tarball across the network to server responsible for writing the backups to tape.

Jason — 2011-01-07

Making my SoSmart work the way I want it to

Last year I purchased a Dane-Elec SoSmart media player. It has 1TB disc in it and numerous outputs, including HDMI. As a media player I can’t really fault it. The only problem that I had was that it uses the NDAS protocol to share the disc, it doesn’t use SMB or any other sane protocol.

For those that don’t know what the NDAS is, and I didn’t until I got this, rather than using a protocol to share/transfer files it uses the NDAS protocol to share the physical disc. There are drivers for Linux as well as Windows, but the speeds it got were terrible. Was there a solution?

Kind of, Sir Dekonass had produced a custom firmware build that would let you access the device using both telnet and ftp. Not exactly secure but on my little home network not much of a problem. Great, except I didn’t have write access to the disc, I only had read only access. I telnet in, root didn’t have a password and I couldn’t set one either.

I then tried to unmount the drive and remount it, but it came back as read only again. I assumed this was due to it being ntfs and the driver being used. I connected to the device using a USB cable and I found that I could use fdisk to recreate the partition on the disk as a Linux partition and then use mkfs.ext3 to create a new file system on this. Restarting it I then discovered that it had mounted it as read only again even though it was now an ext3 partition. I tried unmounting it and mounting it again and this time I could write to the disc. Great except I had to log in each time I turned it on and remount the disk, which I really didn’t want to do.

It took a while of digging through the start up scripts on the SoSmart but I discovered that in the startup sequence it looks at each disc it has mounted for a file called mvix_init_script and if it exists execute it. I knew I couldn’t easily put it on the disc in the machine as I wanted to unmount it. To get around this problem I got an old USB flash drive and repartitioned it into two new partitions and formated them both as ext3.

On the first partition I created my custom mvix_init_script. On the second I copied the contents of the /etc directory. Inside my mvix_init_script I effectively have 3 sections.

The first unmounts the second partition on the USB and remounts it over /etc. This lets me add passwords and new users to the setup, so I now only have a passwordless root account for a very short period on startup.

The second section unmounts the internal hard disk and remounts it, making sure it is writeable. This lets me use ftp to transfer files over to the disk, which is a lot faster and easier than maintaining the NDAS setup.

The thrid section kills the shttpd processes and restarts them with a document root of just the internal hard disk. This stops people viewing things like /etc/passwd with it which really wouldn’t be a good thing.

Is this a perfect setup? Not really I would like to have SSH on it rather than telnet but then copying files over SSH would take longer than the simple ftp method.

It turns out that after googling “mvix_init_script”, the system is actually the mvix built for the sosmart and a lot of good information it available at http://mvixcommunity.com/.

Jason — 2010-12-06

What language should I learn first

This is a question that I have seen asked on many forums and here is the reply I usually give.

Just pick one and start using it

When first learning to program people these days seem to get wrapped up in learning the “best language”. This makes sense to non programmers because they don’t want to waste their time learning one language to only throw it away and and start again with the best one later.

The problem is though that there isn’t a best language, all programming languages are better than others in some area (with the exception of Java). I wouldn’t consider using assembler for processing web pages or text files but I would use Perl. I wouldn’t use Perl to write a piece of code that needed to run as fast as possible. I wouldn’t write an operating system in BASIC, but I would write one in C. I wouldn’t try to teach a complete beginner with C but I would teach them programming with BASIC or Python or another scripting language.

The key thing to learning a programming language is not what language that you are learning but that you are learning. The time that people waste asking on forums for advice on what language they should learn first would have been much better spent just picking one of the ones they have listed and trying to learn it.

What beginners aren’t told

What a lot of beginners are never told is that there are number of skills that you need to learn to program. The first is logic, without an understanding of boolean logic you won’t be able to do much within a program.

The second is algorithm design, it doesn’t matter what language you are using there will be certain styles of algorithms that you will use, be it sorting items in an array or drawing an animation on the screen. If you get the wrong algorithm to solve the problem then you end up with a program that doesn’t scale, get the right algorithm and your small computer will be capable of dealing with data sets you never imagined when you wrote the code.

So to sum up my suggestion, don’t ask “what language should I learn” just pick a common one and start learning. Concentrate on the algorithms you are learning rather than the language, after all the algorithms you learn will usually be relevant to every other language that you learn in the future.

Jason — 2010-11-24

Quick way to use tcpdump to grab packets

Well every now and again I need to grab packets going to and from a specific port on a server machine. The client side isn’t a problem as I have Wireshark installed on my workstation, but when dealing with server to server communications I prefer to use the command line. Most servers I deal with don’t have X-windows installed as there is no need to waist the resources on it.

Of course all Unix style OS’s usually have a program called tcpdump which can be used to collect the packets we are interested in. Being a very powerful tool the man page can be a bit long and it can be hard to get started with it.

Usually I want to grab the packets going to or from a specific port, e.g. port 25 for SMTP or 80 for HTTP. Here are two examples that show how easy it is to use tcpdump

tcpdump -w /tmp/smtp.pcap -s 1500 -i lo 'tcp port 25'

This example will grab all tcp packets going to or from port 25 on the local host interface (127.0.0.1) and put them in the /tmp/smtp.pcap file. This file is in the pcap format so you can copy it to your workstation and dig into it with Wireshark. The -s parameter specifies how much of the packet to grab, usually 1500 will be enough to get the whole packet, but you may need to increase this if you find packets getting truncated.

tcpdump -w /tmp/http.pcap -s 1500 -i eth0 'tcp port 80'

This example will grab all tcp packets going to or from port 80 on the eth0 interface and put them in the tmp/http.pcap file. Again we want all the packets so we use a size of 1500.

Jason — 2010-11-17

Apache rewrite rules and HTTPS

Well after 30 minutes of thinking I was going insane I have discovered that Apache will assume that redirects are for the http version of a site unless they are also stated in the SSL configuration for the site, this is due to the SSL part of the site is usually a virtual host with it’s own configuration.

The solution, to help me keep my sanity, was to create a seperate rewrite.conf file in the conf.d directory (which is automatically included in the default httpd.conf in Red Hat Enterprise Linux). This file can then also be included in the https configuration. i.e.

include conf.d/rewrite.conf
Jason — 2010-11-03

Changing the hostname on Red Hat Enterprise Linux

So I have to change the hostname on a server running Red Hat Enterprise Linux (v5). As I have to do this quite often I figured that I would note down where I need to change it.

  • Update the HOSTNAME entry in /etc/sysconfig/network

  • Update the entry in /etc/hosts to have the correct new hostnames both with domains and without.

  • Update the list of hostnames that sendmail will accept emails for in /etc/mail/local-host-names.

  • run hostname <NEW_HOSTNAME>

  • This will update the hostname there and then, and avoids the need to restart the server to pick up the new hostname.

  • Finally, if your server has DNS entries pointing to it update these to include the new hostname.

Jason — 2010-11-01

Fixing my eeepc 900's keyboard

Today I started up my eeepc 900 and it just didn’t want to work. I managed to work out that the control (CTRL) key seemed to be permanently pressed, even though the key wasn’t stuck down. No matter how much I poked and prodded the key it just didn’t make a difference. So next port of call was removing the keyboard and trying to clean it, as I didn’t want to spend the money on a replacement.

Previous laptop keyboards I have removed have been a real faff, with some requiring me to take the entire laptop apart. With the eeepc 900 though I just had to use a terminal screwdriver to push in 3 springy clips on the side of the keyboard by the screen and then I could lift the keyboard up and then slide it out. There is a connector cable near the touch pad that needs to be unplugged but there is enough cable to easily get at it.

Having removed the keyboard I then attacked it with a can of compressed air. Then I plugged the cable back in and slid the keyboard back into place and pushed it gently down and it just clicked into place again.

It is little things like this that really make be impressed by the eeepc 900 and why I don’t think I will be changing it for a long time to come. It isn’t as fast as a brand new netbook, but then running Debian with a custom build of Enlightenment v17 means that it is still fast enough for what I do with it. The screen isn’t as big as some new ones, but then it does increase it’s portability. It isn’t as nice looking as new ones but as I just throw it in my backpack then rugged is probably better for me than stylish.

Jason — 2010-10-06

Bl***y Java

Warning: this is a rant so if you’re a Java zealot it might wind you up. Also I am not looking at this from the point of view of someone who exclusively uses/Develops Java. I am looking at it from my point of view, someone who has to maintain/develop multiple systems using multiple languages.

I am currently having to get my head round using tomcat and Java for a web application I have to support. Why does everything with Java require it’s own tool when there are plenty of tools that already do the same job? First I have to use maven to get the package as well as build it, then I have to use ant to install it. Why not just use Make for building and installing and CVS, Subverstion, Git, etc for getting the package. I have read that it is faster to use Maven and ant for building your Java apps, but that isn’t a good reason to have do it differently it is a good reason to write a better Java compiler that will run just as quick when used in Maven, Ant, Make or even called manually from the command line.

Then you need tomcat to serve the Java web app. Why not simply work with CGI standards used by other web servers (Apache, lightHTTPd, even IIS). Of course you need to put Apache in front of tomcat to get it running on a standard port. OK you can use IPTables to redirect ports 80 and 443 to tomcats, but then you loose the ability to do things sensibly through Apache, like add in a few CGI scripts or do redirects in the same way to every other web server you are using.

Jason — 2010-09-02

Crontab oddity

I have just managed to figure out an oddity with crontab on a newish Red Hat Enterprise Linux machine. When ever I typed in crontab -e to edit the crontab it would open up an empty file rather the existing crontab. It turns out I needed to add in an EDITOR environment variable in my bash profile.

export EDITOR=/usr/bin/vim

Once the EDITOR environment variable was there everything worked. The annoying thing was that you don’t know there is a problem till the second time you edit your crontab at which point you instinctively close down the editor and crontab writes the empty file to your crontab, deleting the one you already had.

Jason — 2010-07-28

Always remember the block size option on dd

I am currently having to copy one 50GB partition over another 50GB partition. dd is great for this but the first time I did this I forgot to set the block size so dd defaulted to 512 bytes. An hour and 10 minutes later dd finished. The second time I remembered to set it to 20MB (just add bs=20M as one of the parameters for dd) and it took just under 9 minutes. Just to see what difference it makes I ran it a third time with a block size of 40MB and it took just over 7 minutes.

In the future I really should try harder to remember to set the block size when using dd.

Jason — 2010-07-27

JavaScript classes

There are so many ways to create a class in JavaScript that I have spent a few days reading up on them to figure out the best way to implement a class. What I wanted was the ability to create a class as one that can easily be reused, not as a single instance of a class like lots of tutorials show.

I also wanted to make it easier to have each class defined in a separate file, at least in the development stage. These files could be concatenated for the production instance of a website if performance is required.

The method that I favour at the minute, as it meets the above requirements, is

function EXAMPLE_CLASS() {
    this.variable = 10;

    function popup() {
        alert(this.variable);
    };

    EXAMPLE_CLASS.prototype.popup = popup;

To create an instance of this class use

var classInstance=new EXAMPLE_CLASS();

Defining classes like this makes it nice and easy to create multiple instances of the same class as well as helping keeping your code nice and clean.

The popup function is defined within the class to keep it out of the global namespace, after all a lot classes will want to share methods with the same name and while it is possible to point a class’s method to any function name it makes it a bit simpler to find the function that relates to a method if it has the same name.

The prototype line in the class definition function means that rather than each class instance having a seperate popup function one function will be created by the JavaScript interpreter and used for every instance of the class. This is generally a good thing to do unless you have a very good reason why each instance of a class should have it’s own functions rather than sharing them.

Jason — 2010-06-29

Wordpress 3.0 upgrade

Well I have just run the latest upgrade for WordPress to upgrade the site to version 3.0. I have to admit that the wordpress upgrade scheme is one of the easiest updates I have to run.

Even on this site which isn’t set up in a way that lets the automatic update to work all I have to do is upload the latest files to the server and then log in to the site, it automatically detects that the database needs updating and updates it. My custom crafted theme still works with version 3.0 and there hasn’t been any problems, so far that is.

Why can’t all upgrades run this smoothly?

Jason — 2010-06-29

RedHat Enterprise Linux YUM update glitch

Well I have just figured out why some of the machines that I use had stopped picking up updates. When I look at the list of systems on the RedHat Network they had a list of updates that they hadn’t picked up but when I logged into the machines and ran

yum update

it said there were no updates. After trying a lot of things on one of the systems that I was free to test with I still couldn’t find out what was going on. Eventually I discovered that yum had corrupted it’s cache and so it thought that its list of packages was up to date when it wasn’t. The solution was quite easy after that

yum clean all

yum update

I know, I should have tried that at the start.

Jason — 2010-06-18

Awkward first post

This is the awkward first post that all bloggers have to get over. Why have I started blogging now? I have reached the conclusion that there are so many little things that I find, don’t need for 6 months, then forget where I read about it. A blog seems as good a place as any to store this sort of information and it may be useful for other people as well.

The first problem I had was that my WordPress theme I had created didn’t seem to work well with a side bar. It hadn’t worried me before as I didn’t use it but now I am blogging I thought it would be a good idea to have one. Everything I was trying was had the sidebar appearing either above or below the posts. Things that had worked on hundreds of pages in the past just wouldn’t work with my custom theme.

Finally in an act of desperation I set the widths of the sidebar and the main content to percentages rather than pixels and it just worked. Why the floating wouldn’t work with the sizes in pixels I don’t know. Just one of the many joys of developing things for the web.

Jason — 2010-06-03