Tuesday, July 04, 2006

PHP, Smalltalk and Java... sorted out

The other day I needed to sort something... doesn't matter what, but it consisted of a collection of key/value pairs... The keys were strings and the values were integer strings (strings whose values were actually integers). The code happened to be in PHP and I had been outputting unsorted results. This started to bother me so I thought "how hard could it be to sort" ?

So, googled the API... (and found it's call sort ... duh!) and inserted a sort call into my code ...

sort ($table);

but nope, didn't work. It sorted by values but it didn't keep the keys like I needed but looky here, PHP has a rich library and a bunch of friendly sort routines including asort, which does exactly what I wanted - it sorts the values and keeps the keys which were indexing those values... sounds perfect ? sort of ... I also wanted it sorted from into descending values which was no problem either, arsort to the rescue, except one problem. It returned results that looked like like "30,20,100,1" which makes sense if you remember what I said about values being strings which looked like integers. These strings also had some leading blanks on the values due to being read from a file so they sorted kind of funny including the problem where 100 is between 20 and 1. One last tweak to the rescue, PHP has an option on the sort functions to treat values like integers.

The final code added was

arsort ($table, SORT_NUMERIC);

pretty sweet and not too much work for me. So for reference, a simple example in PHP looks like this.

$a = array ("a"=>"1","b"=>"2","c"=>"3","d"=>"4","e"=>"100");
arsort ($a, SORT_NUMERIC) ;

Every one knows Smalltalk is my favourite language, it's just done right... so how would it compare ?

Lets start with a table in array form which looks like this. I set it up so it matches PHP.

| table |
table := #(('a' '1')('b' '2')('c' '3')('d' '4')('e' '100')).

sorted as strings ...

(table asSortedCollection: [ :a :b |(a at: 2) > (b at: 2)]) asArray

-> (('d' '4') ('c' '3') ('b' '2') ('e' '100') ('a' '1'))

sorted as numbers ...

(table asSortedCollection: [ :a :b | (a at: 2) asNumber > (b at: 2) asNumber]) asArray

-> (('e' '100') ('d' '4') ('c' '3') ('b' '2') ('a' '1'))

Exactly what I wanted - properly sorted arrays, not too much code.

How about our old friend Java ? ... sigh ...

String[][] anArray = {{ "a", "1"},{"b","2"},{"c","3"},{"d","4"}, {"e","100"}};

and a print helper

private static void printArray(String[][] anArray, String message) {
System.out.println(message);
for (String[] i : anArray) {
System.out.println(i[0] + " " + i[1]);
}
}

some snippets. Below is my first try, sure it compiles but it doesn't work.

System.out.println("This sort will runtime error... but compiles just fine.");
Arrays.sort(anArray); // type safety ? ha ! still get a runtime error ...
printArray(anArray, "Sorted using Arrays.sort no, extra comparator.");

Yes, I get a runtime error in the library when it does a cast to Comparable.

Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.String; incompatible with

java.lang.Comparable
at java.util.Arrays.mergeSort(Arrays.java:1171)
at java.util.Arrays.sort(Arrays.java:1095)
at stringsort.Test.main(Test.java:37)

Indeed, type safety is not a catch all, runtime errors are still possible (remember that kids when you try to tell your boss it compiled so it's gotta work). In fact, don't forget to mention cases like this to people who don't understand #doesNotUnderstand:


To fix, you add a comparator. Pretty simple (but blocks are simpler).

class CompareAsStrings implements Comparator {
public int compare(String[] a, String[] b) {
return b[1].compareTo(a[1]);
}} ;

and the code looks like this

Arrays.sort(anArray, new CompareAsStrings());
printArray(anArray, "Sorted using CompareAsStrings");

of course the output is string sorted (and wrong for what I want).

Sorted using CompareAsStrings
d 4
c 3
b 2
e 100
a 1

so instead of this, you add a compatator which converts to Integers.

class CompareAsInts implements Comparator {
public int compare(String[] a, String[] b) {
return new Integer(Integer.parseInt(b[1])).compareTo(new Integer (Integer.parseInt(a[1])));
}};


and you finally get happy sorted values...

Sorted using CompareAsInts
e 100
d 4
c 3
b 2
a 1

Of course some bozo is gonna tell me that they would have created a class for the key/value pairs which would have implemented comparable and all would have been fine. Yes, and it would have been a page of code.

So, what do you learn from this ... nothing really, just some random ranting about Java verbosity but if you're the lesson learning type...

Lesson #1 Wanna be agile ? Write less code, use a dynamic language like Smalltalk or PHP.
Lesson #2 Never trust the compiler despite how type safe it claims to be.

2 Comments:

Blogger Nick Edgar said...

I would have implemented the comparator as:

class CompareAsInts implements Comparator {
public int compare(String[] a, String[] b) {
return Integer.parseInt(b[1]) - Integer.parseInt(a[1]);
}};

But that does not negate the fact the Java blows chunks compared to ST ;-).

9:48 AM  
Blogger John Duimovich said...

yeah, I guess but I just translated the code direcly.
So, from

string.compareTo (string2)
a simple conversion each to Integers and run the compareTo for int. Just leave the rest of the code the same (including the compareTo)... this way I do not need to know what compareTo returns ( 0,1,-1 )

I admit I could have let autoboxing take care of the Integer parameter and had code like this

new Integer(Integer.parseInt(b[1])).compareTo(Integer.parseInt(a[1]));

but what I really want to write is

Integer.parseInt(b[1])).compareTo(Integer.parseInt(a[1])

since the only change I really need is the conversions to ints. The rest of the wrappering and silliness is due to ref types and base types living in different worlds.

12:47 PM  

Post a Comment

<< Home