Natural Sort. MySQL. Sorting a 2D Array By Multiple Columns in PHP Naturally. Array_MultiSort with Natural Sort

MySQL has a biggest missing feature: It does not sort naturally. This has been an opened request since 2003, and MySQL team has no plan to fix this in a near future. I have looked for a solution to solve this problem for a while. Unfortunately, I found nothing useful. Here is a summary:

The + 0 Trick

Many developers suggest this solution:

SELECT names FROM your_table ORDER BY age + 0 ASC

This is not a good solution because it only works in some very specific situations. This query fails in most general cases.

Drupal Patch

Some developers suggest to use the patch from Drupal. Forget it, I could not even install it in my system.

My solution: Let PHP to handle the dirty work

So I decide to move the process to PHP. Before you say something about the performance issue, please keep reading my story first. I will talk about it in the later section.

MySQL is a relational database, it stores everything in a table. If we express that in an array, it will look something like this:

Array
(
    [1] => Array
        (
            [Name] => John
            [Identification] => 10
            [Address] => 100 Michigan Ave.
        )

    [2] => Array
        (
            [Name] => Peter
            [Identification] => 1000
            [Address] => 1 Michigan Ave.
        )

    [3] => Array
        (
            [Name] => Tom
            [Identification] => 2
            [Address] => 10 Michigan Ave.
        )

    [4] => Array
        (
            [Name] => Paul
            [Identification] => 1
            [Address] => 10 Michigan Ave.
        )

)

where the array key represents the record ID in the table.

Now, let say I want to sort the contents in the following orders:

  1. Sort by Identification: Ascending
  2. Then by Address: Descending
  3. Then by Name: Ascending

I specify the orders in an array:

$order = array('Identification' => 'ASC',
			   'Address' => 'DESC',
			   'Name' => 'ASC');

And include the following function in my code:

//array_multiSort with natural sort
function naturalSort2D(&$array){

   if (!function_exists(naturalSort2DCompare)){
	function naturalSort2DCompare($a, $b){
		global $order;

		foreach($order as $key => $value){
			
			if (!isset($a[$key])) continue;

			unset($compareResult);

                        //Case insensitive string comparisons using a "natural order" algorithm
			$compareResult = strnatcasecmp($a[$key], $b[$key]);
			
			if ($compareResult === 0) continue;
			
			$value = strtoupper(trim($value));
			
			if ($value === 'DESC'){
				$compareResult = $compareResult*-1;
			}
			
			return $compareResult;
		}
		
		return 0;
	}
   }
	
   //Maintain index association
   //Preserve array keys
   //Use usort if you don't care about key index
   uasort($array, 'naturalSort2DCompare');
	
   return true;
}

Now I sort the array:

naturalSort2D($array);

And I get my result like the following:

Array
(
    [4] => Array
        (
            [Name] => Paul
            [Identification] => 1
            [Address] => 10 Michigan Ave.
        )

    [3] => Array
        (
            [Name] => Tom
            [Identification] => 2
            [Address] => 10 Michigan Ave.
        )

    [1] => Array
        (
            [Name] => John
            [Identification] => 10
            [Address] => 100 Michigan Ave.
        )

    [2] => Array
        (
            [Name] => Peter
            [Identification] => 1000
            [Address] => 1 Michigan Ave.
        )

)

Performance

Before we talk about the performance issue, I like to talk about my database design first. In my application, the records are stored in a table. It is a multiple user system. Each user can access their own records only. Each user has about 3000 records, and we have 1000 users. Therefore we have about 3 millions of records in one table.

Here is a data structure:

#Information input by users
Name
Identification
Address

#Information input by system
ID
UserID

So the SQL quere is nothing more than something like:

Select * FROM `database`.`table` WHERE `UserID` = '1'

Previously, I let MySQL to handle the filter and sorting, i.e.,

Select * FROM `database`.`table` WHERE (`UserID` = '1') AND (`Address` LIKE `%Michigan%`) ORDER BY `Identification` ASC LIMIT 5, 10

After moving to PHP, I just let MySQL to handle the filter,

Select * FROM `database`.`table` WHERE (`UserID` = '1') AND (`Address` LIKE `%Michigan%`)

Notice that I will get all qualified records, could be few thousands. Then I let PHP to do the sorting:

$order = array('Identification' => 'ASC');
naturalSort2D($array);

Now I have a sorted array. Next I need to slice the records:

$array = array_slice($array, 5, 10);

Sounds like lots of work huh? Now let’s talk about the performance impact. Guess what, I see performance improvement for about 15%!

In the new approach, MySQL does lesser work. PHP does more work. Apparently, PHP works more effectively than MySQL. However, this may not true for all database model. Remember that in my situation, the system needs to handle at most few thousands records at a time? So I think PHP is doing better in this range. If your application needs to handle millions of records at a time, e.g., bookstore, then you better let MySQL to do the work.

By the way, the new approach allows to sort the dynamic generated content, which is a big plus in my situation. That is something I cannot do in MySQL approach.

Have fun.

–Derrick

Our sponsors:

[PHP]How to get the number of CPU cores in Fedora, Ubuntu, and FreeBSD

I am working on a PHP application which needs to be deploy to multiple places. One of the variables is the number of CPU core. Since each server has different number of CPU cores, I need to specify the number of CPU core in my application for each server. I think there should be a smarter way to do it.

PHP is a scripting language, it has limit support on accessing the hardware level information. In short, there is no library or function to do it. Fortunately, we can do it via shell command. In the other words, the following methods are not limited to PHP, it will work in any languages, as long as your language supports running the UNIX command and catch the return.

Getting the number of CPU Cores – Linux

(Tested on Fedora, Ubuntu Linux, should work on other Linuxs because they all use the same Linux kernel.)

cat /proc/cpuinfo | grep processor | wc -l

This will return something like this:

8

Getting the number of CPU Cores – FreeBSD

sysctl -a | grep 'hw.ncpu' | cut -d ':' -f2

which will return something like this (notice the extra space before the number):

8

Now, let’s put everything together. Run the command inside your application (Here I am using PHP for example):

//Linux
$cmd = "cat /proc/cpuinfo | grep processor | wc -l";

//FreeBSD
$cmd = "sysctl -a | grep 'hw.ncpu' | cut -d ':' -f2";

$cpuCoreNo = intval(trim(shell_exec($cmd)));

Of course, you can make the application to detect the system automatically:

$cmd = "uname";
$OS = strtolower(trim(shell_exec($cmd)));

switch($OS){
   case('linux'):
      $cmd = "cat /proc/cpuinfo | grep processor | wc -l";
      break;

   case('freebsd'):
      $cmd = "sysctl -a | grep 'hw.ncpu' | cut -d ':' -f2";
      break;

   default:
      unset($cmd);
}

if ($cmd != ''){
   $cpuCoreNo = intval(trim(shell_exec($cmd)));
}

That’s it! Happy PHPing.

–Derrick

Our sponsors:

Comments on running webservers at home – Part 1

I like to experiment all kind of computer technologies especially server applications. I am not one of those who is satisfied with just making a website and putting it in somewhere. I like the full control of the server because I like to try different combination of applications for performance tuning. For example, I doubt regular web hosting company allows you to host the entire system on a ram disk with a reasonable price tag. That’s why I choose to host a server at home. That’s a lot cheaper, plus I have the full control. However, running server application may generate lots of upstream traffic. That’s why most internet server providers (such as Charter) do not allow their customers to run any server related applications using their internet connection service. They do it by blocking most service related common ports, e.g., 80 (HTTP), 21 (FTP), 22 (SSH) etc. So there is really nothing you can do other than hosting your applications on different ports.

Two years ago, I started hosting all of my websites using my own computers. I found a number of benefits.

Benefits of hosting websites at home

1. It saves me tons of money.

I was paying $72/year per domain for web hosting. Since I have more than ten domains, the total running costs per year is pretty high. This amount is really nothing comparing to the cost of the electricity.

Monthly cost for web hosting:

$72 per domain/yr * 10 domains / 12 months
= $60 per domain / month

My monthly electricity cost at home, which includes everything such as running 10 non-gaming computers, washer, dryer, lighting etc:

$80 / month

I haven’t tried measuring the exact energy but you can imagine the electricity used by computers should be under $10 / month.

2. It is fun (and environmental friendly too).

I have few stone-age computers including a Pentium II laptop, a Mac G3 (speed wise similar to Pentium II), a Pentium M Celeron laptop etc. I integrated them to a web server farm (web clusters). Since running a web server does not require a lot of CPU power, they are doing okay for hosting low-traffic websites. Also, it is cool to show off my friends the global data center that I build for my websites.

3. Your data is secured!

Have you ever heard of any bank host their web sites on web hosting? No matter what type of encryption you use for your web applications, you still need to process the raw, original, and unencrypted data on the server side at one point. Processing confidential information on a shared server is like talking your secrets in a study room in a public library. You think you are in an isolated environment, but you can be surveillanced, it’s very simple and easy.

Here is an example:
Supposes I have a web application which accepts the confidential information from my users, and all traffics are encrypted. After the confidential information is decrypted on the server-side, my web application processes the raw information and do further things.

Let’s say the server environment is Apache + PHP + MySQL, the most popular combination of web application environment. Since they are all open-source, it is very easy to modify the source codes and log every single thing into a file, including the raw, original, unencrypted data processed by my web application.

You may think this may require lots of work and it will never happen on you. What if your competitor wants your confidential information? It doesn’t cost much to hire someone to do it.

Sounds scary?

More scary things come along. Shared web hosting (hosting multiple domains on one single server) always come with lots of trouble that many people are ignored. In theory, every website on a shared hosting lives in a virtual, independent environment, think about it as a virtual machine like VMWare or Hyper-V. Practically, it is not easy to set up such environment (e.g., FreeBSD Jail) and many web hosting companies choose to go with a less difficult path, because customers will not realize it anyway. Now here is the interesting part, supposes my domain and your domain are hosted on the same server. I can access the resource at the operating system level first (which will required some hacking), then access your file after that. Now I have access your source code and I can do whatever I want.

The most secure place in the world is the place that can be accessed by you, and no body else, i.e., your home, or any place you have full control

Our sponsors:

Think twice before upgrading your server to PHP 5.3.3

I upgraded to PHP 5.3.3 today, and I was shocked! I found a huge security hole default setting change that may leak the source code to public, which never happened in the earlier version.

Here is an example. Suppose my code looks like this:

<?php

echo "This is a good example";

?>

Everything will turn out fine. Here is the result:

Good Example

However, if the open tag is changed from <?php to <?:

<?

//My comment
echo "This is a bad example";

?>

Then PHP 5.3.3 will not recognize the code within the < ? and ?> tag. Instead, it will parse it as a regular text file, i.e., your users will see the code. Here is the result:
Bad Example

I know that it happens on PHP for Windows (XAMPP for Windows), but I didn’t know that they move this new feature to Unix world.

Update:
Thanks for the help from bt garner, it is pretty simple to fill this security hole.

First, open the php.ini. If you are not sure where is it, you can run the following command to locate it:

sudo find / -name "php.ini"

And activate the following parameters:

short_open_tag = On

That’s it!

–Derrick

Our sponsors:

MediaWiki Error: “Parse error: syntax error, unexpected T_NAMESPACE, expecting T_STRING in /usr/local/www/apache22/data/wiki/includes/Namespace.php on line 46”


After upgrading my PHP to 5.3.2, I received the following error when accessing MediaWiki (1.6.12):

Parse error: syntax error, unexpected T_NAMESPACE, expecting T_STRING in /usr/local/www/apache22/data/wiki/includes/Namespace.php  on line 46

There are three solutions:

  1. Downgrade your PHP from 5.3 back to 5.2
  2. Upgrade your MediaWiki to a newer version.
  3. Fix the codes.

In my case, the first couple solutions do not work for me, and therefore I choose 3.

  1. Backup your wiki file first.
  2. Access your webserver and go to the MediaWiki/includes directory.
    For example:

    cd /usr/local/www/apache22/data/wiki/includes
  3. Edit Namespace.php
    nano Namespace.php
  4. Go to line 46, change the following from:
    class Namespace {

    to:

    class MWNamespace {

    Save the file.

  5. Now, let’s create a script:
    nano wikifix.sh

    Copy and paste the following:

    for fl in *.php; do
    mv $fl $fl.old
    sed 's/Namespace::/MWNamespace::/g' $fl.old > $fl
    done

    Save the file.

  6. Change the file to executable by:
    chmod u+x wikifix.sh
  7. Search the keyword Namespace:: and replace it by MWNamespace:::
    ./wikifix.sh

    Warning: Don’t run the script twice!

  8. Delete the script.
    rm wikifix.sh

Try to access the MediaWiki again and the problem should be gone.

Enjoy Wiki.

–Derrick

Our sponsors:

Weird PHP Error after FreeBSD port upgrade (PHP Warning: PHP Startup: Unable to load dynamic library)

Today I upgraded the port on my FreeBSD box using portmaster tools:

portmaster -Dai

After that, I found that the PHP gave the following weird error:

php -v

PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/20060613/memcached.so' - /usr/local/lib/php/20060613/memcached.so: Undefined symbol " php_session_create_id" in Unknown on line 0

PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/20060613/mysql.so' - Cannot open "/usr/local/lib/php/20060613/mysql.so" in Unknown on line 0

PHP 5.2.12 with Suhosin-Patch 0.9.7 (cli) (built: Apr  6 2010 15:50:36)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2009 Zend Technologies

In order to solve this problem, you will need to rebuild the port. In my situation, I need to clean up, uninstall, rebuild and install the affected ports:

Memcached: /usr/ports/databases/pecl-memcached

MySQL: /usr/ports/databases/php5-mysql

——————————————————————————————
Update on 2010.04.13:
Your situation may be different from mine. The tricky part is to rebuild the port at the right location. For example, if your PHP complains about pdo.so, you can rebuild the port at the affected location:

/usr/ports/database/php5-pdo

If you are not sure where does the port locate, you can search it on Google or find it with the following command:

find /usr/ports/ -name "*pdo*"

——————————————————————————————

In each affected port, run the following commands:

make clean
make deinstall
make
make install && sync
make clean

After that, restart the PHP by restarting Apache:

sudo apachectl restart

and run the php command again:

php -v

The error should be gone:

PHP 5.2.12 with Suhosin-Patch 0.9.7 (cli) (built: Apr  6 2010 15:50:36)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2009 Zend Technologies

Our sponsors:

Enough “Rails Surprise”! Goodbye Ruby On Rails.

Couple months ago, I published an article about Ruby On Rails. I was wrong.

The idea of Ruby On Rails is really good. You can create a very simple database-driven web application (i.e., User can add, browse, remove and search records) in less than half an hour. There is no way to do something similar in such a short time using other programming languages. However – perhaps this is the tradition of the Rails development team, it comes with surprise (“Rails Surprise”) every time after upgrading the Ruby gems component. They either make your Rails application couldn’t start, or some features in your apps are not functioning after the upgrade.

What kind of people should stick with Ruby On Rails?

  • Someone is big fan of Ruby/Gems/Rails.
  • Someone has plenty of spare time.
  • Someone likes to sharpen his/her debugging skill – To debug the backward incompatibility mess introduced by the Ruby/Gems/Rails core development team.
  • Someone is patient and is willing to spend number of hours(could be days) to look for workaround.

I am neither. I think I already have enough “Rails Surprise”. And I will make my own web application framework from now on (The code name is called ICEPIZZA by the way). It will be simple, elegant and backward compatible.

–Derrick

Our sponsors:

Installing XAMPP on Fedora 12 64-bit system

I was installing XAMPP on Fedora 11 64-bit today, and I saw this famous message:

sudo /opt/lampp/lampp startapache
XAMPP is currently only availably as 32 bit application. Please use a 32 bit compatibility library for your system.

The message is very clear that XAMPP doesn’t like 32-bit system. So, we will need to cheat XAMPP that the system is 32-bit. First, we will need to install some libraries:

sudo yum -y install glibc.i686 libgcc.i686 libstdc++.i686

and now, we need to skip the 32-bit annoying check. Use Nano or your favorite editor to open this file: /opt/lampp/lampp

sudo nano /opt/lampp/lampp

Replace the following from:

# XAMPP is currently 32 bit only
case `uname -m` in
       *_64)
       if /opt/lampp/bin/php -v > /dev/null 2>&1
       then
               :
       else
               $de && echo "XAMPP gibt es zur Zeit nur als 32-Bit Applikation. Bitte verwende eine 32-Bit Kompatibilitaetsbibliothek fuer Dein System."
               $de || echo "XAMPP is currently only availably as 32 bit application. Please use a 32 bit compatibility library for your system."
               exit
       fi
       ;;
esac

To:

# XAMPP is currently 32 bit only
#case `uname -m` in
#       *_64)
#       if /opt/lampp/bin/php -v > /dev/null 2>&1
#       then
#               :
#       else
#               $de && echo "XAMPP gibt es zur Zeit nur als 32-Bit Applikation. Bitte verwende eine 32-Bit Kompatibilitaetsbibliothek fuer Dein System."
#               $de || echo "XAMPP is currently only availably as 32 bit application. Please use a 32 bit compatibility library for your system."
#               exit
#       fi
#       ;;
#esac

Now run the following the start XAMPP:

sudo /opt/lampp/lampp startapache

Wow! Everything is working great!

–Derrick

Our sponsors:

Tornado Web Server – Very Fast!

I just did a benchmark comparison on Tornado Web Server and Apache + PHP server. The result is pretty amazing.

What to test:

A simple Hello World application.

How:

ab -n 1000 -c 300 http://myserver.ip

Test Environment

OS: FreeBSD 7.1
CPU: Pentium III 933 MHz
Ram: 256 MB
PHP: 5.2.9
Python: 2.5.4
Tornado: 0.2

Result:

Apache + PHP:

Concurrency Level:      300
Time taken for tests:   3.515 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      255000 bytes
HTML transferred:       12000 bytes
Requests per second:    284.50 [#/sec] (mean)
Time per request:       1054.499 [ms] (mean)
Time per request:       3.515 [ms] (mean, across all concurrent requests)
Transfer rate:          70.85 [Kbytes/sec] received

Tornado Web Server:

Concurrency Level:      300
Time taken for tests:   1.907 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      168000 bytes
HTML transferred:       12000 bytes
Requests per second:    524.48 [#/sec] (mean)
Time per request:       571.993 [ms] (mean)
Time per request:       1.907 [ms] (mean, across all concurrent requests)
Transfer rate:          86.05 [Kbytes/sec] received

What do these numbers mean?

Tornado Web Server can handle 1.84 times more requests than Apache + PHP server in a given time!

–Derrick

Our sponsors:

Updated: What technologies (or toys) I am playing recently?

Recently I am playing the following technologies (toys):

  1. Tornado Web – Not sure how long will Apache + PHP last. Time to learn Python.
  2. Tokyo Cabinet & Tyrant – Another one of my long term plans to step away from SQL.
  3. PHP API for Tokyo Tyrant — I decide to write my own because I cannot find a good one.
  4. Moving my development platform from Windows to FreeBSD. My servers have been already on FreeBSD for many years. I think now is a good time to move my desktop systems to FreeBSD as well. Why not Linux? Oh Well…
  5. LUA – I may pick up Lua if time is permitted.

Looks like I will have a busy time other than shoveling the snow this Winter.

–Derrick

Our sponsors: