Archive for the ‘code’ Category

Check size constraints for multidimensional arrays

Monday, February 8th, 2010

Multidimensional arrays are commonly used as a complex data structure. Moreover, they are used to convey data between software systems internally. In some programming languages however, there is no way to constrain the size of arrays at any stage of the array’s lifetime. PHP for instance has no ability to do this. Therefore a way must be devised to make sure the structure of the passed data conforms to what the receiving system expects.

Think for instance pixel convolution matrices. These are usually 3 by 3.

Check out my recursive PHP function to check sizes at any dimensional depth possible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
function isDimension($arr, $dimensions)
{
   $pass = true;
   $amount = array_shift($dimensions);
 
   if($amount === -1 || count($arr) === $amount)
   {
      if(!empty($dimensions))
      {
         foreach($arr as $element)
         {
            if(is_array($element))
            {
               if(!isDimension($element, $dimensions))
               {
                  $pass = false;
               }
            }
            else
            {
               $pass = false;
            }
         }
      }
   }
   else
   {
      $pass = false;
   }
   return $pass;
}

And when we utilize this function in the following way:

1
2
3
4
5
6
7
8
9
10
11
12
$subject = array(	array(1,2,3),
			array(2,3,4),
			array(3,4,5));
 
if(isDimension($subject, array(3,3)))
{
   print "Yay";
}
else
{
   print "Nay";
}

The output will be:

Yay

If at a certain dimension size does not matter you can skip it by entering -1 as the expected size:

1
2
3
4
5
6
7
8
$subject = array(	array(1,2,3),
			array(2,3,4),
			array(3,4,5)
			array(4,5,6));
 
$wellIsIt = isDimension($subject, array(-1,3)); // Returns true
$wellIsIt = isDimension($subject, array(4,3)); // Returns true
$wellIsIt = isDimension($subject, array(3,3)); // Returns false

This is a recursive function, meaning that it calls itself from within itself until a certain condition is met. Then it cascades the result back to the initial function call which in turn passes it back to whatever script called the function. More about this can be read on wikipedia.

There ya go

Javascript: Remove values from array prototype

Wednesday, October 29th, 2008

In order to remove certain values or objects from an array many people iterate through it and remove unwanted occurences with the splice() method. It does what you want it to do but it’s pretty cumbersome. There are far easier and simpler ways to remove a certain value from an array.

If we have an array like so: arr = [1,2,2,3,4,5,2,6] and we’d do this: arr.remove(2) we’ll end up with [1,3,4,5,6]. Nobody likes two’s anyway.

Here is how I do it:

Array.prototype.remove = function (subject) {
	var r = new Array();
	for(var i = 0, n = this.length; i < n; i++)
	{
		if(!(this[i]==subject))
		{
			r[r.length] = this[i];
		}
	}
	return r;
}

Why is this better?

  • No copy of the original is needed to do the math
  • It is yet again inherently independent of other array prototypes or methods
  • It runs alot faster ;)

So there you go, another mystery solved.
Take care!

PHP: Image Resizer Class

Thursday, June 19th, 2008

Being able to resize images automatically is a must for websites where users can upload their own images, avatars and so on. In PHP we can do this with the GD library. We don’t want users to upload pictures that are 2000 pixels wide and 1 pixel high. This will disrupt the layout of the site on which they are shown.

Thus we want the image to be resized to sane proportions and dimensions so that we can be sure they fit nicely into our websites’ layout.

Therefore I made an image resizing class myself. It actually resamples the images for better results. Here are some of the features:

  • Handles JPG, GIF, PNG and BMP image files
  • Can make multiple resized copies
  • Can simply duplicate the original
  • Can crop images to a certain aspect ratio, maintaining proportions
  • Can save to JPG, GIF, PNG and BMP
  • Can return the image as a string (for i.e. database storage)
  • Can return the image to the browser
  • Throws exceptions for unexpected behaviour

Here is some code that shows how to utilize the class in your PHP application:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
include "imageresizer.php";
 
// Increase the allowed memory size for the bigger images
ini_set('mem_size', 32000);
 
try {
	$image = new imageResizer('original.jpg');
 
	// Make a smaller version of the original
	$image->resize(400, 300, 400, 100);
	$image->save('result.gif', GIF);
 
	// Make a thumbnail of the original
	$image->resize(100, 100);
	$image->save('thumb_result.png', PNG);
 
	// Retrieve the thumbnail as a string as BMP and show it in the browser as JPG
	$string = $image->getString(BMP);
	$image->show(JPG);
}
catch(Exception $e) {
	// Catch and display any exceptional behaviour
	print $e->getMessage();
	exit();
}
 
// Destroy object (executes the destructor) and more importantly, frees up memory
$image = null;

The resize method can have four values passed to it, namely: maximum width, maximum height, minimum width and minimum height. Either the width or the height should be the same when you use this functionality. Suppose we have the width at 400 pixels. The height of the result should be between 300 and 100 pixels. If we can’t fit the original in this aspect ratio constraint it will crop off the edges until it fits. This way the image is not stretched or compressed and we have control over both proportions and dimensions.

Download the imageResizer class here!

If you come up with any improvements or bugs please let me know.

T-SQL: Breadth-First shortest-route search

Wednesday, June 18th, 2008

A while ago I came across the problem of determining the shortest route in a many-to-many self-join table. The linker table consists of two ID columns to link nodes together and is called `vertices`. This represents an unweighted and undirected node graph.

In an unweighted undirected graph there can be no heuristic searching for there is no clue (weight or direction) which possible leaf node gets you to the target quickest. Therefore an uninformed search is the only option and for this I utilized the breadth-first search algorithm.

The breadth-first algorithm visits the leafnodes per generation first and not by branch first. In practice this is the optimal method most of the time.

T-SQL Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
CREATE PROCEDURE BFS(
	@FROM INT,
	@TO INT
)
AS
 
BEGIN
	DECLARE @Nodes TABLE (Generation INT, p INT, r INT, UNIQUE(p, r))
	DECLARE @Generation INT
 
	SELECT @Generation = 0
 
	INSERT @Nodes
		(
			Generation,
			p
		)
	SELECT	@Generation,
			@FROM
 
	WHILE @@ROWCOUNT > 0 AND NOT EXISTS (SELECT * FROM @Nodes WHERE p = @TO)
		BEGIN
			SELECT @Generation = @Generation + 1
 
			INSERT	@Nodes
				(
					Generation,
					p,
					r
				)
 
			SELECT	@Generation,
					bid,
					aid
			FROM	vertices
			WHERE	aid IN (SELECT p FROM @Nodes WHERE Generation = @Generation - 1)
				AND bid NOT IN (SELECT p FROM @Nodes)
			UNION
			SELECT	@Generation,
				aid,
				bid
			FROM	vertices
			WHERE	bid IN (SELECT p FROM @Nodes WHERE Generation = @Generation - 1)
				AND aid NOT IN (SELECT p FROM @Nodes)
 
		END
 
	-- Backtracing method: Traces the route back from target to start
 
	DECLARE @Backtrace TABLE
	( p INT )
 
	INSERT @Backtrace VALUES(@TO)
 
	WHILE @Generation > 0
		BEGIN
			DELETE FROM @Nodes
			WHERE Generation = @Generation
				AND p NOT IN(SELECT p FROM @Backtrace)
 
			INSERT @Backtrace
				( p )
 
			SELECT DISTINCT r
			FROM @Nodes
			WHERE Generation = @Generation
 
			SELECT @Generation = @Generation - 1
		END
END
 
SELECT * FROM @Nodes ORDER BY Generation, r, p

The procedure returns the following columns:

  • Generation: The amount of generations of leaf node expansion
  • P: The progressive node ID
  • R: The regressive node ID

The regressive ID is in fact the node’s parent and the progressive ID its child. This way we can obtain the (outward) direction in the result set. Results can represent multiple shortest routes as long as they both have the minimum amount of steps (Generations) necessary.

If we would want to obtain the shortest route(s) between two nodes with ID’s 776 and 777 this would be a valid result.

Execution of the Stored Procedure:

EXEC BFS @FROM = 776,  @TO = 777

The result set:

Generation p r
0 776
1 2881 776
1 3198 776
2 3362 2881
2 1582 3198
3 1579 1582
3 1262 3362
4 777 1262
4 777 1579

As you can see this result set proposes two shortest routes between nodes 776 and 777.

In practice this technique has its limitations. It is not heuristic so brute force is required. With each generation that the leafs expand there is an exponential amount of nodes to be dealt with. Luckily there are ways to chop in two the exponential strain this procedure poses for the DB. One of these is the bidirectional search method. Alas, I will not disclose that particular implementation.

Go code or something ;)

Javascript: Remove duplicates from Array

Friday, December 21st, 2007

Removing duplicate entries or values from a Javascript array is something which isn’t accomodated for by the native functions in Javascript. I searched Google for a few solutions but they were all lacking something in my opinion, be it performance or just sheer elegance.

*UPDATE*

There is now an array prototype function for removing duplicates. I suggest you use the prototype function instead of the functions below. It is the way it should have been done in the first place and is faster and moreover the most correct way to do it.

Array.prototype.unique = function () {
	var r = new Array();
	o:for(var i = 0, n = this.length; i < n; i++)
	{
		for(var x = 0, y = r.length; x < y; x++)
		{
			if(r[x]==this[i])
			{
				continue o;
			}
		}
		r[r.length] = this[i];
	}
	return r;
}

You can now utilize the unique function like this:

var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9];
var unique = arr.unique();
alert(unique);

The result will be [1,2,3,4,5,6,7,8,9].

*UPDATE*

I therefore made two variants myself:

The first one does what you’ll expect, it’ll regard the first encountered entry as the original and all subsequent entries as duplicates.

function unique(a)
{
   var r = new Array();
   o:for(var i = 0, n = a.length; i < n; i++)
   {
      for(var x = 0, y = r.length; x < y; x++)
      {
         if(r[x]==a[i]) continue o;
      }
      r[r.length] = a[i];
   }
   return r;
}

If we pass the following array [1, 2, 3, 1, 4, 5] to the function the result will be [1, 2, 3, 4, 5].

The second variant returns different results. It will regard the last encountered duplicate as the original. This may be desirable in certain situations.

function unique(a)
{
   var r = new Array();
   o:for(var i = 0, n = a.length; i < n; i++) {
      for(var x = i + 1 ; x < n; x++)
      {
         if(a[x]==a[i]) continue o;
      }
      r[r.length] = a[i];
   }
   return r;
}

The output in this case will be [2, 3, 1, 4, 5].

I wrote both these functions with performance in mind. I have thoroughly tested and profiled both of them. Here are the benefits over other solutions to the problem:

  • it’s just one function
  • the loop length is determined through the object model just once per loop (see the for statements: var i = 0, n = a.length;)
  • after a duplicate has been detected in the nested loop it doesn’t iterate further but goes on to the next entry in the source array (see continue statement)

You can easily adapt the function to make it a prototype function of an Array object. If people don’t know how to do it just ask, i’ll add it.

Hope this helps someone out!

Regex: validate e-mail address in PHP

Friday, December 21st, 2007

Validating whether an e-mail address conforms to the (informal) e-mail address specification is a good way of improving the overall validity of the information entered by a user. Regular expressions can help us in accomplishing this.

The following example doesn’t follow the RFC 2822 specification per se but enforces the more strict rules which are common for the internet nowadays and for instance enforced by Live Hotmail and Gmail.

The Wikipedia article about e-mail addresses is somewhat easier to digest than the original specification and is a good starting point for e-mail validation techniques.

Here goes!

Anatomy for i.e. john@doe.example.com:

  • john is the local name
  • doe is the subdomain name
  • example is the domain name
  • com is the top-level domain name

The rules:

  • Local name may only contain letters, digits, hyphen ‘-’, underscore ‘_’ and periods ‘.’
  • Local name may not begin and/or end with a period
  • Local name may not contain two or more subsequent periods ‘..’
  • Sub-domain name may only contain letters, digits and hyphen
  • Multiple sub-domains are permitted
  • Domain name may only contain letters, digits and hyphen
  • Domain name and sub-domain name may not begin and/or end with a hyphen
  • Domain name and sub-domain must be between 2 to 63 characters long
  • Top-level domain may only contain letters
  • Top-level domain must be between 2 to 6 characters long
  • Sub-domain names, the domain name and the top-level domain name are separated by single periods ‘.’

The regular expression (Perl compatible):

/^[A-z0-9\-_]+(\.[A-z0-9\-_]+)*@(([A-z0-9]+\-?[A-z0-9]+)+\.)+[A-z]{2,6}$/

And a PHP function to validate the e-mail addresses:

// Pass the e-mail address to the function.
// Will return true when valid, false when invalid.
function validate_emailaddress($input) {
   return preg_match('/^[A-z0-9\-_]+(\.[A-z0-9\-_]+)*@(([A-z0-9]+\-?[A-z0-9]+)+\.)+[A-z]{2,6}$/', $input);
}

Note that the solution shown will discount alot of possible RFC 2822 compliant e-mail adressess as invalid but these are mostly used in intranet settings. Stricter rules apply for the web and these are the basis for the expression I’ve made.

What the expression doesn’t account for:

  • It doesn’t check whether the subdomain and domain names’ length exceeds the maximum of 63 characters
  • Every made up top-level domain will pass, given it is between 2 and 6 characters long
  • Something I might have missed?

If you have a solution for the aforementioned shortcomings, please let me know!

Convert Magic eDeveloper / Pervasive date: `Days since AD` in PHP

Friday, December 21st, 2007

A while ago I was working on a PHP web-application that uses a Pervasive SQL database which belonged to a Magic eDeveloper application. I encountered a really weird date format in the database which I couldn’t really place. I had a hunch that it might be the amount of days since AD (01-01-0000). This turned out to be right. Me and a friend devised a way to convert these dates to a unix timestamp in the following manner:

  • Determine the amount of days from AD (01-01-0000) to the Unix epoch (01-01-1970). Outcome: 719163.
  • Subtract the weird date with the amount determined above.
  • Multiply the outcome by the amount of seconds in a day (86400).

Presto, there you have your unix timestamp.

The code in PHP:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// A simple way to convert Magic date to a unix timestamp and vice versa
define("AD_TO_UNIXEPOCH", 719163);
define("SECONDS_IN_DAY", 86400);
 
// Function that converts the days to unixtime
function days2unixtime($date) {
   return ($date - AD_TO_UNIXEPOCH) * SECONDS_IN_DAY;
}
 
// The other way around
function unixtime2days($date) {
   return AD_TO_UNIXEPOCH + round($date / SECONDS_IN_DAY, 0);
}
 
// Proof of concept
$magicdate = 732468;
$unixtime = days2unixtime($magicdate);
print strftime("%d-%m-%Y", $unixtime) . "\n"; // Prepare for a hellish output ;)
print unixtime2days($unixtime); // And converted back

NB: For dates before the Unix epoch you will get a negative result.

Hope this helps someone who’s encountered the same problem.
If you know a better solution please leave a comment or mail.