Javascript: Remove duplicates from Array

Removing duplicate entries or values from a Javascript array is something which isn’t accomodated for by the native functions in Javascript. I searched Google for a few solutions but they were all lacking something in my opinion, be it performance or just sheer elegance.

*UPDATE*

There is now an array prototype function for removing duplicates. I suggest you use the prototype function instead of the functions below. It is the way it should have been done in the first place and is faster and moreover the most correct way to do it.

Array.prototype.unique = function () {
	var r = new Array();
	o:for(var i = 0, n = this.length; i < n; i++)
	{
		for(var x = 0, y = r.length; x < y; x++)
		{
			if(r[x]==this[i])
			{
				continue o;
			}
		}
		r[r.length] = this[i];
	}
	return r;
}

You can now utilize the unique function like this:

var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9];
var unique = arr.unique();
alert(unique);

The result will be [1,2,3,4,5,6,7,8,9].

*UPDATE*

I therefore made two variants myself:

The first one does what you’ll expect, it’ll regard the first encountered entry as the original and all subsequent entries as duplicates.

function unique(a)
{
   var r = new Array();
   o:for(var i = 0, n = a.length; i < n; i++)
   {
      for(var x = 0, y = r.length; x < y; x++)
      {
         if(r[x]==a[i]) continue o;
      }
      r[r.length] = a[i];
   }
   return r;
}

If we pass the following array [1, 2, 3, 1, 4, 5] to the function the result will be [1, 2, 3, 4, 5].

The second variant returns different results. It will regard the last encountered duplicate as the original. This may be desirable in certain situations.

function unique(a)
{
   var r = new Array();
   o:for(var i = 0, n = a.length; i < n; i++) {
      for(var x = i + 1 ; x < n; x++)
      {
         if(a[x]==a[i]) continue o;
      }
      r[r.length] = a[i];
   }
   return r;
}

The output in this case will be [2, 3, 1, 4, 5].

I wrote both these functions with performance in mind. I have thoroughly tested and profiled both of them. Here are the benefits over other solutions to the problem:

  • it’s just one function
  • the loop length is determined through the object model just once per loop (see the for statements: var i = 0, n = a.length;)
  • after a duplicate has been detected in the nested loop it doesn’t iterate further but goes on to the next entry in the source array (see continue statement)

You can easily adapt the function to make it a prototype function of an Array object. If people don’t know how to do it just ask, i’ll add it.

Hope this helps someone out!

Tags: , , , , ,

43 Responses to “Javascript: Remove duplicates from Array”

  1. Pradeep Tiwari wrote:

    Really too good

  2. qwerty wrote:

    Well done. Thanks.

  3. Tony Chung wrote:

    Your blog entry was first when I searched Google for any posts indexed during past year that contained “javascript strip duplicates”. Saved me another several hours of work on a tight deadline.

    Thanks!

  4. roman wrote:

    very good!

  5. Mason Houtz wrote:

    You know, adding public methods to the Array prototype is actually a very destructive behavior. It’s one of the reasons they had to completely rewrite the prototype library last year (or so). New functions added into the prototype will appear in for/in loops, potentially messing up otherwise functional code.

    But if you don’t mind the potential headaches, knock yourself out. :)

  6. hribek wrote:

    nice, too bad it is not in std lib
    1. doing each to each is O(N**2)
    2. if you sort it and then remove guys which are same as guy before them you got O(N*log(N) + N)
    3. you can use hash – O(N)
    /// create list
    var foo = “ahoj # ahoj # foo # foobar # fooba # bar # foo # kkt # foo “;
    var list = new Array();
    list = foo.split(“#”); // chunk it
    for (var i in list ) {
    document.write(“#” + list[i] + “#”);
    }
    /// /// get rid of spaces
    foo = foo.replace(/[ ]+/gi, “”); // get rid of spaces
    /// /// finally create list
    list = foo.split(“#”);
    /// output
    var out = new Array();

    ///
    var h = new Array();
    var times = 0;
    for (var i in list ) {
    /// first time set it
    if (‘undefined’ == typeof(h[list[i].toLowerCase()])) {
    h[list[i].toLowerCase()] = 1;
    out.push(list[i]);
    } else {
    /// the other time ignore
    /// /// print removed
    document.write(“* ” + list[i] + “”);
    ++ times;
    }
    }
    ///
    /// /// print result
    document.write(“times” + times + “”);
    for (var i in out ) {
    document.write(“#” + out[i] + “#”);
    }

  7. hribek wrote:

    sorry for that mess ;)
    uncoment /// for debug output

    function uniq(x) {
    var y = new Array();
    for(var i in x ) { ///= 0, n = a.length; i < n; i++)
    /// document.write(“item:” + x[i] + “”);
    if (‘undefined’ == typeof(y[x[i]])) {
    y[x[i]] = x[i];
    } else {
    /// document.write(“again:” + x[i] + “”);
    }
    }
    return y;
    }

    result MUST be accessed by “foreach” :
    for (var i in result) { }

  8. durgesh suman wrote:

    Thanks Much! Martin

    indeed the code is very code and important…

    But what “Mason Houtz” said is good to note as well

  9. Naresh N wrote:

    thanks alot good job

  10. Justin wrote:

    Thank you for sharing!
    @ Mason: I’m not well-versed in prototyping so I’m not sure how adding .unique would create later conflicts. Gives me something to read about. I avoided it on your word.

    The updated solutions work great! Thanks.

    FYI: this comment textarea is too wide for it’s container and some of the comment gets hidden on the right while you’re typing. (FF2, Vista)

  11. Tim wrote:

    Can this be made to leave the duplicates and fire off an alert that there are duplicates?

  12. Tim wrote:

    I found this on the net which does what I am looking for.

    function unique(a) {
    var i=a.length;
    var a2, o, j;
    while (i–) {
    a2 = a.slice(0,i).concat(a.slice(i+1,a.length));
    o = {};
    j = a2.length;
    while (j–) {
    o[a2[j]] = ”;
    }
    if (a[i] in o) {
    return true;
    }
    }
    return false;
    }

  13. Vishwajeet Sinfh wrote:

    I just copy pasted the function and it worked like charm.

    Thanks alot and keep the good work.

    Cheers,
    Vishwajeet Singh

  14. Mike Koss wrote:

    Another simple solution – removes elements in place. Since sort is n log(n) – this is faster than the n^2 solutions above.

    function DeDupArray(a)
    {
    a.sort();
    for (var i = 1; i < a.length; i++)
    {
    if (a[i-1] == a[i])
    a.splice(i, 1);
    }
    }

  15. Sam Skielnik wrote:

    Thanks for the posting. I just used the DeDupArray function in the prior reply and found a problem when the array has more than 2 duplicates. Here’s the fix.

    function DeDupArray(a) {
    a.sort();
    for (var i = 1; i < a.length; i++) {
    if (a[i-1] == a[i]) {
    a.splice(i, 1);
    i–; //..go back one to compare the next pair in the altered array
    }
    }
    } //..end function DeDupArray

    Thanks.

  16. Parag wrote:

    Hi thanks ,
    this is really good .

  17. milk wrote:

    Mike Koss’ post doesn’t quite work (due the the array being resized by the splice). It needs a minor adjustment:

    function DeDupArray(a)
    {
    a.sort();
    for (var i = 1; i < a.length; i++)
    {
    if (a[i-1] == a[i])
    {
    a.splice(i, 1);
    i–;
    }
    }
    }

    For Mike’s original algorithm, for the input array containing (“a”,”a”,”a”,”c”,”b”,”b”,”c”,”c”,”c”), it returns (“a”,”a”,”b”,”c”,”c”)

  18. milk wrote:

    The above post should have “i–;” beneath the call to splice

  19. milk wrote:

    still not working! – it should be a call to decrement i

  20. Dodgy wrote:

    Would it be possible to return an array of indexs to be removed? As my array contains other info that is not unique ie (“length=5.5″) so i would like to search a names-only array and return an array of indexs to remove from the actual array.

  21. Peter wrote:

    Using a hash will be much faster (O(n) = n) than this approach.

    Sth. like (untested)

    var input = new Array(1,2,3);
    var result = new Object();
    for (var i in input) {
    result[input[i]] = input[i];
    }

  22. Joe Giunta wrote:

    A little off topic, but can someone tell me what the “o:” is in front of the for loop.
    I’ve never seen that before.

    Great script by the way!

    Thanks,

  23. Shamasis Bhattacharya wrote:

    Hash method is only faster for large arrays > 100 items. See: http://www.shamasis.net/2009/09/fast-algorithm-to-find-unique-items-in-javascript-array/

  24. GOP wrote:

    function DeDupArray(a)
    {
    a.sort();
    for (var i = 1; i < a.length; i++)
    {
    if (a[i-1] == a[i])
    {
    a.splice(i, 1);
    –i;
    }
    }
    }

    that works.

  25. glenn jackman wrote:

    I’d write DeDupArray as:

    function DeDupArray(a)
    {
    a.sort();
    for (var i = 1; i < a.length; )
    {
    if (a[i-1] == a[i])
    a.splice(i, 1);
    else
    i++;
    }
    }

  26. Brad Czerniak wrote:

    Glenn Jackman’s solution is elegant. Just to bring it all together:

    function unique(a){
    a.sort();
    for(var i = 1;i < a.length;){
    if(a[i-1] == a[i]){a.splice(i, 1);}
    else{i++;}
    }
    return a;
    }

  27. Nick wrote:

    @Joe:

    “A little off topic, but can someone tell me what the “o:” is in front of the for loop.”

    It’s a label – the “o” can be any valid symbol like “outerLoop” or whatever you like. It’s useful where you have loops inside each other, as by default commands such as “break” and “continue” will only break from the innermost loop. By using “break o” you can break out of the outer loop.

  28. Quall wrote:

    var arrFiltered = arr.filter(function(item) {
    return (arr.indexOf(item) === arr.lastIndexOf(item))
    });

  29. Quall wrote:

    Sorry, this method returns items, appeared once. All duplicated items are ignored

  30. Alexsander wrote:

    What’s the purpose of ‘o’ in these functions? If you know the answer, please email me at http://tr.im/DOA1. Thanks so much, Alex.

  31. Tinu wrote:

    var arr = [1,1,1,1,1];
    var unique = arr.unique();
    alert(unique);

    output is [1,]…….plase check

  32. bagavathe wrote:

    Thanks a lot .

  33. Dmitry wrote:

    Thanks a lot .

  34. Shanti wrote:

    With JavaScript 1.7, you can do this

    Array.prototype.unique = function ()
    {
    var r = new Array();
    for(var i = 0, n = this.length; i < n; i++)
    {
    if (this.lastIndexOf(this[i]) == i) r.push(this[i]);
    }
    return r;
    }

  35. Martin wrote:

    Back to Mason Houtz way up in the response list: check out http://www.kenegozi.com/blog/2009/04/13/javascript-and-the-extended-array-prototype.aspx to prove you wrong at the destructive side of array prototyping. When functionalities collide, always go back to the original intention behind it :)
    If though approaching arrays in an associative manner is imperative in your system, prototyping array functionality will indeed prove destructive.

  36. claudio wrote:

    I really appreciate it. Over so many solutions your worked.

  37. vakata wrote:

    Just came up with this one-liner:

    a = a.sort().join(“,,”).replace(/(,|^)([^,]+)(,,\2)+(,|$)/g,”$1$2$4″).replace(/,,+/g,”,”).replace(/,$/,”").split(“,”);

    NOTE:
    1) Choose delimiter carefully (this example uses “,”)
    2) Returned array is sorted!
    3) Returned array is an array of strings

    Cheers,
    Ivan

  38. csodar wrote:

    A file of mine has a virus and i don’t want to delete it how can I erase the virus from existence and still keep my file? If you have the name of a comp program that can do it please do tell.

    _____________
    grow taller

  39. Mason Houtz wrote:

    @Martin

    I’m definitely not suggesting it’s a good idea to misuse an array by treating it like an associative hash. What I’m suggesting is that others have nonetheless already done so, and my point was that this kind of thing can break existing code.

    I know better, man. :)

  40. Jose wrote:

    @vakata Thanks you very much!

  41. getburl wrote:

    Here is a slightly different approach. Rather than grabbing on the matches, I grabbed where in the sequence when they do not match after sorting.

    var dedup = function(a)
    {
    a.sort(); //get the dups together
    var b = []; //create a replacement array
    for(var i=a.length;i–;) //loop down the original array
    {
    if(a[i]!==a[i-1]) //since they are bunched, only pull out the ones that don’t match
    {
    b[b.length] = a[i]; // give the replacement array the ones that dont match
    }
    }
    return b; // return the replacement arra
    }
    alert(‘Original Array:\’bob\’,\’sally\’,9,\’sally\’,\’george\’,\’george \’,\’andrew\’,23,230,\’23\’\n’ + ‘Deduped Array:’+ dedup(['bob','sally',9,'sally','george','george ','andrew',23,230,'23']));

  42. Rajini wrote:

    Fantastic! It saved my time a lot.
    Keep it up!

  43. Alex wrote:

    There is an alternative example here at:
    http://www.bytechaser.com/en/functions/xjfj6grjfr/create-a-distinct-list-from-a-javascript-array.aspx
    which works with text and numbers and allows you to specify if you want the process to be case sensitive.

Leave a Reply