Javascript: Remove duplicates from Array
Removing duplicate entries or values from a Javascript array is something which isn’t accomodated for by the native functions in Javascript. I searched Google for a few solutions but they were all lacking something in my opinion, be it performance or just sheer elegance.
*UPDATE*
There is now an array prototype function for removing duplicates. I suggest you use the prototype function instead of the functions below. It is the way it should have been done in the first place and is faster and moreover the most correct way to do it.
Array.prototype.unique = function () { var r = new Array(); o:for(var i = 0, n = this.length; i < n; i++) { for(var x = 0, y = r.length; x < y; x++) { if(r[x]==this[i]) { continue o; } } r[r.length] = this[i]; } return r; }
You can now utilize the unique function like this:
var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9]; var unique = arr.unique(); alert(unique);
The result will be [1,2,3,4,5,6,7,8,9].
*UPDATE*
I therefore made two variants myself:
The first one does what you’ll expect, it’ll regard the first encountered entry as the original and all subsequent entries as duplicates.
function unique(a) { var r = new Array(); o:for(var i = 0, n = a.length; i < n; i++) { for(var x = 0, y = r.length; x < y; x++) { if(r[x]==a[i]) continue o; } r[r.length] = a[i]; } return r; }
If we pass the following array [1, 2, 3, 1, 4, 5] to the function the result will be [1, 2, 3, 4, 5].
The second variant returns different results. It will regard the last encountered duplicate as the original. This may be desirable in certain situations.
function unique(a) { var r = new Array(); o:for(var i = 0, n = a.length; i < n; i++) { for(var x = i + 1 ; x < n; x++) { if(a[x]==a[i]) continue o; } r[r.length] = a[i]; } return r; }
The output in this case will be [2, 3, 1, 4, 5].
I wrote both these functions with performance in mind. I have thoroughly tested and profiled both of them. Here are the benefits over other solutions to the problem:
- it’s just one function
- the loop length is determined through the object model just once per loop (see the for statements: var i = 0, n = a.length;)
- after a duplicate has been detected in the nested loop it doesn’t iterate further but goes on to the next entry in the source array (see continue statement)
You can easily adapt the function to make it a prototype function of an Array object. If people don’t know how to do it just ask, i’ll add it.
Hope this helps someone out!
Tags: array, duplicate, duplicates, entries, javascript, remove
Really too good
March 21st, 2008 at 10:34 pmWell done. Thanks.
June 20th, 2008 at 3:11 pmYour blog entry was first when I searched Google for any posts indexed during past year that contained “javascript strip duplicates”. Saved me another several hours of work on a tight deadline.
Thanks!
August 27th, 2008 at 5:39 amvery good!
December 20th, 2008 at 4:24 amYou know, adding public methods to the Array prototype is actually a very destructive behavior. It’s one of the reasons they had to completely rewrite the prototype library last year (or so). New functions added into the prototype will appear in for/in loops, potentially messing up otherwise functional code.
But if you don’t mind the potential headaches, knock yourself out.
December 24th, 2008 at 9:56 pmnice, too bad it is not in std lib
1. doing each to each is O(N**2)
2. if you sort it and then remove guys which are same as guy before them you got O(N*log(N) + N)
3. you can use hash – O(N)
/// create list
var foo = “ahoj # ahoj # foo # foobar # fooba # bar # foo # kkt # foo “;
var list = new Array();
list = foo.split(“#”); // chunk it
for (var i in list ) {
document.write(“#” + list[i] + “#”);
}
/// /// get rid of spaces
foo = foo.replace(/[ ]+/gi, “”); // get rid of spaces
/// /// finally create list
list = foo.split(“#”);
/// output
var out = new Array();
///
January 11th, 2009 at 1:48 pmvar h = new Array();
var times = 0;
for (var i in list ) {
/// first time set it
if (‘undefined’ == typeof(h[list[i].toLowerCase()])) {
h[list[i].toLowerCase()] = 1;
out.push(list[i]);
} else {
/// the other time ignore
/// /// print removed
document.write(“* ” + list[i] + “”);
++ times;
}
}
///
/// /// print result
document.write(“times” + times + “”);
for (var i in out ) {
document.write(“#” + out[i] + “#”);
}
sorry for that mess
uncoment /// for debug output
function uniq(x) {
var y = new Array();
for(var i in x ) { ///= 0, n = a.length; i < n; i++)
/// document.write(“item:” + x[i] + “”);
if (‘undefined’ == typeof(y[x[i]])) {
y[x[i]] = x[i];
} else {
/// document.write(“again:” + x[i] + “”);
}
}
return y;
}
result MUST be accessed by “foreach” :
January 11th, 2009 at 2:02 pmfor (var i in result) { }
Thanks Much! Martin
indeed the code is very code and important…
But what “Mason Houtz” said is good to note as well
February 20th, 2009 at 1:43 pmthanks alot good job
March 24th, 2009 at 6:02 amThank you for sharing!
@ Mason: I’m not well-versed in prototyping so I’m not sure how adding .unique would create later conflicts. Gives me something to read about. I avoided it on your word.
The updated solutions work great! Thanks.
FYI: this comment textarea is too wide for it’s container and some of the comment gets hidden on the right while you’re typing. (FF2, Vista)
March 27th, 2009 at 7:50 pmCan this be made to leave the duplicates and fire off an alert that there are duplicates?
April 10th, 2009 at 12:29 pmI found this on the net which does what I am looking for.
function unique(a) {
April 10th, 2009 at 6:26 pmvar i=a.length;
var a2, o, j;
while (i–) {
a2 = a.slice(0,i).concat(a.slice(i+1,a.length));
o = {};
j = a2.length;
while (j–) {
o[a2[j]] = ”;
}
if (a[i] in o) {
return true;
}
}
return false;
}
I just copy pasted the function and it worked like charm.
Thanks alot and keep the good work.
Cheers,
May 14th, 2009 at 11:42 amVishwajeet Singh
Another simple solution – removes elements in place. Since sort is n log(n) – this is faster than the n^2 solutions above.
function DeDupArray(a)
May 23rd, 2009 at 11:42 am{
a.sort();
for (var i = 1; i < a.length; i++)
{
if (a[i-1] == a[i])
a.splice(i, 1);
}
}
Thanks for the posting. I just used the DeDupArray function in the prior reply and found a problem when the array has more than 2 duplicates. Here’s the fix.
function DeDupArray(a) {
a.sort();
for (var i = 1; i < a.length; i++) {
if (a[i-1] == a[i]) {
a.splice(i, 1);
i–; //..go back one to compare the next pair in the altered array
}
}
} //..end function DeDupArray
Thanks.
May 28th, 2009 at 2:12 amHi thanks ,
May 29th, 2009 at 7:32 amthis is really good .
Mike Koss’ post doesn’t quite work (due the the array being resized by the splice). It needs a minor adjustment:
function DeDupArray(a)
{
a.sort();
for (var i = 1; i < a.length; i++)
{
if (a[i-1] == a[i])
{
a.splice(i, 1);
i–;
}
}
}
For Mike’s original algorithm, for the input array containing (“a”,”a”,”a”,”c”,”b”,”b”,”c”,”c”,”c”), it returns (“a”,”a”,”b”,”c”,”c”)
June 12th, 2009 at 11:59 amThe above post should have “i–;” beneath the call to splice
June 12th, 2009 at 12:00 pmstill not working! – it should be a call to decrement i
June 12th, 2009 at 12:01 pmWould it be possible to return an array of indexs to be removed? As my array contains other info that is not unique ie (“length=5.5″) so i would like to search a names-only array and return an array of indexs to remove from the actual array.
June 24th, 2009 at 1:53 amUsing a hash will be much faster (O(n) = n) than this approach.
Sth. like (untested)
var input = new Array(1,2,3);
June 26th, 2009 at 1:20 pmvar result = new Object();
for (var i in input) {
result[input[i]] = input[i];
}
A little off topic, but can someone tell me what the “o:” is in front of the for loop.
I’ve never seen that before.
Great script by the way!
Thanks,
August 24th, 2009 at 8:16 pmHash method is only faster for large arrays > 100 items. See: http://www.shamasis.net/2009/09/fast-algorithm-to-find-unique-items-in-javascript-array/
September 6th, 2009 at 7:24 pmfunction DeDupArray(a)
{
a.sort();
for (var i = 1; i < a.length; i++)
{
if (a[i-1] == a[i])
{
a.splice(i, 1);
–i;
}
}
}
that works.
October 6th, 2009 at 2:21 pmI’d write DeDupArray as:
function DeDupArray(a)
October 6th, 2009 at 7:13 pm{
a.sort();
for (var i = 1; i < a.length; )
{
if (a[i-1] == a[i])
a.splice(i, 1);
else
i++;
}
}
Glenn Jackman’s solution is elegant. Just to bring it all together:
function unique(a){
October 14th, 2009 at 8:03 pma.sort();
for(var i = 1;i < a.length;){
if(a[i-1] == a[i]){a.splice(i, 1);}
else{i++;}
}
return a;
}
@Joe:
“A little off topic, but can someone tell me what the “o:” is in front of the for loop.”
It’s a label – the “o” can be any valid symbol like “outerLoop” or whatever you like. It’s useful where you have loops inside each other, as by default commands such as “break” and “continue” will only break from the innermost loop. By using “break o” you can break out of the outer loop.
October 18th, 2009 at 12:27 pmvar arrFiltered = arr.filter(function(item) {
October 27th, 2009 at 10:25 amreturn (arr.indexOf(item) === arr.lastIndexOf(item))
});
Sorry, this method returns items, appeared once. All duplicated items are ignored
October 27th, 2009 at 11:12 amWhat’s the purpose of ‘o’ in these functions? If you know the answer, please email me at http://tr.im/DOA1. Thanks so much, Alex.
November 1st, 2009 at 5:34 pmvar arr = [1,1,1,1,1];
var unique = arr.unique();
alert(unique);
output is [1,]…….plase check
November 4th, 2009 at 11:18 amThanks a lot .
December 10th, 2009 at 5:29 amThanks a lot .
January 8th, 2010 at 1:01 amWith JavaScript 1.7, you can do this
Array.prototype.unique = function ()
January 13th, 2010 at 11:52 pm{
var r = new Array();
for(var i = 0, n = this.length; i < n; i++)
{
if (this.lastIndexOf(this[i]) == i) r.push(this[i]);
}
return r;
}
Back to Mason Houtz way up in the response list: check out http://www.kenegozi.com/blog/2009/04/13/javascript-and-the-extended-array-prototype.aspx to prove you wrong at the destructive side of array prototyping. When functionalities collide, always go back to the original intention behind it
February 9th, 2010 at 3:09 amIf though approaching arrays in an associative manner is imperative in your system, prototyping array functionality will indeed prove destructive.
I really appreciate it. Over so many solutions your worked.
February 27th, 2010 at 1:58 amJust came up with this one-liner:
a = a.sort().join(“,,”).replace(/(,|^)([^,]+)(,,\2)+(,|$)/g,”$1$2$4″).replace(/,,+/g,”,”).replace(/,$/,”").split(“,”);
NOTE:
1) Choose delimiter carefully (this example uses “,”)
2) Returned array is sorted!
3) Returned array is an array of strings
Cheers,
March 9th, 2010 at 2:28 pmIvan
A file of mine has a virus and i don’t want to delete it how can I erase the virus from existence and still keep my file? If you have the name of a comp program that can do it please do tell.
_____________
April 20th, 2010 at 8:50 pmgrow taller
@Martin
I’m definitely not suggesting it’s a good idea to misuse an array by treating it like an associative hash. What I’m suggesting is that others have nonetheless already done so, and my point was that this kind of thing can break existing code.
I know better, man.
May 27th, 2010 at 2:37 am@vakata Thanks you very much!
June 10th, 2010 at 9:42 amHere is a slightly different approach. Rather than grabbing on the matches, I grabbed where in the sequence when they do not match after sorting.
var dedup = function(a)
June 17th, 2010 at 4:52 pm{
a.sort(); //get the dups together
var b = []; //create a replacement array
for(var i=a.length;i–;) //loop down the original array
{
if(a[i]!==a[i-1]) //since they are bunched, only pull out the ones that don’t match
{
b[b.length] = a[i]; // give the replacement array the ones that dont match
}
}
return b; // return the replacement arra
}
alert(‘Original Array:\’bob\’,\’sally\’,9,\’sally\’,\’george\’,\’george \’,\’andrew\’,23,230,\’23\’\n’ + ‘Deduped Array:’+ dedup(['bob','sally',9,'sally','george','george ','andrew',23,230,'23']));
Fantastic! It saved my time a lot.
July 12th, 2010 at 10:02 pmKeep it up!
There is an alternative example here at:
July 21st, 2010 at 2:12 pmhttp://www.bytechaser.com/en/functions/xjfj6grjfr/create-a-distinct-list-from-a-javascript-array.aspx
which works with text and numbers and allows you to specify if you want the process to be case sensitive.