DOC PREVIEW
Duke CPS 296.2 - Optional Algorithms for Approximate Clusting

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Abstract Itt ;L CdllstC’rirlg ~JIYJ~Jh’Ill, 1.1tft ,littt is to I);Lrt.it,icbtt il k;ivc*tt sc,t of 7,. l)oittt,s itt rl-clitttc,ttsic,tti11 SlXlCX into k grt,ttl)s, (.;1.11(~1 c’lttst,cTs. so tltitl 1)oittl.s witllilt (~u.11 c~lttst.c~r ;tt’(’ ll(‘il1‘ l’il(‘lt Ol.ll(‘1-. ‘I’WCJ ClJJ~j<V~liVC’ I’llll~~l.iOllS l’1~(‘l~ll~‘tl1 l,V t15,~l (0 ttt(‘;bsttr(’ t.lt(a I)l’rfot.ttt;Lttc.( of ;t c~lttsl.c~titrg ibl~orit.ltttt i~r(‘, for ;ttty I,,, ttt(Qi(., (a) (.It(b tttaxitttttttt (list i~tt(:(’ I)c!t,WcV~tI 1JiLil.S 01’ lJOitlk itI Olt(~ SilltIC’ c~lltst.c~r, ilItC1 (1)) t ll(: tttaximtttu c1iStiHlc.c~ l)c~t.wt:etl poittts itt t’it(h clust f!r and it c:hosr!n clttstf:r c~c~tlter; wc rofcr to ctithc‘l ~~wilsilrc as t.ltc! cIttst,c~r size:. We show that one cannot a.pproximate the optima1 cluster size for a fixed number of clusters within a factor close to 2 in polynomial time, for two or more dimensions, unless P=NP. We also present an algo- rithm t.hat achieves this factor of 2 in time O(n log k), anal show that, this rttnning f,itrtt: is optimal itt t,hc> al- ~:1~l)r;i.ic. cl(:c.isiort t.rcbcs ttto(l(~1. For ;I fixotl cltlst.c~r siz.cb, Ott 1.11($ Of Ilc’r llitIt(l, w(’ give, it pol,yttottAl t.ittio ;qJ]JtXJX- illliLt.ioIl S(~lI~~Ill~~ t,lliLt. (?it.iIIlil.t.CYi tltc~ opt,itttal 11111 tt\J(~I Of’ (‘lllSt.(!rS llIt(l(‘r l.lltb S(Y:oll(l tll(‘it,SIII’(’ OI’ (‘lllSt.(Sr sixc- wit.hitl fit(‘~~OIX ill’/Jitt’itt’it,~ (‘I( 1% t.(J 1. 0111. i1~J~JtOil~~h is c.xl.c*tttlul 1.0 l)rovitl(t :tl)1)r(‘xittt:lt,iot1 ;~.lgorit.ltttts fot 1,lt(~ rc5t.rickl c~c~Ilt,c~rs. sttlq)lic>rs, iltl(l wc*igltl.c>tl sttl)- IJlicbrs ~JrolJl~~~r~s t.1t:i.t. rittt iii ol)t.ittt;tl O(II log X:) t.itttcl it.tt~l ;t.c%ic*vcb 0~tt~ittt;r.l 01’ Itc~;trly O~Jl~iIlliLl ;llJlJr(~xittt~l.t,iott IJ~Jltll~k. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specfic permission. 1. Introduction Cllttsl.c+itg ~~rol~l~~ttts ;iriscL itt il #rcYrt v:u%+y of’ (‘otl- I.c~xts: (lil1.;1 c~ottil~rossioti. Ililt.l.(‘l’ll rwogtlil ion, c2Jttc’cyJt I~~ilrtlitl~, 1Boittt. siittt~~littg, scbrviccb sil.ca itssiqttttc~tt1.. ;ttt(l wittclc)wittg. IkIr.lic~r work 11;~s S~IC~WII t ha1 c~lttst.cGrlg is C~Olll~Jllt.ilt.iCJtlilll~~ Ililrtl iii it 8 0IJI iItlill l;Jrtll. I)111 itl ljr;t.(‘t ic.cs sttl)ol)t ittlitl soltti icuts ;tr(* vcbr.v Itsc~l’tll. ;lIt(i it Vi1ric~t.y (11’ gootl IIottrist.ic3 ilr(’ iii widt5lmwl tts(’ (c,.ir;. (G&I], [Eq87]). Tltcb purpose of tltc\ prt’sctttt, work is t,o oxploro StllJ0~Jt.illlill solutions. in particular. to find soltttiotts t,llilt itr(‘ grt;traIttc~c~d to 111, wit ltiit a fi1c.t or of the opt.intunt, oitltcr in the nutttbtV of clusters ttsctl. or in the size of clusters used. Let S be a set of n points in d-dimertsional spare. .I partition of S into k sets. Cl.C?.. . .CJ,.. is called a k- clustering, anti the individual C, are called C[~LS~PK~. Thcx c:h~.stc:r~ si:c* of a X-clustcrittg is t.ftc‘ It‘ast \-al~t(~ D for wltich ;tll t.lto I)oittts in (‘it(*Il c*ltIstc~r C, art- c*it.ltctr (a) wit,ttitt clist.;t.ttc.c* 11 of ('il(.l1 ot.ltc~r. or (I)) wit.hin dist,;ittc~c~ D/2 of s01w r~lfr,sf.f!~ c:c~rtc:rs wlticlr (21t IW iLtI,y I)oittt. itt sl);l(‘c’. WtlcTc~ diSt.iUlc.cY il.r(’ tll(‘il.SIlr(‘<l with ;itty I,,, ttt(*l ric.. Wo r(lf(lr 1.0 I.ltc*sc~ t.wo iLlt.fTIlitl.iV(* tl(*fittit.iotts il.S I/fk/-rft%sc, clrrslf7~ s6X il.Il(l cYdml c~lrrsh~ si,-c, rc>sl)c\c’l.ivc4y. ‘h> ~“~,i~-f/JiSc~ f’~r/.s//“/‘t,//!/ ~““bhtt. ((‘/“It hf,i! c4~/LS/Wi’M,// yJ7dJk7rr) is 1.(J litttl, for it. f$vrw Sc*t S iltl(1 iIt(.qyr X:. it. X:-c:lttst,ctrittg for S wit.h I)ilil’WiS1’ calust,cbr sizcb (C(%trill clttstclr six(-) D its stttall as possible. (WV will alsO c.onsitl(~r VilriiLt.ioIlS wltc*rc, D is givcatt ;t.Iltl the, sttti~llcst~ k is sottgltt,.) Not.ic.cb t.ltiit. itt t ltcb IJI;IIIc*. wit.11 2,~ ittt(l L, (Or LI ) tlist.ilIlcY~s, t,hC (Y’Ilt.rill clttstc~rittg IHYtI)lCYtl is the problcttt of covctrirtg t.hc> poittts in S with k: c:ir&s or S~~llilITS, rq)c~c:t8ivc4y. of tltcl stttiilh~st. size. Itt ott(’ (littt(~ttsioti, I)itirwisc 5 ;ttt(l c(~ttt.ritl c.lttsic*rittg (‘ill1 lJc% solw~~l itl ~~01~~1l~~ttliil.l f.ittlc* l1.v tlyrl;ltllic~ [)tY’KtXttl- ttlitlK (1l$r771); wf’ (‘it11 iI(‘ltic~V(’ ()(I/ IoK /I) rilttttittg t.itttcs. l:or two 01’ tttor(’ (littt(~ttsiotts. I II<\ lJr001~~1t1 is kttowtt t.o 1,~ NI’-(.otttI,l(+’ (s(v’ I’;,wlc~. I’;lb.rsott it.tt(l 0 1988 ACM-O-89791 -264-O/88/0005/0434 $1.50 434WV give> ii siiuplcbr rcduc%ioll for t.liosc* prold~~ilis fr0111 plittla,r V~Ti.C~X rovw, wliidi ;dlows us t.0 olA;titl I)ot.t.cT IN~lrtlcls ror (1. w s ,l ( ?, IOW t.l1;1 t r.llc ~~II,st.(‘I~III~ I)IY)I)IoI~I I.cltIt;IitIs NI’-II;LIxI if (1 1 .X’Z fin (‘(~Ilt.r;*l Id2 ~~lItsl.~~ritt,q~ (Y L 1 .!K9 for ~+rwisc~ 1~ c4IIst.c~ritIg, 111lCl (P v 2 fOr IbOt.11 IMilWiSC\ il.lltl (‘t’llt.l.ill Ll ;ttld Lx c.lust.t>ritIg. l3ou~1~ls of t,l~ca foriu (Y .’ 2 wv(m$ prcdonsly known for a llollff(?(~“‘t,t\ric: vcmion of’ t.llc> pairwisc c9itst,critig j)rol)l~tti, wlicrc t.lio only c~onstraint, is that, distancr~s sat,isfy t.lirl triartglt~ itteqttalit,y [HS86], and t,ltus without t,lte geotnctric rest&Cons imposed by Lt and L, distances in the plant. For the: geometric restrict ions we consider, Mcnt,zr.r has independently obt nittc~rl the rctIt,ral clustering versions of t,hr above bouncts [bIe88]. An approximation factor of 2 has in fart. been a&ic\-rd. Hoc~hbaunt and Slnnoys ([HSBCi],[HS85]) gave’ a grit& ;Ilgorit~lntt for


View Full Document

Duke CPS 296.2 - Optional Algorithms for Approximate Clusting

Download Optional Algorithms for Approximate Clusting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Optional Algorithms for Approximate Clusting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Optional Algorithms for Approximate Clusting 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?