99999久久久久久亚洲,欧美人与禽猛交狂配,高清日韩av在线影院,一个人在线高清免费观看,啦啦啦在线视频免费观看www

熱線電話:13121318867

登錄
首頁精彩閱讀數(shù)據(jù)挖掘筆記-聚類-Canopy-原理與簡單實(shí)現(xiàn)
數(shù)據(jù)挖掘筆記-聚類-Canopy-原理與簡單實(shí)現(xiàn)
2017-12-10
收藏

數(shù)據(jù)挖掘筆記-聚類-Canopy-原理與簡單實(shí)現(xiàn)

Canopy聚類算法是一個(gè)將對象分組到類的簡單、快速、精確地方法。每個(gè)對象用多維特征空間里的一個(gè)點(diǎn)來表示。這個(gè)算法使用一個(gè)快速近似距離度量和兩個(gè)距離閾值 T1>T2來處理?;镜乃惴ㄊ牵瑥囊粋€(gè)點(diǎn)集合開始并且隨機(jī)刪除一個(gè),創(chuàng)建一個(gè)包含這個(gè)點(diǎn)的Canopy,并在剩余的點(diǎn)集合上迭代。對于每個(gè)點(diǎn),如果它的距離第一個(gè)點(diǎn)的距離小于T1,然后這個(gè)點(diǎn)就加入這個(gè)聚集中。除此之外,如果這個(gè)距離<T2,然后將這個(gè)點(diǎn)從這個(gè)集合中刪除。這樣非??拷c(diǎn)的點(diǎn)將避免所有的未來處理,不可以再做其它Canopy的中心。這個(gè)算法循環(huán)到初始集合為空為止,聚集一個(gè)集合的Canopies,每個(gè)可以包含一個(gè)或者多個(gè)點(diǎn)。每個(gè)點(diǎn)可以包含在多于一個(gè)的Canopy中。

Canopy算法其實(shí)本身也可以用于聚類,但它的結(jié)果可以為之后代價(jià)較高聚類提供幫助,其用在數(shù)據(jù)預(yù)處理上要比單純拿來聚類更有幫助。Canopy聚類經(jīng)常被用作更加嚴(yán)格的聚類技術(shù)的初始步驟,像是K均值聚類。建立canopies之后,可以刪除那些包含數(shù)據(jù)點(diǎn)數(shù)目較少的canopy,往往這些canopy是包含孤立點(diǎn)的。

Canopy算法的步驟如下:

(1) 將所有數(shù)據(jù)放進(jìn)list中,選擇兩個(gè)距離,T1,T2,T1>T2

(2)While(list不為空)

隨機(jī)選擇一個(gè)節(jié)點(diǎn)做canopy的中心;并從list刪除該點(diǎn);

遍歷list:

對于任何一條記錄,計(jì)算其到各個(gè)canopy的距離;

如果距離<T2,則給此數(shù)據(jù)打上強(qiáng)標(biāo)記,并從list刪除這條記錄;

如果距離<T1,則給此數(shù)據(jù)打上弱標(biāo)記;

如果到任何canopy中心的距離都>T1,那么將這條記錄作為一個(gè)新的canopy的中心,并從list中刪除這個(gè)元素;

}

需要注意的是參數(shù)的調(diào)整:
當(dāng)T1過大時(shí),會使許多點(diǎn)屬于多個(gè)Canopy,可能會造成各個(gè)簇的中心點(diǎn)間距離較近,各簇間區(qū)別不明顯;
當(dāng)T2過大時(shí),增加強(qiáng)標(biāo)記數(shù)據(jù)點(diǎn)的數(shù)量,會減少簇個(gè)個(gè)數(shù);T2過小,會增加簇的個(gè)數(shù),同時(shí)增加計(jì)算時(shí)間;

下面用Java來簡單實(shí)現(xiàn)算法,考慮簡單,點(diǎn)只用了二維。

public class CanopyBuilder {  
        private double T1 = 8;  
        private double T2 = 4;  
        private List<Point> points = null;  
        private List<Canopy> canopies = null;  
        public CanopyBuilder() {  
            init();  
        }  
        public void init() {  
            points = new ArrayList<Point>();  
            points.add(new Point(8.1, 8.1));  
            points.add(new Point(7.1, 7.1));  
            points.add(new Point(6.2, 6.2));  
            points.add(new Point(7.1, 7.1));  
            points.add(new Point(2.1, 2.1));  
            points.add(new Point(1.1, 1.1));  
            points.add(new Point(0.1, 0.1));  
            points.add(new Point(3.0, 3.0));  
            canopies = new ArrayList<Canopy>();  
        }  
          
        //計(jì)算兩點(diǎn)之間的曼哈頓距離  
        public double manhattanDistance(Point a, Point b) {  
            return Math.abs(a.getX() - b.getX()) + Math.abs(a.getY() - b.getY());  
        }  
          
        //計(jì)算兩點(diǎn)之間的歐氏距離  
        public double euclideanDistance(Point a, Point b) {  
            double sum =  Math.pow(a.getX() - b.getX(), 2) + Math.pow(a.getY() - b.getY(), 2);  
            return Math.sqrt(sum);  
        }  
      
        public void run() {  
            while (points.size() > 0) {  
                Iterator<Point> iterator = points.iterator();  
                while (iterator.hasNext()) {  
                    Point current = iterator.next();  
                    System.out.println("current point: " + current);  
                    //取一個(gè)點(diǎn)做為初始canopy  
                    if (canopies.size() == 0) {  
                        Canopy canopy = new Canopy();  
                        canopy.setCenter(current);  
                        canopy.getPoints().add(current);  
                        canopies.add(canopy);  
                        iterator.remove();  
                        continue;  
                    }  
                    boolean isRemove = false;  
                    int index = 0;  
                    for (Canopy canopy : canopies) {  
                        Point center = canopy.getCenter();  
                        System.out.println("center: " + center);  
                        double d = manhattanDistance(current, center);  
                        System.out.println("distance: " + d);  
                        //距離小于T1加入canopy,打上弱標(biāo)記  
                        if (d < T1) {  
                            current.setMark(Point.MARK_WEAK);  
                            canopy.getPoints().add(current);  
                        } else if (d > T1) {  
                            index++;  
                        }   
                        //距離小于T2則從列表中移除,打上強(qiáng)標(biāo)記  
                        if (d <= T2) {  
                            current.setMark(Point.MARK_STRONG);  
                            isRemove = true;  
                        }  
                    }  
                    //如果到所有canopy的距離都大于T1,生成新的canopy  
                    if (index == canopies.size()) {  
                        Canopy newCanopy = new Canopy();  
                        newCanopy.setCenter(current);  
                        newCanopy.getPoints().add(current);  
                        canopies.add(newCanopy);  
                        isRemove = true;  
                    }  
                    if (isRemove) {  
                        iterator.remove();  
                    }  
                }  
            }  
            for (Canopy c : canopies) {  
                System.out.println("old center: " + c.getCenter());  
                c.computeCenter();  
                System.out.println("new center: " + c.getCenter());  
                ShowUtils.print(c.getPoints());  
            }  
        }  
      
        public static void main(String[] args) {  
            CanopyBuilder builder = new CanopyBuilder();  
            builder.run();  
        }  
      
    }  

Canopy類

[java] view plain copy

    public class Canopy {  
        private Point center = null;  
        private List<Point> points = null;  
        public Point getCenter() {  
            return center;  
        }  
        public void setCenter(Point center) {  
            this.center = center;  
        }  
        public List<Point> getPoints() {  
            if (null == points) {  
                points = new ArrayList<Point>();  
            }  
            return points;  
        }  
        public void setPoints(List<Point> points) {  
            this.points = points;  
        }  
          
        public void computeCenter() {  
            double x = 0.0;  
            double y = 0.0;  
            for (Point point : getPoints()) {  
                x += point.getX();  
                y += point.getY();  
            }  
            double z = getPoints().size();  
            setCenter(new Point(x / z, y / z));  
        }  
    }


數(shù)據(jù)分析咨詢請掃描二維碼

若不方便掃碼,搜微信號:CDAshujufenxi

數(shù)據(jù)分析師資訊
更多

OK
客服在線
立即咨詢
客服在線
立即咨詢
') } function initGt() { var handler = function (captchaObj) { captchaObj.appendTo('#captcha'); captchaObj.onReady(function () { $("#wait").hide(); }).onSuccess(function(){ $('.getcheckcode').removeClass('dis'); $('.getcheckcode').trigger('click'); }); window.captchaObj = captchaObj; }; $('#captcha').show(); $.ajax({ url: "/login/gtstart?t=" + (new Date()).getTime(), // 加隨機(jī)數(shù)防止緩存 type: "get", dataType: "json", success: function (data) { $('#text').hide(); $('#wait').show(); // 調(diào)用 initGeetest 進(jìn)行初始化 // 參數(shù)1:配置參數(shù) // 參數(shù)2:回調(diào),回調(diào)的第一個(gè)參數(shù)驗(yàn)證碼對象,之后可以使用它調(diào)用相應(yīng)的接口 initGeetest({ // 以下 4 個(gè)配置參數(shù)為必須,不能缺少 gt: data.gt, challenge: data.challenge, offline: !data.success, // 表示用戶后臺檢測極驗(yàn)服務(wù)器是否宕機(jī) new_captcha: data.new_captcha, // 用于宕機(jī)時(shí)表示是新驗(yàn)證碼的宕機(jī) product: "float", // 產(chǎn)品形式,包括:float,popup width: "280px", https: true // 更多配置參數(shù)說明請參見:http://docs.geetest.com/install/client/web-front/ }, handler); } }); } function codeCutdown() { if(_wait == 0){ //倒計(jì)時(shí)完成 $(".getcheckcode").removeClass('dis').html("重新獲取"); }else{ $(".getcheckcode").addClass('dis').html("重新獲取("+_wait+"s)"); _wait--; setTimeout(function () { codeCutdown(); },1000); } } function inputValidate(ele,telInput) { var oInput = ele; var inputVal = oInput.val(); var oType = ele.attr('data-type'); var oEtag = $('#etag').val(); var oErr = oInput.closest('.form_box').next('.err_txt'); var empTxt = '請輸入'+oInput.attr('placeholder')+'!'; var errTxt = '請輸入正確的'+oInput.attr('placeholder')+'!'; var pattern; if(inputVal==""){ if(!telInput){ errFun(oErr,empTxt); } return false; }else { switch (oType){ case 'login_mobile': pattern = /^1[3456789]\d{9}$/; if(inputVal.length==11) { $.ajax({ url: '/login/checkmobile', type: "post", dataType: "json", data: { mobile: inputVal, etag: oEtag, page_ur: window.location.href, page_referer: document.referrer }, success: function (data) { } }); } break; case 'login_yzm': pattern = /^\d{6}$/; break; } if(oType=='login_mobile'){ } if(!!validateFun(pattern,inputVal)){ errFun(oErr,'') if(telInput){ $('.getcheckcode').removeClass('dis'); } }else { if(!telInput) { errFun(oErr, errTxt); }else { $('.getcheckcode').addClass('dis'); } return false; } } return true; } function errFun(obj,msg) { obj.html(msg); if(msg==''){ $('.login_submit').removeClass('dis'); }else { $('.login_submit').addClass('dis'); } } function validateFun(pat,val) { return pat.test(val); }