Attribute VB_Name = "Module6" '''''''''''''''''''''''''''''''''''''''''''''''''''''''' ' The dimensions of the matrix are checked ' Here '''''''''''''''''''''''''''''''''''''''''''''''''''''''' Private Function Find_R_C(Mat() As Double) As Double() Dim Rows As Integer, columns As Integer Dim i As Integer, j As Integer Dim Result() As Double columns = 0 If Mat_1D(Mat, Rows) Then ReDim Result(Rows, 1) Result(0, 0) = Rows Result(0, 1) = columns + 1 For i = 1 To Rows Result(i, 1) = Mat(i - 1) Next i Else Call Mat_2D(Mat, Rows, columns) ReDim Result(Rows, columns) Result(0, 0) = Rows Result(0, 1) = columns For i = 1 To Rows For j = 1 To columns '- 1 Result(i, j) = Mat(i - 1, j - 1) Next j Next i End If Find_R_C = Result End Function '''''''''''''''''''''''''''''''''''''''''''''''''''''''' ' Check if matrix has only one column ' shift the matrix one level and keep ' its dimensions details in Mat(0,0) and Mat(0,1) ' Mat(0,0)= no of rows ' Mat(0,1)= no of columns '''''''''''''''''''''''''''''''''''''''''''''''''''''''' Private Function Mat_1D(Mat() As Double, m As Integer) As Boolean Dim Temp_MAT As Double On Error GoTo Error_Handler Temp_MAT = Mat(0, 0) Mat_1D = False Exit Function Error_Handler: Mat_1D = True m = UBound(Mat) + 1 End Function '''''''''''''''''''''''''''''''''''''''''''''''''''''''' ' Check if matrix has more than one column ' if so return the dimension as described above '''''''''''''''''''''''''''''''''''''''''''''''''''''''' Private Sub Mat_2D(Mat() As Double, m As Integer, N As Integer) Dim Temp_MAT As Double, i As Integer i = 0 m = UBound(Mat) + 1 On Error GoTo Error_Handler Do Until i < -1 Temp_MAT = Mat(0, i) i = i + 1 Loop Error_Handler: N = i End Sub Public Sub MatrixTransp(A() As Double, AT() As Double) Rem This routine transposes matrix A to matrix AT. Dim Arows, Acols As Double Arows = UBound(A, 1) Acols = UBound(A, 2) ReDim AT(1 To Acols, 1 To Arows) For i = 1 To Arows For j = 1 To Acols AT(j, i) = A(i, j) Next j Next i End Sub Public Sub VectorMult(A() As Double, B() As Double, AB() As Double) Rem This routine multiplicates vectors A (vertical) and B (horisontal) and creates matrix AB. Dim Arows, Bcols As Double Arows = UBound(A) Bcols = UBound(B) If Acols <> Brows Then MsgBox ("The number of A's columns differs from the number of B's rows") End If ReDim AB(1 To Arows, 1 To Bcols) For r = 1 To Arows For C = 1 To Bcols AB(r, C) = A(r) * B(C) Next C Next r End Sub ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ' Determinant of a matrix should be (nxn) ' Function returns the solution or errors due to ' dimensions incompatibility ' Example: ' Check Main Form !! ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Public Function det(Mat() As Double) As Double Dim DArray() As Double, s As Integer Dim k As Integer, i As Integer, j As Integer Dim save As Double, ArrayK As Double, k1 As Integer Dim m1 As String, Mat1() As Double On Error GoTo Error_Handler Mat1 = Find_R_C(Mat) If Mat1(0, 0) <> Mat1(0, 1) Then GoTo Error_Dimension s = Mat1(0, 0) det = 1 DArray = Mat1() For k = 1 To s If DArray(k, k) = 0 Then j = k Do While ((j < s) And (DArray(k, j) = 0)) j = j + 1 Loop If DArray(k, j) = 0 Then det = 0 Exit Function Else For i = k To s save = DArray(i, j) DArray(i, j) = DArray(i, k) DArray(i, k) = save Next i End If det = -det End If ArrayK = DArray(k, k) det = det * ArrayK If k < s Then k1 = k + 1 For i = k1 To s For j = k1 To s DArray(i, j) = DArray(i, j) - DArray(i, k) * (DArray(k, j) / ArrayK) Next j Next i End If Next Exit Function Error_Dimension: Err.Raise "5011", , "Matrix should be a square matrix !" Error_Handler: If Err.Number = 5011 Then Err.Raise "5011", , "Matrix should be a square matrix !" Else Err.Raise "5022", , "In order to do this operation values must be assigned to the matrix !!" End If End Function Public Sub VectorMatrixMult(A() As Double, B() As Double, AB() As Double) Rem This routine multiplicates matrices A and B and creates matrix AB. Dim Arows, Acols, Brows, Bcols, ABrows, ABcols As Double Dim CellSum As Double 'Arows = UBound(A) Acols = UBound(A) Brows = UBound(B, 1) Bcols = UBound(B, 2) If Acols <> Brows Then MsgBox ("The number of A's columns differ from the number of B's rows") End If ReDim AB(1 To Bcols) For C = 1 To Bcols CellSum = 0 For i = 1 To Acols CellSum = CellSum + A(i) * B(i, C) Next i AB(C) = CellSum Next C End Sub Public Sub MatrixInverse(m() As Double, returnvalue() As Double) Rem This routine calculates inverse matrix (returnvalue) from matrix (m). Dim i As Double, j As Double, k As Double, row As Double, Pivot As Double row = UBound(m, 1) ReDim returnvalue(1 To row, 1 To row) For i = 1 To row returnvalue(i, i) = 1 Next For i = 1 To row Pivot = m(i, i) If Pivot <> 1 Then For k = 1 To row m(i, k) = m(i, k) / Pivot returnvalue(i, k) = returnvalue(i, k) / Pivot Next End If For j = 1 To row Pivot = m(j, i) If i <> j And Pivot <> 0 Then For k = 1 To row m(j, k) = m(j, k) - m(i, k) * Pivot returnvalue(j, k) = returnvalue(j, k) - returnvalue(i, k) * Pivot Next End If Next Next End Sub Public Sub LDATraining(A() As Double, FolderName As String) Rem This routine creates classification rules with linear discriminant analysis (LDA) Rem from given classified training set. Rem A()-matrix is observation matrix with n rows (number of observations) Rem and m columns (first column includes classes and the rest Rem columns features e.g. pixel values from different channels. Rem FolderName is location of the destination folder where results are sited. Rem ValidationProportion is the percent of observations shifted to validation set Dim NumOfObs As Double 'Number of observations Dim NumOfClasses As Double 'Number of classes Dim NumOfFeatures As Double 'Number of features Dim ValCount As Double 'Counter for mean calculations Dim ValSum As Double 'Sum for mean calculations Dim Class As Integer Dim NewClass As Boolean 'Logical operator that defines if observation belongs to new class Dim textstring As String 'This is for printing Dim Folder As String 'Destination folder's path Dim ClassList() As Double 'List of class names and their continuous indices (indices first) Dim mx() As Double 'Observation matrix Dim mM() As Double 'Mean vectors Dim DelCCount As Double 'Count of removed classes Dim X() As Double 'Next ones are for calculations Dim xT() As Double Dim xxT() As Double Dim m() As Double Dim mT() As Double Dim mmT() As Double Dim Determinant() As Double Dim mC() As Double 'Variance-covariance matrices Dim B() As Double 'Data without dropped observations and class information Dim ObsNumber As Double Dim ErrorMatrix() As Long NumOfObs = UBound(A, 1) NumOfFeatures = UBound(A, 2) - 1 Folder = FolderName Rem Printing of the full observation matrix for k-NN estimation. Rem First value is the number of original observations. Open Folder & "Obs.txt" For Output As 1 Print #1, NumOfObs For i = 1 To NumOfObs textstring = "" For j = 1 To NumOfFeatures textstring = textstring & A(i, j) & "," Next j textstring = textstring & A(i, NumOfFeatures + 1) Print #1, textstring Next i Close (1) Rem First we count the number of classes and number of observations in each class ReDim ADel(1 To NumOfObs, 1 To NumOfFeatures + 1) As Double ReDim DelList(1 To 100, 1 To 3) As Double For i = 1 To NumOfObs NewClass = True 'default For j = 0 To NumOfClasses If A(i, 1) = DelList(j + 1, 2) Then NewClass = False 'if this class already exists End If Next j If NewClass = True Then 'if we have new class DelList(NumOfClasses + 1, 1) = NumOfClasses + 1 'update of continuous class number DelList(NumOfClasses + 1, 2) = A(i, 1) 'update of real class number NumOfClasses = NumOfClasses + 1 'update of the number of classes DelList(NumOfClasses, 3) = 1 'count of observations in this class Else: For l = 1 To NumOfClasses If A(i, 1) = DelList(l, 2) Then DelList(l, 3) = DelList(l, 3) + 1 'count of observations in this class End If Next l End If Next i Rem Here we count how many classes and observations we have left Dim ClassNum As Double ClassNum = 0 Dim ObsNum As Double ObsNum = 0 For i = 1 To NumOfClasses If DelList(i, 3) > NumOfFeatures Then 'if we have enough observations ClassNum = ClassNum + 1 'update of class number ObsNum = ObsNum + DelList(i, 3) 'update of the number of observations End If Next i Rem Now delete observations that belong to small classes Dim k As Double k = 0 For j = 1 To NumOfObs For i = 1 To NumOfClasses If DelList(i, 2) = A(j, 1) Then If DelList(i, 3) > NumOfFeatures Then 'we have enough observations k = k + 1 'count of observations For l = 1 To NumOfFeatures + 1 ADel(k, l) = A(j, l) 'here we copy observations from matrix A Next l End If End If Next i Next j DelCCount = NumOfClasses - ClassNum 'count of removed classes If NumOfClasses - ClassNum > 0 Then MsgBox (NumOfClasses - ClassNum & " classes removed") Rem Here we update values NumOfClasses = ClassNum NumOfObs = ObsNum A = ADel Rem Printing of the new observation matrix Open Folder & "ObsAfterRemoval.txt" For Output As 1 For i = 1 To NumOfObs textstring = "" For j = 1 To NumOfFeatures textstring = textstring & ADel(i, j) & "," Next j textstring = textstring & ADel(i, NumOfFeatures + 1) Print #1, textstring Next i Close (1) '''''''Training starts'''''''' Rem Here we count the number of classes in the training set, Rem create classnumbers and put pattern vectors to matrix mx. ReDim mx(1 To UBound(A, 1), 1 To UBound(A, 2)) ReDim ClassList(1 To 100, 1 To 2) mx = A ClassList(1, 2) = CVar(A(1, 1)) ClassList(1, 1) = 1 NumOfClasses = 1 For i = 1 To NumOfObs NewClass = True For j = 1 To NumOfClasses If A(i, 1) = ClassList(j, 2) Then NewClass = False End If Next j If NewClass = True Then ClassList(NumOfClasses + 1, 1) = NumOfClasses + 1 ClassList(NumOfClasses + 1, 2) = A(i, 1) NumOfClasses = NumOfClasses + 1 End If For j = 1 To NumOfClasses If A(i, 1) = ClassList(j, 2) Then mx(i, 1) = ClassList(j, 1) End If Next j Next i Rem Some basic logical examination of the observation matrix If NumOfClasses > NumOfObs / 2 Then MsgBox ("There is less than twice as much observations than classes!") MsgBox ("Classification aborted") Exit Sub ElseIf NumOfClasses < 2 Then MsgBox ("There have to be more than one class for classification") Exit Sub End If Rem Save the LDA parameters (including class listing) for future use Open Folder & "LDAParameters.txt" For Output As 6 Print #6, NumOfFeatures Print #6, NumOfClasses For i = 1 To NumOfClasses Print #6, ClassList(i, 1), ClassList(i, 2) Next i Close (6) MsgBox ("LDA for " & NumOfClasses & " classes") Rem Here we calculate the mean values of vectors x for each class j Rem and put them to matrix mM (1 row for each mean vector) ReDim mM(1 To NumOfClasses, 1 To NumOfFeatures) For i = 1 To NumOfClasses For l = 1 To NumOfFeatures ValCount = 0 ValSum = 0 For j = 1 To NumOfObs If mx(j, 1) = i Then ValSum = ValSum + mx(j, l + 1) ValCount = ValCount + 1 End If Next j mM(i, l) = ValSum / ValCount Next l Next i Rem Here we print the mean vectors Open Folder & "mM.txt" For Output As 2 For i = 1 To NumOfClasses For j = 1 To NumOfFeatures Print #2, mM(i, j) Next j Next i Rem Code for printing names and means of the classes in matrix form Rem (for easier visual studying) 'For i = 1 To NumOfClasses ' For j = 1 To NumOfFeatures - 1 ' textstring = textstring & " " & Format$(mM(i, j), "#.00") & " " ' Next j ' textstring = textstring & Format$(mM(i, NumOfFeatures), "#.00") ' Print #2, textstring 'Next i Close (2) Rem In this section we calculate covariance matrices (mCj) Rem with equation 1/N*sum(x*xT)-mj*mjT (j stands for class). Rem x and m are vertical and their transposes horizontal vectors Rem which leads up to matrices instead of scalars. As result of this Rem covariance matrices are square matrices that have as many rows and columns Rem than there are features. Rem Previous determinants are deleted Open (Folder & "Dets.txt") For Output As 11 Close (11) ReDim X(1 To NumOfFeatures) ReDim xT(1 To NumOfFeatures) ReDim xxT(1 To NumOfFeatures, 1 To NumOfFeatures) ReDim m(1 To NumOfFeatures) ReDim mT(1 To NumOfFeatures) ReDim mmT(1 To NumOfFeatures, 1 To NumOfFeatures) ReDim Determinant(1 To NumOfClasses) ReDim mC(0 To NumOfFeatures - 1, 0 To NumOfFeatures - 1) ReDim DiffSumC(1 To NumOfFeatures, 1 To NumOfFeatures) As Double For i = 1 To NumOfClasses ValCount = 0 ReDim DiffSum(1 To NumOfFeatures, 1 To NumOfFeatures) As Double For j = 1 To NumOfObs If mx(j, 1) = i Then ValCount = ValCount + 1 For l = 1 To NumOfFeatures X(l) = mx(j, l + 1) Next l xT = X Call VectorMult(X, xT, xxT) For l = 1 To NumOfFeatures For k = 1 To NumOfFeatures DiffSum(l, k) = DiffSum(l, k) + xxT(l, k) DiffSumC(l, k) = DiffSumC(l, k) + xxT(l, k) 'this is for variance covariance matrix of all data Next k Next l End If Next j For j = 1 To NumOfFeatures m(j) = mM(i, j) Next j mT = m Call VectorMult(m, mT, mmT) For j = 1 To NumOfFeatures For k = 1 To NumOfFeatures mC(j - 1, k - 1) = (DiffSum(j, k) / ValCount - mmT(j, k)) * (ValCount / (ValCount - 1)) 'mC(j - 1, k - 1) = DiffSum(j, k) / ValCount - mmT(j, k) As above, but without correction made according to SAS Next k Next j Rem Determinants of covariance matrices are created and saved here Open Folder & "Dets.txt" For Append As 4 Determinant(i) = det(mC) Print #4, Determinant(i) Close (4) Rem Also covariance matrices have to be saved for future classification tasks Open Folder & "mC" & i & ".txt" For Output As 3 For j = 1 To NumOfFeatures For k = 1 To NumOfFeatures Print #3, mC(j - 1, k - 1) Next k Next j Close (3) Rem Code for printing the covariance matrices in matrix form ' Open Folder & "mC" & i & ".txt" For Output As 3 ' For j = 1 To NumOfFeatures ' textstring = "" ' For k = 1 To NumOfFeatures - 1 ' textstring = textstring & mC(j - 1, k - 1) & " " ' Next k ' textstring = textstring & mC(j - 1, NumOfFeatures - 1) ' Print #3, textstring ' Next j ' Close (3) Next i Rem Calculation of variance covariance matrix from all observations Rem Meanvalues from all data ReDim Means(1 To NumOfFeatures) As Double For j = 1 To NumOfFeatures For i = 1 To NumOfObs Means(j) = Means(j) + mx(i, j + 1) Next i Means(j) = Means(j) / NumOfObs MsgBox (Means(j)) Next j Rem Mean vector multiplicated by its transpose ReDim MeansT(1 To NumOfFeatures) As Double MeansT = Means Call VectorMult(Means, MeansT, mmT) Rem Calculation of covariance matrix For j = 1 To NumOfFeatures For k = 1 To NumOfFeatures mC(j - 1, k - 1) = (DiffSumC(j, k) / NumOfObs - mmT(j, k)) * (NumOfObs / (NumOfObs - 1)) Next k Next j Rem Printing of the covariance matrix Open Folder & "mC.txt" For Output As 12 For j = 1 To NumOfFeatures For k = 1 To NumOfFeatures Print #12, mC(j - 1, k - 1) Next k Next j Close (12) Rem Now we have all we need i.e. covariance matrices Cj, their determinants and Rem mean vectors mj for each pattern class j. Rem Next the classification is tested within training set. Rem Observations are now transferred to matrix B without class information ReDim B(1 To NumOfObs, 1 To NumOfFeatures) For i = 1 To NumOfObs For j = 1 To NumOfFeatures B(i, j) = A(i, j + 1) Next j Next i Call LDAClassification(B, FolderName) Open Folder & "Classification.txt" For Input As 7 Rem Here we create and print matrix with classification information. Proper classifications are Rem in the diagonal, real classes in columns and results in rows. ReDim ErrorMatrix(1 To NumOfClasses, 1 To NumOfClasses) As Long For i = 1 To NumOfObs Input #7, ObsNumber, Class, textstring For j = 1 To NumOfClasses If ClassList(j, 2) = Class Then For k = 1 To NumOfClasses If A(i, 1) = ClassList(k, 2) Then ErrorMatrix(ClassList(j, 1), ClassList(k, 1)) = ErrorMatrix(ClassList(j, 1), ClassList(k, 1)) + 1 End If Next k End If Next j Next i Close (7) Rem Now we calculate the Cohen's kappa and percent agreement Dim Kappa As Double Dim DiagSum As Long 'Sum of diagonal elements ReDim RowColSum(1 To NumOfClasses, 1 To 2) As Long 'Row sums and column sums Dim q As Double 'Number of cases expected in diagonal by chance Dim Correct As Double 'Percent agreement For i = 1 To NumOfClasses For j = 1 To NumOfClasses RowColSum(i, 1) = RowColSum(i, 1) + ErrorMatrix(i, j) RowColSum(j, 2) = RowColSum(j, 2) + ErrorMatrix(i, j) If i = j Then DiagSum = DiagSum + ErrorMatrix(i, j) Next j Next i For i = 1 To NumOfClasses q = q + (RowColSum(i, 1) * RowColSum(i, 2)) / NumOfObs Next i Kappa = (DiagSum - q) / (NumOfObs - q) Correct = 100 * (DiagSum / NumOfObs) Rem Here we print the results of classification Print #10, Correct, Kappa, NumOfClasses, NumOfObs Open Folder & "ErrorMatrix.txt" For Output As 10 For i = 1 To NumOfClasses textstring = "" For j = 1 To NumOfClasses - 1 textstring = textstring & Format$(ErrorMatrix(i, j), "000") & "," Next j textstring = textstring & Format$(ErrorMatrix(i, NumOfClasses), "000") Print #10, textstring Next i Close (10) End Sub Public Sub LDAClassification(A() As Double, FolderName As String) Rem This routine examines in which class observation vector x belongs to Rem (in which it gets the highest value of djx) Dim NumOfClasses As Double Dim NumOfFeatures As Double Dim NumOfObs As Double Dim Folder As String Dim dxMax As Double Dim dx As Double Dim Term2Sum As Double Dim textstring As String Dim ClassList As Variant Dim mx() As Double Dim mM() As Double Dim Determinant() As Double Dim xmDiff() As Double Dim xmDiffT() As Double Dim xmDiffTmCinv() As Double Dim mC() As Double Dim mCinv() As Double Folder = FolderName NumOfObs = UBound(A, 1) Rem Read number of classes and features and class conversion list Open Folder & "LDAParameters.txt" For Input As 1 Input #1, NumOfFeatures Input #1, NumOfClasses ReDim ClassList(1 To NumOfClasses, 2) For i = 1 To NumOfClasses Input #1, ClassList(i, 1), ClassList(i, 2) Next i Close (1) Rem Here we check that we have as many features as we had in LDA teaching If NumOfFeatures > UBound(A, 2) Then MsgBox ("Pattern vectors are too short!") Exit Sub End If If NumOfFeatures < UBound(A, 2) Then MsgBox ("Pattern vectors are too long!") Exit Sub End If ReDim mx(1 To NumOfObs, 1 To NumOfFeatures) ReDim mM(1 To NumOfClasses, 1 To NumOfFeatures) ReDim Determinant(1 To NumOfClasses) ReDim xmDiff(1 To NumOfFeatures) ReDim xmDiffT(1 To NumOfFeatures) ReDim xmDiffTmCinv(1 To NumOfFeatures) ReDim mC(1 To NumOfFeatures, 1 To NumOfFeatures) ReDim mCinv(1 To NumOfFeatures, 1 To NumOfFeatures) mx = A Rem Read mean vectors of classes and determinants of covariance matrices Open Folder & "mM.txt" For Input As 3 Open Folder & "Dets.txt" For Input As 4 For i = 1 To NumOfClasses Input #4, Determinant(i) For j = 1 To NumOfFeatures Input #3, mM(i, j) Next j Next i Close (3) Close (4) Rem Result file Open Folder & "Classification.txt" For Output As 2 Rem Classification starts. Rem The maximum decision function value (dxMax) is first set to small value. Rem It is assumed that classes are equally likely to occur. For k = 1 To NumOfObs dxMax = -99999 For i = 1 To NumOfClasses Term2Sum = 0 For j = 1 To NumOfFeatures xmDiff(j) = mx(k, j) - mM(i, j) Next j xmDiffT = xmDiff Open Folder & "mC" & i & ".txt" For Input As 4 For l = 1 To NumOfFeatures For h = 1 To NumOfFeatures Input #4, mC(l, h) Next h Next l Close (4) Call MatrixInverse(mC, mCinv) Call VectorMatrixMult(xmDiffT, mCinv, xmDiffTmCinv) For l = 1 To NumOfFeatures Term2Sum = Term2Sum + xmDiffTmCinv(l) * xmDiff(l) Next l dx = -0.5 * Log(Determinant(i)) - 0.5 * Term2Sum If dxMax < dx Then dxMax = dx Class = i End If Next i Print #2, k, ClassList(Class, 2), dx 'MsgBox (dx & " " & Class) 'SaplingData(k).Class = Class Next k 'MsgBox ("Classification done") Close (2) End Sub Public Sub FileToMatrix(FileName As String, A() As Double) Rem FileName is the name of the file to be converted. Rem A is the matrix, where values are put. Dim LineLength As Integer Dim LineString As String Dim NumOfSeparators As Integer Dim NumOfRows As Integer Dim NumOfColumns As Integer Dim Location() As Integer Dim ValueLength As Integer Open FileName For Input As 1 NumOfRows = 0 Rem First the dimensions of the matrix are checked Do Until EOF(1) NumOfSeparators = 0 Line Input #1, LineString LineLength = Len(LineString) For j = 1 To LineLength Rem Ascii code 9 stands for horizontal tab, 32 for space and 44 for comma If (Mid(LineString, j, 1) = Chr(9) Or Mid(LineString, j, 1) = Chr(32) Or Mid(LineString, j, 1) = Chr(44)) And (Mid(LineString, j + 1, 1) <> Chr(9) And Mid(LineString, j + 1, 1) <> Chr(32) And Mid(LineString, j + 1, 1) <> Chr(44)) And j < LineLength Then NumOfSeparators = NumOfSeparators + 1 End If Next j If NumOfRows > 0 And LineLength > 0 And NumOfSeparators <> NumOfColumns - 1 Then MsgBox ("Number of columns varies between rows!") Exit Sub End If NumOfColumns = NumOfSeparators + 1 If LineLength > 0 Then NumOfRows = NumOfRows + 1 Loop Close (1) Rem Values are saved to matrix row by row according to their lengths and locations Rem in the row ReDim A(1 To NumOfRows, 1 To NumOfColumns) ReDim Location(1 To NumOfColumns + 1, 1 To 2) Open FileName For Input As 1 'Dim textstr As String 'Open "c:\temp\matriisi.txt" For Output As 10 For i = 1 To NumOfRows textstr = "" Line Input #1, LineString LineLength = Len(LineString) Rem Locations and lengths are chekced here k = 1 ValueLength = 1 For j = 2 To LineLength If (Mid(LineString, j, 1) = Chr(9) Or Mid(LineString, j, 1) = Chr(32) Or Mid(LineString, j, 1) = Chr(44)) And (Mid(LineString, j - 1, 1) <> Chr(9) And Mid(LineString, j - 1, 1) <> Chr(32) And Mid(LineString, j - 1, 1) <> Chr(44)) Then Location(k, 1) = j Location(k, 2) = ValueLength ValueLength = 0 k = k + 1 Else: If (Mid(LineString, j, 1) <> Chr(9) And Mid(LineString, j, 1) <> Chr(32) And Mid(LineString, j, 1) <> Chr(44)) Then ValueLength = ValueLength + 1 End If If j = LineLength Then Location(k, 1) = j + 1 Location(k, 2) = ValueLength End If Next j Rem Values are put to matrix here For j = 1 To NumOfColumns A(i, j) = CDbl(Mid(LineString, Location(j, 1) - Location(j, 2), Location(j, 2))) ' textstr = textstr & A(i, j) & "," Next j ' Print #10, textstr Next i Close (1) 'Close (10) End Sub Public Sub kNN(A() As Double, Folder As String, k As Long) Rem This routine does k nearest neighbor classification for given Rem observation vectors based on previously collected training data Rem A matrix contains one pattern vector per each row, Folder is the folder, where Rem training data is located and k is the number of nearest neighbors to be searched Rem Mahalanobis distance is used as distance metric and is calculated as sqr((x-y)T*COVinv*(x-y)) Dim NumOfObs As Long Dim NumOfFeatures As Long Dim TrainingNum As Long Dim TrainingData() As Double Dim MahalDist() As Double 'Mahalanobis distances Dim NN() As Double 'k-nearest neighbors Dim Class() As Double Dim mC() As Double 'variance-covariance matrix Dim mCinv() As Double 'inverse variance-covariance matrix Dim diff() As Double Dim DiffT() As Double Dim DiffTmCinv() As Double Dim DiffTmCinvDiff As Double Dim NNNum() As Long 'numbers of used observations (with shorter distances) Dim IsNew As Boolean 'boolean for studying if observation belongs to NNNum Rem Dimensions NumOfObs = UBound(A, 1) NumOfFeatures = UBound(A, 2) Rem Filling of training data matrix Open Folder & "obs.txt" For Input As 1 Input #1, TrainingNum ReDim TrainingData(1 To TrainingNum, 1 To NumOfFeatures + 1) For i = 1 To TrainingNum For j = 1 To NumOfFeatures + 1 Input #1, TrainingData(i, j) Next j Next i Close (1) Rem Filling of variance-covariance matrix ReDim mC(1 To NumOfFeatures, 1 To NumOfFeatures) ReDim mCinv(1 To NumOfFeatures, 1 To NumOfFeatures) Open Folder & "mC.txt" For Input As 3 For l = 1 To NumOfFeatures For h = 1 To NumOfFeatures Input #3, mC(l, h) Next h Next l Close (3) Call MatrixInverse(mC, mCinv) ReDim MahalDist(1 To TrainingNum, 1 To 2) 'Classes and Mahalanobis distances between observation and training data ReDim NN(1 To k, 1 To 3) 'k nearest neighbors' classes, distances and frequencies ReDim Class(1 To NumOfObs, 1 To 3) 'nearest classes, distances and frequencies ReDim diff(1 To NumOfFeatures) ReDim DiffT(1 To NumOfFeatures) ReDim DiffTmCinv(1 To NumOfFeatures) ReDim NNNum(1 To k) For i = 1 To NumOfObs Rem Here we set long distances to NN matrix For j = 1 To k NN(j, 2) = 30000 Next j Rem Calculation of Mahalanobis distances between observation i and training points For j = 1 To TrainingNum MahalDist(j, 2) = 0 MahalDist(j, 1) = TrainingData(j, 1) 'class For l = 1 To NumOfFeatures 'difference vector diff(l) = A(i, l) - TrainingData(j, l + 1) Next l DiffT = diff '"transpose" Call VectorMatrixMult(DiffT, mCinv, DiffTmCinv) For cell = 1 To NumOfFeatures 'vector multiplication (gives scalar) MahalDist(j, 2) = MahalDist(j, 2) + DiffTmCinv(cell) * diff(cell) Next cell MahalDist(j, 2) = Sqr(MahalDist(j, 2)) Next j Rem Search for the shortest distance For l = 1 To TrainingNum If NN(1, 2) > MahalDist(l, 2) Then NN(1, 1) = MahalDist(l, 1) 'class NN(1, 2) = MahalDist(l, 2) 'distance NN(1, 3) = 0 'Initialization of frequency NNNum(1) = l 'List of used observations End If Next l Rem Next we find 2 to k (if k>1) next shortest distances If k > 1 Then For j = 2 To k For l = 1 To TrainingNum If NN(j, 2) > MahalDist(l, 2) Then IsNew = True For t = 1 To k 'look if vector is already used If l = NNNum(t) Then IsNew = False Next t If IsNew = True Then NN(j, 1) = MahalDist(l, 1) NN(j, 2) = MahalDist(l, 2) NNNum(j) = l End If End If Next l NN(j, 3) = 0 'Initialization of frequency Next j End If Rem Here we count how many nearest neighbors belong to each class For j = 1 To k For l = 1 To k If NN(j, 1) = NN(l, 1) Then NN(j, 3) = NN(j, 3) + 1 Next l Next j Rem Now we examine which class has the highest frequency Class(i, 1) = NN(1, 1) 'defaults Class(i, 2) = NN(1, 2) Class(i, 3) = NN(1, 3) If k > 1 Then For j = 2 To k If NN(j, 3) > Class(i, 3) Then Class(i, 1) = NN(j, 1) Class(i, 3) = NN(j, 3) End If Next j End If Rem Here we set observations, that have less than half hits out of k, to class 0 If Class(i, 3) < k / 2 Then Class(i, 1) = 0 Rem Calculation of Mahalanobis distance mean For j = 1 To k If NN(j, 1) = Class(i, 1) Then Class(i, 2) = Class(i, 2) + NN(j, 2) Next j Next i Rem Printing of the results Open Folder & "kNNClasses.txt" For Output As 2 For i = 1 To NumOfObs Print #2, Format$(Class(i, 2), "#.000") & "," & Class(i, 1) & "," & Class(i, 3) SaplingData(i).Class = Class(i, 1) Next i Close (2) 'MsgBox ("k-NN classification done") End Sub