KNN-classifier

所属分类:聚类算法
开发工具:C++
文件大小:1719KB
下载次数:0
上传日期:2023-04-13 02:09:17
上 传 者sh-1993
说明:  KNN分类器,,
(KNN-classifier,,)

文件列表:
CMakeLists.txt (2271, 2023-07-03)
ClientCLI (0, 2023-07-03)
ClientCLI\AlgorithmSettings.cpp (967, 2023-07-03)
ClientCLI\AlgorithmSettings.h (371, 2023-07-03)
ClientCLI\ClassificationKnn.cpp (509, 2023-07-03)
ClientCLI\ClassificationKnn.h (364, 2023-07-03)
ClientCLI\ClientCLI.cpp (1587, 2023-07-03)
ClientCLI\ClientCLI.h (704, 2023-07-03)
ClientCLI\DisplayClassiffications.cpp (1217, 2023-07-03)
ClientCLI\DisplayClassiffications.h (401, 2023-07-03)
ClientCLI\DownloadFile.cpp (1722, 2023-07-03)
ClientCLI\DownloadFile.h (391, 2023-07-03)
ClientCLI\Downloader.cpp (448, 2023-07-03)
ClientCLI\Downloader.h (281, 2023-07-03)
ClientCLI\Makefile (1023, 2023-07-03)
ClientCLI\UploadFile.cpp (2484, 2023-07-03)
ClientCLI\UploadFile.h (304, 2023-07-03)
Distances (0, 2023-07-03)
Distances\AUC.cpp (289, 2023-07-03)
Distances\AUC.h (272, 2023-07-03)
Distances\CAN.cpp (279, 2023-07-03)
Distances\CAN.h (272, 2023-07-03)
Distances\CHB.cpp (300, 2023-07-03)
Distances\CHB.h (274, 2023-07-03)
Distances\Distances.h (299, 2023-07-03)
Distances\MAN.cpp (290, 2023-07-03)
Distances\MAN.h (287, 2023-07-03)
Distances\MIN.cpp (291, 2023-07-03)
Distances\MIN.h (269, 2023-07-03)
Distances\Makefile (217, 2023-07-03)
General (0, 2023-07-03)
General\Client.h (650, 2023-07-03)
General\Command.h (282, 2023-07-03)
General\Comparator.cpp (373, 2023-07-03)
General\Comparator.h (219, 2023-07-03)
General\Database.cpp (6787, 2023-07-03)
General\Database.h (1209, 2023-07-03)
General\Makefile (761, 2023-07-03)
... ...

# Knn classifier ## In this repo - ClientCLI : directory containing client files - Distances: directory containing distance files - General : directory containing files required by both client and server programs - ServerCLI : directory dontaining server files ### Building ```bash $make ``` ### Running #### Server ```bash $ ./server.out ``` #### Client ```bash $ ./client.out ``` #### Expected input (client) User inputs number corresponding to options in the menu. '-1' closes socket and terminates program ### Cleaning ```bash $make clean ``` ## Project Walkthrough ### Distance class - functor This class defines virtual functions which every distance must implement. Each distance is implemented as an object, which gives us the ability to switch distance functions at runtime for database functionality. This is an object-oriented solution to calculating the distance between database vectors and the input vector. Rather than passing a distance function (distance code from user input), we pass an object that contains the relevant distance method. Each class (AUC, CAN, CHB, MAN, MIN) holds its unique implementation in the () operator which allows us to write simpler, more elegant code. and we hold a map when the key is the distance function name and the value is a functor object as mentioned above keeping it cleaner instead of using a lot of if/elif conditions. ### Database class Processes the user-chosen CSV files and turns them into a database, which is internally implemented as a vector. Every line from the csv file is parsed by commas and each component converted to a double. The class supports processing both classified an unclassified vectors. (For vector length issues, see 'Misc'). ### Vector class Same as assignment 1 ### Comparator class - functor Compares two vectors based on their distFromArg member variable. Comparator object is passed to the sorting function in the K-nearest algorithm in Database. This is an object-oriented solution to passing a reference to a static function, which we would normally do when trying to implement a custom comparator. The Comparator function gives us the ability to pass a reference to an initialized object rather than a static function. ### DefaultIO SocketIO and FileIO inherit this abstract class and implement its virtual methods write() and read(). This wraps the unsightly socket and file handling in a class and makes sockets and files easy to write to and read from. ### ClientCLI This class wraps the client functionality. It runs the main menu and executes commands based on user input. The client is based on the implementation seen in the lecture presentation. Client prompts user option, sends the data to the server and exectures the commmand. Once it receives a response from the server, the client will print it to the console if appropriate or perform some additional logic (depending on the server message). When the client receives '-1' from the user, it disconnects from the server. All communication uses the TCP protocol. ### Server Multithreaded server can handle multiple clients at once. Opens a ServerCLI for each client on a new thread ### ServerCLI The server is based on the implementation seen in the lecture presentation. Reads user choice and executes approppriate command. All communication uses the TCP protocol. ### Used the command design pattern ![image](https://github.com/ArielElb/KNN-classifier/assets/94087682/c6ff14f6-5328-421a-955e-32058018694c) - Invoker is the ServerCLI he holds a vector of Commands (polymorphism). - ConcreteCommands that inherits from Command such as :UploadCommand , SettingsCommand , ClassifyCommand ,DisplayCommand, DownloadCommand. - We got multi Recivers such as DefaultIO,Database that doing their actions. - ClientCLI - The client has parallel commands to the server so we can do what is needed on both sides. In this assignment our server can handle many client in parallelism using threads.
When a new client exceutes the program the server sent him this text:
![alt text](https://github.com/TopazAvraham/IntroductionToCS-University-C-programming/blob/master/Screenshots/205.png?raw=true) #### Option 1- Upload CSV If the user pressed this option he will be given the option to enter a path to a local CSV file in this computer and after pressing enter, our client code will send the content of that file to the server. This file that will contain the classified vectors. After sending the file, the server will send to the client "upload complete". If the path is not legal we will print in the user's terminal "invalid input". This process will be done twice, the first one for the classified vectors and the second one for the vectors the needs to be classified. After the 2 files are uploaded we will show the user the main menu again. ![alt text](https://github.com/TopazAvraham/IntroductionToCS-University-C-programming/blob/master/Screenshots/204.png?raw=true) #### Option 2- Algorithm Settings If the user pressed this option he will be given the option to change the K number of neighbors the algorithm will use, or to change the distance metric the algorithms will be basing its calculations. First we will show the user the cureent KNN parameters - the K and the distance metric and afterwards he will be given the oppurtunity to change the settings, if he wishes to do so. If the user will enter wrong parameters for K or for the metric distance, then the server will send him an error indicating that there is an "invalid input for K" or "invalid input for metric" accordingly. ![alt text](https://github.com/TopazAvraham/IntroductionToCS-University-C-programming/blob/master/Screenshots/203.png?raw=true) #### Option 3- Classify Data If the user pressed this option the server will start the KNN algorithm calculation based on the files uploaded and the current settings of the algorithm. If the user hasn't uploaded any files, or uploaded only the training file or the test file, the server will send an error to the client, saying he should upload the data. #### Option 4- Display Results If the user pressed this option the server will send the client the KNN algorithm calculation results based on the files uploaded and the current settings of the algorithm. If the user hasn't uploaded any files, or uploaded only the training file or the test file, the server will send an error to the client, saying he should upload the data. Also, if the user hasn't classified the data, meaning he didn't pressed option 3- classify data before, so the server will send him an error indicating that he needs to classify the data. ![alt text](https://github.com/TopazAvraham/IntroductionToCS-University-C-programming/blob/master/Screenshots/202.png?raw=true) #### Option 5- Download Results If the user pressed this option he will be given the option to enter a path in his computer in which the program will create a CSV file in that path with the results of the classification inside the file. If the path is not correct we will show the user an error messege. If the user hasn't uploaded any files, or uploaded only the training file or the test file, the server will send an error to the client, saying he should upload the data. Also, if the user hasn't classified the data, meaning he didn't pressed option 3- classify data before, so the server will send him an error indicating that he needs to classify the data. - important note here is that each time the user want to download the results we assigning him a new socket and we binding it to port 0, the OS returning us the port of the socket so we can pass the results and still get new commands to execute and new users. #### Option 8- Exit If the user pressed this option we will exit the program, right after releasing all resources as sockets and thread that are beign used. The excecution is straight forward: we run the client and server code in different terminals. The client asks to connect to the server and the server host assigns him a designated socket and a designated thread. Then, in the client code, we are reading the input from the user, and send it to the server

In the server code we calculate the result based on the classes we implemented from previous assignments, using OOP principles.

We implemented a distance class where each metric is a different method, and We implemented a Knn class to calculate distance of the given vector from the vectors in the file, to updating the distances accordingly, to bubble sort the vectors from the file based on the distance we just calculated. Then we use a method which calculates from the given K number from the client, what was the most frequent name and returns it, as the Knn algorithm result. It does that for all the unclassified vectors in the that the client sent. If needed, the server sends the result to the client using the designated client_socket and thread that was created for him.

## Misc - Program recognizes valid numerical input, such as integers, decimal numbers, and scientific notation. Any invalid input in one of the vectors, whether read from a file or received as input from the user at runtime, will cause the program to read another vector. - Classification of the vector must be a non-numerical string. If the user attempts to classify a vector with a number, the program will interpret it as vector of length n+1 (i.e. if given vector of length 4 with a numerical classification, it will read an unclassified vector of length 5). - Database must contain vectors of the same length. Program will not run if all vectors in the database don't match the length of the first. - Database must contain at least 1 vector. An empty database will cause the program to exit. - 1 <= k <= size of database. Failure to meet this requirement will cause the program to exit. - if k is a decimal number e.g 2.2 , it will consider as k = 2. - If there is a tie for majority classification of the nearest neighbors, (i.e. k = 3, and the 3 closest vectors have different lengths) the program will take the first maximum it comes across in the Map structure.

近期下载者

相关文件


收藏者