STEM Team 5: Machine Learning Computing Resource: 2016

A confusion matrix is a classification system used to distinguish different variables and summarize the results of a supervised learning algorithm (such as the deep learning algorithm being worked on). It is composed of rows and columns where the columns represent the results that the algorithm predicted and the rows represent the actual classes of the objects tested.

Predicted
Actual		Dog	Cat	Fish
	Dog	6	2	0
	Cat	3	7	0
	Fish	0	0	11

In the above example there is a group of 8 dogs, 10 cats, and 11 fish and the results are as above; The algorithm has accurately identified 6 dogs as dogs, 7 cats as cats, and 9 fish as fish. The accurate results are easy to distinguish because they form a diagonal line from the top left of the graph to the bottom right (highlighted in green). In this example we can identify the algorithms mistakes as well. As shown above (in red), the algorithm mistook 2 dogs as being cats, and 3 cats as being dogs, however all the fish were accurately identified and no dogs or cats were mistaken as fish. Through this information, one can observe that the algorithm was extremely accurate in identifying fish and made some errors when it came to distinguishing cats from dogs and vice versa.

Table of Confusion

A table of confusion is a table with 2 rows and 2 columns that reports the number of true positives, false positives, true negatives, and false negatives. This is more accurate at identifying information from confusion matrices because it allows for a more detailed analysis than a proportion of guesses. A reason for accuracy’s lack of reliability is because if the data set is unbalanced, the results will be misleading. For example, if there were 95 cats and only 5 dogs in the data set, the classifier could easily be biased into classifying all the samples as cats. The overall accuracy would be 95%, but in practice the classifier would have a 100% recognition rate for the cat class but a 0% recognition rate for the dog class.

The proper Table of Confusion for the dog class for the Confusion Matrix above would be:

6 true positives (6 dogs correctly identified as dogs)	3 false positives (3 cats that were incorrectly identified as dogs)
2 false negatives (2 dogs that were incorrectly identified as cats)	18 true negatives (18 animals (excluding dogs) not identified as dogs)

Computer System Parts:
https://pcpartpicker.com/user/tylevy555/saved/#view=TWQ48d

CPU:
Intel Core i5-4690 3.5GHz Quad-Core Processor

Motherboard:
Gigabyte GA-Z97MX-Gaming 5 Micro ATX LGA1150 Motherboard

Memory (RAM):
Corsair Vengeance Pro 16GB (2 x 8GB) DDR3-1866 Memory

Storage:
Samsung 850 EVO-Series 250GB 2.5" Solid State Drive
Western Digital BLACK SERIES 1TB 3.5" 7200RPM Internal Hard Drive

GPU:
MSI GeForce GTX 980 Ti 6GB Video Card
MSI GeForce GTX 980 Ti 6GB Video Card
[2-Way SLI]

Case:
Corsair SPEC-03 Red ATX Mid Tower Case

PSU:
EVGA SuperNOVA 1000G2 1000W 80+ Gold Certified Fully-Modular ATX Power Supply

Monitor:
Acer G276HL Gbmid 27-inch Full HD (1920 x 1080) Widescreen Display

Base Total: $2105.71
Mail-in Rebates: -$10.00

Total Price: $2095.71

Rational:

CPU:
I chose this CPU because CPU isn't extremely important for the Deep Learning Algorithm to work. Despite this, this CPU still manages to exceed the requirements a CPU requires for the Deep Learning Algorithm and is still cheap (for a CPU, at least).

Motherboard:
I chose this motherboard because it's a standard motherboard that is compatible with all the other parts. As well as this, it offers 3 PCI-E slots which 2 of the graphics cards will occupy and leave room for 1 more in the future in case I decide to upgrade to a 3-Way SLI.

Memory (RAM):
I chose 16 GB of RAM because CPU RAM should be equal to or greater than Video RAM. Since this build will have 12 GB of Video RAM, 16 GB of CPU RAM is the best way to have a good ratio.

Storage:
I chose a 250 GB SSD to store the algorithm and the OS so the computer will boot up quickly, and boot the program and other important software quickly. The 1 TB HDD is just to store other software that is needed/wanted but that has less importance than the algorithm. Stuff on the HDD will boot up slower than on the SSD, but price/GB on the HDD is cheaper as a trade-off to this sacrifice.

GPU:
The most important part of the hardware list is the GPU. This is due to the fact that deep learning algorithms tend to be incredibly video intensive. Since GPU’s affect video processing speed more than any other hardware part, they’re too large to ignore and they’re a vital part to building an effective deep learning oriented computer. Due to the above, the price of the GPU’s take up about 66% of the total budget. Unfortunately, the best GPU on the market currently, the GeForce GTX Titan X, is too expensive to make it effective given the budget ($2000). However, performance for the 980 Ti can be maximized by overclocking it. Doing so will allow it to perform better than the Titan X and keep expenditures within the budget.

Case:
I chose this case because it's large enough to contain all the hardware and retain good air cooling. As well as this, it is equipped with a sufficient amount of case fans and will provide good cooling in case the hardware gets hot. It also looks cool.

PSU:
I chose this PSU because it's cheap and supplies enough power to both the GPU's and the entire system.

Monitors:
I chose this monitor because I'm giving it to the school (in order to maintain the budget). Also, it's 27 inches which makes it more than large enough for dealing with multiple windows or the code.

STEM Team 5: Machine Learning Computing Resource

Tuesday, February 2, 2016

Confusion Matrix

Tuesday, January 5, 2016

STEM Deep Learning Computer Part List

Blog Archive